home .. forth .. machineforth mail list archive ..

[MachineForth] 25x Forth engine

Dear mail-list readers:

I am cross-posting this reply to colorForth and
NOSC mail-lists because I think it has significant
content about Chuck's latest reports on colorForth
and on his latest chips.   I think eventually the
lists will separate enough that cross posting
will not be appropriate most of the time, but
I think the readers of all three lists will
take different spins off of this information.

Andy Valencia wrote in the [MachineForth] list:

> I hope this list is still alive!

It has been very quite for a long time.  But I figured
that Chuck's press via slashdot would eventually result
in more traffic in the MachineForth list.

The SVFIG chapter's Forth day was held last Saturday and
Chuck gave a presentation in the afternoon where he
talked about his colorForth mostly.  He said that he
would soon be posting two applications in his colorForth
and that these would be a cross compiler for c18 (was
called x18) and a simulator for 25x (which might become
50x before fabrication.)

He showed some of the colorForth code for the c18 Forth.
Of course it looked to me very much like the first MachineForth
that he wrote ten years ago for P21 and the MachineForths
for other chips that Chuck did like P8, i21, P32, F21 etc.
The use of colorForth actually simplified what little was
required, but the first MachineForth that he wrote in FPC
was about 300 bytes.  But the colorForth version will
be interesting to see in totality.

Most of the questions were about the Pentium colorForth
that he handed out on floppy and that he updates on the 
internet.  However since he mentioned c18 and 25x I
tried to get answers to as many of the issues that 
Andy raised as I could.  I seemed to have all the
hardware related questions.  

> it seems like some intermediate steps would
> be appropriate:
> 1. Spec out the CPU
>         a. Instruction architecture

The instruction set started out as the F21 instruction
set.  Then Chuck made a few changes.

First of all there is a new register and couple of new
instructions. B register and !B but no @B but B@ and B!.
Other than that it looks very much like the other
MISC chips.

The second difference of course there is no bit 20 (carry) 
used for addressing. In fact the biggest difference in
architecture of the processor cores is addressing.

The third difference is that P21 had only 10 bit
branch arguments, F21 has 10 and two types of 14 bit
arguements, and x18 has 8 bit arguments for a reason
that will soon be obvious.

The P21 and F21 and i21 had memory interfaces to off-
chip memory and stacks and control registers on-chip.
They all provided 1M word (20 bit) DRAM address spaces
with with hardware provided paged control for on/off-page
DRAM timing.  On F21 and i21 there is a control register
to adjust (tune) the memory timing, and like P21 it varies
with the voltage at which the chip is run.

The 21 chips have a 20-bit wide external data and 20 bits 
of DRAM addressed by hardware and
is also used to address non-DRAM.  The hardware supported
non-DRAM megaword is split up into a 256K 8-bit wide
ROM/SRAM/FLASH space with 4 pages mapped in via two bits
in the memory control register and 20 bit wide SRAM space
and 20 bit wide register space.  In fact the chips have
both FAST and SLOW copies of the non-DRAM memory spaces
and all the chips in this family have the 20-bit external
memory interface and 21-bit wide stacks, registers, and alu.
The c18 ( * c for core) is simply 18 bits wide and has
various 18 bit wide busses.

It appears that c18 has an 8 bit wide address busses on
chip which provide access to 256 words of 1ns ROM or DRAM.
The low address space is on-chip ROM and DRAM.  Somewhere
above that are control registers similar to those on
other MISC chips.

One (or more) of the control registers setup the operation
of the external memory bus and another provides access to
the external memory bus.  So the new B register is mainly
used to hold the address of the external-memory controller
but it could also hold other control port register
addresses or onchip address like the A or R registers.

So one or more on-chip ROM can be programmed to make the
chip assume the reverse pinout of a 4ns 18 bit wide
external fast-SRAM.  It can also be programmed to provide
control signals for other types of memory.

9 of the core CPU in 25x (center 3x3) are only connected
to other CPU via the horizontal and vertical internal
routing busses.  16 are on the outside of the chip and
have access to I/O pins.  Internal horizontal and vertical
busses are parallel so are fast and provide parallel
access.  They have both hardware and software arbitration
but more on that later.

>         b. Instruction set

I would recommend my tutorials on the MISC instruction
sets and MachineForth for the various machines.  Then
when Chuck publishes the c18 Forth and 25x simulator
it will all be pretty obvious even if you have
never seen colorForth.     Likewise if you are new
to colorForth and new to MISC chips and new to MachineForth
it is hard to know where to start.

>         c. Bus(es) architecture

I think I have covered that basically.  Each CPU has
paths to data and return stacks (circular register
stacks with no conventional stack pointer) and an
18 bit wide address and data bus with 256 words
of ROM and DRAM onchip and control registers which
provide access to a hardware/software defined external
memory bus and I/O pins via control registers.  There
are also horizontal and vertical parallel communication
busses in the 25x array.  There are accessed via
software/hardware arbitrated control registers
mapped into the internal address space.

One or more chips on the outside have serial busses
but I don't know the details at this point.  I am
sure at some point Chuck will publish specs.

>         d. Memory organization and hierarchy

I think I have covered that.  But if you didn't
figure this out from above, Chuck also mentioned
explicity that the CPU cannot execute external
memory.  They can copy it to on-chip memory
and only execute from on-chip memory.  So the
CPU must assume some overhead that is done
entirely by other hardware on most other chips.
The CPU must manage on-chip memory manually
like other chips handle cache in hardware.

>         e. Inter-chip signalling

One control register includes destination (not
too unlike the F21 network coprocessor) but only
has 4 bits for the 4 other processors in a row
or column.  This means it can address one CPU
or any combination in a row or column in a
multicast.  The data is passed from a data
register (address) to a data register on
another chip and when that chip reads the
register an ack signal appears in other
control registers.  If a chip attempts to
write again before all recipients have
read the last message the write is blocked
by hardware.  So each CPU has 4 (2 hor and
2 vert, 2 direction and control and 2 data)
registers for multiprocessor communication.
There are also bits that allow a message
to be forwared from a horizontal to
vertical bus, but I don't have exact

Other than that hardware write-block
sychronization the rest of the message
management and bus arbitration must be
done in software.  So many approaches
are possible.  I would favor one that
supports both random addressing mode
and parallel broadcast modes.

>         f. Pin-out

Most likely some ROMs will be programmed to
make it assume a superset of the reverse
of an 18-bit fast-SRAM.  The first test
chip will include unique ROM code on each
of the core CPU in the hopes that some
of them will mostly work on the first try.
> 2. Code up a simulation

Of course the first colorForth application was
OKAD II which includes Chuck's proprietary hardware
simulator.  It simulates the physical properties
at every junction on a chip at 1ps simulated intervals.

The various MISC chip simulators could be modified
fairly easily to convert to c18, but software only
simulators are actually very easy.  Simulators and
emulators are very nice and as the chip evolves
it will nice to have access to simulators that
can be used to develop real ROMable code that 
will actually work.  We have been doing that for
over a decade on the project.

>         a. Accuracy versus planned CPU

The first 25x chip will include test circuits to fine tune
the CAD system to the actual .18u fabrication process
being used.  It is some of Chuck's proprietary code.

Once there is real hardware it is pretty trivial to
fine tune software simulators to also have any 
unexpected "features" that actual chips have.  But
testing partially working chips is something that
is somewhat unusual.

>         b. Tools (cross compiler, etc.)

There are dozens of P21, v21, f21, p8, p16, p24, p32,
p64 etc. tools that other people have done in various
languages and operating systems that are available and
dozens that are not.  They are pretty easy to do, at
least the idea of MISC chips with only 30 or so
instructions is easy, but the multiprocessing part
and programmable hardware/software memory and I/O
interfaces do make it more a more difficult problem
unless you focus on some subset of the potential

>         c. OS

That's seems pretty wide open.  Chuck mentioned that
to really "use" the chip as a computer it would need
all sorts of drivers etc. that are not really important
enough to him to write them for what he will do.  

I do not think people can expect Chuck to provide any
OS in a conventional sense since he thinks that "OS is
a dirty word."   But it is a chance for anyone who 
wants to get involved and provide value added software
to provide tools for the OS that they fell is 

>         d. Word set for operating the CPU "farm"

Again I think that is a very individual thing.  I like
F*F as I see the x family as an extension of the idea
of farming forth chip nodes to Chuck in 91 and published
a couple of papers at Forml about it.  F*F evolved from
Forth-Linda and Occam and I think it is the simplest
approach because it requires the smallest and simplest
set of words.  But there are many approaches to how
one wants to use a CPU farm.  However most of these
approaches are tuned for OS like Unix and are not
really a good match to something so tiny.

I like the idea that only a few words are needed to
make simple machineForth parallel on this type of
architecture.  That is how the architecture evoloved.

>         e. Iterate back to the CPU design as indicated by experience

Chuck has been looking for people to do things like help
with chip testing, documenation, feedback etc..  It has
been a decade with a handful of responses.

>         f. Finish with an instruction set verifier

Test software is always helpful since it is usually
needed along the way.
> 3. Spec out an eval board for the CPU
>         a. 25x interface
>         b. Supporting peripherals
>         c. Bootstrap
>         d. PCB layout

Good idea.
> Note that all of this (except, arguably OS and 
> its supporting "farm" control functions) is 
> orthogonal to the target market.  But with something really
> new like this, I don't believe you *can* target a market 
> (read "The Innovator's Dilemma" for more on this).  My own 
> inclination is to start with code breaking applications; my 
> intuition is that the per-CPU memory size is
> too thin to do a good job here, but that's what the 
> simulation feedback to CPU design is all about.  

This is what I have been recommending for the last dozen
years.  This is how the instruction set and architectures
evolved to some of the places that they have gone.  It is
much easier to see it matches to your ideas, and modify
your ideas or the design before it is completed than
it is to get hardware (custom vlsi anyway) working.
> Sorry for carrying on in my initial message.  I hope 
> others are thinking along similar lines.
> Regards,
> Andy Valencia
> vandys@xxxxxxxxx

It's nice to see some traffic in [MachineForth] and
interest in details of Chuck's hardware.  There has
been significant interest in colorForth, but mostly
by people who can get it and get it to run.  I
have not been able to run it on the machines I
have available and am more focused on software
for actual MISC hardware.  I suspect that Chuck's
providing chip tools as applications for public 
consumption might generate some interest, but
I suspect that the PC colorForth crowd and the
MachineForth and the NOSC groups see the topics
very differently than I do.

I think the ideas only real fit together when you
consider the chips, the chip tools, the software,
and software tools, and that brings you to Chuck's
colorForth.  But if you come into colorForth from
the regular PC hardware/software world, even if
you have a traditional Forth background the
whole thing must look pretty strange.

At Forth Day Wil Baden stated that Chuck's colorForth
source was the worst code that he had seen in twenty-
five years.  I said, I thought it looked nicer than
when it had only 1/4 as much on the screen and rather
than left justifying the red words Chuck had first
felt that that was redundant and had all word definitions
packed end-to-end.  Wil laughed and agreed that the
new code was not the worst that he had seen, he
thought that the earlier colorForth was even worse.

But as I say, you really have to throw out a lot of
ideas to see how the ideas that Chuck is using fit
together, what his vision of his chips and his 
software is.  However if anyone else can fit the
chips or Chuck's software ideas into their vision
and make it work, more power to em!

I think you asked many interesting questions.  Thanks
for the invitation to talk about interesting subjects
like how Chuck's current designs are suppose to work.
After all, c18 and 25x are not hardware yet. ;-)

To Unsubscribe from this list, send mail to Mdaemon@xxxxxxxxxxxxxxxxxx with:
unsubscribe MachineForth
as the first and only line within the message body
Problems   -   List-Admin@xxxxxxxxxxxxxxxxxx
Main Machine Forth site   -   http://www.