home .. forth .. nosc mail list archive ..

[NOSC] Chuck Moore website and new Forth chips

Eric Laforest wrote:
> Code is loaded into the on-chip 512w memory for execution...
> This seems to imply that the internal and external RAM are in a flat,
> contiguous memory space.

The X18 chip and 25x chip are a little different in that
X18 has a pinout that is a superset of a mirror image
of a 512Kx18 cache SRAM.  25x is a superset of that.

So X18 has access to >1MB of external 4ns memory and
access to internal 1ns DRAM and ROM.

25x adds 24 X18 core but without their external SRAM
connection pins.  So they have limited memory.  They
have register to register or memory to memory
(I don't know which) communcation links in rows and
columns for a total of 180Gbps internal communication
bandwidth or something like that. 

Processors on the outside of the block have connections
to I/O pins that can be programmed to do digital or
analog I/O and whatever prototcols will fit.  Chuck's
ideas about software drivers for I/O are part of his
idea of Forth.  You give up some speed in exchange
for a wider range of I/O capability than with 
dedicated I/O hardware.  But with a 2400Mip core
most things in the outside world look pretty slow.

I does not have one pool of memory with a 25 way
bus arbitration unit or anything like that.

I don't know what mechanism Chuck uses to distinguish
internal addresses from external since the chip is
18 bits wide and the external address bus is 18 bits.
I would guess that it uses some paging mechanism
but I couldn't find the information on the site yet.

The site is in progress.  There is nothing there on
the CAD internals yet.  I know Chuck plans to add
a lot of things.

He does not have time to do unpaid support for
hobbyists wanting to play with ColorForth or
his chip designs, but we can collect a list of
things, like missing bits of documentation 
in our mail lists and pass our requests as
a group for what we need on to Chuck.

Since the instruction set on X18 is basically the
same as F21 with a little more information one
could modify the simulators and emulators for
F21 to do X18 and then later 25X.  Armed with
those tools people could develop real code
suitable for framing in ROM.

The best ideas for donated code routines could
go into the ROM.  Chuck will have some interesting
ideas about what to put into the ROM but he
can be influenced by logic if someone else
does identify code that deserves to be ROMmed.

> I presume one simply copies the words one wants to 
> use right now into internal RAM and then do a 
> subroutine call to it?

The processors are asynchonous and software must
coordinate processes.  On X18 there is 4ns and
1ns memory just as on the F21 prototypes in .8u
there are SRAM and DRAM and ROM busses.  The
cache SRAM chips are not used as cache, there is
no cache controller.  The software running on
the CPU manages things by putting the things
that need to run fast into the faster memory.
Instead of the old external memories that
provided 18ns SRAM and 30ns DRAM access these
chips have 1ns internal DRAM and 4ns external SRAM.
> ...or is the dictionnary simply set up to place 
> the shorter, more oft-called words on the chip?

Dictionary is software.  It does whatever.
One would assume that time critical code and/or
more often used code would be run on chip.

Remember that Chuck says that most programs fit
into 1K.  He has room for 1.5K inlined Forth
opcodes or .5K calls on chip or some mix in

Chuck has also said that we could scale the tiles
on F21 down to .18u and modernize the memory busses
if there was interest.  The increased speed and
reduced power make it more scifi.  The 10G
timer could become a 50G timer.  200M analog and
faster bit banged analog and digital I/O etc.

One advantage of .8u on the old prototypes is that 
the chips could be made in the third world on 
"obsolete" fabs for almost nothing.  They have 
quoted prices for wafers that look like what we 
paid for die even in small quantities.  And 500Mips
per node is sufficient for many problems.  The
nodes are also bigger than 25x nodes but require
external memory.  I guess Chuck could fit 20K words
of memory onto an F21 in .18u without the die
becoming too large.

25x was Chuck's first multiprocessor to have have
multiple CPU instead of CPU and multiple I/O
coprocessors.  He picked 5x5 to keep the chip
tiny and cheap.  If someone wanted to pay for
a big one with bigger node clusters that could
also be done.  A large wafer could hold thousands
of 2400 MIP X18 cores.

These designs are well suited to a class of
computationally challenging problems.  The problems
are real and attacking them today is very expensive.
People are using things like machines with thousands
of Pentium chips.  If you do a MIP/$ or MIP/W
comparison you can see the idea.  They are not
designed to run sluggish bloated popular software.

To Unsubscribe from this list, send mail to Mdaemon@xxxxxxxxxxxxxxxxxx with:
unsubscribe NOSC
as the first and only line within the message body
Problems   -   List-Admin@xxxxxxxxxxxxxxxxxx
Main 4th site   -   http://www.