home .. forth .. nosc mail list archive ..

[NOSC] Chuck Moore website and new Forth chips


Mark Sandford wrote:
> The space people (NASA and sattalite builders should
> also be interested as then can have redundant
> processors and just switch to another when a gama ray
> takes out one processor.

iTV came from NASA.  iTV did radiation testing on
i21 and found it to be extremely resistant to 
ionizing radiation despite not being designed with
rad-hard rules.  They also worked with the AirForce
on processors for spacecraft.
 
> I would suggest that this be retargetted somewhat as 25
> processor seems a little overkill, 16 or 9 (assuming
> you like squared numbers) seems more reasonable and
> the SRAM at 4ns (250 MHz before timing margins
> on-chip), would need to get shared between 25
> processors.  Assuming they are doing similar things
> this leaves only an effective 10MHz per processor
> while they are running at 2400MHz, so unless the
> application is heavily, heavily inner loops thay will
> spend a great amount of time twidling thier thumbs
> awaiting thier turn on the bus.  

Of course.  The same thing applies to workstation farms.
All problems have a balance between node processing
and node communication.  The design was not created
for problems that are essential serial and are
limited by communication bandwidth or serial processing.

Instead this design is for computationally intense
problems that can use 60,000 MIPS per $1 cluster chip
and not for software or problems that would limit
it to 250MIPS.  A single X18 is capable of 2400MIPS
so why limit 25 of them to a total of 250MIPS?

The proper model for F21 or 25X is a workstation
farm, but without the hardware and software overhead
needed to put C or Unix on each node.  A very small,
very cheap, Forth workstation farm.

> Even running solid
> multiplies at 125M this still leaves a large margin
> for data transfers.  So firstly I'd trim down the
> number of processors and might suggest looking at
> pairing the processor 

Like P21, F21, i21, and others the X18 design was
picked to reduce the prototying cost and get a
chip with pins that fit the prototyping constraints.
So if someone has their own fab line and is not
restricted by such constraints and is also not
concerned with budget constraints the number of
processors per die is completely variable, from
1 to thousands.  There is interest in thousands
of processor per chip.

The width is also variable from 5 bits to whatever.  
Chuck's designs are in columns so scaling the
width is mostly trivial.  Chuck said that making
a P32 from a P21 was about a day's work in OKAD.

But the + and +* instructions timing is proportional 
to bus width, so those opcodes would be slower with a 
wider bus.  Also the pin count and costs go up.  Pins 
are more expensive than silicon in high volume.  That 
is why a 60,000 MIP 25x can cost about the same thing
as a 2400 MIP X18.

> with a x36 chip instead of the
> x18 to get two "18 bit words" per cycle and
> effectively running the memory at 500MHz x18.

It could be done, and still get 2400MIPS from
the internal memories.  Larger amounts of
internal memory could be put on larger more
expensive chips if prototyping costs are not
an issue.  But these have not been billion
dollar type funding projects so far so 
things have been kept small to make it possible.

> Another area that I might suggest a change is the
> memory per processor 384 words might be 1K words if
> the number of processors is trimmed down to 16 or 9 so
> you would be more likely to run without needing to
> load or store data as frequently.

True.  Have you a particular application in mind where
you have determined that twice as much on chip
memory is needed?  I spent years doing that sort of
thing to tweak F21 before it was fabbed.  

I suggest than anyone with a particular idea simulate
it extensively to be able to tweak the design to
do what you really find best suited to your needs.
Chuck is in the custom silicon business.  He can
make it work in many ways depending on what the client
wants.  It is a little like picking items from a
menu.  Chuck would love to make many custom versions.
But he would also really like to make a production
run and get some chips into some product somewhere.
It is sort of a key element that hasn't happened.

> I might be interested in contributing to such an
> effort, I would need to know more about Chuck's
> experience and how likely the first try is likely to
> work (Murphy's Law and all).  I bought one of the
> original P21 chips and I beleive that those didn't
> function untill the 8th run so this is never a slam
> dunk especially if 0.18 and TSMC are new to his
> techniques.

It did take 8 tries to get P21 completely working.
It had the thermal bug like all conventional chips
but at only 100Mhz in 1.2u Chuck didn't bump into
it and didn't find it.  When he scaled down to .8u
and went to 500Mhz he discovered a bug in the
transitor model.  There were almost thirty
prototypes made at iTV and four by UltraTechnology.
The modeling in OKAD got closer and closer to
what the fabs actually produced.

The problem was that no one could say what the fabs
would produce.  No one knew.  People would just
repeat the mantra that it is just too complex to understand
so you just have to trust in your half million dollar
CAD software and accept that if it only tries to get
within 1/10th of the potential speed in a given 
process that things will most likely work.  The problem
there is that that software can add so much complexity
to the design, and do such poor routing that even a
90% margin of error may not be enough.  So ultimately
it is a trial and error process whether you use
the half million dollar tools and aim for 10% or if
you try to actually understand what the fab process
will really produce and fine tune your own cad
software to match it.

Even the people who wrote the half million dollar
CAD software would usually just say that they 
only wrote 1% of it and only really understood 1% 
of it the way Chuck needed to understand 100%.
Also by only aiming at 10% the potential they
could live with a very fuzzy idea of what the
fabs would actually produce.  Remember that
it took hundreds of millions of man years of
testing to find some Pentium bugs.

I was convinced that everything was working in the
last .8u prototypes, the way OKAD predicted, but not 
all the bug fixes got put into the last designs prototyped.  
The last prototypes were made in 1998 then all funding was
gone.  The move to 1.8u will require prototyping
to get things fine tuned.  I doubt if things will
work 100% the first time.  Not very likely.  But most 
of the problems with CAD have been worked out over 
the last decade.  The only proof will be chips that
do exaclty what OKAD predicts they will do.

The only one of the chips that worked 100% on the
first try was ShBoom.  It was a funny story.  Chuck
laid out the design, the software routed it and Chuck
said, "This could never run.  The software is brain
damaged and doesn't understand which circuits are
critical for timing.  It must just create a list and
go through it.  Look at this trace, it is the most
imporant trace on the design but must have been one
of the last ones routed because it goes all over the
chip to get from here to here.  It needs to be shorter
and straight.  The only solution is to lay out all
the components by hand and hand wire all the sections
together." 

The engineers at OKI thought Chuck was nuts.  They
said that his solution was impossible and simply could
not ever work.  But Chuck did it, it worked 100% on
the first try.  Chuck decided from that experience that
he needed his own tools that did what he needed.  I
think his explanations at his site of why his CAD tools 
work the way they do is very well written.  

The design was stolen from Chuck and eventually found
its way to Patriot Scientific where they spent a
decade making changes and trying to get them to work.

Even with OKAD as evolved as it is, I would expect
that more than one prototype fab would be needed to
get things working 100%.  Also testing every transitor
and combination of instruction etc. is a very
involved process.  That is the sort of thing that
Chuck expected from the client and owner of a chip.

I always thought that it made for an unusual programming
challange when you have no idea what will work or is
working.  You can't count on anything and have to
start almost from scratch each time unless it just
happens to work with a subset of the problems you saw
with the last suite of diagnostic software.

I thought it would be a fun programming challange for
Forth day.  Here is a simulated processor.  Here is
the instruction set that it is designed to support.
Write a program to do such and such to get X points.
You get X extra credit points for each bug where
the simulated processor that we give you does not
do what this documentation says.  You also get 
points for a work around for each bug that you
identify. You also get points for correctly
documenting the details of the hardware bugs that
you find.  The person with the most points at
the end of the lunch hour wins.

This sort of programming is very different than
what most people do.  You can't trust the chip hardware,
you can't trust the board hardware, you can't trust
the compiler, and the first few dozen things that
you try simply may not work at all.  So you can't
easily find software bugs by observing that your
program didn't work.  The problem becomes nearly
impossible if you introduce your own software bugs
on top of external hardware bugs that are seen
at a board level due to signal glitches or from
internal hardware bugs in the instruction set
or registers.  And the bugs that only appear
once in ever few billion executions of a given
sequence of otherwise proper code are very very 
hard to find.  Most programmers have a hard time
finding their own bugs when given solid chips,
solid boards, solid operating systems, and solid
compilers.  The bare metal programming of 
prototype chips is tricky business.
------------------------

To Unsubscribe from this list, send mail to Mdaemon@xxxxxxxxxxxxxxxxxx with:
unsubscribe NOSC
as the first and only line within the message body
Problems   -   List-Admin@xxxxxxxxxxxxxxxxxx
Main 4th site   -   http://www.