home .. forth .. misc mail list archive ..

Re: Scott McLoughlin, would you comment?


On Mon, 29 May 1995, fire-l wrote:

> From: "Rick Hohensee" <hohenzay@tmn.com>
> To: fire-l@artopro.mlnet.com
> Date: Mon, 29 May 1995 11:40:02 -0400 (EDT)
> Subject: Scott McLoughlin, would you comment?
>  
> The MuP21 isn't a Forth engine, really.   Fetch   fetches a datum from
> the address in the 'address register', not the top of the data stack.
> This is similar to Bana's address stack, methinks. However, why didn't
> Chuck make said register a stack? 

Why didn't Chuck make the A-register reside of the data stack?

Mainly, for the A@+ and A!+ , I guess. It is very convenient (and spares 
a few stack jugglings) to be able to walk through an array, without 
consuming the pointer. A wild guess of mine is, that this also simplifies 
the logic to a degree, since only the A and the PC registers communicate 
with the addressing part of the memory coprocessor.

Why didn't Chuck make a stack of A-registers?

It is very convenient for some algorithms to have two pointers, rather
than one. At the moment, MuP21 does not specifically facilitate these
algorithms. 

There are several approaches. The one take in F21 is for R to be second
address register, thus adding the instructions R@+ and R!+ . This 
approach is very efficient, when one has to deal with two-pointer 
algorithms like scalar multiplication of vectors, convolutions, and 
matrix row operations.

Another approach is to have a A-stack and have APUSH, APOP and ADROP. If the 
stack is not a ring, one must have 2n-1 positions APOP to the n-th 
pointer on the stack, and APUSH back to the first one. 

It is a bit easier, when we have an A-ring, but once again, accessing 
seems to be a problem. Suppose you have a ring of 8 addresses, and you 
need 5 for your algorithm. Once a step is finished and you are at the 
5-th position, you need either 4 APUSHes, or 4 APOPs to access the first 
element again.

Shorter rings would be more efficient -- i.e. a ring of three or four. But 
then, this is so close to 2, that it is not clear whether the additional 
one or two constants can justify the additional APOPs (or AROTs :-) 
needed to access them.

In all cases, if one has a specific application in mid, and is ready to
produce a 10-100K lot, why not design it the best way for the _specific
algorithms_ the chip will be running. 


--
Penio Penev <Penev@venezia.Rockefeller.edu> 1-212-327-7423