home .. forth .. misc mail list archive ..

Re: Hobbit again


> Date: Sun, 24 Mar 1996 12:27:37 +0100
> From: Jaap van Ganswijk <ganswijk@xs4all.nl>
> >> We must learn to live with the fact, that CPU's are complicated
> >> if we want them to do complicated tasks (quickly).
> >
> >Erm, no.  This is precisely the opposite of reality.  Interlocks and
> >MUXes within CPUs cause delays: the way to make a CPU go faster is to
> >reduce the number of such things.  The RISC lesson is that if you want
> >a fast computer you make it as simple as possible.
> 
> RISC is of course a very interesting approach, but it wastes memory
> (and disk-space and program load time) and because of it's
> register structure can't handle a variable number of arguments correctly etc.

RISC instructions are often a bit less dense than CISC.  RISC
processors handle procedures with a variable number of arguments
perfectly correctly.  That is: the code does the right thing, and it
does it quickly.

> I feel however, that you approach the problem too much from the
> hardware angle.
> Writing a compiler is very difficult, especially when too many axes of
> freedom are present and especially when some problems must be solved
> by a repetitive process. You can easily solve one such problem,
> but most architectures require several of these optimizations, which
> are interdependent.

Compiler writing for RISC architectures is to a large extent a problem
which has already been solved.  I use such a compiler every day: the
code it generates isn't perfectly optimum, but it is correct.  (In
fact that's not quite true; I have experienced one optimizer bug which
generated incorrect code in the course of three years' programming.)

> Hardware designer often create a lot of problems, which they say are
> each easily solvable, but the complexity of all these problems together
> may be too much to solve (at least in acceptable compiler time).
> 
> Example:
> When you have a small number of registers like the 68000 has (8),
> you may figure out how many registers a function needs,
> to calculate all of its expressions etc. with all local variables
> in memory. Then you allocate all remaining registers to local
> variables. Now you recalculate the number of temporary registers
> which may have become less, because some of the operands
> are already in a register. This process is repeated...
> When adding the complexity of also having address registers besides
> data registers, things become a lot more difficult.

Yes, but the 68000 is a completely broken CISC architecture.  Alpha is
a much better example...

> As Hennessy and Patterson suggest there should be a symbiosis
> between hardware development and compiler/software development.
> 
> >Making local variables appear to be in main memory slows down the
> >processor's maximum clock rate and the ability to go superscalar.
> >That's what's wrong with it.
> 
> I don't see why letting the registers be a cache on memory slows
> things down. It doesn't have to be associative memory.
> Just two bits are needed to indicate if the register is filled and/or changed.

Darn it, I've explained already why making the registers a memory
cache will slow things down.  I'll try again.

If an instruction can modify memory, there has to be another write
port into the register array.  That's a cost.  There also has to be a
bypass mechanism on memory reads and writes.  That's another cost.  If
a processor has multiple execution units, it is impossible to issue
any instructions until an instruction which does a write to memory has
completed.  This is because a write instruction may modify a register
after the operands for one of the execution units have been issued.

> Most of the data in the registers will never go to memory!
> For task switches several register banks can be used. 

You can only have multiple register banks by having additional MUXes
in the dapath, thus slowing the clock rate.  In any case, where are
you going to put these register banks?  You can only get high clock
rates by closely coupling the register bank to the execution units.

> >What do you mean by "optimum"?
> 
> Best solution.
> 
> >Dou you mean from a theoretical point of view?
> 
> Yes, that way all system software can be written in C, because
> C programs can reach everything in that one space.
> Lot's of problems are automatically solved: When all registers
> are in the memory space, a task switch can be done with a simple
> block move or a cache flush..

Why is it desirable to write everything in C?  In many OSes there is
only a tiny bit of task switching code in assembly: such code is
simple, reliable, and easy to write.  Why fix something that isn't
broken?

> >That's right.  And experience has shown that if you want to execute C
> >quickly you should use a register based RISC CPU.  Using similar
> >process technologies, Hobbit executes C slower than the ARM, for
> >example.  And Hobbit is much bigger.
> 
> The ARM has probably been spent much more time and money on.

No way.  ARM is a better architecture, that's all.

> This is no proof, that a Hobbit-like architecture couldn't have
> saved a lot of programming effort, which is what I am trying to say...

Oh, I never doubted that.  But computer engineering is about doing the
whole job, not just optimizing one aspect of it.  

Andrew.