home .. forth .. misc mail list archive ..

Re: P64 Math



>>On Thu, 31 Aug 1995, Steve Atkins wrote:
>>
>>> >>Readers,
>>
>>> [ which adder architecture ]
>> 
>>> When you hit 32 bits I think it's worthwhile to use some sort of carry
>>> propagate acceleration. Something like a carry skip adder doesn't cost too
>>> much silicon, and will be *much* faster than a ripple adder for wide words.
>>
>>Can you name numbers? At least a qualitative factor. It still scales
>>linearly to bit length, I presume?

Depends how you do it, how many levels of skip you use etc. I don't have the
numbers here (they're at home), but it'll  be sub-linear with operand length.

>>> be essential - either a carry skip adder for medium speeds, a hybrid carry select
>>> adder (such as the one the Alpha uses) for fast speeds. If the word width grew
>>
>>Can you outline their principles, at least roughly? (My numerics 
>>hardware book is a bit outdated :(

I'll stick together a description and comparison (area, speed, complexity) of the
various adders tonight and post it tomorrow.

(Is there anyone around without web access? - this would be easier to do as
 HTML rather than flat ASCII)

[Later] Here are the basic numbers, from a handy paper:

(RCA = ripple carry adder, HPA = hybrid prefix adder)

n = operand width, T = delay, A = area in lamda^2

  n   T(RCA)   T(HPA)  A(RCA)      A(HPA)
8      18       7
16     34       8      200 x 700   350 x 700
32     66       9      200 x 1400  500 x 1400
64     130      10     200 x 2100  650 x 1200
256    514      17
432    866
512             19

>>
>>> beyond 64 bits, or *very* fast addition was the aim then bizarre adder
>>> architectures (Prefix adders, such as Brent-Kunge or Kogge-Stone (sp?)) start
>>> looking like a good idea. They're pretty expensive on silicon, unfortunately.
>>
>>Where do _these_ break down? 128 bit? 512 bit? Do they scale linearly, too?
>>How many transistors does one burn per bit slice?

They don't break down, but physical factors such as the capacitive load of tracks
across the datapath become non-negligable pretty quickly. For anything wider than
128 bit they'd be the first type I'd consider - a log N delay term is nice to have.
I can't recall how big an adder I've tried to model, but 512 bits certainly wouldn't
be a problem.

They have a delay varying as log N, for N bit operands, and an area varying 
roughly in the range of N log N to N^2 depending on the type of prefix adder.



I *like* adders - there are so many possible ways of implementing the one basic
operation. Much more interesting than multiply or Divide-Sqrt.

Cheers,
  Steve
--
-- Steve Atkins -- <atkinss@inmos.co.uk> -- +44 454 611439 --
"At least one good reason for studying multiplication and division is that
 there is an infinite number of ways of performing these operations and hence
 an infinite number of Ph.Ds (or expenses-paid visits to conferences in the
 USA) to be won from inventing new forms of multiplier."
                   -- Alan Clements, The Principles of Computer Hardware, 1986