home .. forth .. misc mail list archive ..

Re: MISC-d Digest V97 #14


>Date: Tue, 1 Apr 1997 02:05:10 -0500 (EST)
>From: Penio Penev <penev@venezia>
>To: Chen Ting <pting@netcom.com>, MISC
>Subject: 80 MIPS peak?

>The difference is 1 milion "nop nop nop nop"-s.
>
>Then we do:
>
>: 50T1 50 FOR T1 NEXT ;
>: 50T2 50 FOR T2 NEXT ;
>
>50T1 executes for 24 s, against 32 s for 50T2:
>
>These are 8G ns for 50M "nop nop nop nop", which is about 160ns for
>instruction -- roughly 125% * 115ns (ajusting for video bandwidth). 
>
>This is suspiciously close to 50ns (RAS time) + 25ns (setup time) + 40ns
>(execution time).  I.e., the peak rate of memory access is 9MHz, not 20,
>as indicated.
>
>Am I completely at loss?

160ns sounds suspiciously like a simple off page access vs an onpage access.
The difference is about 3 times, 50ns vs 150ns.  The use of video will slow
the cpu in dram by about 2/3.

Of course you cannot ever get 80 MIPS in the CPU if the video coprocessor is
taking 150ns for its off page access and then very possibly forcing the next
cpu instruction to take 150ns vs 50ns for its next access by being forced
offpage by the video coprocessor and making it miss 40ns of prefetch every
600ns. 

80 Peak advertizing DRAM MIPS is for unrolled stack access onpage in DRAM
with full memory 
bandwidth.  If the video is getting 250ns every 600ns you are only getting
350ns every 600ns
at the CPU.  If you have any non linear code (branching) or data access
instructions you must add time because of delayed instruction prefetch.
If the access to memory is delayed by the a video access and then slowed
to offpage speed it is like getting 300+ns access at the CPU.  The use
of video on P21 drops the effective speed of the cpu on linear dram
to about 20mips.   8 sec (8Gns) for 50M "nop nop nop nop" is  200M
instructions/8 sec
or 25M instructions per second.

If you want more bandwidth for the cpu, turn off video and let the cpu keep some
of dram refreshed and do some i/o and do some timing.

The last time I heard Dr. Ting answer the question of speed he said 66 Mips in
regard to MuP21h in the dram on his board.  Somewhere in the documentation it
should say that video could slow things down to about 20 mips.

One of the nice things about F21 will be that you don't have to generate
video on every
cpu, so most of them can provide almost all the memory bandwidth to their
cpu.  The
expanded SRAM space will let one put more code into fast memory, and the
home page
branch instructions make calls and jumps into that space a single word.  So
instead
of being limited to 20 Mips in your DRAM like P21 most F21 will be able to
run at
100 MIPS in DRAM and 200 MIPS in SRAM. (i/o processors excepted)

>I think Jeff once mentioned that the next instruction prefetch begins as
>soon as there is no pending memory instructions, therefore the 40 ns of
>"nop nop nop nop" should be entirely overlapped with instruction prefetch,
>be it 50ns + 25ns.

Well more like 35ns (really for an 80ns part) + 25ns setup for onpage timing
of 60ns
or 135ns + 25ns or 160ns for offpage timing.  If "nop nop nop nop" if forced
offpage
by video and you lose prefetch you get 40ns+160ns or 200ns/4 opcodes or 20mips.
If you add in the video timing of 160 to the 200 for the 4 opcodes you get 360ns
which is 11 mips on that single word of cpu code.  If that happens 1/3 of
the time
you get  60ns + 60ns + 360ns / 12 opcodes  for an average of 160ns / 4 opcodes,
40ns / opcode or 25mips.

Jeff
+-----------------------------------------------------+
|   Jeff Fox                                          |
|   jfox@dnai.com              Ultra Technology Inc.  |
|   jeff@itvcorp.com           the iTV Corporation    |
|   http://www.dnai.com/~jfox/                        |
+-----------------------------------------------------+