home .. forth .. colorforth mail list archive ..

RE: [colorforth] FS/Forth for DOS: crude performance test


> I think that what makes the "exchange" technique slower is precisely the
> exchanges, because PUSH and POP are to me the fastest way to fetch/store
> memory words. So I'm a bit puzzled by your 17% figure. I'm also surprised by

This was my thinking too, especially since pushes and pops "paired" in the
Pentium; however, using PUSH and POP creates a rather large number of
read-after-write pipeline interlocks, which causes the CPU to slow down.

> your XOR example. INC costs only 2 cycles whereas IIRC costs much more.

INC may cost two cycles EACH, but LEA takes one cycle, and if you use your
effective addresses properly, will pair with the preceeding instruction.

> But my instruction timings are for the 8086. Probably the timings are
> different for higher processors. I read that happens to be the case for

The timings are *very* different for the higher processors.  In fact, because
instructions are dynamically dispatched, and we have super-pipelines and
superscalar execution units, neither Intel nor AMD publish instruction timings
anymore.  Even Intel cannot determine how long an instruction takes to execute.

Fast code for the 8086 is almost certainly going to be the *slowest* code for
any 80486-class processor, or higher.

--
Samuel A. Falvo II


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: colorforth-help@xxxxxxxxxxxxxxxxxx
Main web page - http://www.colorforth.com