home .. forth .. colorforth mail list archive ..

RE: [colorforth] FS/Forth for DOS: crude performance test


> I'm trying to send this again, the last time it didn't work...
> Just curious why all this XCHGing is required.

It's because the x86 only has one stack pointer, but Forth requires two stacks.
 So to manipulate the data stack with PUSH and POP instructions, the data stack
pointer must appear in the SP register.  But when invoking colon definitions,
the return stack pointer must be in SP.

> My forth doesn't use a PFA, so maybe it is for that?

Neither does mine.

> It seems FS/Forth and 4IM both are swapping (exchanging) between the
> RSP and PSP (return stack and parameter stack pointers.) Why is this
> necessary? Can't the pointers simply be left in their own dedicated
> register? 

If the SI register (or similarly permissible register for effective addressing)
were used, yes.  But I didn't do this for the DOS version, for two reasons. 
One, it consumes less opcode space.  Since the DOS version of FS/Forth is
limited to 64K of combined code and data space (12K of which is used for
administrative stuff like block buffers, and the stacks themselves), and each
word reference consumes three bytes already, hopefully you can see why this can
become rather important.

> The reason I'm asking, is that, in order to learn Forth, I ended up
> writing a simple DOS based "classical" forth -- to get a better idea of
> what was going on internally. I don't recall having to XCHG the 
> two stack pointers. Is this for CREATE ... DOES> ? I haven't figured
> "DOES>" out yet... 

No.  See the beginning of this message for an explanation why.

> Sorry for the off-topic question, but since both the Forths mentioned
> above swap pointers (and I think Eforth does too.) it is making me wonder 
> if I've missed something in my implementation. 

No.  There are a myriad different ways to implement a Forth for the x86
architecture.  Chuck's code, for example, keeps the data stack pointer in the
ESI register (the 32-bit version of SI).

After some reflection, Frederick's suggestion of peephole optimizing out the
XCHG instructions makes a lot of sense.  I had forgotten that the Pentium
series of CPUs *pairs* stack operations with arithmetic operations; hence, it
actually could be *faster* than using ESI as the stack pointer.  So the
occasional overhead of XCHG EBP,ESP will be more than dwarfed by the
performance gains that using simple PUSH/POP instructions will provide for all
but the most pathological cases.

--
Samuel A. Falvo II


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: colorforth-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: colorforth-help@xxxxxxxxxxxxxxxxxx
Main web page - http://www.colorforth.com