home .. forth .. misc mail list archive ..

16-bit stack machine implemented on a Cypress CY37128 CPLD


I have developed a 16-bit zero-operand stack machine that I call Steamer16.
It fits on the Cypress CY37128 CPLD in an 84-pin PLCC package. Using the
125 MHz speed grade, wirewrapped operation at 20 MHz is predicted by the
simulator.

Unfortunately, a dual-stack Forth architecture doesn't fit in the 128
macrocells available. Consequently the design isn't a true Forth chip, but
it is a zero-operand stack machine nonetheless. In the future I would like
to fit a true Forth architecture to one of the CPLD or FPGA architectures
that include on-chip RAM blocks for the stacks.

Being fearfull of actually fitting the design to the target device, the
instruction set and architecture was minimized to a ridiculous extent, and
it indeed just barely fits. In the future, more elaborate implementations
may be implemented on larger devices not suitable for hobby projects due to
exotic packaging. For this reason, the documentation contains nerdy phrases
typical of growth-path specifications, but don't let that distract you from
understanding the Steamer16 initial implementation that exists today.

I plan to design a companion chip, also using the CY37128 to provide a
timer, parallel I/O, a funnel shifter, memory decoder/wait state logic, and
glue logic for a 16-bit 3-port multiplier/accumulator.

I think it might be bad netiqette to attach the 40Kbyte zip file I have
available because of the load on the MISC server. It contains the
assembler, JEDEC file, and side documentation. Interested parties should
e-mail me for a copy. Please withold any technical questions until having
read the documentation package.

BTW, I am well aware of the shortcomings of the Steamer16 implementation,
so please don't take me to task over it. My defense is:
1) it fits on a low-cost CPLD in a package hobbyists can deal with
2) companion chips can alleviate some of the shortcomings
3) at 20 MHz, it can clunk through inelegant code sequences quickly

Following is an excerpt from the assembler documentation (STASM.TXT), part
of the zipped package.

Happy New Millenium, MISCers!
Myron Plichota

************************************************************
Programming Model:

  The Steamer architecture consists of a program counter (P) and a 3-deep
  RPN evaluation stack (TOP, 2ND, 3RD). P is cleared on reset. The stack
  registers are undefined until loaded under program control. There is no
  program status word or carry flag. P addresses instruction groups, not
  necessarily individual instructions. Steamer architecture mandates
  operations on natural size words without forbidding other data types.

  Steamer16 implements the Steamer architecture in 16 bits, with no
  enhancements.



Stack diagrams:

  Stack diagrams are used to describe instruction behavior by showing both
  the inputs on the stack and the results in a concise notation. The input
  list is on the left-hand side of the "--" before/after separator, the
  results are on the right-hand side.

  eg. ( 3RD 2ND TOP -- 3RD 2ND TOP)

  The input list shows the proper order of input entry in left-to-right
  order. The input list shows only the requisite stack entries.

  The output list shows all three entries. The symbols x, y, and z, are
used
  to denote the original values of any surviving independent stack entries.



Instruction Descriptions: opcodes are in hexadecimal order

  NOP,  {0}     ( -- x y z)             no operation

  lit,  {8}     ( -- y z data) P++      read memory at P, increment P

  @,    {9}     ( addr -- x y data)     read memory at addr

  !,    {A}     ( data addr -- x x x)   write data to memory at addr

  +,    {B}     ( n1 n2 -- x x n1+n2)   add 2ND to TOP

  AND,  {C}     ( n1 n2 -- x x n1&n2)   and 2ND to TOP

  OR,   {D}     ( n1 n2 -- x x n1|n2)   or 2ND to TOP

  XOR,  {E}     ( n1 n2 -- x x n1^n2)   exclusive-or 2ND to TOP

  zgo,  {F}     ( flg addr -- x x x)    if flg equals 0 then jump to addr
                                        else continue

  Notes:
        1) 3RD is sticky. When the stack shrinks it holds its value.
        2) lit, is the only instruction that grows the stack, destroying
3RD.
        3) The Steamer16 instruction set contains no additions to the
Steamer
           required instruction set.
        4) Opcodes {1..7} are implemented as no operation and are not part
of
           the Steamer required instruction set.



Instruction Timing:

  Steamer16 executes all instructions in 1 clock cycle. A quartet fetch
  cycle is required when the current quartet has finished executing or a
  jump is taken. For sequential execution, quartets are fetched and
executed
  in 5 clocks.

  Software delays are deterministic and may be counted from the fetch of
  any quartet.

  The adder for the +, instruction is implemented as a cascade of 8 2-bit
  ripple-carry adder cells. Running on a 125 MHz part, the maximum clock
  frequency is 20 MHz for unambiguous results.

  Instruction timing is not mandated in the Steamer architecture.