home .. forth .. misc mail list archive ..

No Subject


Dear Misc readers,                                             5/18/95

I figured there would be many questions about the serial/network coprocessor
specs.  I was actually impressed with the way Chuck had covered all the details
on only two pages.  I also figured that it is pretty easy to determine which
sections are Chuck's careful descriptions of spec data, and my more lengthy
narratives about how it works.

I any event  here are some more naratives and some answers to questions.

The serial/network coprocessor on the F21 is a hardware device designed to
provide hardware support for Direct Memory Access transfers and remote
CPU interrupts in an F21 multiprocessor.  The unit is essentially a serial
shifter with a DMA and CPU interrupt unit.  The only instructions it
recognizes are control bits in the serial/network coprocessor configuration
register, and the Start of Message and End of Message tokens in the serial
bit stream.  This is why I say:

    > In normal operation the serial/network coprocessor will be
    > reading and echoing serial data, but it will not be making
    > any memory access.

In normal operation the unit is continuously scanning the serial bit
stream for the instructions to perform a download from the serial stream
to memory, or stop such a transfer and interrupt the CPU.  It does not
read any instructions from the memory bus to do this.  It only makes use
of the memory bus for DMA transfers.  Thus there is normally no overhead
involved in having it running.  It will simply echo incomming data to its
output.  If it sees a SOM that matches the one in its own SOM register in
the data stream it begins a DMA transfer, and if it sees its own EOM
instruction it stops a transfer and performs a CPU interrupt.  If the CPU
sets the transmit bit in the configuration register it will read from
memory and write to the output of the serial/network coprocessor.

If a transfer is made into memory, it will require a certain number of
memory write operations, but the serial/network coprocessor does not
read instructions from memory, and if it is not performing a DMA to
or from memory it will add no memory access overhead.

The F21 chip has two pins labeled Si and So these are serial input
and serial output.  The serial/network coprocessor actually uses three
pins since is also shares the CLK signal with the video and analog
coprocessors.  The processes instructions in the serial data stream
and provides DMA and CPU interrupt functions in hardware.

The F21 serial/network coprocessor performs serialization, DMA,
clocking, and recognizes its instructions with its own hardware
and does not require the use the CPU.  The serial/network coprocessor
can interrupt the CPU.  This happens after the End of Message pattern
turns off the DMA transfer.  The interaction between the CPU and the
serial/network coprocessor is programmable and determines what
happens on software layers.

The serial/network coprocessor provides all the hardware needed to
create a ring topology network.  But the single input and output
on the serial/network coprocessor does not limit the F21 network,
or even the serial/network coprocessor configuration to a ring.
With an amplifier the serial output of one unit can be fed into
any number of inputs.  With the addition of an OR gate any number
of outputs could be fed into one input.  A ring is one topology.
There is only one input and one output on each F21, and it is
not bi-directional.

Whatever topology is used software is required for operation of
a network.  Among some of the operations in the network software
are configuration and initialization of nodes, administration of
the operating network, error detection and correction, and DMA
and interrupt services.  The network software will be layered
to provide more complex protocols and services.

An F21 network is controled by software and is not limited to the
serial/network coprocessor.  Any node can be a bridge to another
ring of F21 or to an external network.  The on-chip parallel
port provides an easy way to expand an F21 network with bridges
between multiple rings.

The performance of the serial/network coprocessor will be limited by
the quality of the data and clock signals.  With an 11 bit counter
on the input CLK a wide frequency range is available for a given
clock.  Chips are designed to connect the output on one chip to
the input on another chip with wire.  Any medium could be used
to move the signal from one chip to another.  The connects could
be amplified, optical fibers, IR, or whatever.  There will be a limit
to the distance between units with a wire only interconnect, and
it will be related to speed.  Lower speed should work over longer
unamplified wires within a range.

Ultra Technology provides the technical information on the hardware
operation of serial/network coprocessor for those who want to
program this unit.  Ultra Technology will also provide software to
use the serial/network coprocessors on F21 to support a network.
Low level software routines will be part of the OS code in the boot
ROM.  The initial F21 must boot from slow 8 bit RAM/ROM memory as the
bits in the Cn serial/network coprocessor configuration register do
not boot up with the network live ready to do DMA and interrupts.

The first network software Ultra Technology will provide will support a
ring.  A master processor will assign addresses to each node and will
arbitrate write control on the ring.  This ring can operate in one of
two modes.

The first mode is the general purpose networked Distributed Shared
Memory mode. In this DSM mode most of the time there will be only
one processor writing to the network at any time.  In this mode the
nodes are time sharing write access to the network.  If the number
of nodes is N and if the network is being operated at B bits per
second then each processor will only be able to use B/N bandwidth.
The networks software provides collision avoidance, error detection
and correction, atomic write access to DSM, and the ability to
remotely process an execution vector.  One feature that can increase
the efficiency of operation in this mode is the group and unit
SOM bit assignments.  This feature permits a single message with
DMA and CPU interrupt to be broadcast to a group of nodes in a
single message.

The second mode of operation of the ring is pipeline mode.  In this
mode all of the odd or even nodes in the ring would send at the
same time, but only to the next unit on the ring.  For some applications
it is more efficient to use the extra network bandwidth in this
mode.  Where data is piped between execution processes this mode
can provide greater network throughput.  The Occam like Parallel
Channel wordset can be implemented in the DSM mode, but should
be able to also take advantage of pipeline mode network operation.
In this mode of operation the total network bandwidth becomes BN/2.

Various hardware and software control layers will be provided in network
software.  At the highest levels of parallel code the network details
are not visible.  The programmer or compiler can specify sections
of code to run in parallel configured automatically at run time for
dynamic load balancing.

But of course the details of the operation of the hardware are
provided for those who wish to have total control over the run time
hardware.

All that is available now are the hardware details, and they are
still preliminary.  There are few examples or source code, and
no code running on these networks yet.

But about questions:

	o	what kind of general concept (CSMA/CD <-> token ring <-> ???)

I think I answered that above.

	o	what kind of collision detect / arbitration

Software, see above.

	o	what kind of transmission medium

Wires, or amps, or fiber, or IR, or radio, or whatever.   It is
interesting to note that the design of the serial/network coprocessor
on F21 is basically the same as on the P32.  It is designed to
be able to provide multiple software protocols at up to gigabit rates.

	o	what max. line length (derived from the points above)

Who knows until we try.  Maybe you can tell me, at ttl signal levels
how far 5v will go at various speeds.  We will push for speed later.
Still, if you want a fiber with repeaters, the line length is pretty
long.

Analog processor question:
>Six I/O channels or I/O channels with 6bit resolution?

The chip diagram clearly shows one input and one output from the
analog coprocessor.  The documentation on the coprocessor says it
has six bits input and six bits output.  It stores those in the
top and bottom six bits and puts some control bits like interrupt
in the middle. You will see Ai and Ao on the pinout.  This is
Analog In and Analog Out,  there is no multiplexor on chip, and
there are not six analog inputs and outputs.

If you want six analog inputs and outputs, use an external multiplexor
or six F21. If you need more than six or eight bits or whatever
of analog input you can always use an external part for a few
dollars more. :-)

    Jeff> Below that you can use oversampling techniques
    Jeff> to get more bits of resolution.  

Robert:
>I don't see how. 
>
>If you use a Sigma-Delta-Converter and you can adjust the digital
>filter parameter, you could adjust resolution by oversampling.
>
>If you have a additional D/A converter, you could use it as an 
>offset/limit generator and enhance resolution that way (a little).

There has been a thread going on on this subject in comp.arch.embedded
for a while now.  It is well known that if you have a sigma-delta
type converter you get smaller changes by sampling faster.  This is
not the way the F21 analog input works.  Instead it uses a ramp and
comparitor and increments through 6 bits at up to 14mhz.

However for some type of analog signals you can still get extended
range resolution by oversampling.  It depends on the signal.  If a
signal is not changing, and is fixed between the lowest and second
to lowest levels in your A/D you can never get any more resolution.
But if the signal is moving the more often it is sampled the more
accurate the average or interpolation of the value between the lowest
levels you can sample.

The original spec for the F21 analog coprocessor input called for
a 1 bit delta squared input.  Chuck convinced me to try this design
for a number of reasons.

Units will have to be tested for accuracy, linearity, speed,
and noise immunity performance.  It is quite early to say how they
will perform.  Much of this is new territory.

As Chuck said in his last presentation to SVFIG this (F21) is much
more analog than P21.  P21 only had only one analog out, this will
have multiple analog i/o coprocessors.

Eugene.
>The only thing I didn't quite understand: how many links are
>there? Can one patch through through every link to every link?
>Did you actually integrate a crossbar switch on the die?
>Or does each transfer virtually block the processor?

You get one serial/network coprocessor per F21 chip.  Each has one
input and one output.  The parallel port can be used for bridges in
the network topology.  The upper limit that will be practicle for
different things is not clear yet.

Yes on a ring any node can send to any node with a little software.
There is no crossbar switch, but each node only receives messages
sent to it.  All other messages just fly around the ring until they
are stoped, while messages pass by the block the use of the ring
for anything else, but they do not block the memory or CPU.

I understand that there are problems where DSM is not suitable because
almost all memory must be global, and there are systems where the message
overhead will be a bottleneck if you don't have 100 megabytes per node.
F21 is not designed for all applications or all parallel applications.

Jeff Fox