home .. forth .. misc mail list archive ..

Re: multi/tasking/processing


On Wed, 16 Aug 1995, Christophe Lavarenne wrote:

> I agree with Penio Penev that:
> > one is much better off with total control over all of the silicon ...
> 
> But his following argument is worth a comment:
> > On the other hand, big states (register files) are not good for
> > multitasking, since _all of them_ need to be swapped to/from memory
> > _every_ context switch.  (Don't tell me, that having _only_ two tasks is
> > adequate for a multitasking machine.)

I was not proposing multiple register files or stacks, only two
sets, not even symmetrical (OS stack being much more shallow).
If these two banks are on chip, they need not to be swapped out
to main store. 

The only minus apart from burning more transistors/need a bigger die 
is, according to Chuck, the single gate delay to determine which banks is 
active at the time. As the MISC chips are fine-honed high-speed machines,
stack operations being the hot spots, the delay is _very_ noticeable.
We are trading continous loss of peak performance for reduced latency.
:(


But still I am of the oppinion that having zero-latency context
switch between two tasks (possibly, protecting the address space
of one task) is valuable. A lot of program code is OS call code.
The OS is a very special task, requiring dedicated resources.
The OS can be implemented as a VM with zero context switch.
The OS supervisor task gets called very often particularly in
realtime machines. As long as we don't have maspar machines with
tens to hundreds of separate monotask nodes on single chip, a reentrant 
multitasking OS with memory protection is a must. Of course we don't 
have memory thrashing in distant nodes, but why should we tolerate 
shooting the local OS, either by direct writing to memory or corrupting 
the stack contents?

Now imagine a worm-like damage spreading lightning fast over the whole
network. It need not be malice, spontananeous bit mutation (another
task freaked out) in a network-traveling agent code will suffice.

> You are right for monoprocessor multipurpose desktop workstations.
> 
> For multiprocessor real-time embedded machines, I would rather say that the
> most efficient use of hardware is only one task per instruction sequencer, to
> avoid the overhead of context switches.

What if these tasks need to communicate a lot, and the network bandwidth
is insufficient? The best way of communication is not having to push
the data around at all.

What if the task is much too small, shall we waste the
rest of local node memory resources?

There is a point, beyond which the processor spends most of its time 
switching contexts instead of executing useful code. But why should 
we want to push things thus far? 8-16 threads/node are
surely acceptable. One of the nice things of the Novix was its multiple
stack frames option, which switched the frame of reference instead of
pushing data to and fro.  One of the big minuses of cooperative multitasking 
is that a hanging task will block the whole node.

I think there are a lot of reasons pro fast reentrant few-thread/node 
multitasking even on maspar machines.

> On a multi-F21, there are several instruction sequencers.

But they are not microcode sequencers. Instruction field coding
takes virtually not any sequencing logic.

[ interesting remarks on MISC philosophy snipped ]

Sorry for the muddle,

-- Eugene