Re: www.sicortex.com




From: "Paul A. Clayton" <paaronclayton@xxxxxxxxxxxxx>
Subject: Re: www.sicortex.com
Date: Friday, April 04, 2008 7:43 AM

On Apr 4, 4:53 am, "Chris Thomasson" <cris...@xxxxxxxxxxx> wrote:
"Paul A. Clayton" <paaronclay...@xxxxxxxxxxxxx> wrote in messagenews:5cb22757-ec55-4c1e-907b-719b854cb9c8@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[snip]
> > According to the limited documentation I have read, the DMA is
> > not exposed; they suggest the use of their MPI implementation.
>
> IMVHO, if its indeed true that there are no instructions that drive > their
> DMA engine, well, that's total crap! I would like the flexibility > program my
> own custom message-passing. Well, if somebody that posts to the group > has
> access to one of these systems, would you kindly send a disassembly of > the
> 'MPI_Send/Rcv' functions? There has to be some special instructions > which
> trigger DMA events. Anyway, I did some more reading and found where they
> explicitly say that:

I did not mean to imply that they hide (as a trade secret) the DMA
interface, merely that they do not give any documentation (linked
from their website) to indicate how such would be used.

Yeah. I bet that they don't hide them at all.




"For programs that use multithreading facilities such as pthreads or openMP,
each SiCortex node is a cache-coherent SMP."

So it looks like I could use the MIPS64 instruction set to implement my
existing libraries which make heavy use of shared-memory non-blocking
algorithms. Humm... I wonder if I could use one of my nearly zero-overhead
atomic queue algorithms for message-passing instead of their DMA engines...

Well, a node is only one processor chip (6 processors) and 2 DDR2
DIMMs,
so even the Catapult (desk-side) unit has 12 nodes.

Right. I was thinking that I could use existing shared memory multi-threading techniques for intra-node programming, and using their MPI interface only for inter-node communication. I don't think I would use MPI to communicate between processors on the same node. I would rather use the nodes "local" shared-memory on a for that purpose...

For the Catapult unit, I could use 12 multi-threaded processes where each process has its execution affinity bound to a separate node. That way the threads of each process would be running on the CPUS belonging to the processes node. The threads within a process can use local shared memory to communicate. The processes would use the MPI interface to communicate. That simple scheme should work fine on these neat systems...




> Hope that was helpful.

It was helpful indeed.

I am pleased.

.



Relevant Pages

  • Re: forall and do loop
    ... And MPI happens to work ok on shared memory machines. ... Personally I find OpenMP ... Yes, copying is the bane of message passing, at least for performance, ...
    (comp.lang.fortran)
  • Re: forall and do loop
    ... MPI is used because of the Sirens call of cheap computing ... message passing paradigm on such machines as the IPSC/2. ... And MPI happens to work ok on shared memory machines. ...
    (comp.lang.fortran)
  • Re: spawing multiple threads in Compaq Fortran?
    ... OpenMP and MPI. ... OpenMP requires shared memory and MPI requires inter-process ...
    (comp.lang.fortran)
  • Re: Parallel Python
    ... |> computing, both on shared memory systems and clusters. ... Er, MPI works by getting SOMETHING to spawn processes, which then ... are MUCH easier to run effectively on shared memory multi-CPU systems. ...
    (comp.lang.python)