Re: Cluster computing drawbacks




In article <dc7mjh$qqc$1@xxxxxxxxxxxxxxxx>,
=?ISO-8859-1?Q?Javier_Fern=E1ndez?= <javier@xxxxxxxxxx> writes:
|> Nick Maclaren wrote:
|> > scheduler or both. Just Do It. Converting to use MPI communication
|> > is harder, but still easier than converting to use SMP communication.
|> >
|> > Most people's experience is that it is EASIER than converting
|> > a serial program to use SMP communication. Seriously. Converting
|> > to use SMP is one of the foulest tasks that you can imagine, and is
|>
|> "This software is not thread-safe" :-)
|>
|> I made the same claim on my Ph.D. defense. You can find books and books
|> with chapters and chapters devoted to _explain_ the possible deadlocks
|> on SMPs and then chapters and chapters devoted to explain The Right Way
|> to build several paradigms (consumer-producer, etc)

To be fair, about half of those also apply to message passing.
What you don't get with message passing is IMPLICIT interaction;
if you don't pass a message, the threads are independent. With
shared memory, it is usually unclear when threads are interacting.

|> Change or delete (or add) one line to such schemes and you'll face
|> a deadlock, for reasons so complex to explain that you'll need to
|> study again all those previous chapters :-)

Actually, my experience is that people often don't get that far.
The issues to do with when objects are distinct and when they are
not (i.e. when they may be used independently in separate threads)
and the total lack of tools for investigating even the simpler
issues of wrong answers, deadlock etc. are what catch them.

|> Message-passing handbooks usually include a remark on the man page
|> for _send (or _recv), clarifying that if you put the wrong tag or
|> receiver (sender), your message will get lost, sent to a non-listening
|> receiver or received from a non-expected sender. Three text lines,
|> instead of chapters and chapters.

There are some tools to help check for that. There are also a
few situations where you can get deadlock that are not obvious,
but not all that many, and it is pretty easy to diagnose the
erroneous operations when you have done it.

|> Nice to learn somebody else thinks the same :-)

For me too :-)


Regards,
Nick Maclaren.
.



Relevant Pages

  • Re: Cluster computing drawbacks
    ... to run under a MPI spawning program, ... Converting to use MPI communication is harder, but still easier than converting to use SMP communication. ...
    (comp.arch)
  • Re: Cluster computing drawbacks
    ... but still easier than converting to use SMP communication. ... receiver or received from a non-expected sender. ...
    (comp.arch)