Re: Is this group dying?



On Oct 1, 7:37 pm, "Chris Thomasson" <cris...@xxxxxxxxxxx> wrote:
"blytkerchan" <blytkerc...@xxxxxxxxx> wrote in message
news:1191204480.244573.305530@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Sep 29, 5:53 pm, "Chris Thomasson" <cris...@xxxxxxxxxxx> wrote:
[...]
The question remains, though: what is the problem domain?
Problem/Question:
- How to efficiently solve shared-memory coherency issues in software?
Possible Solution:
- Virtually Zero-Overhead:
[...]
Any thoughts?

I like the producer/consumer abstraction a lot, but your vZOOM will
only work in single-producer single-consumer cases if you don't want
to use either locks or interlocked instructions - at least I don't
really see how it could work with multiple producers or multiple
consumers.

You have to use distributed model. For instance, a
multiple-producer/single-consumer model in vZOOM can consist of per-producer
queue which are registered with a single consumer. This model is 100%
compatible with. You can transform that into a
multiple-producer/multiple-consumer model by hashing a pointer to a producer
thread into an index of consumer threads. The producer uses this index to
know what consumer to register its spsc queue with. All of the described
models can use the ultra-low overhead vZOOM unbounded wait-free queues. This
is ideal for a distributed message passing system.

Si you basically use single-producer single-consumer queues in much
the same way as the "input queues" and "output queues" I was talking
about a few days ago. You solve the multi-producer/multi-consumer
problem by hashing your pointer and you can always compensate for a
mis-attribution of a task through work stealing.

It's certainly a scalable approach, and if your virtually-zero
overhead is really virtually zero, the cost of having two queues per
thread is negligeable next to the use of locks.

For observers, I usually use a generic non-threaded implementation

[...]

I kind of like doing it with a single disributer thread; like the diagram I
linked to shows.

That is actually pretty close to what I would do for a dispatcher-
based Mediator implementation: a single multi-producer/single-
consumer queue toward the dispatcher thread and a single-producer/
single-consumer queue between the dispatcher and each consumer of an
observable event. The dispatcher basically works as a switchboard,
delivering its messages to the interested observers according to their
types (required quite a bit of template meta-programming, that
one :) )

I have one global queue (multiple producer/
multiple consumer, lock-free but not wait-free) from which the worker
threads take their next task if their local queue is empty. If both
are empty, they sleep on a condvar/event. The per-worker-thread queues
are multi-producer/single-consumer, lock-free and wait-free.

One global lock-free queue can cause quite a bit of overhead; can have 2
interlocks in the producer _and_ in the consumer. That is basically just
expensive as a traditional uncontended fast-pathed mutex assess; what
algorithm are you using? Also, what algorithm are you using for the
per-worker-thread queues?

The basic idea is that an empty queue consists of a single node with a
NULL as its "next"node. Its value is writable by the producer until
that producer has put a new "next" node in place. The consumer
acquires a node it can read from by putting NULL on its "next" and
checking that what was there wasn't NULL already. If it was, the node
is a dummy and should stay where it is. If it wasn't, the node value
is what it wants and whatever was in place is the new head. The
producer keeps track of the tail, the consumer keeps track of the
head. As both the head and the tail can only be mutated by a single
thread, there's no locking required for those.

I.e., the queue looks a bit like this (pseudo-code)

struct Node
{
Node() : next(0) {}
Node * next_;
Any val_;
};
struct Queue
{
Queue() : head_(new Node), tail_(head_) {}

void push(const Any & val)
{
assert(tail_);
std::auto_ptr< Node > new_tail(new Node);
tail_->val_ = val;
Node * old_tail(xchg(tail_, new_tail.release()));
assert(old_tail == 0);
}

Any pop()
{
assert(head_);
Node * new_head(xchg(head_->next_, 0));
if (new_head != 0)
{
Any retval(head_->val_);
delete head_;
head_ = new_head;
return retval;
}
else
throw Exceptions::Empty();
}

Node * head_; // usable only by consumer
Node * tail_; // usable only by producer
};

http://groups.google.com/group/comp.programming.threads/msg/a359bfb41...

If so, that works fine; indeed it has wait-free consumers. However, the
vZOOM model can still outperform it by quite a large margin simply because
it does not use interlocking, has wait-free producer and consumer operations
and is highly-distributed/NUMA-friendly.

What I gather from vZOOM is that only write operations are interlocked
(or even synchronized) and that you try to reduce write operations to
a strict minimum by consuming in large batches rather than in single
nodes. In my case, that would boil down to iterating to the
(perceived) en of the queue and cutting it off there, which would
reduce the number of XCHGs to one per batch of reads in stead of one
per read (e.g. if there are five writes between two reads, there would
be a total of seven XCHGs in stead of twelve). That is a nice idea..

I want the wait-free observer to be a multi-threaded daemon that can allow
multiple threads to create/register delegates; message-types and allow
consumer threads to register with those delegates and receive their
messages; producer threads create messages with subsequently signal the
delegates that manages those messages to multicast to their registered
consumers.

Which is very, *very* close to my Mediator implementation..

As for a Factory, I don't see why a factory would have to know about
threading unless you want the factory to keep track of whatever it's
created. Other than that, you just need a thread-safe memory manager..

Or perhaps I don't quite see the problem you're seeing.

I want to implement the wait-free factory as a multi-threaded daemon process
that multiple producers can register the path and name of a shared library,
which concurrent consumer processes can lookup by name; dynamically link
with the resulting library and call a "common" instance function (e.g.,
define common api). I guess you could think of it as a highly concurrent
per-computer COM daemon. The factory consumer threads can use the lock-free
reader pattern, and the producer threads can use a form of mutual exclusion;
including, but not limited, to locks.

OK, but what does the factory create?

As for a scripting language: I see scripting languages as a convenient
way for gluing otherwise disparate blocks of code together, or to make
decisions with "on-the-fly" logic (i.e. logic that wasn't built into
whatever you were building at the time). So the question really is:
what are you going to want to glue together? Message passing is a nice
abstraction for that: you just need to be able to create a message and
pass it into the mill, and to create whatever logic is needed to
handle the message and have that run by the mill. I can see a nice
opportunity for a scripting language there.

Yeah. BTW, are there any scripting langs that have a message passing
interface for their basic method of operation? Simply create a script and
enqueue it onto the message system, and wait/pool on its subsequent
resulting data. Humm...

not really: DCOM basically allows a (scripting) language to create a
network-transparent object with an underlying messaging protocol much
like CORBA, and I hear Erlang is message-passing-based (but haven't
taken the time to look closely at that) but neither is explicitly
multi-threaded and I don't think the DCOM-based scripting languages
(such as Visual Basic) allow for asynchronous operation - but I don't
know any of those languages, at all. I do know CORBA, however, as I
use it in a distributed multi-platform architecture... but CORBA only
uses "one-way" messages for asynchronous things...

rlc

.



Relevant Pages

  • Re: Is this group dying?
    ... consumer queue toward the dispatcher thread and a single-producer/ ... that producer has put a new "next" node in place. ... I want to implement the wait-free factory as a multi-threaded daemon process ...
    (comp.programming.threads)
  • Re: Thread question
    ... is the number of externally-caused awakenings. ... queue holding the items produced but not yet consumed. ... If the loop above is part of the consumer, ... Thing from the queue supplied by the producer. ...
    (comp.lang.java.programmer)
  • Re: Multithreading - Problem with notifyAll() and wait()
    ... NotifyAlland waitstatements (producer notifies and consumer ... private Queue queue = new Queue; ... Vector vectorQueue = new Vector; ...
    (comp.lang.java.programmer)
  • Re: Producer/Consumer
    ... but the consumer is only prepared to dequeue one at a time. ... even though there remain items in the queue (because the number of thread ... _newItemEvent is set by the producer after enqueueing an item, ...
    (microsoft.public.dotnet.general)
  • Re: Gas prices up to pay for bosses
    ... kilter profits become the more competition intensified. ... of the producer ought to be attended to, only so far as it may be necessary ... for promoting that of the consumer. ...
    (uk.politics.misc)