andrewb at inf.ethz.ch
Fri Dec 2 00:09:36 CET 2011
messages_wait_and_handle_next() is really a kludge for a legacy (and broken) IDC system, and shouldn't be used in any new code, although I'll be the first to admit that there are still too many references to it in the tree. If you look at the implementation, you'll see it's just a wrapper for "event_dispatch(get_default_waitset())". The problem with this is that it blocks indefinitely if there are no events to dispatch on the default waitset, and there's no guarantee on which event it does dispatch when it returns to you. So it's typically used in a loop testing some external condition, as in "while (!callback_has_run()) messages_wait_and_handle_next();" but even this only works when:
1. There is only one thread dispatching the default waitset. If more than one thread dispatches the waitset, it might run the even that triggers callback_has_run(), while your loop blocks waiting for an event that will never arrive.
2. The messages_wait_and_handle_next() doesn't happen in the context of an event handler. We don't support nested events on the same binding, so you will probably deadlock here.
That said, you do need to dispatch the default waitset for bindings to complete. If you want to go with a completely event-driven model, the clean thing to do is "stack-rip" your function where it blocks, so that all the logic that comes after the binding completes actually runs in the binding completion continuation. If you want to go with a threaded model, you could have one thread dispatching the waitset (and doing nothing else), and have the event handlers unblock the main thread (via mechanisms like semaphores or condition variables).
BTW, have you looked into using THC for your application? Is there a reason you can't do this? It makes a lot of this event machinery much nicer to program...
From: Georgios Varisteas [mailto:yorgos at kth.se]
Sent: Thursday, 01 December, 2011 8:02
To: barrelfish-users at lists.inf.ethz.ch
Subject: [Barrelfish-users] Bindings
I'm really stuck on a problem cause I can't figure out its source. The bottom line is that a client freezes while waiting for a reply from the bind operation on the server's iref. Let me elaborate on it...
There is a primary instance of the server which spawns multiple other instances of itself as separate applications. Each of them gets a binding to each other, thus creating an interconnected distributed service. Communication between them provably works.
The client comes up, successfully retrieves the primary server's iref from the nameservice and executes the bind operation on it. Right afterwards it executes the "messages_wait_and_handle_next()" function to wait for the bind's reply. The reply's handler is set as the continuation to the bind call but it never executes. At that point everything freezes. If I omit the "messages_wait_and_handle_next()" then the client will proceed normally without ever having the bind's reply execute.
Any hints or ideas would be mostly welcome. Thanks.
Barrelfish-users mailing list
Barrelfish-users at lists.inf.ethz.ch
More information about the Barrelfish-users