[Barrelfish-users] Messages

Zeus Gómez Marmolejo zeus.gomez at bsc.es
Wed Aug 17 18:10:49 CEST 2011


Hi Tim,

I used debug_printf() and things didn't change.

But I saw where is the problem. In the stub generated by flounder named
"ambf_flounder_bindings.c", there is a
function "ambf_ump_rx_handler()" called in the event_dispatch(). This
function has a:

    while (true) {
        // try to retrieve a message from the channel

dispatches one message and then dispatches another without returning. I saw
it in the debugger as the ambf_ump_rx_handler() function is only hitting
once while delivering the two messages.

So one event in the waitset may carry several messages...

That means that my code has to change a bit now :)


zeus.


2011/8/17 Tim Harris (RESEARCH) <tharris at microsoft.com>

>  Hi Zeus,****
>
> ** **
>
> Two possible things to check:****
>
> ** **
>
> **·         **Are you using printf or debug_printf for output?  I’d
> suggest using the latter – it really goes out right away, so you can trust
> what you see J
>
> ****
>
> **·         **If it possible that the message handler is doing something
> that is causing a recursive call to event_dispatch?  (Actually, maybe this
> would be from within printf?)****
>
> ** **
>
> Cheers,****
>
> ** **
>
> Tim****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *From:* zeus at aluzina.org [mailto:zeus at aluzina.org] *On Behalf Of *Zeus
> Gómez Marmolejo
> *Sent:* 17 August 2011 15:31
> *To:* Timothy Roscoe
> *Cc:* barrelfish-users at lists.inf.ethz.ch; Tim Harris (RESEARCH)
> *Subject:* Re: [Barrelfish-users] Messages****
>
> ** **
>
> Hi,****
>
> ** **
>
> I discovered another surprising thing. I'm calling the function
> messages_wait_and_handle_next() to wait and dispatch one message. Which in
> turn calls to event_dispatch(). According to the info I get on screen,
> sometimes it handles two messages before returning. This is strange.****
>
> ** **
>
> I was inspecting the code and it's calling to get_next_event() which is
> supposed to handle only one event from the waitset. I don't understand this
> behavior, is this possible?****
>
> zeus.****
>
> ** **
>
> El 17 de agosto de 2011 14:48, Zeus Gómez Marmolejo <zeus.gomez at bsc.es>
> escribió:****
>
> Ok, I don't get any failures when sending messages, as I'm always checking
> with can_send(). Otherwise, I call the event_dispatch().****
>
> ** **
>
> But now I'm sure that messages get reordered somehow. It's difficult to
> send the code as it's quite involved, but I filled the code with some
> printf. I'm using own implemented barriers using the barrelfish message
> passing and I see the message handler that is in the barrier dispatching
> messages that were sent after the barrier reply. Then suddenly the message
> reply for the barrier arrives ... how can it be?****
>
> ** **
>
> I don't know how the stub is working but is it possible that is sending a
> retry of a message after the following message has been sent?****
>
> ** **
>
> zeus.****
>
> ** **
>
> El 13 de agosto de 2011 17:49, Timothy Roscoe <troscoe at inf.ethz.ch>
> escribió:****
>
> ** **
>
>
> Well, messages should never be lost or reordered on a UMP channel, but
> both sending and receiving messages can temporarily fail in a variety
> of ways (we push retries due to full channels, etc. back to the
> sender).
>
> This means that a single-threaded sender and receiver can deadlock if
> they're not careful about handling things like send failures due to a
> full channel, or the need to send acks back from the receiver.
>
> The stubs try to hide most of the functionality required for this,
> except for the fact that any send or recv can fail (as with Unix NBIO
> and select/poll).
>
> That said, you probably already know most of this :-)
>
> And, of course, there are probably bugs.
>
>  -- Mothy****
>
>
> At Sat, 13 Aug 2011 15:08:27 +0200, Zeus Gómez Marmolejo <
> zeus.gomez at bsc.es> wrote:
> > I'm trying to simplify the code (I'm using all gasnet and the library I
> > made), but whenever I get it simpler it turns out that it's working...
> >
> > So I have to figure out where the problem is.
> >
> >
> > El 12 de agosto de 2011 11:43, Tim Harris (RESEARCH)
> > <tharris at microsoft.com>escribió:
> >
> > >  Could you post some code examples?****
> > >
> > > ** **
> > >
> > > Cheers,****
> > >
> > > ** **
> > >
> > > Tim****
> > >
> > > ** **
> > >
> > > ** **
> > >
> > > ** **
> > >
> > > *From:* Zeus Gómez Marmolejo [mailto:zeus.gomez at bsc.es]
> > > *Sent:* 12 August 2011 10:42
> > > *To:* barrelfish-users at lists.inf.ethz.ch
> > > *Subject:* [Barrelfish-users] Messages****
> > >
> > > ** **
> > >
> > > Hi,****
> > >
> > > ** **
> > >
> > > I'm developing a benchmark for message passing in Barrelfish, but I
> don't
> > > get it working properly. I experience some random errors. ****
> > >
> > > ** **
> > >
> > > The application is single threaded and it's using the flounder stubs to
> > > send messages via the UMP backend. Some of the messages get lost. Is it
> > > possible that they arrive in different order that they were sent? I'm
> not
> > > sure how to debug it****
> > >
> > > ** **
> > >
> > > ** **
> > >
> > > Thanks for your help!
> > >
> > > --
> > > Zeus Gómez Marmolejo
> > > Barcelona Supercomputing Center
> > > PhD student
> > > http://www.bsc.es
> > >
> > > ****
> > >
> >
> >
> >
> > --
> > Zeus Gómez Marmolejo
> > Barcelona Supercomputing Center
> > PhD student
> > http://www.bsc.es
> >****
>
>
>
> ****
>
> ** **
>
> -- ****
>
> Zeus Gómez Marmolejo
> Barcelona Supercomputing Center
> PhD student
> http://www.bsc.es
>
> ****
>
>
>
> ****
>
> ** **
>
> --
> Zeus Gómez Marmolejo
> Barcelona Supercomputing Center
> PhD student
> http://www.bsc.es
>
> ****
>



-- 
Zeus Gómez Marmolejo
Barcelona Supercomputing Center
PhD student
http://www.bsc.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.inf.ethz.ch/pipermail/barrelfish-users/attachments/20110817/90bc6b7d/attachment.html 


More information about the Barrelfish-users mailing list