Ok, yes. You are right. There is an extra malloc() being done in the sender when receiving back the buffer, done by the flounder stub that is not free'd. I send you now the corrected version. But in any case, the user page fault doesn't go away... :(<div>
<br></div><div>Thank you for your help</div><div><br></div><div>zeus.<br><br><div class="gmail_quote">El 15 de septiembre de 2011 00:55, Zeus Gómez Marmolejo <span dir="ltr"><<a href="mailto:zeus.gomez@bsc.es">zeus.gomez@bsc.es</a>></span> escribió:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Ok, I'm checking this now, but this is not the problem. In fact, it shouldn't fail anyway, as if malloc() fails, then the sender will fail when accessing the buffer, as the length is set to 65536 anyway.<div>
<br></div>
<div>But the sender is never failing, is always the receiver who fails. (The buffer in the receiver side is being allocated in lib/barrelfish/flounder_support.c:424) It's the user's job to free the pointer allocated from the flounder stub. And this is always done in the continuation closure (function endr() ) in the receiver side.</div>
<div><br></div><div>zeus.</div><div><br><div class="gmail_quote">El 14 de septiembre de 2011 19:59, Baumann Andrew <span dir="ltr"><<a href="mailto:andrewb@inf.ethz.ch" target="_blank">andrewb@inf.ethz.ch</a>></span> escribió:<div>
<div></div><div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-AU" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Hi,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">A quick first thought: have you checked whether malloc() is failing and returning NULL on the sender side, which is then being delivered as a zero-length NULL
buffer on the receiver? You do appear to have a memory leak on the receiver (someone needs to free the buffer p after it has been sent), so this might be the cause.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D">Andrew<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#1F497D"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt">From:</span></b><span lang="EN-US" style="font-size:10.0pt"> Zeus Gómez Marmolejo [mailto:<a href="mailto:zeus.gomez@bsc.es" target="_blank">zeus.gomez@bsc.es</a>]
<br>
<b>Sent:</b> Wednesday, 14 September, 2011 10:29<br>
<b>To:</b> <a href="mailto:barrelfish-users@lists.inf.ethz.ch" target="_blank">barrelfish-users@lists.inf.ethz.ch</a><br>
<b>Subject:</b> [Barrelfish-users] Possible bug?<u></u><u></u></span></p><div><div></div><div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Hi all,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I'm trying the Barrelfish' ability of sending large buffers (now 64kb) between cores using flounder. And I'm experiencing a strange behaviour. After sending some messages, and running it for about 1 min in QEMU, the program segfaults or
simply hangs. I'm not sure if I am doing something wrong... I'm trying it with the latest version of Barrelfish.<br clear="all">
<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">The program is inspired in "usr/tests/idctest/idctest.c". It's spawning another instance on core 1 and setting the binding. After that it creates another thread to "dispatch events". The main loop in core 0 is sending messages, while the
main loop in core 1 is just waiting for a reply queue to have some messages and send them back. It's using the same "test.if" interface that idctest.c is using. I send it as an attachment. You can simply try it by copying it over the existing idctest.c.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">So, core 0 is sending a message to core 1:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><b>a = malloc(65536);</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b>test_buf__tx(b, MKCONT(end, 0), a, 65536);</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">before sending it, it obtains a lock to ensure that next message is not sent before the previous one has been already sent:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><b> while (__sync_lock_test_and_set(&busy, true))</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b> thread_yield();</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal">The continuation closure is releasing the lock:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><b>static void end(void *r)</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b>{</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b> free (a);</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b> busy = false;</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b>}</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
<div>
<p class="MsoNormal">This function is being called always by the second thread, which is always dispatching messages. I know this is not very efficient to use this spinlock... Is there any other way to block the thread and wake it up by the other thread?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Core 1 when receives the message:<u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><b>static void buf1(struct test_binding *b, uint8_t *p, size_t buflen)</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b>{</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b> debug_printf("buf1 %u\n", p[65535]);</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b> ambf_buff_send(p);</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><b>}</b><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
<div>
<p class="MsoNormal">Simply prints the last position and sends it back to core 0: queuing it to the other thread in order not to block the event dispatcher.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">The program segfaults when accessing to buffer p, in the previous function, as sometimes p is null. In any case, there is no error reported...<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I would like to know if you see something wrong here ...<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Many thanks!!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">-- <br>
Zeus Gómez Marmolejo<br>
Barcelona Supercomputing Center<br>
PhD student<br>
<a href="http://www.bsc.es" target="_blank">http://www.bsc.es</a><br>
<br>
<u></u><u></u></p>
</div>
</div></div></div>
</div>
</blockquote></div></div></div><div><div></div><div class="h5"><br><br clear="all"><div><br></div>-- <br>Zeus Gómez Marmolejo<br>Barcelona Supercomputing Center<br>PhD student<br><a href="http://www.bsc.es" target="_blank">http://www.bsc.es</a><br>
<br><br>
</div></div></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Zeus Gómez Marmolejo<br>Barcelona Supercomputing Center<br>PhD student<br><a href="http://www.bsc.es" target="_blank">http://www.bsc.es</a><br><br><br>
</div>