[Barrelfish-users] Corruption sending buffer

Zeus Gómez Marmolejo zeus.gomez at bsc.es
Tue Jan 17 15:06:27 CET 2012


Hi Andrew,

El 16 de enero de 2012 20:58, Baumann Andrew <andrewb at inf.ethz.ch> escribió:

>  Hi Zeus,****
>
> ** **
>
> At first look your program appears to be correct if sizeof(unsigned) == 4
> on both source and destination. Although sending 4kB buffers through the
> IDC system in this way is madness, it should work.****
>
> **
>

Yes, I forgot to say that the architecture I was running is x86_64.



>  **
>
> Unfortunately I’m not in a good position to run this right now (… it’s a
> long story involving snow and an out-of-date QEMU). When it fails, is the
> incorrect value one from the previous iteration, or is it garbage?****
>
> **
>

The incorrect value is 0 almost all the time is failing.


> **
>
> BTW, there is a minor use-after-free bug in your debug_printf() in the
> receive handler.****
>
> **
>

You are right, but placing the debug_printf() before the free() doesn't
make the error go away.


> **
>
> Andrew****
>
> ** **
>
> *From:* Zeus Gómez Marmolejo [mailto:zeus.gomez at bsc.es]
> *Sent:* Monday, 16 January 2012 11:39
> *To:* barrelfish-users at lists.inf.ethz.ch
> *Subject:* [Barrelfish-users] Corruption sending buffer****
>
> ** **
>
> Dear Barrelfish developers,****
>
> ** **
>
> I believe that I've found a bug in the message passing interface when a
> buffer is sent between two endpoints. I would like you to have a look at
> this example I'm sending to you. You can copy it to the folder
> "usr/tests/idctest" of the latest version of the public repository. It
> should build correctly.****
>
> ** **
>
> This is a very simple example: it has 2 cores, with one thread per core.
> Only one core is sending a message to the other core. The message is a
> simple buffer of 1024 unsigned integers where the first integer and the
> last one is the same, it's incremented in each message sent. The message
> handler on the receiver is just checking that the first and the last
> integer of the buffer are the same.****
>
> ** **
>
> The application keeps running until it finds that the two integers differ.
> This means that the buffer has been sent incorrectly. I've tested this in
> qemu and in a real machine and after a while the program aborts because of
> a corrupt buffer. In both systems, the error happens before 2 minutes
> approx.****
>
> ** **
>
> Can you have a look at this example and check whether I'm doing something
> wrong?
> ****
>
> ** **
>
> Many thanks!!****
>
> ** **
>
> ** **
>
> --
> Zeus Gómez Marmolejo
> Barcelona Supercomputing Center
> PhD student
> http://www.bsc.es
>
> ****
>



-- 
Zeus Gómez Marmolejo
Barcelona Supercomputing Center
PhD student
http://www.bsc.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.inf.ethz.ch/pipermail/barrelfish-users/attachments/20120117/22497f6f/attachment.html 


More information about the Barrelfish-users mailing list