[Barrelfish-users] Possible bug?
andrewb at inf.ethz.ch
Wed Sep 14 19:59:13 CEST 2011
A quick first thought: have you checked whether malloc() is failing and returning NULL on the sender side, which is then being delivered as a zero-length NULL buffer on the receiver? You do appear to have a memory leak on the receiver (someone needs to free the buffer p after it has been sent), so this might be the cause.
From: Zeus Gómez Marmolejo [mailto:zeus.gomez at bsc.es]
Sent: Wednesday, 14 September, 2011 10:29
To: barrelfish-users at lists.inf.ethz.ch
Subject: [Barrelfish-users] Possible bug?
I'm trying the Barrelfish' ability of sending large buffers (now 64kb) between cores using flounder. And I'm experiencing a strange behaviour. After sending some messages, and running it for about 1 min in QEMU, the program segfaults or simply hangs. I'm not sure if I am doing something wrong... I'm trying it with the latest version of Barrelfish.
The program is inspired in "usr/tests/idctest/idctest.c". It's spawning another instance on core 1 and setting the binding. After that it creates another thread to "dispatch events". The main loop in core 0 is sending messages, while the main loop in core 1 is just waiting for a reply queue to have some messages and send them back. It's using the same "test.if" interface that idctest.c is using. I send it as an attachment. You can simply try it by copying it over the existing idctest.c.
So, core 0 is sending a message to core 1:
a = malloc(65536);
test_buf__tx(b, MKCONT(end, 0), a, 65536);
before sending it, it obtains a lock to ensure that next message is not sent before the previous one has been already sent:
while (__sync_lock_test_and_set(&busy, true))
The continuation closure is releasing the lock:
static void end(void *r)
busy = false;
This function is being called always by the second thread, which is always dispatching messages. I know this is not very efficient to use this spinlock... Is there any other way to block the thread and wake it up by the other thread?
Core 1 when receives the message:
static void buf1(struct test_binding *b, uint8_t *p, size_t buflen)
debug_printf("buf1 %u\n", p);
Simply prints the last position and sends it back to core 0: queuing it to the other thread in order not to block the event dispatcher.
The program segfaults when accessing to buffer p, in the previous function, as sometimes p is null. In any case, there is no error reported...
I would like to know if you see something wrong here ...
Zeus Gómez Marmolejo
Barcelona Supercomputing Center
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Barrelfish-users