[Barrelfish-users] Threads on different cores and timers lead to deadlock

Manuel Stocker mensi at vis.ethz.ch
Sat Jan 5 13:01:52 CET 2013


Hey Luki,

We used this in our AHCI benchmarking code:

     #if defined(__x86_64__) || defined(__i386__)
         uint64_t tscperms;
         err = sys_debug_get_tsc_per_ms(&tscperms);
         assert(err_is_ok(err));

         uint64_t start = rdtsc();
     #endif

     // do stuff

     #if defined(__x86_64__) || defined(__i386__)
         uint64_t stop = rdtsc();
         uint64_t elapsed_msecs = ((stop - start) / tscperms);
     #endif

regards,
Manuel

Am 04.01.2013 22:43, schrieb Andrew Baumann:
> Hi Lukas,
>
> I think you are doing the best thing you can in the current API... there are a couple of issues here:
>
>   * the kernel's timer infrastructure is low resolution (specifically kernel_now is in ms, and unless you set CONFIG_ONESHOT_TIMER it is only updated by a slow periodic ticker)
>   * the user's copy of the system time is only updated at dispatch, and get_system_time() blindly returns this value
>
> Without having thought about it terribly hard, I think a nice solution would be to make the tsc_per_ms value available to user-mode somehow, and have the dispatcher track the tsc count (using rdtsc) since it was dispatched, so that it can add this to the systime value from the kernel in get_system_time(). This would require that we only update the systime field on dispatch, and not on resume.
>
> Andrew
>
> -----Original Message-----
> From: Lukas Humbel [mailto:humbell at ethz.ch]
> Sent: Thursday, 3 January 2013 02:32
> To: Andrew Baumann
> Cc: barrelfish-users at lists.inf.ethz.ch
> Subject: Re: [Barrelfish-users] Threads on different cores and timers lead to deadlock
>
> Hi,
>
> Another question regarding deferred events: What is the preferred way of
> getting the remaining time until an event fires? I tried something like:
>
> time = get_system_time() + duration
>
> and then on request
>
> time - get_system_time()
>
> but it seems like the system time is only updated when the process
> yields. So it gives me the same value each time I call it, although
> there is a gap of 10ms between the calls. Should I use something that
> reads TSC?
>
> Lukas
>
>
> On 12/20/2012 04:51 PM, Andrew Baumann wrote:
>> Yes, you generally want to call event_dispatch() on the default waitset in a loop on a single main or message-handling thread.
>>
>> Andrew
>>
>> -----Original Message-----
>> From: Lukas Humbel [mailto:humbell at ethz.ch]
>> Sent: Thursday, 20 December 2012 06:04
>> To: Andrew Baumann
>> Cc: barrelfish-users at lists.inf.ethz.ch
>> Subject: Re: [Barrelfish-users] Threads on different cores and timers lead to deadlock
>>
>> Hi Andrew,
>>
>> Thanks. I'll try to use deferred events and see if it works. If
>> messages_wait_and_handle_next is deferred, what should I use? Just
>> inline the function?
>>
>> I already tried replacing the while(1); loop with something that handles
>> the message, also no luck...
>>
>> Lukas
>>
>> On 12/20/2012 10:16 AM, Andrew Baumann wrote:
>>> Hi Lukas,
>>>
>>> The first thing to note is that the timer library you're using (<timer/timer.h>) has been superseded by the deferred events (<barrelfish/deferred.h>), which should be used for any new code. (We should really get around to updating the existing clients of the timer library and removing it.) The old timer library relies on IDC to a driver to function, which we're realised was a bad idea, whereas the new deferred events code plugs into the waitset logic directly and runs off the kernel timer, so is more accurate and not prone to the type of deadlock that you're seeing.
>>>
>>> To try to answer your questions: all the IDC mechanisms, including messages_wait_and_handle_next() which exists only for backwards compatibility and shouldn't be used in new code, are core-local. However, I suspect that by having the second core spin in while (1) you're probably preventing servicing of one of the internal event handlers needed by Flounder.
>>>
>>> Andrew
>>>
>>> -----Original Message-----
>>> From: Lukas Humbel [mailto:humbell at ethz.ch]
>>> Sent: Wednesday, 19 December 2012 11:17
>>> To: barrelfish-users at lists.inf.ethz.ch
>>> Subject: [Barrelfish-users] Threads on different cores and timers lead to deadlock
>>>
>>> Hi all,
>>>
>>> I'm having problems avoiding a deadlock when using the timer code,
>>> especially the function timer_remaining. It blocks in
>>> messages_wait_and_handle_next. There is also a comment next to it:
>>>
>>> // XXX: shouldn't block on default waitset! if we're in a callback we'll
>>> deadlock
>>>
>>> I'm not calling the function from within a callback. But what seems
>>> related to it, is that I have set up a flounder binding just before,
>>> when I remove it, the problem vanishes.
>>>
>>> Maybe this lock is just a symptom of a more fundamental problem so let
>>> me tell you what I'm trying to do:
>>> I have one domain which spans to two cores and two threads (shared
>>> vspace): one on each core. The first thread exports a flounder
>>> interface, spins until a connection has been set up and enters the timer
>>> test (create, start, check remaining) and locks. The second thread
>>> connects to the interface, performs messages_wait_and_handle_next until
>>> the connection set up callback is called and enters a while(1){}. I'm
>>> using the default waitset everywhere, the flounder connection looks
>>> fine, as I can perform RPCs.
>>>
>>> Is this setup possible with flounder? Or is there something that
>>> prevents flounder from working when client and server are using the same
>>> vspace?
>>>
>>> Is messages_wait_and_handle_next "core aware" ? means if I call it on
>>> core 0 will it only receive messages for core 0? (As far as I can see
>>> from my debug output this seems to be true). (Just out of curiosity:)
>>> How about two threads on the same core?
>>>
>>> Do you have any ideas why this lock happens and how to avoid it?
>>>
>>>
>>> Cheers,
>>> Lukas
>>>
>>> _______________________________________________
>>> Barrelfish-users mailing list
>>> Barrelfish-users at lists.inf.ethz.ch
>>> https://lists.inf.ethz.ch/mailman/listinfo/barrelfish-users
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
>
> _______________________________________________
> Barrelfish-users mailing list
> Barrelfish-users at lists.inf.ethz.ch
> https://lists.inf.ethz.ch/mailman/listinfo/barrelfish-users
>



More information about the Barrelfish-users mailing list