[Barrelfish-users] A Weird Bug about Page Fault
Simon Peter
speter at inf.ethz.ch
Tue Dec 4 09:45:08 CET 2012
Hi Jinghao,
It seems this is SCC specific. I just ran your test-case on QEMU on both
x86-64 and -32 platforms and it seems to work just fine (i.e. I get the
"all good" output).
Simon
On 12/03/2012 12:47 AM, Shi Jinghao wrote:
> Hi Andrew,
>
> Thanks for your reply. The two different exceptions you mentioned is
> insightful I tried your suggestion. But that does not help. The NaN
> errors still occur. I also tried to put extra dummy float point
> operations in page fault handler. And that does not help, either.
>
> Thanks,
> Jinghao
>
> On Sun, Dec 2, 2012 at 2:06 AM, Andrew Baumann
> <Andrew.Baumann at microsoft.com <mailto:Andrew.Baumann at microsoft.com>> wrote:
>
> Hi Jinghao,____
>
> __ __
>
> I notice that the first time you use floating point in this program
> is when writing to the array. There should be two different
> exceptions raised and handled here: one for the page fault, and one
> for the first use of the floating point hardware (which we lazily
> context-switch). My guess is that the page-fault path, which is not
> heavily exercised, does not interact well with the floating point
> save/restore code.____
>
> __ __
>
> If you initialise the floating point hardware by doing some other
> floating point operations (or writing to a statically allocated
> variable) beforehand, does the problem go away?____
>
> __ __
>
> Andrew____
>
> __ __
>
> *From:* Shi Jinghao [mailto:jhshi at cs.hku.hk <mailto:jhshi at cs.hku.hk>]
> *Sent:* Saturday, 1 December 2012 02:20
> *To:* barrelfish-users at lists.inf.ethz.ch
> <mailto:barrelfish-users at lists.inf.ethz.ch>
> *Subject:* [Barrelfish-users] A Weird Bug about Page Fault____
>
> __ __
>
> Hi,____
>
> __ __
>
> I've been developing a memory management library on Barrelfish
> (SCC). Recently I bumped into a very weird bug about page fault. I
> attached a minimal case (pgfault_test.tgz) that can reproduce this
> bug.____
>
> __ __
>
> The work flow of the test case is as simple as following:____
>
> __ __
>
> 1) Allocate an array of doubles as read-only, using frame_alloc and
> vspace_map_one_frame_attr (or pmap->f.map, this doesn't matter)____
>
> __ __
>
> 2) Initiate the array, this will generate page fault____
>
> __ __
>
> 3) In page fault handler, remap the faulted page as read-write,
> using pmap->f.modify_flags____
>
> __ __
>
> The weird thing is: the first touch of this array will not result in
> a proper value, but just NaN!____
>
> __ __
>
> I've conducted several runs and found the following:____
>
> __ __
>
> 1) This bug will occur when the array type is double or float.
> Everything is fine if it's a integer array.____
>
> __ __
>
> 2) Only the item that caused the page fault will end in a NaN value,
> others items are just fine. And this applies when the faulted be
> anywhere within that page, not just the page start.____
>
> __ __
>
> 3) If you assign each array value with a constant value (say 1.0),
> or a int/double variable, then all items will end up with a right
> value. It seems only when we assign a[i] with i (or any expression
> contains i) will produce this bug.____
>
> __ __
>
> I tested the attached code in release2012-05-25 (the revision I work
> on) and the latest revision (release2012-10-03).____
>
> __ __
>
> I've also composed a minimal test case in sccLinux (write_fault.c).
> It turns out that everything is all good. No annoying NaN values.____
>
> __ __
>
> This bug has bothered me for quite a few days. Really appreciate if
> someone can give a hint on this.____
>
> __ __
>
> Thanks,____
>
> Jinghao____
>
>
>
>
> _______________________________________________
> Barrelfish-users mailing list
> Barrelfish-users at lists.inf.ethz.ch
> https://lists.inf.ethz.ch/mailman/listinfo/barrelfish-users
More information about the Barrelfish-users
mailing list