[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] Re: Reproducable data corruption on xen-unstable

On Sat, 5 Feb 2005, Robin Green wrote:
On the assumption that this _is_ an FP save/restore bug,

Update: I have narrowed down this bug

I have confirmed that there IS definitely an FP save/restore bug with this kernel/xen combination (i.e. I've eliminated the possibility that it was just a non-floating-point-related bug)! I identified it using a different test case (running wget -d in a konsole), and I have established
that it is case 1 in the list of possible causes I gave, namely:

1. Something leaves the FPU in a state where it has bogus data in it,
   but it won't trap to tell the kernel to restore the old, correct data

More specifically, in this particular case, according to my printf's, what happened was:

A syscall was made (connect). Immediately before the syscall, the floating-point stack was empty; immediately after the syscall, the floating-point stack was nonempty, and the TS flag (Task Switch) was _cleared_.
(Source code and output available on request.)

This may not immediately cause problems. But over time, it would tend to lead to floating-point stack overflow, which leads to floating-point calculations generating bogus output.

So, in theory there are two possible algorithms which the kernel could be supposed to be following to avoid this situation.

A. Always set TS on task switch (Seems like the logical choice!)

B. Always set TS on task switch - except when the FPU has not been used
by the switched-to process, in which case do an FINIT on task switch. (This seems pointlessly complicated and slow, so I doubt the kernel follows this approach.)

So, it looks like we are looking for a code path in which TS doesn't end
up set after a task switch. (And it might be specifically to do with

I will look for one - but does anyone have any ideas for what that code path might be, or how I could efficiently debug the kernel (while in X, remember, because this doesn't seem to occur in text mode!) to find out what that code path is? I don't have a serial console.


This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.