[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] x86-64 problem with invalid page fault in linux 2.6.16-rc1

On SMP systems we need the guest to handle spurious page-not-present faults at any time and at any virtual address. This is a side effect of the writable pagetable implementation.

If the vmalloc_fault path no longer covers all of the kernel virtual address space then a spurious-fault detection needs to be added before oops'ing the kernel.

Alternatively it would be quite reasonable to extend the writable pagetable implementation to exclude areas of the virtual address space from being subject to batched wrpt updates. Updates to such areas would be emulated synchronously. This would be a good thing anyway -- updates to kernel va space rarely occur in batches and so emulation would improve performance.

 -- Keir

On 20 Jan 2006, at 15:50, Jan Beulich wrote:

I'm just trying to see if anyone has any clue about this, which only appears to happen with MP guests:

Since the check for the modules area is gone in 2.6.16's vmalloc_fault() (and we appropriately merged this change to the Xen files), we are now seeing page faults in the module area, where a subsequent software page table walk shows all page table entries present, and a get_user from inside the hypervisor's or the guest's page fault handler also succeeds. The module in the questionable space was loaded significantly before the page fault occurs, and we never saw a fault after the system fully booted. Faults of this kind may have existed before, but would have been hidden by the vmalloc_fault() handling assuming that another processor would have put in place the pgd entry meanwhile.

Since I have no clue how such a fault could be raised in the first place (given that the pgd entry for the modules area is shared with main kernel code, all lower level entries are shared across kernel and all processes, and the fault happens on an access to the modules area from kernel code [guaranteeing that the system isn't unintentionally running
with the user mode page tables]).

I can also mostly rule out any sort of hardware problem since the issue is visible on both Intel and AMD processors.

Thanks for any thoughts, hints, or pointers,

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.