[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()



On 12/08/2010 12:48 AM, Jan Beulich wrote:
>>>> On 08.12.10 at 01:54, Chuck Anderson <chuck.anderson@xxxxxxxxxx> wrote:
>> I'm posting this because I am writing a patch to fix a 2.6.32 based PV 
>> Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h 
>> arch_enter_lazy_mmu_mode() (see details below).  The following BUG_ON() 
>> was triggered:
>>
>>     arch/x86/kernel/paravirt.c
>>
>>     static inline void enter_lazy(enum paravirt_lazy_mode mode)
>>     {
>>             BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
>>
>>             percpu_write(paravirt_lazy_mode, mode);
>>     }
>>
>> because enter_lazy() was called twice, once through mm/memory.c 
>> copy_pte_range() and a second time through an interrupt path.
>>
>> The easy fix is to disable interrupts in copy_pte_range() before calling 
>> arch_enter_lazy_mmu_mode() and re-enable them after the call to 
>> arch_leave_lazy_mmu_mode() but I'm asking if there is a better way to 
>> handle this.  If disabling interrupts is best, there are other calls to 
>> arch_enter_lazy_mmu_mode() that appear to have the same interruption 
>> issue.  It may be best then to disable interrupts in 
>> arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu().
> I don't think this is an option, as the period of time for which you
> would disable interrupts could be pretty much unbounded.
>
> Instead (being a performance optimization only anyway)
> the BUG_ON() could be removed (accepting that the
> interrupted sequence would not batch any further
> hypercalls, and provided all of this stuff can actually be
> used in a nested way), the flag could be converted to a
> counter (again provided nesting is okay here in the first
> place), or a filter could be applied when actually checking
> whether to batch (which is what we do in our non-pvops
> kernels: in IRQ context, no batching happens).

That's what happens in pvops kernels too - batching is disabled in
interrupt context so that (for example) vmalloc pagefault pte updates
aren't deferred.

Looks like enter/leave lazy should just be no-op in interrupt context too.

Though I'm surprised it has taken so long for this to appear.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.