[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen-4.7 regression when saving a pv guest



On 26.08.2016 13:53, Juergen Gross wrote:
> On 26/08/16 12:52, Stefan Bader wrote:
>> On 25.08.2016 19:31, Juergen Gross wrote:
>>> On 25/08/16 17:48, Stefan Bader wrote:
>>>> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
>>>> following error:
>>>>
>>>> II: Guest memory 4096 MB
>>>> II: Saving guest state to file...
>>>> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
>>>> xc: info: Saving domain 23, type x86 PV
>>>> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
>>>
>>> So the first mfn of the memory containing the p2m information is bogus.
>>> Weird.
>>
>> Hm, not sure how bogus. From below the first mfn is 0x4eb1c8 and points to
>> pfn=0xff7c8 which is above the current max of 0xbffff. But then the dmesg 
>> inside
>> the guest said: "last_pfn = 0x100000" which would be larger than the pfn 
>> causing
>> the error.
>>
>>>
>>>> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
>>>> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
>>>> xc: error: Save failed (34 = Numerical result out of range): Internal error
>>>> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
>>>> domain: domain did not respond to suspend request: Numerical result out of 
>>>> range
>>>> Failed to save domain, resuming domain
>>>> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
>>>> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: 
>>>> xc_domain_resume
>>>> failed for domain 23: Invalid argument
>>>> EE: Guest not off after save!
>>>> FAIL
>>>>
>>>> From dmesg inside the guest:
>>>> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
>>>>
>>>> Somehow I am slightly suspicious about
>>>>
>>>> commit 91e204d37f44913913776d0a89279721694f8b32
>>>>   libxc: try to find last used pfn when migrating
>>>>
>>>> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
>>>> against in mfn_in_pseudophysmap(). Is that a known problem?
>>>> With xen-4.6 and the same dom0/guest kernel version combination this does 
>>>> work.
>>>
>>> Can you please share some more information? Especially:
>>>
>>> - guest kernel version?
>> Hm, apparently 4.4 and 4.6 with stable updates. I just tried a much older 
>> guest
>> kernel (3.2) environment and that works. So it is the combination of 
>> switching
>> from xen-4.6 to 4.7 and guest kernels running 4.4 and later. And while the 
>> exact
>> mfn/pfn which gets dumped varies a little, the offending mapping always 
>> points
>> to 0xffxxx which would be below last_pfn.
> 
> Aah, okay. The problem seems to be specific to the linear p2m list
> handling.
> 
> Trying on my system... Yep, seeing your problem, too.
> 
> Weird that nobody else stumbled over it.
> Ian, don't we have any test in OSSTEST which should catch this problem?
> A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
> currently.
> 
> Following upstream patch fixes it for me:

Ah! :) Thanks. I applied the below locally, too. And save works with a 4.6 guest
kernel.

-Stefan

> 
> diff --git a/tools/libxc/xc_sr_save_x86_pv.c
> b/tools/libxc/xc_sr_save_x86_pv.c
> index 4a29460..7043409 100644
> --- a/tools/libxc/xc_sr_save_x86_pv.c
> +++ b/tools/libxc/xc_sr_save_x86_pv.c
> @@ -430,6 +430,8 @@ static int map_p2m_list(struct xc_sr_context *ctx,
> uint64_t p2m_cr3)
> 
>          if ( level == 2 )
>          {
> +            if ( saved_idx == idx_end )
> +                saved_idx++;
>              max_pfn = ((xen_pfn_t)saved_idx << 9) * fpp - 1;
>              if ( max_pfn < ctx->x86_pv.max_pfn )
>              {
> 
> 
> Juergen
> 


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.