[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG REPORT] soft_reset (kexec/kdump) does not work with mainline xen


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
  • Date: Fri, 25 Feb 2022 10:45:50 -0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8BYp7TpMiW9tihYA5bt5W0jmwZZ4fgvseB/F7Eosde4=; b=VwW6NxUtkSw82vtn5qFkNc1WwT4Kqh1O7ih+jSHBYP9Lq9GbsrGMl55E3m5b7NBzWmRj0sAZv5CkKYG7Pkk1nUkA3wLX87LVCN4+7DvIMOGx7SkoDIfJyLNI/bVMotUFROKp11ar2kqP3NShM85xCBX6jA+yqZ05P0q62MqERGSKPj/pZVaMsfkPhbFYQeMQZ6wXjIuu/Cb3C594TuXe07LRqd4PT2fKtMV/YSw0A3Rk2aSUAumn42EFXd26+X19zcNI5JnwdDAE8JH3VK7KFqoB6NVcwp0wvK6Y2mYypEqINdZ4vXpMM+Ddg9o0jKV9POxSyPzWKeBpPt3dP3shvg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CNdG51b2tVLtNJiE1RXPJEc0lsXhJ5SausI9ttisMPoWN7hVQXTzSL4Keq3Q78lCheESaFs+6fioWll9Eewct2EhZw5JgX68IxGjIpM1k/+qXSjs7hn2l7K/5fZJCd+lncy1K4IrarAHHEesA1cv7cbQo4Z2nlAel+i8dp4u1/s9NDB4EYLH4UofmlaHpWDsixywuuBHiOVasXxzKXQI2RnZ6eGlt4baSXbBTonXWVNSMtPme4UqbenH32bgV1i5tazcxaah9C7JT48MLkbtQzLHiJDdV6rMn3PRGzX+9cIlYcgGYIuo0UGG04nDvh/IRExgO5jJU8Bd3MWFWkpAOQ==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Fri, 25 Feb 2022 18:46:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Jan,

On 2/24/22 11:15 PM, Jan Beulich wrote:
> On 24.02.2022 23:27, Dongli Zhang wrote:
>> Hello,
>>
>> This is to report that the soft_reset (kexec/kdump) has not been working for 
>> me
>> since long time ago.
>>
>> I have tested again with the most recent mainline xen and the most recent
>> mainline kernel.
>>
>> While it works with my old xen version, it does not work with mainline xen.
>>
>>
>> This is the log of my HVM guest.
>>
>> Waiting for domain test-vm (domid 1) to die [pid 1265]
>> Domain 1 has shut down, reason code 5 0x5
>> Action for shutdown reason code 5 is soft-reset
>> Done. Rebooting now
>> xc: error: Failed to set d1's policy (err leaf 0xffffffff, subleaf 
>> 0xffffffff, msr 0xffffffff) (17 = File exists): Internal error
> 
> I don't suppose you tried you track down the origin of this EEXIST? I think
> it's pretty obvious, as in the handling of XEN_DOMCTL_set_cpu_policy we have
> 
>         if ( d->creation_finished )
>             ret = -EEXIST; /* No changing once the domain is running. */
> 
> Question is how to address it: One approach could be to clear
> d->creation_finished in domain_soft_reset(). But I think it would be more
> clean if the tool stack avoided trying to set the CPUID policy (again) on
> the guest when it soft-resets, as it's still the same guest after all.
> Cc-ing Andrew and Anthony for possible thoughts.
> 

The soft_reset on HVM is successful after I reset d->creation_finished at the
beginning of domain_soft_reset(). So far I am able to use this as workaround to
test kexec/kdump.

However, while my image's console works well on old xen versions, the console on
mainline xen version does not work well.

I connect to the console with "xl console <domid>" immediately after the domU is
panic (and kdump is triggered). I am not able to have the syslogs of kdump
kernel on mainline xen. The same image works on old xen version.

Thank you very much!

Dongli Zhang



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.