[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVM domains crash after upgrade from XEN 4.5.1 to 4.5.2



On 12/11/15 14:29, Atom2 wrote:
> Hi Andrew,
> thanks for your reply. Answers are inline further down.
>
> Am 12.11.15 um 14:01 schrieb Andrew Cooper:
>> On 12/11/15 12:52, Jan Beulich wrote:
>>>>>> On 12.11.15 at 02:08, <ariel.atom2@xxxxxxxxxx> wrote:
>>>> After the upgrade HVM domUs appear to no longer work - regardless
>>>> of the
>>>> dom0 kernel (tested with both 3.18.9 and 4.1.7 as the dom0 kernel); PV
>>>> domUs, however, work just fine as before on both dom0 kernels.
>>>>
>>>> xl dmesg shows the following information after the first crashed HVM
>>>> domU which is started as part of the machine booting up:
>>>> [...]
>>>> (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest
>>>> state (0).
>>>> (XEN) ************* VMCS Area **************
>>>> (XEN) *** Guest State ***
>>>> (XEN) CR0: actual=0x0000000000000039, shadow=0x0000000000000011,
>>>> gh_mask=ffffffffffffffff
>>>> (XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000,
>>>> gh_mask=ffffffffffffffff
>>>> (XEN) CR3: actual=0x0000000000800000, target_count=0
>>>> (XEN)      target0=0000000000000000, target1=0000000000000000
>>>> (XEN)      target2=0000000000000000, target3=0000000000000000
>>>> (XEN) RSP = 0x0000000000006fdc (0x0000000000006fdc)  RIP =
>>>> 0x0000000100000000 (0x0000000100000000)
>>> Other than RIP looking odd for a guest still in non-paged protected
>>> mode I can't seem to spot anything wrong with guest state.
>> odd? That will be the source of the failure.
>>
>> Out of long mode, the upper 32bit of %rip should all be zero, and it
>> should not be possible to set any of them.
>>
>> I suspect that the guest has exited for emulation, and there has been a
>> bad update to %rip.  The alternative (which I hope is not the case) is
>> that there is a hardware errata which allows the guest to accidentally
>> get it self into this condition.
>>
>> Are you able to rerun with a debug build of the hypervisor?
> Given that I am compiling from source under gentoo and provided you
> lend me a helping hand in case I get stuck, I am confident that this
> is possible.
>
> gentoo has three xen packages (they call those ebuilds) as follows
>     app-emulation/xen
>     app-emulation/xen-tools
>     app-emulation/pvgrub
> all of which are installed on my system. The former two offer a debug
> USE-flag and I assume that debug code for the latter is not required
> as this is for (the still working) PV domUs only. Furthermore as you
> are talking about the hypervisor, I guess it is safe to assume that it
> is app-emulation/xen and not xen-tools. Right?

I would guess so.

>
> BTW: The description of the debug USE flag reads as follows:
> Enable extra debug codepaths, like asserts and extra output. If you
> want to get meaningful backtraces see
> https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces
> I assume that backtraces are probably not required to get things moving.

By the sounds of it, the debug USE flag is what you want.

>
> Another question is whether prior to enabling the debug USE flag it
> might make sense to re-compile with gcc-4.8.5 (please see my previous
> list reply) to rule out any compiler related issues. Jan, Andrew -
> what are your thoughts?

First of all, check whether the compiler makes a difference on 4.5.2

If both compiles result in a guest crashing in that manner, test a debug
Xen to see if any assertions/errors are encountered just before the
guest crashes.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.