[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HVM/PVH Balloon crash


  • To: Elliott Mitchell <ehem+xen@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 15 Sep 2021 08:05:05 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=2E+WEwgOtIX1OCyW2woDo9w/2jvsNO81sEG/PAJhb0s=; b=cG3kG6kgW3leC7tXviqr4Ni3hWqF4v1MopQIMJG6KVTLASUAjEchWMPyujj0zX2gy452s7uYAVI2mnx18mMPGEAsGU8e7dCwORRBCTsgQm/SlmY1NyNasFvh61xeoYiBUxWD8KqWuyxO2oyT1IlzJu44HhlvP73YeIhXBqqzS4G9qEzUpHtl2NEWKms2+ndwH04vTsOaqIzzuQGfqGVNPjUmou017f7acejqwgInM26QfGkNkjZQFIjzCZKocuw/I2wFf5c5jDHTHKFL8VYjP6ykXmBqTppIZjCwBYGSfdAuKaBQ9MUrtaNzDtnSpu6RrFnRHTg7Iql+CLq3JeE73w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ndnpmZQDOzwdHTWxxKt9tGLto0MuJ/MR4F0oWyCX16cTWs5aeYVTMVmzZGm1AGq7T0EZzikyUptFmzlUupDOXvnAZESjNsBHmgPZJ+7SHmy6KnfysxvkE6gJA/Y3vc9IXMb1EsjmByckZMc8w74+vR90btEokk6ljKLbu8XWcnRUQzBhlW6FCtRUTvuGIeMU/ndhSIyXXALchGRvzZpuumlfrpfa5kD2M0vp8kJEvVaQ0Jf6oMeQP2R1uO0g9r2koTRrC5oM1l826gTmY2CbNpq/Kj9nyigRnKRWL1RR2gwBfDoBZ/lff4+YSw0PUd0tzgnMFpPPbhy/cy30Vuqh2Q==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 15 Sep 2021 06:05:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.09.2021 04:40, Elliott Mitchell wrote:
> On Tue, Sep 07, 2021 at 05:57:10PM +0200, Jan Beulich wrote:
>> On 07.09.2021 17:03, Elliott Mitchell wrote:
>>>  Could be this system is in an
>>> intergenerational hole, and some spot in the PVH/HVM code makes an
>>> assumption of the presence of NPT guarantees presence of an operational
>>> IOMMU.  Otherwise if there was some copy and paste while writing IOMMU
>>> code, some portion of the IOMMU code might be checking for presence of
>>> NPT instead of presence of IOMMU.
>>
>> This is all very speculative; I consider what you suspect not very likely,
>> but also not entirely impossible. This is not the least because for a
>> long time we've been running without shared page tables on AMD.
>>
>> I'm afraid without technical data and without knowing how to repro, I
>> don't see a way forward here.
> 
> Downtimes are very expensive even for lower-end servers.  Plus there is
> the issue the system wasn't meant for development and thus never had
> appropriate setup done.
> 
> Experimentation with a system of similar age suggested another candidate.
> System has a conventional BIOS.  Might some dependancies on the presence
> of UEFI snuck into the NPT code?

I can't think of any such, but as all of this is very nebulous I can't
really rule out anything.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.