[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/hvm: Fix boot on systems where HVM isn't available


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Mon, 7 Feb 2022 16:00:55 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RVMMGcshLYbP+PZLZfceTFwbCv1XB7f4eHmsO1KU/SM=; b=eZXiTZCLeDwxinSYKx8MbNonTAowprmsMIjbXwHc5UoXUmc+mKHY5Xo0gkobtgG0k348TsmdOadUSgb6D2LM9Vg4khBtpOztZijFJcNu451rWriOztTpGv9ULPSDr6CWzxta7eSk/tmb/xShquDjW5fUDlO4dDVEB3w4Nv9ZZLh6RH5ISfzE/jvzKYtdXujpxOSGHNdVGmMNJkwnRWnTNIa0KT1pDV+kK/cWf9GMhFTlaqzfBIkrBKX4x40vuD79iffqrUpBYpABOMbkix7cyBIL+kTSVChZNNKq3QWqSxpXD/mS8P3L0zOBTgEqMAxhPLHIA6u4tx9j85DtvbI4iA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YIqLeKMUEFuvoW8lmRe6UydowfFU3yhktN7v6GDM6tfBX6TAdjvW25Cv/fCxMkzw7KGVlw3LXyx1lgOYQmRQdEF22pL9/romvdPLX/MlUG61/rMQiiOJlz33f2onI5nDLtfulQfeu4pu28Ar0vVk/N/IKFeohvZ2ED8a1D0lTeShR6KAet1FiOQSl2w8DCBU/1pgnLe60t62bwyiUAfRi8nGg20e3cJZibmmrHljUQYu91aQtFNaXl6YZ3tF5nuG29jYJeziC7jC+bgScjROwv+6mOoSum92z/9FhnnAJejYyumd5VEB+kF1KjVW0AzhP32icZwYQqC00AGj4b4JmA==
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "Roger Pau Monne" <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 07 Feb 2022 16:01:25 +0000
  • Ironport-data: A9a23:gck4m69DbNEJX3Ned3hGDrUDWHmTJUtcMsCJ2f8bNWPcYEJGY0x3m DMYXzjUPfyMZWGgeYx/b9vk8ExSsJ7Xm4AwQFFupSE8E34SpcT7XtnIdU2Y0wF+jyHgoOCLy +1EN7Es+ehtFie0Si9AttENlFEkvU2ybuOU5NXsZ2YhFWeIdA970Ug5w7Rh3dYy6TSEK1jlV e3a8pW31GCNg1aYAkpMg05UgEoy1BhakGpwUm0WPZinjneH/5UmJMt3yZWKB2n5WuFp8tuSH I4v+l0bElTxpH/BAvv9+lryn9ZjrrT6ZWBigVIOM0Sub4QrSoXfHc/XOdJFAXq7hQllkPgux Mtik6W1az4GFZLToKMnchsGTQZHaPguFL/veRBTsOSWxkzCNXDt3+9vHAc9OohwFuRfWD8Us 6ZCcXZUM07F17neLLGTE4GAguwKKsXxMZxZkXZn1TzDVt4tQIzZQrWM7thdtNs1rp4VQ6uBP JZCAdZpRDH+QDgfGnwcMa4jh+fvt0vgWmJX9F3A8MLb5ECMlVcsgdABKuH9ZdiiVchT2EGCq Qru72n/Rx0XKtGb4T6E6W63wP/CmzvhX4AfH6H+8eRl6HWRzGEODBwdVXOgvOK0zEW5Xrpix 1c8o3R06/JorQryE4e7D0bQTGO4UgA0BNZOPMsD7SO05YHr/D7ePEkGEGBmUYlz3CMpfgAC2 liMltLvIDVgtryJVH6QnoupQSOO1Ts9djFbO3JdJecRy5y6+dxo0EqTJjp2OPPt1rXI9SfML ydmRcTUr5EaloY12qqy5jgraBr898GSHmbZCug6N19JDz+Vhqb4PeRECnCBtJ6sybp1qXHb7 RA5dzC2trxmMH10vHXlrB8xNL+o/e2ZFzbXnERiGZIsnxz0pSL/JNEJuW8vfh4yWirhRdMOS BWC0T69GbcJZCf6BUOJS97Z5zsWIVjISo2+C6G8gitmaZltbg6XlByClmbLt10BZHMEyPllU b/CKJ7EJS9DVcxPkWrnL89AgORD7n1vmgv7G8uhpzz5iuX2WZJgYepcWLd4Rrtit/3sTcS82 4s3CvZmPD0FDLSuOXeGrdFPRb3IRFBiba3LRwVsXrfrCiJtGX07Cu+XxrUkeod/mL9SmPuO9 Xa4MnK0AnKm1RUr8C2GNSJubq3BR5F6oS5pNCAgJw/wiXMifZyu/OEUcJ5uJesr8+lqzPhVS fgZeprfXqQTG2qfozlNP4PgqIFCdQiwgV7cNSSSfzViLYVrQBbE+4G4c1K3pjUOFCe+qeA3v 6akilHAWZMGSgk7VJTWZfujwkmfp38YnO4uDULELsMKIBfn8ZRwKjy3hfgyepleJRLGzzqc9 gCXHRZH+rWd/95rqIHE3PnWoZ2oHu1yGlthM1PatbvmZzPH+meDwJNbVLradz7qS26pqr6pY v9Yzq+gPaRfzkpKqYd1D51i0bk6u4n0v7ZfwwlpQCfLYlCsBu8yK3WKx5AS5KhEx7sfsgqqQ EOfvNJdPOzRas/iFVcQIisjb/iCiq5IymWDs6xtLRWo/jJz8ZqGTV5WbkuFhyFqJbdoNJ8on LU6s8kM5g3j0hcnP75qVMyPG7hg+pDYb5gaiw==
  • Ironport-hdrordr: A9a23:oDmk3aFL9BDOfHjVpLqFTJHXdLJyesId70hD6qkvc3Nom52j+/ xGws536fatskdtZJkh8erwXZVp2RvnhNBICPoqTMuftW7dySqVxeBZnMTfKljbdREWmdQtrJ uIH5IOa+EYSGIK9/oSgzPIU+rIouP3iJxA7N22pxwGLGFXguNbnnxE426gYxdLrWJ9dP4E/e +nl6x6Tk2bCBMqh6qAdxs4dtmGg+eOuIPtYBYACRJiwhKJlymU5LnzFAXd9gsCUhtUqI1Ssl Ttokjc3OGOovu7whjT2yv49JJNgubszdNFGYilltUVEDPxkQylDb4RGYFq/QpF5d1H2mxa1+ UkkC1QefibLEmhJ11dlCGdnzUIFgxes0MKh2Xo2kcL6vaJOw7SQ/Ax+76xNCGptnbI9esMoJ 6ilQiixutqJAKFkyLn69fSURZ20kKyvHo5iOYWy2dSSI0EddZq3MYiFW5uYd899RjBmcsa+S hVfbXhzecTdUnfY2HSv2FpztDpVnMvHg2eSkxHvsCOyTBZkH1w0kNdnaUk7zs93YN4T4MB6/ XPM6xumr0LRsgKbbhlDONERcesEGTCTR/FLWrXK1X6E6MMPW7LtvfMkfgIzfDvfIZNwIo5mZ zHXl8dvWkue1j2AcnLx5FP+gClehT1Yd0s8LAp23FUgMyPeFPbC1z1dLl1qbrSnxw2OLyvZ8 qO
  • Ironport-sdr: GWof63ljdIl+8znsHOqtglYI4iHAiA1fFtyr6ofjbA3s52/+Zf2L/O3Ie0i/W0BcrQRO6NiIxS yI/uBJrrfrxs8llhF133NRQK7idhwomzIh4UbjxXDLfP0pB2IzJAu6lParFzd/NpxL7HXxhRyv AUyFrPFTDhF2ccZC2Wz67xLjROQTRGCf1SnQxw9WT4+fdERN7pifGRs2epzpZISEIuw+4Jn+LK s5aRevaShRCr7ltScS5Y7WP6S3EKBJEG/46mmZHR2s8RMo3wfRcO5uiRnuEWF7qDU9u3dE1J3F +3RVrj2PPtC4n3PqjCfcFFrB
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYGe2Id8FyB0tJFkegbPCVP/nSPKyEtt+AgAMO4ACAAH4rAA==
  • Thread-topic: [PATCH] x86/hvm: Fix boot on systems where HVM isn't available

On 07/02/2022 08:29, Jan Beulich wrote:
> On 05.02.2022 10:47, Roger Pau Monné wrote:
>> On Fri, Feb 04, 2022 at 05:34:05PM +0000, Andrew Cooper wrote:
>>> c/s 27a63cdac388 ("x86/HVM: convert remaining hvm_funcs hook invocations to
>>> alt-call") went too far with dropping NULL function pointer checks.
> Oh, I'm sorry, I should have noticed this.
>
>>> smp_callin() calls hvm_cpu_up() unconditionally.  When the platform doesn't
>>> support HVM, hvm_enable() exits without filling in hvm_funcs, after which 
>>> the
>>> altcall pass nukes the (now unconditional) indirect call, causing:
>>>
>>>   (XEN) ----[ Xen-4.17.0-10.18-d  x86_64  debug=y  Not tainted ]----
>>>   (XEN) CPU:    1
>>>   (XEN) RIP:    e008:[<ffff82d04034bef5>] start_secondary+0x393/0x3b7
>>>   (XEN) RFLAGS: 0000000000010086   CONTEXT: hypervisor
>>>   ...
>>>   (XEN) Xen code around <ffff82d04034bef5> (start_secondary+0x393/0x3b7):
>>>   (XEN)  ff ff 8b 05 1b 84 17 00 <0f> 0b 0f ff ff 90 89 c3 85 c0 0f 84 db 
>>> fe ff ff
>>>   ...
>>>   (XEN) Xen call trace:
>>>   (XEN)    [<ffff82d04034bef5>] R start_secondary+0x393/0x3b7
>>>   (XEN)    [<ffff82d0402000e2>] F __high_start+0x42/0x60
>>>
>>> To make matters worse, __stop_this_cpu() calls hvm_cpu_down() 
>>> unconditionally
>>> too, so what happen next is:
>>>
>>>   (XEN) ----[ Xen-4.17.0-10.18-d  x86_64  debug=y  Not tainted ]----
>>>   (XEN) CPU:    0
>>>   (XEN) RIP:    e008:[<ffff82d04034ab02>] __stop_this_cpu+0x12/0x3c
>>>   (XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor
>>>   ...
>>>   (XEN) Xen code around <ffff82d04034ab02> (__stop_this_cpu+0x12/0x3c):
>>>   (XEN)  48 89 e5 e8 8a 1d fd ff <0f> 0b 0f ff ff 90 0f 06 db e3 48 89 e0 
>>> 48 0d ff
>>>   ...
>>>   (XEN) Xen call trace:
>>>   (XEN)    [<ffff82d04034ab02>] R __stop_this_cpu+0x12/0x3c
>>>   (XEN)    [<ffff82d04034ac15>] F smp_send_stop+0xdd/0xf8
>>>   (XEN)    [<ffff82d04034a229>] F machine_restart+0xa2/0x298
>>>   (XEN)    [<ffff82d04034a42a>] F 
>>> arch/x86/shutdown.c#__machine_restart+0xb/0x11
>>>   (XEN)    [<ffff82d04022fd15>] F smp_call_function_interrupt+0xbf/0xea
>>>   (XEN)    [<ffff82d04034acc6>] F call_function_interrupt+0x35/0x37
>>>   (XEN)    [<ffff82d040331a70>] F do_IRQ+0xa3/0x6b5
>>>   (XEN)    [<ffff82d04039482a>] F common_interrupt+0x10a/0x120
>>>   (XEN)    [<ffff82d04031f649>] F __udelay+0x3a/0x51
>>>   (XEN)    [<ffff82d04034d5fb>] F __cpu_up+0x48f/0x734
>>>   (XEN)    [<ffff82d040203c2b>] F cpu_up+0x7d/0xde
>>>   (XEN)    [<ffff82d0404543d3>] F __start_xen+0x200b/0x2618
>>>   (XEN)    [<ffff82d0402000ef>] F __high_start+0x4f/0x60
>>>
>>> which recurses until hitting a stack overflow.  The #DF handler, which 
>>> resets
>>> its stack on each invocation, loops indefinitely.
>>>
>>> Reinstate the NULL function pointer checks for hvm_cpu_{up,down}().
>>>
>>> Fixes: 27a63cdac388 ("x86/HVM: convert remaining hvm_funcs hook invocations 
>>> to alt-call")
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Thanks.

>
>>> ---
>>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> CC: Wei Liu <wl@xxxxxxx>
>>>
>>> RFC.  Not tested yet on the imacted hardware.  It's a Xeon PHI with another
>>> werid thing in need of debugging.  First boot is fine, while second
>>> boot (loading microcode this time) has a problem with vmx.
> Sounds not unfamiliar: My meanwhile oldish Romley needs to be cold-
> booted for VMX to actually be usable (not locked) on APs.

This is something which goes wrong as a consequence of loading microcode.

>>> I wonder if we want to modify the callers to check for HVM being enabled,
>>> rather than leaving the NULL pointer checks in a position where they're 
>>> liable
>>> to be reaped again.
>> What about adding a couple of comments to hvm_cpu_{up,down} to note
>> they are called unconditionally regardless of whether HVM is present
>> or not?
> I second this as the perhaps better alternative: The S3 path is
> similarly affected (and you may want to mention this in the
> description), so this would mean up to 5 conditionals (at the
> source level) instead of the just two you get away with here.

Ok.  I've added:

/* Called in boot/resume paths.  Must cope with no HVM support. */

and:

/* Called in shutdown paths.  Must cope with no HVM support. */

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.