[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HVM guest only bring up a single vCPU


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 27 Aug 2021 08:28:24 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Pp6AXQbcNJm3Li0YSJQ8jm6grjN9bJYKcWH0MiR4LCs=; b=lnL9Bzw45CRm4p01pQ2NfKEwRhNgYGDK/pVmRyPkrXtxhEg/eewZbnGsqRZZL15iKJpVGW+UTuK3R5NWLIY3+DpLNZPrVdIE0Nlzq5GAbiHG53HDYux75n7K0KNMuma6Pybgsw9D7DRB1rOu8gJsVejVunTNzpPLG7c+FrVLly9Or+fJFGCkmJi+cWF8Gk1fuHFADjorrbDP4KACjjEpvPyMZaEjHFVVpaQsoQWEVFTYOkF4+e6jYnNstbjCQt0d93A11kgbUt9RpW+eEL2LtBNAS2uAvKOyioddzfb6MJ8Cc60kkl7uxcKMuUWLnIvRj3bArqWcWv57kSjcCwUC8Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kgyTMqZUF2QpH1EeEmQHIF2sI56eh5zrLghryXD+DZE0/KWvLeR72MuX4wUKDLcoDfdN3ww+cqTQQ7lHNmcjk73lcSACjRV6dNRWx8rYgLOdh1ttrOL9xsx0heVMFsAZ94e58wuk8J3PiNaUyztXQ0cLq9szACh4mpu/pn1x6ZcM3jehPD7i+x5WtcqJEGb/ZPLufVIJn1Z/ZdCIwK8jgGjZMtqDLciovN/y9XOz1u94VtZSJ216ydB9p7DCz9a3xpHlErXGMgJm65al+MetdUZjlBG8EqR6GeKXpCej11otm4vXd/yAZxNYbZQKvfGn1OXZY18gy9EWmBw6lOm7tg==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 27 Aug 2021 06:28:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.08.2021 01:42, Andrew Cooper wrote:
> On 26/08/2021 22:00, Julien Grall wrote:
>> Hi Andrew,
>>
>> While doing more testing today, I noticed that only one vCPU would be
>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>
>> [    1.122180]
>> ================================================================================
>> [    1.122180] UBSAN: shift-out-of-bounds in
>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>> [    1.122180] shift exponent -1 is negative
>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>> [    1.122180] Call Trace:
>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>> [    1.122180]  ubsan_epilogue+0x5/0x50
>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>> [    1.122180]  ? cpu_up+0x6e/0x100
>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>> [    1.122180]  ? lock_release+0xc7/0x2a0
>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>> [    1.122180]  cpu_up+0xbd/0x100
>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>> [    1.122180]  smp_init+0x26/0x74
>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  kernel_init+0x17/0x140
>> [    1.122180]  ? rest_init+0x330/0x330
>> [    1.122180]  ret_from_fork+0x22/0x30
>> [    1.122244]
>> ================================================================================
>> [    1.123176] installing Xen timer for CPU 1
>> [    1.123369] x86: Booting SMP configuration:
>> [    1.123409] .... node  #0, CPUs:      #1
>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>> [    1.154491] smp: Brought up 1 node, 1 CPU
>> [    1.154526] smpboot: Max logical packages: 2
>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>> BogoMIPS)
>>
>> I have tried a PV guest (same setup) and the kernel could bring up all
>> the vCPUs.
>>
>> Digging down, Linux will set smp_num_siblings to 0 (via
>> detect_ht_early()) and as a result will skip all the CPUs. The value
>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>> leaft correctly.
>>
>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>> Does this ring any bell to you?
> 
> The CPUID data we give to guests is generally nonsense when it comes to
> topology.  By any chance does the hardware you're booting this on not
> have hyperthreading enabled/active to begin with?

Well, I'd put the question slightly differently: What CPUID data does
qemu supply to Xen here? I could easily see us making an assumption
somewhere that is met by all hardware but is theoretically wrong to
make and not met by qemu, which then leads to further issues with what
we expose to our guest.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.