[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Crash when using cpupools


  • To: Juergen Gross <jgross@xxxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Mon, 6 Sep 2021 10:14:39 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=uzHDmyjnh5LGIpbyruutxpYRl2HOE5gt7OKhwpi2jLE=; b=nynDTTETRYYsOgNWHfleHhkmJT7MEihvuZYAPN/8nklf+zEftzYjhsjyJhAyBpIVOlTXkLRUEWQvmWVnv8tRDUbRf5GElm0hikf72zA6CZ6uTZkWDVKPeewuTRxRcsnYvftlBPc7mfqQyhOeMryhEzyb9JqWb6LYS+VaRAX1LUHnFPQWza/stnQ5zT1RjJ9ty3JpcmvJacsxtlmOk9EbTGwwc8lZBNSKw8n2KAZmK1a7TRDa03E+KTTODZ0dxcKTNdkbZ5C+hM2eKbPFxMQqX/ZgQ2r8kFvC1jg9LQ694fhMe+SiVlqPalX5Rl+2kvd/dx86kxP/zgzIHL0Co9hS6Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dxYMsosEaTZ9CSJ4MU1ItPJz+pLEzyMfBWLzOAtzt27pYkb5Bk5t2inN4O/H59Ah8VnxhXkZAflD63xx0z30rPyXZo4vqxHuS0nYmEKGNu1whZPc6u+e/Ut4sec/UQ+DUhKr6ADWySQFCVM0YbpvdL7BkBj3/lzG7arjDYRnxT5gB2Cn57s6zwCZbBWjjIBq8ynl2gzovo6fVuJhbWH3+b8q91/nFciDFOsGc28wQbgkY7GqKUjWUV27BCe5sIB7DsRW/G2xhieorFX0f6DFowUTIFjMRk8MbGotCM6kFIW581S1yHYsMJztU3DwRMQo5fNsIvN+ngIyDCMG1d1Ptw==
  • Authentication-results-original: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>
  • Delivery-date: Mon, 06 Sep 2021 10:14:59 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXoNo0twqKn1RLAkCotEsfXja8k6uWr0+AgAAGR4CAAAvYAIAADN6A
  • Thread-topic: Crash when using cpupools

HI Juergen,

> On 6 Sep 2021, at 10:28, Juergen Gross <jgross@xxxxxxxx> wrote:
> 
> On 06.09.21 10:46, Andrew Cooper wrote:
>> On 06/09/2021 09:23, Juergen Gross wrote:
>>> On 03.09.21 17:41, Bertrand Marquis wrote:
>>>> Hi,
>>>> 
>>>> While doing some investigation with cpupools I encountered a crash
>>>> when trying to isolate a guest to its own physical cpu.
>>>> 
>>>> I am using current staging status.
>>>> 
>>>> I did the following (on FVP with 8 cores):
>>>> - start dom0 with dom0_max_vcpus=1
>>>> - remove core 1 from dom0 cpupool: xl cpupool-cpu-remove Pool-0 1
>>>> - create a new pool: xl cpupool-create name=\"NetPool\”
>>>> - add core 1 to the pool: xl cpupool-cpu-add NetPool 1
>>>> - create a guest in NetPool using the following in the guest config:
>>>> pool=“NetPool"
>>>> 
>>>> I end with a crash with the following call trace during guest creation:
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<0000000000234cb0>] credit2.c#csched2_alloc_udata+0x58/0xfc
>>>> (PC)
>>>> (XEN)    [<0000000000234c80>] credit2.c#csched2_alloc_udata+0x28/0xfc
>>>> (LR)
>>>> (XEN)    [<0000000000242d38>] sched_move_domain+0x144/0x6c0
>>>> (XEN)    [<000000000022dd18>]
>>>> cpupool.c#cpupool_move_domain_locked+0x38/0x70
>>>> (XEN)    [<000000000022fadc>] cpupool_do_sysctl+0x73c/0x780
>>>> (XEN)    [<000000000022d8e0>] do_sysctl+0x788/0xa58
>>>> (XEN)    [<0000000000273b68>] traps.c#do_trap_hypercall+0x78/0x170
>>>> (XEN)    [<0000000000274f70>] do_trap_guest_sync+0x138/0x618
>>>> (XEN)    [<0000000000260458>] entry.o#guest_sync_slowpath+0xa4/0xd4
>>>> 
>>>> After some debugging I found out that unit->vcpu_list is NULL after
>>>> unit->vcpu_list = d->vcpu[unit->unit_id]; with unit_id 0 in
>>>> common/sched/core.c:688
>>>> This makes the call to is_idle_unit(unit) in csched2_alloc_udata
>>>> trigger the crash.
>>> 
>>> So there is no vcpu 0 in that domain? How is this possible?
>> Easy, depending on the order of hypercalls issued by the toolstack.
>> Between DOMCTL_createdomain and DOMCTL_max_vcpus, the domain exists but
>> the vcpus haven't been allocated.
> 
> Oh yes, indeed.
> 
> Bertrand, does the attached patch fix the issue for you?

It does, my guest is now booting properly :-)
So this is solving the issue on arm (and probably on x86 if it was present but 
untested by me).

Feel free to add to your patch my:
Reviewed-by: Bertrand Marquis <bertrand.marquis@xxxxxxx>

Thanks a lot for the quick fix
Cheers
Bertrand

> 
> 
> Juergen
> <0001-xen-sched-fix-sched_move_domain-for-domain-without-v.patch><OpenPGP_0xB0DE9DD628BF132F.asc>


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.