[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Wed, 24 Feb 2021 00:19:57 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3/pA7clmf8L2PrYGDWbFd1HANlAQeQJl6RNhm3tdmYQ=; b=T0cy9T5T7jM19lTBYHrE8rISYUFhp/LOyZ7l0KfHJQCJsPqXFQVT2lWVkUQLvYDY5dbc1nlE6shyS7snny1xcQHVjMLW2vqdjWF/2kYGF6uKDOj3aj+HbYnDXJ27SckFGJYEkfxjR1efM4/ZT0tV4dLhREhVzNZGBSKOhSJt4tGJDN6uluakTPY7y/blkn1PiayIhzXk3NZckutQusW2NNu2OZpDbRLj5Uq/IyQmvRiufLLhiFEiymFW8HURs83/KCeaK0xfWOZSVtNeS60anwwCwqqAEDj23Exg4GR/w8nCRYE4lKclHWstqoljISwrADR6YzdxyVQ+WsIWDlz4vw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S/w5Wp6ZrXceZmuHnQnbemItW19soW8JCdf1Uc2SP3Ov3pfmZad7nP1O9sF/VVZJgeUFRhjqYAs4nG99k85N1fvWvmWbmFob/pm7r3+0Eo9vu/Z82TyaedXkHw92v/8+/ADdNLFHwWEPEgw+qey0VslD7OjgNLSHyTSZRLH9rTeoGQUvNjIEzXqNagpG/qEJMFK/yYMvs9mY42Kw7RQdfrPVPC1kEpGhqy4NXLKtBPNspBfgiwB9dDHLbWF2tb9miwS8aG4EmZ/+r2j6nsFf0yX+1a/Lb0yJ2uRtS+D17TtVtMBkE4WfQ87cQAqb1ngDZegzHLeVMcnNhV5NHDNoaw==
  • Authentication-results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=epam.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "famzheng@xxxxxxxxxx" <famzheng@xxxxxxxxxx>, "cardoe@xxxxxxxxxx" <cardoe@xxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>, "Bertrand.Marquis@xxxxxxx" <Bertrand.Marquis@xxxxxxx>, "julien@xxxxxxx" <julien@xxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Wed, 24 Feb 2021 00:20:26 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXCiKc1pmgaaPjJEuqIKIpTRVLeqpmcUqA
  • Thread-topic: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)

Hi Stefano,

Stefano Stabellini writes:

> Hi Volodymyr,
>
> This looks like a genuine failure:

Thank you for the report. I just debugged similar issues, which seems
happen randomly and found a flaw in this_cpu() implementation. It is
currently not compatible with preemption in hypervisor mode.

It might happen then CPU id is being read while running on one pCPU, but
then code might be preempted and it may continue to run on other pCPU,
while accessing data for a previous pCPU.

This mostly happens with __preempt_count variable in my case, by other
per_cpu variables affected too. Linux uses pair of
get_cpu_var/put_cpu_var functions, that temporally disable/enable
preemption. Something like that should be implemented in my patches as
well. But for __preempt_count I need completely different approach of
course. I'm looking for solution right now.

> https://urldefense.com/v3/__https://gitlab.com/xen-project/patchew/xen/-/jobs/1048475444__;!!GF_29dbcQIUBPA!mJXa6VegCDFete9dsvs4m8Epto5RJvSwfsRrESsenHBOJ4yxtj7XSU8QELo6TojcFLguIww$
>  [gitlab[.]com]
>
>
> (XEN) Data Abort Trap. Syndrome=0x1930046
> (XEN) Walking Hypervisor VA 0xf0008 on CPU0 via TTBR 0x0000000040545000
> (XEN) 0TH[0x0] = 0x0000000040544f7f
> (XEN) 1ST[0x0] = 0x0000000040541f7f
> (XEN) 2ND[0x0] = 0x0000000000000000
> (XEN) CPU0: Unexpected Trap: Data Abort
> (XEN) ----[ Xen-4.15-unstable  arm64  debug=y  Tainted: U     ]----
> (XEN) CPU:    0
> (XEN) PC:     00000000002273b8 timer.c#remove_from_heap+0x2c/0x114
> (XEN) LR:     0000000000227530
> (XEN) SP:     000080003ff7f9a0
> (XEN) CPSR:   800002c9 MODE:64-bit EL2h (Hypervisor, handler)
> (XEN)      X0: 000080000234e6a0  X1: 0000000000000001  X2: 0000000000000000
> (XEN)      X3: 00000000000f0000  X4: 0000000000000000  X5: 00000000014d014d
> (XEN)      X6: 0000000000000080  X7: fefefefefefeff09  X8: 7f7f7f7f7f7f7f7f
> (XEN)      X9: 717164616f726051 X10: 7f7f7f7f7f7f7f7f X11: 0101010101010101
> (XEN)     X12: 0000000000000008 X13: 0000000000000001 X14: 000080003ff7fa78
> (XEN)     X15: 0000000000000020 X16: 000000000028e558 X17: 0000000000000000
> (XEN)     X18: 00000000fffffffe X19: 0000000000000001 X20: 0000000000310180
> (XEN)     X21: 00000000000002c0 X22: 0000000000000000 X23: 0000000000346008
> (XEN)     X24: 0000000000310180 X25: 0000000000000000 X26: 00008000044e91b8
> (XEN)     X27: 000000000000ffff X28: 0000000041570018  FP: 000080003ff7f9a0
> (XEN) 
> (XEN)   VTCR_EL2: 80043594
> (XEN)  VTTBR_EL2: 000200007ffe3000
> (XEN) 
> (XEN)  SCTLR_EL2: 30cd183d
> (XEN)    HCR_EL2: 00000000807c663f
> (XEN)  TTBR0_EL2: 0000000040545000
> (XEN) 
> (XEN)    ESR_EL2: 97930046
> (XEN)  HPFAR_EL2: 0000000000030010
> (XEN)    FAR_EL2: 00000000000f0008
> (XEN) 
> (XEN) Xen stack trace from sp=000080003ff7f9a0:
> (XEN)    000080003ff7f9c0 0000000000227530 00008000044e9190 00000000002280dc
> (XEN)    000080003ff7f9e0 0000000000228234 00008000044e9190 000000000024dd04
> (XEN)    000080003ff7fa40 000000000024a414 0000000000311390 000080000234e430
> (XEN)    0000800002345000 0000000000000000 0000000000346008 00008000044e9150
> (XEN)    0000000000000001 0000000000000000 0000000000000240 0000000000270474
> (XEN)    000080003ff7faa0 000000000024b91c 0000000000000001 0000000000310238
> (XEN)    000080003ff7fbf8 0000000080000249 0000000093860047 00000000002a1de0
> (XEN)    000080003ff7fc88 00000000002a1de0 00000000000002c0 00008000044e9470
> (XEN)    000080003ff7fab0 00000000002217b4 000080003ff7fad0 000000000027a8c0
> (XEN)    0000000000311324 00000000002a1de0 000080003ff7fc00 0000000000265310
> (XEN)    0000000000000000 00000000002263d8 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000020
> (XEN)    0000000000000080 fefefefefefeff09 7f7f7f7f7f7f7f7f 717164616f726051
> (XEN)    7f7f7f7f7f7f7f7f 0101010101010101 0000000000000008 0000000000000001
> (XEN)    000080003ff7fa78 0000000000000020 000000000028e558 0000000000000000
> (XEN)    00000000fffffffe 0000000000000000 0000000000310238 000000000000000a
> (XEN)    0000000000310238 00000000002a64b0 00000000002a1de0 000080003ff7fc88
> (XEN)    0000000000000000 0000000000000240 0000000041570018 000080003ff7fc00
> (XEN)    000000000024c8c0 000080003ff7fc00 000000000024c8c4 9386004780000249
> (XEN)    000080003ff7fc90 000000000024c974 0000000000000384 0000000000000002
> (XEN)    0000800002345000 00000000ffffffff 0000000000000006 000080003ff7fe20
> (XEN)    0000000000000001 000080003ff7fe00 000080003ffe4a60 000080000234e430
> (XEN)    000080003ff7fd20 000080003ff7fd20 000080003ff7fce0 00000000ffffffc8
> (XEN)    000080003ff7fce0 000000000031a147 000080003ff7fd20 000000000027f7b8
> (XEN)    000080003ff7fd20 000080003ff7fd20 000080003ff7fce0 00000000ffffffc8
> (XEN)    000080003ff7fd20 000080003ff7fd20 000080003ff7fce0 00000000ffffffc8
> (XEN)    0000000000000240 0000800002345000 00000000ffffffff 0000000000000004
> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000022
> (XEN)    000080003ff7fda0 000000000026ff2c 000000000027f608 0000000000000000
> (XEN)    0000000000000093 0000800002345000 0000000000000000 000080003ffe4a60
> (XEN)    0000000000000001 000080003ff7fe00 000080003ffe4a60 0000000041570018
> (XEN)    000080003ff7fda0 000000000026fee0 000080003ff7fda0 000000000026ff18
> (XEN)    000080003ff7fe30 0000000000279b2c 0000000093860047 0000000000000090
> (XEN)    0000000003001384 000080003ff7feb0 ffff800011dc1384 ffff8000104b06a0
> (XEN)    ffff8000104b0240 ffff00000df806e8 0000000000000000 ffff800011b0ca88
> (XEN)    0000000003001384 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000093860047 0000000003001384 000080003ff7fe70 000000000027a180
> (XEN)    000080003ff7feb0 0000000093860047 0000000093860047 0000000060000085
> (XEN)    0000000093860047 ffff800011b0ca88 ffff800011b03d90 0000000000265458
> (XEN)    0000000000000000 ffff800011b0ca88 000080003ff7ffb8 000000000026545c
> (XEN) Xen call trace:
> (XEN)    [<00000000002273b8>] timer.c#remove_from_heap+0x2c/0x114 (PC)
> (XEN)    [<0000000000227530>] timer.c#remove_entry+0x90/0xa0 (LR)
> (XEN)    [<0000000000227530>] timer.c#remove_entry+0x90/0xa0
> (XEN)    [<0000000000228234>] stop_timer+0x1fc/0x254
> (XEN)    [<000000000024a414>] core.c#schedule+0xf4/0x380
> (XEN)    [<000000000024b91c>] wait+0xc/0x14
> (XEN)    [<00000000002217b4>] try_preempt+0x88/0xbc
> (XEN)    [<000000000027a8c0>] do_trap_irq+0x5c/0x60
> (XEN)    [<0000000000265310>] entry.o#hyp_irq+0x7c/0x80
> (XEN)    [<000000000024c974>] printk+0x68/0x70
> (XEN)    [<000000000027f7b8>] vgic-v2.c#vgic_v2_distr_mmio_write+0x1b0/0x7ac
> (XEN)    [<000000000026ff2c>] try_handle_mmio+0x1ac/0x27c
> (XEN)    [<0000000000279b2c>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8
> (XEN)    [<000000000027a180>] do_trap_guest_sync+0x10c/0x63c
> (XEN)    [<0000000000265458>] entry.o#guest_sync_slowpath+0xa4/0xd4
> (XEN) 
> (XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) CPU0: Unexpected Trap: Data Abort
> (XEN) ****************************************
>
>
> On Mon, 22 Feb 2021, no-reply@xxxxxxxxxxx wrote:
>> Hi,
>> 
>> Patchew automatically ran gitlab-ci pipeline with this patch (series) 
>> applied, but the job failed. Maybe there's a bug in the patches?
>> 
>> You can find the link to the pipeline near the end of the report below:
>> 
>> Type: series
>> Message-id: 20210223023428.757694-1-volodymyr_babchuk@xxxxxxxx
>> Subject: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)
>> 
>> === TEST SCRIPT BEGIN ===
>> #!/bin/bash
>> sleep 10
>> patchew gitlab-pipeline-check -p xen-project/patchew/xen
>> === TEST SCRIPT END ===
>> 
>> warning: redirecting to 
>> https://urldefense.com/v3/__https://gitlab.com/xen-project/patchew/xen.git/__;!!GF_29dbcQIUBPA!mJXa6VegCDFete9dsvs4m8Epto5RJvSwfsRrESsenHBOJ4yxtj7XSU8QELo6TojcGTFbSRQ$
>>  [gitlab[.]com]
>> warning: redirecting to 
>> https://urldefense.com/v3/__https://gitlab.com/xen-project/patchew/xen.git/__;!!GF_29dbcQIUBPA!mJXa6VegCDFete9dsvs4m8Epto5RJvSwfsRrESsenHBOJ4yxtj7XSU8QELo6TojcGTFbSRQ$
>>  [gitlab[.]com]
>> From 
>> https://urldefense.com/v3/__https://gitlab.com/xen-project/patchew/xen__;!!GF_29dbcQIUBPA!mJXa6VegCDFete9dsvs4m8Epto5RJvSwfsRrESsenHBOJ4yxtj7XSU8QELo6TojcntxRYAg$
>>  [gitlab[.]com]
>>  * [new tag]               
>> patchew/20210223023428.757694-1-volodymyr_babchuk@xxxxxxxx -> 
>> patchew/20210223023428.757694-1-volodymyr_babchuk@xxxxxxxx
>> Switched to a new branch 'test'
>> a569959cc0 alloc pages: enable preemption early
>> c943c35519 arm: traps: try to preempt before leaving IRQ handler
>> 4b634d1924 arm: context_switch: allow to run with IRQs already disabled
>> 7d78d6e861 sched: core: remove ASSERT_NOT_IN_ATOMIC and disable preemption[!]
>> d56302eb03 arm: setup: disable preemption during startup
>> 18a52ab80a preempt: add try_preempt() function
>> 9c4a07d0fa preempt: use atomic_t to for preempt_count
>> 904e59f28e sched: credit2: save IRQ state during locking
>> 3e3726692c sched: rt: save IRQ state during locking
>> c552842efc sched: core: save IRQ state during locking
>> 
>> === OUTPUT BEGIN ===
>> [2021-02-23 02:38:00] Looking up pipeline...
>> [2021-02-23 02:38:01] Found pipeline 260183774:
>> 
>> https://urldefense.com/v3/__https://gitlab.com/xen-project/patchew/xen/-/pipelines/260183774__;!!GF_29dbcQIUBPA!mJXa6VegCDFete9dsvs4m8Epto5RJvSwfsRrESsenHBOJ4yxtj7XSU8QELo6Tojc-d06GNY$
>>  [gitlab[.]com]
>> 
>> [2021-02-23 02:38:01] Waiting for pipeline to finish...
>> [2021-02-23 02:53:10] Still waiting...
>> [2021-02-23 03:08:19] Still waiting...
>> [2021-02-23 03:23:29] Still waiting...
>> [2021-02-23 03:38:38] Still waiting...
>> [2021-02-23 03:53:48] Still waiting...
>> [2021-02-23 04:08:57] Still waiting...
>> [2021-02-23 04:19:05] Pipeline failed
>> [2021-02-23 04:19:06] Job 'qemu-smoke-x86-64-clang-pvh' in stage 'test' is 
>> failed
>> [2021-02-23 04:19:06] Job 'qemu-smoke-x86-64-gcc-pvh' in stage 'test' is 
>> failed
>> [2021-02-23 04:19:06] Job 'qemu-smoke-x86-64-clang' in stage 'test' is failed
>> [2021-02-23 04:19:06] Job 'qemu-smoke-x86-64-gcc' in stage 'test' is failed
>> [2021-02-23 04:19:06] Job 'qemu-smoke-arm64-gcc' in stage 'test' is failed
>> [2021-02-23 04:19:06] Job 'qemu-alpine-arm64-gcc' in stage 'test' is failed
>> [2021-02-23 04:19:06] Job 'alpine-3.12-clang-debug' in stage 'build' is 
>> failed
>> [2021-02-23 04:19:06] Job 'alpine-3.12-clang' in stage 'build' is failed
>> [2021-02-23 04:19:06] Job 'alpine-3.12-gcc-debug' in stage 'build' is failed
>> [2021-02-23 04:19:06] Job 'alpine-3.12-gcc' in stage 'build' is failed
>> === OUTPUT END ===
>> 
>> Test command exited with code: 1


-- 
Volodymyr Babchuk at EPAM


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.