[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

S3 resume crash in memguard_guard_stack (stable-4.14)



Hi,

My journey to get S3 working on Xen 4.14 continues...
My current setup is:
 - stable-4.14 (28855ebcdb)
 - with "x86/S3: Fix Shadow Stack resume path"
 - with efi_get_time() disabled
 - with "write_cr4(read_cr4())" just after "system_state =
   SYS_STATE_resume" (should be more or less equivalent to "x86/S3:
   Restore CR4 earlier during resume"
 - with "xen: credit2: limit the max number of CPUs in a runqueue"
   reverted

With this, I get a crash on S3 resume:

(XEN) Preparing system for ACPI S3 state.
(XEN) Disabling non-boot CPUs ...
(XEN) Entering ACPI S3 state.
(XEN) [VT-D]Passed iommu=no-igfx option.  Disabling IGD VT-d engine.
(XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, 
CMCI
(XEN) CPU0 CMCI LVT vector (0xf1) already installed
(XEN) Finishing wakeup from ACPI S3 state.
(XEN) Enabling non-boot CPUs  ...
(XEN) ----[ Xen-4.14.1-pre  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d040311090>] memguard_guard_stack+0x7/0x1a5
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: ffff830250ca03f8   rbx: 0000000000000001   rcx: ffff830250cb10b0
(XEN) rdx: 0000003210739000   rsi: 0000000000000001   rdi: ffff830250ca0000
(XEN) rbp: ffff830049a6fd70   rsp: ffff830049a6fd40   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 0000000000000001   r11: 0000000000000002
(XEN) r12: 0000000000010000   r13: 0000000000000000   r14: 0000000000000001
(XEN) r15: ffff82d040598440   cr0: 000000008005003b   cr4: 00000000003526e0
(XEN) cr3: 0000000049a5d000   cr2: ffff830250ca03f8
(XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d040311090> (memguard_guard_stack+0x7/0x1a5):
(XEN)  c3 48 8d 87 f8 03 00 00 <48> 89 87 f8 03 00 00 48 8d 87 f8 07 00 00 48 89
(XEN) Xen stack trace from rsp=ffff830049a6fd40:
(XEN)    ffff82d040321c2e ffff82d040461b68 ffff82d040461b60 ffff82d040461240
(XEN)    0000000000000001 0000000000000000 ffff830049a6fdb8 ffff82d040221f9c
(XEN)    ffff830049a6fde0 0000000000000001 0000000000000000 00000000ffffffef
(XEN)    ffff830049a6fe08 0000000000000001 ffff830250b66000 ffff830049a6fdd0
(XEN)    ffff82d0402036cf 0000000000000001 ffff830049a6fdf8 ffff82d040203a4d
(XEN)    0000000000000000 0000000000000001 0000000000000010 ffff830049a6fe28
(XEN)    ffff82d040203d00 ffff830049a6fef8 0000000000000000 0000000000000003
(XEN)    0000000000000200 ffff830049a6fe58 ffff82d040270c9a ffff830250139f70
(XEN)    ffff830250b45000 0000000000000000 0000000000000000 ffff830049a6fe78
(XEN)    ffff82d040207064 ffff830250b451b8 ffff82d0405781b0 ffff830049a6fe90
(XEN)    ffff82d04022b7bb ffff82d0405781a0 ffff830049a6fec0 ffff82d04022ba9c
(XEN)    0000000000000000 ffff82d0405781b0 ffff82d04057ed00 ffff82d040598440
(XEN)    ffff830049a6fef0 ffff82d0402f33e3 ffff830252b0e000 ffff830250b45000
(XEN)    ffff830252b0f000 0000000000000000 ffff830049a6fdc8 ffff88818ce029e0
(XEN)    ffffc900026b7f08 0000000000000003 0000000000000000 0000000000003403
(XEN)    ffffffff8277a5a8 0000000000000246 0000000000000003 0000000000003403
(XEN)    0000000000003403 0000000000000000 ffffffff810020ea 0000000000003403
(XEN)    0000000000000010 deadbeefdeadf00d 0000010000000000 ffffffff810020ea
(XEN)    000000000000e033 0000000000000246 ffffc900026b7cb8 000000000000e02b
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d040311090>] R memguard_guard_stack+0x7/0x1a5
(XEN)    [<ffff82d040321c2e>] S smpboot.c#cpu_smpboot_callback+0xe5/0x6d5
(XEN)    [<ffff82d040221f9c>] F notifier_call_chain+0x6b/0x96
(XEN)    [<ffff82d0402036cf>] F cpu.c#cpu_notifier_call_chain+0x1b/0x33
(XEN)    [<ffff82d040203a4d>] F cpu_up+0x5f/0xd5
(XEN)    [<ffff82d040203d00>] F enable_nonboot_cpus+0xea/0x1fb
(XEN)    [<ffff82d040270c9a>] F power.c#enter_state_helper+0x152/0x606
(XEN)    [<ffff82d040207064>] F 
domain.c#continue_hypercall_tasklet_handler+0x4c/0xb9
(XEN)    [<ffff82d04022b7bb>] F tasklet.c#do_tasklet_work+0x76/0xa9
(XEN)    [<ffff82d04022ba9c>] F do_tasklet+0x58/0x8a
(XEN)    [<ffff82d0402f33e3>] F domain.c#idle_loop+0x40/0x96
(XEN) 
(XEN) Pagetable walk from ffff830250ca03f8:
(XEN)  L4[0x106] = 8000000049a5b063 ffffffffffffffff
(XEN)  L3[0x009] = 0000000250cae063 ffffffffffffffff
(XEN)  L2[0x086] = 0000000250cad063 ffffffffffffffff
(XEN)  L1[0x0a0] = 8000000250ca0161 ffffffffffffffff
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0003]
(XEN) Faulting linear address: ffff830250ca03f8
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...
(XEN) Executing kexec image on cpu0
(XEN) Shot down all CPUs

The code in question seems to belong to this commit:

commit 91d26ed304ff562f341824be12bf49bd78c39e39
Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date:   Thu Apr 23 20:20:59 2020 +0100

    x86/shstk: Create shadow stacks


Disabling Shadow Stack in Kconfig makes the issue gone - I got S3 resume
working on this machine, at least once. Then it hanged after second S3
resume (most likely yet another proble...).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.