[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PVH dom0 construction timeout



On 28.02.2020 22:08, Andrew Cooper wrote:
> It turns out that PVH dom0 construction doesn't work so well on a
> 2-socket Rome system...
> 
> (XEN) NX (Execute Disable) protection active
> 
> (XEN) *** Building a PVH Dom0 ***
> 
> (XEN) Watchdog timer detects that CPU0 is stuck!
> 
> (XEN) ----[ Xen-4.14-unstable  x86_64  debug=y   Not tainted ]----
> 
> (XEN) CPU:    0
> 
> (XEN) RIP:    e008:[<ffff82d08029a8fd>] page_get_ram_type+0x58/0xb6
> 
> (XEN) RFLAGS: 0000000000000206   CONTEXT: hypervisor
> 
> (XEN) rax: ffff82d080948fe0   rbx: 0000000002b73db9   rcx: 0000000000000000
> 
> (XEN) rdx: 0000000004000000   rsi: 0000000004000000   rdi: 0000002b73db9000
> 
> (XEN) rbp: ffff82d080827be0   rsp: ffff82d080827ba0   r8:  ffff82d080948fcc
> 
> (XEN) r9:  0000002b73dba000   r10: ffff82d0809491fc   r11: 8000000000000000
> 
> (XEN) r12: 0000000002b73db9   r13: ffff8320341bc000   r14: 000000000404fc00
> 
> (XEN) r15: ffff82d08046f209   cr0: 000000008005003b   cr4: 00000000001506e0
> 
> (XEN) cr3: 00000000a0414000   cr2: 0000000000000000
> 
> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
> 
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> 
> (XEN) Xen code around <ffff82d08029a8fd> (page_get_ram_type+0x58/0xb6):
> 
> (XEN)  4c 39 d0 74 4d 49 39 d1 <76> 0b 89 ca 83 ca 10 48 39 38 0f 47 ca 49 89 
> c0
> 
> (XEN) Xen stack trace from rsp=ffff82d080827ba0:
> 
> (XEN)    ffff82d08061ee91 ffff82d080827bb4 00000000000b2403 ffff82d080804340
> 
> (XEN)    ffff8320341bc000 ffff82d080804340 ffff83000003df90 ffff8320341bc000
> 
> (XEN)    ffff82d080827c08 ffff82d08061c38c ffff8320341bc000 ffff82d080827ca8
> 
> (XEN)    ffff82d080648750 ffff82d080827c20 ffff82d08061852c 0000000000200000
> 
> (XEN)    ffff82d080827d60 ffff82d080638abe ffff82d080232854 ffff82d080930c60
> 
> (XEN)    ffff82d080930280 ffff82d080674800 ffff83000003df90 0000000001a40000
> 
> (XEN)    ffff83000003df80 ffff82d080827c80 0000000000000206 ffff8320341bc000
> 
> (XEN)    ffff82d080827cb8 ffff82d080827ca8 ffff82d080232854 ffff82d080961780
> 
> (XEN)    ffff82d080930280 ffff82d080827c00 0000000000000002 ffff82d08022f9a0
> 
> (XEN)    00000000010a4bb0 ffff82d080827ce0 0000000000000206 000000000381b66d
> 
> (XEN)    ffff82d080827d00 ffff82d0802b1e87 ffff82d080936900 ffff82d080936900
> 
> (XEN)    ffff82d080827d18 ffff82d0802b30d0 ffff82d080936900 ffff82d080827d50
> 
> (XEN)    ffff82d08022ef5e ffff8320341bc000 ffff83000003df80 ffff8320341bc000
> 
> (XEN)    ffff83000003df80 0000000001a40000 ffff83000003df90 ffff82d080674800
> 
> (XEN)    ffff82d080827d98 ffff82d08063cd06 0000000000000001 ffff82d080674800
> 
> (XEN)    ffff82d080931050 0000000000000100 ffff82d080950c80 ffff82d080827ee8
> 
> (XEN)    ffff82d08062eae7 0000000001a40fff 0000000000000000 000ffff82d080e00
> 
> (XEN)    ffffffff00000000 0000000000000005 0000000000000004 0000000000000004
> 
> (XEN)    0000000000000003 0000000000000003 0000000000000002 0000000000000002
> 
> (XEN)    0000000002050000 0000000000000000 ffff82d080674c20 ffff82d080674ea0
> 
> (XEN) Xen call trace:
> 
> (XEN)    [<ffff82d08029a8fd>] R page_get_ram_type+0x58/0xb6
> 
> (XEN)    [<ffff82d08061ee91>] S arch_iommu_hwdom_init+0x239/0x2b7
> 
> (XEN)    [<ffff82d08061c38c>] F 
> drivers/passthrough/amd/pci_amd_iommu.c#amd_iommu_hwdom_init+0x85/0x9f
> 
> (XEN)    [<ffff82d08061852c>] F iommu_hwdom_init+0x44/0x4b
> 
> (XEN)    [<ffff82d080638abe>] F dom0_construct_pvh+0x160/0x1233
> 
> (XEN)    [<ffff82d08063cd06>] F construct_dom0+0x5c/0x280e
> 
> (XEN)    [<ffff82d08062eae7>] F __start_xen+0x25db/0x2860
> 
> (XEN)    [<ffff82d0802000ec>] F __high_start+0x4c/0x4e
> 
> (XEN)
> 
> (XEN) CPU1 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> (XEN) CPU31 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> (XEN) CPU30 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> (XEN) CPU27 @ e008:ffff82d08022ad5a (scrub_one_page+0x6d/0x7b)
> 
> (XEN) CPU26 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> (XEN) CPU244 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> (XEN) CPU245 @ e008:ffff82d08022ad5a (scrub_one_page+0x6d/0x7b)
> 
> (XEN) CPU247 @ e008:ffff82d080256e3f 
> (drivers/char/ns16550.c#ns_read_reg+0x2d/0x35)
> 
> (XEN) CPU246 @ e008:ffff82d0802f203f 
> (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf)
> 
> <snip rather a large number of cpus, all idle>
> 
> 
> This stack trace is the same on several boots, and in particular,
> page_get_ram_type() being the %rip which took the timeout.  For an
> equivalent PV dom0 build, it takes perceptibly 0 time, based on how
> quickly the next line is printed.
> 
> I haven't diagnosed the exact issue, but some observations:
> 
> The arch_iommu_hwdom_init() loop's positioning of
> process_pending_softirqs() looks problematic, because it is short
> circuited conditionally by hwdom_iommu_map().

Yes, we want to avoid this bypassing. I'll make a patch.

> page_get_ram_type() is definitely suboptimal here.  We have an linear
> search over a (large-ish) sorted list, and a caller which has every MFN
> in the system passed into it, which makes the total runtime of
> arch_iommu_hwdom_init() quadratic with the size of the system.

This linear search is the same for PVH and PV, isn't it? In
fact hwdom_iommu_map(), on the average, may do more work for
PV than for PVH, considering the is_hvm_domain()-based return
from the switch()'s default case. So for the moment I could
explain such a huge difference in consumed time only if the
PV case ran with iommu_hwdom_passthrough set to true (which
isn't possible for PVH).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.