[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features





On 13/08/2020 19:41, Oleksandr wrote:
Rebooting domain 2
root@generic-armv8-xt-dom0:~# (XEN) Xen BUG at ...tAUTOINC+bb71237a55-r0/git/xen/include/xen/mm.h:683
(XEN) ----[ Xen-4.14.0  arm64  debug=y   Not tainted ]----
(XEN) CPU:    3
(XEN) PC:     0000000000246f28 ioreq.c#hvm_free_ioreq_mfn+0x68/0x6c
(XEN) LR:     0000000000246ef0
(XEN) SP:     0000800725eafd80
(XEN) CPSR:   60000249 MODE:64-bit EL2h (Hypervisor, handler)
(XEN)      X0: 0000000000000001  X1: 403fffffffffffff  X2: 000000000000001f
(XEN)      X3: 0000000080000000  X4: 0000000000000000  X5: 0000000000400000
(XEN)      X6: 0000800725eafe24  X7: 0000ffffd1ef3e08  X8: 0000000000000020
(XEN)      X9: 0000000000000000 X10: 00e800008ecebf53 X11: 0400000000000000
(XEN)     X12: ffff7e00013b3ac0 X13: 0000000000000002 X14: 0000000000000001
(XEN)     X15: 0000000000000001 X16: 0000000000000029 X17: 0000ffff9badb3d0
(XEN)     X18: 000000000000010f X19: 0000000810e60e38 X20: 0000800725e68ec0
(XEN)     X21: 0000000000000000 X22: 00008004dc0404a0 X23: 000000005a000ea1
(XEN)     X24: ffff8000460ec280 X25: 0000000000000124 X26: 000000000000001d
(XEN)     X27: ffff000008ad1000 X28: ffff800052e65100  FP: ffff0000223dbd20
(XEN)
(XEN)   VTCR_EL2: 80023558
(XEN)  VTTBR_EL2: 0002000765f04000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 000000008078663f
(XEN)  TTBR0_EL2: 00000000781c5000
(XEN)
(XEN)    ESR_EL2: f2000001
(XEN)  HPFAR_EL2: 0000000000030010
(XEN)    FAR_EL2: ffff000008005f00
(XEN)
(XEN) Xen stack trace from sp=0000800725eafd80:
(XEN)    0000800725e68ec0 0000000000247078 00008004dc040000 00000000002477c8 (XEN)    ffffffffffffffea 0000000000000001 ffff8000460ec500 0000000000000002 (XEN)    000000000024645c 00000000002462dc 0000800725eafeb0 0000800725eafeb0 (XEN)    0000800725eaff30 0000000060000145 000000000027882c 0000800725eafeb0 (XEN)    0000800725eafeb0 01ff00000935de80 00008004dc040000 0000000000000006 (XEN)    ffff800000000000 0000000000000002 000000005a000ea1 000000019bc60002 (XEN)    0000ffffd1ef3e08 0000000000000020 0000000000000004 000000000027c7d8 (XEN)    000000005a000ea1 0000800725eafeb0 000000005a000ea1 0000000000279f98 (XEN)    0000000000000000 ffff8000460ec200 0000800725eaffb8 0000000000262c58 (XEN)    0000000000262c4c 07e0000160000249 0000000000000002 0000000000000001 (XEN)    ffff8000460ec500 ffff8000460ec508 ffff8000460ec208 ffff800052e65100 (XEN)    000000005060b478 0000ffffd20f3000 ffff7e00013c77e0 0000000000000000 (XEN)    00e800008ecebf53 0400000000000000 ffff7e00013b3ac0 0000000000000002 (XEN)    0000000000000001 0000000000000001 0000000000000029 0000ffff9badb3d0 (XEN)    000000000000010f ffff8000460ec210 ffff8000460ec200 ffff8000460ec210 (XEN)    0000000000000001 ffff8000460ec500 ffff8000460ec280 0000000000000124 (XEN)    000000000000001d ffff000008ad1000 ffff800052e65100 ffff0000223dbd20 (XEN)    ffff000008537004 ffffffffffffffff ffff0000080c17e4 5a000ea160000145 (XEN)    0000000060000000 0000000000000000 0000000000000000 ffff800052e65100 (XEN)    ffff0000223dbd20 0000ffff9badb3dc 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<0000000000246f28>] ioreq.c#hvm_free_ioreq_mfn+0x68/0x6c (PC)
(XEN)    [<0000000000246ef0>] ioreq.c#hvm_free_ioreq_mfn+0x30/0x6c (LR)
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) Xen BUG at ...tAUTOINC+bb71237a55-r0/git/xen/include/xen/mm.h:683
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) PSCI cpu off failed for CPU0 err=-3
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...



Either I did something wrong (most likely) or there is an issue with page ref-counting in the IOREQ code. I am still trying to understand what is going on.

At a first glance, the implement of set_foreign_p2m_entry() looks fine to me.

Some notes on that:
1. I checked that put_page() was called for these pages in p2m_put_l3_page() when destroying domain. This happened before hvm_free_ioreq_mfn() execution. 2. There was no BUG detected if I passed "p2m_ram_rw" instead of "p2m_map_foreign_rw" in guest_physmap_add_entry(), but the DomU couldn't be fully destroyed because of the reference taken.

This definitely looks like a page reference issue. Would it be possible to print where the page reference are dropped? A WARN() in put_page() would help.

To avoid a lot of message, I tend to use a global variable that store the page I want to watch.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.