Re: [Xen-devel] Page table sync. in Xen arm (Arndale) with SMP enabled (Bug fix included)

On Thu, 2013-04-18 at 07:30 +0100, Sengul Thomas wrote:
> Hello,
> I'm running Xen on Arndale (Exynos5250) board with the following source.
> Xen repo: //xenbits.xen.org/people/julieng/xen-unstable.git
>        branch: arndale
> And when I generate heavy network traffic at DomU (with SMP enabled)
> the following error occasionally occurs:
> (XEN) Assertion 'map[slot].pt.avail != 0' failed, line 219, file mm.c
> (XEN) Xen BUG at mm.c:219
> (XEN) CPU1: Unexpected Trap: Undefined Instruction
> (XEN) ----[ Xen-4.3-unstable  arm32  debug=y  Tainted:    C ]----
> (XEN) CPU:    1
> (XEN) PC:     0023d1bc __bug+0x2c/0x44
> (XEN) CPSR:   200001da MODE:Hypervisor
> (XEN)      R0: 0025c70c R1: 00000003 R2: 3fd4fd80 R3: 00000fff
> (XEN)      R4: 00259170 R5: 000000db R6: 0025a4ec R7: 7ffd7180
> (XEN)      R8: bfcf8000 R9: 7ffd516c R10:7ffd7000 R11:7ffe7d2c R12:00000004
> (XEN) HYP: SP: 7ffe7d24 LR: 0023d1bc
> ... skip
> (XEN)    [<0023d1bc>] __bug+0x2c/0x44
> (XEN)    [<00243a18>] unmap_domain_page+0x7c/0xac
> (XEN)    [<002452a8>] p2m_lookup+0x13c/0x170
> (XEN)    [<0024562c>] gmfn_to_mfn+0x14/0x20
> (XEN)    [<00209e04>] __get_paged_frame+0x24/0x9c
> (XEN)    [<0020a2f4>] __acquire_grant_for_copy+0x478/0x6b0
> (XEN)    [<0020cd30>] do_grant_table_op+0x1938/0x2cb8
> (XEN)    [<00247d00>] do_trap_hypervisor+0x70c/0xa7c
> (XEN)    [<0024a030>] return_from_trap+0/0x4
> I have digged a little bit and figured out that map_domain_page and
> unmap_domain_page functions access page table without synchronization.

Hrm, I thought map_domain_page was supposed to create a PCPU local
mapping, at least I thought that was Tim's intent. By its very nature a
per-PCPU mapping shouldn't require locking except against local
interrupts (which it correctly disables).

I expect the real bug is that we have failed to remember to setup
per-PCPU mappings at DOMHEAP_VIRT_START when we implemented SMP! Oops...
We don't appear to have even managed per-PCPU pagetables, so this may
not be a completely trivial fix...

This also explains why Stefano was able to implement
map_doman_page_global in terms of map_domain_page:

> Here goes a simple fix and I hope you guys make it xen-compatible :)

Thanks, as I say I don't think the fix is right but the
analysis/identification of the issue was certainly valuable!

WRT future submissions the most important thing is to include a
Signed-off-by line to indicate that you certify the patch under the
Developer's Certificate of Origin, which is include in:


