[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] DMA trouble with current xen-sparse

  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: "Stephen C. Tweedie" <sct@xxxxxxxxxx>
  • Date: Fri, 28 Oct 2005 15:21:20 -0400
  • Delivery-date: Fri, 28 Oct 2005 19:18:29 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>


I've been trying to get current xen-sparse up and running on a 2-cpu box
and have had a number of problems.  One has been that networking is
completely unstable: I get kernel panics under the slightest network

The trouble is that this is a 1G box, so its memory is not large enough
to automatically enable the swiotlb.  (arch/xen/i386/kernel/swiotlb.c
enables swiotlb automatically for dom0 only if there's at least 2G of
memory.)  And the first time we get a pci_dma_single() request for a
dom0-contiguous region which crosses a page boundary, we hit the BUG_ON
at arch/xen/i386/kernel/pci_dma.c:270 due to dma_map_single() checking:

                IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size));

And this happens *instantly* on any loaded tcp connection on my e1000
NIC.  All I need to do to kill the box is to ssh in and type "find\n".
Instant dom0 death after the ssh client receives about a dozen lines of
output.  The stack trace is appended below.

The PCI mapping documentation certainly says that pci_map_single() needs
to be able to map a single region, not just a single page.  If it can't,
then I suspect we really need to enable swiotlb by default, because
we'll just be unstable without it.

The kernel panics after this with "Fatal DMA error! Please use
'swiotlb=force'".  But of course the default for Xen is to instantly
reboot at this point before the error is visible.  And even after
catching the message with serial console, I found that "swiotlb=force"
*also* dies on this box, with

(XEN) (file=memory.c, line=57) Could not allocate order=14 extent: id=0 flags=0
(0 of 1)
kernel BUG at arch/xen/i386/mm/hypervisor.c:354
 [<c011a77d>] xen_create_contiguous_region+0x26d/0x2b0
 [<c0112596>] swiotlb_init_with_default_size+0x86/0x1c0
 [<c0112735>] swiotlb_init+0x65/0xa0

because we don't have a large enough zone at boot time to create the
64MB swiotlb.  

Booting with "swiotlb=force swiotlb=8m" works around both of these bugs
and allows me to boot; fortunately things are much more stable after I
get this far.



kernel BUG at arch/xen/i386/kernel/pci-dma.c:270 (dma_map_single)!
 [<c010ecd6>] dma_map_single+0xf6/0x160
 [<f49cd40b>] e1000_xmit_frame+0x40b/0xd30 [e1000]
 [<c0313510>] qdisc_restart+0x100/0x2f0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03010ff>] dev_queue_xmit+0x9f/0x340
 [<c032404c>] ip_finish_output+0x15c/0x2e0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c0324947>] ip_queue_xmit+0x2b7/0x560
 [<c0323ec0>] dst_output+0x0/0x30
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0155e06>] check_poison_obj+0x26/0x1c0
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0157dbc>] cache_alloc_debugcheck_after+0x4c/0x1b0
 [<c0336e24>] tcp_transmit_skb+0x3d4/0x810
 [<c02fab10>] skb_clone+0x20/0x1d0
 [<c0337efd>] tcp_write_xmit+0x10d/0x330
 [<c0334943>] __tcp_data_snd_check+0xa3/0xe0
 [<c02fa961>] kfree_skbmem+0x21/0x30
 [<c0335069>] tcp_rcv_established+0x2a9/0x910
 [<f4b3f036>] ipt_hook+0x36/0x40 [iptable_filter]
 [<c033ef5a>] tcp_v4_do_rcv+0xfa/0x150
 [<c033f8d5>] tcp_v4_rcv+0x925/0x980
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c03206bc>] ip_local_deliver+0xdc/0x2f0
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c0320f0e>] ip_rcv+0x3ce/0x5b0
 [<c03210f0>] ip_rcv_finish+0x0/0x320
 [<c0301be0>] netif_receive_skb+0x250/0x310
 [<f49cf3ae>] e1000_clean_rx_irq+0x13e/0x5d0 [e1000]
 [<f49ce8a2>] e1000_clean+0x52/0x1c0 [e1000]
 [<c0301f2c>] net_rx_action+0xdc/0x220
 [<c0128f4a>] __do_softirq+0x8a/0x120
 [<c012905d>] do_softirq+0x7d/0x80
 [<c010ee22>] do_IRQ+0x22/0x30
 [<c01049be>] evtchn_do_upcall+0x9e/0xe0
 [<c010a2f0>] hypervisor_callback+0x2c/0x34
 [<c0107b30>] xen_idle+0x40/0x80
 [<c0107bd4>] cpu_idle+0x64/0xb0
 [<c0436a4f>] start_kernel+0x1af/0x210
 [<c0436380>] unknown_bootoption+0x0/0x220

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.