[Xen-devel] Re: userspace block backend / gntdev problems

Derek Murray wrote:
> The 128-grant limit is fairly arbitrary, and I wanted to see how people
> were using gntdev before changing this. The reason for using a
> fixed-size array is that it gives us O(1)-time mapping and unmapping of
> single grants, which I anticipated would be the most frequently-used
> case.

Ok, try a hash instead of a list then ;)

>> Second problem is that batched grant mappings (using
>> xc_gnttab_map_grant_refs) don't work reliable.  Symtoms I see are random
>> failures with ENOMEM for no obvious reason (128 grant limit is *far*
>> away).
> 
> If it's failing with ENOMEM, a possible reason is that the address space
> for mapping grants within gntdev (the array I mentioned above) is
> becoming fragmented. Are you combining the mapping of single grants and
> batches within the same gntdev instance?

Yes, I'm mixing up single and batched maps (the later can have different
sizes too, depending on the requests coming in, in the 1 -> 11 range).
But I've seen ENOMEM failures with *only* the shared ring being mapped,
i.e. one of 128 slots being used.  That can't be fragmentation ...

>> Also host kernel crashes (kernel 2.6.21-2952.fc8xen).
> 
> When does this happen? Could you post the kernel OOPS?

Dunno what exactly triggers it.  Oops attached.

cheers,
  Gerd

BUG: unable to handle kernel NULL pointer dereference at virtual address 
00000000
 printing eip:
0143e000 -> *pde = 00000000:5016e001
2c76e000 -> *pme = 00000000:00000000
Oops: 0000 [#1]
SMP 
last sysfs file: /devices/xen-backend/vbd-1-51712/statistics/wr_sect
Modules linked in: ipt_MASQUERADE(U) iptable_nat(U) nf_nat(U) 
nf_conntrack_ipv4(U) xt_state(U) nf_conntrack(U) nfnetlink(U) ipt_REJECT(U) 
xt_tcpudp(U) iptable_filter(U) ip_tables(U) x_tables(U) bridge(U) nfsd(U) 
exportfs(U) lockd(U) nfs_acl(U) autofs4(U) sunrpc(U) ipv6(U) ext2(U) loop(U) 
dm_multipath(U) netbk(U) blkbk(U) 8250_pnp(U) 8250_pci(U) snd_hda_intel(U) 
snd_hda_codec(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) 
snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) 
i2c_i801(U) parport_pc(U) snd_timer(U) i2c_core(U) snd(U) parport(U) 8250(U) 
e1000(U) pcspkr(U) soundcore(U) serio_raw(U) serial_core(U) ata_generic(U) 
snd_page_alloc(U) sr_mod(U) sg(U) cdrom(U) ata_piix(U) dm_snapshot(U) 
dm_zero(U) dm_mirror(U) dm_mod(U) ahci(U) libata(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U) mbcache(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
CPU:    0
EIP:    0061:[<c10e85ba>]    Not tainted VLI
EFLAGS: 00010282   (2.6.21-2952.fc8xen #1)
EIP is at __sync_single+0x1c/0x197
eax: 00000000   ebx: 0005a6ca   ecx: 00000002   edx: 00000000
esi: 00000000   edi: 00000000   ebp: 00000400   esp: c136ce80
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0069
Process swapper (pid: 0, ti=c136c000 task=c12d4260 task.ti=c1314000)
Stack: 00000002 ed6a1000 c1c5d100 c1c5d100 c136cee8 c1c5d5c0 00000000 c102b7a1 
       0005a6ca 00000000 00000000 ed6a1000 c10e87db 00000400 00000002 00000400 
       00000000 00000400 00000000 ec7fb480 c10e8a3e 00000002 00000001 c1d87848 
Call Trace:
 [<c102b7a1>] lock_timer_base+0x19/0x35
 [<c10e87db>] unmap_single+0x55/0xd2
 [<c10e8a3e>] swiotlb_unmap_sg+0x103/0x120
 [<ee107fec>] ata_sg_clean+0x103/0x1b9 [libata]
 [<ee1080f0>] __ata_qc_complete+0x4e/0x92 [libata]
 [<c1009859>] timer_interrupt+0x5a4/0x5b7
 [<ee10bc70>] ata_qc_complete_multiple+0x87/0x9d [libata]
 [<ee0e5f22>] ahci_interrupt+0x2ff/0x4bd [ahci]
 [<c104a53a>] handle_IRQ_event+0x36/0x6e
 [<c104b9f2>] handle_level_irq+0x81/0xc7
 [<c104b971>] handle_level_irq+0x0/0xc7
 [<c100719a>] do_IRQ+0xac/0xd2
 [<c1036cb6>] ktime_get+0xf/0x2b
 [<c114f076>] evtchn_do_upcall+0x82/0xdb
 [<c100585e>] hypervisor_callback+0x46/0x4e
 [<c1008840>] raw_safe_halt+0xb3/0xd5
 [<c100452e>] xen_idle+0x31/0x5c
 [<c1003435>] cpu_idle+0xa3/0xbc
 [<c1319be4>] start_kernel+0x481/0x489
 [<c131925a>] unknown_bootoption+0x0/0x202
 =======================
Code: c8 09 d0 5a 0f 94 c0 59 0f b6 c0 5b 5e 5f c3 55 57 56 89 c6 53 83 ec 20 
89 4c 24 04 8b 4c 24 38 89 54 24 18 8b 6c 24 34 89 0c 24 <8b> 08 c1 e9 1e 69 c9 
80 12 00 00 81 c1 00 9e 2d c1 8b 99 0c 12 
EIP: [<c10e85ba>] __sync_single+0x1c/0x197 SS:ESP 0069:c136ce80
Kernel panic - not syncing: Fatal exception in interrupt
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: userspace block backend / gntdev problems