[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Create a iSCSI DomU with disks in another DomU running on the same Dom0



On 21/12/12 18:35, Konrad Rzeszutek Wilk wrote:
> On Fri, Dec 21, 2012 at 03:47:20PM +0100, Roger Pau Monné wrote:
>> On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote:
>>>> Hello,
>>>>
>>>> I'm trying to use a strange setup, that consists in having a DomU
>>>> serving iSCSI targets to the Dom0, that will use this targets as disks
>>>> for other DomUs. I've tried to set up this iSCSI target DomU using both
>>>> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
>>>> Enterprise Target (IET), and when launching the DomU I get this messages
>>>> from Xen:
>>>>
>>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, 
>>>> od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>>>> (XEN) Xen WARN at mm.c:1926
>>>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>>>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
>>>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
>>>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
>>>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
>>>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
>>>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
>>>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>>>> (XEN)    ffff830141405000 8000000000000003 7400000000000001 
>>>> 0000000000145028
>>>> (XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 
>>>> ffff82c4802bfd28
>>>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 
>>>> ffff82c480109ba3
>>>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8 
>>>> 0000000061dfc3f0
>>>> (XEN)    0000000000000001 ffffffffffff8000 0000000000000002 
>>>> ffff83011d555000
>>>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98 
>>>> ffff82c48010c607
>>>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 
>>>> 000000000011cf90
>>>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000 
>>>> ffff82c4802b8000
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 
>>>> ffff82c480300920
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 
>>>> ffff82c4802b8000
>>>> (XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10 
>>>> ffffc90000028b10
>>>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000 
>>>> 0000000000145028
>>>> (XEN)    000000000011cf7c 0000000000001000 0000000000157e68 
>>>> 0000000000007ff0
>>>> (XEN)    000000000000027e 000000000042000d 0000000000020b50 
>>>> ffff8300dfdf0000
>>>> (XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 
>>>> ffff880185f6fd58
>>>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 
>>>> ffff82c48010eb65
>>>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 
>>>> ffff82c480181831
>>>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28 
>>>> ffff8300dfb03000
>>>> (XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002 
>>>> ffff82c480300948
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
>>>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>>>> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
>>>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>>>> (XEN)    
>>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
>>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, 
>>>> od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>>>> (XEN) Xen WARN at mm.c:1926
>>>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>>>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
>>>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
>>>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
>>>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
>>>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
>>>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
>>>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>>>> (XEN)    ffff830141405000 8000000000000003 7400000000000001 
>>>> 000000000014581d
>>>> (XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 
>>>> ffff82c4802bfd28
>>>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 
>>>> ffff82c480109ba3
>>>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8 
>>>> 0000000061dfc308
>>>> (XEN)    0000000000000000 ffffffffffff8000 0000000000000001 
>>>> ffff83011d555000
>>>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98 
>>>> ffff82c48010c607
>>>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 
>>>> 000000000011cf90
>>>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000 
>>>> ffff82c4802b8000
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 
>>>> ffff82c480300920
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 
>>>> ffff82c4802b8000
>>>> (XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60 
>>>> ffffc90000028b60
>>>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000 
>>>> 000000000014581d
>>>> (XEN)    00000000000deb3e 0000000000001000 0000000000157e68 
>>>> 000000000b507ff0
>>>> (XEN)    0000000000000261 000000000042000d 00000000000204b0 
>>>> ffffc90000028b38
>>>> (XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38 
>>>> ffff880185f6fd58
>>>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 
>>>> ffff82c48010eb65
>>>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 
>>>> ffff82c480181831
>>>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000 
>>>> 0000000000000000
>>>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28 
>>>> 0000000000000086
>>>> (XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 
>>>> 0000000000000286
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
>>>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>>>> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
>>>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>>>> (XEN)    
>>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
>>>>
>>>> (Note that I've added a WARN() to mm.c:1925 to see where the
>>>> get_page call was coming from).
>>>>
>>>> Connecting the iSCSI disks to another Dom0 works fine, so this
>>>> problem only happens when trying to connect the disks to the
>>>> Dom0 where the DomU is running.
>>>
>>> Is this happening when the 'disks' are exported to the domUs?
>>> Are they exported via QEMU or xen-blkback?
>>
>> The iSCSI disks are connected to the DomUs using blkback, and this is
>> happening when the DomU tries to access it's disks.
>>
>>>>
>>>> I've replaced the Linux DomU serving iSCSI targets with a
>>>> NetBSD DomU, and the problems disappears, and I'm able to
>>>> attach the targets shared by the DomU to the Dom0 without
>>>> issues.
>>>>
>>>> The problem seems to come from netfront/netback, does anyone
>>>> have a clue about what might cause this bad interaction
>>>> between IET and netfront/netback?
>>>
>>> Or it might be that we are re-using the PFN for blkback/blkfront
>>> and using the m2p overrides and overwritting the netfront/netback
>>> m2p overrides?
>>
>> What's strange is that this doesn't happen when the domain that has the
>> targets is a NetBSD PV. There are also problems when blkback is not used
>> (see below), so I guess the problem is between netfront/netback and IET.
>>
>>> Is this with an HVM domU or PV domU?
>>
>> Both domains (the domain holding the iSCSI targets, and the created
>> guests) are PV.
>>
>> Also, I've forgot to say that in the previous email, but if I just
>> connect the iSCSI disks to the Dom0, I don't see any errors from Xen,
>> but the Dom0 kernel starts complaining:
>>
>> [70272.569607] sd 14:0:0:0: [sdc]
>> [70272.569611] Sense Key : Medium Error [current]
>> [70272.569619] Info fld=0x0
>> [70272.569623] sd 14:0:0:0: [sdc]
>> [70272.569627] Add. Sense: Unrecovered read error
>> [70272.569633] sd 14:0:0:0: [sdc] CDB:
>> [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70272.569662] end_request: critical target error, dev sdc, sector 0
>> [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70277.571220] sd 14:0:0:0: [sdc]
>> [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70277.571229] sd 14:0:0:0: [sdc]
>> [70277.571233] Sense Key : Medium Error [current]
>> [70277.571241] Info fld=0x0
>> [70277.571245] sd 14:0:0:0: [sdc]
>> [70277.571249] Add. Sense: Unrecovered read error
>> [70277.571255] sd 14:0:0:0: [sdc] CDB:
>> [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70277.571284] end_request: critical target error, dev sdc, sector 0
>> [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70282.572781] sd 14:0:0:0: [sdc]
>> [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70282.572790] sd 14:0:0:0: [sdc]
>> [70282.572794] Sense Key : Medium Error [current]
>> [70282.572802] Info fld=0x0
>> [70282.572806] sd 14:0:0:0: [sdc]
>> [70282.572810] Add. Sense: Unrecovered read error
>> [70282.572816] sd 14:0:0:0: [sdc] CDB:
>> [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70282.572846] end_request: critical target error, dev sdc, sector 0
>> [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70287.574409] sd 14:0:0:0: [sdc]
>> [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70287.574418] sd 14:0:0:0: [sdc]
>> [70287.574422] Sense Key : Medium Error [current]
>> [70287.574430] Info fld=0x0
>> [70287.574434] sd 14:0:0:0: [sdc]
>> [70287.574438] Add. Sense: Unrecovered read error
>> [70287.574445] sd 14:0:0:0: [sdc] CDB:
>> [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70287.574474] end_request: critical target error, dev sdc, sector 0
>>
>> When I try to attach the targets to another Dom0, everything works fine,
>> the problem only happens when the iSCSI target is a DomU and you attach
>> the disks from the Dom0 on the same machine.
> 
> I think we are just swizzling the PFNs with a different MFN when you
> do the domU -> domX, using two ring protocols. Weird thought as the
> m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of
> thing.
> 
> What happens if the dom0/domU are all 3.8 with the persistent grant
> patches?

Sorry for the delay, the same error happens when Dom0/DomU is using a
persistent grants enabled kernel, although I had to backport the
persistent grants patch to 3.2, because I was unable to get iSCSI
Enterprise Target dkms working with 3.8. I'm also seeing this messages
in the DomU that's running the iSCSI target:

[  511.338845] net_ratelimit: 36 callbacks suppressed
[  511.338851] net eth0: rx->offset: 0, size: 4294967295
[  512.288282] net eth0: rx->offset: 0, size: 4294967295
[  512.525639] net eth0: rx->offset: 0, size: 4294967295
[  512.800729] net eth0: rx->offset: 0, size: 4294967295
[  512.800732] net eth0: rx->offset: 0, size: 4294967295
[  513.049447] net eth0: rx->offset: 0, size: 4294967295
[  513.050125] net eth0: rx->offset: 0, size: 4294967295
[  513.313493] net eth0: rx->offset: 0, size: 4294967295
[  513.313497] net eth0: rx->offset: 0, size: 4294967295
[  513.557233] net eth0: rx->offset: 0, size: 4294967295
[  517.422772] net_ratelimit: 61 callbacks suppressed
[  517.422777] net eth0: rx->offset: 0, size: 4294967295
[  517.422780] net eth0: rx->offset: 0, size: 4294967295
[  517.667053] net eth0: rx->offset: 0, size: 4294967295
[  517.667640] net eth0: rx->offset: 0, size: 4294967295
[  517.879690] net eth0: rx->offset: 0, size: 4294967295
[  517.879693] net eth0: rx->offset: 0, size: 4294967295
[  518.125314] net eth0: rx->offset: 0, size: 4294967295
[  518.125907] net eth0: rx->offset: 0, size: 4294967295
[  518.477026] net eth0: rx->offset: 0, size: 4294967295
[  518.477029] net eth0: rx->offset: 0, size: 4294967295
[  553.400129] net_ratelimit: 84 callbacks suppressed
[  553.400134] net eth0: rx->offset: 0, size: 4294967295
[  553.400615] net eth0: rx->offset: 0, size: 4294967295
[  553.400618] net eth0: rx->offset: 0, size: 4294967295
[  553.400620] net eth0: rx->offset: 0, size: 4294967295
[  553.603476] net eth0: rx->offset: 0, size: 4294967295
[  553.604103] net eth0: rx->offset: 0, size: 4294967295
[  553.807444] net eth0: rx->offset: 0, size: 4294967295
[  553.807447] net eth0: rx->offset: 0, size: 4294967295
[  554.049223] net eth0: rx->offset: 0, size: 4294967295
[  554.049902] net eth0: rx->offset: 0, size: 4294967295
[  581.912017] net_ratelimit: 8 callbacks suppressed
[  581.912022] net eth0: rx->offset: 0, size: 4294967295
[  581.912496] net eth0: rx->offset: 0, size: 4294967295
[  582.118996] net eth0: rx->offset: 0, size: 4294967295
[  582.357495] net eth0: rx->offset: 0, size: 4294967295
[  615.983921] net eth0: rx->offset: 0, size: 4294967295

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.