Update following some more testing.
This bug/feature appears to be specific to XCP 1.0 - pulled power on identical
Xenserver 5.6FP1 system and it comes back up with no issue. I also see Chris
had same issue with iscsi disk.
Also when vdi is not available (in XCP) I checked and there was no process
using underlying vhd (fuser)
This is beyond my technical expertise to fix but attached are extracts from logs
\var\log\messages
Jan 8 11:57:15 durham xenguest: Determined the following parameters from
xenstore:
Jan 8 11:57:15 durham xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0
vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0
Jan 8 11:57:15 durham fe: 8251 (/opt/xensource/libexec/xenguest -controloutfd
6 -controlinfd 7 -debuglog /tmp...) exitted with code 2
Jan 8 11:57:16 durham xapi: [error|durham|642|Async.VM.start
R:b2baff469ec4|xapi] Memory F 5011684 KiB S 0 KiB T 6141 MiB
Jan 8 11:57:16 durham xapi: [error|durham|305 xal_listen||event] event could
not be processed because VM record not in database
Jan 8 11:57:16 durham xapi: [error|durham|305 xal_listen|VM (domid: 3)
device_event = ChangeUncooperative false D:7c80cc6a38b5|event] device_event
could not be processed because VM record not in database
\var\log\xensource
[20110108T11:57:16.349Z|debug|durham|642|Async.VM.start R:b2baff469ec4|sm] SM
ext vdi_detach sr=OpaqueRef:da1ceffc-453a-9bf3-108c-1255793bc4a0
vdi=OpaqueRef:fafa5ede-b57a-648d-2f88-0a7aa9ea9b30
[20110108T11:57:16.514Z|debug|durham|642|Async.VM.start
R:b2baff469ec4|storage_access] Executed detach succesfully on VDI
'96ccff27-332b-4fb4-b01b-0ee6e70d3a43'; attach refcount now: 0
[20110108T11:57:16.514Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Vmops.start_paused caught: SR_BACKEND_FAILURE_46: [ ; The VDI is not available
[opterr=VDI 96ccff27-332b-4fb4-b01b-0ee6e70d3a43 already attached RW]; ]:
calling domain_destroy
[20110108T11:57:16.515Z|error|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Memory F 5011684 KiB S 0 KiB T 6141 MiB
[20110108T11:57:16.515Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops]
Domain.destroy: all known devices = [ ]
[20110108T11:57:16.515Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops]
Domain.destroy calling Xc.domain_destroy (domid 3)
[20110108T11:57:16.755Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops]
No qemu-dm pid in xenstore; assuming this domain was PV
[20110108T11:57:16.756Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops]
Domain.destroy: rm /local/domain/3
[20110108T11:57:16.762Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xenops]
Domain.destroy: deleting backend paths
[20110108T11:57:16.768Z|debug|durham|642|Async.VM.start
R:b2baff469ec4|locking_helpers] Released lock on VM
OpaqueRef:478974cb-fed4-f2ac-8192-868c9e9cfe41 with token 4
[20110108T11:57:16.775Z|debug|durham|305 xal_listen|VM (domid 3) @releaseDomain
D:021cb80f6c7a|dispatcher] Server_helpers.exec exception_handler: Got exception
INTERNAL_ERROR: [ Vmopshelpers.Vm_corresponding_to_domid_not_in_db(3) ]
[20110108T11:57:16.776Z|error|durham|305 xal_listen||event] event could not be
processed because VM record not in database
[20110108T11:57:16.776Z|debug|durham|305 xal_listen||event] VM (domid: 3)
device_event = ChangeUncooperative false
[20110108T11:57:16.778Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Raised at pervasiveext.ml:26.22-25 -> pervasiveext.ml:22.2-9
[20110108T11:57:16.780Z|error|durham|305 xal_listen|VM (domid: 3) device_event
= ChangeUncooperative false D:7c80cc6a38b5|event] device_event could not be
processed because VM record not in database
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Check operation error: op=snapshot
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
vdis_reset_and_caching: [(false,false);(false,false)]
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Checking for vdis_reset_and_caching...
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi] Op
allowed!
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Check operation error: op=copy
[20110108T11:57:16.782Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
vdis_reset_and_caching: [(false,false);(false,false)]
[20110108T11:57:16.783Z|debug|durham|642|Async.VM.start R:b2baff469ec4|xapi]
Check operation error: op=clone
[20110108T11:57:16.786Z|debug|durham|642|Async.VM.start
R:b2baff469ec4|dispatcher] Server_helpers.exec exception_handler: Got exception
SR_BACKEND_FAILURE_46: [ ; The VDI is not available [opterr=VDI
96ccff27-332b-4fb4-b01b-0ee6e70d3a43 already attached RW]; ]
Hopefully somebody can diagnose and fix this bug.
Pls note VDIs had to be recreated post last post so UUID are not the same -
everything else is.
Regards,
Jon
-----Original Message-----
From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
Sent: 31 December 2010 14:11
To: Jonathon Royle
Cc: xen-api@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-API] XCP 1.0 beta - Locked VDI issue
Well... I'm not familiar with file-based VDI provisioning, but I'm think
the problem is not in XCP (well, XCP have the bug, but it only trigger
state), but in some forgotten mount.
Check fuser/lsof output, and try to restart xapi (xe-toolstack-restart).
And look to /var/log/messages and /var/log/xensource.log, every domain
start it filling with huge amount of debug info (I sometime thinking,
that this info reduce domain start/shutdown speed at least half).
В Птн, 31/12/2010 в 13:56 +0000, Jonathon Royle пишет:
> George,
>
> Thanks - now my oops, I thought I had included SR details
>
> SR - /dev/cciss/C0d0p3 ext3, thin provisioned
>
> Server HP ML370 G5 - running Raid1
>
> NB Same thing happens on RAID 5 /dev/cciss/C0d1p1 also ext3. Not tested with
> Local LVM but I can do.
>
> Jon
>
>
>
> -----Original Message-----
> From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
> Sent: 31 December 2010 13:13
> To: Jonathon Royle
> Cc: xen-api@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-API] XCP 1.0 beta - Locked VDI issue
>
> Oops, sorry, miss it.
>
> Next: is SR iscsi-based? Check if corresponding volume mounted and try
> to deactivate it by lvchange. (name of LV will contain VDI uuid).
>
>
> В Птн, 31/12/2010 в 12:13 +0000, Jonathon Royle пишет:
> > George,
> >
> > Thanks for quick response.
> >
> > List of created VBD is as per original post - ie only attached to intended
> > VM. As part of initial trouble shoot I did remove all VBDs as there was
> > (from memory) an errant one.
> >
> > Regards,
> >
> > Jon
> >
> >
> > -----Original Message-----
> > From: George Shuklin [mailto:george.shuklin@xxxxxxxxx]
> > Sent: 31 December 2010 12:06
> > To: Jonathon Royle
> > Cc: xen-api@xxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-API] XCP 1.0 beta - Locked VDI issue
> >
> > Try to see created vbd for this vdi (xe vbd-list vdi-uuid=UUID), some of
> > them will be attached to control domain where VM was stopped.
> >
> >
> > В Птн, 31/12/2010 в 11:42 +0000, Jonathon Royle пишет:
> > > First time post so hope I am using the correct list.
> > >
> > >
> > >
> > > I have been trialling XCP1.0 beta for a few weeks now and have had no
> > > issues until now. If the host is ungracefully shutdown (power fail in
> > > my case) then the VDI of the running VM become unusable upon host
> > > restart.
> > >
> > >
> > >
> > >
> > >
> > > [root@----]# xe vm-start uuid=03ed2489-49f6-eb48-0819-549c74a96269
> > >
> > > Error code: SR_BACKEND_FAILURE_46
> > >
> > > Error parameters: , The VDI is not available [opterr=VDI
> > > fc77b366-950b-49be-90ce-2a466cf73502 already attached RW],
> > >
> > >
> > >
> > > I have been able to repeat this on several occasions.
> > >
> > >
> > >
> > > I have tried toolstack restart, host reboot as well as vbd-unplug
> > > etc. The only solution I have found is to use sr-forget (a bit
> > > drastic) and then reintroduce the SR
> > >
> > >
> > >
> > > Some config output.
> > >
> > >
> > >
> > > [root@----]# xe vdi-list uuid=fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > uuid ( RO) : fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > name-label ( RW): Cacti - /
> > >
> > > name-description ( RW): System
> > >
> > > sr-uuid ( RO): 0fe9e89c-e244-5cf2-d35d-1cdca89f798e
> > >
> > > virtual-size ( RO): 8589934592
> > >
> > > sharable ( RO): false
> > >
> > > read-only ( RO): false
> > >
> > >
> > >
> > > [root@----]# xe vbd-list vdi-uuid=fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > uuid ( RO) : 350d819b-ec36-faf4-5457-0a81668407f0
> > >
> > > vm-uuid ( RO): 03ed2489-49f6-eb48-0819-549c74a96269
> > >
> > > vm-name-label ( RO): Cacti
> > >
> > > vdi-uuid ( RO): fc77b366-950b-49be-90ce-2a466cf73502
> > >
> > > empty ( RO): false
> > >
> > > device ( RO):
> > >
> > >
> > >
> > >
> > >
> > > Is this a known bug, is there a better solution – happy to test
> > > further
> > >
> > >
> > >
> > > Regards,
> > >
> > >
> > >
> > > Jon
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > xen-api mailing list
> > > xen-api@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/mailman/listinfo/xen-api
> >
> >
>
>
_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api
|