Hello,
I’m a Linux administrator in charge
of Xen environment in a large organization.
We have about 300 VMs spread on 25
clusters (RHCS) running XEN.
Last night I had an unexpected reboot on
a particular virtual machine and I can’t figure out what happened.
Xend.log says:
[2010-08-12 18:58:39 xend 14729] ERROR (xmlrpclib2:184) (16, 'Device or
resource busy')
Traceback (most recent call last):
File
"/usr/lib64/python2.4/site-packages/xen/util/xmlrpclib2.py", line
162, in _marshaled_dispatch
response = self._dispatch(method, params)
File "/usr/lib64/python2.4/SimpleXMLRPCServer.py",
line 406, in _dispatch
return func(*params)
File
"/usr/lib64/python2.4/site-packages/xen/xend/server/XMLRPCServer.py",
line 54, in domain
return fixup_sxpr(info.sxpr())
File
"/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
1319, in sxpr
for config in self.getDeviceConfigurations(cls):
File
"/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line
1250, in getDeviceConfigurations
return self.getDeviceController(deviceClass).configurations()
File
"/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py",
line 236, in configurations
return map(self.configuration, self.deviceIDs())
File "/usr/lib64/python2.4/site-packages/xen/xend/server/vfbif.py",
line 39, in configuration
r = DevController.configuration(self, devid)
File
"/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py",
line 244, in configuration
backdomid = xstransact.Read(self.devicePath(devid),
"backend-id")
File
"/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py",
line 297, in Read
return complete(path, lambda t: t.read(*args))
File
"/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py",
line 351, in complete
t = xstransact(path)
File
"/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py",
line 20, in __init__
self.transaction = xshandle().transaction_start()
Error: (16, 'Device or resource busy')
[2010-08-12 18:58:39 xend.XendDomainInfo 14729] DEBUG
(XendDomainInfo:1036) XendDomainInfo.handleShutdownWatch
[2010-08-12 18:58:39 xend.XendDomainInfo 14729] DEBUG
(XendDomainInfo:1036) XendDomainInfo.handleShutdownWatch
Environment looks like:
Physical
hosts: Two DELL R710 servers, 144Gb Memo, 8 Intel Xeon X5460 3.16GHz (2 quad)
running RHEL 5.3, kernel 2.6.18-128.1.1.el5xen x86_64, 2 bonded LAN NIC (bnx2),
2 bonded heartbeat NIC (e1000e) plus one administration NIC. To connect
to our SAN we use 2 redundant HBA QLogic ISP2432-based 4Gb and hook it to a EMC
Clarion CX4 storage. Multipathing done through EMC PowerPath.
VM: RHEL
5.3 Kernel 2.6.18-128.1.1.el5xen x86_64 x86_64, 2Gb memo, 2 CPUs. VM is LV backed.
This VM was up and running for months.
After rebooting its running fine as well.
There were 2 other VMs running in this
same cluster, using the same storage, same OS, same configurations. Those weren’t
affected at all.
I couldn’t find any info in log
files, except for the reboot.
Can anyone decipher what the error above
is trying to tell?
Any ideas?
Thanks in advance.
Regards,
Emerson Ribeiro
55
11 4344-8905