WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Production problem

To: <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-users] Production problem
From: Ribeiro Emerson Gomes <Emerson.Ribeiro@xxxxxxxxxx>
Date: Fri, 13 Aug 2010 10:44:07 -0300
Delivery-date: Fri, 13 Aug 2010 06:50:28 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acs67Z112oNx6EUASVqhj9gV0cFRmw==
Thread-topic: Production problem

Hello,

 

I’m a Linux administrator in charge of Xen environment in a large organization.

We have about 300 VMs spread on 25 clusters (RHCS) running XEN.

 

Last night I had an unexpected reboot on a particular virtual machine and I can’t figure out what happened.

 

Xend.log says:

[2010-08-12 18:58:39 xend 14729] ERROR (xmlrpclib2:184) (16, 'Device or resource busy')

Traceback (most recent call last):

  File "/usr/lib64/python2.4/site-packages/xen/util/xmlrpclib2.py", line 162, in _marshaled_dispatch

    response = self._dispatch(method, params)

  File "/usr/lib64/python2.4/SimpleXMLRPCServer.py", line 406, in _dispatch

    return func(*params)

  File "/usr/lib64/python2.4/site-packages/xen/xend/server/XMLRPCServer.py", line 54, in domain

    return fixup_sxpr(info.sxpr())

  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1319, in sxpr

    for config in self.getDeviceConfigurations(cls):

  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1250, in getDeviceConfigurations

    return self.getDeviceController(deviceClass).configurations()

  File "/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py", line 236, in configurations

    return map(self.configuration, self.deviceIDs())

  File "/usr/lib64/python2.4/site-packages/xen/xend/server/vfbif.py", line 39, in configuration

    r = DevController.configuration(self, devid)

  File "/usr/lib64/python2.4/site-packages/xen/xend/server/DevController.py", line 244, in configuration

    backdomid = xstransact.Read(self.devicePath(devid), "backend-id")

  File "/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py", line 297, in Read

    return complete(path, lambda t: t.read(*args))

  File "/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py", line 351, in complete

    t = xstransact(path)

  File "/usr/lib64/python2.4/site-packages/xen/xend/xenstore/xstransact.py", line 20, in __init__

    self.transaction = xshandle().transaction_start()

Error: (16, 'Device or resource busy')

[2010-08-12 18:58:39 xend.XendDomainInfo 14729] DEBUG (XendDomainInfo:1036) XendDomainInfo.handleShutdownWatch

[2010-08-12 18:58:39 xend.XendDomainInfo 14729] DEBUG (XendDomainInfo:1036) XendDomainInfo.handleShutdownWatch

 

Environment looks like:

Physical hosts: Two DELL R710 servers, 144Gb Memo, 8 Intel Xeon X5460 3.16GHz (2 quad) running RHEL 5.3, kernel 2.6.18-128.1.1.el5xen x86_64, 2 bonded LAN NIC (bnx2), 2 bonded heartbeat NIC (e1000e)  plus one administration NIC. To connect to our SAN we use 2 redundant HBA QLogic ISP2432-based 4Gb and hook it to a EMC Clarion CX4 storage. Multipathing done through EMC PowerPath.

 

VM: RHEL 5.3 Kernel 2.6.18-128.1.1.el5xen x86_64 x86_64, 2Gb memo, 2 CPUs. VM is LV backed.

 

This VM was up and running for months. After rebooting its running fine as well.

There were 2 other VMs running in this same cluster, using the same storage, same OS, same configurations. Those weren’t affected at all.

I couldn’t find any info in log files, except for the reboot.

Can anyone decipher what the error above is trying to tell?

 

Any ideas?

 

Thanks in advance.

 

 

 

Regards,

 

Emerson Ribeiro

55 11 4344-8905

 

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-users] Production problem, Ribeiro Emerson Gomes <=