Pasi � wrote:
> On Mon, Aug 11, 2008 at 10:45:23AM -0600, Jim Fehlig wrote:
>> Ian Jackson wrote:
>>> Jim Fehlig writes ("[Xen-devel] [PATCH] [RFC] Add lock on domain start"):
>>>
>>>> This patch adds a simple lock mechanism when starting domains by placing
>>>> a lock file in xend-domains-path/<dom_uuid>. The lock file is removed
>>>> when domain is stopped. The motivation for such a mechanism is to
>>>> prevent starting the same domain from multiple hosts.
>>>>
>>> I think this should be dealt with in your next-layer-up management
>>> tools.
>>>
>> Perhaps. I wanted to see if there was any interest in having such a
>> feature at the xend layer. If not, I will no longer pursue this option.
>>
>
> Replying a bit late to this.. I think there is demand for this feature!
>
> Many people (mostly in a smaller environments) don't want to use
> 'next-layer-up' management tools..
>
>>> Lockfiles are bad because they can become stale.
>>>
>> Yep. Originally I considered a 'lockless-lock' approach where a bit it
>> set and counter is spun on a 'reserved' sector of vbd, e.g. first
>> sector. Attempting to attach the vbd to another domain would fail if
>> lock bit is set and counter is incrementing. If counter is not
>> incrementing assume lock is stale and proceed. This approach is
>> certainly more complex. We support various image formats (raw, qcow,
>> vmdk, ...) and such an approach may mean changing the format (e.g.
>> qcow3). Wouldn't work for existing images. Who is responsible for
>> spinning the counter? Anyhow seemed like a lot of complexity as
>> compared to the suggested simple approach with override for stale lock.
>>
>
> I assume you guys have this patch included in OpenSuse/SLES Xen rpms.
>
> Is the latest version available from somewhere?
>
> -- Pasi
I ever seen a patch in SUSE xen rpm. maybe Jim can tell you the latest status.
In Oracle VM, we add hooks in xend and use a external locking utility.
currently, we use DLM (distributed lock manager) to manage the domain running
lock to prevent the same
VM starts from two servers simultaneously.
We have add hooks to VM start/shutdown/migration for acquire/release the lock.
Note during migration, we release the lock before starting the migration process
and a lock will be acquired in the destination side. There still a chance for
other servers rather than the destination server to acquire the lock. thus cause
the migration fail.
hope someone can give some advice.
here is the patch for your reference.
thanks,
zhigang
diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/examples/xend-config.sxp
xen-3.4.0/tools/examples/xend-config.sxp
--- xen-3.4.0.bak/tools/examples/xend-config.sxp 2009-08-05
16:17:42.000000000 +0800
+++ xen-3.4.0/tools/examples/xend-config.sxp 2009-08-04 10:23:17.000000000
+0800
@@ -69,6 +69,12 @@
(xend-unix-path /var/lib/xend/xend-socket)
+# External locking utility for get/release domain running lock. By default,
+# no utility is specified. Thus there will be no lock as VM running.
+# The locking utility should accept:
+# <--lock | --unlock> --name <name> --uuid <uuid>
+# command line options, and returns zero on success, others on error.
+#(xend-domains-lock-path '')
# Address and port xend should use for the legacy TCP XMLRPC interface,
# if xend-tcp-xmlrpc-server is set.
diff -Nurp --exclude '*.orig'
xen-3.4.0.bak/tools/python/xen/xend/XendDomainInfo.py
xen-3.4.0/tools/python/xen/xend/XendDomainInfo.py
--- xen-3.4.0.bak/tools/python/xen/xend/XendDomainInfo.py 2009-08-05
16:17:42.000000000 +0800
+++ xen-3.4.0/tools/python/xen/xend/XendDomainInfo.py 2009-08-05
16:35:35.000000000 +0800
@@ -359,6 +359,8 @@ class XendDomainInfo:
@type state_updated: threading.Condition
@ivar refresh_shutdown_lock: lock for polling shutdown state
@type refresh_shutdown_lock: threading.Condition
+ @ivar running_lock: lock for running VM
+ @type running_lock: bool or None
@ivar _deviceControllers: device controller cache for this domain
@type _deviceControllers: dict 'string' to DevControllers
"""
@@ -427,6 +429,8 @@ class XendDomainInfo:
self.refresh_shutdown_lock = threading.Condition()
self._stateSet(DOM_STATE_HALTED)
+ self.running_lock = None
+
self._deviceControllers = {}
for state in DOM_STATES_OLD:
@@ -453,6 +457,7 @@ class XendDomainInfo:
if self._stateGet() in (XEN_API_VM_POWER_STATE_HALTED,
XEN_API_VM_POWER_STATE_SUSPENDED, XEN_API_VM_POWER_STATE_CRASHED):
try:
+ self.acquire_running_lock();
XendTask.log_progress(0, 30, self._constructDomain)
XendTask.log_progress(31, 60, self._initDomain)
@@ -485,6 +490,7 @@ class XendDomainInfo:
state = self._stateGet()
if state in (DOM_STATE_SUSPENDED, DOM_STATE_HALTED):
try:
+ self.acquire_running_lock();
self._constructDomain()
try:
@@ -2617,6 +2623,11 @@ class XendDomainInfo:
self._stateSet(DOM_STATE_HALTED)
self.domid = None # Do not push into _stateSet()!
+
+ try:
+ self.release_running_lock()
+ except:
+ log.exception("Release running lock failed: %s" % status)
finally:
self.refresh_shutdown_lock.release()
@@ -4073,6 +4084,28 @@ class XendDomainInfo:
params.get('burst', '50K'))
return 1
+ def acquire_running_lock(self):
+ if not self.running_lock:
+ lock_path = xoptions.get_xend_domains_lock_path()
+ if lock_path:
+ status = os.system('%s --lock --name %s --uuid %s' % \
+ (lock_path, self.info['name_label'],
self.info['uuid']))
+ if status == 0:
+ self.running_lock = True
+ else:
+ raise XendError('Acquire running lock failed: %s' % status)
+
+ def release_running_lock(self):
+ if self.running_lock:
+ lock_path = xoptions.get_xend_domains_lock_path()
+ if lock_path:
+ status = os.system('%s --unlock --name %s --uuid %s' % \
+ (lock_path, self.info['name_label'],
self.info['uuid']))
+ if status == 0:
+ self.running_lock = False
+ else:
+ raise XendError('Release running lock failed: %s' % status)
+
def __str__(self):
return '<domain id=%s name=%s memory=%s state=%s>' % \
(str(self.domid), self.info['name_label'],
diff -Nurp --exclude '*.orig' xen-3.4.0.bak/tools/python/xen/xend/XendDomain.py
xen-3.4.0/tools/python/xen/xend/XendDomain.py
--- xen-3.4.0.bak/tools/python/xen/xend/XendDomain.py 2009-08-05
16:17:09.000000000 +0800
+++ xen-3.4.0/tools/python/xen/xend/XendDomain.py 2009-08-04
10:23:17.000000000 +0800
@@ -1317,6 +1317,7 @@ class XendDomain:
POWER_STATE_NAMES[dominfo._stateGet()])
""" The following call may raise a XendError exception """
+ dominfo.release_running_lock();
dominfo.testMigrateDevices(True, dst)
if live:
diff -Nurp --exclude '*.orig'
xen-3.4.0.bak/tools/python/xen/xend/XendOptions.py
xen-3.4.0/tools/python/xen/xend/XendOptions.py
--- xen-3.4.0.bak/tools/python/xen/xend/XendOptions.py 2009-08-05
16:17:42.000000000 +0800
+++ xen-3.4.0/tools/python/xen/xend/XendOptions.py 2009-08-04
10:23:17.000000000 +0800
@@ -281,6 +281,11 @@ class XendOptions:
"""
return self.get_config_string("xend-domains-path",
self.xend_domains_path_default)
+ def get_xend_domains_lock_path(self):
+ """ Get the path of the lock utility for running domains.
+ """
+ return self.get_config_string("xend-domains-lock-path")
+
def get_xend_state_path(self):
""" Get the path for persistent domain configuration storage
"""
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|