[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xend: do not polling vcpus info if guest state is not RUNNING or PAUSED



On 19/11/13 07:13, Joe Jin wrote:
> When created new guest on NUMA server, xend tried to get the best node by
> calculated all vcpus info, the race is if other geust is rebooting, the
> guest in the list when entered find_relaxed_node(), but when call
> getVCPUInfo() the guest be terminated, then getVCPUInfo() will fail with
> below error:
> 
> [2013-09-04 20:01:26 6254] ERROR (XendDomainInfo:496) VM start failed
> Traceback (most recent call last):
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
> 482, in start
>     XendTask.log_progress(31, 60, self._initDomain)
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendTask.py", line 209, 
> in log_progress
>     retval = func(*args, **kwds)
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
> 2918, in _initDomain
>     node = self._setCPUAffinity()
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
> 2835, in _setCPUAffinity
>     best_node = find_relaxed_node(candidate_node_list)[0]
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
> 2803, in find_relaxed_node
>     cpuinfo = dom.getVCPUInfo()
>   File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
> 1600, in getVCPUInfo
>     raise XendError(str(exn))
> XendError: (3, 'No such process')
> 
> This patch will let find_relaxed_node() only polling the RUNNING or PAUSED
> guest vpus info to avoid the race.
> 
> Signed-off-by: Joe Jin <joe.jin@xxxxxxxxxx>
> ---
>  tools/python/xen/xend/XendDomainInfo.py |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/tools/python/xen/xend/XendDomainInfo.py 
> b/tools/python/xen/xend/XendDomainInfo.py
> index e9d3e7e..66e4b9f 100644
> --- a/tools/python/xen/xend/XendDomainInfo.py
> +++ b/tools/python/xen/xend/XendDomainInfo.py
> @@ -2734,6 +2734,8 @@ class XendDomainInfo:
>                  from xen.xend import XendDomain
>                  doms = XendDomain.instance().list('all')
>                  for dom in filter (lambda d: d.domid != self.domid, doms):
> +                    if dom._stateGet() not in 
> (DOM_STATE_RUNNING,DOM_STATE_PAUSED):
> +                        continue

Isn't it possible that the domain has rebooted and is no longer there
between this two calls?

IMHO it's very unlikely, but there's still a window where getVCPUInfo
could fail.

>                      cpuinfo = dom.getVCPUInfo()
>                      for vcpu in sxp.children(cpuinfo, 'vcpu'):
>                          if sxp.child_value(vcpu, 'online') == 0: continue
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.