[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set



On 05/12/12 15:59, Matt Wilson wrote:
> On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote:
>> On 05/12/12 06:02, Matt Wilson wrote:
>>> An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot,
>>> but still have the flexibility to change the configuration later.
>>> There's no logic that keys off of domain->is_pinned outside of
>>> sched_init_vcpu() and vcpu_set_affinity(). By adjusting the
>>> is_pinned_vcpu() macro to only check for a single CPU set in the
>>> cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system
>>> boots.
>> Sadly this patch will break things.  There are certain callers of
>> is_pinned_vcpu() which rely on the value to allow access to certain
>> power related MSRs, which is where the requirement for never permitting
>> an update of the affinity mask comes from.
> If this is true, the existing is_pinned_vcpu() test is broken:
>
>    #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
>                               cpumask_weight((v)->cpu_affinity) == 1)
>
> It's && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot,
> the MSR traps will suddenly start working.
>
> See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd
>
>> When I encountered this problem before, I considered implementing
>> dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at
>> create time, but leaves is_pinned as false. 
> I could implement that, but I want to make sure we're fixing a real
> problem. It sounds like Keir thinks this can be relaxed.
>
> Matt

Hmm - the logic does indeed look broken.  When I was working on this
problem, it was an older tree without this changeset.

With the current logic, the vcpu will transparently move to a different
set of MSRs whenever the affinity is changed, which does not sound safe
for an unaware vcpu.  I did not pay much attention to the function of
the MSRs.

~Andrew

>
>>> Signed-off-by: Matt Wilson <msw@xxxxxxxxxx>
>>>
>>> diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown
>>> --- a/docs/misc/xen-command-line.markdown   Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/docs/misc/xen-command-line.markdown   Wed Dec 05 05:48:23 2012 +0000
>>> @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_
>>>  
>>>  > Default: `false`
>>>  
>>> -Pin dom0 vcpus to their respective pcpus
>>> +Initially pin dom0 vcpus to their respective pcpus
>>>  
>>>  ### e820-mtrr-clip
>>>  > `= <boolean>`
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c
>>> --- a/xen/common/domain.c   Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/common/domain.c   Wed Dec 05 05:48:23 2012 +0000
>>> @@ -45,10 +45,6 @@
>>>  /* xen_processor_pmbits: xen control Cx, Px, ... */
>>>  unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
>>>  
>>> -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */
>>> -bool_t opt_dom0_vcpus_pin;
>>> -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>>> -
>>>  /* Protect updates/reads (resp.) of domain_list and domain_hash. */
>>>  DEFINE_SPINLOCK(domlist_update_lock);
>>>  DEFINE_RCU_READ_LOCK(domlist_read_lock);
>>> @@ -235,7 +231,6 @@ struct domain *domain_create(
>>>  
>>>      if ( domid == 0 )
>>>      {
>>> -        d->is_pinned = opt_dom0_vcpus_pin;
>>>          d->disable_migrate = 1;
>>>      }
>>>  
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c
>>> --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000
>>> @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings",
>>>   * */
>>>  int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
>>>  integer_param("sched_ratelimit_us", sched_ratelimit_us);
>>> +
>>> +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */
>>> +bool_t opt_dom0_vcpus_pin;
>>> +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>>> +
>>>  /* Various timer handlers. */
>>>  static void s_timer_fn(void *unused);
>>>  static void vcpu_periodic_timer_fn(void *data);
>>> @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi
>>>       * domain-0 VCPUs, are pinned onto their respective physical CPUs.
>>>       */
>>>      v->processor = processor;
>>> -    if ( is_idle_domain(d) || d->is_pinned )
>>> +
>>> +    if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) )
>>>          cpumask_copy(v->cpu_affinity, cpumask_of(processor));
>>>      else
>>>          cpumask_setall(v->cpu_affinity);
>>> @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co
>>>      cpumask_t online_affinity;
>>>      cpumask_t *online;
>>>  
>>> -    if ( v->domain->is_pinned )
>>> -        return -EINVAL;
>>>      online = VCPU2ONLINE(v);
>>>      cpumask_and(&online_affinity, affinity, online);
>>>      if ( cpumask_empty(&online_affinity) )
>>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h       Fri Nov 30 21:51:17 2012 +0000
>>> +++ b/xen/include/xen/sched.h       Wed Dec 05 05:48:23 2012 +0000
>>> @@ -292,8 +292,6 @@ struct domain
>>>      enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying;
>>>      /* Domain is paused by controller software? */
>>>      bool_t           is_paused_by_controller;
>>> -    /* Domain's VCPUs are pinned 1:1 to physical CPUs? */
>>> -    bool_t           is_pinned;
>>>  
>>>      /* Are any VCPUs polling event channels (SCHEDOP_poll)? */
>>>  #if MAX_VIRT_CPUS <= BITS_PER_LONG
>>> @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma
>>>  
>>>  #define is_hvm_domain(d) ((d)->is_hvm)
>>>  #define is_hvm_vcpu(v)   (is_hvm_domain(v->domain))
>>> -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
>>> -                           cpumask_weight((v)->cpu_affinity) == 1)
>>> +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1)
>>>  #ifdef HAS_PASSTHROUGH
>>>  #define need_iommu(d)    ((d)->need_iommu)
>>>  #else
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxx
>>> http://lists.xen.org/xen-devel
>> -- 
>> Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
>> T: +44 (0)1223 225 900, http://www.citrix.com

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.