[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus



On 14/11/16 18:44, Boris Ostrovsky wrote:
> On 11/14/2016 01:19 PM, Andrew Cooper wrote:
>> On 14/11/16 17:48, Boris Ostrovsky wrote:
>>> On 11/14/2016 12:17 PM, Andrew Cooper wrote:
>>>>>> I am not convinced though that we can start enforcing this new VCPU
>>>>>> count, at least for PV guests. They expect to start all max VCPUs and
>>>>>> then offline them. This, of course, can be fixed but all non-updated
>>>>>> kernels will stop booting.
>>>>> How about we don't clear _VPF_down if the bit in the availability bitmap
>>>>> is not set?
>>>> This is yet another PV mess.  We clearly need to quirk PV guests as the
>>>> exception to sanity, given that they expect (and have been able to)
>>>> online all cpus at start-of-day.
>>>>
>>>> To avoid race conditions, you necessarily need to be able to set a
>>>> reduced permitted map before asking the VM to unplug.
>>>>
>>>> For HVM guests, we can set a proper permitted map at boot, and really
>>>> should do so.
>>>>
>>>> For PV guests, we have to wait until it has completed its SMP bringup
>>>> before reducing the permitted set.  Therefore, making the initial
>>>> set_avail_vcpus call could be deferred until the first unplug request?
>>> I am not sure how we can determine in the hypervisor that a guest has
>>> completed the bringup: I don't think we can rely on the last VCPU (which
>>> is maxvcpu-1) doing VCPUOP_up. Just to mess up with the hypervisor the
>>> guest may decide to only bring up (maxvcpus-2) VCPUs. In other words, we
>>> can't assume a well-behaved guest.
>> I wasn't suggesting relying on the guest.  I was referring to the first
>> unplug request at the toolstack level.
> I don't think waiting for toolstack's (un)plug request is going to help
> much --- the request may never come and the guest will be able to use
> all maxvcpus.

How does this currently work for PV guests, assuming there is no
explicit user input beyond the what was written in the domain
configuraiton file?

>
>
>>> And then, even if we do determine the point when (maxvcpus-1) VCPUs are
>>> all up, when do we clamp them down to avail_vcpus? For the same reason,
>>> we can't assume that the guest will VCPUOP_down all extra VCPUs.
>> If at some point we observe all vcpus being up, then we could set the
>> restricted map then.  However, I can't think of a useful way of
>> identifying this point.
> Exactly.
>
> The question is then, if we can't do this for PV, should we still do it
> for HVM?

Absolutely.  It is embarrassing that there isn't enforcement for PV.

Getting extra vcpu power for a short while is something that you
typically buy from a cloud provider.

>
>>>> It also occurs to me that you necessarily need a get_avail_vcpus
>>>> hypercall to be able to use this interface sensibly from the toolstack.
>>> We could modify getdomaininfo but that would make set_avail_vcpus domctl
>>> non-symmetrical.
>>>
>>> And what would the use of this information be anyway?
>> Well, for a start, this information needs to move in the migration
>> stream, or by migrating a VM you will lose its current availability
>> bitmap and reintroduce the problem we are trying to solve.
> Oh, right, of course.

Everyone forgets migrate.

When we eventually get a checklist for new features, migration will
definitely be on there somewhere.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.