[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Live migration and PV device handling



On 06/04/2020 08:50, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Dongli 
>> Zhang
>> Sent: 03 April 2020 23:33
>> To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Anastassios Nanos 
>> <anastassios.nanos@xxxxxxxxxxx>; xen-
>> devel@xxxxxxxxxxxxx
>> Subject: Re: Live migration and PV device handling
>>
>> Hi Andrew,
>>
>> On 4/3/20 5:42 AM, Andrew Cooper wrote:
>>> On 03/04/2020 13:32, Anastassios Nanos wrote:
>>>> Hi all,
>>>>
>>>> I am trying to understand how live-migration happens in xen. I am
>>>> looking in the HVM guest case and I have dug into the relevant parts
>>>> of the toolstack and the hypervisor regarding memory, vCPU context
>>>> etc.
>>>>
>>>> In particular, I am interested in how PV device migration happens. I
>>>> assume that the guest is not aware of any suspend/resume operations
>>>> being done
>>> Sadly, this assumption is not correct.  HVM guests with PV drivers
>>> currently have to be aware in exactly the same way as PV guests.
>>>
>>> Work is in progress to try and address this.  See
>>> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=775a02452ddf3a6889690de90b1a94eb29c3c732
>>> (sorry - for some reason that doc isn't being rendered properly in
>>> https://xenbits.xen.org/docs/ )

Document rendering now fixed.

https://xenbits.xen.org/docs/unstable/designs/non-cooperative-migration.html

>> I read below from the commit:
>>
>> +* The toolstack choose a randomized domid for initial creation or default
>> +migration, but preserve the source domid non-cooperative migration.
>> +Non-Cooperative migration will have to be denied if the domid is
>> +unavailable on the target host, but randomization of domid on creation
>> +should hopefully minimize the likelihood of this. Non-Cooperative migration
>> +to localhost will clearly not be possible.
>>
>> Does that indicate while scope of domid_t is shared by a single server in old
>> design, the scope of domid_t is shared by a cluster of server in new design?
>>
>> That is, the domid should be unique in the cluster of all servers if we 
>> expect
>> non-cooperative migration always succeed?
>>
> That would be necessary to guarantee success (or rather guarantee no failure 
> due to domid clash) but the scope of xl/libxl is single serve, hence 
> randomization is the best we have to reduce clashes to a minimum.

domid's are inherently a local concept and will remain so, but a
toolstack managing multiple servers and wanting to use this version of
non-cooperative migration will have to manage domid's cluster wide.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.