[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFCv3 3/8] xen: introduce XEN_DOMCTL_set_recipient



"Jan Beulich" <JBeulich@xxxxxxxx> writes:

>>>> On 25.11.14 at 16:41, <vkuznets@xxxxxxxxxx> wrote:
>> "Jan Beulich" <JBeulich@xxxxxxxx> writes:
>>>>>> On 07.10.14 at 15:10, <vkuznets@xxxxxxxxxx> wrote:
>>>> @@ -1764,11 +1765,28 @@ void free_domheap_pages(struct page_info *pg, 
>>>> unsigned int order)
>>>>              scrub = 1;
>>>>          }
>>>>  
>>>> -        if ( unlikely(scrub) )
>>>> -            for ( i = 0; i < (1 << order); i++ )
>>>> -                scrub_one_page(&pg[i]);
>>>> +        if ( !d || !d->recipient || d->recipient->is_dying )
>>>> +        {
>>>> +            if ( unlikely(scrub) )
>>>> +                for ( i = 0; i < (1 << order); i++ )
>>>> +                    scrub_one_page(&pg[i]);
>>>>  
>>>> -        free_heap_pages(pg, order);
>>>> +            free_heap_pages(pg, order);
>>>> +        }
>>>> +        else
>>>> +        {
>>>> +            mfn = page_to_mfn(pg);
>>>> +            gmfn = mfn_to_gmfn(d, mfn);
>>>> +
>>>> +            page_set_owner(pg, NULL);
>>>> +            assign_pages(d->recipient, pg, order, 0);
>>>
>>> This can fail.
>> 
>> .. if the recipient is dying or we're over-allocating. Shouldn't happen
>> (if toolstack does its job right) but I'll add a check.
>
> You must not rely on the tool stack doing things right (see XSA-77).
>
>>> What's worse though is that you don't guard against
>>> XEN_DOMCTL_set_recipient zapping d->recipient. If that really is
>>> necessary to be supported, some synchronization will be needed.
>>>
>>> And finally - what's the intended protocol for the tool stack to
>>> know that all pages got transferred (and hence the new domain
>>> can be launched)?
>>>
>> 
>> (hope this answers both questions above)
>> 
>> Now the protocol is:
>> 1) Toolstack queries for all page frames (with xc_get_pfn_type_batch())
>> for the old domain.
>> 2) Toolstack issues XEN_DOMCTL_set_recipient with the recipient domain
>> 3) Toolstack kills the source domain
>> 4) Toolstack waits for awhile (loop keeps checking while we see new pages
>> arriving + some timeout).
>> 5) Toolstack issues XEN_DOMCTL_set_recipient with DOMID_INVALID
>> resetting recipient.
>> 6) Toolstack checks if all pages were transferred
>> 7) If some pages are missing (e.g. source domain became zombie holding
>> something) request new empty pages instead.
>> 8) Toolstack starts new domain
>> 
>> So we don't actually need XEN_DOMCTL_set_recipient to switch from one
>> recipient to the other, we just need a way to reset it.
>
> No, this doesn't address either question. For the first one, you again
> assume the tool stack behaves correctly, which is wrong nowadays.
> And for the second you give a description, but that's not really a
> dependable protocol. Nor do I really follow why resetting the recipient
> is necessary. Can't the tools e.g. wait for the final death of the domain
> if there's no other notification?

Yes, it's possible and should happen in all normal cases. However my
idea was that it's possible to start new domain even if the old one
fails to die holding several (presumably special) pages and we're fine
with replacing those with empty pages. In case we go for 'always wait
for the original domain to die' solution resetting
XEN_DOMCTL_set_recipient is not necessary (it is necessary in case we
don't as we can recieve a page after we already requested a new one
instead).

>
>>>> --- a/xen/include/public/domctl.h
>>>> +++ b/xen/include/public/domctl.h
>>>> @@ -965,6 +965,23 @@ struct xen_domctl_vnuma {
>>>>  typedef struct xen_domctl_vnuma xen_domctl_vnuma_t;
>>>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vnuma_t);
>>>>  
>>>> +/*
>>>> + * XEN_DOMCTL_set_recipient - sets the 'recipient' domain which will 
>>>> recieve
>>>> + * all the domain's memory after its death or when and memory page from
>>>> + * domain's heap is being freed.
>>>
>>> So this specifically allows for this to happen on a non-dying domain.
>>> Why, i.e. what's the intended usage scenario? If not really needed,
>>> please remove this and verify in the handling that the source domain
>>> is dying.
>> 
>> Sorry if I didn't get this comment right. We need a way to tell which
>> domain will recieve memory and XEN_DOMCTL_set_recipient sets (and
>> resets) this target domain. We call it from toolstack before we start
>> killing old domain so it is not dying yet. We can't do it
>> hypervistor-side when our source domain exits doing SHUTDOWN_soft_reset
>> as there is no recipient domain created yet.
>> 
>> From a general (hypervisor) point of view we don't actually care if the
>> domain is dying or not. We just want to recieve all freed pages from
>> this domain (so they go to some other domain instead of trash).
>
> We do care I think, primarily because we want a correct GMFN <->
> MFN mapping. Seeing multiple pages for the same GMFN is for
> example going to cause the whole process in its current form to fail.

Can adding a check that nothing is mapped to the GMFN before mapping new
MFN there be a solution?

> And considering the need for such a correct mapping - is it always
> the case that the mapping gets updated after a page got removed
> from a guest? (I can see that the mapping doesn't change anymore
> for a dying guest, but you explicitly say that you want/need this to
> work before the guest is actually marked dying.)

Actual reassignment here happens for a dying guest only as
XEN_DOMCTL_set_recipient does nothing. If you think it's safer to depend
on the fact that dying flag is always set we can couple
XEN_DOMCTL_set_recipient with XEN_DOMCTL_destroydomain in one call. It
is possible if we go for 'always wait for the original domain to die'
solution (so no reset is required).

>
> Jan

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.