[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vfree crash



On 03.07.2019 15:27, Andrew Cooper wrote:
> On 01/07/2019 11:47, Jan Beulich wrote:
>> On 01.07.2019 10:56, Andrew Cooper wrote:
>>> On 01/07/2019 09:45, Petre Ovidiu PIRCALABU wrote:
>>>> The problem lies with vfree because it creates a new list with the
>>>> pages, unmaps the va pointer and then frees the pages. If I do these
>>>> steps manually (without adding them to a new list) it works.
>>> The problem here is that struct page_info only has a single linked list
>>> pointer, and vfree() blindly assumes it is available for use, which
>>> isn't true once you've called assign_pages() on the vmap area.
>>>
>>> At the moment, it doesn't look like it is possible to set v*alloc()'d
>>> pages up suitably to be mapped by a guest.  (Similar corruption will
>>> occur via share_xen_page_with_guest() and the xenheap list).
>> Well, whoever assigns pages to a domain behind vmalloc()'s back has got
>> to make sure to de-assign those pages before freeing them.
> 
> Why?  Or perhaps more accurately, where is any of this written down.

I take this to be a rhetorical question.

> Allocation of memory seems logically unrelated to making it mappable by
> guests, so when vmalloc() *is* the correct allocation function to use,
> the fact that assign_pages() results in vfree() silently corrupting the
> domains memory list is unexpected behaviour.

At best this depends on the position you take. Assignment of pages
_always_ transfers their freeing to the refcounting machinery. You
won't find many free_domheap_pages() matching alloc_domheap_pages()
with a non-NULL d (and no special flags). They'll be paired with
put_page() instead. The ones you will find (like in memory_exchange()
have special precautions taken up front).

>> An alternative
>> _might_ be to leave freeing to the normal cleanup processes (when the
>> last page ref gets put), and just vunmap()-ing the range when the mapping
>> isn't needed anymore.
> 
> So this is what I suggested as an interim solution, but I'm not sure if
> it is a sensible option longterm.
> 
> The scenario here is for the "vm-event-ng" interface which was posted as
> an RFC earlier.  Several key purposes for the new interface is to be a
> slot-per-vcpu, and to be usable via the acquire_resource infrastructure.
> 
> struct vm_event_st is currently 384 bytes, which is only 10 full structs
> per page.  The size of the structure is liable to change over time, and
> most likely won't evenly divide a page, so vmalloc() is the correct
> allocation interface within Xen.
> 
> The alloc and free in this case is being done as a side effect of the
> vmi enable/disable calls.  The lifetime of the VMI interface isn't the
> same as the lifetime of the VM.
> 
> With HVI specifically, the SVA VM can reboot, and it needs to re-attach
> to the protected VMs.  There are other load balancing scenarios where
> the protection of a VM might logically move between two different SVAs.
> 
> In either case, retaining the first vmalloc() will result in a failure
> to remap the ring, as the domain assignment will be to the old domid.
> 
> Therefore, I think it is important to be able to fully disable and clean
> up the VMI interface at some point before the protected VM is destroyed.
> 
> As a result, I think the proper fix here is to modify vfree() not to
> clobber the pagelist.
> 
> Thoughts?

Beyond what I've said above already, from your description I imply that
it's the monitoring domain which is to own the page. Yet that's in
conflict with you also saying that this domain may want rebooting. In
such a case, the pages need to either not be owned by that domain, or
they need to be re-allocated during every attach operation.

Furthermore - what use would not clobbering the list be? There would
still be the call to free_domheap_page() there, which is legitimate
only if the page (still having an owner) has just been transitioned to
a zero general ref count. What might be possible here is

     for ( i = 0; i < pages; i++ )
     {
         struct page_info *page = vmap_to_page(va + i * PAGE_SIZE);

         ASSERT(page);
         if ( <whatever> )
             put_page(pg)
         else
             page_list_add(page, &pg_list);
     }

Or maybe this is actually what you have been thinking of.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.