[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail



On 01/10/2017 04:13 PM, Andrew Cooper wrote:
> On 10/01/17 09:06, Razvan Cojocaru wrote:
>> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>>> Hello,
>>>>
>>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>>> to eat up all the RAM it can:
>>>>
>>>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>>
>>>> This leads to a problem with xen-access, specifically libxc which does
>>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>>
>>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>>                                  &mmap_pfn, 1);
>>>>
>>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>>> {
>>>>     /* Map failed, populate ring page */
>>>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>>                                                &ring_pfn);
>>>>     if ( rc1 != 0 )
>>>>     {
>>>>         PERROR("Failed to populate ring pfn\n");
>>>>         goto out;
>>>>     }
>>>>
>>>> The first time everything works fine, xen-access can map the ring page.
>>>> But most of the time the second time fails in the
>>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>>> Xen log (once for each failed attempt):
>>>>
>>>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>>> Thinking further about this, what happens if you avoid removing the page
>>> on exit?
>>>
>>> The first populate succeeds, and if you leave the page populated, the
>>> second time you come around the loop, it should not be of type XTAB, and
>>> the map should succeed.
>> Sorry for the late reply, had to put out another fire yesterday.
>>
>> I've taken your recommendation to roughly mean this:
>>
>> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
>> index ba9690a..805564b 100644
>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -100,8 +100,11 @@ static int vm_event_enable(
>>      return 0;
>>
>>   err:
>> +    /*
>>      destroy_ring_for_helper(&ved->ring_page,
>>                              ved->ring_pg_struct);
>> +    */
>> +    ved->ring_page = NULL;
>>      vm_event_ring_unlock(ved);
>>
>>      return rc;
>> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
>> struct vm_event_domain *ved)
>>              }
>>          }
>>
>> +        /*
>>          destroy_ring_for_helper(&ved->ring_page,
>>                                  ved->ring_pg_struct);
>> +       */
>>
>> +        ved->ring_page = NULL;
>>          vm_event_cleanup_domain(d);
>>
>>          vm_event_ring_unlock(ved);
>>
>> but this unfortunately still fails to map the page the second time. Do
>> you mean to simply no longer munmap() the ring page from libxc / the
>> client application?
> 
> Neither.
> 
> First of all, I notice that this is probably buggy:
> 
>     ring_pfn = pfn;
>     mmap_pfn = pfn;
>     rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
>     if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>     {
>         /* Page not in the physmap, try to populate it */
>         rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>                                               &ring_pfn);
>         if ( rc1 != 0 )
>         {
>             PERROR("Failed to populate ring pfn\n");
>             goto out;
>         }
>     }
> 
> A failure of xc_get_pfn_type_batch() is not a suggestion that population
> might work.
> 
> 
> What I meant was taking out this call:
> 
>     /* Remove the ring_pfn from the guest's physmap */
>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> &ring_pfn);
>     if ( rc1 != 0 )
>         PERROR("Failed to remove ring page from guest physmap");
> 
> To leave the frame in the guest physmap.  The issue is fundamentally
> that after this frame has been taken out, something kicks the VM to
> realise it has an extra frame of balloonable space, which it clearly
> compensates for.
> 
> You can work around the added attack surface by marking it RO in EPT;
> neither Xen's nor dom0's mappings are translated via EPT, so they can
> still make updates, but the guest won't be able to write to it.
> 
> I should say that this is all a gross hack, and is in desperate need of
> a proper API to make rings entirely outside of the gfn space, but this
> hack should work for now.

Thanks! So far, it seems to work like a charm like this:

diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
index 2fef96a..5dd00a6 100644
--- a/tools/libxc/xc_vm_event.c
+++ b/tools/libxc/xc_vm_event.c
@@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t
domain_id, int param,
     }

     /* Remove the ring_pfn from the guest's physmap */
+    /*
     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
&ring_pfn);
     if ( rc1 != 0 )
         PERROR("Failed to remove ring page from guest physmap");
+    */
+
+    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) )
+    {
+        PERROR("Could not set ring page read-only\n");
+        goto out;
+    }

  out:
     saved_errno = errno;

Should I send this as a patch for mainline as well?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.