[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/x86: Fix memory leak in vcpu_create() error path


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 18 Dec 2020 13:58:19 +0000
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Michał Leszczyński <michal.leszczynski@xxxxxxx>
  • Delivery-date: Fri, 18 Dec 2020 13:58:29 +0000
  • Ironport-sdr: S7jw0Ff2I5FRGxVWvD8CaTGXEoK5ljSNye9a4L5+btRtzd9YEhJjdjOnNtuGY+lvm7Yk4blkkS c+omn26ucbONGvWdb3RGGP0aXM/ERIdGdoaH1DNxO5o8jyRdEKghVTu4ekP9RdDT5KxyOCjNMz 2UnUEUUd5DN2S9HIQA38yVvfFqO139efNh8BziwzkMWQu9IJahiaVq3RN1UzjvjDTPuecL0IsH Lo5QOXbhmtGUxbkncfxN3fFf0pS9LAaXpNiSAFtSR8lZs1f6i2dIneQLfNhtOBZEQd2U+1KYuk kfM=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18/12/2020 08:27, Jan Beulich wrote:
> On 17.12.2020 22:46, Andrew Cooper wrote:
>> On 29/09/2020 07:18, Jan Beulich wrote:
>>> On 28.09.2020 17:47, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/mm/hap/hap.c
>>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>>> @@ -563,30 +563,37 @@ void hap_final_teardown(struct domain *d)
>>>>      paging_unlock(d);
>>>>  }
>>>>  
>>>> +void hap_vcpu_teardown(struct vcpu *v)
>>>> +{
>>>> +    struct domain *d = v->domain;
>>>> +    mfn_t mfn;
>>>> +
>>>> +    paging_lock(d);
>>>> +
>>>> +    if ( !paging_mode_hap(d) || !v->arch.paging.mode )
>>>> +        goto out;
>>> Any particular reason you don't use paging_get_hostmode() (as the
>>> original code did) here? Any particular reason for the seemingly
>>> redundant (and hence somewhat in conflict with the description's
>>> "with the minimum number of safety checks possible")
>>> paging_mode_hap()?
>> Yes to both.  As you spotted, I converted the shadow side first, and
>> made the two consistent.
>>
>> The paging_mode_{shadow,hap})() is necessary for idempotency.  These
>> functions really might get called before paging is set up, for an early
>> failure in domain_create().
> In which case how would v->arch.paging.mode be non-NULL already?
> They get set in {hap,shadow}_vcpu_init() only.

Right, but we also might end up here with an error early in
vcpu_create(), where d->arch.paging is set up, but v->arch.paging isn't.

This logic needs to be safe to use at any point of partial initialisation.

(And to be clear, I found I needed both of these based on some
artificial error injection testing.)

>> The paging mode has nothing really to do with hostmode/guestmode/etc. 
>> It is the only way of expressing the logic where it is clear that the
>> lower pointer dereferences are trivially safe.
> Well, yes and no - the other uses of course should then also use
> paging_get_hostmode(), like various of the wrappers in paging.h
> do. Or else I question why we have paging_get_hostmode() in the
> first place.

I'm not convinced it is an appropriate abstraction to have, and I don't
expect it to survive the nested virt work.

> There are more examples in shadow code where this
> gets open-coded when it probably shouldn't be. There haven't been
> any such cases in HAP code so far ...

Doesn't matter.  Its use here would obfuscate the code (this is one part
of why I think it is a bad abstraction to begin with), and if the
implementation ever changed, the function would lose its safety.

> Additionally (noticing only now) in the shadow case you may now
> loop over all vCPU-s in shadow_teardown() just for
> shadow_vcpu_teardown() to bail right away. Wouldn't it make sense
> to retain the "if ( shadow_mode_enabled(d) )" there around the
> loop?

I'm not entirely convinced that was necessarily safe.  Irrespective, see
the TODO.  The foreach_vcpu() is only a stopgap until some cleanup
structure changes come along (which I had queued behind this patch anyway).

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.