[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] xen/dmalloc: Introduce dmalloc() APIs



On 14.01.2021 00:16, Andrew Cooper wrote:
> On 05/01/2021 15:56, Jan Beulich wrote:
>> On 23.12.2020 17:34, Andrew Cooper wrote:
>>> RFC:
>>>  * This probably wants to be less fatal in release builds
>> I'm not even convinced this wants to be a panic() in debug builds.
> 
> Any memory leak spotted by this is an XSA, except in the narrow case of
> being due exclusively to a weird and non-default order of hypercalls.
> 
> It absolutely needs to be something fatal in debug builds, for avoid
> going unnoticed by testing.

This is a perfectly valid position to take, but I'm afraid once
again isn't the only possible one. Since I do routinely look at
logs, I'm personally in favor of avoiding crashing the host
even for debug builds. The more that I've had pretty bad fallout
from crashes in the past, due to (afaict) bugs in a file system
driver in Linux that persisted over a longer period of time.

>  Patch 4 is my proposed solution for this,
> which will hopefully prevent bugs from escaping staging.
> 
> For release builds, a real memory leak is less bad behaviour than taking
> the host down, but it certainly shouldn't shouldn't go unnoticed.

Of course - it absolutely needs logging.

>>>  * In an ideal world, we'd also want to count the total number of bytes
>>>    allocated from the xmalloc heap, which would be interesting to print in 
>>> the
>>>    'q' debugkey.  However, that data is fairly invasive to obtain.
>> Unless we used an xmem_pool rather than the new interface being
>> a thin layer around xmalloc(). (This would then also provide
>> better locality of the allocations, i.e. different domains
>> wouldn't share allocations from the same page.)
> 
> I'm not sure if using a separate memory pool is a clear cut improvement.

Neither am I, but it's an option to at least consider.

> For an attacker which has a corruption primitive, a single shared pool
> will reduce the chance of the exploit being stable across different
> systems.  Improved locality of allocations is an advantage from the
> attackers point of view, but the proper way to combat that is with a
> real randomised heap allocator.
> 
> Sharing within the same page isn't an issue, so long as we respect a
> cache line minimum allocation size.
> 
> More importantly however, until we know exactly how much memory we're
> talking about here, switching to a per-domain pool might be a massive
> waste.  The xmalloc() allocations are in the noise compared to RAM, and
> might only be a small multiple of the pool metadata to begin with.
> 
>> And even without doing so, adding a function to retrieve the actual size
>> shouldn't be all that difficult - internally xmalloc_tlsf.c
>> knows the size, after all, for e.g. xrealloc() to work right.
> 
> Yeah - the internals of the pool can calculate this.  I was considering
> doing just this, but wasn't sure how exposing an API for this would go down.
> 
> For maximum flexibility, it would be my preferred way forward.

I don't seen an issue with exposing such an API, so long as it's
made clear what purposes we want to permit it to be used for.

>>> +#define DFREE(d, p)                             \
>>> +    do {                                        \
>>> +        dfree(d, p);                            \
>>> +        (p) = NULL;                             \
>>> +    } while ( 0 )
>>> +
>>> +
>>> +void *_dzalloc(struct domain *d, size_t size, size_t align);
>>> +
>>> +static inline void *_dzalloc_array(struct domain *d, size_t size,
>>> +                                   size_t align, size_t num)
>>> +{
>>> +    return _dzalloc(d, size * num, align);
>> No protection at all against the multiplication overflowing?
> 
> Well - xmalloc doesn't have any either.  But yes - both want some
> compiler-aided overflow detection, rather than messing around with
> arbitrary limits pretending to be an overflow check.

You did suggest this previously, for xmalloc(). And I did look
into doing so, but either ran into some issue or simply didn't
see the point, considering the code we already have. Yes, the
use of UINT_MAX may seem somewhat arbitrary. We could make this
less arbitrary by deriving from MAX_ORDER instead, or by adding
a suitable BUILD_BUG_ON(UINT_MAX < ...). After all there's no
point even just trying allocations exceeding MAX_ORDER.

The reason for my comment was that the functions here are
specifically _not_ a simple copy of their xmalloc()
counterparts.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.