[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] arm/p2m: Fix regression during domain shutdown with active mem_access



On 24/01/2017 23:38, Tamas K Lengyel wrote:
> On Tue, Jan 24, 2017 at 4:19 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
>>
>> On 24/01/2017 22:45, Tamas K Lengyel wrote:
>>> On Tue, Jan 24, 2017 at 3:32 PM, Julien Grall <julien.grall@xxxxxxx>
>>> wrote:
>>>>
>>>>
>>>> On 24/01/2017 22:19, Tamas K Lengyel wrote:
>>>>>
>>>>> On Tue, Jan 24, 2017 at 3:13 PM, Julien Grall <julien.grall@xxxxxxx>
>>>>> wrote:
>>>>>>
>>>>>> Hi Tamas,
>>>>>>
>>>>>> On 24/01/2017 22:10, Tamas K Lengyel wrote:
>>>>>>>
>>>>>>>
>>>>>>> The change in commit 438c5fe4f0c introduced a regression for domains
>>>>>>> where
>>>>>>> mem_acces is or was active. When relinquish_p2m_mapping attempts to
>>>>>>> clear
>>>>>>> a page where the order is not 0 the following ASSERT is triggered:
>>>>>>>
>>>>>>>     ASSERT(!p2m->mem_access_enabled || page_order == 0);
>>>>>>>
>>>>>>> This regression was unfortunately not caught during testing in
>>>>>>> preparation
>>>>>>> for the 4.8 release.
>>>>>>>
>>>>>>> As at this point during domain shutdown it is safe to skip mem_access
>>>>>>> paths
>>>>>>> altogether (pages are being relinquished), this patch flips the
>>>>>>> mem_access_enabled flag to forgo any radix-tree lookups and to avoid
>>>>>>> tripping the ASSERT.
>>>>>>>
>>>>>>> Ideally this fix would be part of Xen 4.8.1.
>>>>>>
>>>>>>
>>>>>>
>>>>>> How about fixing the ASSERT rather than turning-off memaccess crudely?
>>>>>>
>>>>>> For instance by whether whether the domain is dying.
>>>>>>
>>>>> We can do that too if preferred. This way though we also shortcut all
>>>>> calls to p2m_mem_access_radix_set, so shutdown would be faster.
>>>>
>>>>
>>>> Well yes and no. You will have to free the radix tree afterwards anyway.
>>>> So
>>>> if you disable mem_access, the radix tree will fully free in
>>>> p2m_teardown.
>>>>
>>>> This may be faster to destroy a domain. But you may notice a lag on CPU
>>>> for
>>>> a bit as p2m_teardown is not preemptible (radix_tree_destroy can take
>>>> some
>>>> times depending on the size of the tree). On the other side,
>>>> p2m_relinquish
>>>> is preemptible so by removing element one by one it will be slower but
>>>> don't
>>>> introduce the potential lag in p2m_teardown as the function will be
>>>> preempted if there is work to do (such as an interrupt need to be
>>>> injected
>>>> in the guest issuing the destroy hypercall).
>>>>
>>>> I have to chose I would go on free element one by one as it potentially
>>>> avoids lag. But Stefano may disagree here.
>>>>
>>> Preemption makes sense. OTOH the teardown is getting called for all
>>> GFNs, while the radix tree is likely just a subset (potentially only a
>>> couple gfns).
>>
>> How did you deduce this number? It is a valid use case to protect all the
>> RAM, and it is what xen-memaccess is doing by default. So if your guest has
>> 3GB of RAM you would have ~800000 entries.
>>
>>> So instead of checking all possible gfns if they are
>>> present in the tree and then removing them, just to be followed by
>>> destroying the tree, we can just destroy the tree once and be done
>>> with it. Since the most expensive operation with the tree are adding
>>> and removing nodes, IMHO we should skip that if possible.
>>
>> If you look at the code, p2m_get_entry is taking a NULL pointer for the
>> access so the code will not retrieve the memaccess permission. The only
>> operation memaccess related done by p2m_relinquish will be removing an
>> element from the tree.
>>
>> You seem to only have in mind that destroying a domain will be faster with
>> this patch. You are right that it will be faster, however you also need to
>> have in mind the impact on Xen and the rest of the platform. Any processor
>> running Xen code is not be preemptible unless asked by the code itself. This
>> means that we need to break down expensive task to avoid "locking" a
>> processor for too long.
>>
>> The function radix_tree_destroy could be very expensive to run as you have
>> to browse the tree to free the associated memory. A couple of entry would be
>> fast, now imagine 800000 entries (and even more on guest with large amount
>> of memory).
>>
>> So yes, I prefer to have a domain to be destroyed more "slowly" but at least
>> there will be less impact on others.
> Actually, if users do follow the expected steps on properly clearing
> up mem_access - as illustrated by xen-access - then they will first
> call xc_set_mem_access to reset all GFNs to the default permission. If
> this simple best-practice routine is followed then it means that the
> Radix tree is actually already empty. In that case there is actually
> very little difference between flipping mem_access_enabled and just
> checking whether the tree is empty - just an extra function call. So
> it's fine either way. I'll just change the ASSERT.

Sadly, the important case is never when people are following best
practice.  (Yay its fast, but...)

The code needs to be able to cope sensibly when someone (maliciously or
accidentally) has managed to stack everything in their favour, and
against Xen's.

This comment isn't specific to the issue here (I have no vested interest
as to the specifics of how it is resolved), but a guideline in general. 
To avoid security and/or performance problems, Xen needs to function by
bounding the worst case to something tolerable, than to optimise on the
expectation of the common case.  (If you can do both, then fantastic :))

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.