|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/mm: Short circuit damage from "fishy" ref/typecount failure
On 29.01.2021 17:17, Andrew Cooper wrote:
> On 29/01/2021 11:29, Jan Beulich wrote:
>> On 25.01.2021 18:59, Andrew Cooper wrote:
>>> On 20/01/2021 08:06, Jan Beulich wrote:
>>>> Also, as far as "impossible" here goes - the constructs all
>>>> anyway exist only to deal with what we consider impossible.
>>>> The question therefore really is of almost exclusively
>>>> theoretical nature, and hence something like a counter
>>>> possibly overflowing imo needs to be accounted for as
>>>> theoretically possible, albeit impossible with today's
>>>> computers and realistic timing assumptions. If a counter
>>>> overflow occurred, it definitely wouldn't be because of a
>>>> bug in Xen, but because of abnormal behavior elsewhere.
>>>> Hence I remain unconvinced it is appropriate to deal with
>>>> the situation by BUG().
>>> I'm not sure how to be any clearer.
>>>
>>> I am literally not changing the current behaviour. Xen *will* hit a
>>> BUG() if any of these domain_crash() paths are taken.
>>>
>>> If you do not believe me, then please go and actually check what happens
>>> when simulating a ref-acquisition failure.
>> So I've now also played the same game on the ioreq path (see
>> debugging patch below, and again with some non-"//temp"
>> changes actually improving overall behavior in that "impossible"
>> case). No BUG()s hit, no leaks (thanks to the extra changes),
>> no other anomalies observed.
>>
>> Hence I'm afraid it is now really up to you to point out the
>> specific BUG()s (and additional context as necessary) that you
>> either believe could be hit, or that you have observed being hit.
>
> The refcounting logic was taken verbatim from ioreq, with the only
> difference being an order greater than 0. The logic is also identical
> to the vlapic logic.
>
> And the reason *why* it bugs is obvious - the cleanup logic
> unconditionally put()'s refs it never took to begin with, and hits
> underflow bugchecks.
In current staging, neither vmx_alloc_vlapic_mapping() nor
hvm_alloc_ioreq_mfn() put any refs they couldn't get. Hence
my failed attempt to repro your claimed misbehavior.
> For specifics, a simulated regular ref failure:
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 1051d86a20..314d258e31 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -171,9 +171,14 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
> v->vmtrace.buf = pg;
>
> for ( i = 0; i < d->vmtrace_frames; i++ )
> + {
> + if ( i == 0 )
> + return -ENOMEM;
> +
> /* Domain can't know about this page yet - something fishy
> going on. */
> if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
> BUG();
> + }
No bad put here. This should have BUG() replaced just like
the other two instances mentioned above have it.
> and the simulated typeref failure:
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index db845ccc81..bd810157f4 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -172,8 +172,16 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
>
> for ( i = 0; i < d->vmtrace_frames; i++ )
> {
> + get_page(&pg[i], d);
> +
> if ( i == 0 )
> + {
> + put_page(&pg[i]);
This of course if wrong is the get_page() failed. Assuming it
succeeds I don't see what's wrong here.
> return -ENOMEM;
> + }
> +
> + get_page_type(&pg[i], PGT_writable_page);
> + continue;
>
> /* Domain can't know about this page yet - something fishy
> going on. */
> if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
>
> both yield:
>
> (XEN) Xen BUG at /local/xen.git/xen/include/xen/mm.h:610
> (XEN) RIP: e008:[<ffff82d04020423e>]
> common/domain.c#vmtrace_free_buffer+0x57/0xc1
> (XEN) Xen call trace:
> (XEN) [<ffff82d04020423e>] R
> common/domain.c#vmtrace_free_buffer+0x57/0xc1
> (XEN) [<ffff82d040205497>] F vcpu_create+0x245/0x32b
> (XEN) [<ffff82d04023ae5b>] F do_domctl+0xb48/0x1964
> (XEN) [<ffff82d04030c6b2>] F pv_hypercall+0x2e4/0x53d
> (XEN) [<ffff82d04039045d>] F lstar_enter+0x12d/0x140
So maybe vmtrace_free_buffer() is broken?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |