[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 08/16] xen/domain: Add vmtrace_size domain creation parameter


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 1 Feb 2021 22:14:32 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tQ4GbX8DKqqb5qN1QA1TM1ZU1ty6R9EBq8A2DVGgQ5Q=; b=PT+uKhZKW/ckB8eOg0AxHEHrZ9zfeE5LqJK32KzzlWCZzMqugJw5vxNqAu6ZVUdNFiqa0yPhGA1BWaWtGCDCI3i/GgxUKDed1eb45SIhO/sAk2weuHysJQKci40bqbzH8BpOqxuKQWggXSa9zHpK8D9/peZqnigmdc1J+/5xpFh8JiwwJhdi6pVsGBQh8iJ7qQ/T8NlVLAUb/hfy0BVuFyfohmpsjiIwdxD0ICCD8MSh62jTpsJ4U8MTSLqouuhl/OZTM79HNENWlkzylGNwFO3QCzqEOG5GU/2JweSigSc85ODfaInRVr81TelF6Fx39USW0tuinEKNfg5MhjHhVw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WJ8bao5qA2fzYI8l3B8RiAPFCbP1dYlKWITso+tJFIczZ0YpIjyPIBM+twV/0q8NEuUhP1jczN+yWD9UlnrLNi0siAkCP3ZFJakvbNymT8g4RCdlrcyR+hYutMXZwsLmyTNuA6HJtLFlphJfyFGnJu5eGED1d1eQcdzWyH2zHmHA8FS2z+BPFs7SYE5Dqng9xpBtH9SjPNNAKPPfW/W+h9zsIxsAse8IaFIxmdR2RwWxOghqoLsWmbPJlS/DhVbzL3ymyPAyBfZ0Y7JIYu9FFlhs3piqiN7v46RS89plI6lN+g0NV6wwReSvhu3R08JV8e0ICCbCtelBwvCYcUbuDA==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Michał Leszczyński <michal.leszczynski@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 01 Feb 2021 22:14:56 +0000
  • Ironport-sdr: EGGn/zk/oY7erXeENt3JpKjOzsY+RRrI2+KiyiFbQu4Mt7G3jCdOQgok82deMmqpgGp091JmUb X4xArtVrpP7CkVf15OGt7xOaEXOYTFD4FPtOYQDCN/5ySEhj53HUYicK+Z2epRua2pNa7hPsvm fS5WoE1CArYUCo5IyXr16hfi8SIoREO9LtXYH7oKksKE9pbJcfsbzxNYAGI45N0rCKdm7mAUKS aaTah10GnFJoBRNSW8qhIoQs978OP0/QCWrcCXrCbcIKvdPa/09HwELcSPQ5BqOP017XyxllQ4 jKE=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/02/2021 14:36, Jan Beulich wrote:
> On 01.02.2021 15:22, Andrew Cooper wrote:
>> On 01/02/2021 13:18, Jan Beulich wrote:
>>> On 30.01.2021 03:58, Andrew Cooper wrote:
>>>> +static int vmtrace_alloc_buffer(struct vcpu *v)
>>>> +{
>>>> +    struct domain *d = v->domain;
>>>> +    struct page_info *pg;
>>>> +    unsigned int i;
>>>> +
>>>> +    if ( !d->vmtrace_size )
>>>> +        return 0;
>>>> +
>>>> +    pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size),
>>>> +                             MEMF_no_refcount);
>>>> +    if ( !pg )
>>>> +        return -ENOMEM;
>>>> +
>>>> +    /*
>>>> +     * Getting the reference counting correct here is hard.
>>>> +     *
>>>> +     * All pages are now on the domlist.  They, or subranges within, will 
>>>> be
>>> "domlist" is too imprecise, as there's no list with this name. It's
>>> extra_page_list in this case (see also below).
>>>
>>>> +     * freed when their reference count drops to zero, which may any time
>>>> +     * between now and the domain teardown path.
>>>> +     */
>>>> +
>>>> +    for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
>>>> +        if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) )
>>>> +            goto refcnt_err;
>>>> +
>>>> +    /*
>>>> +     * We must only let vmtrace_free_buffer() take any action in the 
>>>> success
>>>> +     * case when we've taken all the refs it intends to drop.
>>>> +     */
>>>> +    v->vmtrace.pg = pg;
>>>> +
>>>> +    return 0;
>>>> +
>>>> + refcnt_err:
>>>> +    /*
>>>> +     * In the failure case, we must drop all the acquired typerefs thus 
>>>> far,
>>>> +     * skip vmtrace_free_buffer(), and leave 
>>>> domain_relinquish_resources() to
>>>> +     * drop the alloc refs on any remaining pages - some pages could 
>>>> already
>>>> +     * have been freed behind our backs.
>>>> +     */
>>>> +    while ( i-- )
>>>> +        put_page_and_type(&pg[i]);
>>>> +
>>>> +    return -ENODATA;
>>>> +}
>>> As said in reply on the other thread, PGC_extra pages don't get
>>> freed automatically. I too initially thought they would, but
>>> (re-)learned otherwise when trying to repro your claims on that
>>> other thread. For all pages you've managed to get the writable
>>> ref, freeing is easily done by prefixing the loop body above by
>>> put_page_alloc_ref(). For all other pages best you can do (I
>>> think; see the debugging patches I had sent on that other
>>> thread) is to try get_page() - if it succeeds, calling
>>> put_page_alloc_ref() is allowed. Otherwise we can only leak the
>>> respective page (unless going to further extents with trying to
>>> recover from the "impossible"), or assume the failure here was
>>> because it did get freed already.
>> Right - I'm going to insist on breaking apart orthogonal issues.
>>
>> This refcounting issue isn't introduced by this series - this series
>> uses an established pattern, in which we've found a corner case.
>>
>> The corner case is theoretical, not practical - it is not possible for a
>> malicious PV domain to take 2^43 refs on any of the pages in this
>> allocation.  Doing so would require an hours-long SMI, or equivalent,
>> and even then all malicious activity would be paused after 1s for the
>> time calibration rendezvous which would livelock the system until the
>> watchdog kicked in.
> Actually an overflow is only one of the possible reasons here.
> Another, which may be more "practical", is that another entity
> has already managed to free the page (by dropping its alloc-ref,
> and of course implying it did guess at the MFN).

Yes, but in this case it did get dropped from the extra page list, in
which case looping over the remaining ones in relinquish_resource would
be safe.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.