[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] x86/mem_sharing: reorder when pages are unlocked and released



>>> On 15.04.19 at 19:28, <George.Dunlap@xxxxxxxxxx> wrote:
>> On Apr 15, 2019, at 4:31 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> wrote:
>> On Mon, Apr 15, 2019 at 9:21 AM George Dunlap <george.dunlap@xxxxxxxxxx> 
>> wrote:
>>> On 4/12/19 8:47 PM, Tamas K Lengyel wrote:
>>>> Patch 0502e0adae2 "x86: correct instances of PGC_allocated clearing" 
>>>> introduced
>>>> grabbing extra references for pages that drop references tied to 
>>>> PGC_allocated.
>>>> However, the way these extra references were grabbed were incorrect, 
>>>> resulting
>>>> in both share_pages and unshare_pages failing.
>>> 
>>> Why?  Is cd not the owner of the page?
>> 
>> It's not, dom_cow is.
>> 
>>>> @@ -1002,7 +994,10 @@ static int share_pages(struct domain *sd, gfn_t 
>>>> sgfn, shr_handle_t sh,
>>>>     /* Free the client page */
>>>>     if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>>>>         put_page(cpage);
>>>> -    put_page(cpage);
>>>> +
>>>> +    ASSERT(put_count);
>>>> +    while ( put_count-- )
>>>> +        put_page_and_type(cpage);
>>> 
>>> I don't think this is in the spirit of the fix in 0502e0adae2; the idea
>>> there seems to have been to make sure that cpage belongs to the right
>>> domain, and that the ownership doesn't change.  If such a race / mistake
>>> were possible before that change, such a race / mistake would be
>>> possible after this change, wouldn't it?
>> 
>> I don't have an answer to your question. AFAIU this is just to ensure
>> that the page isn't dropped before test_and_clear_bit is used on the
>> page.
> 
> Right, but why would that be a bad thing?
> 
> I think it’s to avoid races like this:
> 
> P1: Gets pointer to page A, owned by domain M
> P2: Gets pointer to page A, owned by domain M
> P2: test_and_clear_bit(PGC_allocated) succeeds
> P2: put_page()
>   -> page freed
>   -> Page re-allocated to domain N
> P1: test_and_clear_bit(PGC_allocated) succeeds
> P1: put_page() 
>   -> page freed ##
> 
> Now P1 incorrectly “frees” a page owned by domain N.
> 
> If both P1 and P2 call get_page(A, M) first, and drop that reference 
> afterwards, this race can’t happen.
> 
> Jan, you’re the author of that patch — is that not the case?

Yes indeed, that's why I did add the extra get_page() (not
realizing the owning domain is the wrong one).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.