[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 12 of 18] x86/mm: Make page_lock/unlock() in arch/x86/mm.c externally callable



>>> On 09.12.11 at 15:53, "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx> 
>>> wrote:
>>>>> On 09.12.11 at 03:54, "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
>>>>> wrote:
>>>>  At 02:47 -0500 on 08 Dec (1323312447), Andres Lagar-Cavilla wrote:
>>>>> This is necessary for a new consumer of page_lock/unlock to follow in
>>>>> the series.
>>>>>
>>>>> Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>
>>>>
>>>> Nak, I'm afraid.
>>>>
>>>> These were OK as local functions but if they're going to be made
>>>> generally visible, they need clear comments describing what this
>>>> locking protects and what the discipline is for avoiding deadlocks.
>>>
>>> How about something along the lines of
>>> "page_lock() is used for two purposes: pte serialization, and memory
>>> sharing. All users of page lock for pte serialization live in mm.c, use
>>> it
>>> to lock a page table page during pte updates, do not take other locks
>>> within the critical section delimited by page_lock/unlock, and perform
>>> no
>>> nesting. All users of page lock for memory sharing live in
>>> mm/mem_sharing.c. For memory sharing, nesting may happen when sharing
>>> (and
>>> locking) two pages -- deadlock is avoided by locking pages in increasing
>>> order. Memory sharing may take the p2m_lock within a page_lock/unlock
>>> critical section. These two users (pte serialization and memory sharing)
>>> should never collide, as long as page table pages are properly unshared
>>> prior to updating."
>>
>> This would seem to me like very undesirable lock ordering - a very
>> coarse (per-domain) lock taken inside a very fine grained (per-page)
>> one.
> I'm not sure I follow. Unshare would do its work, and then pte
> serialization would start. The two pieces of code will be disjoint,
> locking-wise.

But your original mail said "Memory sharing may take the p2m_lock
within a page_lock/unlock critical section" - see above. That's what
I'm referring to.

> Now it is true that during unshare we need to take the p2m lock to change
> the p2m entry. That's a very strong reason to make the p2m lock
> fine-grained. But I need to start somewhere, so I'm breaking up the global
> shr_lock first.

I don't really think that it'll be reasonable to split up the p2m lock.

>>> Now those are all pretty words, but here are the two things I (think I)
>>> need to do:
>>> - Prior to updating pte's, we do get_gfn on the page table page. We
>>> should
>>> be using get_gfn_unshare. Regardless of this patch. It's likely never
>>> going to trigger an actual unshare, yet better safe than sorry.
>>
>> Does memory sharing work on pv domains at all?
> Not. At. All :)
> 
> I can _not_ add the unshare. It's idempotent to pv.

Perhaps I should have clarified why I was asking: The pte handling is
a pv-only thing, and if memory sharing is hvm only, then the two can't
ever conflict.

>>> - I can wrap uses of page_lock in mm sharing in an "external"
>>> order-enforcing construct from mm-locks.h. And use that to scream
>>> deadlock
>>> between page_lock and p2m_lock.
>>>
>>> The code that actually uses page_lock()s in the memory sharing code can
>>> be
>>> found in "[PATCH] x86/mm: Eliminate global lock in mem sharing code". It
>>> already orders locking of individual pages in ascending order.
>>
>> It should be this patch to make the functions externally visible, not a
>> separate one (or at the very minimum the two ought to be in the same
>> series, back to back). Which is not to say that I'm fully convinced this
>> is a good idea.
> Whichever you prefer. I'm of the mind of making shorter patches when
> possible, that do one thing, to ease readability. But I can collapse the
> two.

In quite a few (recent) cases your patches did something where the user
of the change wasn't obvious at all (in some cases - I tried to point these
out - there was no user even at the end of a series). While I agree that
shorter patches are easier to review, trivial changes like the one here
should imo really be logically grouped with what requires them.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.