[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/9] mm: Place unscrubbed pages at the end of pagelist



>>> On 04.04.17 at 17:14, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 04/04/2017 10:46 AM, Jan Beulich wrote:
>>> @@ -933,6 +952,10 @@ static bool_t can_merge(struct page_info *buddy, 
>>> unsigned int node,
>>>           (phys_to_nid(page_to_maddr(buddy)) != node) )
>>>          return false;
>>>  
>>> +    if ( need_scrub !=
>>> +         !!test_bit(_PGC_need_scrub, &buddy->count_info) )
>>> +        return false;
>> I don't think leaving the tree in a state where larger order chunks
>> don't become available for allocation right away is going to be
>> acceptable. Hence with this issue being dealt with only in patch 7
>> as it seems, you should state clearly and visibly that (at least)
>> patches 2...7 should only be committed together.
> 
> The dirty pages are available for allocation as result of this patch but
> they might not be merged with higher orders (which is what this check is
> for)

The individual chunks are available for allocation, but not the
combined one (for a suitably high order request). Or am I
missing something?

>>> @@ -952,9 +977,10 @@ static struct page_info *merge_chunks(struct page_info 
>>> *pg, unsigned int node,
>>>          {
>>>              /* Merge with predecessor block? */
>>>              buddy = pg - mask;
>>> -            if ( !can_merge(buddy, node, order) )
>>> +            if ( !can_merge(buddy, node, order, need_scrub) )
>>>                  break;
>>>  
>>> +            pg->count_info &= ~PGC_need_scrub;
>>>              pg = buddy;
>>>              page_list_del(pg, &heap(node, zone, order));
>>>          }
>>> @@ -962,9 +988,10 @@ static struct page_info *merge_chunks(struct page_info 
>>> *pg, unsigned int node,
>>>          {
>>>              /* Merge with successor block? */
>>>              buddy = pg + mask;
>>> -            if ( !can_merge(buddy, node, order) )
>>> +            if ( !can_merge(buddy, node, order, need_scrub) )
>>>                  break;
>>>  
>>> +            buddy->count_info &= ~PGC_need_scrub;
>>>              page_list_del(buddy, &heap(node, zone, order));
>>>          }
>> For both of these, how come you can / want to clear the need-scrub
>> flag? Wouldn't it be better for each individual page to retain it, so
>> when encountering a higher-order one you know which pages need
>> scrubbing and which don't? Couldn't that also be used to avoid
>> suppressing their merging here right away?
> 
> I am trying to avoid having to keep dirty bit for each page since a
> buddy is either fully clean or fully dirty. That way we shouldn't need
> to walk the list and clear the bit. (I, in fact, suspect that there may
> be other state bits/fields that we might be able to keep at a buddy only)

But as said - at the expense of not being able to merge early. I
consider this a serious limitation.

>>> +static void scrub_free_pages(unsigned int node)
>>> +{
>>> +    struct page_info *pg;
>>> +    unsigned int i, zone;
>>> +    int order;
>> There are no negative orders.
> 
> It actually becomes negative in the loop below and this is loop exit
> condition.

Only because of the way you've coded the loop. It becoming
negative can be easily avoided.

>>> +    ASSERT(spin_is_locked(&heap_lock));
>>> +
>>> +    if ( !node_need_scrub[node] )
>>> +        return;
>>> +
>>> +    for ( zone = 0; zone < NR_ZONES; zone++ )
>>> +    {
>>> +        for ( order = MAX_ORDER; order >= 0; order-- )
>>> +        {
>>> +            while ( !page_list_empty(&heap(node, zone, order)) )
>>> +            {
>>> +                /* Unscrubbed pages are always at the end of the list. */
>>> +                pg = page_list_last(&heap(node, zone, order));
>>> +                if ( !test_bit(_PGC_need_scrub, &pg->count_info) )
>>> +                    break;
>>> +
>>> +                for ( i = 0; i < (1UL << order); i++)
>> Types of loop variable and upper bound do not match.
>>
>>> +                    scrub_one_page(&pg[i]);
>>> +
>>> +                pg->count_info &= ~PGC_need_scrub;
>>> +
>>> +                page_list_del(pg, &heap(node, zone, order));
>>> +                (void)merge_chunks(pg, node, zone, order);
>> Pointless cast.
> 
> Didn't coverity complain about those types of things? That was the
> reason I have the cast here. If not I'll drop it.

I don't know why Coverity would complain about an unused
return value. We've got plenty of such cases throughout the
code base. If this was a macro, the story might be different.

>>> --- a/xen/include/asm-x86/mm.h
>>> +++ b/xen/include/asm-x86/mm.h
>>> @@ -233,6 +233,10 @@ struct page_info
>>>  #define PGC_count_width   PG_shift(9)
>>>  #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
>>>  
>>> +/* Page needs to be scrubbed */
>>> +#define _PGC_need_scrub   PG_shift(10)
>>> +#define PGC_need_scrub    PG_mask(1, 10)
>> So why not a new PGC_state_dirty instead of this independent
>> flag? Pages other than PGC_state_free should never make it
>> to the scrubber, so the flag is meaningless for all other
>> PGC_state_*.
> 
> Wouldn't doing this require possibly making two checks ---
> page_state_is(pg, free) || page_state_is(pg, dirty)?

Well, your goal would normally be to first look for pages not
needing scrubbing anyway, so quite likely you'd do two
passes anyway. But of course much depends on whether to
merge early or late.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.