[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist



>>> Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> 06/22/17 8:55 PM >>>
> I kept node_need_scrub[] as a global array and not a "per-node". I think 
> splitting
> it should be part of making heap_lock a per-node lock, together with 
> increasing
> scrub concurrency by having more than one CPU scrub a node.

Agreed - I hadn't meant my earlier comment to necessarily affect this series.

> @@ -798,11 +814,26 @@ static struct page_info *alloc_heap_pages(
>      return NULL;
>  
>   found: 
> +
> +    if ( pg->u.free.first_dirty != INVALID_DIRTY_IDX )
> +        first_dirty_pg = pg + pg->u.free.first_dirty;
> +
>      /* We may have to halve the chunk a number of times. */
>      while ( j != order )
>      {
> -        PFN_ORDER(pg) = --j;
> -        page_list_add_tail(pg, &heap(node, zone, j));
> +        unsigned int first_dirty;
> +
> +        if ( first_dirty_pg && ((pg + (1 << j)) > first_dirty_pg) )

Despite the various examples of doing it this way, please at least use 1u.

> +        {
> +            if ( pg < first_dirty_pg )
> +                first_dirty = (first_dirty_pg - pg) / sizeof(*pg);

Pointer subtraction already includes the involved division. Otoh I wonder
if you couldn't get away without pointer comparison/subtraction here
altogether.

> @@ -849,13 +880,22 @@ static int reserve_offlined_page(struct page_info *head)
>  {
>      unsigned int node = phys_to_nid(page_to_maddr(head));
>      int zone = page_to_zone(head), i, head_order = PFN_ORDER(head), count = 
> 0;
> -    struct page_info *cur_head;
> +    struct page_info *cur_head, *first_dirty_pg = NULL;
>      int cur_order;
>  
>      ASSERT(spin_is_locked(&heap_lock));
>  
>      cur_head = head;
>  
> +    /*
> +     * We may break the buddy so let's mark the head as clean. Then, when
> +     * merging chunks back into the heap, we will see whether the chunk has
> +     * unscrubbed pages and set its first_dirty properly.
> +     */
> +    if (head->u.free.first_dirty != INVALID_DIRTY_IDX)

Coding style.

> @@ -892,8 +934,25 @@ static int reserve_offlined_page(struct page_info *head)
>              {
>              merge:
>                  /* We don't consider merging outside the head_order. */
> -                page_list_add_tail(cur_head, &heap(node, zone, cur_order));
> -                PFN_ORDER(cur_head) = cur_order;
> +
> +                /* See if any of the pages indeed need scrubbing. */
> +                if ( first_dirty_pg && (cur_head + (1 << cur_order) > 
> first_dirty_pg) )
> +                {
> +                    if ( cur_head < first_dirty_pg )
> +                        i = (first_dirty_pg - cur_head) / sizeof(*cur_head);
> +                    else
> +                        i = 0;
> +
> +                    for ( ; i < (1 << cur_order); i++ )
> +                        if ( test_bit(_PGC_need_scrub,
> +                                      &cur_head[i].count_info) )
> +                        {
> +                            first_dirty = i;
> +                            break;
> +                        }

Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
there cases where ->u.free.first_dirty of a page may be wrong?

> @@ -977,35 +1090,53 @@ static void free_heap_pages(
>  
>          if ( (page_to_mfn(pg) & mask) )
>          {
> +            struct page_info *predecessor = pg - mask;
> +
>              /* Merge with predecessor block? */
> -            if ( !mfn_valid(_mfn(page_to_mfn(pg-mask))) ||
> -                 !page_state_is(pg-mask, free) ||
> -                 (PFN_ORDER(pg-mask) != order) ||
> -                 (phys_to_nid(page_to_maddr(pg-mask)) != node) )
> +            if ( !mfn_valid(_mfn(page_to_mfn(predecessor))) ||
> +                 !page_state_is(predecessor, free) ||
> +                 (PFN_ORDER(predecessor) != order) ||
> +                 (phys_to_nid(page_to_maddr(predecessor)) != node) )
>                  break;
> -            pg -= mask;
> -            page_list_del(pg, &heap(node, zone, order));
> +
> +            page_list_del(predecessor, &heap(node, zone, order));
> +
> +            if ( predecessor->u.free.first_dirty != INVALID_DIRTY_IDX )
> +                need_scrub = true;

I'm afraid I continue to be confused by this: Why does need_scrub depend on
the state of pages not being the subject of the current free operation? I
realize that at this point in the series the path can't be taken yet, but
won't later patches need to rip it out or change it anyway, in which case it
would be better to introduce the then correct check (if any) only there?

> --- a/xen/include/asm-x86/mm.h
> +++ b/xen/include/asm-x86/mm.h
> @@ -88,7 +88,15 @@ struct page_info
>          /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
>          struct {
>              /* Do TLBs need flushing for safety before next page use? */
> -            bool_t need_tlbflush;
> +            unsigned long need_tlbflush:1;
> +
> +            /*
> +             * Index of the first *possibly* unscrubbed page in the buddy.
> +             * One more than maximum possible order (MAX_ORDER+1) to

Why +1 here and hence ...

> +             * accommodate INVALID_DIRTY_IDX.
> +             */
> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<<MAX_ORDER) + 2) - 1))
> +            unsigned long first_dirty:MAX_ORDER + 2;

... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
parenthesized (apart from lacking blanks around the shift operator)? I'd
expect you want a value with MAX_ORDER+1 set bits, i.e.
(1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.