[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] page-alloc: detect double free earlier



>>> On 09.05.19 at 14:50, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 09/05/2019 13:35, Jan Beulich wrote:
>> Right now this goes unnoticed until some subsequent page allocator
>> operation stumbles across the thus corrupted list. We can do better:
>> Only PGC_state_inuse and PGC_state_offlining pages can legitimately be
>> passed to free_heap_pages().
>>
>> Take the opportunity and also restrict the PGC_broken check to the
>> PGC_state_offlining case, as only pages of that type or
>> PGC_state_offlined may have this flag set on them. Similarly, since
>> PGC_state_offlined is not a valid input state, the setting of "tainted"
>> can be restricted to just this case.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, with a suggestion.

Thanks.

>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -1409,13 +1409,22 @@ static void free_heap_pages(
>>           *     in its pseudophysical address space).
>>           * In all the above cases there can be no guest mappings of this 
>> page.
>>           */
>> -        ASSERT(!page_state_is(&pg[i], offlined));
>> -        pg[i].count_info =
>> -            ((pg[i].count_info & PGC_broken) |
>> -             (page_state_is(&pg[i], offlining)
>> -              ? PGC_state_offlined : PGC_state_free));
>> -        if ( page_state_is(&pg[i], offlined) )
>> +        switch ( pg[i].count_info & PGC_state )
>> +        {
>> +        case PGC_state_inuse:
>> +            BUG_ON(pg[i].count_info & PGC_broken);
>> +            pg[i].count_info = PGC_state_free;
>> +            break;
>> +
>> +        case PGC_state_offlining:
>> +            pg[i].count_info = (pg[i].count_info & PGC_broken) |
>> +                               PGC_state_offlined;
>>              tainted = 1;
>> +            break;
>> +
>> +        default:
> 
> Given that this is a fully fatal condition, it would be helpful to at
> least print the state we found here.  For cases other than
> PGC_state_free, it would probably be a very useful piece of information
> for diagnosing what went wrong.

Funny you should say this - I have the debugging patch below on top
in my tree. I could easily submit this as a standalone follow-on patch.

Jan

--- unstable.orig/xen/common/page_alloc.c
+++ unstable/xen/common/page_alloc.c
@@ -1014,7 +1014,14 @@ static struct page_info *alloc_heap_page
     for ( i = 0; i < (1 << order); i++ )
     {
         /* Reference count must continuously be zero for free pages. */
-        BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free);
+        if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
+        {
+            printk(XENLOG_ERR "pg[%x] m=%lx c=%lx o=%x v=%lx t=%x\n",
+                   i, mfn_x(page_to_mfn(pg + i)),
+                   pg[i].count_info, pg[i].v.free.order,
+                   pg[i].u.free.val, pg[i].tlbflush_timestamp);
+            BUG();
+        }
 
         /* PGC_need_scrub can only be set if first_dirty is valid */
         ASSERT(first_dirty != INVALID_DIRTY_IDX || !(pg[i].count_info & 
PGC_need_scrub));
@@ -1423,6 +1430,10 @@ static void free_heap_pages(
             break;
 
         default:
+            printk(XENLOG_ERR "pg[%x] m=%lx c=%lx o=%x v=%lx t=%x\n",
+                   i, mfn_x(page_to_mfn(pg + i)),
+                   pg[i].count_info, pg[i].v.free.order,
+                   pg[i].u.free.val, pg[i].tlbflush_timestamp);
             BUG();
         }
 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.