[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: credit2: fix spinlock irq-safety violation



On 09/20/2017 10:19 AM, George Dunlap wrote:
> On 09/19/2017 03:11 AM, Dario Faggioli wrote:
>> In commit ad4b3e1e9df34 ("xen: credit2: implement
>> utilization cap") xfree() was being called (for
>> deallocating the budget replenishment timer, during
>> domain destruction) inside an IRQ disabled critical
>> section.
>>
>> That must not happen, as it uses the mem-pool's lock,
>> which needs to be taken with IRQ enabled. And, in fact,
>> we crash (in debug builds):
>>
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Xen BUG at spinlock.c:47
>> (XEN) ****************************************
>>
>> Let's, therefore, kill and deallocate the timer outside of
>> the critical sections, when IRQs are enabled.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
>> ---
>> Cc: osstest service owner <osstest-admin@xxxxxxxxxxxxxx>
>> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
>> Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
>> ---
>> This was spotted by OSSTest's flight 113562:
>>
>>  http://logs.test-lab.xenproject.org/osstest/logs/113562/
>>  
>> http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-xl-credit2/serial-godello0.log
>> ---
>>  xen/common/sched_credit2.c |    6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
>> index 32234ac..7a550db 100644
>> --- a/xen/common/sched_credit2.c
>> +++ b/xen/common/sched_credit2.c
>> @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct scheduler *ops, 
>> void *data)
>>  
>>      write_lock_irqsave(&prv->lock, flags);
>>  
>> -    kill_timer(sdom->repl_timer);
>> -    xfree(sdom->repl_timer);
>> -
>>      list_del_init(&sdom->sdom_elem);
>>  
>>      write_unlock_irqrestore(&prv->lock, flags);
>>  
>> +    kill_timer(sdom->repl_timer);
>> +    xfree(sdom->repl_timer);
> 
> Any particular reason for moving the kill_timer() as well as the xfree
> outside the lock?  What happens if the timer goes off after the
> irqrestore but before the kill_timer?

Looks like if the timer fires, nothing terribly bad will happen; it will
just do a useless budget replenishment.

It looks like kill_timer() disables irqs, so it could be moved inside
the critical section.  I'm inclined to say we should do so.  I don't
anticipate the budget replenishment ever to need to walk the domain
list, but should that change, this would be a bear of a bug to find.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.