[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)



Attached this new one which should eliminate the race ultimately. 
Xiantao 

-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Zhang, Xiantao
Sent: Friday, October 16, 2009 5:50 PM
To: He, Qing
Cc: Cinco, Dante; xen-devel@xxxxxxxxxxxxxxxxxxx; Keir Fraser
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

He, Qing wrote:
> On Fri, 2009-10-16 at 16:35 +0800, Zhang, Xiantao wrote:
>> He, Qing wrote:
>>> On Fri, 2009-10-16 at 16:22 +0800, Zhang, Xiantao wrote:
>>>> He, Qing wrote:
>>>>> On Fri, 2009-10-16 at 15:32 +0800, Zhang, Xiantao wrote:
>>>>>> According to the description, the issue should be caused by lost
>>>>>> EOI write for the MSI interrupt and leads to permanent interrupt
>>>>>> mask. There should be a race between guest setting new vector and
>>>>>> EOIs old vector for the interrupt.  Once guest sets new vector
>>>>>> before it EOIs the old vector, hypervisor can't find the pirq
>>>>>> which corresponds old vector(has changed
>>>>>> to new vector) , so also can't EOI the old vector forever in
>>>>>> hardware level. Since the corresponding vector in real processor
>>>>>> can't be EOIed, so system may lose all interrupts and result the
>>>>>> reported issues ultimately.
>>>>> 
>>>>>> But I remembered there should be a timer to handle this case
>>>>>> through a forcible EOI write to the real processor after timeout,
>>>>>> but seems it doesn't function in the expected way.
>>>>> 
>>>>> The EOI timer is supposed to deal with the irq sharing problem,
>>>>> since MSI doesn't share, this timer will not be started in the
>>>>> case of MSI.
>>>> 
>>>> That maybe a problem if so. If a malicious/buggy guest won't EOI
>>>> the MSI vector, so host may hang due to lack of timeout mechanism?
>>> 
>>> Why does host hang? Only the assigned interrupt will block, and
>>> that's exactly what the guest wants :-)
>> 
>> Hypervisor shouldn't EOI the real vector until guest EOI the
>> corresponding virtual vector , right ?  Not sure.:-)
> 
> Yes, it is the algorithm used today.

So it should be still a problem. If guest won't do eoi, host can't do eoi also, 
and leads to system hang without timeout mechanism. So we may need to introduce 
a timer for each MSI interrupt source to avoid hanging host, Keir? 

> After reviewing the code, if the guest really does something like
> changing affinity within the window between an irq fire and eoi,
> there is indeed a problem, attached is the patch. Although I kinda
> doubt it, shouldn't desc->lock in guest protect and make these two
> operations mutual exclusive.

We shouldn't let hypervisor do real EOI before guest does the correponding 
virtual EOI, so this patch maybe have a correctness issue. :-)

Attached the fix according to my privious guess, and it should fix the issue. 

Xiantao

Attachment: fix-irq-affinity-msi3.patch
Description: fix-irq-affinity-msi3.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.