WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated n

On 03/10/11 19:13, Stefano Stabellini wrote:
> CC'ing Jan, that probably is going to have an opinion on this.
>
> Let me add a bit of background: Stefan found out that PV on HVM guests
> could loose level interrupts coming from emulated devices. Looking
> through the code I realized that we need to add some logic to inject a
> pirq in the guest if a level interrupt has been raised while the guest
> is servicing the first one.
> While this is all very specific to interrupt remapping and emulated
> devices, I realized that something similar could happen even with dom0
> or other PV guests with PCI passthrough:
>
> 1) the device raises a level interrupt and xen injects it into the
> guest;
>
> 2) the guest is temporarely stuck: it does not ack it or eoi it;
>
> 3) the xen timer kicks in and eois the interrupt;
>
> 4) the device thinks it is all fine and sends a second interrupt;
>
> 5) Xen fails to inject the second interrupt into the guest because the
> guest has still the event channel pending bit set;
>
> at this point the guest looses the second interrupt notification, that
> is not supposed to happen with level interrupts and I think it might
> cause problems with some devices.
>
> Jan, do you think we should try to handle this case, or is it too
> unlikely?

I am not certain whether this is relevant, but the ICH10 IO-APIC
documentation indicated that early EOI'ing of a line level interrupt
should not have this effect.  Specifically, it states that EOI'ing a
line level interrupt whos line is still asserted will cause the
interrupt to be "re-raised" from the IO-APIC.  It uses this to assert
that it is fine to use multiple IO-APIC entries with the same vector,
with a broadcast of vector number alone to EOI the interrupt.

In this case, while Xen sees two interrupts, from the devices point of
view, only I has happened.

In the case where the device has dropped its line level interrupt of its
own accord, then I would agree that the current Xen behavior is wrong. 
I cant offhand think of a good reason why this would occur.

I know it is not helpful in this case, but as a rule of thumb, line
level interrupts should not be used with Xen.  The average response time
on an unloaded system is ~30ms, ranging from 5 to 150.  (On a sample set
of a Dell R710, Xen 4.1.0 and 2.6.32 dom0, over 2 weeks of debugging
another line level interrupt bug).

~Andrew

> In any case we need to handle the PV on HVM remapping bug, that because
> of the way interrupts are emulated is much more likely to happen...
>
>
> On Mon, 3 Oct 2011, Stefano Stabellini wrote:
>> On Fri, 30 Sep 2011, Stefan Bader wrote:
>>> Also I did not completely remove the section that would return the status
>>> without setting needsEOI. I just changed the if condition to be <0 instead 
>>> of
>>> <=0 (I knew from the tests that the mapping was always 0 and maybe the <0 
>>> check
>>> could be useful for something.
>>>
>>>         irq_status_query.flags = 0;
>>>         if ( is_hvm_domain(v->domain) &&
>>>              domain_pirq_to_irq(v->domain, irq) < 0 )
>>>         {
>>>             ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
>>>             break;
>>>         }
>>>
>> You need to remove the entire test because we want to receive
>> notifications in all cases.
>>
>>
>>> With that a quick test shows both the re-sends done sometimes and the domU 
>>> doing
>>> EOIs. And there is no stall apparent. Did the same quick test with the e1000
>>> emulated NIC and that still seems ok. Those were not very thorough tests 
>>> but at
>>> least I would have observed a stall pretty quick otherwise.
>> I am glad it fixes the problem for you.
>>
>> I am going to send a different patch upstream for Xen 4.2, because I
>> would also like it to cover the very unlikely scenario in which a PV
>> guest (like dom0 or a PV guest with PCI passthrough) is loosing level
>> interrupts because when Xen tries to set the corresponding event channel
>> pending the bit is alreay set. The codebase is different enough that
>> making the same change on 4.1 is non-trivial. I am appending the new
>> patch to this email, it would be great if you could test it. You just
>> need a 4.2 hypervisor, not the entire system. You should be able to
>> perform the test updating only xen.gz.
>> If you have trouble if xen-unstable.hg tip, try changeset 23843.
>>
>> ---
>>
>>
>> diff -r bf533533046c xen/arch/x86/hvm/irq.c
>> --- a/xen/arch/x86/hvm/irq.c Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/hvm/irq.c Mon Oct 03 16:54:51 2011 +0000
>> @@ -36,7 +36,8 @@ static void assert_gsi(struct domain *d,
>>  
>>      if ( hvm_domain_use_pirq(d, pirq) )
>>      {
>> -        send_guest_pirq(d, pirq);
>> +        if ( send_guest_pirq(d, pirq) && ioapic_gsi >= NR_ISAIRQS )
>> +            pirq->lost++;
>>          return;
>>      }
>>      vioapic_irq_positive_edge(d, ioapic_gsi);
>> @@ -63,6 +64,7 @@ static void __hvm_pci_intx_assert(
>>  {
>>      struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq;
>>      unsigned int gsi, link, isa_irq;
>> +    struct pirq *pirq;
>>  
>>      ASSERT((device <= 31) && (intx <= 3));
>>  
>> @@ -72,6 +74,11 @@ static void __hvm_pci_intx_assert(
>>      gsi = hvm_pci_intx_gsi(device, intx);
>>      if ( hvm_irq->gsi_assert_count[gsi]++ == 0 )
>>          assert_gsi(d, gsi);
>> +    else {
>> +        pirq = pirq_info(d, domain_emuirq_to_pirq(d, gsi));
>> +        if ( hvm_domain_use_pirq(d, pirq) )
>> +            pirq->lost++;
>> +    }
>>  
>>      link    = hvm_pci_intx_link(device, intx);
>>      isa_irq = hvm_irq->pci_link.route[link];
>> diff -r bf533533046c xen/arch/x86/irq.c
>> --- a/xen/arch/x86/irq.c     Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/irq.c     Mon Oct 03 16:54:51 2011 +0000
>> @@ -965,7 +965,11 @@ static void __do_IRQ_guest(int irq)
>>               !test_and_set_bool(pirq->masked) )
>>              action->in_flight++;
>>          if ( !hvm_do_IRQ_dpci(d, pirq) )
>> -            send_guest_pirq(d, pirq);
>> +        {
>> +            if ( send_guest_pirq(d, pirq) &&
>> +                    action->ack_type == ACKTYPE_EOI )
>> +                pirq->lost++;
>> +        }
>>      }
>>  
>>      if ( action->ack_type != ACKTYPE_NONE )
>> diff -r bf533533046c xen/arch/x86/physdev.c
>> --- a/xen/arch/x86/physdev.c Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/arch/x86/physdev.c Mon Oct 03 16:54:51 2011 +0000
>> @@ -11,6 +11,7 @@
>>  #include <asm/current.h>
>>  #include <asm/io_apic.h>
>>  #include <asm/msi.h>
>> +#include <asm/hvm/irq.h>
>>  #include <asm/hypercall.h>
>>  #include <public/xen.h>
>>  #include <public/physdev.h>
>> @@ -270,6 +271,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>>          if ( !is_hvm_domain(v->domain) ||
>>               domain_pirq_to_irq(v->domain, eoi.irq) > 0 )
>>              pirq_guest_eoi(pirq);
>> +        if ( pirq->lost > 0) {
>> +            if ( !send_guest_pirq(v->domain, pirq) )
>> +                pirq->lost--;
>> +        }
>>          spin_unlock(&v->domain->event_lock);
>>          ret = 0;
>>          break;
>> @@ -328,9 +333,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>>              break;
>>          irq_status_query.flags = 0;
>>          if ( is_hvm_domain(v->domain) &&
>> -             domain_pirq_to_irq(v->domain, irq) <= 0 )
>> +             domain_pirq_to_irq(v->domain, irq) <= 0 &&
>> +             domain_pirq_to_emuirq(v->domain, irq) == IRQ_UNBOUND )
>>          {
>> -            ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
>> +            ret = -EINVAL;
>>              break;
>>          }
>>  
>> diff -r bf533533046c xen/include/xen/irq.h
>> --- a/xen/include/xen/irq.h  Fri Sep 30 14:12:35 2011 +0000
>> +++ b/xen/include/xen/irq.h  Mon Oct 03 16:54:51 2011 +0000
>> @@ -146,6 +146,7 @@ struct pirq {
>>      int pirq;
>>      u16 evtchn;
>>      bool_t masked;
>> +    u32 lost;
>>      struct rcu_head rcu_head;
>>      struct arch_pirq arch;
>>  };
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel