WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated n

To: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated nics
From: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>
Date: Wed, 05 Oct 2011 18:10:19 +0200
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 05 Oct 2011 09:12:10 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <alpine.DEB.2.00.1110031800390.3519@kaball-desktop>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E7B4768.8060103@xxxxxxxxxxxxx> <alpine.DEB.2.00.1109221838370.8700@kaball-desktop> <4E85883C.7030808@xxxxxxxxxxxxx> <alpine.DEB.2.00.1109301427590.3519@kaball-desktop> <4E85E8E8.2020702@xxxxxxxxxxxxx> <4E860382.7040108@xxxxxxxxxxxxx> <alpine.DEB.2.00.1110031800390.3519@kaball-desktop>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Lightning/1.0b2 Thunderbird/3.1.15
On 03.10.2011 19:24, Stefano Stabellini wrote:
> I am going to send a different patch upstream for Xen 4.2, because I
> would also like it to cover the very unlikely scenario in which a PV
> guest (like dom0 or a PV guest with PCI passthrough) is loosing level
> interrupts because when Xen tries to set the corresponding event channel
> pending the bit is alreay set. The codebase is different enough that
> making the same change on 4.1 is non-trivial. I am appending the new
> patch to this email, it would be great if you could test it. You just
> need a 4.2 hypervisor, not the entire system. You should be able to
> perform the test updating only xen.gz.
> If you have trouble if xen-unstable.hg tip, try changeset 23843.

Hi Stefano,

currently I would have the problem that I don't have too much time to move to
another hypervisor (tests may or may not be useful there with substantial
changes beside this one) with our next release being close.
But I think I got a usable backport of your change to 4.1.1 (you think it looks
ok?) and have given that a quick test which seems to be ok...
Though one drawback is that I don't have a setup which would use passthrough, so
that path is not tested. I think I did see (with a debugging version) that the
lost count was incremented and decremented in dom0, though.

-Stefan

---

Index: xen-4.1.1/xen/arch/x86/domain.c
===================================================================
--- xen-4.1.1.orig/xen/arch/x86/domain.c        2011-10-05 15:03:19.405815293 
+0200
+++ xen-4.1.1/xen/arch/x86/domain.c     2011-10-05 15:09:59.781816622 +0200
@@ -514,6 +514,12 @@ int arch_domain_create(struct domain *d,
         memset(d->arch.pirq_irq, 0,
                d->nr_pirqs * sizeof(*d->arch.pirq_irq));

+        d->arch.pirq_lost = xmalloc_array(int, d->nr_pirqs);
+        if ( !d->arch.pirq_lost)
+            goto fail;
+        memset(d->arch.pirq_lost, 0,
+               d->nr_pirqs * sizeof(*d->arch.pirq_lost));
+
         d->arch.irq_pirq = xmalloc_array(int, nr_irqs);
         if ( !d->arch.irq_pirq )
             goto fail;
@@ -575,6 +581,7 @@ int arch_domain_create(struct domain *d,
  fail:
     d->is_dying = DOMDYING_dead;
     vmce_destroy_msr(d);
+    xfree(d->arch.pirq_lost);
     xfree(d->arch.pirq_irq);
     xfree(d->arch.irq_pirq);
     xfree(d->arch.pirq_emuirq);
@@ -628,6 +635,7 @@ void arch_domain_destroy(struct domain *
 #endif

     free_xenheap_page(d->shared_info);
+    xfree(d->arch.pirq_lost);
     xfree(d->arch.pirq_irq);
     xfree(d->arch.irq_pirq);
     xfree(d->arch.pirq_emuirq);
Index: xen-4.1.1/xen/arch/x86/hvm/irq.c
===================================================================
--- xen-4.1.1.orig/xen/arch/x86/hvm/irq.c       2011-10-05 15:14:35.441815292 
+0200
+++ xen-4.1.1/xen/arch/x86/hvm/irq.c    2011-10-05 17:55:43.603986605 +0200
@@ -33,7 +33,9 @@ static void assert_gsi(struct domain *d,
     int pirq = domain_emuirq_to_pirq(d, ioapic_gsi);
     if ( hvm_domain_use_pirq(d, pirq) )
     {
-        send_guest_pirq(d, pirq);
+        if ( send_guest_pirq(d, pirq) && ioapic_gsi >= NR_ISAIRQS )
+            if (d->arch.pirq_lost)
+                d->arch.pirq_lost[pirq]++;
         return;
     }
     vioapic_irq_positive_edge(d, ioapic_gsi);
@@ -67,6 +69,12 @@ static void __hvm_pci_intx_assert(
     gsi = hvm_pci_intx_gsi(device, intx);
     if ( hvm_irq->gsi_assert_count[gsi]++ == 0 )
         assert_gsi(d, gsi);
+    else {
+        int pirq = domain_emuirq_to_pirq(d, gsi);
+
+        if ( hvm_domain_use_pirq(d, pirq) && d->arch.pirq_lost)
+            d->arch.pirq_lost[pirq]++;
+    }

     link    = hvm_pci_intx_link(device, intx);
     isa_irq = hvm_irq->pci_link.route[link];
Index: xen-4.1.1/xen/arch/x86/irq.c
===================================================================
--- xen-4.1.1.orig/xen/arch/x86/irq.c   2011-10-05 15:26:58.477815292 +0200
+++ xen-4.1.1/xen/arch/x86/irq.c        2011-10-05 17:56:23.191986535 +0200
@@ -888,10 +888,13 @@ static void __do_IRQ_guest(int irq)
                 desc->status |= IRQ_INPROGRESS; /* cleared during hvm eoi */
             }
         }
-        else if ( send_guest_pirq(d, pirq) &&
-                  (action->ack_type == ACKTYPE_NONE) )
-        {
-            already_pending++;
+        else {
+            if ( send_guest_pirq(d, pirq) ) {
+               if ( action->ack_type == ACKTYPE_EOI && d->arch.pirq_lost)
+                    d->arch.pirq_lost[pirq]++;
+                else if ( action->ack_type == ACKTYPE_NONE )
+                    already_pending++;
+            }
         }
     }

Index: xen-4.1.1/xen/arch/x86/physdev.c
===================================================================
--- xen-4.1.1.orig/xen/arch/x86/physdev.c       2011-10-05 15:36:14.545815292 
+0200
+++ xen-4.1.1/xen/arch/x86/physdev.c    2011-10-05 17:57:06.055986460 +0200
@@ -261,13 +261,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
         ret = -EINVAL;
         if ( eoi.irq >= v->domain->nr_pirqs )
             break;
+        spin_lock(&v->domain->event_lock);
         if ( v->domain->arch.pirq_eoi_map )
             evtchn_unmask(v->domain->pirq_to_evtchn[eoi.irq]);
         if ( !is_hvm_domain(v->domain) ||
              domain_pirq_to_irq(v->domain, eoi.irq) > 0 )
-            ret = pirq_guest_eoi(v->domain, eoi.irq);
-        else
-            ret = 0;
+            pirq_guest_eoi(v->domain, eoi.irq);
+        if ( v->domain->arch.pirq_lost && v->domain->arch.pirq_lost[eoi.irq]) {
+            if ( !send_guest_pirq(v->domain, eoi.irq) )
+                v->domain->arch.pirq_lost[eoi.irq]--;
+        }
+        ret = 0;
+        spin_unlock(&v->domain->event_lock);
         break;
     }

@@ -323,9 +328,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
             break;
         irq_status_query.flags = 0;
         if ( is_hvm_domain(v->domain) &&
-             domain_pirq_to_irq(v->domain, irq) <= 0 )
+             domain_pirq_to_irq(v->domain, irq) <= 0 &&
+             domain_pirq_to_emuirq(v->domain, irq) == IRQ_UNBOUND )
         {
-            ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0;
+            ret = -EINVAL;
             break;
         }

Index: xen-4.1.1/xen/include/asm-x86/domain.h
===================================================================
--- xen-4.1.1.orig/xen/include/asm-x86/domain.h 2011-10-05 15:10:11.709815293 
+0200
+++ xen-4.1.1/xen/include/asm-x86/domain.h      2011-10-05 15:12:46.237815276 
+0200
@@ -312,6 +312,9 @@ struct arch_domain
                                 (possibly other cases in the future */
     uint64_t vtsc_kerncount; /* for hvm, counts all vtsc */
     uint64_t vtsc_usercount; /* not used for hvm */
+
+    /* Protected by d->event_lock, count of lost pirqs */
+    int *pirq_lost;
 } __cacheline_aligned;

 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))

Attachment: xen-backport-pirq-lost.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel