This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET bro

To: Andreas Kinzler <ml-xen-devel@xxxxxx>
Subject: RE: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
From: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Date: Fri, 1 Oct 2010 12:14:13 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "JBeulich@xxxxxxxxxx" <JBeulich@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Thu, 30 Sep 2010 21:15:05 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CA45B9E.1060608@xxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C88A6F3.9020207@xxxxxx> <20100921115604.GP2804@xxxxxxxxxxx> <4CA38093.9070802@xxxxxx> <BC00F5384FCFC9499AF06F92E8B78A9E1A90A388F5@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4CA45B9E.1060608@xxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Actgg+abgn4mlpn/QVm+SqOlWp+HRAAmB9ZQ
Thread-topic: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
Andreas Kinzler wrote:
> On 30.09.2010 07:00, Zhang, Xiantao wrote:
>> Maybe you can disable pirq_set_affinity to have a try with the
>> following patch. 
>  > It may trigger IRQ migration in hypervisor, and the IRQ migration
> logic about(especailly shared)
>  > level-triggered ioapic IRQ is not well tested because of
>  > no users before. After intoducing the pirq_set_affinity in
>  > #Cset21625, the logic is used frequently when vcpu migration occurs
> I am using Xen 4.0.1 which is c/s 21324 so I should not be affected?

Which Cset is adopted when you collected the suspecious 'irr=1' log ? Xen-4.0.1 
or 22068 ?  
In addition, did you always see the above strange log for every hang?  You 
know,  IRQ 16 is assigned with a relatively big vector 216, if it is not 
correctly acked, the other interrupt source will be masked automatically, so 
the dom0 maybe go hang.  Another try is to hack assign_irq_vector to allocate a 
small vector for IRQ16, and when the aacraid controller has something wrong, 
you stilll have a chance to logon to dom0 to get more information.  Besides, 
could you enable MSI for accraid controller to have a try ?  

>  > Besides, there is a bug in event driver which is fixed in latest
>  > pv_ops dom0, seems the dom0 you are using doesn't include the fix.
>  > This bug may result in lost event in dom0 and invoke dom0
>  > hang eventually.
> Hmm, this really does not explain why everything is rock solid after
> disabling HPET broadcast? And the problem occured with every kernel
> (xenfied, pvops, all versions). Please correct me if I am wrong

Just guess hpet broadcase maybe not the real killer, and it just exposes the 
bug accidentally according to the log you attached. 

>  > To workaround this bug,  you can disable irqbalance in dom0. Good
> luck! 
> As far as I know I am not using irq balancing (certainly not using the
> irqbalance daemon).

Xen-devel mailing list