[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: Xen panic when shutting down HVM guest with PCI passthrough: RIP: e008:[<ffff82d0801099b1>] evtchn_move_pirqs+0x90/0xbf



Friday, June 6, 2014, 10:35:05 AM, you wrote:

>>>> On 06.06.14 at 10:15, <linux@xxxxxxxxxxxxxx> wrote:
>>> Hmm, does the following patch fix it?
>> 
>> 
>>> Juergen
>> 
>>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>>> index c174c41..3ea9fc8 100644
>>> --- a/xen/common/schedule.c
>>> +++ b/xen/common/schedule.c
>>> @@ -297,7 +297,8 @@ int sched_move_domain(struct domain *d, struct 
>>> cpupool *c)
>>>           spin_unlock_irq(lock);
>> 
>>>           v->sched_priv = vcpu_priv[v->vcpu_id];
>>> -        evtchn_move_pirqs(v);
>> +        if ( !d->>is_dying )
>>> +            evtchn_move_pirqs(v);
>> 
>>>           new_p = cpumask_cycle(new_p, c->cpu_valid);
>> 
>> Hi JÃrgen,
>> 
>> Tried that one, it prevents the host panic, but immediately after i do get 
>> a:
>> 
>> [  658.222311] irq 16: nobody cared (try booting with the "irqpoll" option)
>> [  658.229069] CPU: 0 PID: 17104 Comm: smtp Not tainted 
>> 3.15.0-rc8-20140602-net-xendev-bt-mq+ #1
>> [  658.235996] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS 
>> V1.8B1 09/13/2010
>> [  658.243234]  ffff880054b00718 ffff88005f603d98 ffffffff81b93243 
>> ffff88002e4e9330
>> [  658.250342]  ffff880054b00690 ffff88005f603dc8 ffffffff8112306d 
>> ffff88005f603dc8
>> [  658.257411]  ffff880054b00690 0000000000000010 0000000000000000 
>> ffff88005f603e18
>> [  658.264362] Call Trace:
>> [  658.271128]  <IRQ>  [<ffffffff81b93243>] dump_stack+0x46/0x58
>> [  658.278096]  [<ffffffff8112306d>] __report_bad_irq+0x3d/0xe0
>> [  658.285175]  [<ffffffff8112355f>] note_interrupt+0x1bf/0x210
>> [  658.292210]  [<ffffffff81120ca5>] handle_irq_event_percpu+0xb5/0x1e0
>> [  658.299135]  [<ffffffff81120e18>] handle_irq_event+0x48/0x70
>> [  658.306032]  [<ffffffff8112411f>] ? handle_fasteoi_irq+0x2f/0x150
>> [  658.312935]  [<ffffffff8112417c>] handle_fasteoi_irq+0x8c/0x150
>> [  658.319847]  [<ffffffff811204f2>] generic_handle_irq+0x22/0x40
>> [  658.326627]  [<ffffffff8157dfb7>] evtchn_fifo_handle_events+0x137/0x140
>> [  658.333370]  [<ffffffff8157aef0>] __xen_evtchn_do_upcall+0x50/0xa0
>> [  658.340061]  [<ffffffff8157cc37>] xen_evtchn_do_upcall+0x37/0x50
>> [  658.346575]  [<ffffffff81b9ff5e>] xen_do_hypervisor_callback+0x1e/0x30
>> [  658.353214]  <EOI>
>> [  658.353264] handlers:
>> [  658.366234] [<ffffffff819a2440>] azx_interrupt
>> [  658.372806] Disabling IRQ #16
>> 
>> Which could be related ?

> evtchn_move_pirqs() is an entirely optional thing, trying to keep the
> IRQs at CPUs where they would also get serviced. So from an
> abstract pov making the call conditional upon an alive domain should
> be unrelated. Otoh I suppose you would have told us if with the
> commit reverted that Andrew suggested you also saw this...

OK did some more testing:

Nope i didn't see it before .. but I can't seem to trigger it again ..
so the patch seems to be OK.

( Oh and when being stubborn and trying to restart the guest with passthrough 
after this message
  causes a complete machine freeze (without any stacktrace on serial console 
  (without sync_console though)
)

> Is the passed through device also sitting on IRQ 16 (the one getting
> disabled)? In which case the question would be whether the passed
> through device gets properly shut down when the guest terminates.

Under the assumption that the devices get the same irq after reboot (which 
seems 
to be the case) .. 
it wasn't .. it was the onboard soundcontroller which was still assigned to 
dom0 (and not in active use).

00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel 
HDA) (rev 40)
        Subsystem: Micro-Star International Co., Ltd. Device 7640
        Flags: bus master, slow devsel, latency 64, IRQ 16
        Memory at fddf8000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [50] Power Management version 2
        Kernel driver in use: snd_hda_intel

Will keep an eye on it .. and try to use the debug keys to dump irq states etc 
from serial console when i spot it again.

--
Sander

> Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.