WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen Advisory 5 (CVE-2011-3131) IOMMU fault livelock

>>> On 15.08.11 at 11:26, Tim Deegan <tim@xxxxxxx> wrote:
> At 15:48 +0100 on 12 Aug (1313164084), Jan Beulich wrote:
>> >>> On 12.08.11 at 16:09, Tim Deegan <tim@xxxxxxx> wrote:
>> > At 14:53 +0100 on 12 Aug (1313160824), Jan Beulich wrote:
>> >> > This issue is resolved in changeset 23762:537ed3b74b3f of
>> >> > xen-unstable.hg, and 23112:84e3706df07a of xen-4.1-testing.hg.
>> >> 
>> >> Do you really think this helps much? Direct control of the device means
>> >> it could also (perhaps on a second vCPU) constantly re-enable the bus
>> >> mastering bit. 
>> > 
>> > That path goes through qemu/pciback, so at least lets Xen schedule the
>> > dom0 tools.
>> 
>> Are you sure? If (as said) the guest uses a second vCPU for doing the
>> config space accesses, I can't see how this would save the pCPU the
>> fault storm is occurring on.
> 
> Hmmm.  Yes, I see what you mean.  What was your concern about
> memory-mapped config registers?  That PCIback would need to be involved
> somehow?

Yes, unless we want to get into the business of intercepting Dom0's
writes to mmcfg space.

>> > The particular failure that this patch fixes was locking up
>> > cpu0 so hard that it couldn't even service softirqs, and the NMI
>> > watchdog rebooted the machine.
>> 
>> Hmm, that would point at a flaw in the interrupt exit path, on which
>> softirqs shouldn't be ignored.
> 
> Are you suggesting that we should handle softirqs before re-enabling
> interrupts?  That sounds perilous.

Ah, okay, I was assuming execution would get back into the guest at
least, but you're saying the interrupts hit right after the sti. Indeed,
in that case there's not much else we can do. Or maybe we could: How
about moving the whole fault handling into a softirq, and make the
low level handler just raise that one? Provided this isn't a performance
critical operation (and it really can't given that now you basically knock
the offending device in the face when one happens), having to iterate
through all IOMMUs shouldn't be that bad.

Jan

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel