WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen Advisory 5 (CVE-2011-3131) IOMMU fault livelock

>>> On 16.08.11 at 17:06, Tim Deegan <tim@xxxxxxx> wrote:
> At 08:03 +0100 on 16 Aug (1313481813), Jan Beulich wrote:
>> >>> On 15.08.11 at 11:26, Tim Deegan <tim@xxxxxxx> wrote:
>> > At 15:48 +0100 on 12 Aug (1313164084), Jan Beulich wrote:
>> >> >>> On 12.08.11 at 16:09, Tim Deegan <tim@xxxxxxx> wrote:
>> >> > At 14:53 +0100 on 12 Aug (1313160824), Jan Beulich wrote:
>> >> >> > This issue is resolved in changeset 23762:537ed3b74b3f of
>> >> >> > xen-unstable.hg, and 23112:84e3706df07a of xen-4.1-testing.hg.
>> >> >> 
>> >> >> Do you really think this helps much? Direct control of the device means
>> >> >> it could also (perhaps on a second vCPU) constantly re-enable the bus
>> >> >> mastering bit. 
>> >> > 
>> >> > That path goes through qemu/pciback, so at least lets Xen schedule the
>> >> > dom0 tools.
>> >> 
>> >> Are you sure? If (as said) the guest uses a second vCPU for doing the
>> >> config space accesses, I can't see how this would save the pCPU the
>> >> fault storm is occurring on.
>> > 
>> > Hmmm.  Yes, I see what you mean.
>> 
>> Actually, a second vCPU may not even be needed: Since the "fault"
>> really is an external interrupt, if that one gets handled on a pCPU other
>> than the one the guest's vCPU is running on, it could execute such a
>> loop even in that case.
>> 
>> As to yesterdays softirq-based handling thoughts - perhaps the clearing
>> of the bus master bit on the device should still be done in the actual IRQ
>> handler, while the processing of the fault records could be moved out to
>> a softirq.
> 
> Hmmm.  I like the idea of using a softirq but in fact by the time we've
> figured out which BDF to silence we've pretty much done handling the
> fault.

Ugly, but yes, indeed.

> Reading the VTd docs it looks like we can just ack the IOMMU fault
> interrupt and it won't send any more until we clear the log, so we can
> leave the whole business to a softirq.  Delaying that might cause the
> log to overflow, but that's not necessarily the end of the world.
> Looks like we can do the same on AMD by disabling interrupt generation
> in the main handler and reenabling it in the softirq.
> 
> Is there any situation where we rally care terribly about the IOfault
> logs overflowing?

As long as older entries don't get overwritten, I don't think that's
going to be problematic. The more that we basically shut off the
offending device(s).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel