[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] iommu: leave IOMMU enabled by default during kexec crash transition

On 22/02/2019 12:51, Jan Beulich wrote:
>>>> On 22.02.19 at 13:40, <igor.druzhinin@xxxxxxxxxx> wrote:
>> There are several reasons why it's better:
>> a) kernel is able to perform device reset properly as it has bus
>> specific code that does this. There is even a comment in the code
>> mentioning that at the moment it disables the translation bus-specific
>> reset is finished and it's safer (as devices likely stopped DMA at this
>> point) to do it now.
>> b) kernel has the drivers that do per-driver-specific reset of the
>> devices that do not work well with bus-specific reset. It's simply
>> impossible to implement that in Xen
>> c) even if a device is uncooperative and keeps sending bus transactions,
>> an error event that I mentioned earlier will be properly handled as we
>> have facilities for it in the kernel at that point
> Okay, c) is a convincing argument. a) and b) are partly only: Iirc
> crash kernels don't load unnecessary drivers, so a babbling device
> may be left untouched unless generic kernel code can reset or
> otherwise silence it.

Crash kernel loads whatever drivers are necessary to save crash dumps:
if it needs to load RAID controller drivers which is still sending DMAs
it will do it - therefore it will try to reset the device.

In order to avoid scenario in (c) if the device is untouched it's still
unsafe to disable translation while any of the devices on a bus haven't
been properly reset. So the kernel will try it's best to avoid (c) and
will reset all the devices.

I want to also mention that I've tested this particular change in our
lab on about 500 of machines of different ages and classes. It
definitely makes the crash path more reliable - before this change
entering the crash dump failed in ~25-30% of the cases while with this
change reliability is close to 98-99%.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.