[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86 / iommu: set up a scratch page in the quarantine domain



On 10.12.2019 08:16, Tian, Kevin wrote:
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: Tuesday, December 3, 2019 5:36 PM
>>
>> On 28.11.2019 12:32, Jürgen Groß wrote:
>>> On 28.11.19 12:17, Jan Beulich wrote:
>>>> On 27.11.2019 18:11, Paul Durrant wrote:
>>>>> This patch introduces a new iommu_op to facilitate a per-
>> implementation
>>>>> quarantine set up, and then further code for x86 implementations
>>>>> (amd and vtd) to set up a read-only scratch page to serve as the source
>>>>> for DMA reads whilst a device is assigned to dom_io. DMA writes will
>>>>> continue to fault as before.
>>>>>
>>>>> The reason for doing this is that some hardware may continue to re-try
>>>>> DMA (despite FLR) in the event of an error, or even BME being cleared,
>> and
>>>>> will fail to deal with DMA read faults gracefully. Having a scratch page
>>>>> mapped will allow pending DMA reads to complete and thus such buggy
>>>>> hardware will eventually be quiesced.
>>>>>
>>>>> NOTE: These modifications are restricted to x86 implementations only as
>>>>>        the buggy h/w I am aware of is only used with Xen in an x86
>>>>>        environment. ARM may require similar code but, since I am not
>>>>>        aware of the need, this patch does not modify any ARM
>> implementation.
>>>>>
>>>>> Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
>>>>
>>>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>>
>>>>> There is still the open question of whether use of a scratch page ought
>>>>> to be gated on something, either are run-time or compile-time.
>>>>
>>>> I have no clear opinion either way here. The workaround seems low
>>>> overhead enough that there may not be a need to have an admin (or
>>>> build time) control for this.
>>>>
>>>> As to 4.13: The quarantining as a whole is pretty fresh. While it
>>>> has been backported to security maintained trees, I'd still consider
>>>> it a new feature in 4.13, and hence this workaround at least eligible
>>>> for consideration.
>>>
>>> I agree.
>>>
>>> Release-acked-by: Juergen Gross <jgross@xxxxxxxx>
>>
>> I notice this has been committed meanwhile. I had specifically not
>> done so due to the still missing VT-d ack, seeing that this wasn't
>> an entirely "trivial" change.
>>
> 
> While the quarantine idea sounds good overall, I'm still not convinced
> to have it the only way in place just for handling some known-buggy
> device. It kills the possibility of identifying a new buggy device and then 
> deciding not to use it in the first space... I thought about whether it
> will get better when future IOMMU implements A/D bit - by checking
> access bit being set then we'll know some buggy device exists, but, 
> the scratch page is shared by all devices then we cannot rely on this 
> feature to find out the actual buggy one.

Thinking about it - yes, I think I agree. This (as with so many
workarounds) would better be an off-by-default one. The main issue
I understand this would have is that buggy systems then might hang
without even having managed to get a log message out - Paul?

Jürgen - would you be amenable to an almost last minute refinement
here (would then also need to still be backported to 4.12.2, or
the original backport reverted, to avoid giving the impression of
a regression)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.