[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.13 2/2] x86/ioapic: don't use raw entry reads/writes in clear_IO_APIC_pin



On 08.11.2019 10:27, Roger Pau Monné  wrote:
> On Thu, Nov 07, 2019 at 04:56:09PM +0100, Jan Beulich wrote:
>> On 07.11.2019 16:46, Roger Pau Monné  wrote:
>>> On Thu, Nov 07, 2019 at 04:28:56PM +0100, Jan Beulich wrote:
>>>> On 07.11.2019 16:06, Roger Pau Monne wrote:
>>>>> @@ -530,9 +530,9 @@ static void clear_IO_APIC_pin(unsigned int apic, 
>>>>> unsigned int pin)
>>>>>       */
>>>>>      memset(&entry, 0, sizeof(entry));
>>>>>      entry.mask = 1;
>>>>> -    __ioapic_write_entry(apic, pin, true, entry);
>>>>> +    __ioapic_write_entry(apic, pin, false, entry);
>>>>
>>>> I may be able to understand why this one can't use raw mode, but as
>>>> per above a better overall description is needed.
>>>
>>> Yes, this is the one that's actually incorrect, but see my reasoning
>>> below.
>>>
>>>>
>>>>> -    entry = __ioapic_read_entry(apic, pin, true);
>>>>> +    entry = __ioapic_read_entry(apic, pin, false);
>>>>>      if (entry.irr)
>>>>>          printk(KERN_ERR "IO-APIC%02x-%u: Unable to reset IRR\n",
>>>>>                 IO_APIC_ID(apic), pin);
>>>>
>>>> This read again shouldn't need conversion, as the IRR bit doesn't
>>>> get touched (I think) by the interrupt remapping code during the
>>>> translation it does.
>>>
>>> TBH, I think raw mode should only be used by the iommu code in order
>>> to setup the entries to point to the interrupt remapping table,
>>> everything else shouldn't be using raw mode. While it's true that some
>>> of the cases here are safe to use raw mode I would discourage it's
>>> usage as it can lead to issues, and this is not a performance critical
>>> path anyway.
>>
>> You also should take the other possible perspective - not using
>> raw mode means going through interrupt remapping logic, which
>> can (needlessly) trigger errors. I think you want to break the
>> patch into a necessary and an optional part. The optional part
>> should be discussed separately and deferred until after 4.13.
> 
> IMO generic IO-APIC code has not business playing with raw entries
> when interrupt remapping is enabled, the layout of IO-APIC entries in
> that case is vendor-specific, and hence the generic IO-APIC code is
> not able to parse it.
> 
> For example the code in clear_IO_APIC_pin modifies the mask or the
> trigger fields of RAW entries, is there any guarantee that those
> fields don't have different meanings/layout when interrupt remapping
> is enabled?

From an abstract pov there's no such guarantee, but in practice
the meaning of the fields doesn't change. You make a good point
though nevertheless: For VT-d the trigger mode fields in RTE and
IRTE need to match, so the interrupt remapping code needs to see
the trigger mode change. See below for a possible alternative
patch.

> I can split the specific bugfix into a separate patch, but IMO the
> code in clear_IO_APIC_pin is not safe.

A change is needed, yes, but in particular because of the use of
the function from clear_IO_APIC(), in turn called from
disable_IO_APIC(), yet in turn used e.g. during emergency
shutdown after a crash, I'd like the function to do as simple
operations as possible, i.e. specifically avoid going through
interrupt remapping code (because its data structures may also
be corrupted at that point) unless really needed (hence the
alternative patch suggestion below).

As an aside, iommu_crash_shutdown() - even if actually doing
something, i.e. disabling interrupt remapping - does _not_
cause the RTEs to be re-written in non-translated format.

Jan

--- unstable.orig/xen/arch/x86/io_apic.c
+++ unstable/xen/arch/x86/io_apic.c
@@ -519,8 +519,9 @@ static void clear_IO_APIC_pin(unsigned i
     if (entry.irr) {
         /* Make sure the trigger mode is set to level. */
         if (!entry.trigger) {
+            entry = __ioapic_read_entry(apic, pin, false);
             entry.trigger = 1;
-            __ioapic_write_entry(apic, pin, TRUE, entry);
+            __ioapic_write_entry(apic, pin, false, entry);
         }
         __io_apic_eoi(apic, entry.vector, pin);
     }
@@ -530,7 +531,7 @@ static void clear_IO_APIC_pin(unsigned i
      */
     memset(&entry, 0, sizeof(entry));
     entry.mask = 1;
-    __ioapic_write_entry(apic, pin, TRUE, entry);
+    __ioapic_write_entry(apic, pin, false, entry);
 
     entry = __ioapic_read_entry(apic, pin, TRUE);
     if (entry.irr)


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.