[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged



Hi Julien,

On Fri, Apr 27, 2018 at 5:12 PM, Julien Grall <julien.grall@xxxxxxx> wrote:
>
>
> On 27/04/18 15:38, Mirela Simonovic wrote:
>>
>> Hi,
>>
>> On Fri, Apr 27, 2018 at 4:15 PM, Tim Deegan <tim@xxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> At 10:28 +0100 on 27 Apr (1524824906), Julien Grall wrote:
>>>>
>>>> On 26/04/18 15:23, Tim Deegan wrote:
>>>>>
>>>>> At 11:08 +0100 on 26 Apr (1524740921), Julien Grall wrote:
>>>>>>>>>>
>>>>>>>>>> On 20/04/18 13:25, Mirela Simonovic wrote:
>>>>>>>>
>>>>>>>> This looks a bit weird. AFAIU, if you disable the CPU interface,
>>>>>>>> then you
>>>>>>>> should never receive interrupt after. So why would you re-enable
>>>>>>>> them?
>>>>>>>>
>>>>>>>> I realize the code in __cpu_disbale do that, but this looks quite
>>>>>>>> wrong to
>>>>>>>> me. There are no way to receive queued timer interrupt afterwards.
>>>>>>>>
>>>>>>>
>>>>>>> That is what I took as a reference, but I asked myself the same.
>>>>>>> There is (extremely small, but it exists) time window between
>>>>>>> disabling irq locally and disabling CPU interface. An interrupt
>>>>>>> received in that time window would propagate to the CPU but I'm not
>>>>>>> sure would happen after the GIC CPU interface is disabled and
>>>>>>> interrupts are locally enabled. That is the only explanation I can
>>>>>>> come up with, although I believe the answer is nothing. Since you're
>>>>>>> at ARM you could check this internally.
>>>>>>
>>>>>>
>>>>>> Speaking with Andre (in CC), the GIC CPU interface may have forwarded
>>>>>> an
>>>>>> interrupt to the processor before it gets disabled. So when the
>>>>>> interrupt will be re-enabled, the processor will jump to the interrupt
>>>>>> exception entry.
>>>>>>
>>>>>> However, looking at the spec (4-78 in ARM IHI 0048B.b), Xen will read
>>>>>> a
>>>>>> spurious interrupt ID from GICC_IAR. So I am not sure what the point
>>>>>> of
>>>>>> that code. It looks like it has been taken from x86, but some bits are
>>>>>> missing.
>>>>>>
>>>>>> AFAIU, x86 will only suspend the timer afterwards (see time_suspend).
>>>>>> I
>>>>>> am not fully sure why this code is there on Arm. Whether we expect a
>>>>>> timer interrupt to come up. Stefano, Tim, do you have any insight on
>>>>>> that code?
>>>>>
>>>>>
>>>>> Sorry, no.  I pretty clearly copied this logic from x86, which copied
>>>>> it directly from Linux at some point in the past.  I don't know why
>>>>> x86 does it this way, and I haven't dived into linux to find out. :)
>>>>> But draining the outstanding IRQs seems like a polite thing to do if
>>>>> you're ever going to re-enable this CPU (at least without resetting
>>>>> it first).
>>>>
>>>>
>>>> I am not entirely sure what you mean by draining, do you mean they will
>>>> serviced by Xen? If so, what kind of interrupts do you expect to be
>>>> serviced (e.g PPI, SPIs) ?
>>>
>>>
>>> All I mean is, when you disable the GICC (or APIC, or whatever), you
>>> know that it won't send any more interrupts to the CPU.  But you would
>>> like to also be certain that any interrupts it already sent to the CPU
>>> get processed now.  Otherwise, if you bring the CPU up again later
>>> that interrupt could still be there.  Better to get it out of the way
>>> now, right?
>>>
>>> AIUI that's what x86 is doing by re-enabling interrupts and waiting a
>>> bit, which seems a bit crude but OK.  ARM could maybe do the same
>>> thing by disabling GICC, dsb(), then disable interrupts.  But I don't
>>> understand the interface between GICD, GICC and CPU well enough to
>>> reason about it properly.
>>>
>>> It's also possible that there's some subtlety of the timer interrupt
>>> handling that I don't know about -- I _think_ that the reason timer
>>> interrupts are relevant is that they're generated inside the APIC,
>>> so that even when no interrupts are routed to the core, the APIC could
>>> still generate one as it's being shut down.
>>>
>>>> Clearly, this code does not seem to be doing what we are expecting.
>>>> Speaking the Marc Z. (GIC maintainers in Linux), there are no need to
>>>> disable the GIC CPU interface in the hypervisor/OS. You are going to
>>>> shutdown the CPU and it will be reset when you are coming back.
>>>
>>>
>>
>> I don't think this assumption is guaranteed to be correct. In current
>> implementation of ATF and if no additional security software runs the
>> assumption would likely be correct, but it wouldn't be correct in
>> general. Linux does disable GICC in such a scenario.
>
> Can you please give a pointer to code in Linux? The only place I see the
> GICC disabled is when using versatile TC2. It looks like it is just because
> they don't have PSCI support so Linux has to do the power management. That's
> not a platform we are ever going to fully support in Xen.
>

You already found it. Good to clarify, thanks.

>> So changing this could cause problems in some scenarios, while keeping
>> it makes no harm.
>
>
> While I guess this code makes no harm, it does not do what is expected (i.e
> draining the interrupt). I can't see any reason to keep wrong code, we
> should really aim to have code that match the architecture. And better to
> fix it when we discover the problem rather than waiting until we
> rediscovered it later.
>
> So at least a patch to update the code/comment should be done.
>

I don't feel comfortable removing these 3 lines because I have no way
to test and guarantee that the change will not introduce any issues.
However, if despite all you really want me to remove these lines
within this series I don't have a problem doing that in a separate
patch. Please just confirm the plan.

Thanks,
Mirela

>> Thanks for the feedback, I really appreciate it.
>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.