[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range



Hi Andre,

On 12/11/2019 12:46, Andre Przywara wrote:
On Mon, 11 Nov 2019 11:01:07 -0800 (PST)
Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:
On Sat, 9 Nov 2019, Julien Grall wrote:
On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, <sstabellini@xxxxxxxxxx> wrote:
       On Thu, 7 Nov 2019, Peng Fan wrote:
       > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
       >
       > Signed-off-by: Peng Fan <peng.fan@xxxxxxx>

       Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>


To be honest, I am not sure the code is correct. A read to those registers 
should tell you the list of interrupts active. As we always
return 0, this will not return the correct state of the GIC.

I know that returning the list of actives interrupts is complicated with the 
old vGIC, but I don't think silently ignoring it is a good
idea.
The question here is why the guest accessed those registers? What is it trying 
to figure out?

I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant 
points for ARM:
- In kernel/irq/manage.c, in __synchronize_hardirq().
- In KVM's arch timer emulation code.

I think the latter is of no concern (yet), but the first might actually 
trigger. At the moment it's beyond me what it actually does, but maybe some IRQ 
changes (RT, threaded IRQs?) trigger this now?\
It happens I am sitting right next to Marc now, so I had a chat with him about this :). Let me try to summarize the discussion here.

__synchronize_hardirq() is used to ensure that all active interrupts have been handled before continuing. When sync_chip == false, we only ensure that all in progress interrupts (from Linux PoV) are handled before continue.

sync_chip == true will additionally ensure that any interrupt that were acknowledge but not yet marked as in progress by the kernel are also handled. The assumption is this is called after the interrupt has been masked/disabled.

The call to __synchronize_hardirq() in free_irq() (as reported by Peng stack trace) was introduced recently (see [1]). It is not entirely clear whether this would affect anyone using Linux 5.4 or just a limited subset of users.

Anyhow, this is a genuine bug and I think returning 0 is only papering over the bug with no long-term resolution. As Marc pointed out, Even the old vGIC in KVM was not spec compliant and thankfully this was fixed in the new vGIC.

As I said in a different sub-thread, I would reluctanly be ok to see returning 0 as long as we have add a warning for *every* access. Long-term, the current vGIC should really get killed.

Cheers,

[1] 62e0468650c30f0298822c580f382b16328119f6 "genirq: Add optional hardware synchronization for shutdown"

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.