[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: fix rank/vgic locks inversion bug



On Mon, 19 Dec 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 19/12/2016 23:30, Stefano Stabellini wrote:
> > On Mon, 19 Dec 2016, Julien Grall wrote:
> > > > > 2) We run gic_update_one_lr and vgic_store_itargetsr in parallel
> > > > > safely
> > > > > and locklessly. There might be a way to do it, but it is not easy I
> > > > > haven't found it yet.
> > > 
> > > Correct me if I am wrong. There are no restriction to write into
> > > IROUTER/ITARGETSR while an IRQ is pending. So the irq_set_affinity could
> > > be
> > > called once at the beginning of vgic_irq_migrate.
> > > 
> > > We may receive the interrupt on the wrong physical CPU at the beginning.
> > > But
> > > it would be the same when writing into IROUTER/ITARGETSR.
> > > 
> > > This would remove the need to get the rank lock in gic_update_one_lr.
> > > 
> > > Did I miss anything?
> > 
> > I am not sure if we can do that: the virtual interrupt might not be
> > EOI'd yet at that time. The guest EOI is used to deactivate the
> > corresponding physical interrupt. Migrating the physical IRQ at that
> > time, could have unintended consequences? I am not sure what the spec
> > says about this, but even if it is correct on paper, I would prefer not
> > to put it to the test: I bet that not many interrupt controllers have
> > been heavily tested in this scenario. What do you think?
> 
> I don't think this is an issue. An interrupt could have the priority drop
> happening on pCPU A and be deactivated on pCPU B if the vCPU A is been
> migrated when the interrupt is inflight. So if it is fine here, why would not
> it be when the guest is specifically requesting the routing?

That is true, but this is not exactly the same. This is changing the
physical irq affinity while both physical and virtual irqs are still
active. As I wrote, usually operating systems only change affinity after
deactivating an irq, so I thought it would be wise in Xen to wait at
least for the EOI. If we tried to inject the same virtual interrupt on a
different vcpu, while the interrupt is still inflight, we could get in
troubles. But we could avoid that by testing for GIC_IRQ_GUEST_MIGRATING
in vgic_vcpu_inject_irq. Maybe I am worrying about nothing.


> > Otherwise, I think it is very reasonable to store the vcpu id (or the
> > pcpu id) in struct pending_irq: we already store the lr register number
> > there. The current code can tell in which lr register an interrupt has
> > been written to, but it cannot tell to which cpu the lr register belongs
> > to. It's a paradox. Once we know the vcpu id for any inflight irqs, then
> > we can make sure to take the right vcpu.vgic lock from vgic_migrate_irq.
> 
> This is a good point. I was concerned about the size of pending_irq (we have
> to allocate it per-IRQ) but it looks like we have some padding in the
> structure between priority and inflight.

That's right, that's exactly what I intended to use. This solution comes
at no costs, but we would have to remember that it's the vcpu vgic lock
that protects reads to rank->vcpu rather than the rank lock.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.