[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 4/4] The locking order is: first rank lock, then vgic lock. The order is respected everywhere, except for gic_update_one_lr.



Hi Stefano,

On 16/01/2017 19:10, Stefano Stabellini wrote:
On Mon, 16 Jan 2017, Julien Grall wrote:
Hi Stefano,

On 03/01/17 22:51, Stefano Stabellini wrote:
On Wed, 28 Dec 2016, Julien Grall wrote:
Hi Stefano,

On 22/12/16 02:15, Stefano Stabellini wrote:
gic_update_one_lr is called with the vgic lock held, but it calls
vgic_get_target_vcpu, which tries to obtain the rank lock. This can
cause deadlocks.

We already have a version of vgic_get_target_vcpu that doesn't take the
rank lock: __vgic_get_target_vcpu.

Solve the lock inversion problem, by not taking the rank lock in
gic_update_one_lr (calling __vgic_get_target_vcpu instead of
vgic_get_target_vcpu).  This is safe, because vcpu target modifications
are protected by the same vgic vcpu lock.

I maintain what I said on the first version of this patch. All this patch
could be simplified by moving the call to irq_set_affinity in
vgic_irq_migrate.

There are no restriction to do that and no need to have any lock taken but
the
rank lock.

All right. I'll submit a patch that does exactly that. It is not
perfect, because it drops interrupts in the problematic scenario, but it
should be a good place to start for a technical discussion.

I think I have missed something. What is the problematic scenario you are
describing? Looking at the v3, I don't see any mention of this problem. Does
it mean it has been solved?

No, it hasn't been solved. Sorry for not having been clearer here. The
issue is described in the commit message of
http://marc.info/?l=xen-devel&m=148348625720995:

  After migrating an interrupt from vcpu/pcpu 0 to vcpu/pcpu 1, it is
  possible to receive a physical interrupt on pcpu 1, which Xen is
  supposed to inject into vcpu 1, before the LR on pcpu 0 has been
  cleared. In this case the irq is still marked as
  GIC_IRQ_GUEST_MIGRATING, and struct pending_irq is still "inflight" on
  vcpu 0. As the irq cannot be "inflight" on vcpu 0 and vcpu 1
  simultaneously, drop the interrupt.

This is not the right behavior: the interrupt should be injected into
vcpu 1.

I haven't yet reviewed v3, hence why I missed this paragraph, sorry. As you suggested on an earlier mail, I will start the discussion there.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.