[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] ITS emulation race conditions



On Mon, 10 Apr 2017, Andre Przywara wrote:
> Hi,
> 
> On 10/04/17 00:12, André Przywara wrote:
> > Hi,
> > 
> > I wanted to run some ideas on how to prevent the race conditions we are
> > facing with the ITS emulation and removing devices and/or LPIs.
> > I think Stefano's idea of tagging a discarded LPI is the key, but still
> > some details are left to be solved.
> > I think we are dealing with two issues:
> > 1) A guest's DISCARD can race with in incoming LPI.
> > Ideally DISCARD would remove the host LPI -> vLPI connection (in the
> > host_lpis table), so any new incoming (host) LPI would simply be
> > discarded very early in gicv3_do_LPI() without ever resolving into a
> > virtual LPI. Now while this removal is atomic, we could have just missed
> > an incoming LPI, so the old virtual LPI would still traverse down the
> > VGIC with a "doomed" virtual LPI ID.
> > I wonder if that could be solved by a "crosswise" check:
> > - The DISCARD handler *first* tags the pending_irq as "UNMAPPED", *then*
> > removes the host_lpis connectin.
> > - do_LPI() reads the host_lpis() array, *then* checks the UNMAPPED tag
> > in the corresponding pending_irq (or lets vgic_vcpu_inject_irq() do that).
> > With this setup the DISCARD handler can assume that no new virtual LPI
> > instances enter the VGIC afterwards.
> >
> > 2) An unmapped LPI might still be in use by the VGIC, foremost might
> > still be in an LR.
> > Tagging the pending_irq should solve this in general. Whenever a VGIC
> > function finds the UNMAPPED tag, it does not process the vIRQ, but
> > retires it. For simplicity we might limit this to handling when a VCPU
> > exists and an LR gets cleaned up: If we hit a tagged pending_irq here,
> > we clean the LR, remove the pending_irq from all lists and signal the
> > ITS emulation that this pending_irq is now ready to be removed (by
> > calling some kind of cleanup_lpi() function, for instance).
> > The ITS code can then remove the struct pending_irq from the radix tree.
> > MAPD(V=0) is now using this to tag all still mapped events as
> > "UNMAPPED", the counting down the "still alive" pending_irqs in
> > cleanup_lpi() until they reach zero. At this point it should be safe to
> > free the pend_irq array.
> > 
> > Does that sound like a plan?
> > Do I miss something here? Probably yes, hence I am asking ;-)
> 
> So there are two issues with this:
> - For doing the LPI cleanup, we would need to call a virtual ITS or
> virtual LPI specific function directly from gic.c. This looks like bad
> coding style, as it breaks the abstraction (calling a virtual LPI/ITS
> specific function from within gic.c)

This is just code organization, I am not worried about it. We might have
to register cleanup functions. The real problem to solve is below.


> - If we have a "DISCARD; MAPTI" sequence, targeting the same vLPI, while
> this vLPI is still in an LR (but not cleaned up until both commands have
> been handled), we end up with using the wrong pending_irq struct (the
> new one). (I assume the UNMAPPED tag would be cleared upon the new MAPTI).

It looks like "DISCARD; MAPTI" would be a problem even if we go with
"GIC: Add checks for NULL pointer pending_irq's", because instead of a
NULL pending_irq, we could get the new pending_irq, the one backing the
same vlpi but a different eventid.


> Can this create problems?
> I see two possibilities:
> a) the old vLPI is still pending in the LR: as the new LR would be
> pristine, the code in update_one_lr() would just clean the QUEUED bit,
> but not do anything further. The LR would be kept as used by this vLPI,
> with the state still set to GICH_LR_PENDING. Upon re-entering the VCPU
> this would make the new vLPI pending.
> b) the old vLPI has been EOIed: the new LR would be pristine, the code
> in update_one_lr() would try to clean it up anyway, which should not
> hurt, AFAICT.

This cannot be allowed to happen. We have to keep a consistent internal
state at all times: we cannot change the vlpi mapping in pend_lpi_tree
before the old vlpi is completed discarded. We cannot have a vlpi marked
as "UNMAPPED", but still alive in an LR, while vlpi_to_pending already
returns the new vlpi.

We have a number of possible approaches, I'll mark them with [letter].


[a] On DISCARD we do everything we can to remove the vlpi from
everywhere:

- if pending_irq is in lr_queue, remove it from the list
- if the vlpi is in an LR register as pending (not active, see below),
  remove it from the LR (see the code in gic_restore_pending_irqs, under
  "No more free LRs: find a lower priority irq to evict")
- remove pending_irq from inflight
- remove pending_irq from pend_lpi_tree

At this stage there should be no more traces of the vlpi/pending_irq
anywhere.

What do we do if a "DISCARD; MAPTI" pair of commands is issued while the
old vlpi is still ACTIVE in an LR?  ACTIVE means that the guest is still
handling the irq and hasn't even EOIed it yet. I know that physical LPIs
have no active state, but what about virtual LPIs? I am tempted to say
that in any case for simplicity we could just remove the vlpi from the
LR ("evict") anyway. I don't think this case should happen with a well
behaved guest anyhow.

But the other issue is that "DISCARD, MAPTI" could be done on a
different vcpu, compared to the one having the old vlpi in an LR. In
this case, we would have to send an SGI to the right pcpu and clear the
LR, which is a royal pain in the neck because the other vcpu could not
even be running. Also another MAPTI could come in on a different pcpu
while we haven't completed the first LR removal. There might be races.
Let's explore other options.


[b] On DISCARD we would have to:
- if pending_irq is in lr_queue, remove it from the list
- if the vlpi is in an LR register:
  - mark as UNMAPPED (this check can be done from another vcpu)
  - remove from inflight (it could also be done in gic_update_one_lr)

Then when the guest EOIs the interrupt, in gic_update_one_lr:
- clear LR
- if UNMAPPED
  - clear UNMAPPED and return
  - remove old pending_irq from pend_lpi_tree (may not be necessary?)


Now the problem with this approach, like you pointed out, is: what do we
do if the guest issues a MAPTI before EOIing the interrupt?


[c] The simplest thing to do is ignore it (or report error back to the
guest) and printk a warning if the old vlpi is still UNMAPPED. It could
be a decent solution for Xen 4.9.


[d] Otherwise, we can try to handle it properly. On DISCARD:
- if pending_irq is in lr_queue, remove it from the list
- if vlpi is in LR:
  - mark as UNMAPPED
  - remove from inflight

On MAPTI:
if old pending_irq is UNMAPPED:
  - clear UNMAPPED in old pending_irq
  - add UNMAPPED to new pending_irq
- remove old pending_irq from pend_lpi_tree
- add new pending_irq to pend_lpi_tree

Then when the guest EOIs the interrupt, in gic_update_one_lr:
- clear LR
- if UNMAPPED
  - clear UNMAPPED and return


Among these approaches, [b]+[c] would be OK for 4.9. Or [d]. [a] looks
simple but it is actually difficult to do correctly.


> So I would like to know how to proceed here:
> I) stick to the tag, and fix that particular case above by checking for
> the "inflight && ENABLED" condition and clearing the LR if the
> pending_irq does not state it's pending anymore
> II) introduce a NO_LONGER_PENDING tag in addition to the UNMAPPED tag,
> where a new MAPTI just clears UNMAPPED_LPI, but keeps NO_LONGER_PENDING.
> NO_LONGER_PENDING would be checked and cleared by update_one_lr(),
> UNMAPPED by vgic_vcpu_inject_irq().
> That would leave the issue how to call the pend_irq_cleanup() function
> from update_one_lr() without breaking abstraction

If MAPTI is called with a different eventid but the same vlpi, wouldn't
vlpi_to_pending return the wrong pending_irq struct in
gic_update_one_lr, causing problems to both I) and II)? The flags would
not be set.


> III) Revert to the "check for NULL" solution.

Wouldn't irq_to_pending potentially return the new pending_irq instead
of NULL in gic_update_one_lr? This is a problem. For example, the new
pending_irq is not in the inflight list, but gic_update_one_lr will try
to remove it from it.

I think this approach is error prone: we should always get the right
pending_irq struct, or one that is appropriately marked with a flag, or
NULL. In this case, we would still have to mark the new pending_irq as
"UNMAPPED" to tell gic_update_one_lr to do nothing and return. It would
end up very similar to [d]. We might as well do [d].
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.