[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 21/24] ARM: vITS: handle INVALL command



On Fri, 2 Dec 2016, Andre Przywara wrote:
> Hi,
> 
> sorry for chiming in late ....
> 
> I've been spending some time thinking about this, and I think we can in
> fact get away without ever propagating command from domains to the host.
> 
> I made a list of all commands that possible require host ITS command
> propagation. There are two groups:
> 1: enabling/disabling LPIs: INV, INVALL
> 2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.
> 
> The second group can be handled by mapping all required devices up
> front, I will elaborate on that in a different email.
> 
> For the first group, read below ...
> 
> On 01/12/16 01:19, Stefano Stabellini wrote:
> > On Fri, 25 Nov 2016, Julien Grall wrote:
> >> Hi,
> >>
> >> On 18/11/16 18:39, Stefano Stabellini wrote:
> >>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> >>>> On Fri, 11 Nov 2016, Julien Grall wrote:
> >>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
> >>>>> That's why in the approach we had on the previous series was "host ITS
> >>>>> command
> >>>>> should be limited when emulating guest ITS command". From my recall, in
> >>>>> that
> >>>>> series the host and guest LPIs was fully separated (enabling a guest
> >>>>> LPIs was
> >>>>> not enabling host LPIs).
> >>>>
> >>>> I am interested in reading what Ian suggested to do when the physical
> >>>> ITS queue is full, but I cannot find anything specific about it in the
> >>>> doc.
> >>>>
> >>>> Do you have a suggestion for this?
> >>>>
> >>>> The only things that come to mind right now are:
> >>>>
> >>>> 1) check if the ITS queue is full and busy loop until it is not 
> >>>> (spin_lock
> >>>> style)
> >>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
> >>>
> >>> Another, probably better idea, is to map all pLPIs of a device when the
> >>> device is assigned to a guest (including Dom0). This is what was written
> >>> in Ian's design doc. The advantage of this approach is that Xen doesn't
> >>> need to take any actions on the physical ITS command queue when the
> >>> guest issues virtual ITS commands, therefore completely solving this
> >>> problem at the root. (Although I am not sure about enable/disable
> >>> commands: could we avoid issuing enable/disable on pLPIs?)
> >>
> >> In the previous design document (see [1]), the pLPIs are enabled when the
> >> device is assigned to the guest. This means that it is not necessary to 
> >> send
> >> command there. This is also means we may receive a pLPI before the 
> >> associated
> >> vLPI has been configured.
> >>
> >> That said, given that LPIs are edge-triggered, there is no deactivate state
> >> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the 
> >> same
> >> LPIs could potentially be raised again. This could generate a storm.
> > 
> > Thank you for raising this important point. You are correct.
> >
> >> The priority drop is necessary if we don't want to block the reception of
> >> interrupt for the current physical CPU.
> >>
> >> What I am more concerned about is this problem can also happen in normal
> >> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
> >> edge-triggered interrupt, there is no way to prevent them to fire again. 
> >> Maybe
> >> it is time to introduce rate-limit interrupt for ARM. Any opinions?
> > 
> > Yes. It could be as simple as disabling the pLPI when Xen receives a
> > second pLPI before the guest EOIs the first corresponding vLPI, which
> > shouldn't happen in normal circumstances.
> > 
> > We need a simple per-LPI inflight counter, incremented when a pLPI is
> > received, decremented when the corresponding vLPI is EOIed (the LR is
> > cleared).
> > 
> > When the counter > 1, we disable the pLPI and request a maintenance
> > interrupt for the corresponding vLPI.
> 
> So why do we need a _counter_? This is about edge triggered interrupts,
> I think we can just accumulate all of them into one.

The counter is not to re-inject the same amount of interrupts into the
guest, but to detect interrupt storms.


> So here is what I think:
> - We use the guest provided pending table to hold a pending bit for each
> VLPI. We can unmap the memory from the guest, since software is not
> supposed to access this table as per the spec.
> - We use the guest provided property table, without trapping it. There
> is nothing to be "validated" in that table, since it's a really tight
> encoding and every value written in there is legal. We only look at bit
> 0 for this exercise here anyway.

I am following...


> - Upon reception of a physical LPI, we look it up to find the VCPU and
> virtual LPI number. This is what we need to do anyway and it's a quick
> two-level table lookup at the moment.
> - We use the VCPU's domain and the VLPI number to index the guest's
> property table and read the enabled bit. Again a quick table lookup.

They should be both O(2), correct?


>  - If the VLPI is enabled, we EOI it on the host and inject it.
>  - If the VLPI is disabled, we set the pending bit in the VCPU's
>    pending table and EOI on the host - to allow other IRQs.
> - On a guest INV command, we check whether that vLPI is now enabled:
>  - If it is disabled now, we don't need to do anything.
>  - If it is enabled now, we check the pending bit for that VLPI:
>   - If it is 0, we don't do anything.
>   - If it is 1, we inject the VLPI and clear the pending bit.
> - On a guest INVALL command, we just need to iterate over the virtual
> LPIs.

Right, much better.


> If you look at the conditions above, the only immediate action is
> when a VLPI gets enabled _and_ its pending bit is set. So we can do
> 64-bit read accesses over the whole pending table to find non-zero words
> and thus set bits, which should be rare in practice. We can store the
> highest mapped VLPI to avoid iterating over the whole of the table.
> Ideally the guest has no direct control over the pending bits, since
> this is what the device generates. Also we limit the number of VLPIs in
> total per guest anyway.

I wonder if we could even use a fully packed bitmask with only the
pending bits, so 1 bit per vLPI, rather than 1 byte per vLPI. That would
be a nice improvement.


> If that still sounds like a DOS vector, we could additionally rate-limit
> INVALLs, and/or track additions to the pending table after the last
> INVALL: if there haven't been any new pending bits since the last scan,
> INVALL is a NOP.
> 
> Does that makes sense so far?

It makes sense. It should be OK.


> So that just leaves us with this IRQ storm issue, which I am thinking
> about now. But I guess this is not a show stopper given we can disable
> the physical LPI if we sense this situation.

That is true and it's exactly what we should do.


> > When we receive the maintenance interrupt and we clear the LR of the
> > vLPI, Xen should re-enable the pLPI.
> > Given that the state of the LRs is sync'ed before calling gic_interrupt,
> > we can be sure to know exactly in what state the vLPI is at any given
> > time. But for this to work correctly, it is important to configure the
> > pLPI to be delivered to the same pCPU running the vCPU which handles
> > the vLPI (as it is already the case today for SPIs).
> 
> Why would that be necessary?

Because the state of the LRs of other pCPUs won't be up to date: we
wouldn't know for sure whether the guest EOI'ed the vLPI or not.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.