Xen project Mailing List

Re: [Xen-devel] [RFC PATCH 21/24] ARM: vITS: handle INVALL command

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>

From: Andre Przywara <andre.przywara@xxxxxxx>

Date: Fri, 9 Dec 2016 18:07:50 +0000

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, george.dunlap@xxxxxxxxxx, Vijay Kilari <vijay.kilari@xxxxxxxxx>, Steve Capper <Steve.Capper@xxxxxxx>

Delivery-date: Fri, 09 Dec 2016 18:07:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, On 07/12/16 20:20, Stefano Stabellini wrote: > On Tue, 6 Dec 2016, Julien Grall wrote: >> On 06/12/2016 22:01, Stefano Stabellini wrote: >>> On Tue, 6 Dec 2016, Stefano Stabellini wrote: >>>> moving a vCPU with interrupts assigned to it is slower than moving a >>>> vCPU without interrupts assigned to it. You could say that the >>>> slowness is directly proportional do the number of interrupts assigned >>>> to the vCPU. >>> >>> To be pedantic, by "assigned" I mean that a physical interrupt is routed >>> to a given pCPU and is set to be forwarded to a guest vCPU running on it >>> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of >>> these physical interrupts, a corresponding virtual interrupt (could be a >>> different irq) will be injected into the guest vCPU. >>> >>> When the vCPU is migrated to a new pCPU, the physical interrupts that >>> are configured to be injected as virtual interrupts into the vCPU, are >>> migrated with it. The physical interrupt migration has a cost. However, >>> receiving physical interrupts on the wrong pCPU has an higher cost. >> >> I don't understand why it is a problem for you to receive the first interrupt >> to the wrong pCPU and moving it if necessary. >> >> While this may have an higher cost (I don't believe so) on the first received >> interrupt, migrating thousands of interrupts at the same time is very >> expensive and will likely get Xen stuck for a while (think about ITS with a >> single command queue). >> >> Furthermore, the current approach will move every single interrupt routed a >> the vCPU, even those disabled. That's pointless and a waste of resource. You >> may argue that we can skip the ones disabled, but in that case what would be >> the benefits to migrate the IRQs while migrate the vCPUs? >> >> So I would suggest to spread it over the time. This also means less headache >> for the scheduler developers. > > The most important aspect of interrupts handling in Xen is latency, > measured as the time between Xen receiving a physical interrupt and the > guest receiving it. This latency should be both small and deterministic. > > We all agree so far, right? > > > The issue with spreading interrupts migrations over time is that it makes > interrupt latency less deterministic. It is OK, in the uncommon case of > vCPU migration with interrupts, to take a hit for a short time. This > "hit" can be measured. It can be known. If your workload cannot tolerate > it, vCPUs can be pinned. It should be a rare event anyway. On the other > hand, by spreading interrupts migrations, we make it harder to predict > latency. Aside from determinism, another problem with this approach is > that it ensures that every interrupt assigned to a vCPU will first hit > the wrong pCPU, then it will be moved. It guarantees the worst-case > scenario for interrupt latency for the vCPU that has been moved. If we > migrated all interrupts as soon as possible, we would minimize the > amount of interrupts delivered to the wrong pCPU. Most interrupts would > be delivered to the new pCPU right away, reducing interrupt latency. So if this is such a crucial issue, why don't we use the ITS for good this time? The ITS hardware probably supports 16 bits worth of collection IDs, so what about we assign each VCPU (in every guest) a unique collection ID on the host and do a MAPC & MOVALL on a VCPU migration to let it point to the right physical redistributor. I see that this does not cover all use cases (> 65536 VCPUs, for instance), also depends much of many implementation details: - How costly is a MOVALL? It needs to scan the pending table and transfer set bits to the other redistributor, which may take a while. - Is there an impact if we exceed the number of hardware backed collections (GITS_TYPE[HCC])? If the ITS is forced to access system memory for every table lookup, this may slow down everyday operations. - How likely are those misdirected interrupts in the first place? How often do we migrate VCPU compared to the the interrupt frequency? There are more, subtle parameters to consider, so I guess we just need to try and measure. > Regardless of how we implement interrupts migrations on ARM, I think it > still makes sense for the scheduler to know about it. I realize that > this is a separate point. Even if we spread interrupts migrations over > time, it still has a cost, in terms of latency as I wrote above, but also > in terms of interactions with interrupt controllers and ITSes. A vCPU > with no interrupts assigned to it poses no such problems. The scheduler > should be aware of the difference. If the scheduler knew, I bet that > vCPU migration would be a rare event for vCPUs that have many interrupts > assigned to them. For example, Dom0 vCPU0 would never be moved, and > dom0_pin_vcpus would be superfluous. That's a good point, so indeed the "interrupt load" should be a scheduler parameter. But as you said: that's a different story. Cheers, Andre. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.