Xen project Mailing List

Re: [Xen-devel] [PATCH v2 09/27] ARM: GICv3: introduce separate pending_irq structs for LPIs

To: Julien Grall <julien.grall@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Andre Przywara <andre.przywara@xxxxxxx>

Date: Fri, 24 Mar 2017 15:50:06 +0000

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Shanker Donthineni <shankerd@xxxxxxxxxxxxxx>, Vijay Kilari <vijay.kilari@xxxxxxxxx>

Delivery-date: Fri, 24 Mar 2017 15:48:15 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, On 24/03/17 11:40, Julien Grall wrote: > Hi Andre > > On 03/16/2017 11:20 AM, Andre Przywara wrote: >> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c >> index 364d5f0..e5cfa54 100644 >> --- a/xen/arch/arm/vgic.c >> +++ b/xen/arch/arm/vgic.c >> @@ -30,6 +30,8 @@ >> >> #include <asm/mmio.h> >> #include <asm/gic.h> >> +#include <asm/gic_v3_defs.h> >> +#include <asm/gic_v3_its.h> > > I really don't want to see gic_v3_* headers included in common code. Why > do you need them? > >> #include <asm/vgic.h> >> >> static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int >> rank) >> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, >> unsigned int irq) >> return vgic_get_rank(v, rank); >> } >> >> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int >> virq) >> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq) >> { >> INIT_LIST_HEAD(&p->inflight); >> INIT_LIST_HEAD(&p->lr_queue); >> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu >> *v, unsigned int virq) >> >> static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq) >> { >> - struct vgic_irq_rank *rank = vgic_rank_irq(v, virq); >> + struct vgic_irq_rank *rank; >> unsigned long flags; >> int priority; >> >> + if ( is_lpi(virq) ) >> + return vgic_lpi_get_priority(v->domain, virq); > > This would benefits some comments to explain why LPI pending handling is > a different path. > >> + >> + rank = vgic_rank_irq(v, virq); >> vgic_lock_rank(v, rank, flags); >> priority = rank->priority[virq & INTERRUPT_RANK_MASK]; >> vgic_unlock_rank(v, rank, flags); >> @@ -446,13 +452,63 @@ bool vgic_to_sgi(struct vcpu *v, register_t >> sgir, enum gic_sgi_mode irqmode, >> return true; >> } >> >> +/* >> + * Holding struct pending_irq's for each possible virtual LPI in each >> domain >> + * requires too much Xen memory, also a malicious guest could >> potentially >> + * spam Xen with LPI map requests. We cannot cover those with (guest >> allocated) >> + * ITS memory, so we use a dynamic scheme of allocating struct >> pending_irq's >> + * on demand. >> + */ > > I am afraid this will not prevent a guest to use too much Xen memory. > Let's imagine the guest is mapping thousands of LPIs but decides to > never handle them or is slow. You would allocate pending_irq for each > LPI, and never release the memory. So what would be the alternative here? In the current GICv2/3 scheme a pending_irq for each (SPI) interrupt is allocated up front. This approach here tries just to be a bit smarter for LPIs, since the expectation is that we by far don't need so many. In the worst case the memory allocation would converge to the upper limit (number of mapped LPIs), which would probably be used otherwise for the static allocation scheme. Which means we would never be worse. What actually is missing is the freeing the memory upon domain destruction, which can't be tested for Dom0. > If we use dynamic allocation, we need a way to limit the number of LPIs > received by a guest to avoid memory exhaustion. The only idea I have is > an artificial limit, but I don't think it is good. Any other ideas? I think if we are concerned about this, we shouldn't allow *DomUs* to allocate too many LPIs, which are bounded by the DeviceID/EventID combinations. This will be under full control by Dom0, which will communicate the number of MSIs (events) for each passthrough-ed device, so we can easily impose a policy limit upon creation. And just to be sure: we are at most allocating one structure per event. An MSI interrupt storm would still converge into one pending LPI (struct), so wouldn't trigger any further allocations. Remapping LPIs won't increase that as well, since we only can assign one LPI to a DevID/EvID pair at any given point in time. Remapping requires the pending bit to be cleared. And the structures are not indexed by the LPI number, but just take a free slot. For Dom0 I don't see any good ways of limiting the number of LPIs, but we shouldn't need any. >> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi, >> + bool allocate) >> +{ >> + struct lpi_pending_irq *lpi_irq, *empty = NULL; >> + >> + spin_lock(&v->arch.vgic.pending_lpi_list_lock); >> + list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry) >> + { >> + if ( lpi_irq->pirq.irq == lpi ) >> + { >> + spin_unlock(&v->arch.vgic.pending_lpi_list_lock); >> + return &lpi_irq->pirq; >> + } >> + >> + if ( lpi_irq->pirq.irq == 0 && !empty ) >> + empty = lpi_irq; >> + } >> + >> + if ( !allocate ) >> + { >> + spin_unlock(&v->arch.vgic.pending_lpi_list_lock); >> + return NULL; >> + } >> + >> + if ( !empty ) >> + { >> + empty = xzalloc(struct lpi_pending_irq); > > xzalloc will return NULL if memory is exhausted. There is a general lack > of error checking within this series. Any missing error could be a > potential target from a guest, leading to security issue. Stefano and I > already spot some, it does not mean we found all. So Before sending the > next version, please go through the series and verify *every* return. > > Also, I can't find the code which release LPIs neither in this patch nor > in this series. A general rule is too have allocation and free within > the same patch. It is much easier to spot missing free. There is no such code, on purpose. We only grow the number, but never shrink it (to what?, where to stop?, what if we need more again?). As said above, in the worst case this ends up at something where a static allocation would have started with from the beginning. And as mentioned, I only skipped the "free the whole list upon domain destruction" part. Cheers, Andre. >> + vgic_init_pending_irq(&empty->pirq, lpi); >> + list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list); >> + } else >> + { >> + empty->pirq.status = 0; >> + empty->pirq.irq = lpi; >> + } >> + >> + spin_unlock(&v->arch.vgic.pending_lpi_list_lock); >> + >> + return &empty->pirq; >> +} >> + >> struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq) >> { >> struct pending_irq *n; >> + > > Spurious change. > >> /* Pending irqs allocation strategy: the first vgic.nr_spis irqs >> * are used for SPIs; the rests are used for per cpu irqs */ >> if ( irq < 32 ) >> n = &v->arch.vgic.pending_irqs[irq]; >> + else if ( is_lpi(irq) ) >> + n = lpi_to_pending(v, irq, true); >> else >> n = &v->domain->arch.vgic.pending_irqs[irq - 32]; >> return n; >> @@ -480,7 +536,7 @@ void vgic_clear_pending_irqs(struct vcpu *v) >> void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq) >> { >> uint8_t priority; >> - struct pending_irq *iter, *n = irq_to_pending(v, virq); >> + struct pending_irq *iter, *n; >> unsigned long flags; >> bool running; >> >> @@ -488,6 +544,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned >> int virq) >> >> spin_lock_irqsave(&v->arch.vgic.lock, flags); >> >> + n = irq_to_pending(v, virq); > > Why did you move this code? > >> + >> /* vcpu offline */ >> if ( test_bit(_VPF_down, &v->pause_flags) ) >> { >> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h >> index 00b9c1a..f44a84b 100644 >> --- a/xen/include/asm-arm/domain.h >> +++ b/xen/include/asm-arm/domain.h >> @@ -257,6 +257,8 @@ struct arch_vcpu >> paddr_t rdist_base; >> #define VGIC_V3_RDIST_LAST (1 << 0) /* last vCPU of the rdist */ >> uint8_t flags; >> + struct list_head pending_lpi_list; >> + spinlock_t pending_lpi_list_lock; /* protects the >> pending_lpi_list */ > > > It would be better to have this structure per-domain limiting the amount > of memory allocating. Also, Stefano was suggesting to use a more > efficient data structure, such as an hashtable or a tree. > > I would be ok to focus on the correctness so far, but this would need to > be address before ITS is marked as stable. > >> extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, >> int n, int s); >> extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned >> int irq); >> extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr); >> extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n); >> extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n); >> +/* placeholder function until the property table gets introduced */ >> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi) >> +{ >> + return 0xa; >> +} > > To be fair, you can avoid this function by re-ordering the patches. As > suggested earlier, I would introduce some bits of the vITS before. This > would also make the series easier to read. > > Also, looking how it has been implemented later. I would prefer to see a > new callback get_priority in vgic_ops which will return the correct > priority. > > I agree this would introduce a bit more of duplicated code. But it would > limit the use of is_lpi in the common code. > > Cheers, > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.