[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 09/27] ARM: GICv3: introduce separate pending_irq structs for LPIs



Hi,

On 24/03/17 11:40, Julien Grall wrote:
> Hi Andre
> 
> On 03/16/2017 11:20 AM, Andre Przywara wrote:
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 364d5f0..e5cfa54 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -30,6 +30,8 @@
>>
>>  #include <asm/mmio.h>
>>  #include <asm/gic.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic_v3_its.h>
> 
> I really don't want to see gic_v3_* headers included in common code. Why
> do you need them?
> 
>>  #include <asm/vgic.h>
>>
>>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int
>> rank)
>> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v,
>> unsigned int irq)
>>      return vgic_get_rank(v, rank);
>>  }
>>
>> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int
>> virq)
>> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>>  {
>>      INIT_LIST_HEAD(&p->inflight);
>>      INIT_LIST_HEAD(&p->lr_queue);
>> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu
>> *v, unsigned int virq)
>>
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +    struct vgic_irq_rank *rank;
>>      unsigned long flags;
>>      int priority;
>>
>> +    if ( is_lpi(virq) )
>> +        return vgic_lpi_get_priority(v->domain, virq);
> 
> This would benefits some comments to explain why LPI pending handling is
> a different path.
> 
>> +
>> +    rank = vgic_rank_irq(v, virq);
>>      vgic_lock_rank(v, rank, flags);
>>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>      vgic_unlock_rank(v, rank, flags);
>> @@ -446,13 +452,63 @@ bool vgic_to_sgi(struct vcpu *v, register_t
>> sgir, enum gic_sgi_mode irqmode,
>>      return true;
>>  }
>>
>> +/*
>> + * Holding struct pending_irq's for each possible virtual LPI in each
>> domain
>> + * requires too much Xen memory, also a malicious guest could
>> potentially
>> + * spam Xen with LPI map requests. We cannot cover those with (guest
>> allocated)
>> + * ITS memory, so we use a dynamic scheme of allocating struct
>> pending_irq's
>> + * on demand.
>> + */
> 
> I am afraid this will not prevent a guest to use too much Xen memory.
> Let's imagine the guest is mapping thousands of LPIs but decides to
> never handle them or is slow. You would allocate pending_irq for each
> LPI, and never release the memory.

So what would be the alternative here? In the current GICv2/3 scheme a
pending_irq for each (SPI) interrupt is allocated up front. This
approach here tries just to be a bit smarter for LPIs, since the
expectation is that we by far don't need so many. In the worst case the
memory allocation would converge to the upper limit (number of mapped
LPIs), which would probably be used otherwise for the static allocation
scheme. Which means we would never be worse.
What actually is missing is the freeing the memory upon domain
destruction, which can't be tested for Dom0.

> If we use dynamic allocation, we need a way to limit the number of LPIs
> received by a guest to avoid memory exhaustion. The only idea I have is
> an artificial limit, but I don't think it is good. Any other ideas?

I think if we are concerned about this, we shouldn't allow *DomUs* to
allocate too many LPIs, which are bounded by the DeviceID/EventID
combinations. This will be under full control by Dom0, which will
communicate the number of MSIs (events) for each passthrough-ed device,
so we can easily impose a policy limit upon creation.

And just to be sure: we are at most allocating one structure per event.
An MSI interrupt storm would still converge into one pending LPI
(struct), so wouldn't trigger any further allocations.
Remapping LPIs won't increase that as well, since we only can assign one
LPI to a DevID/EvID pair at any given point in time. Remapping requires
the pending bit to be cleared. And the structures are not indexed by the
LPI number, but just take a free slot.

For Dom0 I don't see any good ways of limiting the number of LPIs, but
we shouldn't need any.

>> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>> +                                   bool allocate)
>> +{
>> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
>> +
>> +    spin_lock(&v->arch.vgic.pending_lpi_list_lock);
>> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
>> +    {
>> +        if ( lpi_irq->pirq.irq == lpi )
>> +        {
>> +            spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
>> +            return &lpi_irq->pirq;
>> +        }
>> +
>> +        if ( lpi_irq->pirq.irq == 0 && !empty )
>> +            empty = lpi_irq;
>> +    }
>> +
>> +    if ( !allocate )
>> +    {
>> +        spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
>> +        return NULL;
>> +    }
>> +
>> +    if ( !empty )
>> +    {
>> +        empty = xzalloc(struct lpi_pending_irq);
> 
> xzalloc will return NULL if memory is exhausted. There is a general lack
> of error checking within this series. Any missing error could be a
> potential target from a guest, leading to security issue. Stefano and I
> already spot some, it does not mean we found all. So Before sending the
> next version, please go through the series and verify *every* return.
> 
> Also, I can't find the code which release LPIs neither in this patch nor
> in this series. A general rule is too have allocation and free within
> the same patch. It is much easier to spot missing free.

There is no such code, on purpose. We only grow the number, but never
shrink it (to what?, where to stop?, what if we need more again?). As
said above, in the worst case this ends up at something where a static
allocation would have started with from the beginning.

And as mentioned, I only skipped the "free the whole list upon domain
destruction" part.

Cheers,
Andre.

>> +        vgic_init_pending_irq(&empty->pirq, lpi);
>> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
>> +    } else
>> +    {
>> +        empty->pirq.status = 0;
>> +        empty->pirq.irq = lpi;
>> +    }
>> +
>> +    spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
>> +
>> +    return &empty->pirq;
>> +}
>> +
>>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>>  {
>>      struct pending_irq *n;
>> +
> 
> Spurious change.
> 
>>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>>       * are used for SPIs; the rests are used for per cpu irqs */
>>      if ( irq < 32 )
>>          n = &v->arch.vgic.pending_irqs[irq];
>> +    else if ( is_lpi(irq) )
>> +        n = lpi_to_pending(v, irq, true);
>>      else
>>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>>      return n;
>> @@ -480,7 +536,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>  {
>>      uint8_t priority;
>> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
>> +    struct pending_irq *iter, *n;
>>      unsigned long flags;
>>      bool running;
>>
>> @@ -488,6 +544,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned
>> int virq)
>>
>>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>
>> +    n = irq_to_pending(v, virq);
> 
> Why did you move this code?
> 
>> +
>>      /* vcpu offline */
>>      if ( test_bit(_VPF_down, &v->pause_flags) )
>>      {
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 00b9c1a..f44a84b 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -257,6 +257,8 @@ struct arch_vcpu
>>          paddr_t rdist_base;
>>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>>          uint8_t flags;
>> +        struct list_head pending_lpi_list;
>> +        spinlock_t pending_lpi_list_lock;   /* protects the
>> pending_lpi_list */
> 
> 
> It would be better to have this structure per-domain limiting the amount
> of memory allocating. Also, Stefano was suggesting to use a more
> efficient data structure, such as an hashtable or a tree.
> 
> I would be ok to focus on the correctness so far, but this would need to
> be address before ITS is marked as stable.
> 
>>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b,
>> int n, int s);
>>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned
>> int irq);
>>  extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>>  extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
>>  extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
>> +/* placeholder function until the property table gets introduced */
>> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
>> +{
>> +    return 0xa;
>> +}
> 
> To be fair, you can avoid this function by re-ordering the patches. As
> suggested earlier, I would introduce some bits of the vITS before. This
> would also make the series easier to read.
> 
> Also, looking how it has been implemented later. I would prefer to see a
> new callback get_priority in vgic_ops which will return the correct
> priority.
> 
> I agree this would introduce a bit more of duplicated code. But it would
> limit the use of is_lpi in the common code.
> 
> Cheers,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.