[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.15] xen/vgic: Implement write to ISPENDR in vGICv{2, 3}


  • To: Julien Grall <julien@xxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Mon, 22 Feb 2021 13:45:05 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=B512dvbpOnEYWD1i4RnQHZAQXptOjs6RivQrMPNyXkc=; b=Xu1oLZGVbKEBkLvT/W9noBdP0WWbD1nVoxSjW1PpiLo9HYmylfisC3sk+OrAzwSQq+QKFJaL4LJ1jIYOvUN6ALy96n7rL/19LEosjMVeOB/Vig0H0Wn5WQ8v83uxNfUd90pBQ2HemwKcuBDyJKX/2prBpJywv2GAxxd2qWhzbmfLxQuBgk8XwAts1nw28W54VC9uOhb9SzvxmgyUoPtM2PotybgtbHc34oNOLktGgFPFHuLK0G9XshqAcKUZbUAKeORxmI5HwARh2raY4kFEd4UEY8ix7LvTVmEk0qfyaFAsaxhEIulJHeu3JPph1zIlZX9j/Z0X8w3DJXNljSbtLA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=b/26/6d5cxdTWk+R5cOP4ZhcLYfR5H+lPnl+DPFi77IDMIQzjHh0lBlpGAk1OJWiIeg8Eg1awTvo80oo70fcK7/DLYnWalhPfTUNnpz0U8JuxJFM1annoiNBC+hQgnV+4NTNV8bexkHnKugbk3unnSRGtdc3vVIWaDpnCw2uSXLys3mk/ZPsHVQVLyLsGaqUc+dsu/kOBDwUZDO94CDUGsjMU4u9AN9Qcx/ML+ay8emshd8Sbwi6yaOMRsg3z/FR51vXOoiALl6RJxO9Ojx4jETNRKzY+h6UPfyAocuie6EIe7DOH/glIJqOVaFkWOai9f1mKo4JpfKoqd6d6xSCaQ==
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "iwj@xxxxxxxxxxxxxx" <iwj@xxxxxxxxxxxxxx>, Julien Grall <jgrall@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Mon, 22 Feb 2021 13:45:27 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXB5FR40tx26Grzkm8HRIE12rerqpkMreA
  • Thread-topic: [PATCH for-4.15] xen/vgic: Implement write to ISPENDR in vGICv{2, 3}

Hi Julien,

> On 20 Feb 2021, at 14:04, Julien Grall <julien@xxxxxxx> wrote:
> 
> From: Julien Grall <jgrall@xxxxxxxxxx>
> 
> Currently, Xen will send a data abort to a guest trying to write to the
> ISPENDR.
> 
> Unfortunately, recent version of Linux (at least 5.9+) will start
> writing to the register if the interrupt needs to be re-triggered
> (see the callback irq_retrigger). This can happen when a driver (such as
> the xgbe network driver on AMD Seattle) re-enable an interrupt:
> 
> (XEN) d0v0: vGICD: unhandled word write 0x00000004000000 to ISPENDR44
> [...]
> [   25.635837] Unhandled fault at 0xffff80001000522c
> [...]
> [   25.818716]  gic_retrigger+0x2c/0x38
> [   25.822361]  irq_startup+0x78/0x138
> [   25.825920]  __enable_irq+0x70/0x80
> [   25.829478]  enable_irq+0x50/0xa0
> [   25.832864]  xgbe_one_poll+0xc8/0xd8
> [   25.836509]  net_rx_action+0x110/0x3a8
> [   25.840328]  __do_softirq+0x124/0x288
> [   25.844061]  irq_exit+0xe0/0xf0
> [   25.847272]  __handle_domain_irq+0x68/0xc0
> [   25.851442]  gic_handle_irq+0xa8/0xe0
> [   25.855171]  el1_irq+0xb0/0x180
> [   25.858383]  arch_cpu_idle+0x18/0x28
> [   25.862028]  default_idle_call+0x24/0x5c
> [   25.866021]  do_idle+0x204/0x278
> [   25.869319]  cpu_startup_entry+0x24/0x68
> [   25.873313]  rest_init+0xd4/0xe4
> [   25.876611]  arch_call_rest_init+0x10/0x1c
> [   25.880777]  start_kernel+0x5b8/0x5ec
> 
> As a consequence, the OS may become unusable.
> 
> Implementing the write part of ISPENDR is somewhat easy. For
> virtual interrupt, we only need to inject the interrupt again.
> 
> For physical interrupt, we need to be more careful as the de-activation
> of the virtual interrupt will be propagated to the physical distributor.
> For simplicity, the physical interrupt will be set pending so the
> workflow will not differ from a "real" interrupt.
> 
> Longer term, we could possible directly activate the physical interrupt
> and avoid taking an exception to inject the interrupt to the domain.
> (This is the approach taken by the new vGIC based on KVM).
> 
> Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>

This is something which will not be done by a guest very often so I think your
implementation actually makes it simpler and reduce possibilities of race 
conditions
so I am not even sure that the XXX comment is needed.
But i am ok with it being in or not so:

Reviewed-by: Bertrand Marquis <bertrand.marquis@xxxxxxx>

I did some tests by manually generating interrupts and I can confirm that this
works as expected.

Cheers
Bertrand

> 
> ---
> 
> Note that this doesn't touch the read part for I{S,C}PENDR nor the write
> part of ICPENDR because they are more complex to implement.
> 
> For physical interrupt, I didn't implement the same solution as KVM because
> I couldn't convince myself this could be done race free for physical
> interrupt.
> 
> This was tested using the IRQ debugfs (CONFIG_GENERIC_IRQ_DEBUGFS=y)
> which allows to retrigger an interrupt:
> 
> 42sh> echo trigger > /sys/kernel/debug/irq/irqs/<irq>
> 
> This patch is candidate for 4.15 and also backporting to older trees.
> Without this patch, recent Linux version may not boot on Xen on some
> platforms (such as AMD Seattle used in OssTest).
> 
> The patch is self-contained to implementing a single set of registers.
> So this would not introduce any risk on platforms where OSes don't use
> those registers.
> 
> For the other setup (e.g. AMD Seattle + Linux 5.9+), it cannot be worse
> than today.
> 
> So therefore, I would consider the risk limited.
> ---
> xen/arch/arm/vgic-v2.c     | 10 ++++----
> xen/arch/arm/vgic-v3.c     | 18 ++++++---------
> xen/arch/arm/vgic.c        | 47 ++++++++++++++++++++++++++++++++++++++
> xen/include/asm-arm/vgic.h |  2 ++
> 4 files changed, 62 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> index 64b141fea586..b2da886adc18 100644
> --- a/xen/arch/arm/vgic-v2.c
> +++ b/xen/arch/arm/vgic-v2.c
> @@ -472,10 +472,12 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, 
> mmio_info_t *info,
> 
>     case VRANGE32(GICD_ISPENDR, GICD_ISPENDRN):
>         if ( dabt.size != DABT_WORD ) goto bad_width;
> -        printk(XENLOG_G_ERR
> -               "%pv: vGICD: unhandled word write %#"PRIregister" to 
> ISPENDR%d\n",
> -               v, r, gicd_reg - GICD_ISPENDR);
> -        return 0;
> +        rank = vgic_rank_offset(v, 1, gicd_reg - GICD_ISPENDR, DABT_WORD);
> +        if ( rank == NULL ) goto write_ignore;
> +
> +        vgic_set_irqs_pending(v, r, rank->index);
> +
> +        return 1;
> 
>     case VRANGE32(GICD_ICPENDR, GICD_ICPENDRN):
>         if ( dabt.size != DABT_WORD ) goto bad_width;
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index fd8cfc156d0c..613f37abab5e 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -808,10 +808,12 @@ static int __vgic_v3_distr_common_mmio_write(const char 
> *name, struct vcpu *v,
> 
>     case VRANGE32(GICD_ISPENDR, GICD_ISPENDRN):
>         if ( dabt.size != DABT_WORD ) goto bad_width;
> -        printk(XENLOG_G_ERR
> -               "%pv: %s: unhandled word write %#"PRIregister" to 
> ISPENDR%d\n",
> -               v, name, r, reg - GICD_ISPENDR);
> -        return 0;
> +        rank = vgic_rank_offset(v, 1, reg - GICD_ISPENDR, DABT_WORD);
> +        if ( rank == NULL ) goto write_ignore;
> +
> +        vgic_set_irqs_pending(v, r, rank->index);
> +
> +        return 1;
> 
>     case VRANGE32(GICD_ICPENDR, GICD_ICPENDRN):
>         if ( dabt.size != DABT_WORD ) goto bad_width;
> @@ -975,6 +977,7 @@ static int vgic_v3_rdistr_sgi_mmio_write(struct vcpu *v, 
> mmio_info_t *info,
>     case VREG32(GICR_ICACTIVER0):
>     case VREG32(GICR_ICFGR1):
>     case VRANGE32(GICR_IPRIORITYR0, GICR_IPRIORITYR7):
> +    case VREG32(GICR_ISPENDR0):
>          /*
>           * Above registers offset are common with GICD.
>           * So handle common with GICD handling
> @@ -982,13 +985,6 @@ static int vgic_v3_rdistr_sgi_mmio_write(struct vcpu *v, 
> mmio_info_t *info,
>         return __vgic_v3_distr_common_mmio_write("vGICR: SGI", v,
>                                                  info, gicr_reg, r);
> 
> -    case VREG32(GICR_ISPENDR0):
> -        if ( dabt.size != DABT_WORD ) goto bad_width;
> -        printk(XENLOG_G_ERR
> -               "%pv: vGICR: SGI: unhandled word write %#"PRIregister" to 
> ISPENDR0\n",
> -               v, r);
> -        return 0;
> -
>     case VREG32(GICR_ICPENDR0):
>         if ( dabt.size != DABT_WORD ) goto bad_width;
>         printk(XENLOG_G_ERR
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 82f524a35c9e..8f9400a51960 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -423,6 +423,53 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
>     }
> }
> 
> +void vgic_set_irqs_pending(struct vcpu *v, uint32_t r, unsigned int rank)
> +{
> +    const unsigned long mask = r;
> +    unsigned int i;
> +    /* The first rank is always per-vCPU */
> +    bool private = rank == 0;
> +
> +    /* LPIs will never be set pending via this function */
> +    ASSERT(!is_lpi(32 * rank + 31));
> +
> +    for_each_set_bit( i, &mask, 32 )
> +    {
> +        unsigned int irq = i + 32 * rank;
> +
> +        if ( !private )
> +        {
> +            struct pending_irq *p = spi_to_pending(v->domain, irq);
> +
> +            /*
> +             * When the domain sets the pending state for a HW interrupt on
> +             * the virtual distributor, we set the pending state on the
> +             * physical distributor.
> +             *
> +             * XXX: Investigate whether we would be able to set the
> +             * physical interrupt active and save an interruption. (This
> +             * is what the new vGIC does).
> +             */
> +            if ( p->desc != NULL )
> +            {
> +                unsigned long flags;
> +
> +                spin_lock_irqsave(&p->desc->lock, flags);
> +                gic_set_pending_state(p->desc, true);
> +                spin_unlock_irqrestore(&p->desc->lock, flags);
> +                continue;
> +            }
> +        }
> +
> +        /*
> +         * If the interrupt is per-vCPU, then we want to inject the vIRQ
> +         * to v, otherwise we should let the function figuring out the
> +         * correct vCPU.
> +         */
> +        vgic_inject_irq(v->domain, private ? v : NULL, irq, true);
> +    }
> +}
> +
> bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode,
>                  int virq, const struct sgi_target *target)
> {
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index ce1e3c4bbdac..62c2ae538db2 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -288,6 +288,8 @@ extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu 
> *v, int b, int n, int
> extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
> extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
> extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
> +extern void vgic_set_irqs_pending(struct vcpu *v, uint32_t r,
> +                                  unsigned int rank);
> extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
> int vgic_v2_init(struct domain *d, int *mmio_count);
> int vgic_v3_init(struct domain *d, int *mmio_count);
> -- 
> 2.17.1
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.