[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Thu, 28 Oct 2021 14:23:34 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4kYDrEAGhyYGrBHM0aibwdt4pNQYhXuWYdvWT8+9790=; b=Xz6yKz0uakoc48H0DVXVDQNa/9vKSd8fwZZt+7C84NZpfzy8shNOqy7VWcCs01cK5dY3s/EZ0GTTR+CPpUFoHxWyJ+lt/K3oK6UNYNnyGhzJurRS0IPUoIONu7fIgkkthgu/OUv1+kDd2yvXfXcHh01kHdDlWqwEY9VlDYLr9hlqwvIY1ZubjuUY4/9aITPK27010OdVYnKoguznd540aZA+Wwfoxd4rmaJgU7ccissWrIw/RawLO3fqa2qxVzxTpr1cZk7PV81PS8tZrhgtp0TxkY+ALlKLo6jZEt4sMb5FnTRgn8tz3NRzBRG4Za5ExI4teCazgJ0JqLCoHhwVeg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TEir9NWfKblRGXYVjYlpc0OhReNpN2UHuyokMe7QHWMi1pzPQTV8CDES2lkr2s5sQwHKKSUUbE3KnzFBdTUa1lCIy/wtSrljofFFE1OZgA7G7dpauOafoqJ7MIpB7Hqk9JrIJKTpU17e9Fe9NBR8Xj37H8FhfSUzMyI3TOrjB1T5sgQilV6emC6YRpwjjlWkQTfQ6q/vqPYiLQP7nxSd/T1S2pFMfurPmzai3HLyYDH3GDN/NaqZOmx78cleiqtuoubj4A8j18QG14Dx7x3fCLBfGnapBQbAF027/Nc2rYnsKh2OOST1ZvvtZdB+ntk4jJ7ldw/Fqfj7+Pu9rcd/nw==
  • Cc: Julien Grall <julien@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "iwj@xxxxxxxxxxxxxx" <iwj@xxxxxxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>
  • Delivery-date: Thu, 28 Oct 2021 14:23:52 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXyww6rKHs3LJru0q52eLUia+oNavnHBIAgAE3H4CAABhAgIAADTsA
  • Thread-topic: [PATCH] xen/arm: fix SBDF calculation for vPCI MMIO handlers


On 28.10.21 16:36, Roger Pau Monné wrote:
> On Thu, Oct 28, 2021 at 12:09:23PM +0000, Oleksandr Andrushchenko wrote:
>> Hi, Julien!
>>
>> On 27.10.21 20:35, Julien Grall wrote:
>>> Hi Oleksandr,
>>>
>>> On 27/10/2021 09:25, Oleksandr Andrushchenko wrote:
>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>>
>>>> While in vPCI MMIO trap handlers for the guest PCI host bridge it is not
>>>> enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as
>>>> the base address may not be aligned in the way that the translation
>>>> always work. If not adjusted with respect to the base address it may not be
>>>> able to properly convert SBDF and crashes:
>>>>
>>>> (XEN) vpci_mmio_read 0000:65:1a.0 reg 8bc gpa e65d08bc
>>> I can't find a printk() that may output this message. Where does this comes 
>>> from?
>> That was a debug print. I shouldn't have used that in the patch description, 
>> but
>> probably after "---" to better explain what's happening
>>> Anyway, IIUC the guest physical address is 0xe65d08bc which, if I am not 
>>> mistaken, doesn't belong to the range advertised for GUEST_VPCI_ECAM.
>> This is from dom0 I am working on now.
>>> IMHO, the stack trace should come from usptream Xen or need some 
>>> information to explain how this was reproduced.
>>>
>>>> (XEN) Data Abort Trap. Syndrome=0x6
>>>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000
>>> I can understnad that if we don't substract GUEST_VPCI_ECAM, we would (in 
>>> theory) not get the correct BDF. But... I don't understand how this would 
>>> result to a data abort in the hypervisor.
>>>
>>> In fact, I think the vPCI code should be resilient enough to not crash if 
>>> we pass the wrong BDF.
>> Well, there is no (?) easy way to validate SBDF. And this could be a problem 
>> if we have a misbehaving
>> guest which may force Xen to access the memory beyond that of PCI host bridge
> How could that be? The ECAM region exposed to the guest you should be
> the same as the physical one for dom0?
Ok, I have a Designware PCI hist which has 2 ECAM regions (I am starting to
implement the driver for it, so I can be wrong here):
- Root Complex ECAM area ("dbi"), it is something like 0x3000 bytes long
- "Client" ECAM area ("config")
So from Dom0 POV we have 2 ECAM regions and for the guest
we always emulate a single big region:
/*
  * 256 MB is reserved for VPCI configuration space based on calculation
  * 256 buses x 32 devices x 8 functions x 4 KB = 256 MB
  */
#define GUEST_VPCI_ECAM_BASE    xen_mk_ullong(0x10000000)
#define GUEST_VPCI_ECAM_SIZE    xen_mk_ullong(0x10000000)

So, we have the base address and size of the emulated ECAM space
not connected to the real host bridge
>
> And for domUs you really need to fix vpci_{read,write} to not
> passthrough accesses not explicitly handled.
Do you mean that we need to validate SBDFs there?
This can be tricky if we have a use-case when a PCI device being
passed through if not put at 0000:00:0.0, but requested to be, for
example, 0000:0d:0.0. So, we need to go over the list of virtual
devices and see if SBDF the guest is trying to access is a valid SBDF.
Is this what you mean?
>
>>> When there is a data abort in Xen, you should get a stack trace from where 
>>> this comes from. Can you paste it here?
>> (XEN) Data Abort Trap. Syndrome=0x6
>> (XEN) Walking Hypervisor VA 0x467a28bc on CPU0 via TTBR 0x00000000481d5000
>> (XEN) 0TH[0x0] = 0x00000000481d4f7f
>> (XEN) 1ST[0x1] = 0x00000000481d2f7f
>> (XEN) 2ND[0x33] = 0x0000000000000000
>> (XEN) CPU0: Unexpected Trap: Data Abort
>> (XEN) ----[ Xen-4.16-unstable  arm64  debug=y  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) PC:     000000000026d3d4 pci_generic_config_read+0x88/0x9c
>> (XEN) LR:     000000000026d36c
>> (XEN) SP:     000080007ff97c00
>> (XEN) CPSR:   0000000060400249 MODE:64-bit EL2h (Hypervisor, handler)
>> (XEN)      X0: 00000000467a28bc  X1: 00000000065d08bc  X2: 00000000000008bc
>> (XEN)      X3: 000000000000000c  X4: 000080007ffc6fd0  X5: 0000000000000000
>> (XEN)      X6: 0000000000000014  X7: ffff800011a58000  X8: ffff0000225a0380
>> (XEN)      X9: 0000000000000000 X10: 0101010101010101 X11: 0000000000000028
>> (XEN)     X12: 0101010101010101 X13: 0000000000000020 X14: ffffffffffffffff
>> (XEN)     X15: 0000000000000001 X16: ffff800010da6708 X17: 0000000000000020
>> (XEN)     X18: 0000000000000002 X19: 0000000000000004 X20: 000080007ff97c5c
>> (XEN)     X21: 00000000000008bc X22: 00000000000008bc X23: 0000000000000004
>> (XEN)     X24: 0000000000000000 X25: 00000000000008bc X26: 00000000000065d0
>> (XEN)     X27: 000080007ffb9010 X28: 0000000000000000  FP: 000080007ff97c00
>> (XEN)
>> (XEN)   VTCR_EL2: 00000000800a3558
>> (XEN)  VTTBR_EL2: 00010000bffba000
>> (XEN)
>> (XEN)  SCTLR_EL2: 0000000030cd183d
>> (XEN)    HCR_EL2: 00000000807c663f
>> (XEN)  TTBR0_EL2: 00000000481d5000
>> (XEN)
>> (XEN)    ESR_EL2: 0000000096000006
>> (XEN)  HPFAR_EL2: 0000000000e65d00
>> (XEN)    FAR_EL2: 00000000467a28bc
>> (XEN)
>> [snip]
>> (XEN) Xen call trace:
>> (XEN)    [<000000000026d3d4>] pci_generic_config_read+0x88/0x9c (PC)
>> (XEN)    [<000000000026d36c>] pci_generic_config_read+0x20/0x9c (LR)
>> (XEN)    [<000000000026d2c8>] pci-access.c#pci_config_read+0x60/0x84
>> (XEN)    [<000000000026d4a8>] pci_conf_read32+0x10/0x18
>> (XEN)    [<000000000024dcf8>] vpci.c#vpci_read_hw+0x48/0xb8
>> (XEN)    [<000000000024e3e4>] vpci_read+0xac/0x24c
>> (XEN)    [<000000000024e934>] vpci_ecam_read+0x78/0xa8
>> (XEN)    [<000000000026e368>] vpci.c#vpci_mmio_read+0x44/0x7c
>> (XEN)    [<0000000000275054>] try_handle_mmio+0x1ec/0x264
>> (XEN)    [<000000000027ea50>] traps.c#do_trap_stage2_abort_guest+0x18c/0x2d8
>> (XEN)    [<000000000027f088>] do_trap_guest_sync+0xf0/0x618
>> (XEN)    [<0000000000269c58>] entry.o#guest_sync_slowpath+0xa4/0xd4
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) CPU0: Unexpected Trap: Data Abort
>> (XEN) ****************************************
> Are you exposing an ECAM region to the guest bigger than the
> underlying one, and that's why you get crashes? (because you get out of
> the hardware range)
Please see above
> I would assume physical accesses to the ECAM area reported by the
> hardware don't trigger traps?
No
>
> Roger.
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.