[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices



Hi Oleksandr,

On 18/11/2021 06:59, Oleksandr Andrushchenko wrote:
Hi, Julien!

On 16.11.21 21:22, Julien Grall wrote:
Hi Oleksandr,

On 05/11/2021 06:33, Oleksandr Andrushchenko wrote:
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>

Currently Xen maps all IRQs and memory ranges for all devices except
those marked for passthrough, e.g. it doesn't pay attention to the
"status" property of the node.

According to the device tree specification [1]:
   - "okay"     Indicates the device is operational.
   - "disabled" Indicates that the device is not presently operational,
                but it might become operational in the future (for example,
           something is not plugged in, or switched off).
           Refer to the device binding for details on what disabled means
           for a given device.

So, "disabled" status is device dependent and mapping should be taken by
case-by-case approach with that respect. Although in general Xen should map
IRQs and memory ranges as the disabled devices might become operational

Right, this change effectively prevent dom0 to use such device if it becomes 
operational in the future.
Or doesn't, see below
So this sounds like a big regression.

How do you plan to handle it?
It depends if this is really a regression or is ok and "by design"

Quoting what you wrote in the commit message:

"Indicates that the device is not presently operational, but it might become operational in the future."

I read that as it might be possible to have a device turn on after boot.
In fact, looking at Linux, I can see some code allowing to change the state of a device after boot (see [4]). So I think we want to keep mapping the regions in dom0 at least for some devices.

Note, that other bits of the DT overlay would not work. This should be covered by the new series from Xilinx [5].


it
makes it impossible for the other devices, which are not operational in
any case, to skip the mappings.

You wrote "makes it impossible for the other devices", but it is not clear to 
me what's would go wrong when we map a disabled device (Dom0 should not touch it). Do you 
have an example?
Ok, here we go. In the SoC I am working with the PCIe controller may run in two 
modes:
Root or Endpoint. Not configurable at run-time, so you configure it in the 
device tree.
The are two device tree entries for that: for the root [1] and for the endpoint 
[2].
So, when you want the controller to be a Root Complex then you put status = 
"disabled"
for the Endpoint entry and vise versa.

Thank for the example. I think this is something that should be explained in the commit message.


If you take a look at the nodes they both define the same "reg" and 
"interrupts",
effectively making it impossible for me in the patch [3] to actually trap MMIOs: > we 
skip the mappings for [1] and then, because we assume "disabled" is
still valid for mappings, we add those for [2].

Technically, you would have the same issues if your device is sharing the interrupt or the MMIO page.

The former is fairly common, but IIUC you are not emulated the interrupts. Right?

For the latter, I have seen multiple UARTs in the same page (e.g. pine64) but not multiple PCI hostbridges. However, this is only with 4KB page granularity. We may have the issue arising with 16KB/64KB ones. So I would say we could ignore it for now.


So, this is probably why device tree documentation says about the disabled 
status
to be device specific.

Correct. That said, until now, all the devices would have their MMIOsregions mapped to dom0. So the interpretation of "disabled" didn't matter too much.


Hope this describes a very valid use-case. At the same time you may argue that
I just need to remove the offending entry from the device tree which may also be
valid.

We need to be able to accept any *valid* Device-Tree without any modification. At the same time, we want to avoid break potential existing use-cases.

I think, you can handle the two gracefully by adding a else in pci_host_bridge_mappings() that would remove the P2M mapping if bridge->ops->need_p2m_hwdom_mapping() returns false.

This is not pretty but it should do the job for now. Long run, I think we should consider to create fake entries in the P2M for emulated regions.

The advantage is we could easily find clash and use them to ignore mapping when the region is emulated.

Cheers,


Cheers,

Thank you,
Oleksandr

[1] 
https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L843
[2] 
https://github.com/renesas-rcar/linux-bsp/blob/v5.10.41/rcar-5.1.3.rc6/arch/arm64/boot/dts/renesas/r8a779f0.dtsi#L896
[3] 
https://patchwork.kernel.org/project/xen-devel/patch/20211105063326.939843-5-andr2000@xxxxxxxxx/

[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/of/dynamic.c#n111 [5] https://lore.kernel.org/xen-devel/1636441347-133850-1-git-send-email-fnu.vikram@xxxxxxxxxx/




--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.