[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs



On Wed, 22 Dec 2021, Julien Grall wrote:
> > > > > > To me dtdev and XEN_DOMCTL_assign_device are appropriate because
> > > > > > they
> > > > > > signify device assignment of one or more devices. We are not adding
> > > > > > any
> > > > > > additional "meaning" to them. It is just that we'll automatically
> > > > > > detect
> > > > > > and generate any SCMI configurations based on the list of assigned
> > > > > > devices. Because if SCMI is enabled and a device is assigned to the
> > > > > > guest, then I think we want to add the device to the SCMI list of
> > > > > > devices automatically.
> > > > > 
> > > > > I am OK with re-using dtdev/XEN_DOMCTL_assign_device however there is
> > > > > a pitfall with that approach.
> > > > > 
> > > > > At the moment, they are only used for device protected by the
> > > > > IOMMU. If the device is not protected by the IOMMU then it will return
> > > > > an error.
> > > > IIRC there was a change, that allowed to assign device without a
> > > > IOMMU. At
> > > > least we discussed this internally.
> > > 
> > > I am not aware of any upstream. Do you have a pointer if there is any
> > > public discussion?
> > 
> > No, this is the first public discussion on this topic.
> > 
> > > 
> > > > > 
> > > > > Now, with your approach we may have a device that is not protected by
> > > > > the IOMMU but require to a SCMI configuration.
> > > > You need to protect only DMA-capable devices.
> > > 
> > > Xen doesn't know if the device is DMA-capable or not. So...
> > > 
> > 
> > But it knows if there is a iommus=<> node present for the device.
> 
> Not really. Not all the platforms have IOMMUs and not all the platforms with
> IOMMU have DMA-capable device protected by an IOMMU.
> 
> > 
> > > > 
> > > > > I don't think it would be sensible to just return "succeed" here
> > > > > because technically you are asking to assign a non-protected
> > > > > device. But at the same time, it would prevent a user to assign a
> > > > > non-DMA capable device.
> > > > > 
> > > > > So how do you suggest to approach this?
> > > > Well, in my head assign_device ideally should do the following:
> > > > 1. Assign IOMMU if it is configured for the device
> > > 
> > > ... with this approach you are at the risk to let the user passthrough
> > > a device that cannot be protected.
> > > 
> > > > 2. Assign SCMI access rights
> > > > (Not related to this patch series, but...)
> > > > 3. Assign IRQs
> > > > 4. Assign IO memory ranges.
> > > > Points 3. and 4. would allow us to not provide additional irq=[] and
> > > > iomem=[] entries in a guest config.
> > > 
> > > That could only work if your guest is using the same layout as the
> > > host.
> > 
> > Agreed. But in my experience, in most cases this is the true. We worked
> > with Renesas and IMX hardware and in both cases iomems were
> > mapped 1:1.
> > 
> > > Otherwise, there is a risk it will clash with other parts of the
> > > memory layout.
> > > 
> > 
> > > Today, guests started via the toolstack is only using a virtual
> > > layout, so you would first need to add support to use the host memory
> > > layout.
> > 
> > I can't say for all the possible configurations in the wild, but I'm
> > assuming that in most cases it will be fine to use 1:1 mapping. For
> > those more exotic cases it would be possible to implement some
> > configuration option like iomem_map=[mfn:gfn].
> Well, the Xen memory layout is not something that is stable nor AFAIK based on
> any memory layout. In fact, there is no such generic layout on Arm.
> 
> It is quite possible that it will work well with 1:1 MMIO on some platform
> (e.g. Renesas, IMX) but you can't expect to work for every Xen release or all
> the platforms.

Yeah this is a true problem. Thankfully with Penny's series we'll be
able to create domUs with the host memory layout (although dom0less
only but it is a step forward).

 
> > As Stefano pointed, right now user needs to provide 3 configuration
> > options to pass a device to a guest: dt_dev, irq, iomem. But in fact,
> > Xen is already knowing about irq and iomem from device tree.
> 
> I understand and this is not ideal. I would be happy to consider your
> approach. However, this will have to enabled only when the guest is re-using
> the host layout.

It looks like we all agree that today configuring device assignment with
Xen is too complicated and would like for it to be simpler. This thread
has some excellent ideas on how to address the issue. Skip to the end if
you are only interested in this patch series.


# Future Ideas

A great suggestion by Julien is to start supporting the dom0less partial
device tree format in xl/libxl as well so that we can have a single
"device_tree" option in dom.cfg instead of 4 (device_tree, iomem, irqs,
dtdev).

Even with that implemented, the user has still to provide a partial dtb,
xen,reg and xen,path. I think this is a great step forward and we should
do that, if nothing else to make it easier to switch between dom0less
and normal domU configurations. But the number of options and
information that the user has to provide is still similar to what we
have today.

After that, I think we need something along the lines of what Volodymyr
suggested. Let's say that the user only provides "dtdev" and
"device_tree" in dom.cfg.  We could:

- read interrupts from the  "interrupts" property like we do for dom0less
- read "xen,reg" for the mapping, if "xen,reg" is missing, read "reg"
  and assume 1:1 (we could try the mapping and print an error on
  failure, or we could only do 1:1 if the domain is entirely 1:1)
- optionally read "xen,path" to populate dtdev
- if an IOMMU configuration is present, then also configure the IOMMU
  for the device, if not then "xen,force-assign-without-iommu" must be
  present
- assign SCMI access rights


Now we only have:
- device_tree in dom.cfg
- dtdev in dom.cfg (or xen,path in the partial DTB)
- xen,force-assign-without-iommu in the partial DTB


It would be good if we could remove "xen,force-assign-without-iommu"
because at this stage it is the only Xen-specific property remaining in
the partial DTB. If we could get rid of it, it would make it easier to
write/generate the partial DTB because it becomes a strict subset of the
corresponding host node. It would enable us to automatically generate it
(we are going to do experiments with it at Xilinx in the next few
months) and it would be easier to explain to users how to write it.
The partial DTB usually starts as a copy of the corresponding host node
plus some edits. The fewer edits are required, the better.

But it is difficult because of the reasons mentioned by Julien. In Xen
we cannot know if a device is not behind an IOMMU because is not a DMA
master (so safe to assign) or because the system simply doesn't have
enough IOMMU coverage (so not safe to assign). Thankfully the hardware
has been improving in recent years and there are more and more platforms
with full IOMMU coverage. I think we should make it easier for users on
these well-behave platforms.

At least, we could move the "xen,force-assign-without-iommu" option from
the partial DTB to the VM config file dom.cfg. Something like:

force-assign-without-iommu="true"
or
platform-iommu-safe="true"

That way, it is global rather than per-device and doesn't have to be
added by the user by hand when creating the partial DTB.


# This patch series

Now in regards to this specific series and the SCMI options today, I
think it is OK to have a per-domain sci="scmi_smc", but I think we
should try to avoid another detailed list of device paths or list of
IDs in addition to dtdev.

dtdev already specifies the device tree nodes to be assigned to the
guest. Based on that list we can also do SCMI assignment.

As Julien pointed out, the issue is: what if a device needs SCMI
assignment but is not a DMA master (hence there is no IOMMU
configuration on the host)?

Attempting to assign a DMA mastering device without IOMMU protection is
not just unsafe, it will actively cause memory corruptions in most
cases. It is an erroneous configuration.

Of course we should try to stop people from running erroneous
configurations, but we should do so without penalizing all users.

With that in mind, also considering that dtdev is the list of devices to
be assigned, I think dtdev should be the list of all devices, even
non-DMA masters. It makes a lot of sense also because today is really
difficult to explain to a user that "yes, dtdev is the devices to be
assigned but no, if the device is a UART you don't have to add it to
dtdev because it is not a DMA master". It would be a lot easier if the
list of assigned devices was comprehensive, including both DMA masters
and not DMA masters.

So I think we should either:

a) extend dtdev to cover all devices, including non-DMA masters
b) or add a new "device_assignment" property which is like dtdev but is
   the list of both DMA masters and non-DMA masters

Either way, when non-DMA masters are present in the
dtdev/device_assignment list we could either:
    c) require force-assign-without-iommu="true" in dom.cfg
    d) or print a warning like:
    "WARNING: device assignment safety for device XXX cannot be
    verified. Please make sure XXX is not a DMA mastering device."


All these options are good I think. My preference is a) + d), so extend
dtdev and print a warning if non-DMA masters are in the list. We don't
necessarily need a new hypercall but that is also an option.

I think this discussion was a long time coming and I am glad we are
having it now. We have a lot of room for improvement! I don't want to
scope-creep this patch series, but I hope that this last bit could be
done as part of this series if we find agreement in the community.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.