[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>
  • Date: Thu, 23 Dec 2021 19:11:44 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=InUROT1dthpWZRzvpRvyssmT17opcVX/IN42VN0LxE4=; b=YyxqSNUYASzv+aENtKaGcQrYZzmcSfDfJWF9OXCev9q1H2MgqJPqUdQYRY9LEGIrt6/q8hzxhQWjifk2froiZ4QUZu0qqwuXzJK83pQ+RArc/kPMPNwVXqHNG/rmRI88NCmcB4QwwIfWyKe+JAIuJWdfunvT+mzkrqR4/3XmyyoKKu7JqqbVcPrOO3BU3Vov76W5BDUh6EcLRh7ebCYhQ1EEgVhVeK7h2PBfN8VASVMpQ9L7KfLRMhyPxAh9TokG3iH8henrFXxrFCs0EVd2SlVjo2f+e7GRr6CnGwntUAeZZ220uUOA5hPZ6/cNHc4OKnAOLqnJa7Vy8+Sb9YFB3g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gtZJyA1jsUqD8qfzGWQIu1zplsEjF7KAQXgZE/3JBnH3lXw7i9Gn107902mltyJZGzLzxhm0gS0f1L3B4D+5g3hiK/WhR+N0E17m3U1FKdfPS9GTMoQAWWbxObHJaQ0/qi2bLN+cTTPPOfj06ZDN5+zwisWMPTpIiHYjnQbIYf6q47gFJ+AJEh9hYhetV0vQi2Q1sBwuA4zhEatV1YKrgKGpL92HCjBTFY9+K7C8iW+oo3Lci0VUkl2pcB/558H0Y3CsiypUGeyU34lDOjSfn4CFPtz2K0bRgpesye5iF8lRl850kWpJmQChZm6bU0NA4FZkWw7VfFrV5xkqgrEwhw==
  • Cc: Julien Grall <julien@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Oleksandr <olekstysh@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>
  • Delivery-date: Thu, 23 Dec 2021 19:12:15 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHX8M3LpzWz4l8PQUiIA2nAUV0H+qw0P2iAgAJehQCABnNTgIAAc5MAgADFCgCAAB+dgIAAA54AgAARtoCAABUwgIAA0niAgAEZxoA=
  • Thread-topic: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs

Hi Stefano,

On Wed, Dec 22, 2021 at 06:23:13PM -0800, Stefano Stabellini wrote:
> On Wed, 22 Dec 2021, Julien Grall wrote:
> > > > > > > To me dtdev and XEN_DOMCTL_assign_device are appropriate because
> > > > > > > they
> > > > > > > signify device assignment of one or more devices. We are not 
> > > > > > > adding
> > > > > > > any
> > > > > > > additional "meaning" to them. It is just that we'll automatically
> > > > > > > detect
> > > > > > > and generate any SCMI configurations based on the list of assigned
> > > > > > > devices. Because if SCMI is enabled and a device is assigned to 
> > > > > > > the
> > > > > > > guest, then I think we want to add the device to the SCMI list of
> > > > > > > devices automatically.
> > > > > > 
> > > > > > I am OK with re-using dtdev/XEN_DOMCTL_assign_device however there 
> > > > > > is
> > > > > > a pitfall with that approach.
> > > > > > 
> > > > > > At the moment, they are only used for device protected by the
> > > > > > IOMMU. If the device is not protected by the IOMMU then it will 
> > > > > > return
> > > > > > an error.
> > > > > IIRC there was a change, that allowed to assign device without a
> > > > > IOMMU. At
> > > > > least we discussed this internally.
> > > > 
> > > > I am not aware of any upstream. Do you have a pointer if there is any
> > > > public discussion?
> > > 
> > > No, this is the first public discussion on this topic.
> > > 
> > > > 
> > > > > > 
> > > > > > Now, with your approach we may have a device that is not protected 
> > > > > > by
> > > > > > the IOMMU but require to a SCMI configuration.
> > > > > You need to protect only DMA-capable devices.
> > > > 
> > > > Xen doesn't know if the device is DMA-capable or not. So...
> > > > 
> > > 
> > > But it knows if there is a iommus=<> node present for the device.
> > 
> > Not really. Not all the platforms have IOMMUs and not all the platforms with
> > IOMMU have DMA-capable device protected by an IOMMU.
> > 
> > > 
> > > > > 
> > > > > > I don't think it would be sensible to just return "succeed" here
> > > > > > because technically you are asking to assign a non-protected
> > > > > > device. But at the same time, it would prevent a user to assign a
> > > > > > non-DMA capable device.
> > > > > > 
> > > > > > So how do you suggest to approach this?
> > > > > Well, in my head assign_device ideally should do the following:
> > > > > 1. Assign IOMMU if it is configured for the device
> > > > 
> > > > ... with this approach you are at the risk to let the user passthrough
> > > > a device that cannot be protected.
> > > > 
> > > > > 2. Assign SCMI access rights
> > > > > (Not related to this patch series, but...)
> > > > > 3. Assign IRQs
> > > > > 4. Assign IO memory ranges.
> > > > > Points 3. and 4. would allow us to not provide additional irq=[] and
> > > > > iomem=[] entries in a guest config.
> > > > 
> > > > That could only work if your guest is using the same layout as the
> > > > host.
> > > 
> > > Agreed. But in my experience, in most cases this is the true. We worked
> > > with Renesas and IMX hardware and in both cases iomems were
> > > mapped 1:1.
> > > 
> > > > Otherwise, there is a risk it will clash with other parts of the
> > > > memory layout.
> > > > 
> > > 
> > > > Today, guests started via the toolstack is only using a virtual
> > > > layout, so you would first need to add support to use the host memory
> > > > layout.
> > > 
> > > I can't say for all the possible configurations in the wild, but I'm
> > > assuming that in most cases it will be fine to use 1:1 mapping. For
> > > those more exotic cases it would be possible to implement some
> > > configuration option like iomem_map=[mfn:gfn].
> > Well, the Xen memory layout is not something that is stable nor AFAIK based 
> > on
> > any memory layout. In fact, there is no such generic layout on Arm.
> > 
> > It is quite possible that it will work well with 1:1 MMIO on some platform
> > (e.g. Renesas, IMX) but you can't expect to work for every Xen release or 
> > all
> > the platforms.
> 
> Yeah this is a true problem. Thankfully with Penny's series we'll be
> able to create domUs with the host memory layout (although dom0less
> only but it is a step forward).
> 
>  
> > > As Stefano pointed, right now user needs to provide 3 configuration
> > > options to pass a device to a guest: dt_dev, irq, iomem. But in fact,
> > > Xen is already knowing about irq and iomem from device tree.
> > 
> > I understand and this is not ideal. I would be happy to consider your
> > approach. However, this will have to enabled only when the guest is re-using
> > the host layout.
> 
> It looks like we all agree that today configuring device assignment with
> Xen is too complicated and would like for it to be simpler. This thread
> has some excellent ideas on how to address the issue. Skip to the end if
> you are only interested in this patch series.
> 
> 
> # Future Ideas
> 
> A great suggestion by Julien is to start supporting the dom0less partial
> device tree format in xl/libxl as well so that we can have a single
> "device_tree" option in dom.cfg instead of 4 (device_tree, iomem, irqs,
> dtdev).
> 
> Even with that implemented, the user has still to provide a partial dtb,
> xen,reg and xen,path. I think this is a great step forward and we should
> do that, if nothing else to make it easier to switch between dom0less
> and normal domU configurations. But the number of options and
> information that the user has to provide is still similar to what we
> have today.
> 
> After that, I think we need something along the lines of what Volodymyr
> suggested. Let's say that the user only provides "dtdev" and
> "device_tree" in dom.cfg.  We could:
> 
> - read interrupts from the  "interrupts" property like we do for dom0less
> - read "xen,reg" for the mapping, if "xen,reg" is missing, read "reg"
>   and assume 1:1 (we could try the mapping and print an error on
>   failure, or we could only do 1:1 if the domain is entirely 1:1)
> - optionally read "xen,path" to populate dtdev
> - if an IOMMU configuration is present, then also configure the IOMMU
>   for the device, if not then "xen,force-assign-without-iommu" must be
>   present
> - assign SCMI access rights
> 
> 
> Now we only have:
> - device_tree in dom.cfg
> - dtdev in dom.cfg (or xen,path in the partial DTB)
> - xen,force-assign-without-iommu in the partial DTB
> 
> 
> It would be good if we could remove "xen,force-assign-without-iommu"
> because at this stage it is the only Xen-specific property remaining in
> the partial DTB. If we could get rid of it, it would make it easier to
> write/generate the partial DTB because it becomes a strict subset of the
> corresponding host node. It would enable us to automatically generate it
> (we are going to do experiments with it at Xilinx in the next few
> months) and it would be easier to explain to users how to write it.
> The partial DTB usually starts as a copy of the corresponding host node
> plus some edits. The fewer edits are required, the better.
> 
> But it is difficult because of the reasons mentioned by Julien. In Xen
> we cannot know if a device is not behind an IOMMU because is not a DMA
> master (so safe to assign) or because the system simply doesn't have
> enough IOMMU coverage (so not safe to assign). Thankfully the hardware
> has been improving in recent years and there are more and more platforms
> with full IOMMU coverage. I think we should make it easier for users on
> these well-behave platforms.
> 
> At least, we could move the "xen,force-assign-without-iommu" option from
> the partial DTB to the VM config file dom.cfg. Something like:
> 
> force-assign-without-iommu="true"
> or
> platform-iommu-safe="true"
> 
> That way, it is global rather than per-device and doesn't have to be
> added by the user by hand when creating the partial DTB.
> 
> 
> # This patch series
> 
> Now in regards to this specific series and the SCMI options today, I
> think it is OK to have a per-domain sci="scmi_smc", but I think we
> should try to avoid another detailed list of device paths or list of
> IDs in addition to dtdev.
> 
> dtdev already specifies the device tree nodes to be assigned to the
> guest. Based on that list we can also do SCMI assignment.
> 
> As Julien pointed out, the issue is: what if a device needs SCMI
> assignment but is not a DMA master (hence there is no IOMMU
> configuration on the host)?
> 
> Attempting to assign a DMA mastering device without IOMMU protection is
> not just unsafe, it will actively cause memory corruptions in most
> cases. It is an erroneous configuration.
> 
> Of course we should try to stop people from running erroneous
> configurations, but we should do so without penalizing all users.
> 
> With that in mind, also considering that dtdev is the list of devices to
> be assigned, I think dtdev should be the list of all devices, even
> non-DMA masters. It makes a lot of sense also because today is really
> difficult to explain to a user that "yes, dtdev is the devices to be
> assigned but no, if the device is a UART you don't have to add it to
> dtdev because it is not a DMA master". It would be a lot easier if the
> list of assigned devices was comprehensive, including both DMA masters
> and not DMA masters.
> 
> So I think we should either:
> 
> a) extend dtdev to cover all devices, including non-DMA masters
> b) or add a new "device_assignment" property which is like dtdev but is
>    the list of both DMA masters and non-DMA masters
> 
> Either way, when non-DMA masters are present in the
> dtdev/device_assignment list we could either:
>     c) require force-assign-without-iommu="true" in dom.cfg
>     d) or print a warning like:
>     "WARNING: device assignment safety for device XXX cannot be
>     verified. Please make sure XXX is not a DMA mastering device."
> 
> 
> All these options are good I think. My preference is a) + d), so extend
> dtdev and print a warning if non-DMA masters are in the list. We don't
> necessarily need a new hypercall but that is also an option.
> 
> I think this discussion was a long time coming and I am glad we are
> having it now. We have a lot of room for improvement! I don't want to
> scope-creep this patch series, but I hope that this last bit could be
> done as part of this series if we find agreement in the community.

For me a) + d) looks good. I will implement it in v2 if everybody ok
with this approach.


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.