[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Community call: PCI Emulation - Future Direction (Wed, May 2nd, UTC 16:00-17:00 / BST 17:00-18:00) - Minutes



On Fri, 4 May 2018 15:38:49 +0000
Lars Kurth <lars.kurth@xxxxxxxxxx> wrote:
[...]
>Julien: where would the reset code live then?
>Christopher: would want to avoid Dom0 having access to the config space. The 
>VM hosting
>the toolstack will need to exercise control over access to the config space.
>Roger: Another option would be to do this inside of Xen via a hypercall
>Julien: moving reset from Linux into Xen would be quote complex.
>Paul: Handling the reset and quirks within Xen seems perfectly reasonable
>Christopher: handling the sequence to reset the device is quite complex
>Stefano: Aside from who does what are there any specific requirements we need 
>to pay
>attention to for complex devices such as GPUs (such as IOMMU mapping)
>Alexey: saw devices which do not like secondary bus reset (e.g. some NVIDIA 
>GPUs) -
>When we use the device and restart the domain, it will hang during boot.
>Roger: know there are issues with some devices.
>Stefano: Surprisingly high number of quirks. So the question is who maintains 
>the quirks. If
>we moved it to Xen, we may not get contributions to fix quirks. We would have 
>to monitor
>Linux and then move code, which increases the codsize
>Roger: The code would be somewhere in any case, either Xen or Dom0 kernel: so 
>why does
>the codesize matter?
>Daniel: the code size does not go away, but the question is how it can be 
>isolated
>Stefano: depending on where it is, the stability of the system is directly 
>impacted
>Alexey: need to provide device specific quirks to reset the device
>Alexey: Have not looked at Linux quirks for resetting devices. Reset is 
>mandatory (must be
>performed in many cases such as domain restart, ...). Can move from secondary 
>reset to
>other reset methods and work around specific quirks.
>Rich: Mentioned that Oracle posted some reset code recently for XenClient into 
>Linux.
>**Next steps:**
>* Should we start a discussion on the mailing list on how to resolve the reset 
>question.
>ACTION: Rich to start the thread (the people participating in the reset 
>discussion to
>be CC’ed)

First of all, sorry for my poor English skills.

Regarding the secondary bus reset VS some nvidia GPUs -- the exact
behavior was like this:

- if we do SBR on the affected device BEFORE it was initialized by its
  proprietary driver -- there is no problem, SBR works exactly the same
  as expected, no issues upon the domain restart (for example, if we
  restart a domain before a guest OS switched to native display drivers).

- the same is applicable if there is no proprietary driver installed at
  all -- no problems for SBR at any time of the domain lifetime

- if we do SBR on the affected device AFTER it was initialized by its
  proprietary driver -- then there will be a problem after restarting
  the domain with such GPU in primary mode -- GPU videobios will hang
  polling one of GPU registers. Interestingly enough, if we hide Option
  ROM and skip videobios execution, leaving GPU to proprietary driver
  only, then there are no issues. The drawback is that there will be no
  early video output (until the Welcome screen).

- in both cases performing SBR looks good at first -- the screen goes
  blank, PCI conf fields got reset (including BAR values), both
  functions are affected by SBR.

- in all cases, a proprietary reset for these GPUs works perfectly

Seems like these particular devices do not fully conform to the PCIe
specification in responding to the PCIe link reset and leave some
state/registers non-reset. This (apparently buggy) behavior is not
common for all nvidia GPUs, but multiple videocards manifesting this
issue were encountered.

With such devices in mind, possible priorities for reset methods can be
(with SBR/FLR priority questionable):

Quirks (if any) -> SBR -> FLR -> (PM reset, though this one is pretty much 
useless as it seems)

Quirks can be handled in a usual way -- probing the device by VID/DID
against a quirk table and calling a custom reset callback if a match
found.

One thing to note is that both quirk reset and SBR may affect multiple
devices (i.e. functions).

So a good topic to discuss might be designing a new interface to
reset passed through devices which can take into account dependencies
between PCI/PCIe devices and the reset methods that may affect a group
of devices at the same time. The existing ('do_flr'-like) interface
currently doesn't know anything about device relations.

>## Stefano/Julien: ​ARM guest pci-passthrough
>
>Julien: the idea was not really speaking about PCI passthrough, but to follow 
>what is
>happening on ARM. Don’t have any specific things to talk about.
>Stefano: The challenge on ARM has been a few incompatible implementations in 
>the config
>space. Initially we didn't know what to do. We then decided to start simple 
>and implement the
>standard compliant functions in the HV. And then cross the bridge of 
>incompatible config
>space registers when we come to it.
>Julien: mostly looking on what is going on. Not currently working on PCI 
>passthrough
>Roger: asks whether suitable for ARM
>Julien: in principle yes, but the different implementations (e.g. for timers). 
>IOMMU may not
>translate all the hardware (some commands may bypass). Not sure whether the 
>same
>challenge exists on x86.
>
>## Rich: ​discuss the level of security support that will be asserted in 
>SUPPORT.md for
>
>## driver domains which contain untrusted PCI devices.
>
>* Will Xen security support be different for SR-IOV devices? GPUs vs. NICs?
>* There have been past discussions on this topic and a proposed 
>PCI-iommu-bugs.txt
>file to help Xen users and developers understand the risks [2][3][4] that may 
>arise
>from a hostile device and potentially buggy firmware. If we can document 
>specific
>risks, we can ask firmware developers to make specific improvements to improve 
>the
>security of PCI emulation.
>* There is an active effort [4] underway to improve firmware security in 
>servers (and
>eventually desktops), including a reduction of attack surface due to SMM. 
>There is
>also work underway [5][6] to perform secure boot between individual PCI 
>devices and
>server motherboards. Some of these concepts may already be deployed in Azure.
>* Several stakeholders will be attending or presenting at the PSEC [6] 
>conference.
>[1] Performance Isolation Exposure in Virtualized Platforms with PCI 
>Passthrough I/O
>Sharing, ​https://mediatum.ub.tum.de/doc/1187609/972322.pdf
>
>
>[2] Securing Self-Virtualizing Ethernet Devices,
>https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-smolyar.pdf
>[3] Denial-of-Service Attacks on PCI Passthrough Devices,
>[http://publications.andre-richter.com/richter2015denial.pdf](http://publications.andre-richter.com/richter2015denial.pdf)
>[4] Open Compute Open System Firmware,
>[http://www.opencompute.org/wiki/Open_System_Firmware](http://www.opencompute.org/wiki/Open_System_Firmware)
>[5] Open Compute Security, ​http://www.opencompute.org/wiki/Security
>[6] Firmware attestation: 
>https://www.platformsecuritysummit.com/prepare/#attestation
>[0] Notes for upcoming PCI emulation call thread:
>https://lists.xenproject.org/archives/html/xen-devel/2018-05/msg00091.html
>Note: we have no stake-holders from the security team on the call, which makes 
>this a
>difficult discussion.
>Rich: Andrew, Roger mentioned some problems related to security support in a 
>previous
>discussion <Lars: is there a link to it?>
>Rich: Earlier in this meeting we mentioned blacklisting, but thought we were 
>going to use
>whitelisting?
>Alexey: we know nothing about vendor specific capabilities for some devices 
>which we may
>to expose, so whitelisting is problematic
>Roger: maybe add a list of extra capabilities.
>Rich: roughly agrees. Maybe someone can write down what the plan is such that 
>it can be
>reviewed?
>Alexey: there are a series of patches in this area to expose capabilities 
>after the Q
>patches (such as support for dynamic fields?).
>Rich: once we can document precisely how this works we can revisit the 
>security support
>question
>Roger: part of the problem was that some devices expose a configuration space 
>on a Base
>Address Register (e.g. for Windows drivers).
>* Could whitelist some known devices
>* Paul confirms that some devices did that - ACTION: Paul to write up a couple
>
>## AOB
>
>```
>* Continue on the mailing list
>* If needed try and arrange a all with a more narrow topic
>```
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.