[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] IOMMU: make DMA containment of quarantined devices optional
> From: Tian, Kevin > Sent: Monday, December 16, 2019 1:58 PM > > > From: Jürgen Groß <jgross@xxxxxxxx> > > Sent: Friday, December 13, 2019 11:36 PM > > > > On 13.12.19 15:45, Jan Beulich wrote: > > > On 13.12.2019 15:24, Jürgen Groß wrote: > > >> On 13.12.19 15:11, Jan Beulich wrote: > > >>> On 13.12.2019 14:46, Jürgen Groß wrote: > > >>>> On 13.12.19 14:38, Jan Beulich wrote: > > >>>>> On 13.12.2019 14:31, Jürgen Groß wrote: > > >>>>>> Maybe I have misunderstood the current state, but I thought that it > > >>>>>> would just silently hide quirky devices without imposing a security > > >>>>>> risk. We would not learn which devices are quirky, but OTOH I > doubt > > >>>>>> we'd get many reports about those in case your patch goes in. > > >>>>> > > >>>>> We don't want or need such reports, that's not the point. The > > >>>>> security risk comes from the quirkiness of the devices - admins > > >>>>> may wrongly think all is well and expose quirky devices to not > > >>>>> sufficiently trusted guests. (I say this fully realizing that > > >>>>> exposing devices to untrusted guests is almost always a certain > > >>>>> level of risk.) > > >>>> > > >>>> Do we _know_ those devices are problematic from security > standpoint? > > >>>> Normally the IOMMU should do the isolation just fine. If it doesn't > > >>>> then its not the quirky device which is problematic, but the IOMMU. > > >>>> > > >>>> I thought the problem was that the quirky devices would not stop all > > >>>> (read) DMA even when being unassigned from the guest resulting in > > >>>> fatal IOMMU faults. The dummy page should stop those faults to > > happen > > >>>> resulting in a more stable system. > > >>> > > >>> IOMMU faults by themselves are not impacting stability (they will > > >>> add processing overhead, yes). The problem, according to Paul's > > >>> description, is that the occurrence of at least some forms of IOMMU > > >>> faults (not present ones as it seems, as opposed to permission > > >>> violation ones) is fatal to certain systems. Irrespective of the > > >>> sink page used after de-assignment a guest can arrange for IOMMU > > >>> faults to occur even while it still has the device assigned. Hence > > >>> it is important for the admin to know that their system (not the > > >>> the particular device) behaves in this undesirable way. > > >> > > >> So how does the admin learn this? Its not as if your patch would result > > >> in a system crash or hang all the time, right? This would be the case > > >> only if there either is a malicious (on purpose or due to a bug) guest > > >> which gets the device assigned, or if there happens to be a pending DMA > > >> operation when the device gets unassigned. > > > > > > I didn't claim the change would cover all cases. All I am claiming > > > is that it increases the chances of admins becoming aware of reasons > > > not to pass through devices to certain guests. > > > > So combined with your answer this means to me: > > > > With your patch (or the original one reverted) a DoS will occur either > > due to a malicious guest or in case a DMA is still pending. As a result > > the admin will no longer pass this device to any untrusted guest. > > > > With the current 4.13-staging a DoS will occur only due to a malicious > > guest. The admin will then no longer pass this device to any untrusted > > guest. > > > > So right now without any untrusted guest no DoS, while possibly DoS with > > your patch. How is that better? > > > > I'd suggest separating run-time DoS from original quarantine purpose > of this patch. > > For quarantine, I'm with Jan that giving admin the chance of knowing > whether quarantine is required is important. Say an admin just gets > a sample device from a new vendor and needs to decide whether his > employer should put such device in their production system. It's > essential to have a default configuration which can warn on any > possible violation of the expectations on a good assignable device. > Then the admin can look at Xen user guide to find out what the warning > information means and then does whatever required (usually means > more scrutinization than the warning itself) to figure out whether > identified problems are safe (e.g. by enabling quarantine) or are > real indicators of bogus implementation (then should not use it). > Having quarantine default on means that every admin should remember > that Xen already enables some band-aids on benign warnings so he > should explicitly turn off those options to do evaluation which, to me > is not realistic. > > On the other hand, although a sink page can help mitigate run-time DoS, > how to handle it should be an admin policy. If DoS is caused by > malicious guest, it makes more sense to warn the admin and then switch > to sink page after meeting a DoS detection condition (e.g. number of > faults within a timing window exceeds a threshold). Lots of such warnings > from multiple VMs may imply a massive attack then the admin may adopt > more comprehensive analysis or defensive means. Then it's also possible > that DoS is caused by a vulnerability within the guest. In such case, the > guest may want to contain the DoS attack itself through a virtual IOMMU. > Then Xen needs to know the fault and then forward to the guest through > the vIOMMU. In either case I don't think current patch is sufficient. So it's > better to leave DoS improved separately (of course the sink logic can be > leveraged), given the timeline of 4.13. > btw I forgot that today we already have some-level of DoS detection. You can check pci_check_disable_device, which exactly does above threshold check and then disable the PCI_COMMAND register. The missing part might be, the device still issues DMAs even after the disabling. In such case, at least VT-d allows the software to disable fault reporting on a given device. Such method betters suits warn-and-disable model upon DoS attacking than blindly setting up a sink page. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |