[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue



> On 21.12.2015 at 9:23pm, <JBeulich@xxxxxxxx> wrote:
> >>> On 21.12.15 at 14:08, <quan.xu@xxxxxxxxx> wrote:
> >>  On 21.12.2015 at 8:50pm, <JBeulich@xxxxxxxx> wrote:
> >> >>> On 21.12.15 at 13:28, <quan.xu@xxxxxxxxx> wrote:
> >> > On 21.12.2015 at 7:47pm, <JBeulich@xxxxxxxx> wrote:
> >> >> >>> On 20.12.15 at 14:57, <quan.xu@xxxxxxxxx> wrote:
> >> >> > 2. If VT-d is bug, does the hardware_domain continue to work
> >> >> > with PCIe Devices / DRAM well with DMA remapping error?
> >> >> >    I think it is no. furthermore, i think VMM can NOT run a
> >> >> > normal HVM domain without device-passthrough.
> >> >>
> >> >> In addition to what Andrew said - VT-d is effectively not in use
> >> >> for domains without PT device.
> >> >
> >> > IMO, When VT-d is enabled, but is not working correct. These PCI-e
> >> > devices
> >> > (Disks/NICs..) DMA/Interrupt behaviors are not predictable.
> >> > Assumed that, VT-d is effectively not in use for domains without PT
> >> > device, while at least the virtualization infrastructure is not trusted.
> >> > I think it is also not secure to run PV domains.
> >> >
> >> >> Impacting all such domains by crashing the hypervisor just because
> >> >> (in the extreme case) a single domain with PT devices exhibited a
> >> >> flush issue is a no-go imo.
> >> >>
> >> >
> >> > IMO, a VT-d (IEC/Context/Iotlb) flush issue is not a single domain
> >> > behavior, it is a Hypervisor and infrastructure issue.
> >> > ATS device's Device-TLB flush is a single domain issue.
> >> > Back to our original goal, my patch set is for ATS flush issue. right?
> >>
> >> You mean you don't like this entailing clean up of other code?
> >
> >  Jan, for ARM/AMD, I really have no knowledge to fix it. and I have no
> > ARM/AMD hardware to verify it. if I need to fix these common part of
> > INTEL/ARM/AMD, I think I need to make  Xen compile correct and not to
> > destroy the logic.
> 
> You indeed aren't expected to fix AMD or ARM code, but it may be necessary to
> adjust that code to make error propagation work.
> 
> >> I'm sorry, but I'm
> >> afraid you won't get away without - perhaps the VT-d maintainers
> >> could help here, but in the end you have to face that it was mainly
> >> Intel people who introduced the code which now needs fixing up, so I
> >> consider it not exactly unfair for you (as a
> >> company) to do this work.
> >>
> >
> > Furthermore, I found out that
> >      if IEC/Iotlb/Context flush error, then panic.
> >      Else if device-tlb flush error, we'll hide the target ATS device
> > and kill the domain owning this ATS device. If impacted domain is
> > hardware domain, just throw out a warning.
> >
> >      Then, it is fine to _not_check all the way up the device-tlb
> > flush call trees( maybe it is our next topic of discussion).
> 
> I don't follow - this sounds more or less like the model you've been 
> following in
> past versions, yet it was that which prompted the request to properly 
> propagate
> errors.
> 
Jan,
Maybe we can discuss the big picture first on how to deal with 
iec/iotlb/context and Device-TLB flush error.
Then we can discuss it in detail. We can ignore some point of the way up the 
device-tlb flush call trees. Such as 

   iommu_hwdom_init()
   *|--hd->platform_ops->map_page(d, gfn, mfn, mapping);


And more, if we are on same page, I am glad to write patch for all of vt-d 
issue, including IOMMU_WAIT_OP issue .etc..

-Quan








_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.