[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch RFC 00/13] VT-d Asynchronous Device-TLB Flush for ATS Device



> >> >> >>> On September 29, 2015 at 3:22 PM, <JBeulich@xxxxxxxx> wrote:
> >>> On 29.09.15 at 04:53, <quan.xu@xxxxxxxxx> wrote:
> >>>> Monday, September 28, 2015 2:47 PM,<JBeulich@xxxxxxxx> wrote:
> >> >>> On 28.09.15 at 05:08, <quan.xu@xxxxxxxxx> wrote:
> >> >>>> Thursday, September 24, 2015 12:27 AM, Tim Deegan wrote:
> >
> > For Tim's suggestion --"to make the IOMMU table take typed refcounts
> > to anything it points to, and only drop those refcounts when the flush
> > completes."
> >
> > From IOMMU point of view, if it can walk through IOMMU table to get
> > these pages and take typed refcounts.
> > These pages are maybe owned by hardware_domain, dummy, HVM guest .etc.
> > could I narrow it down to HVM guest? --- It is not for anything it
> > points to, but just for HVM guest related. this will simplify the design.
> 
> I don't follow. Why would you want to walk page tables? And why would a HVM
> guest have pages other than those owned by itself or granted access to by
> another guest mapped in its IOMMU page tables?

It is tricky. Let's ignore it.

This is an analysis of IOMMU table to take typed refcounts to anything it 
points to.
I know the IOMMU table and EPT table may share the same page 
table('iommu_hap_pt_share = 1').
Then, go through iommu table and take typed refcounts.

> In any event - the ref-counting
> would need to happen as you _create_ the mappings, not at some later point.
> 
a general rule. Agreed.

When create the mappings, what conditions to take typed refcounts?



> >  Just for check, do typed refcounts refer to the following?
> >
> > --- a/xen/include/asm-x86/mm.h
> > +++ b/xen/include/asm-x86/mm.h
> > @@ -183,6 +183,7 @@ struct page_info
> >  #define PGT_seg_desc_page PG_mask(5, 4)  /* using this page in a GDT/LDT?
> */
> >  #define PGT_writable_page PG_mask(7, 4)  /* has writable mappings?
> */
> >  #define PGT_shared_page   PG_mask(8, 4)  /* CoW sharable page
> */
> > +#define PGT_dev_tlb_page  PG_mask(9, 4)  /* Maybe in Device-TLB
> mapping?   */
> >  #define PGT_type_mask     PG_mask(15, 4) /* Bits 28-31 or 60-63.
> */
> >
> > * I define a new typed refcounts PGT_dev_tlb_page.
> 
> Why? I.e. why won't a base ref for r/o pages and a writable type-ref for r/w 
> ones
> suffice, just like we do everywhere else?
> 

I think it is different from r/o or writable.

The page freed from the domain, the Device-TLB flush is not completed.
The page is _not_ r/o or writable, and can only access though DMA..

Maybe it would modify a lot of related code. 
r/o or writable are acceptable to me. 


> >> Once you do that, I
> >> don't think there'll be a reason to pause the guest for the duration
> >> of the
> > flush.
> >> And really (as pointed out before) pausing the guest would get us
> >> _far_ away from how real hardware behaves.
> >>
> >
> > Once I do that, I think the guest should be still paused, if the
> > Device-TLB flush is not completed.
> >
> > As mentioned in previous email, for example:
> > Call do_memory_op HYPERCALL to free a pageX (gfn1 <---> mfn1). The
> > gfn1 is the freed portion of GPA.
> > assume that there is a mapping(gfn1<---> mfn1) in Device-TLB. If the
> > Device-TLB flush is not completed and return to guest mode, the guest
> > may call do_memory_op HYPERCALL to allocate a new pageY(mfn2) to
> > gfn1..
> > then:
> > the EPT mapping is (gfn1--mfn2), the Device-TLB mapping is (gfn1<--->mfn1) .
> >
> > If the Device-TLB flush is not completed, DMA associated with gfn1 may
> > still write some data with pageX(gfn1 <---> mfn1), but pageX will be
> > Released to xen when the Device-TLB flush is completed. It is maybe
> > not correct for guest to read data from gfn1 after DMA(now the page
> > associated with gfn1 is pageY ).
> >
> > Right?
> 
> No. The extra ref taken will prevent the page from getting freed. And as long 
> as
> the flush is in process, DMA to/from the page is going to produce undefined
> results (affecting only the guest). But note that there may be reasons for an
> external to the guest entity invoking the operation which ultimately led to 
> the
> flush to do this on a paused guest only. But that's not of concern to the
> hypervisor side implementation.
> 

Reasonable.

Jan, thanks!

-Quan


> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.