[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v10 7/7] vtd: use a bit field for dma_pte


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
  • Date: Mon, 30 Nov 2020 05:29:04 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vwiZdpkwPU4vE6pIVVS1yOCxygQ4CsnKQvP7+mgCbqo=; b=EMNRqzlm1vVhSZenO4PbUCMzr5DKvgWVS66iP0LYwFycEPMLDPJl5O5zOABxWnl8VIO1zNVXX/k28iX3x9H1zQxbw8JD8f5YufadMnPhAcoL8LTVjxU6uMC1B5bgd52MpO/s+GxL1NIpYt2NseLDkylty4He+vo8Rq8BDM+OhIWqtWj/95kcvf3lv11Nhq/6CyxhYoLHFrYx/bLxP7aTzImjiCLkQbFdez0Xv+Xnw2TEs+IBOzX0IWZTLqLjYYYRxD0W1vIPhJLhD6X3tvUdGGh5r8gxeC/kJNla8GHnmoWWEuCrD/dYzhv5n6IS/gq+3qEP1quKbFT3tX9+tgPZtQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hLpp9LdTKPFVIDNVfqcjm6DE8qoJuDBo2UDg9CFKxQsDfpkhU6vhAdFbL2UJWz4zgBrR2Q9pAcTa4IuPyCPsFfcJ4yp2KbaO07kGwj2iOTp1cMVhnlDO6NAsWThBWleqJZPpAwLWK0ZnxU0iUR1EJ952lPoLE1HdAsnBCrFyyRnIUuCDFybhU5ErXd3vGlWTIIZxe5J1PNKfY4hEz+4o1UVo2966L72SB0cqqsGbyjW/7u34aLSUKH8VS/7tEunc+Wwh2l+TLmA6nwJMCrvGjeOJdcTqvMd6NLs36A5iuDROwbwoLQrrNVtVm15C7EF7iVcqiumQ2gJJo17oEXbEYQ==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=intel.com;
  • Cc: Paul Durrant <pdurrant@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 30 Nov 2020 05:29:35 +0000
  • Dlp-product: dlpe-windows
  • Dlp-reaction: no-action
  • Dlp-version: 11.5.1.3
  • Ironport-sdr: Q7dwKLrQ+8fdn6geJsxAXnW4tXcUupubgd7OKwHRpk6zRvDf2eWSZj9/q9jhkoNdrdIaGhYCuO tg44IoTdSs6Q==
  • Ironport-sdr: 8Paml2sERhA37Xoy8etT+2NHB3ErD+WXbNhIYPvrWEE/ySe4xaf2JAVamgMp5MXJNziLpOjfA5 F+8hBbuhN9JA==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWv0CPiyT+mTsyP0CyAKicZjEQmKncLtqAgAP+YuA=
  • Thread-topic: [PATCH v10 7/7] vtd: use a bit field for dma_pte

> From: Beulich
> Sent: Saturday, November 28, 2020 12:02 AM
> 
> On 20.11.2020 14:24, Paul Durrant wrote:
> > @@ -709,20 +709,23 @@ static void dma_pte_clear_one(struct domain
> *domain, uint64_t addr,
> >      page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
> >      pte = page + address_level_offset(addr, 1);
> >
> > -    if ( !dma_pte_present(*pte) )
> > +    if ( !pte->r && !pte->w )
> 
> I think dma_pte_present() wants to stay, so we would have to touch
> only one place when adding support for x.
> 
> >      {
> >          spin_unlock(&hd->arch.mapping_lock);
> >          unmap_vtd_domain_page(page);
> >          return;
> >      }
> >
> > -    dma_clear_pte(*pte);
> > -    *flush_flags |= IOMMU_FLUSHF_modified;
> > +    pte->r = pte->w = false;
> > +    smp_wmb();
> > +    pte->val = 0;
> >
> >      spin_unlock(&hd->arch.mapping_lock);
> >      iommu_sync_cache(pte, sizeof(struct dma_pte));
> 
> Just as an observation - in an earlier patch I think there was a
> code sequence having these the other way around. I think we want
> to settle one one way of doing this (flush then unlock, or unlock
> then flush). Kevin?
> 

Agree. Generally speaking 'unlock then flush' is preferred since
spinlock doesn't protect iommu anyway.

> > @@ -1775,15 +1778,12 @@ static int __must_check
> intel_iommu_map_page(struct domain *d, dfn_t dfn,
> >      page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
> >      pte = &page[dfn_x(dfn) & LEVEL_MASK];
> >      old = *pte;
> > -
> > -    dma_set_pte_addr(new, mfn_to_maddr(mfn));
> > -    dma_set_pte_prot(new,
> > -                     ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
> > -                     ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
> > -
> > -    /* Set the SNP on leaf page table if Snoop Control available */
> > -    if ( iommu_snoop )
> > -        dma_set_pte_snp(new);
> > +    new = (struct dma_pte){
> > +        .r = flags & IOMMUF_readable,
> > +        .w = flags & IOMMUF_writable,
> > +        .snp = iommu_snoop,
> > +        .addr = mfn_x(mfn),
> > +    };
> 
> We still haven't settled on a newer gcc baseline, so this kind of
> initializer is still not allowed (as in: will break the build) for
> struct-s with unnamed sub-struct-s / sub-union-s.
> 
> > @@ -2611,18 +2611,18 @@ static void
> vtd_dump_page_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
> >              process_pending_softirqs();
> >
> >          pte = &pt_vaddr[i];
> > -        if ( !dma_pte_present(*pte) )
> > +        if ( !pte->r && !pte->w )
> >              continue;
> >
> >          address = gpa + offset_level_address(i, level);
> >          if ( next_level >= 1 )
> > -            vtd_dump_page_table_level(dma_pte_addr(*pte), next_level,
> > +            vtd_dump_page_table_level(pfn_to_paddr(pte->addr), next_level,
> >                                        address, indent + 1);
> >          else
> >              printk("%*sdfn: %08lx mfn: %08lx\n",
> >                     indent, "",
> >                     (unsigned long)(address >> PAGE_SHIFT_4K),
> > -                   (unsigned long)(dma_pte_addr(*pte) >> PAGE_SHIFT_4K));
> > +                   (unsigned long)(pte->addr));
> 
> Could you also drop the no longer needed pair of parentheses. I
> further suspect the cast isn't needed (anymore?). (Otoh I think
> I recall oddities with gcc's printf()-style format checking and
> direct passing of bitfields. But if that's a problem, I think
> one of the earlier ones already introduced such an issue. So
> perhaps we can wait until someone actually confirms there is an
> issue - quite likely this someone would be me anyway.)
> 
> > --- a/xen/drivers/passthrough/vtd/iommu.h
> > +++ b/xen/drivers/passthrough/vtd/iommu.h
> > @@ -244,38 +244,21 @@ struct context_entry {
> >  #define level_size(l) (1 << level_to_offset_bits(l))
> >  #define align_to_level(addr, l) ((addr + level_size(l) - 1) & 
> > level_mask(l))
> >
> > -/*
> > - * 0: readable
> > - * 1: writable
> > - * 2-6: reserved
> > - * 7: super page
> > - * 8-11: available
> > - * 12-63: Host physcial address
> > - */
> >  struct dma_pte {
> > -    u64 val;
> > +    union {
> > +        uint64_t val;
> > +        struct {
> > +            bool r:1;
> > +            bool w:1;
> > +            unsigned int reserved0:1;

'X' bit is ignored instead of reserved when execute request is not
reported or disabled.

Thanks
Kevin

> > +            unsigned int ignored0:4;
> > +            bool ps:1;
> > +            unsigned int ignored1:3;
> > +            bool snp:1;
> > +            uint64_t addr:52;
> 
> As per the doc I look at this extends only to bit 51 at most.
> Above are 11 ignored bits and (in leaf entries) the TM one.
> 
> Considering the differences between leaf and intermediate
> entries, perhaps leaf-only fields could gain a brief comment?
> 
> Jan


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.