[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 1/5] pci: add pci_iomap_wc() variants



On Thu, Apr 30, 2015 at 10:36:04AM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
> 
> This allows drivers to take advantage of write-combining
> when possible. The PCI specification does not allow for us
> to automatically identify a memory region which needs
> write-combining so drivers have to identify these areas
> on their own. There is IORESOURCE_PREFETCH but as clarified
> by Michael and confirmed later by Bjorn, PCI prefetch bit
> merely means bridges can combine writes and prefetch reads.
> Prefetch does not affect ordering rules and does not allow
> writes to be collapsed [0]. WC is stronger, it allows collapsing
> and changes ordering rules. WC can also hurt latency as small
> writes are buffered. Because of all this drivers needs to
> know what they are doing, we can't set a write-combining
> preference flag in the pci core automatically for drivers.
> 
> Lastly although there is also arch_phys_wc_add() this makes
> use of architecture specific write-combining *hacks* and
> the only one currently defined and used is MTRR for x86.
> MTRRs are legacy, limited in number, have restrictive size
> constraints, and are known to interact pooly with the BIOS.
> MTRRs should only really be considered on old video framebuffer
> drivers. If we made ioremap_wc() and similar calls start
> automatically adding MTRRs, then performance will vary wildly
> with the order of driver loading because we'll run out of MTRRs
> part-way through bootup.
> 
> There are a few motivations for phasing out of MTRR and
> helping driver change over to use write-combining with PAT:
> 
> a) Take advantage of PAT when available
> 
> b) Help bury MTRR code away, MTRR is architecture specific and on
>    x86 its replaced by PAT
> 
> c) Help with the goal of eventually using _PAGE_CACHE_UC over
>    _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
>    de33c442e titled "x86 PAT: fix performance drop for glx,
>    use UC minus for ioremap(), ioremap_nocache() and
>    pci_mmap_page_range()")
> 
> [0] https://lkml.org/lkml/2015/4/21/714
> 
> Cc: Toshi Kani <toshi.kani@xxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> Cc: Suresh Siddha <sbsiddha@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> Cc: Dave Airlie <airlied@xxxxxxxxxx>
> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Cc: Antonino Daplas <adaplas@xxxxxxxxx>
> Cc: Jean-Christophe Plagniol-Villard <plagnioj@xxxxxxxxxxxx>
> Cc: Tomi Valkeinen <tomi.valkeinen@xxxxxx>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Arnd Bergmann <arnd@xxxxxxxx>
> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx>
> Cc: venkatesh.pallipadi@xxxxxxxxx
> Cc: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>
> Cc: Ville Syrjälä <syrjala@xxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Davidlohr Bueso <dbueso@xxxxxxx>
> Cc: konrad.wilk@xxxxxxxxxx
> Cc: ville.syrjala@xxxxxxxxxxxxxxx
> Cc: david.vrabel@xxxxxxxxxx
> Cc: jbeulich@xxxxxxxx
> Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Cc: linux-fbdev@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
> Cc: linux-pci@xxxxxxxxxxxxxxx
> Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>

Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

> ---
> 
> This v5 makes the code return NULL for IORESOURCE_IO and fixes the commit
> log to clarify the conclusions reached for MTRR and our review of
> IORESOURCE_PREFETCH.
> 
>  include/asm-generic/pci_iomap.h | 14 ++++++++++
>  lib/pci_iomap.c                 | 61 
> +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 75 insertions(+)
> 
> diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h
> index 7389c87..b1e17fc 100644
> --- a/include/asm-generic/pci_iomap.h
> +++ b/include/asm-generic/pci_iomap.h
> @@ -15,9 +15,13 @@ struct pci_dev;
>  #ifdef CONFIG_PCI
>  /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
>  extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long 
> max);
> +extern void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned 
> long max);
>  extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
>                                    unsigned long offset,
>                                    unsigned long maxlen);
> +extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
> +                                     unsigned long offset,
> +                                     unsigned long maxlen);
>  /* Create a virtual mapping cookie for a port on a given PCI device.
>   * Do not call this directly, it exists to make it easier for architectures
>   * to override */
> @@ -34,12 +38,22 @@ static inline void __iomem *pci_iomap(struct pci_dev 
> *dev, int bar, unsigned lon
>       return NULL;
>  }
>  
> +static inline void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, 
> unsigned long max)
> +{
> +     return NULL;
> +}
>  static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
>                                           unsigned long offset,
>                                           unsigned long maxlen)
>  {
>       return NULL;
>  }
> +static inline void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
> +                                            unsigned long offset,
> +                                            unsigned long maxlen)
> +{
> +     return NULL;
> +}
>  #endif
>  
>  #endif /* __ASM_GENERIC_IO_H */
> diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
> index bcce5f1..9604dcb 100644
> --- a/lib/pci_iomap.c
> +++ b/lib/pci_iomap.c
> @@ -52,6 +52,46 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
>  EXPORT_SYMBOL(pci_iomap_range);
>  
>  /**
> + * pci_iomap_wc_range - create a virtual WC mapping cookie for a PCI BAR
> + * @dev: PCI device that owns the BAR
> + * @bar: BAR number
> + * @offset: map memory at the given offset in BAR
> + * @maxlen: max length of the memory to map
> + *
> + * Using this function you will get a __iomem address to your device BAR.
> + * You can access it using ioread*() and iowrite*(). These functions hide
> + * the details if this is a MMIO or PIO address space and will just do what
> + * you expect from them in the correct way. When possible write combining
> + * is used.
> + *
> + * @maxlen specifies the maximum length to map. If you want to get access to
> + * the complete BAR from offset to the end, pass %0 here.
> + * */
> +void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
> +                              int bar,
> +                              unsigned long offset,
> +                              unsigned long maxlen)
> +{
> +     resource_size_t start = pci_resource_start(dev, bar);
> +     resource_size_t len = pci_resource_len(dev, bar);
> +     unsigned long flags = pci_resource_flags(dev, bar);
> +
> +     if (len <= offset || !start)
> +             return NULL;
> +     len -= offset;
> +     start += offset;
> +     if (maxlen && len > maxlen)
> +             len = maxlen;
> +     if (flags & IORESOURCE_IO)
> +             return NULL;
> +     if (flags & IORESOURCE_MEM)
> +             return ioremap_wc(start, len);
> +     /* What? */
> +     return NULL;
> +}
> +EXPORT_SYMBOL_GPL(pci_iomap_wc_range);
> +
> +/**
>   * pci_iomap - create a virtual mapping cookie for a PCI BAR
>   * @dev: PCI device that owns the BAR
>   * @bar: BAR number
> @@ -70,4 +110,25 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, 
> unsigned long maxlen)
>       return pci_iomap_range(dev, bar, 0, maxlen);
>  }
>  EXPORT_SYMBOL(pci_iomap);
> +
> +/**
> + * pci_iomap_wc - create a virtual WC mapping cookie for a PCI BAR
> + * @dev: PCI device that owns the BAR
> + * @bar: BAR number
> + * @maxlen: length of the memory to map
> + *
> + * Using this function you will get a __iomem address to your device BAR.
> + * You can access it using ioread*() and iowrite*(). These functions hide
> + * the details if this is a MMIO or PIO address space and will just do what
> + * you expect from them in the correct way. When possible write combining
> + * is used.
> + *
> + * @maxlen specifies the maximum length to map. If you want to get access to
> + * the complete BAR without checking for its length first, pass %0 here.
> + * */
> +void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long 
> maxlen)
> +{
> +     return pci_iomap_wc_range(dev, bar, 0, maxlen);
> +}
> +EXPORT_SYMBOL_GPL(pci_iomap_wc);
>  #endif /* CONFIG_PCI */
> -- 
> 2.3.2.209.gd67f9d5.dirty
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.