[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] xen/swiotlb: If iommu=soft was not passed in on > 4GB, don't turn it on.



On Fri, Jul 27, 2012 at 08:27:39AM +0100, Jan Beulich wrote:
> >>> On 26.07.12 at 22:43, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 
> >>> wrote:
> > If we boot a 64-bit guest with more than 4GB memory, the SWIOTLB
> > gets turned on:
> > PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> > software IO TLB [mem 0xfb43d000-0xff43cfff] (64MB) mapped at 
> > [ffff8800fb43d000-ffff8800ff43cfff]
> > 
> > which is OK if we had PCI devices, but not if we did not. In a PV
> > guest the SWIOTLB ends up asking the hypervisor for precious lowmem
> > memory - and 64MB of it per guest. On a 32GB machine, this limits the
> > amount of guests that are 4GB to start due to lowmem exhaustion.
> > 
> > What we do is detect whether the user supplied e820_hole=1
> > parameter, which is used to construct an E820 that is similar to
> > the machine  - so that the PCI regions do not overlap with RAM regions.
> > We check for that by looking at the E820 and seeing if it diverges
> > from the standard - and if so (and if iommu=soft was not turned on),
> > we disable the check pci_swiotlb_detect_4gb code.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > ---
> >  arch/x86/xen/pci-swiotlb-xen.c |   26 ++++++++++++++++++++++++++
> >  1 files changed, 26 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
> > index 967633a..56f373e 100644
> > --- a/arch/x86/xen/pci-swiotlb-xen.c
> > +++ b/arch/x86/xen/pci-swiotlb-xen.c
> > @@ -8,6 +8,10 @@
> >  #include <xen/xen.h>
> >  #include <asm/iommu_table.h>
> >  
> > +#include <asm/e820.h>
> > +#include <asm/dma.h>
> > +#include <asm/iommu.h>
> > +
> >  int xen_swiotlb __read_mostly;
> >  
> >  static struct dma_map_ops xen_swiotlb_dma_ops = {
> > @@ -24,7 +28,19 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {
> >     .unmap_page = xen_swiotlb_unmap_page,
> >     .dma_supported = xen_swiotlb_dma_supported,
> >  };
> > +bool __init e820_has_acpi(void)
> > +{
> > +   int i;
> >  
> > +   /* Check if the user supplied the e820_hole parameter
> > +    * which would create a machine looking E820 region. */
> > +   for (i = 0; i < e820.nr_map; i++) {
> > +           if ((e820.map[i].type == E820_ACPI) ||
> > +               (e820.map[i].type == E820_NVS))
> > +                   return true;
> 
> Tying this decision to the presence of ACPI regions in E820 is
> problematic for two reasons imo: For one, it precludes cleaning
> up this (bogus!) construct where it gets produced (PV DomU-s
> really shouldn't ever see such E820 entries, they should get
> converted to simple reserved entries, to wipe any notion of
> ACPI presence). And second it ties you to running on systems
> that actually have ACPI, whereas it is my rudimentary
> understanding that systems with e.g. SFI would not have any
> ACPI).

Right. The other idea was to check the XenBus for the existence
of vpci backend. But at this stage it is not up yet.

Perhaps what I should check for is the existence of two E820_RSV
and two E820_RAM regions - and that would be a normal PV guest.
Anything that is outside of that scope would be considered
a PCI PV guest?

The other thought I had was to skip this check altogether and
either do:
1). initialize SWIOTLB when xen-pcifront start up and detects
    that it has devices (so later on initialization - similar to
    how IA64 does it) - but I am not sure how the PCI-DMA works
    with these late bloomers (especially as one could just make
    xen-pcifront be a module).
2). If xen-pcifront starts and does not detect any backends
    it calls swiotlb_free. But that also requires the PCI-DMA
    to swap in the dma_ops, and I am not entirely sure how
    that would work out.
3). Have an "early_init" xen-pcifront components that does a
    a quick XenBus init (similar to how hvmloader checks for
    DMI overwrites) and if it finds vpci then declare its
    time to turn SWIOTLB on.
4). The other thing is to wrap this code with something like
    this:

#ifdef CONFIG_SWIOTLB
#ifdef CONFIG_XEN_PCI_FRONTEND
        if (.. blah balh) do the check as outlined in 3).
#else // PCI_FRONTEND is not present, so we won't need SWIOTLB
        swiotlb = 0;
        iommu = 1;
#endif
#endif

That would take care of the built-in issues.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.