[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 0 PATCH 3/3] PVH dom0: construct_dom0 changes



On Fri, Oct 04, 2013 at 07:53:20AM +0100, Jan Beulich wrote:
> >>> On 03.10.13 at 02:53, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:
> > On Fri, 27 Sep 2013 07:54:39 +0100
> > "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
> > 
> >> >>> On 27.09.13 at 02:17, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
> >> >>> wrote:
> >> > On Thu, 26 Sep 2013 09:02:41 +0100 "Jan Beulich"
> >> > <JBeulich@xxxxxxxx> wrote:
> >> >> >>> On 25.09.13 at 23:03, Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
> >> >> >>> wrote:
> >> >> > +/*
> >> >> > + * Set the 1:1 map for all non-RAM regions for dom 0. Thus,
> >> >> > dom0 will have
> >> >> > + * the entire io region mapped in the EPT/NPT.
> >> >> > + *
> >> >> > + * PVH FIXME: The following doesn't map MMIO ranges when they
> >> >> > sit above the
> >> >> > + *            highest E820 covered address.
> >> >> 
> >> >> This absolutely needs fixing before this can go in.
> >> > 
> >> > Any suggestions on how to fix it? Mapping all the way to end could
> >> > result in a huge hap table. 
> >> 
> >> You'll probably need a call down from Dom0 telling you where it
> >> finds/puts MMIO resources. Or perhaps that could be mapped
> >> in on demand from the EPT fault handler (since these regions
> >> shouldn't be subject to DMA, and hence IOMMU faults shouldn't
> >> occur - perhaps that's even a reason to not share page tables
> >> at least in dom0-strict mode)?
> > 
> > Thinking about mapping in on demand from the EPT fault handler, how
> > would I know if the access beyond last e820 entry is genuine and not 
> > a faulty pte in a buggy guest? Could I consult the mmconfig table (?) 
> > or the ACPI table in xen? Any pointers would be helpful... my 
> > knowledge runs out quickly here.
> 
> You'd have to inspect all the BARs of the devices the domain owns.
> Hence the thought of having Dom0 tell you about those resource
> assignments.

Doesn't that happen via PHYSDEVOP_pci_device_add hypercalls?
> 
> > FWIW, at present pv-ops linux doesn't allow any mmio access beyond
> > the last e820 entry. So, we'd need a fix there too. In my very orig
> > patch, I was updating all IO mappings on demand by putting hook
> > in linux native_pte_update if it was _PAGE_BIT_IOMAP. Another 
> > possibility would be do that for any mappings above the last
> > e820 entry. What do you think?
> 
> Special casing IOMAP page table creation might be an option, but
> has the downside of allowing kernel bugs to propagate into Xen's
> view of the world.
> 
> > For testing purposes, do you have reference for hardware? I don't see 
> > any here with such configuration.
> 
> Nothing specific, but I know that SR-IOV virtual functions easily
> cause kernels to run out of MMIO space below 4G (namely when
> the hole is only around 1Gb or even less), and Intel must have
> knowledge of graphics cards having so huge a frame buffer that
> it can only be mapped above 4G.

Right, but the BIOS Writers Guide and docs all talk setting the MCFG
up for that. Granted the MCFG (or was the ACPI spec?) says that the 
MCFG regions do not have to be defined in the E820.

You pointed out also that the MCFG entries might come out from
the ACPI DSDT. Which I think all comes back to dom0 parsing this and
providing this sort of information back to the hypervisor?



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.