[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Is: PVH - how to solve maxmem != memory scenario? Was:Re: [PATCH] libxl: create PVH guests with max memory assigned



On Tue, Aug 05, 2014 at 04:05:28PM +0100, David Vrabel wrote:
> On 05/08/14 15:18, Konrad Rzeszutek Wilk wrote:
> > On Tue, Aug 05, 2014 at 01:08:22PM +0200, Roger Pau Monné wrote:
> >> On 05/08/14 11:34, David Vrabel wrote:
> >>>
> >>> I now regret accepting the PVH support in Linux without a clear
> >>> specification of what PVH actually is.
> > 
> > It is evolving :-)
> 
> I don't think this is a valid excuse for not having documentation.
> 
> > My personal opinion is that the easiest path is the best.
> > If it is just the matter of making Xen's P2M and E820 be exactly
> > the same and let the Linux guest figure out based on 'nr_pages'
> > how many RAM pages are really provided), is the way, then
> > lets do it that way.
> 
> Here's a rough design for two different options that I think would be
> sensible.
> 
> [Note: hypercalls are from memory and may be wrong, I also can't
> remember whether PVH guests get a start_info page and the existing docs
> don't say.]
> 
> The first is the PV-like approach.
> 
>   The toolstack shall construct an e820 memory map including all
>   appropriate holes for MMIO regions.  This memory map will be well
>   ordered, no regions shall overlap and all regions shall begin and end
>   on page boundaries.
> 
>   The toolstack shall issue a XENMEM_set_memory_map hypercall for this
>   memory map.
> 
>   The toolstack shall issue a XENMEM_set_maximum_reservation hypercall.
> 
>   Xen (or toolstack via populate_physmap? I can't remember) shall
>   populate the guest's p2m using the provided e820 map.  Frames shall
>   be added starting from the first E820_RAM region, fully
>   populating each RAM region before moving onto the next, until the
>   initial number of pages is reached.
> 
>   Xen shall write this initial number of pages into the nr_pages field
>   of the start_info frame.
> 
>   The guest shall issue a XENMEM_memory_map hypercall to obtain the
>   e820 memory map (as set by the toolstack).
> 
>   The guest shall obtain the initial number of pages from
>   start_info->nr_pages.
> 
>   The guest may then iterate over the e820 map, adding (sub) RAM
>   regions that are unpopulated to the balloon driver (or similar).
> 
> This second one uses PoD, but we can require specific behaviour on the
> guest to ensure the PoD pool (cache) is large enough.
> 
>   The toolstack shall construct an e820 memory map including all
>   appropriate holes for MMIO regions.  This memory map will be well
>   ordered, no regions shall overlap and all regions shall begin and end
>   on page boundaries.
> 
>   The toolstack shall issue a XENMEM_set_memory_map hypercall for this
>   memory map.
> 
>   The toolstack shall issue a XENMEM_set_maximum_reservation hypercall.
> 
>   Xen shall initialize and fill the PoD cache to the initial number of
>   pages.
> 
>   Xen (toolstack?) shall write the initial number of pages into the
>   nr_pages field of the start_info frame.
> 
>   The guest shall issue a XENMEM_memory_map hypercall to obtain the
>   e820 memory map (as set by the toolstack).
> 
>   The guest shall obtain the initial number of pages from
>   start_info->nr_pages.
> 
>   The guest may then iterate over the e820 map, adding (sub) RAM
>   regions that are unpopulated to the balloon driver (or similar).
> 
>   Xen must not use the PoD pool for allocations outside the initial
>   regions.  Xen must inject a fault into the guest should it attempt to
>   access frames outside of the initial region without an appropriate
>   XENMEM_populate_physmap hypercall to mark the region as populated (or
>   it could inject a fault/kill the domain if it runs out of PoD pool for
>   the initial allocation).
> 
> >From the guest's point-of-view both approaches are the same.  PoD could
> allow for deferred allocation which might help with start-up times for
> large guests, if that's something that interests people.

There is a tiny problem with PoD: PCI passthrough.

Right now we don't allow PoD + PCI until the hardware allows it
 (if it ever will).

Thought for PV we do allow ballooning and PCI passthrough as we can
at least control that the pages that are being ballooned are not going
to be used for DMA operations (unless the guest does something stupid
and returns an page back to the allocated but still does DMA ops on 
the PFN).
> 
> David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.