[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen



On 03/04/16 10:20, Haozhong Zhang wrote:
> On 03/02/16 06:03, Jan Beulich wrote:
> > >>> On 02.03.16 at 08:14, <haozhong.zhang@xxxxxxxxx> wrote:
> > > It means NVDIMM is very possibly mapped in page granularity, and
> > > hypervisor needs per-page data structures like page_info (rather than the
> > > range set style nvdimm_pages) to manage those mappings.
> > > 
> > > Then we will face the problem that the potentially huge number of
> > > per-page data structures may not fit in the normal ram. Linux kernel
> > > developers came across the same problem, and their solution is to
> > > reserve an area of NVDIMM and put the page structures in the reserved
> > > area (https://lwn.net/Articles/672457/). I think we may take the similar
> > > solution:
> > > (1) Dom0 Linux kernel reserves an area on each NVDIMM for Xen usage
> > >     (besides the one used by Linux kernel itself) and reports the address
> > >     and size to Xen hypervisor.
> > > 
> > >     Reasons to choose Linux kernel to make the reservation include:
> > >     (a) only Dom0 Linux kernel has the NVDIMM driver,
> > >     (b) make it flexible for Dom0 Linux kernel to handle all
> > >         reservations (for itself and Xen).
> > > 
> > > (2) Then Xen hypervisor builds the page structures for NVDIMM pages and
> > >     stores them in above reserved areas.
> > 
[...]
> > Furthermore - why would Dom0 waste space
> > creating per-page control structures for regions which are
> > meant to be handed to guests anyway?
> > 
> 
> I found my description was not accurate after consulting with our driver
> developers. By default the linux kernel does not create page structures
> for NVDIMM which is called by kernel the "raw mode". We could enforce
> the Dom0 kernel to pin NVDIMM in "raw mode" so as to avoid waste.
> 

More thoughts on reserving NVDIMM space for per-page structures

Currently, a per-page struct for managing mapping of NVDIMM pages may
include following fields:

struct nvdimm_page
{
    uint64_t mfn;        /* MFN of SPA of this NVDIMM page */
    uint64_t gfn;        /* GFN where this NVDIMM page is mapped */
    domid_t  domain_id;  /* which domain is this NVDIMM page mapped to */
    int      is_broken;  /* Is this NVDIMM page broken? (for MCE) */
}

Its size is 24 bytes (or 22 bytes if packed). For a 2 TB NVDIMM,
nvdimm_page structures would occupy 12 GB space, which is too hard to
fit in the normal ram on a small memory host. However, for smaller
NVDIMMs and/or hosts with large ram, those structures may still be able
to fit in the normal ram. In the latter circumstance, nvdimm_page
structures are stored in the normal ram, so they can be accessed more
quickly.

So we may add a boot parameter for Xen to allow users to configure which
place, the normal ram or nvdimm, are used to store those structures. For
the config of using normal ram, Xen could manage nvdimm_page structures
more quickly (and hence start a domain with NVDIMM more quickly), but
leaves less normal ram for VMs. For the config of using nvdimm, Xen would
take more time to mange nvdimm_page structures, but leaves more normal
ram for VMs.

Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.