Xen project Mailing List

Re: [Xen-devel] [PATCH 00/11] Alternate p2m: support multiple copies of host p2m

On 01/13/2015 12:45 PM, Andrew Cooper wrote: > On 13/01/15 20:02, Ed White wrote: >> On 01/13/2015 11:01 AM, Andrew Cooper wrote: >>> On 09/01/15 21:26, Ed White wrote: >>>> This set of patches adds support to hvm domains for EPTP switching by >>>> creating >>>> multiple copies of the host p2m (currently limited to 10 copies). >>>> >>>> The primary use of this capability is expected to be in scenarios where >>>> access >>>> to memory needs to be monitored and/or restricted below the level at which >>>> the >>>> guest OS page tables operate. Two examples that were discussed at the 2014 >>>> Xen >>>> developer summit are: >>>> >>>> VM introspection: >>>> http://www.slideshare.net/xen_com_mgr/ >>>> zero-footprint-guest-memory-introspection-from-xen >>>> >>>> Secure inter-VM communication: >>>> http://www.slideshare.net/xen_com_mgr/nakajima-nvf >>>> >>>> Each p2m copy is populated lazily on EPT violations, and only contains >>>> entries for >>>> ram p2m types. Permissions for pages in alternate p2m's can be changed in >>>> a similar >>>> way to the existing memory access interface, and gfn->mfn mappings can be >>>> changed. >>>> >>>> All this is done through extra HVMOP types. >>>> >>>> The cross-domain HVMOP code has been compile-tested only. Also, the >>>> cross-domain >>>> code is hypervisor-only, the toolstack has not been modified. >>>> >>>> The intra-domain code has been tested. Violation notifications can only be >>>> received >>>> for pages that have been modified (access permissions and/or gfn->mfn >>>> mapping) >>>> intra-domain, and only on VCPU's that have enabled notification. >>>> >>>> VMFUNC and #VE will both be emulated on hardware without native support. >>>> >>>> This code is not compatible with nested hvm functionality and will refuse >>>> to work >>>> with nested hvm active. It is also not compatible with migration. It >>>> should be >>>> considered experimental. >>> Having reviewed most of the series, I believe I now have a feeling for >>> what you are trying to achieve, but I would like to discuss some of the >>> design implications. >>> >>> The following is my understanding of the situation. Please correct me >>> if I have made a mistake. >>> >>> >> Thanks for investing the time to do this. Maybe this first couple of days >> would have gone more smoothly if something like this was in the cover letter. > > No problem. (I tend to find that things like this save time in the long > run) > >> >> With the exception of a couple of minor points, you are spot on. > > Cool! > >> >>> Currently, a domain has a single host p2m. This contains the guest >>> physical address mappings, and a combination of p2m types which are used >>> by existing components to allow certain actions to happen. All vcpus >>> run with the same host p2m. >>> >>> A domain may have a number of nested p2ms (currently an arbitrary limit >>> of 10). These are used for nested-virt and are translated by the host >>> p2m. Vcpus in guest mode run under a nested p2m. >>> >>> This new altp2m infrastructure adds the ability to use a different set >>> of tables in the place of the host p2m. This, in practice, allows for >>> different translations, different p2m types, different access permissions. >>> >>> One usecase of alternate p2ms is to provide introspection information to >>> out-of-guest entities (via the mem_event interface) or to in-guest >>> entities (via #VE). >>> >>> >>> Now for some observations and assumptions. >>> >>> It occurs to me that the altp2m mechanism is generic. From the look of >>> the series, it is mostly implemented in a generic way, which is great. >>> The only Intel specific bits appear to be the ept handling itself, >>> 'vmfunc' instruction support and #VE injection to in-guest entities. >>> >> That was my intention. I don't know enough about the state of AMD >> virtualization to know if it can support these patches by emulating >> vmfunc and #VE, but that was my target. > > As far as I am aware, AMD SVM has no similar concept to vmfunc, nor > #VE. However, the same kinds of introspection are certainly possible by > playing with the read/write bits on the NPT tables and causing a vmexit. > >> >>> I can't think of any reasonable case where the alternate p2m would want >>> mappings different to the host p2m. That is to say, an altp2m will map >>> the same set of mfns to make a guest physical address space, but may >>> differ in page permissions and possibly p2m types. >>> >> The set of mfn's is the same, but I do allow gfn->mfn mappings to be >> modified under certain circumstances. One use of this is to point the >> same VA to different physical pages (with different access permissions) >> in different p2m's to hide memory changes. > > What is the practical use of being able to play paging tricks like this > behind a VMs back? > I'm restricted in how much detail I can go into on a public mailing list, but imagine that you want a data read to see one thing and an instruction fetch to see something else. If you need more than that we'll have to go off-list, and even then I'll have to check what I can say. Ed >> >>> Given the above restriction, I believe a lot of the existing features >>> can continue to work and coexist. For generating mem_events, the >>> permissions can be altered in the altp2m. For injecting #VE, the altp2m >>> type can change to the new p2m_ram_rw, so long as the host p2m type is >>> compatible. For both, a vmexit can occur. Xen can do the appropriate >>> action and also inject a #VE on its way back into the guest. >>> >>> One thing I have noticed while looking at the #VE stuff that EPT also >>> supports A/D tracking, which might be quite a nice optimisation and >>> forgo the need for p2m_ram_logdirty, but I think this should be treated >>> as an orthogonal item. >>> >> This is far from my area of expertise, but I believe there is code in Xen >> to use EPT D bits in migration. > > Not that I can spot, although I seem to remember some talk about it. All > logdirty code still appears to relies on the logdirty bitmap being > filled, which is done from vmexits for p2m_ram_logdirty regions. > > ~Andrew > >> >> Ed >> >>> When shared ept/iommu is not in use, altp2m can safely be used by vcpus, >>> as this will not interfere with the IOMMU permissions. >>> >>> Furthermore, I can't conceptually think of an issue against the idea of >>> nestedp2m alternatives, following the same rule that the mapped mfns >>> match up. That should allow all existing nestedvirt infrastructure >>> continue to work. >>> >>> Does the above look sensible, or have I overlooked something? >>> >>> ~Andrew >>> > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.