This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: [PATCH 00/13] Nested Virtualization: Overview

To: Christoph Egger <Christoph.Egger@xxxxxxx>
Subject: [Xen-devel] RE: [PATCH 00/13] Nested Virtualization: Overview
From: "Dong, Eddie" <eddie.dong@xxxxxxxxx>
Date: Wed, 8 Sep 2010 16:58:52 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: Tim, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, Deegan <Tim.Deegan@xxxxxxxxxx>, "He, Qing" <qing.he@xxxxxxxxx>
Delivery-date: Wed, 08 Sep 2010 02:07:08 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201009071749.23010.Christoph.Egger@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201009011653.27082.Christoph.Egger@xxxxxxx> <201009061437.26601.Christoph.Egger@xxxxxxx> <1A42CE6F5F474C41B63392A5F80372B22A7C2808@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <201009071749.23010.Christoph.Egger@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActOpHGmnj5fXHwvQ125+cDGMAokxQAhnobA
Thread-topic: [PATCH 00/13] Nested Virtualization: Overview
>>> When the VCPU switches between host and guest mode then you need to
>>> save the l1 guest state, restore the l2 guest state and vice versa.
>> No. Switching from L1 to L2 guest, it is a simple restore of L2
>> guest state (VMCLEAR/VMPTRLD). VMX doesn't save L1 state,
> So the function hooks prepare4vmentry is pretty cheap and the
> 'hostsave' function hook is empty.

I don;t know how you conclude here, extending SVM knowledge to VMX again?
Prepare4vmentry has to load several VMCS fields, which is expansive in VMX.

>From SVM code, it loads 11 VMCB fields, which may be 11 VMCS field access in 
nsvm_vmcb_prepare4vmexit does reverse thing which touches at least 6 VMCS field 

Those 17 VMCS access already far exceeds the average VMCS access # in Xen for a 
typical VM exit, and even may be more expansive than an entire SW VM exit 
handler in some old platforms (The modern processor may be pretty fast though).
In many cases (even with nested virtualization), we can complete the VM exit 
handling in 2-3 VMCS field access.

>> while SVM does require the save of L1 state per my understanding.
> SVM does not per architecture. It is the implementation I have taken
> over from KVM. It will be replaced with an implementation that only
> requires a VMSAVE to save the L1 state. But that won't change the
> patch series fundamentally as this change touches SVM code only.
>> Intel process can hold VMCS in processor for performance.
>> Switching from L2 to L1, we need to convert (or store) some VMCS
>> information from physical to virtual VMCS.
> That is what can be put into the prepare4vmexit function hook.

No, all those are conditionally and accessible at current VMCS context only.
It is not guranteed at the time of prepare4vmexit. It is particular risky if we 
change the nested VMX code algorithm.

>> But it is limited and only covers the "guest state" and exit
>> information. Load of L1 state may be as simple as VMPTRLD (of course
>> it may modify some VMCS field upon different situation).
> That is what can be put into the 'hostrestore' function hook.
>>> This requires a lot of accesses from/to the vmcb/vmcs  unless you
>>> have a lazy switching technique, do you ?
>> Yes, the Intel hardware already did lazy switching thru VMPTRLD.
>> And the access of VMCS is depending on the L1 guest modification,
>> only dirty VMCS fields needs to be updated.
> Sounds like you need a 'shadow vmcs' that holds the l2 guest state.
>> In the majority case, the VM exit from L2 guest will be handled by
>> root VMM directly.
> Same on SVM. root VMM handles everything L1 guest does not intercept
> plus some special intercepts such as interrupts, nmi's, page
> fault/nested page faults. 

That is the key reason that I strongly against the wrapper of VM exit and VM 
entry. W/o the wrapper, we can complete the handler in 2-3 VMCS field access in 
most case, but w/ the wrapper, we have to spend 17+ VMCS field access.

>> One example is external interrupt, which doesn't need to access
>> rest of VMCS fields except the reason, but the wrapper makes the
>> access a must, which I can't agree.
> Which conversion function do you mean by 'wrapper' here ? Why does it
> require additional VMCS field accesses ?
> Can you explain this in detail, please ?

see before. nsvm_vmcb_prepare4vmexit and Prepare4vmentry.

>>> The reason to leave this in the generic code is what Keir stated out
>>> as feedback to the second patch series:
>>> http://lists.xensource.com/archives/html/xen-devel/2010-06/msg00280.html
>>> (What was patch #10 there is  patch #8 in latest patch series).
>> While, I didn;t read in this way. Even the function itself is
>> wrapped, they should go to SVM tree if not go together with the
>> caller. 
>> Anyway, to me given that nested SVM & VMX is still on the very
>> beginning of development, I can only say yes to those wrappers that
>> have clear understanding to both side.
> Good to know. See my offer below.

I am not sure if you want to efficiently start with lighweight wrapper first 
(those wrapper with consense and individual solution for those not), or you 
want to keep spinning on the heavy wrapper.  For me, if you go with lighweight 
wrapper like single layer VMX/SVM virtualization does, I can ack soon. But for 
the heavy weight wrapper, it is not necessary to me and will block future 
development of nested VMX, so I can't ack. Even current code is a lot of hard 
to understand given that I need to understand the semantics of those new 
(vendor-neutral) fields, however the semantics of VMX fields are clear to all 
VMX developers. Sorry for that, although I know Tim is more considerative to 
the effort you have called it out. I appreciate your effort as well, but that 
reason is not strong enough yet to impact nested VMX development.

>> I would rather leave those uncertain wrappers to future, once the
>> basic shape of nested virtualization is good and stable enough, i.e.
>> lightweight wrapper.  We have plenty of performance work ahead such
>> as virtual VTd support, enhanced PV driver for nested etc. Excessive
>> wrapper is simple a burden to nested VMX developers for those future
>> features. 
>> Qing will post his patch today or tomorrow for your reference if you
>> want. 
> Thanks. May I take code from there and add into my patch series ?

Sure. But you won;t assume this is the logic we won't change. The more 
performance tunning happens, the more we may change, and it is possible some 
new hardware features may come some time later.

Thx, Eddie
Xen-devel mailing list