This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Questioning the Xen Design of the VMM

To: Al Boldi <a1426z@xxxxxxxxx>
Subject: Re: [Xen-devel] Questioning the Xen Design of the VMM
From: Daniel Stodden <stodden@xxxxxxxxxx>
Date: Thu, 10 Aug 2006 13:20:00 +0200
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 10 Aug 2006 06:00:33 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <200608091553.06042.a1426z@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Technische Universität München
References: <907625E08839C4409CE5768403633E0BA7FE0E@xxxxxxxxxxxxxxxxx> <200608091553.06042.a1426z@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, 2006-08-09 at 15:53 +0300, Al Boldi wrote:
> Petersson, Mats wrote:
> > > > Al Boldi wrote:
> > > > > I maybe missing something, but why should the Xen-design
> > > > > require the guest to be patched?
> >
> > The main reason to use a para-virtual kernel that it performs better
> > than the fully virtualized version.
> >
> > > So HVM solves the problem, but why can't this layer be implemented in
> > > software?
> >
> > It CAN, and has been done.
> You mean full virtualization using binary translation in software?
> My understanding was, that HVM implies full virtualization without the need 
> for binary translation in software.
> > It is however, a little bit difficult to
> > cover some of the "strange" corner cases, as the x86 processor wasn't
> > really designed to handle virtualization natively [until these
> > extensions where added].
> You mean AMDV/IntelVT extensions?
> If so, then these extensions don't actively participate in the act of 
> virtualization, but rather fix some x86-arch shortcomings, that make it 
> easier for software (i.e. Xen) to virtualize, thus circumventing the need to 
> do binary translation.  Is this a correct reading?

they fix the issues, removing the general need for binary translation,
but go well beyond that as well.

a comparatively simple example of where it goes beyond are privilege
levels. basic system virtualization would just move the guest kernel to
a nonprivileged level to maintain control in the vmm. so you'd have the
hypervisor in supervisor mode (that's why it's called a hypervisor), and
both guest kernel and applications in user mode [1]. [should note that
xen makes a difference here, using x86 privilege levels which are more

what vtx does is keeping the privilege rings in protected mode untouched
by the virtualization features. instead, two whole new modes are added:
'vmx root' and 'vmx non-root'. the former applies to the vmm, the latter
to the guests. _both_ of these basically implement the protected mode as
it used to be. so hardware virtualization won't have to muck around with
the regular privilege system.

one example where this is particularly useful are hosted vmms, e.g.
vmware workstation. imagine a natively-running operating system and a
machine monitor running on top of (or integrated with) that. the system
would run in vmx-root mode. regular application processes there in ring3
as they used to. additionally, one may start guest systems on top of the
vmm, which again are implemented on top a regular x86 protected mode,
but in non-root mode.

all of the above 
 - can be functionally achieved _efficiently_ without hardware 
   extensions like vmx
 - but ONLY as long as the privilege architecture supports   
 - x86 does NOT [2]
   the pushf/popf outlined is an example of where the problems are
   - binary translation is a way to do it anyway, but does not count
     as 'efficient'.

with vmx
  - efficient virtualization is achieved.
  - some things just get additional flexibility. 

related reading:

[1] popek & goldberg: Formal Requirements for Virtualizable Third
Generation Architectures.pdf, 1974 (!)

[2] robin & irvine:  Analysis of the Intel Pentium's Ability to Support
a Secure Virtual Machine Monitor.pdf, 2000

both should be available from the web if you dig around long enough. :)

> > This is why you end up with binary translation
> > in VMWare for example. For example, let's say that we use the method of
> > "ring compression" (which is when the guest-OS is moved from Ring 0
> > [full privileges] to Ring 1 [less than full privileges]), and the
> > hypervisor wants to have full control of interrupt flags:
> >
> > some_function:
> >     ...
> >     pushf                   // Save interrupt flag.
> >     cli                     // Disable interrupts
> >     ...


Daniel Stodden
LRR     -      Lehrstuhl für Rechnertechnik und Rechnerorganisation
Institut für Informatik der TU München             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@xxxxxxxxxx
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list