[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Questioning the Xen Design of the VMM



On Wed, 2006-08-09 at 15:53 +0300, Al Boldi wrote:
> Petersson, Mats wrote:
> > > > Al Boldi wrote:
> > > > > I maybe missing something, but why should the Xen-design
> > > > > require the guest to be patched?
> >
> > The main reason to use a para-virtual kernel that it performs better
> > than the fully virtualized version.
> >
> > > So HVM solves the problem, but why can't this layer be implemented in
> > > software?
> >
> > It CAN, and has been done.
> 
> You mean full virtualization using binary translation in software?
> 
> My understanding was, that HVM implies full virtualization without the need 
> for binary translation in software.
> 
> > It is however, a little bit difficult to
> > cover some of the "strange" corner cases, as the x86 processor wasn't
> > really designed to handle virtualization natively [until these
> > extensions where added].
> 
> You mean AMDV/IntelVT extensions?
> 
> If so, then these extensions don't actively participate in the act of 
> virtualization, but rather fix some x86-arch shortcomings, that make it 
> easier for software (i.e. Xen) to virtualize, thus circumventing the need to 
> do binary translation.  Is this a correct reading?

they fix the issues, removing the general need for binary translation,
but go well beyond that as well.

a comparatively simple example of where it goes beyond are privilege
levels. basic system virtualization would just move the guest kernel to
a nonprivileged level to maintain control in the vmm. so you'd have the
hypervisor in supervisor mode (that's why it's called a hypervisor), and
both guest kernel and applications in user mode [1]. [should note that
xen makes a difference here, using x86 privilege levels which are more
complex].

what vtx does is keeping the privilege rings in protected mode untouched
by the virtualization features. instead, two whole new modes are added:
'vmx root' and 'vmx non-root'. the former applies to the vmm, the latter
to the guests. _both_ of these basically implement the protected mode as
it used to be. so hardware virtualization won't have to muck around with
the regular privilege system.

one example where this is particularly useful are hosted vmms, e.g.
vmware workstation. imagine a natively-running operating system and a
machine monitor running on top of (or integrated with) that. the system
would run in vmx-root mode. regular application processes there in ring3
as they used to. additionally, one may start guest systems on top of the
vmm, which again are implemented on top a regular x86 protected mode,
but in non-root mode.

all of the above 
 - can be functionally achieved _efficiently_ without hardware 
   extensions like vmx
 - but ONLY as long as the privilege architecture supports   
   virtualization
 - x86 does NOT [2]
   the pushf/popf outlined is an example of where the problems are
   - binary translation is a way to do it anyway, but does not count
     as 'efficient'.

with vmx
  - efficient virtualization is achieved.
  - some things just get additional flexibility. 

related reading:

[1] popek & goldberg: Formal Requirements for Virtualizable Third
Generation Architectures.pdf, 1974 (!)

[2] robin & irvine:  Analysis of the Intel Pentium's Ability to Support
a Secure Virtual Machine Monitor.pdf, 2000

both should be available from the web if you dig around long enough. :)

> > This is why you end up with binary translation
> > in VMWare for example. For example, let's say that we use the method of
> > "ring compression" (which is when the guest-OS is moved from Ring 0
> > [full privileges] to Ring 1 [less than full privileges]), and the
> > hypervisor wants to have full control of interrupt flags:
> >
> > some_function:
> >     ...
> >     pushf                   // Save interrupt flag.
> >     cli                     // Disable interrupts
> >     ...


regards,
daniel

-- 
Daniel Stodden
LRR     -      Lehrstuhl fÃr Rechnertechnik und Rechnerorganisation
Institut fÃr Informatik der TU MÃnchen             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@xxxxxxxxxx
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.