RE: [Xen-users] continued question on Xen 3d virtualization with

 

> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Tao Shen
> Sent: 28 May 2007 10:53
> To: xen-users@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-users] continued question on Xen 3d 
> virtualization with IOMMU
> 
> Hi , Xen-User group:
> 
> I am planning on a new system to run 4 VMs within Xen, 
> hopefully after 3d(openGL, Direct3d) working in Xen or windows.
> 
> I have some questions after extensively googling for it:
> 
> Mats, you said that you don't need Xen-aware drivers in DomU 
> if the system has IOMMU. 

Yes. 
> 
> now on the subject of hardware support, currently we have 
> Vt-x and AMD-v and that's hardware assisted CPU 
> virtualization. what's coming up in Penryn is the Extended 
> Page Table(EPT) and AMD Barcelona's Nested Page Table(NPT) 
> for help with hardware assisted memory virtualization as far 
> as I understand it. 

Yes. 
> 
> Now question #1, EPT and NPT should only help performance of 
> the VM, it doesn't help with 3d right? What I understand is 
> that you need IOMMU instead which is a chipset feature 
> instead of a CPU feature.

That is correct. {N,E}PT is only to support the fact that the guest view
of memory layout is different from the physical memory that is ACTUALLY
used by the guest. It relieves all the work done by the shadow-paging
code in the guest. 

> 
> on the subject of IOMMU support:  The Bearlake Q35 chipset 
> will come with Intel VT-d(intel's version of IO virt), 
> expected in a few months, Bearlake P35 is already out.  On 
> the AMD side, I have heard that current chipsets already have 
> IOMMU support built in.(probably not AMD IOMMU spec 1.2 just 
> released, but at least 1.0)

As far as I'm aware, there are no AMD chipsets with IOMMU available - I
could be wrong, but that's my understanding.
> 
> Now question #2,  which AMD chipsets(there is a bunch of 
> Nforce, and ATI chipsets) that Xen developers know of that 
> has IOMMU working?(I have heard that the GART and DEV 
> together is a fully functional IOMMU unit)  and if I were to 
> get an Athlon X2 AM2 chip with that chipset mobo, 
> technically, I can get the 3d working right? but without the 
> benefits of NPT which later comes with Barcelona(which is 
> also AM2 socket compatible) 

GART will support re-mapping of the device memory access, but it only
supports one map for the entire system, which may be insufficient for
anything but the most minimal setup. Also, there's currently no software
to support GART at all in Xen, although this about to change for the
purpose of using GART to map the para-virtual memory. Thus far I've
heard of no plans to use this to support fully virtualizaed guests. 

DEV will prevent one guest from access another guests memory (which is
another functionality that the IOMMU allows - making sure that the PCI
device doesn't access somewhere OUTSIDE it's own memory)
> 
> Question #3:  you said that Xen aware GPU drivers can help 3d 
> accleration in domU VMs if the GPU driver is open source.  
> Intel's GPUs are all open source now, when can users expect 
> to have Xen work with Intel's embedded GPUs like GMA950 and X3100s? 

Just to clarify, unless we start making really big changes to the driver
architecture, we have to use a modified driver in the guest. I'm not
100% sure that the entire interface necessary to perform this task is
there in the para-virtual driver interface [it probably can be ADDED,
but it further makes the task complicated]. 

If the driver is open source, you have some chance of actually modifying
it. But these drivers are quite clearly non-trivial, so it's not just a
case of "recompiling for Xen". It is a case of wading through the code
and modifying any place where a reference to memory is given to the
graphics card, such that the new code takes into account the fact that
memory in the guest isn't actually the REAL physical memory layout. I'm
sure this CAN be done, but it's a lot of hard work to find all the
relevent places [also, you have to be aware that memory the guest thinks
is contiguous may not actually be contiguous in the ACTUAL physical
memory map, which means that some process of re-mapping this to a
MACHINE PHYSICAL contiguous memor region would be necessary]. 

I also don't think the GPU drivers for Windows are Open Source, and
since the vast majority of "requests" for 3D graphics in guest are
related to using Windows to do 3D graphics, this is clearly where the
effort would have to be put in. 


> 
> Now question #4: not that important, but how much performance 
> benefits do you think you can get from the addition of NPT 
> and EPT?,  VMware argues that the first gen VT-x and AMD-V 
> sometimes made the VMs slower.  If EPT doesn't add much and 
> AMD's got IOMMU already working, there is no reason for me to 
> wait for Penryn IMHO. 

It very much depends on the "application" you're using. It requires much
less interaction with the hypervisor, which is why it's there. The
shadow-paging code in Xen is not trivial, it interprets instructions
that are "trapped" by the hypervisor. Each update will take many
thousand cycles, guaranteed! On the other hand, the nested paging adds
overhead reads of the "host-pagetable", which is in a worst case
scenario 4 per page-table level, so a maximum of 20 reads for one
complete page-table fill. This is unlikely to happen very often (the
highest level page-table usually only has two entries, one for kernel
and one for "user-code", so at least these should be cached in the TLB -
the next levels depend on the application). 

So, in a test-case like "kernel compile" (which does lots of page-table
updates), the benefit will definitely be noticable [if not at the speed
the compile scrolls past, at least you will be able to measure it with a
regular wrist-watch with a second hand, rather than a cronograph]. 

On the other "extreme", you'll have the case where you have a HUGE array
(many gigabytes), and use a random number to index that array - then
you'll have few updates to the page-table, and many TLB-miss operations
where the whole chain of memory reads have to take place. 

In between comes some benchmarks such as CPU intensive calculations
where the amount of memory accecssed is relatively small and not many
page-table-updates, where there's no big difference either direction. 

Just like for the x86-64 vs. x86-32 performance difference, one isn't
necessarily better than the other on individual cases, and it may even
be that the "new" one is slower. But on an average over some reasonably
different benchmarks, the overall win is with the "new" technology. 

I can't give any direct benchmarks, simply because I don't have any. 

--
Mats

> 
> Thanks for your time and thank you in advance for helping me 
> with those questions,
> 
> 
> 



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
WARNING - OLD ARCHIVES

xen-users

RE: [Xen-users] continued question on Xen 3d virtualization with IOMMU