WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

RE: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:Transparentpara

To: "Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "Matt Chapman" <matthewc@xxxxxxxxxxxxxxx>
Subject: RE: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:Transparentparavirtualization vs. xen paravirtualization)
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 1 Nov 2005 23:19:17 +0800
Cc: "Ling, Xiaofeng" <xiaofeng.ling@xxxxxxxxx>, xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 01 Nov 2005 15:16:24 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcXerXo76q/haFr+QWGubs2JLsu1/QAAmd0wAA7Ef8AAAiCdgA==
Thread-topic: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:Transparentparavirtualization vs. xen paravirtualization)
>From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.magenheimer@xxxxxx]
>Sent: 2005年11月1日 22:32
>
>However, I agree with Matt that a PMT for other domains
>(domU) is a bad idea as it creates many problems for migration,
>save/restore, ballooning, and adding new domains to an already
>loaded system. Further, the grant table abstraction is the primary
>mechanism for page sharing for domU in Xen (on Xen/x86).
>I think if domU has any knowledge of actual machine addresses,
>the Xen team would consider this a bug that should be fixed.

Here we need to clarify one concept. When domU behaves as a driver domain, it 
becomes one necessary assistant to dom0 to co-construct the virtual 
environment. At that time, the driver domU should also be owned by system 
administrator like dom0 since these domains controls physical resources. They 
driver domU are just service provider (backend) as dom0 and there's on need to 
migrate them. IMO, migration is mainly applied to non-driver domU which are 
simply service-subscriber (frontend). Then migration is necessary for them by 
hooking them to different service-providers on another machine.

More, driver domains are necessary for server environment (especially for 
IA64), and it would be nightmare to only have one dom0 controls all physical 
resources and acts as only backend to serve all other domains.


Thanks,
Kevin
>
>> -----Original Message-----
>> From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf
>> Of Dong, Eddie
>> Sent: Tuesday, November 01, 2005 12:09 AM
>> To: Matt Chapman; Tian, Kevin
>> Cc: Ling, Xiaofeng; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: RE: [Xen-ia64-devel] Re: PMT table for XEN/IA64
>> (was: RE:Transparentparavirtualization vs. xen paravirtualization)
>>
>> Matt:
>>      Yes, like you mentioned, let domU or VTIdomain only do
>> page flipping with assumption of the service domain own whole
>> system pages (i.e all other domain's page comes from service
>> domain) works. While it is eventually impossible for driver
>> domains as there can be only one domain that own whole system
>> pages. So either we start with what you proposed, and roll
>> back to what X86 is doing now at some time later for example
>> Xen3.1, or we start with align to Xen/X86 and save all
>> various maintaince effort and rework effort. I suggest we go
>> with the right design it will be eventually.
>>      Yes, supporting PMT may require modification in
>> Xenia64Linux, while as you pointed out, domU in any sense
>> (migration, memory location etc.)  has to maintain PMT table,
>> so why not let dom0 work in same way? Let dom0 and domU use
>> as much code as possible is a right way to do IMO, right?
>>      The modification to Xenia64Linux is not so big,
>> probably only PMT setup now, and then VBD/VNIF work may
>> reference and modify it. It should be almost same with X86 approach.
>>      What sepcific question about X86 shadow_translate? I
>> can consult expert here too if you need :-)
>>
>>      So, now it may be time for us to dig into details of
>> how to do PMTs...:-) And dan?
>> Eddie
>>
>>
>>
>>
>> Matt Chapman wrote:
>> > I'm still not clear about the details.  Could you outline
>> the changes
>> > that you want to make to Xen/ia64?
>> >
>> > Would DomU have a PMT?  Surely DomU should not know about
>> real machine
>> > addresses, that should be hidden behind the grant table interface.
>> > Otherwise migration, save/restore, etc. are difficult (as they have
>> > found on x86).
>> >
>> > Do you know how x86 shadow_translate mode works?  Perhaps we should
>> > use that as an example.
>> >
>> > Matt
>> >
>> >
>> > On Mon, Oct 31, 2005 at 05:11:09PM +0800, Tian, Kevin wrote:
>> >> Matt Chapman wrote:
>> >>> 1. Packet arrives in a Dom0 SKB.  Of course the buffer needs
>> >>>    to be page sized/aligned (this is true on x86 too).
>> >>> 2. netback steals the buffer
>> >>> 3. netback donates it to DomU *without freeing it*
>> >>> 4. DomU receives the frame and passes it up its network stack
>> >>> 5. DomU gives away other frame(s) to restore balance
>> >>> 6. Dom0 eventually receives extra frames via its balloon driver
>> >>>
>> >>> 5 and 6 can be done lazily in batches.  Alternatively, 4 and 5
>> >>> could be a single "flip" operation.
>> >>
>> >> The solution will work with some tweaks.  But is there any obvious
>> >> benefit than PMT approach used on x86? (If yes, you should suggest
>> >> to xen-devel;-) Usually we want a different approach for either
>> >> "can't do on this architecture" or "far better performance than
>> >> existing one". Or else why we derail from Xen design for extra
>> >> maintainance effort.  This extra effort has causing us 2+ weeks to
>> >> get VBD up to support DomU for the last 2 upstream merges
>> >>
>> >>>
>> >>> I think this is not significantly different from x86.
>> >>>
>> >>> I'm not saying this is necessarily better than a PMT solution,
>> >>> but I want to discuss the differences and trade-offs.  By PMT
>> >>> I assume you mean to make Dom0 not 1:1 mapped, and then give
>> >>> it access to the translation table?  Can you describe how the
>> >>> above works differently with a PMT?
>> >>
>> >>
>> >> Simply saying the work flow, PMT approach is similar with
>> >> backend/frontend needed to touch PMT table for ownership change.
>> >> However do you evaluate how many tricky changes required to support
>> >> Domain0 with gpn=mfn upon existing code? For example,
>> >>   - Backend drivers are not bound to dom0, which can also
>> be used by
>> >> domU as driver domain. At that time, 1:1 mapping has no
>> sense there.
>> >> There are some talks on DomU servers as driver IO already.
>> >>   - You need ensure all available pages granted to dom0.
>> That means
>> >> you need change current dom0 allocation code.
>> >>   - You need to change current vnif code with - unknown -
>> #ifdefs and
>> >> workarounds, since you implement a new behavior on top of different
>> >> approach.
>> >>   - ... (maintenance!)
>> >>
>> >> So if you implement a VM from scratch, then definitely
>> your approach
>> >> is worthy of trying since no limitation there. However
>> since we work
>> >> on XEN, we should take advantage of current Xen design as possible,
>> >> right? ;-)
>> >>
>> >>>
>> >>> One disadvantage I see of having Dom0 not 1:1 is that superpages
>> >>> are more difficult, we can't just use the guest's superpages.
>> >>
>> >>
>> >> Superpages are optimization option, and we still need to support
>> >> incontiguous pages as a basic requirement. You can still add option
>> >> to allocate contiguous pages for guest even with PMT table, since
>> >> para-virtualization is cooperative.
>> >>
>> >>>
>> >>> Also, are there paravirtualisation changes needed to support a
>> >>> PMT?  I'm concerned about not making the paravirtualisation
>> >>> changes too complex (I think x86 Xen changes the OS too much).
>> >>> Also, it should be possible to load Xen frontend drivers into
>> >>> unmodified OSs (on VT).
>> >>
>> >>
>> >> We need balance between new designs and maintainance effort.
>> >> Currently Xiaofeng Lin from Intel is working on para-drivers for
>> >> unmodified domain, and both VBD & VNIF are working for x86 VT
>> >> domains already and are reviewing by Cambridge. This work is based
>> >> on PMT table.
>> >>
>> >> Kevin
>> >>>
>> >>> On Mon, Oct 31, 2005 at 01:28:43PM +0800, Tian, Kevin wrote:
>> >>>> Hi, Matt,
>> >>>>
>> >>>>         The point here is how to check donated frame done and
>> where "free"
>> >>>> actually happens in domU. Currently Linux network driver utilizes
>> >>>> zero-copy to pass received packet up without any copy. In this
>> >>>> case, the receive pages are allocated from skbuff, which however
>> >>>> is freed by upper layer instead of vnif driver itself.
>> To let dom0
>> >>>> know when the donated page is done, you may either:
>> >>>>         - Copy content from donated page to local skbuff page, and then
>> >>>> notify dom0 immediately at the cost of performance
>> >>>>         - Modify upper layer code to register "free" hook which notify
>> >>>> dom0 if done at the cost of more modification to common code and
>> >>>> bias from x86.
>> >>>>
>> >>>>         Definitely there're other possibilities to make it "working" by
>> >>>> this approach and even more alternatives. However the point we
>> >>>> really want to emphasize here is that we can move towards x86
>> >>>> solution by adding PMT, with best performance and less
>> maintenance
>> >>>> effort. That can actually minimize our future re-base effort when
>> >>>> para-drivers keep going. ;-)
>> >>>>
>> >>>> Thanks,
>> >>>> Kevin
>> >>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Matt Chapman [mailto:matthewc@xxxxxxxxxxxxxxx]
>> >>>>> Sent: 2005年10月31日 13:09
>> >>>>> To: Tian, Kevin
>> >>>>> Cc: Dong, Eddie; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>>>> Subject: Re: [Xen-ia64-devel] Re: PMT table for
>> XEN/IA64 (was: RE:
>> >>>>> Transparentparavirtualization vs. xen paravirtualization)
>> >>>>>
>> >>>>> Yes, I think I understand the problem now.
>> >>>>>
>> >>>>> The way I imagine this could work is that Dom0 would know about
>> >>>>> all of the memory in the machine (i.e. it would be passed the
>> >>>>> original EFI memmap, minus memory used by Xen).
>> >>>>>
>> >>>>> Then Dom0 would donate memory for other domains (=ballooning).
>> >>>>> Dom0 can donate data frames to DomU in the same way -
>> by granting
>> >>>>> the frame and not freeing it.  When DomU donates a data frame to
>> >>>>> Dom0, Dom0 frees it when it is done, and now the kernel can use
>> >>>>> it.
>> >>>>>
>> >>>>> What do you think of this approach?
>> >>>>>
>> >>>>> Matt
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Oct 31, 2005 at 11:09:04AM +0800, Tian, Kevin wrote:
>> >>>>>> Hi, Matt,
>> >>>>>>       It's not related to mapped virtual address, but only for
>> >>>>>> physical/machine pfn.
>> >>>>> Current vnif backend (on x86) works as:
>> >>>>>>
>> >>>>>> 1. Allocate a set of physical pfns from kernel
>> >>>>>> 2. chop up the mapping between physical pfn and old machine pfn
>> >>>>>> 3. Transfer ownership of old machine pfn to frontend
>> >>>>>> 4. Allocate new machine pfn and bound to that physical pfn
>> >>>>>> (In this case, there's no ownership return from frontend for
>> >>>>>> performance reason)
>> >>>>>>
>> >>>>>>       If without PMT table (Assuming guest==machine
>> for dom0), that
>> >>>>>> means you
>> >>>>> have to hotplug physical pfns from guest (based on page
>> >>>>> granularity) based on current vnif model. Or maybe you
>> have better
>> >>>>> alternative without PMT, and without big change to existing vnif
>> >>>>> driver simultaneously?
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Kevin
>> >>>>>>
>> >>>>>>> -----Original Message-----
>> >>>>>>> From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> >>>>>>> [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx]
>> On Behalf Of
>> >>>>>>> Matt Chapman Sent: 2005年10月31日 10:59 To: Dong, Eddie
>> >>>>>>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>>>>>> Subject: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:
>> >>>>>>> Transparentparavirtualization vs. xen paravirtualization)
>> >>>>>>>
>> >>>>>>> Hi Eddie,
>> >>>>>>>
>> >>>>>>> The way I did it was to make the address argument to grant
>> >>>>>>> hypercalls in/out; that is, the hypervisor might
>> possibly return
>> >>>>>>> a different address than the one requested, like mmap on UNIX.
>> >>>>>>>
>> >>>>>>> For DomU, the hypervisor would map the page at the requested
>> >>>>>>> address.  For Dom0, the hypervisor would instead return the
>> >>>>>>> existing address of that page, since Dom0 already has access
>> >>>>>>> to the whole address space.
>> >>>>>>>
>> >>>>>>> (N.B. I'm referring to physical/machine mappings here; unlike
>> >>>>>>> the x86 implementation where the grant table ops map pages
>> >>>>>>> directly into virtual address space).
>> >>>>>>>
>> >>>>>>> Matt
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Fri, Oct 28, 2005 at 10:28:08PM +0800, Dong, Eddie wrote:
>> >>>>>>>>>  Page flipping should work just fine
>> >>>>>>>>> in the current design; Matt had it almost working (out of
>> >>>>>>>>> tree) before he went back to school.
>> >>>>>>>>>
>> >>>>>>>> Matt:
>> >>>>>>>>     Dan mentioned that you had VNIF work almost
>> done without PMT
>> >>>>>>>> table support for dom0, Can you share the idea with us?
>> >>>>>>>>     Usually VNIF swap page between dom0 and domU so
>> that network
>> >>>>>>>> package copy (between dom0 native driver and domU  frontend
>> >>>>>>>> driver) can be avoided and thus achieve high
>> performance. With
>> >>>>>>>> this swap, we can no longer assume dom0 gpn=mfn. So what did
>> >>>>>>>>     you ever propose to port VNIF without PMT
>> table? Thanks a
>> >>>>>>>> lot, eddie
>> >>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> Xen-ia64-devel mailing list
>> >>>>>>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>>>>>> http://lists.xensource.com/xen-ia64-devel
>> >>>
>> >>> _______________________________________________
>> >>> Xen-ia64-devel mailing list
>> >>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>> http://lists.xensource.com/xen-ia64-devel
>>
>>
>> _______________________________________________
>> Xen-ia64-devel mailing list
>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-ia64-devel
>>

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel