[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Could Xen hyperviosr be able to invoke Linux systemcalls?
On Tue, 2015-08-18 at 01:18 +0000, Kun Cheng wrote: > On Tue, Aug 18, 2015 at 3:25 AM Dario Faggioli > <dario.faggioli@xxxxxxxxxx> wrote: > > On Mon, 2015-08-17 at 00:55 +0000, Kun Cheng wrote: > > > > > > On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio > <freddy77@xxxxxxxxx> > > > > What I'm planing is adding page migration support for NUMA > aware > > scheduling. In such a case the most time I'll be dealing > with Xen's > > memory management & scheduling part to make relevant pages > migrate to > > another node with their VCPU. However, Linux kernel has > already > > implemented some basic mechanisms so the whole work would be > better by > > leveraging the kernel's existing code or functions. > > > No, not at all. As you figured (or at least had intuition > about) > yourself, Xen does run below Linux. Actually, it runs below > any guest, > including Dom0, which is a special guest but still a guest, > and can even > not be a Linux guest. > > So there's no code sharing, or no mechanism to invoke Linux > code and > have it affect Xen's scheduling or memory management (and > never will > be :-P). > > > Not being able to share the existing kernel mechanism is some kind of > frustrating...... > You think? Well, I guess I see what you mean. However, being able to do custom things, specifically tailored to the kind of workload that Xen focuses on (i.e., virtualization, of course), instead of having to rely on tweaking a general purpose operating system, trying to bending it as much as possible to some specific needs (i.e., basically, what KVM is doing), is one of Xen's strengths. Then, whether or not we always manage to take proper advantage of that it's another pair of hands. > But just as you said it's the point of virtualization. And now I gain > a better understanding why you said it would be tough ;) (I start to > envy KVM guys, LOL) > Yeah, sometimes it happens that they get something sort of "for free", but I really believe what I just said above, so no anvy. :-) > So, in summary, what you're after should be achieved entirely > inside > Xen. It is possible than, in the PV guest case, you'd need > some help > from the guest. However, that would be in the form of "Xen > asking/forcing the guest to do something on the *guest* > *itself*", not > in the form of "Xen asking dom0 to do something on Xen's own > memory/scheduling or (directly) on other guests' memory". > > Hope this helps clearing things out for you. :-) > > At this point I still have other plans. But 'asking the guest to do > something on the guest itself' sounds like exposing the virtual NUMA > topology to the guest (vNUMA). > How so? We already have it, although it's not yet fully usable (right for PV guests) due to other issues. But I don't see what that has to do with what we're talking about. In the PV case, virtual NUMA what virtual NUMA topology takes is: - the tools and the hypervisor being able to allocate memory for the guest in a specific way (matching the topology we want the guest to have) - the hypervisor to store the virtual topology somewhere, in order to be able to provide it to the guest - the guest to ask about its own NUMA topology via a PV path (hypercalls), rather than via ACPI (which basically doesn't exist in PV) Again, what does this have to do with memory migration? > I wrote this email because hypervisor is responsible to allocate > machine memory for each guest. Then, in a PV case there are P2M and > M2P to help address translation (and shadow page tables in HVMs). So > what first came to my mind was hypervisor should move the pages for > guests and then P2M things should better be renewed somehow. However > inside a guest domain, its OS can only manage the guest physical > memory, which I don't think is able to be moved to another node by > itself. > A PV guests know about the fact that it is a PV guest (that's the point of paravirtualization), and in fact, it performs hypercalls ad everything. However, such a knowledge does not go as far as being aware of the host NUMA layout, and being able to move its own memory to a different NUMA node in the host. What I recommend you, is to have a look at the migration code. It's kind of a beast, I know, but it's been rewrote almost from scratch just very recently, and I'm sure now it's a lot better and easier to understand than before. Reason I'm suggesting this is that, particularly for PV, moving the guest's RAM under its own feet is going to be possible oly with something similar to performing a local migration. The main difference is that we may want to be able to do it more 'lively' (i.e., without stopping the world, even for a small amount of time, as it happens in migration), as well as that we may want to be able to move specific chunks of memory, rather than all of it. These are not small differences, and the migration code wouldn't probably be reusable as it is, but it's the closest thing to what you're saying you're trying to achieve that I can imagine. > > Maybe I misunderstood you words... 'asking the guest to do something > on the guest itself' confuses me a bit, could you explain more details > of your thought if it's convenient for you? > Yeah, my bad. Perhaps, for now, it's better if you forget about this. Very quickly, what I was hinting at is some mechanisms that we could come up with (but that will be one of the last steps) for putting the PV guest under some kind of quiescent state, i.e., a state where it does not change its page tables --as we're fiddling with them-- without being completely suspended. If we'll ever get there, I think that this could only be done with some cooperation from the guest, e.g., having it going through a protocol that we'd need to define, upon request from the hypervisor. But that's just speculation at this time, and we really shouldn't think at it until we get there... It's not like there aren't super difficult problem to solve already! :-P Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |