This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Scheduler portability problem

To: "Magenheimer, Dan (HP Labs Fort Collins)" <dan.magenheimer@xxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Scheduler portability problem
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Wed, 9 Mar 2005 16:28:51 +0800
Delivery-date: Wed, 09 Mar 2005 08:30:09 +0000
Envelope-to: xen+James.Bulpin@xxxxxxxxxxxx
List-archive: <http://sourceforge.net/mailarchive/forum.php?forum=xen-devel>
List-help: <mailto:xen-devel-request@lists.sourceforge.net?subject=help>
List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
List-post: <mailto:xen-devel@lists.sourceforge.net>
List-subscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=subscribe>
List-unsubscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=unsubscribe>
Sender: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
Thread-index: AcUkMLMmuQFw0055QJSt5e32tY6tcgATgkBg
Thread-topic: [Xen-devel] Scheduler portability problem
Hi, Dan,
        Your finding is real problem for porting XEN to archs like IA64
which has a large set of register files. Current XEN/x86 adopts
so-called continuation mechanism to provide only one HV stack per LP,
for all domains running on that LP. A simple flow when context switch
can be:

1. Scheduler picks a new domain
2. In switch_to:
        - Save domain context (xen_regs) to prev's
        - Load next's domain context to bottom of stack (xen_regs)
3. Then schedule_tail simply does assembly tricks like you said, to
reset stack pointer to xen_regs area and resume to new domain

        This flow is elegant regarding to small context of x86, which
saves time for normal function exits since the stack content is known to
be useless on this continuation mechanism. Also by this way, two
parameters are enough for switch_to, since no stack switch happens at

        Say, IA64 has a large set of register files (n Kbytes) and
especially, a hardware engine to manage stack registers. Then both
performance and implementation difficulty are dramatically influenced if
we still adopt same mechanism. So, yes, we need to find a generic way to
allow both mechanisms (per-LP stack and per-domain stack) co-exist. A
quick code surf seems to indicate the first and major blocker is the BUG
in the end of __enter_scheduler. If we can take that check into arch
specific scheduler_tail, saying let different arch to decide whether it
wants a normal return, per-domain stacks may start to work if fortunate
enough. As long as the execution path follows normal function return
path to assembly stub, ia64_switch_to you ported from IPF linux can work
smoothly. However, you are right, we need comments from broader
developers to see what on earth an complete solution should be. :)

>-----Original Message-----
>From: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
>[mailto:xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of
Magenheimer, Dan (HP
>Labs Fort Collins)
>Sent: Tuesday, March 08, 2005 2:47 PM
>To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
>Subject: [Xen-devel] Scheduler portability problem
>I am working on Xen/ia64 changes (within Xen itself) to support
>multiple domains and ran into the following problem:
>It appears that __enter_scheduler was derived from an old version
>of the Linux scheduler ("schedule()"), with some changes made for
>simplification.  Many of the function names are the same but some
>of the syntax and semantics have changed.  In particular, four
>of note:
>1) switch_to now takes two arguments instead of three, and
>2) after switch_to is called, "other things" are done which
>   utilize the "next" pointer
>3) schedule_tail is passed the "next" task, rather than "prev"
>4) schedule_tail is assumed to never return
>I'm all for simplification if the Linux code is too complicated,
>but in this case, some of the complexity is present to support
>other architectures.  I can speak for ia64 but I suspect that
>similar problems will occur with other non-x86 ports.
>On Linux, switch_to is actually a macro and on ia64, another routine
>is called which returns a value that is "passed back" in the
>third switch_to argument.  Why?  Because switch_to actually does a task
>switch and the world may be very different when it returns.
>In particular, the values for prev and next are *different* when
>it returns.  Why?  Because switch_to (at least on ia64) is the
>key point where all of the current task state is put in memory,
>stacks are changed, and the new task state is taken back out
>of memory.  Actually, that's not quite accurate... at the point of
>the call to switch_to, a fair amount of state has *already* been
>put in memory in both the memory stack and the register stack.
>The only way to restore this state (short of some very complex
>stack analysis) is to exit each routine in the call stack the same
>way as it was called.
>So, on Linux, after the call to switch_to, "next" is no longer
>valid and is not used.  "Prev" is used only because of the third
>argument macro trick, and "current" has already been changed to
>point to the new task.
>On Xen/x86, it appears schedule_tail never returns because some cool
>assembly tricks are used to jump directly to the right place,
>basically as if throwing an exception (I'm guessing because there is no
>useful state on the call stack on x86).  As previously noted, this is
>problematic on ia64.
>Bottom line: The current code in __enter_scheduler() does not easily
>accommodate other architectures.  I'll be taking a look at what it
>will take to "fix" it, but wanted to open discussion first.  I know
>there are some that will say "just change the ia64 code"... because
>of architectural constraints, this is far FAR more easily said than
>done.  And there are some that will say that mimicking Linux is
>a mistake because XINL (Xen is not Linux).  However, I believe this
>is a case where leveraging the many many years of experience on many
>many architectures (with said experience only documented in the code
>itself) of Linux will benefit Xen portability in the long run (and,
>in my case, in the short run).
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real
>Discover which products truly live up to the hype. Start reading now.
>Xen-devel mailing list

SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>