WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: lazy context switching

On Aug 26, 2005, at 4:37 AM, Keir Fraser wrote:
On 25 Aug 2005, at 22:55, Hollis Blanchard wrote:

Later on, if it turns out we are switching domains, we save/restore all the state we can, then return to the exception handler which saves the old set of nonvolatiles and loads the new one. Until that point, some domain state is
spread arbitrarily across our stack.

That means that context_switch() cannot actually save all of @prev's state to memory (and neither can __sync_lazy_execstate()) -- only by returning all the
way to assembly can we accomplish that.

Thoughts?
What you need is a synchronisation point, visible to other CPUs, 
beyond which things like DOM0_GETVCPUCONTEXT can be sure to read 
consistent current state for the descheduled vcpu. See 
domain_sleep_sync() for the current way we ensure that state is 
committed to memory.
Hmmmmm. I think the basic problem is that in the exception handler we 
don't usually know we will need this state. The exception is a debug 
exception, where we know we will need it for the GDB stub.
However, we also have a hypervisor-dedicated timer, HDEC (hypervisor 
decrementer). Rather than using it as a plain tick which may or may not 
cause a scheduler exception, we can use it to *always* mean a context 
switch. In that case, we would always save the full state on HDEC 
entry, because we know it will always cause a context switch. Judging 
by set_ac_timer() callers, it seems that only the scheduler really uses 
the Xen timer tick. If non-scheduler components start using 
Xen-internal ticks, this approach wouldn't hold up (or rather, it would 
start becoming less efficient).
Would that also work for DOM0_GETVCPUCONTEXT? Let's assume the dom0 
vcpu and the target vcpu are running on separate dedicated processors. 
In that case, dom0 could wait for the target vcpu to take an HDEC at 
some point in the future, but if it really is a dedicated vcpu then we 
would want the schedule interval to be the maximum, so that could be a 
long time. Another option is to have vcpu_pause() end up resetting the 
target vcpu's processor's HDEC via an IPI, which would cause a fake 
scheduler HDEC to go off, syncronizing the target vcpu's state.
What do you think?

If you have a lot of register state, have you considered maintaining a Xen stack per VCPU? The context-switch interface already supports this, for ia64.
We have plenty of space on the per-CPU stack for the register state (we 
use it anyways on a debug exception for the GDB stub). And even if we 
had one stack per VCPU, we would still want to avoid unnecessarily 
saving/restoring the nonvolatiles...
--
Hollis Blanchard
IBM Linux Technology Center


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel