[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] x86's context switch ordering of operations


  • To: Jan Beulich <jbeulich@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
  • Date: Tue, 29 Apr 2008 13:50:30 +0100
  • Delivery-date: Tue, 29 Apr 2008 05:50:34 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: Acip95rZ2WtDFhXqEd2QpAAWy6hiGQ==
  • Thread-topic: [Xen-devel] x86's context switch ordering of operations

On 29/4/08 13:39, "Jan Beulich" <jbeulich@xxxxxxxxxx> wrote:

> To do so, I was considering using {un,}map_domain_page() from
> the context switch path, but there are two major problems with the
> ordering of operations:
> - for the outgoing task, 'current' is being changed before the
> ctxt_switch_from() hook is being called
> - for the incoming task, write_ptbase() happens only after the
> ctxt_switch_to() hook was already called
> I'm wondering whether there are hidden dependencies that require
> this particular (somewhat non-natural) ordering.

ctxt_switch_{from,to} exist only in x86 Xen and are called from a single
hook point out from the common scheduler. Thus either they both happen
before, or both happen after, current is changed by the common scheduler. It
took a while for the scheduler interfaces to settle down to something both
x86 and ia64 was happy with so I'm not particularly excited about revisiting
them. I'm not sure why you'd want to map_domain_page() on context switch
anyway. The map_domain_page() 32-bit implementation is inherently per-domain
already.

> 1) How does the storing of vcpu_info_mfn in the hypervisor survive
> migration or save/restore? The mainline Linux code, which uses this
> hypercall, doesn't appear to make any attempt to revert to using the
> default location during suspend or to re-setup the alternate location
> during resume (but of course I'm not sure that guest is save/restore/
> migrate ready in the first place). I would imagine it to be at least
> difficult for the guest to manage its state post resume without the
> hypervisor having restored the previously established alternative
> placement.

I don't see that it would be hard for the guest to do it itself before
bringing back all VCPUs (either by bringing them up or by exiting the
stopmachine state). Is save/restore even supported by pv_ops kernels yet?

> 2) The implementation in the hypervisor seems to have added yet another
> scalibility issue (on 32-bits), as this is being carried out using
> map_domain_page_global() - if there are sufficiently many guests with
> sufficiently many vCPU-s, there just won't be any space left at some
> point. This worries me especially in the context of seeing a call to
> sh_map_domain_page_global() that is followed by a BUG_ON() checking
> whether the call failed.

The hypervisor generally assumes that vcpu_info's are permanently and
globally mapped. That obviously places an unavoidable scalability limit for
32-bit Xen. I have no problem with telling people who are concerned about
the limit to use 64-bit Xen instead.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.