[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] How to migrate vCPUs based on Credit Scheduler



On Mon, 2017-04-03 at 18:18 +0100, Lars Kurth wrote:
> Adding George, Dario & Anshul
>
Hey, hello,

Thanks Lars for the heads up...

> > On 3 Apr 2017, at 17:21, 甘清甜 <qingtiangan@xxxxxxxxx> wrote:
> > 
> > Hi,
> > 
> >  I'm now designing new vCPU scheduler in Xen, and trying to
> > implement 
> > the scheduler based on the Credit scheduler in Xen-4.5.1.
>
Can I ask what the purpose and the end goal of this is? 99% of the
times, knowing that helps giving good advices.

Also, are you forced to use 4.5.x for some specific reason? Because, if
that is not the case, it's always better to base on the most possible
recent code base (which, ideally, would be upstream git repo or, if
that's not possible or convenient, the latest released version, i.e.,
4.8, and soon enough, 4.9-rc).

> >  But I encountered
> >  come problems when debuging the code.
> > 
> > Most of the code modification is done in function csched_schedule()
> > in 
> > file: xen/common/csched_schedule.c . And the core code is as
> > followed:
> > 
> > if( vcpu_runnable(current) )
> > {
> >         if( match the migration contition )
> >         {
> >             cpu_affinity = pick_pcpu_runq();  // this function is
> > defined by myself
> >             
> >             pcpulock = pcpu_schedule_lock_irqsave(cpu_affinity, 
> >                    &pcpulock_flag);
> > 
Err... It's rather hard to comment without seeing the code. All of it,
I mean. For instance, you're already in csched_schedule(), called, say,
on cpu X, and hence you hold the scheduler lock of pcpu X.

Calling pcpu_schedule_lock() like above, is at high risk of deadlock,
depending of what "match the migration condition" actually means, and
on how pick_pcpu_runq() is defined.

In fact, if you look, for instance, in csched_load_balance(), you'll
see that it does a trylock, exactly for this very reason.


> >             TRACE_3D(TRC_CSCHED_STOLEN_VCPU, cpu_affinity  ,
> > domain_id,
> >                    vcpu_id);
> >             SCHED_VCPU_STAT_CRANK(scurr, migrate_q);
> >             SCHED_STAT_CRANK(migrate_queued);
> >             WARN_ON(scurr->vcpu->is_urgent);
> >                 scurr->vcpu->processor = cpu_affinity;      
> > 
> >              __runq_insert(cpu_affinity, scurr);
> >             pcpu_schedule_unlock_irqrestore(pcpulock,
> > pcpulock_flag, cpu_affinity  );
> >         }
> >         else 
> >             __runq_insert(cpu, scurr);
> > }
> > else 
> >         BUG_ON( is_idle_vcpu(current) || list_empty(runq) );
> > 
> > 
> > I try to run the modified Xen. But according to the log I found
> > that, 
> > although I insert the vCPU into the runqueue  of another pCPU, the 
> > vCPU still appears at the old pCPU in the following scheduling
> > period. 
>
Again: it's impossible to tell why this is happening, only looking at
the code snipped above.

For instance, what does csched_schedule() returns, in the case you'd
want the migration to occur? That influences what will be run in the
next scheduling period.

> > Now I have a few questions:
> > 
> > 1. Does the Xen scheduler framework support changing the pCPU of a 
> > vCPU after using out the scheduling time slice, but not just to
> > steal one 
> > vCPU from runqueue of other pCPU in load_balance period?
> > 
It does such a thing already (at least, if I understood correctly what
you're asking). Look at csched_load_balance() and csched_runq_steal().
They do exactly this.

> > 2. If yes, what status of the vCPU should be changed before
> > inserting 
> > the vCPU into the destination pCPU?
> > 
The answer varies, depending on whether the vCPU is currently running
on a pCPU, whether it is in a pCPU's runqueue, or whether it is
blocked/sleeping.

Looking closely at what csched_runq_steal() does is the best source of
information, but really, in order to be able to say anything useful, we
need to see the code, and to know what the end goal is. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.