[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] Modified RTDS scheduler to use an event-driven model instead of polling.



2015-07-08 1:01 GMT-07:00 Dario Faggioli <dario.faggioli@xxxxxxxxxx>:
> [Trimming the Cc-list a bit, to avoid bothering Wei and Jan]
>
> On Tue, 2015-07-07 at 22:56 -0700, Meng Xu wrote:
>> Hi Dario,
>>
> Hi,
>
>> 2015-07-07 7:03 GMT-07:00 Dario Faggioli <dario.faggioli@xxxxxxxxxx>:
>> >
>> > On Mon, 2015-07-06 at 22:51 -0700, Meng Xu wrote:
>
>> > So, it looks to me that, as far as (1) and (2) are concerned, since we
>> > are "just" inserting a vCPU in the runq, if we have M pCPUs, and we know
>> > whether we inserted it within the first M spots, we already have what we
>> > want, or am I missing something? And if __runq_insert() now (with Dagaen
>> > patch) tells us this, well, we can simplify the tickling logic, can't
>> > we?
>>
>> I think you might assume that the first M VCPUs  in the runq are the
>> current running VCPUs on the M pCPUs. Am I correct? (From what you
>> described in the following example, I think I'm correct. ;-) )
>>
> Mmm... Interesting. Yes, I was. I was basing this assumption on this
> chunk on Dagaen's patch:
>
>     // If we become one of top [# CPUs] in the runq, tickle it
>     // TODO: make this work when multiple tickles are required
>     if ( new_position > 0 && new_position <= prv->NUM_CPUS )
>         runq_tickle(ops, svc);
>
> And forgot (and did not go check) about the __q_remove() in
> rt_schedule(). My bad again.

Actually this assumption is good! :-D

>
> But then, since we don't have the running vCPUs in the runq, how the
> code above is supposed to be correct?

That's why at the very beginning of this thread, I asked Dagaen why we
need to return the position where a VCPU is inserted in the runq.
Without this assumption, I don't see how we can use the returned
position of the vcpu in runq in the current patch could be useful.
But with this assumption, it could be very useful!


>
>> > With an example:
>> > We are waking up (or re-inserting, in rt_context_saved()) vCPU j. We
>> > have 6 pCPUs. __runq_insert() tells us that it put vCPU j at the 3rd
>> > place in the runq. This means vCPU j should be set to run as soon as
>> > possible. So, if vCPU j is 3rd in runq, either
>> >  (a) there are only 3 runnable vCPUs (i.e., if we are waking up j, there
>> >      were 2 of them, and j is the third; if we are in context_saved,
>> >      there already where 3, and j just got it's deadline postponed, or
>> >      someone else got its one replenished);
>> >  (b) there are more than 3 runnable vCPUs, i.e., there is at least a 4th
>> >      vCPU --say vCPU k-- in the runq, which was the 3rd before vCPU j
>> >      were woken (or re-inserted), but now became the 4th, because
>> >      deadline(j)<deadline(k).
>> > In case (a), there are for sure idle pCPUs, and we should tickle one of
>> > them.
>>
>> I tell that you make the above assumption from here.
>>
>> However, in the current implementation, runq does not hold the current
>> running VCPUs on the pCPUs. We remove the vcpu from runq in
>> rt_schedule() function. What you described above make perfect sense
>> "if" we decide to make runq hold the current running VCPUs.
>>
> Yep. And it indeed seems to me that we may well think about doing so. It
> will make it possible to base on the position for making/optimizing
> scheduling decisions, and at the same time I don't think I see much
> downsides in that, do you?

No. Actually, I think it is better to keep the running VCPUs in the
runq (not running queue) so that we can better use this position of
VCPU in the runq. This could speed up the process to find a CPU to
tickle.

>
>> Actually, after thinking about the example you described, I think we
>> can hold the current running VCPUs *and* the current idle pCPUs in the
>> scheduler-wide structure;
>>
> What do you mean with 'current idle pCPUs'? I said something similar as
> well, and what I meant was a cpumask with bit i set if i-eth pCPU is
> idle, do you also mean this?

Yes.

>
> About the running vCPUs, why just not leave them in the actual runq?
>
>> In other words, we can have another runningq
>> (not runq) and a idle_pcpu list in the rt_private; Now all VCPUs are
>> stored in three queues: runningq, runq, and depletedq, in increasing
>> priority order.
>>
> Perhaps, but I'm not sure I see the need for another list. Again, why
> just not leave them in runq?

When I proposed the runningq, I was thinking about the situation when
we decide to split the (old) runq in RT-Xen to the (current) runq and
depletedq; I was thinking the same reason why we split to runq and
depletedq may still hold here when we add runningq.

But after thinking twice, maybe runq approach is a better way since it
just make the position information more meaningful. As you described
in the previous email, we can compare the position of a vcpu inserted
into the runq with the number of pCPUs so as to quickly know which
pCPU to tickle.

> I appreciate this is a rather big  change
> (although, perhaps it looks bigger said than done), but I think it could
> be worth pursuing.

It is worth pursuing since it simplify the cpu tickling process a lot.

>
> For double checking, asserting, and making sure that we are able to
> identify the running svc-s, we have the __RTDS_scheduled flag.

Yes. In rt_schedule(), we set the flag for the next vcpu
"set_bit(__RTDS_scheduled, &snext->flags);"

>
>> When we make the tickle decision, we only need to scan the idle_pcpu
>> and then runningq to figure out which pCPU to tickle. All of other
>> design you describe still hold here, except that the position where a
>> VCPU is inserted into runq cannot directly give us which pCPU to
>> tickle. What do you think?
>>
> I think that I'd like to know why you think adding another queue is
> necessary, instead of just leaving the vCPUs in the actual runq. Is
> there something bad about that which I'm missing?

Actually, nothing bad. I just recalled the situation when we split a
runq to runq and delpetedq. I was thinking it might be the case here
as well. (Yes, it is different here since we can get more useful
information to tickle cpu if we put vCPUs into runq instead of adding
one more queue.) :-)
>
>> > In case (b) there may be idle pCPUs (and, if that's the case, we
>> > should tickle one of them, of course) or not. If not, we need to go
>> > figure out which pCPU to tickle, which is exactly what runq_tickle()
>> > does, but we at least know for sure that we want to tickle the pCPU
>> > where vCPU k runs, or others where vCPUs with deadline greater than vCPU
>> > k run.
>> >
>> > Does this make sense?
>>
>> Yes, if we decide to hold the currently running VCPUs in
>> scheduler-wide structure: it can be runq or runningq.
>>
> Yes, but if we use two queues, we defeat at least part of this
> optimization/simplification.

Agree. Thanks! :-D

>
>> > Still, I think I gave enough material for an actual optimization. What
>> > do you think?
>>
>> Yes. It is very clear.
>> The only thing is how we are going to "realize" your assumption. :-)
>>
> :-D
>
>> I'm not so sure if we do need the runningq  to hold the current
>> running VCPUs since we can still use "for_each_cpu()" to find all
>> pCPUs used in the scheduler and get the current running VCPU on it.
>> Adding a new field to hold all such running VCPUs might help speed up
>> the searching, but need more space. So there is a balance between
>> space and time.
>>
> I don't think the added space would be a problem, but I don't see why we
> need it.

If we don't add another queue, no extra space. So we get free lunch here. :-)

Thanks and best regards,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.