[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 09/11] x86/ctxt: Issue a speculation barrier between vcpu contexts

>>> On 27.01.18 at 02:27, <dfaggioli@xxxxxxxx> wrote:
>>  On 25/01/18 16:09, Andrew Cooper wrote:
>> > On 25/01/18 15:57, Jan Beulich wrote:
>> > > > > > 
>> > > For the record, the overwhelming majority of calls to
>> > > __sync_local_execstate() being responsible for the behavior
>> > > come from invalidate_interrupt(), which suggests to me that
>> > > there's a meaningful number of cases where a vCPU is migrated
>> > > to another CPU and then back, without another vCPU having
>> > > run on the original CPU in between. If I'm not wrong with this,
>> > > I have to question why the vCPU is migrated then in the first
>> > > place.
>> > 
> So, about this. I haven't applied Jan's measurement patch yet (I'm
> doing some reshuffling of my dev and test hardware here), but I have
> given a look at traces.
> So, Jan, a question: why are you saying "migrated to another CPU **and
> then back**"?

Because that's how I interpret some of the output from my
logging additions.

> I'm asking because, AFAICT, the fact that
> __sync_local_execstate() is called from invalidate_interrupt() means
> that:
> * a vCPU is running on a pCPU
> * the vCPU is migrated, and the pCPU became idle
> * the vCPU starts to run where it was migrated, while its 'original'
>   pCPU is still idle ==> inv. IPI ==> sync state.

This is just the first half of it. In some cases I then see the vCPU
go back without the pCPU having run anything else in between.

> So there seems to me to be no need for the vCPU to actually "go back",
> is there it?

There is no need for it to go back, but it does. There's also no need
for it to be migrated in the first place if there are no more runnable
vCPU-s than there are pCPU-s.

> Anyway, looking at traces, I observed the following:
> [...]
> At (1) d3v8 starts running on CPU 9. Then, at (2), d3v5 wakes up, and
> at (3) CPU 8 (which is idle) is tickled, as a consequence of that. At
> (4), CPU 8 picks up d3v5 and run it (this may seem unrelated, but bear
> with me a little).
> At (5), a periodic tick arrives on CPU 9. Periodic ticks are a core
> part of the Credit1 algorithm, and are used for accounting and load
> balancing. In fact, csched_tick() calls csched_vcpu_acct() which, at
> (6), calls _csched_cpu_pick().
> Pick realizes that d3v8 is running on CPU 9, and that CPU 8 is also
> busy. Now, since CPU 8 and 9 are hyperthreads of the same core, and
> since there are fully idle cores, Credit1 decides that it's better to
> kick d3v8 to one of those fully idle cores, so both d3v5 and d3v8
> itslef can run at full "core speed". In fact, we see that CPU 11 is
> picked, as both the hyperthreads --CPU 10 and CPU 11 itself-- are idle.
> (To be continued, below)

Ah, I see. I did not look at the output with topology in mind, I
admit. Nevertheless I then still don't really understand why it
appears to be not uncommon for a vCPU to move back and
forth relatively rapidly (I take no other vCPU having run on a
pCPU as a sign that the period of time elapsed in between to
not be very large). Please don't forget that the act of moving
a vCPU has a (performance) price, too.

> The problem, as I was expecting, is not work stealing, the problem is,
> well... Credit1! :-/
> [...]
> Credit2, for instance, does not suffer from this issue. In fact,
> hyperthreading, there, is considered during wakeup/tickling already.

Well - when is Credit2 going to become the default?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.