[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V2 42/45] xen/sched: add fall back to idle vcpu when scheduling item



>>> On 17.05.19 at 07:13, <jgross@xxxxxxxx> wrote:
> On 16/05/2019 16:41, Jan Beulich wrote:
>>>>> On 16.05.19 at 15:51, <jgross@xxxxxxxx> wrote:
>>> On 16/05/2019 15:05, Jan Beulich wrote:
>>>>>>> On 06.05.19 at 08:56, <jgross@xxxxxxxx> wrote:
>>>>> --- a/xen/arch/x86/domain.c
>>>>> +++ b/xen/arch/x86/domain.c
>>>>> @@ -154,6 +154,24 @@ static void idle_loop(void)
>>>>>      }
>>>>>  }
>>>>>  
>>>>> +/*
>>>>> + * Idle loop for siblings of active schedule items.
>>>>> + * We don't do any standard idle work like tasklets, page scrubbing or
>>>>> + * livepatching.
>>>>> + * Use default_idle() in order to simulate v->is_urgent.
>>>>
>>>> I guess I'm missing a part of the description which explains all this:
>>>> What's wrong with doing scrubbing work, for example? Why is
>>>> doing tasklet work not okay, but softirqs are? What is the deal with
>>>> v->is_urgent, i.e. what justifies not entering a decent power
>>>> saving mode here on Intel, but doing so on AMD?
>>>
>>> One of the reasons for using core scheduling is to avoid running vcpus
>>> of different domains on the same core in order to minimize the chances
>>> for side channel attacks to data of other domains. Not allowing
>>> scrubbing or tasklets here is due to avoid accessing data of other
>>> domains.
>> 
>> So how is running softirqs okay then? And how is scrubbing accessing
>> other domains' data?
> 
> Right now I'm not sure whether it is a good idea to block any softirqs.
> We definitely need to process scheduling requests and I believe RCU and
> tasklets, too. The tlbflush one should be uncritical, so timers is the
> remaining one which might be questionable. This can be fine-tuned later
> IMO e.g. by defining a softirq mask of critical softirqs to block and
> eventually splitting up e.g. timer and tasklet softirqs into critical
> and uncritical ones.

Well, okay, but please add an abridged version of this to the patch
description then.

> Scrubbing will probably pull the cache lines of the dirty pages into
> the L1 cache of the cpu. For me this sounds problematic. In case we
> are fine to do scrubbing as there is no risk associated I'm fine to add
> it back in.

Well, of course there's going to be a brief period of time where
a cache line will be present in CPU internal buffers (it's not just the
cache after all, as we've learned with XSA-297). So I can certainly
buy that when using core granularity you don't want to scrub on
the other thread. But what about the socket granularity case?
Scrubbing on fully idle cores should still be fine, I would think.

>>> As with core scheduling we can be sure the other thread is active
>>> (otherwise we would schedule the idle item) and hoping for saving power
>>> by using mwait is moot.
>> 
>> Saving power may be indirect, by the CPU re-arranging
>> resource assignment between threads when one goes idle.
>> I have no idea whether they do this when entering C1, or
>> only when entering deeper C states.
> 
> SDM Vol. 3 chapter 8.10.1 "HLT instruction":
> 
> "Here shared resources that were being used by the halted logical
> processor become available to active logical processors, allowing them
> to execute at greater efficiency."

To be honest, this is to broad/generic a statement to fully
trust it, judging from other areas of the SDM. And then, as
per above, what about the socket granularity case? Putting
entirely idle cores to sleep is surely worthwhile?

>> And anyway - I'm still none the wiser as to the v->is_urgent
>> relationship.
> 
> With v->is_urgent set today's idle loop will drop into default_idle().
> I can remove this sentence in case it is just confusing.

I'd prefer if the connection would become more obvious. One
needs to go from ->is_urgent via ->urgent_count to
sched_has_urgent_vcpu() to find where the described
behavior really lives.

What's worse though: This won't work as intended on AMD
at all. I don't think it's correct to fall back to default_idle() in
this case. Instead sched_has_urgent_vcpu() returning true
should amount to the same effect as max_cstate being set
to 1. There's
(a) no reason not to use MWAIT on Intel CPUs in this case,
if MWAIT can enter C1, and
(b) a strong need to use MWAIT on (at least) AMD Fam17,
or else it won't be C1 that gets entered.
I'll see about making a patch in due course.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.