Xen project Mailing List

Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

To: Dario Faggioli <dario.faggioli@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

Date: Fri, 24 Jun 2016 06:11:36 +0000

Accept-language: en-US

Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "keir@xxxxxxx" <keir@xxxxxxx>, "george.dunlap@xxxxxxxxxxxxx" <george.dunlap@xxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, "Wu, Feng" <feng.wu@xxxxxxxxx>

Delivery-date: Fri, 24 Jun 2016 06:11:54 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHRtPAbGJXeZV4RxU2HnWE/vtIwqZ/GiTswgAEQl4CAL5ChcP//qKqAgAF4tIA=

Thread-topic: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

> -----Original Message----- > From: Dario Faggioli [mailto:dario.faggioli@xxxxxxxxxx] > Sent: Thursday, June 23, 2016 11:12 PM > To: Wu, Feng <feng.wu@xxxxxxxxx>; Jan Beulich <JBeulich@xxxxxxxx> > Cc: Tian, Kevin <kevin.tian@xxxxxxxxx>; keir@xxxxxxx; > george.dunlap@xxxxxxxxxxxxx; andrew.cooper3@xxxxxxxxxx; xen- > devel@xxxxxxxxxxxxx > Subject: Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and > per-cpu blocking list > > On Thu, 2016-06-23 at 12:33 +0000, Wu, Feng wrote: > > > -----Original Message----- > > > From: Dario Faggioli [mailto:dario.faggioli@xxxxxxxxxx] > > > > > > It goes through all the vcpus of all domains, and does not check or > > > care whether they are running, runnable or blocked. > > > > > > Let's look at this in some more details. So, let's assume that > > > processor 5 is going away, and that you have the following vcpus > > > around: > > > > > > d0v0 : v->processor = 5, running on cpu 5 > > > d0v1 : v->processor = 4, running on cpu 4 > > > d1v0 : v->processor = 5, runnable but not running > > > d2v3 : v->processor = 5, blocked > > > > > > for d0v0, we do: > > > cpu_disable_scheduler(5) > > > set_bit(_VPF_migrating, d0v0->pause_flags); > > > vcpu_sleep_nosync(d0v0); > > > SCHED_OP(sleep, d0v0); > > > csched_vcpu_sleep(d0v0) > > > cpu_raise_softirq(5, SCHEDULE_SOFTIRQ); > > > vcpu_migrate(d0v0); > > > if ( v->is_running || ...) // assume v->is_running is true > > > return > > Hi Dario, after read this mail again, I get another question, > > could you please help me out? > > > > In the above code flow, we return in vcpu_migrate(d0v0) because > > v->is_running == 1, after vcpu_migrate() return, we check: > > > > if ( v->processor == cpu ) > > ret = -EAGAIN; > > > > In my understand in the above case, 'v->processor' is likely equal to > > 'cpu', hence return -EAGAIN. However, in __cpu_disable(), there is > > some check as below: > > > > if ( cpu_disable_scheduler(cpu) ) > > BUG(); > > > Right. But, as the comment inside cpu_disable_scheduler() itself says, > we only return -EAGAIN in case we are calling cpu_disable_scheduler for > removing a pCPU from a cpupool. > > In that case, we do not use __cpu_disable(), and hence we can safely > return an error value. In that case, in fact, the caller of > cpu_disable_scheduler() is cpupool_unassign_cpu_helprer(), which does > what's necessary to deal with such error. > > > Might we hit the BUG() in the above case? > > > No, because we call cpu_disable_scheduler() from __cpu_disable(), only > when system state is SYS_STATE_suspend already, and hence we take the > then branch of the 'if', which does never return an error. Thanks for the elaboration. I find __cpu_disable() can be called with system state not being SYS_STATE_suspend. Here is my experiment: 1. Launch a guest and pin vCPU 3 to pCPU 3 (which makes the experiment simpler) 2. offline pCPU 3 via "xen-hptool cpu-offline 3" The call path of the above steps is: arch_do_sysctl() -> cpu_down_helper() -> cpu_down() -> take_cpu_down() -> __cpu_disable() -> cpu_disable_scheduler() (enter the 'else' part) I ran an infinite loop on guest vCPU3 (hence pCPU3) to increase the possibility to hit 'ret = -EAGAGIN' in this function, and I thought it would return -EAGAIN if the vCPU is running, However, I tested it for many times, it didn't happen. and I find after vcpu_migrate() returns, vCPU3->runstate.state: 1, vCPU3->is_running: 0, which means the vCPU is current not running. Then I went back in the call patch and find ' v->runstate.state ' gets changed during calling stop_machine_run(), and here is the findings: int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu) { ...... point 1 for_each_cpu ( i, &allbutself ) tasklet_schedule_on_cpu(&per_cpu(stopmachine_tasklet, i), i); point 2 ...... } at point 1 above, vCPU3->runstate.state: 0, vCPU3->is_running: 1 while at point 2 above: vCPU3->runstate.state: 1, vCPU3->is_running: 0 I tested it for many times and got the same result. I am not sure the vcpu state transition just happens to occurs here or not? If the transition doesn't happen and is_running is still 1 when we get vcpu_migrate() in cpu_disable_scheduler() in the above case, should it be a problem? Thanks a lot! Thanks, Feng > > Regards, > Dario > -- > <<This happens because I choose it to happen!>> (Raistlin Majere) > ----------------------------------------------------------------- > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.