Xen project Mailing List

Re: [Xen-devel] High CPU temp, suspend problem - xen 4.1.5-pre, linux 3.7.x

To: Marek Marczykowski <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Date: Tue, 2 Apr 2013 10:05:14 -0400

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ben Guthro <ben@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Tue, 02 Apr 2013 14:05:53 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Apr 02, 2013 at 03:13:56AM +0200, Marek Marczykowski wrote: > On 01.04.2013 15:53, Ben Guthro wrote: > > On Thu, Mar 28, 2013 at 3:03 PM, Marek Marczykowski > > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> (XEN) Restoring affinity for d2v3 > >> (XEN) Assertion '!cpus_empty(cpus) && cpu_isset(cpu, cpus)' failed at > >> sched_credit.c:481 > > > > > > I think the "fix-suspend-scheduler-*" patches posted here are applicable > > here: > > http://markmail.org/message/llj3oyhgjzvw3t23 > > > > > > Specifically, I think you need this bit: > > > > diff --git a/xen/common/cpu.c b/xen/common/cpu.c > > index 630881e..e20868c 100644 > > --- a/xen/common/cpu.c > > +++ b/xen/common/cpu.c > > @@ -5,6 +5,7 @@ > > #include <xen/init.h> > > #include <xen/sched.h> > > #include <xen/stop_machine.h> > > +#include <xen/sched-if.h> > > > > unsigned int __read_mostly nr_cpu_ids = NR_CPUS; > > #ifndef nr_cpumask_bits > > @@ -212,6 +213,8 @@ void enable_nonboot_cpus(void) > > BUG_ON(error == -EBUSY); > > printk("Error taking CPU%d up: %d\n", cpu, error); > > } > > + if (system_state == SYS_STATE_resume) > > + cpumask_set_cpu(cpu, cpupool0->cpu_valid); > > } > > > > cpumask_clear(&frozen_cpus); > > > > Indeed, this makes things better, but still not ideal. > Now after resume all CPUs are in Pool-0, which is good. But CPU0 is much more > preferred than others (xl vcpu-list). For example if I start 4 busy loops in > dom0, I got (even after some time): > [user@dom0 ~]$ xl vcpu-list > Name ID VCPU CPU State Time(s) CPU > Affinity > dom0 0 0 0 r-- 98.5 any cpu > dom0 0 1 0 --- 181.3 any cpu > dom0 0 2 2 r-- 262.4 any cpu > dom0 0 3 3 r-- 230.8 any cpu > netvm 1 0 0 -b- 18.4 any cpu > netvm 1 1 0 -b- 9.1 any cpu > netvm 1 2 0 -b- 7.1 any cpu > netvm 1 3 0 -b- 5.4 any cpu > firewallvm 2 0 0 -b- 10.7 any cpu > firewallvm 2 1 0 -b- 3.0 any cpu > firewallvm 2 2 0 -b- 2.5 any cpu > firewallvm 2 3 3 -b- 3.6 any cpu > > If I remove some CPU from Pool-0 and re-add it, things back to normal for this > particular CPU (so I got two equally used CPUs) - to fully restore system I > must remove all but CPU0 from Pool-0 and add it again. > > Also still only CPU0 have all C-states (C0-C3), all others have only C0-C1. > This probably could be fixed by your "xen: Re-upload processor PM data to > hypervisor after S3 resume" patch (reload of xen-acpi-processor module helps > here). But I don't think it is a right way. It isn't necessary on other > systems (with somehow older hardware). It must be something missing on resume > path. The question is what... The xen-acpi-processor should probably also have the cpu hotplug notification in it to deal with this - so that you don't need to do the reload. > > Perhaps someone need to go through enable_nonboot_cpus() (__cpu_up?) and check > if it restore all things disabled in disable_nonboot_cpus() (__cpu_disable?). > Unfortunately I don't know x86 details so good to follow that code... > > -- > Best Regards / Pozdrawiam, > Marek Marczykowski > Invisible Things Lab > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.