Xen project Mailing List

Re: [Xen-devel] Need help in debugging partially blocked hypervisor

From: Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>

Date: Tue, 3 Nov 2009 08:52:53 +0100

Cc: "Shan, Haitao" <haitao.shan@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>

Delivery-date: Mon, 02 Nov 2009 23:53:27 -0800

Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:From:To:Subject:Date:User-Agent:Cc: References:In-Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Message-Id; b=VrrwdoVegL+HIX1SgtjOpZNyz45LLgqVWMFYAnnVZ5lw0KY3coMpFX99 88RPr5vi2jeKKau0orglrqJxQGI1yXSy4u5u5PjD0LzFe7+n9IEDcuejR V7hjbqvlZwUH+IsftEfu/1irHfhpz/gYHbJzzpQBFfKilRrXRosk9lFuk cZ6KDsE+Ksr3wg5ZUmSkMDoOFuLowT9coGYnP3JGgGbyhYaN7SVSmYHAt 4Z9xb6CH6dBtD4ZDvigfhK6CtA40Q;

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Please see below. > See my comments embedded. :) > > Haitao > > > Dietmar Hahn wrote: > > The conclusion is, that this seems to be a workaround for the endless > > NMI loop. PMI's are a very rarely event and this should not raise a > > performance > > problem. > I totally agree that this is only a workaround for approach 1. > > > > > I didn't try your second approach > >> 2> Remove unmasking PMI from vpmu_do_interrupt and unmask *physical > >> PMI* when guest vcpu unmasks virtual PMI. but I have some question. > > > > - What if the 'physical PMI' is not unmasked in vpmu_do_interrupt and > > a watchdog NMI would occur before the domU unmasks it? > I think the second NMI will be lost. > > > - Is it possible that after handling the NMI (and not unmasking) > > another domU got running on this CPU and therefore PMI's got lost? > LVTPC entry in physical local APIC is save/restored by Xen on VCPU switches. > So unmasking (or not) of PMI of one vcpu should have no impact on another > vcpu. When developing vPMU, I treated as vPMU context both PMU MSRs and LVTPC > entry in local APIC. vPMU context is save/restored on physical HW when vcpus > is scheduled, either in an active save/restore manner or a lazy one > (depending on the PMU usage at the time of switch). > > > > > But the real cause of the problem is unknown. As said I saw this only > > on > > Nehalem. Maybe there is a problem together with the hardware? Perhaps > > your > > hardware colleagues know something more ;-) > When I found this problem, I just thought it might be a corner case that only > happens on my box (of course, I only see this in NHM, too). > I will try to pin HW guy to see if any explanation, since it is proven to be > a general problem on NHM. > > But before everything is clear, I think approach 2 is a better solution now. What would be the effect if the guest unmasks the PMI (which leads to unmasking the 'physical PMI') but doesn't reset the counter to a value != 0? Is the guest able to produce the nmi endless loop? Dietmar. > > > > > Thanks > > Dietmar > > > >> > >>> > >>> When I met this problem, I remember that I tried two approaches: > >>> 1> Setting the counter to non-zero before unmasking PMI in > >>> vpmu_do_interrupt; 2> Remove unmasking PMI from vpmu_do_interrupt > >>> and unmask *physical PMI* when guest vcpu unmasks virtual PMI. > >>> I remember that approach 2 can fix this issue. But I do not > >>> remember the result of approach 1, since I met this about one year > >>> ago. > >>> It is my understanding that approach 2 is quite same as approach 1, > >>> since normally guest will set the counter to some negative value > >>> (for example, -100000) before unmasking virtual PMI. > >>> However, approach 2 looks cleaner and more reasonable. > >>> > >>> Can you have a try and let me know the result? If both can not > >>> work, there might be some problems that I have not met before. > >>> > >>> BTW: Sorry, I did not see your patch to enable NHM vpmu before. So, > >>> there is no need for me to work on that now. :) > >>> > >>> Haitao > >>> > >>> > >>> Dietmar Hahn wrote: > >>>> Hi Haitao, > >>>> > >>>>> Can I know how you enabled vPMU on Nehalem? This is not supported > >>>>> in current Xen. > >>>> > >>>> http://lists.xensource.com/archives/html/xen-devel/2009-09/msg00829.html > >>>> > >>>>> > >>>>> Concerning vpmu support, I totally agree that we can disable this > >>>>> feature by default. If anyone really wants to use it, he can use > >>>>> boot options to turn it on. > >>>> > >>>> Yes, that's OK for me. > >>>> > >>>>> I am preparing a patch for that. And I will > >>>>> send a patch to enable NHM vpmu together. > >>>>> > >>>>> For the problem that Dietmar met, I think I once met this before. > >>>>> Can you add some code in vpmu_do_interrupt that sets the counter > >>>>> you are using to a value other than zero? Please let me know if > >>>>> that can help. > >>>> > >>>> I don't set the counter to zero. I use 0-val to set the counter. > >>>> Actually I testet on Nehalem with > >>>> - General Perf-counter #2 (0xc3) with CPU_CLK_UNHALTED and > >>>> val=1100000 > >>>> - Fixed counter #1 (0x30a) and val=1100000 > >>>> The thing is that in normal case the overflows of both counters > >>>> appear nearly at the same time. As described I added some extra > >>>> tracer for xentrace in core2_vpmu_do_interrupt() so the code looks > >>>> like: > >>>> > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content); -> 1. > >>>> Step { uint32_t HAHN_l, HAHN_h; > >>>> HAHN_l = (uint32_t) msr_content; > >>>> HAHN_h = (uint32_t) (msr_content >> 32); > >>>> HVMTRACE_3D(HAHN_TR2, v, 1, HAHN_h, HAHN_l); -> 2. Step > >>>> } > >>>> if ( !msr_content ) > >>>> return 0; > >>>> core2_vpmu_cxt->global_ovf_status |= msr_content; > >>>> msr_content = 0xC000000700000000 | ((1 << > >>>> core2_get_pmc_count()) - 1); > >>>> wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content); -> 3. Step > >>>> > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content); -> 4. > >>>> Step { uint32_t HAHN_l, HAHN_h; > >>>> HAHN_l = (uint32_t) msr_content; > >>>> HAHN_h = (uint32_t) (msr_content >> 32); > >>>> HVMTRACE_3D(HAHN_TR2, v, 0xa, HAHN_h, HAHN_l); -> 5. > >>>> Step > >>>> > >>>> rdmsrl(0xc3, msr_content); -> 6. > >>>> Step General counter #2 HAHN_l = (uint32_t) msr_content; > >>>> HAHN_h = (uint32_t) (msr_content >> 32); > >>>> HVMTRACE_3D(HAHN_TR2, v, 0xc3, HAHN_h, HAHN_l); > >>>> rdmsrl(0x30a, msr_content); -> 7. > >>>> Step Fixed counter #1 HAHN_l = (uint32_t) msr_content; > >>>> HAHN_h = (uint32_t) (msr_content >> 32); > >>>> HVMTRACE_3D(HAHN_TR2, v, 0x30a, HAHN_h, HAHN_l); } > >>>> > >>>> With these tracers I got the following output: > >>>> > >>>> Last good NMI: > >>>> Both counter cause the NMI. Resetting works OK. > >>>> The counter itself were running further. > >>>> 2. Step: par1 = 0x01, high = 0x0002, low = 0x0004 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 5. Step: par1 = 0x0a, high = 0x0000, low = 0x0000 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 6. Step: par1 = 0xc3, high = 0x0000, low = 0x03c4 ] > >>>> rdmsrl(0xc3) -> #2 general counter > >>>> 7. Step: par1 = 0x30a, high = 0x0000, low = 0x02da ] > >>>> rdmsrl(0x30a) -> #1 fixed counter > >>>> > >>>> NMI from where things goes wrong: > >>>> Both counter cause the NMI. Resetting works NOT correct, only for > >>>> the general counter! The general counter (caused the NMI) seems to > >>>> be stopped! > >>>> 2. Step: par1 = 0x01, high = 0x0002, low = 0x0004 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 5. Step: par1 = 0x0a, high = 0x0002, low = 0x0000 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 6. Step: par1 = 0xc3, high = 0x0000, low = 0x00ec ] > >>>> rdmsrl(0xc3) -> #2 general counter > >>>> 7. Step: par1 = 0x30a, high = 0x0000, low = 0x0000 ] > >>>> rdmsrl(0x30a) -> #1 fixed counter > >>>> > >>>> Wrong NMI: > >>>> Only the fixed counter causes the NMI (which was not resetted > >>>> during NMI handling above!) Both counter seems to be stopped! > >>>> 2. Step: par1 = 0x01, high = 0x0002, low = 0x0000 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 5. Step: par1 = 0x0a, high = 0x0002, low = 0x0000 ] > >>>> rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS) > >>>> 6. Step: par1 = 0xc3, high = 0x0000, low = 0x00ec ] > >>>> rdmsrl(0xc3) -> #2 general counter > >>>> 7. Step: par1 = 0x30a, high = 0x0000, low = 0x0000 ] > >>>> rdmsrl(0x30a) -> #1 fixed counter > >>>> > >>>> And this state remains forever! > >>>> I hope my explanations are understandable ;-) > >>>> > >>>> Until now I can see this behavior only on a Nehalem processor. > >>>> > >>>> Thanks. > >>>> Dietmar > >>>> > >>>>> > >>>>> Best Regards > >>>>> Shan Haitao > >>>>> > >>>>> 2009/10/30 Keir Fraser <keir.fraser@xxxxxxxxxxxxx>: > >>>>>> On 30/10/2009 12:20, "Dietmar Hahn" > >>>>>> <dietmar.hahn@xxxxxxxxxxxxxx> wrote: > >>>>>> > >>>>>>> I searched the intel processor spec but couldn't find any help. > >>>>>>> So my questions is, what is wrong here? > >>>>>>> Can anybody with more knowledge point me in the right direction, > >>>>>>> what can I still do to find the real cause of this? > >>>>>> > >>>>>> You should probably Cc one of the Intel guys who implemented this > >>>>>> stuff -- I've added Haitao Shan. > >>>>>> > >>>>>> Meanwhile I'd be interested to know whether things work okay for > >>>>>> you, minus performance counters and the hypervisor hang, if you > >>>>>> return immediately from vpmu_initialise(). Really at minimum we > >>>>>> need such a fix, perhaps with a boot paremeter to re-enable the > >>>>>> feature, for 3.4.2 release; allowing guests to hose the > >>>>>> hypervisor like this is of course not on. > >>>>>> > >>>>>> -- Keir > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel > > -- Company details: http://ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.