Xen project Mailing List

RE: [Xen-devel] million cycle interrupt

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

Date: Tue, 14 Apr 2009 14:49:57 +0000 (GMT)

Cc:

Delivery-date: Tue, 14 Apr 2009 07:51:03 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> be suspected. Then you may use "ioapic_ack=old" to verify. FYI ioapic_ack=old doesn't seem to help > -----Original Message----- > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > Sent: Monday, April 13, 2009 10:45 PM > To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail) > Subject: RE: [Xen-devel] million cycle interrupt > > > >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > >Sent: 2009年4月14日 12:24 > > > >The IO-APIC-edge interrupt action is timer_interrupt. > > this could be either driven by PIT or HPET (in legacy replacement > mode), and activated when PIT is using as a platform timer or > when PIT/HPET is used to broadcast wakeup event to cpus in > deep C-states. The logic in timer_interrupt is very simple, and > maybe you could again go one more step tracing into that function, > e.g. the block protected by pit_lock. :-) > > > > >The MSI interrupt action is showing up strange > >(2010001)... I'll need to look into this more. > > It's reasonable as action is allocated by xmalloc for those devices > owned by guest. I guess you find two MSI instances in > __do_IRQ_guest path? If yes, the logic to handle pending eoi may > be suspected. Then you may use "ioapic_ack=old" to verify. > > From information by far, it looks like the weird high cost could be > from individual handler implementation, or else you should be able > to observe in every handler (such as serial). > > > > >> lock contention? > > > >Possible, though the consistent max numbers make it > >seem unlikely. > > > >> If it's really caused by SMT, it's more likely to be software > >> caused resource contention due to doubled logical processors. > > > >You mentioned PIT earlier... I wonder if PIT can be > >read in parallel if read from different cores, but > >if read from different threads, it must be serialized? > >A million cycles is probably about right for a platform > >timer read, true? > > It looks too large to me. > > Thanks, > Kevin > > > > >> -----Original Message----- > >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > >> Sent: Monday, April 13, 2009 9:05 PM > >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail) > >> Subject: RE: [Xen-devel] million cycle interrupt > >> > >> > >> >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > >> >Sent: 2009年4月14日 10:59 > >> > > >> >Some more interesting data that echos my earlier > >> >measurements: > >> > > >> >Booting with maxcpus=4, the three "large" interrupts > >> >are all about 20000 cycles. Booting with maxcpus=6, > >> >the three are at 146896, 183260, and 240959 cycles. > >> > > >> >So it definitely appears to get worse as maxcpus > >> >gets larger. > >> > >> lock contention? > >> > >> > > >> >I also wonder if it might be related to SMT? > >> > >> If it's really caused by SMT, it's more likely to be software > >> caused resource contention due to doubled logical processors. > >> At hardware level although cache is shared, ISRs in Xen are > >> normally short and I don't think such high factor could come > >> from there... > >> > >> Thanks, > >> Kevin > >> > >> > > >> >> One immediate trick is to print handler address for problematic > >> > > >> >I'll give that a try, but probably not until tomorrow. > >> > > >> >> -----Original Message----- > >> >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > >> >> Sent: Monday, April 13, 2009 8:42 PM > >> >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail) > >> >> Subject: RE: [Xen-devel] million cycle interrupt > >> >> > >> >> > >> >> >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] > >> >> >Sent: 2009年4月14日 10:34 > >> >> > > >> >> >> handler, why not take one more step to measure every handler > >> >> > > >> >> >Great idea! I added a max_cycles field to irq_desc_t > >> >> >and check/update it at every interrupt in do_IRQ, > >> >> >then print max_cycles in "xm debug-key i", > >> >> >including a "max max". > >> >> > > >> >> >I'm not entirely sure how to interpret the output > >> >> >from dump_irqs() but the only two IRQ's with > >> >> >type==PCI-MSI have a "large" > >> >> >max (450044 cycles and 730972 cycles). The third is > >> >> >Vec240, an IO-APIC-edge interrupt that maxes at 1047500 > >> >> >cycles. No other interrupt has a max exceeding 10000. > >> >> > > >> >> >Here's the relevant output. How do I map this to > >> >> >something meaningful? > >> >> > >> >> One immediate trick is to print handler address for problematic > >> >> vector, and then search its name in your dumpped Xen symbol > >> >> file. > >> >> > >> >> Thanks, > >> >> Kevin > >> >> > >> >> > > >> >> >(XEN) Vec 49 IRQ -1: type=PCI-MSI status=00000010 > >> >> >max_cycles=450044 in-flight=0 domain-list=0:254(----), > >> >> > > >> >> >(XEN) Vec208 IRQ -1: type=PCI-MSI status=00000010 > >> >> >max_cycles=730972 in-flight=0 domain-list=0:255(----), > >> >> > >> >> Above two look like owned by dom0. > >> >> > >> >> > > >> >> >(XEN) Vec240 IRQ 0: type=IO-APIC-edge status=00000000 > >> >> >max_cycles=1047500 mapped, unbound > >> >> > >> >> > >> >> looks like PIT, but not quite sure. > >> >> > >> >> > > >> >> >(XEN) max_max_cycles = 1047500 > >> >> > > >> >> > > >> >> > > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] > >> >> >> Sent: Monday, April 13, 2009 5:19 PM > >> >> >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail) > >> >> >> Subject: RE: [Xen-devel] million cycle interrupt > >> >> >> > >> >> >> > >> >> >> >From: Dan Magenheimer > >> >> >> >Sent: 2009年4月14日 5:15 > >> >> >> > > >> >> >> >> You can instrument irq_enter() and irq_exit() to read TSC > >> >> >> > > >> >> >> >Rather than do this generically and ensure I get all > >the macros > >> >> >> >correct (e.g. per_cpu, nesting) I manually > instrumented three > >> >> >> >likely suspect irq_enter/exit pairs, two in do_IRQ() and one > >> >> >> >in smp_call_function(). ALL of them show an issue with max > >> >> >> >readings in the 300K-1M range... with > >smp_call_function showing > >> >> >> >the lowest max and the second in do_IRQ (the non-guest one) > >> >> >> >showing readings over 1M (and the guest one at about 800K). > >> >> >> > >> >> >> Since you already reach this step around calling > >actual action's > >> >> >> handler, why not take one more step to measure every handler > >> >> >> (serial, apic, vtd, ...)? You can first simply print > >> >which handlers > >> >> >> are registered or invoked on your platform. If only > one handler > >> >> >> is experienced with abnormal high latency, it's possibly one > >> >> >> specific point. Or else you can suspect on some common code > >> >> >> shared by all handlers, or ... as Keir said, it could > >> be SMM. :-) > >> >> >> > >> >> >> Thanks, > >> >> >> Kevin > >> >> >> > >> >> >> > > >> >> >> >Interestingly, I get no readings at all over 60K when I > >> >> >> >recompile with max_phys_cpus=4 (and with nosmp) on my > >> >> >> >quad-core-by-two-thread machine. This is versus several > >> >> >> >readings over 60K nearly every second when max_phys_cpus=8. > >> >> >> > > >> >> >> >> Otherwise who knows, it could even be system > management mode > >> >> >> > > >> >> >> >I suppose measuring irq_enter/exist pairs still don't rule > >> >> >> >this out. But the "large" interrupts don't seem to happen > >> >> >> >(at least not nearly as frequently) with fewer physical > >> >> >> >processors enabled, so sys mgmt mode seems unlikely. > >> >> >> > > >> >> >> >Anyway, still a probable problem, still mostly a mystery > >> >> >> >as to what is actually happening. And, repeat, this has > >> >> >> >nothing to do with tmem... I'm just observing it using > >> >> >> >tmem as a convenient measurement tool. > >> >> >> > > >> >> >> >> -----Original Message----- > >> >> >> >> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > >> >> >> >> Sent: Monday, April 13, 2009 2:24 AM > >> >> >> >> To: Dan Magenheimer; Xen-Devel (E-mail) > >> >> >> >> Subject: Re: [Xen-devel] million cycle interrupt > >> >> >> >> > >> >> >> >> > >> >> >> >> On 12/04/2009 21:16, "Dan Magenheimer" > >> >> >> >> <dan.magenheimer@xxxxxxxxxx> wrote: > >> >> >> >> > >> >> >> >> > Is a million cycles in an interrupt handler bad? Any > >> >> idea what > >> >> >> >> > might be consuming this? The evidence might imply > >> more cpus > >> >> >> >> > means longer interrupt, which bodes poorly for larger > >> >> machines. > >> >> >> >> > I tried disabling the timer rendezvous code (not > >positive I > >> >> >> >> > was successful), but still got large measurements, and > >> >> >> >> > eventually the machine froze up (but not before > I observed > >> >> >> >> > the stime skew climbing quickly to the millisecond-plus > >> >> >> >> > range). > >> >> >> >> > >> >> >> >> You can instrument irq_enter() and irq_exit() to > >read TSC and > >> >> >> >> find out the > >> >> >> >> distribution of irq handling times for > >interruptions that Xen > >> >> >> >> knows about. > >> >> >> >> Otherwise who knows, it could even be system management > >> >> >> mode on that > >> >> >> >> particular box. > >> >> >> >> > >> >> >> >> -- Keir > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> > > >> >> >> >_______________________________________________ > >> >> >> >Xen-devel mailing list > >> >> >> >Xen-devel@xxxxxxxxxxxxxxxxxxx > >> >> >> >http://lists.xensource.com/xen-devel > >> >> >> > > >> >> > > >> > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.