[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] million cycle interrupt



> be suspected. Then you may use "ioapic_ack=old" to verify.

FYI ioapic_ack=old doesn't seem to help


> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> Sent: Monday, April 13, 2009 10:45 PM
> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail)
> Subject: RE: [Xen-devel] million cycle interrupt
>
>
> >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> >Sent: 2009年4月14日 12:24
> >
> >The IO-APIC-edge interrupt action is timer_interrupt.
>
> this could be either driven by PIT or HPET (in legacy replacement
> mode), and activated when PIT is using as a platform timer or
> when PIT/HPET is used to broadcast wakeup event to cpus in
> deep C-states. The logic in timer_interrupt is very simple, and
> maybe you could again go one more step tracing into that function,
> e.g. the block protected by pit_lock. :-)
>
> >
> >The MSI interrupt action is showing up strange
> >(2010001)... I'll need to look into this more.
>
> It's reasonable as action is allocated by xmalloc for those devices
> owned by guest. I guess you find two MSI instances in
> __do_IRQ_guest path? If yes, the logic to handle pending eoi may
> be suspected. Then you may use "ioapic_ack=old" to verify.
>
> From information by far, it looks like the weird high cost could be
> from individual handler implementation, or else you should be able
> to observe in every handler (such as serial).
>
> >
> >> lock contention?
> >
> >Possible, though the consistent max numbers make it
> >seem unlikely.
> >
> >> If it's really caused by SMT, it's more likely to be software
> >> caused resource contention due to doubled logical processors.
> >
> >You mentioned PIT earlier... I wonder if PIT can be
> >read in parallel if read from different cores, but
> >if read from different threads, it must be serialized?
> >A million cycles is probably about right for a platform
> >timer read, true?
>
> It looks too large to me.
>
> Thanks,
> Kevin
>
> >
> >> -----Original Message-----
> >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> >> Sent: Monday, April 13, 2009 9:05 PM
> >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail)
> >> Subject: RE: [Xen-devel] million cycle interrupt
> >>
> >>
> >> >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> >> >Sent: 2009年4月14日 10:59
> >> >
> >> >Some more interesting data that echos my earlier
> >> >measurements:
> >> >
> >> >Booting with maxcpus=4, the three "large" interrupts
> >> >are all about 20000 cycles.  Booting with maxcpus=6,
> >> >the three are at 146896, 183260, and 240959 cycles.
> >> >
> >> >So it definitely appears to get worse as maxcpus
> >> >gets larger.
> >>
> >> lock contention?
> >>
> >> >
> >> >I also wonder if it might be related to SMT?
> >>
> >> If it's really caused by SMT, it's more likely to be software
> >> caused resource contention due to doubled logical processors.
> >> At hardware level although cache is shared, ISRs in Xen are
> >> normally short and I don't think such high factor could come
> >> from there...
> >>
> >> Thanks,
> >> Kevin
> >>
> >> >
> >> >> One immediate trick is to print handler address for problematic
> >> >
> >> >I'll give that a try, but probably not until tomorrow.
> >> >
> >> >> -----Original Message-----
> >> >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> >> >> Sent: Monday, April 13, 2009 8:42 PM
> >> >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail)
> >> >> Subject: RE: [Xen-devel] million cycle interrupt
> >> >>
> >> >>
> >> >> >From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> >> >> >Sent: 2009年4月14日 10:34
> >> >> >
> >> >> >> handler, why not take one more step to measure every handler
> >> >> >
> >> >> >Great idea!  I added a max_cycles field to irq_desc_t
> >> >> >and check/update it at every interrupt in do_IRQ,
> >> >> >then print max_cycles in "xm debug-key i",
> >> >> >including a "max max".
> >> >> >
> >> >> >I'm not entirely sure how to interpret the output
> >> >> >from dump_irqs() but the only two IRQ's with
> >> >> >type==PCI-MSI have a "large"
> >> >> >max (450044 cycles and 730972 cycles).  The third is
> >> >> >Vec240, an IO-APIC-edge interrupt that maxes at 1047500
> >> >> >cycles.   No other interrupt has a max exceeding 10000.
> >> >> >
> >> >> >Here's the relevant output.  How do I map this to
> >> >> >something meaningful?
> >> >>
> >> >> One immediate trick is to print handler address for problematic
> >> >> vector, and then search its name in your dumpped Xen symbol
> >> >> file.
> >> >>
> >> >> Thanks,
> >> >> Kevin
> >> >>
> >> >> >
> >> >> >(XEN)    Vec 49 IRQ -1: type=PCI-MSI         status=00000010
> >> >> >max_cycles=450044 in-flight=0 domain-list=0:254(----),
> >> >> >
> >> >> >(XEN)    Vec208 IRQ -1: type=PCI-MSI         status=00000010
> >> >> >max_cycles=730972 in-flight=0 domain-list=0:255(----),
> >> >>
> >> >> Above two look like owned by dom0.
> >> >>
> >> >> >
> >> >> >(XEN)    Vec240 IRQ  0: type=IO-APIC-edge    status=00000000
> >> >> >max_cycles=1047500 mapped, unbound
> >> >>
> >> >>
> >> >> looks like PIT, but not quite sure.
> >> >>
> >> >> >
> >> >> >(XEN) max_max_cycles = 1047500
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> >> >> >> Sent: Monday, April 13, 2009 5:19 PM
> >> >> >> To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail)
> >> >> >> Subject: RE: [Xen-devel] million cycle interrupt
> >> >> >>
> >> >> >>
> >> >> >> >From: Dan Magenheimer
> >> >> >> >Sent: 2009年4月14日 5:15
> >> >> >> >
> >> >> >> >> You can instrument irq_enter() and irq_exit() to read TSC
> >> >> >> >
> >> >> >> >Rather than do this generically and ensure I get all
> >the macros
> >> >> >> >correct (e.g. per_cpu, nesting) I manually
> instrumented three
> >> >> >> >likely suspect irq_enter/exit pairs, two in do_IRQ() and one
> >> >> >> >in smp_call_function().  ALL of them show an issue with max
> >> >> >> >readings in the 300K-1M range... with
> >smp_call_function showing
> >> >> >> >the lowest max and the second in do_IRQ (the non-guest one)
> >> >> >> >showing readings over 1M (and the guest one at about 800K).
> >> >> >>
> >> >> >> Since you already reach this step around calling
> >actual action's
> >> >> >> handler, why not take one more step to measure every handler
> >> >> >> (serial, apic, vtd, ...)? You can first simply print
> >> >which handlers
> >> >> >> are registered or invoked on your platform. If only
> one handler
> >> >> >> is experienced with abnormal high latency, it's possibly one
> >> >> >> specific point. Or else you can suspect on some common code
> >> >> >> shared by all handlers, or ... as Keir said, it could
> >> be SMM. :-)
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Kevin
> >> >> >>
> >> >> >> >
> >> >> >> >Interestingly, I get no readings at all over 60K when I
> >> >> >> >recompile with max_phys_cpus=4 (and with nosmp) on my
> >> >> >> >quad-core-by-two-thread machine.  This is versus several
> >> >> >> >readings over 60K nearly every second when max_phys_cpus=8.
> >> >> >> >
> >> >> >> >> Otherwise who knows, it could even be system
> management mode
> >> >> >> >
> >> >> >> >I suppose measuring irq_enter/exist pairs still don't rule
> >> >> >> >this out.  But the "large" interrupts don't seem to happen
> >> >> >> >(at least not nearly as frequently) with fewer physical
> >> >> >> >processors enabled, so sys mgmt mode seems unlikely.
> >> >> >> >
> >> >> >> >Anyway, still a probable problem, still mostly a mystery
> >> >> >> >as to what is actually happening.  And, repeat, this has
> >> >> >> >nothing to do with tmem... I'm just observing it using
> >> >> >> >tmem as a convenient measurement tool.
> >> >> >> >
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> >> >> >> >> Sent: Monday, April 13, 2009 2:24 AM
> >> >> >> >> To: Dan Magenheimer; Xen-Devel (E-mail)
> >> >> >> >> Subject: Re: [Xen-devel] million cycle interrupt
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On 12/04/2009 21:16, "Dan Magenheimer"
> >> >> >> >> <dan.magenheimer@xxxxxxxxxx> wrote:
> >> >> >> >>
> >> >> >> >> > Is a million cycles in an interrupt handler bad?  Any
> >> >> idea what
> >> >> >> >> > might be consuming this?  The evidence might imply
> >> more cpus
> >> >> >> >> > means longer interrupt, which bodes poorly for larger
> >> >> machines.
> >> >> >> >> > I tried disabling the timer rendezvous code (not
> >positive I
> >> >> >> >> > was successful), but still got large measurements, and
> >> >> >> >> > eventually the machine froze up (but not before
> I observed
> >> >> >> >> > the stime skew climbing quickly to the millisecond-plus
> >> >> >> >> > range).
> >> >> >> >>
> >> >> >> >> You can instrument irq_enter() and irq_exit() to
> >read TSC and
> >> >> >> >> find out the
> >> >> >> >> distribution of irq handling times for
> >interruptions that Xen
> >> >> >> >> knows about.
> >> >> >> >> Otherwise who knows, it could even be system management
> >> >> >> mode on that
> >> >> >> >> particular box.
> >> >> >> >>
> >> >> >> >>  -- Keir
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >> >_______________________________________________
> >> >> >> >Xen-devel mailing list
> >> >> >> >Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> >> >> >http://lists.xensource.com/xen-devel
> >> >> >> >
> >> >> >
> >> >
> >

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.