[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] million cycle interrupt



> You can instrument irq_enter() and irq_exit() to read TSC 

Rather than do this generically and ensure I get all the macros
correct (e.g. per_cpu, nesting) I manually instrumented three
likely suspect irq_enter/exit pairs, two in do_IRQ() and one
in smp_call_function().  ALL of them show an issue with max
readings in the 300K-1M range... with smp_call_function showing
the lowest max and the second in do_IRQ (the non-guest one)
showing readings over 1M (and the guest one at about 800K).

Interestingly, I get no readings at all over 60K when I
recompile with max_phys_cpus=4 (and with nosmp) on my
quad-core-by-two-thread machine.  This is versus several
readings over 60K nearly every second when max_phys_cpus=8.

> Otherwise who knows, it could even be system management mode

I suppose measuring irq_enter/exist pairs still don't rule
this out.  But the "large" interrupts don't seem to happen
(at least not nearly as frequently) with fewer physical
processors enabled, so sys mgmt mode seems unlikely.

Anyway, still a probable problem, still mostly a mystery
as to what is actually happening.  And, repeat, this has
nothing to do with tmem... I'm just observing it using
tmem as a convenient measurement tool.

> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Monday, April 13, 2009 2:24 AM
> To: Dan Magenheimer; Xen-Devel (E-mail)
> Subject: Re: [Xen-devel] million cycle interrupt
> 
> 
> On 12/04/2009 21:16, "Dan Magenheimer" 
> <dan.magenheimer@xxxxxxxxxx> wrote:
> 
> > Is a million cycles in an interrupt handler bad?  Any idea what
> > might be consuming this?  The evidence might imply more cpus
> > means longer interrupt, which bodes poorly for larger machines.
> > I tried disabling the timer rendezvous code (not positive I
> > was successful), but still got large measurements, and
> > eventually the machine froze up (but not before I observed
> > the stime skew climbing quickly to the millisecond-plus
> > range).
> 
> You can instrument irq_enter() and irq_exit() to read TSC and 
> find out the
> distribution of irq handling times for interruptions that Xen 
> knows about.
> Otherwise who knows, it could even be system management mode on that
> particular box.
> 
>  -- Keir
> 
> 
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.