[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time)



> Skipping cpu0 makes no sense.

Oops, I misunderstood that for some reason.

Here's a fixed version.  I also now preserve the "Platform timer is"
line since that can get flushed out of the dmesg buffer.

Any idea why the skew can get so bad?

Dan

> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Thursday, July 03, 2008 5:00 PM
> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
> Cc: Dave Winchell
> Subject: Re: [Xen-devel] RE: [PATCH] record max stime skew (was RE:
> [PATCH] strictly increasing hvm guest time)
> 
> 
> Skipping cpu0 makes no sense. It's not the 'master'. 
> master_stime is time
> calculated from the platform timer (hpet, pit, or whatever). 
> All cpus are
> equal peers. Apart from that looks plausible to me.
> 
>   -- Keir
> 
> On 3/7/08 21:03, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:
> 
> >>> IMHO, it would be nice to put this patch into the tree as it
> >>> will be good for helping to diagnose time skew problems
> >>> such as the one just reported on the list.
> >>
> >> Oops!  Just after I sent the above email, I checked again and
> >> the same machine (no reboots, no guests ever launched) now reports
> >> a max stime skew of 4333ns!!  Methinks there might be some
> >> periodic glitch in the calibration code?
> >
> > OK this version records not only max but also a distribution
> > of skew.  (The code is a bit ugly... I thought about doing
> > something fancy with log-binary but decided a few base-10
> > ranges were clearer for a human to read.)
> >
> > With this, I use "watch -d 'xm debug-key t; xm dmesg | tail -3'"
> > and can observe that (on my single-socket two-core recent-vintage
> > Intel box) roughly three-quarters of the skew measurements are
> > between 10-100nsec, roughly one-quarter are between 100ns-1us,
> > a couple percent are between 1us-10us and a few are >10us.
> >
> > This represents an approximate distribution of how long an hvm
> > guest might observe time to be stopped (if it is able to repeatedly
> > read time values quickly enough).
> >
> > So on some machines, this might be substantially worse than the
> > old hvm-platform-timer-built-on-tsc mechanism (though we had
> > no monotonicity constraint built into that).
> >
> > I wonder if the >1us outliers are occurring only if the
> > processor has been idle for awhile, vs entirely random.
> >
> > Dan
> 
> 
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.