WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ti

To: "dan.magenheimer@xxxxxxxxxx" <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ticks
From: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Date: Fri, 08 Feb 2008 16:21:16 -0500
Cc: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>, Deepak Patel <deepak.patel@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 18 Feb 2008 09:54:00 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <47A77075.3010707@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20080201153111687.00000002384@djm-pc> <47A77075.3010707@xxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
Hi Dan,

Sorry it took me so long, but I finally ran an ltp test today.
Its on rh4u4-64. I'm using the defaults for ltp and using a script
called runltp. I had a usex load on rh4u5-64. No ntpd.
virtual processors / physical processors = 2.

The clocks drifted -1 sec (4u5) and +1.5 sec (4u4) in 300 minutes
for -.005% and .008%.

I'm running a 3.1 based hypervisor with some clock related tweaks that
I haven't submitted, because I'm still characterizing them.

I'm stopping the usex load on 4u5-64 now and replacing it with ltp
and will leave the two guests running ltp over the weekend.

Regards,
Dave


Dave Winchell wrote:

Hi Dan, Deepak:

Thanks for the data. Those drifts are severe - no wonder ntp couldn't
keep then in synch. I'll try to reproduce that behaviour here, with my code base.
If I can't reproduce it, I'll try 3.2.

If you can isolate what ltp is doing during the cliffs, that would be very
helpful.

thanks,
Dave




Dan Magenheimer wrote:

OK, Deepak repeated the test without ntpd and using ntpdate -b before
the test.

The attached graph shows his results: el5u1-64 (best=~0.07%),
el4u5-64 (middle=~0.2%), and el4u5-32 (worst=~0.3%).

We will continue to look at LTP to try to isolate.

Thanks,
Dan

P.S. elXuY is essentially RHEL XuY with some patches.

-----Original Message-----
From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
Sent: Wednesday, January 30, 2008 2:45 PM
To: Deepak Patel
Cc: dan.magenheimer@xxxxxxxxxx; Keir Fraser;
xen-devel@xxxxxxxxxxxxxxxxxxx; akira.ijuin@xxxxxxxxxx; Dave Winchell
Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables pending
missed ticks


Dan, Deeepak,

It may be that the underlying clock error is too great for ntp
to handle. It would be useful if you did not run ntpd
and, instead did ntpdate -b <timeserver> at the start of the test
for each guest. Then capture the data as you have been doing.
If the drift is greater than .05%, then we need to address that.

Another option is, when running ntpd, to enable loop statistics in
/etc/ntp.conf
by adding this to the file:

statistics loopstats
statsdir /var/lib/ntp/

Then you will see loop data in that directory.
Correlating the data in the loopstats files with the
peaks in skew would be interesting. You will see entries of the form

54495 76787.701 -0.045153303 -132.569229 0.020806776 239.735511 10

Where the second to last column is the Allan Deviation. When that
gets over 1000, ntpd is working pretty hard. However, I have not seen ntpd
completely loose it like you have.

I'm on vacation until Monday, and won't be reading
email.

Thanks for all your work on this!

-Dave

Deepak Patel wrote:

Is the graph for RHEL5u1-64? (I've never tested this one.)

I do not know which graph was attached with this. But I saw this
behavior in EL4u5 - 32, EL4U5 - 64 and EL5U1 - 64 hvm guests when I
was running ltp tests continuously.

What was the behaviour of the other guests running?

All pvm guests are fine. But behavior of most of the hvm guests were
as described.

If they had spikes, were they at the same wall time?

No. They are not at the same wall time.

Were the other guests running ltp as well?


Yes all 6 guests (4 hvm and 2 pvm) the guests are running ltp
continuously.

How are you measuring skew?

I was collecting output of "ntpdate -q <timeserver> every

300 seconds
(5 minutes) and have created graph based on that.

Are you running ntpd?


Yes. ntp was running on all the guests.

I am investigating what causes this spikes and let everyone

know what
are my findings.

Thanks,
Deepak

Anything that you can discover that would be in sync with
the spikes would be very helpful!

The code that I test with is our product code, which is based
on 3.1. So it is possible that something in 3.2 other than vpt.c
is the cause. I can test with 3.2, if necessary.

thanks,
Dave



Dan Magenheimer wrote:

Hi Dave (Keir, see suggestion below) --

Thanks!

Turning off vhpet certainly helps a lot (though see below).

I wonder if timekeeping with vhpet is so bad that it should be
turned off by default (in 3.1, 3.2, and unstable) until it is
fixed?  (I have a patch that defaults it off, can post it if
there is agreement on the above point.)  The whole point of an
HPET is to provide more precise timekeeping and if vhpet is
worse than vpit, it can only confuse users.  Comments?


In your testing, are you just measuring % skew over a long
period of time?
We are graphing the skew continuously and
seeing periodic behavior that is unsettling, even with pit.
See attached.  Though your algorithm recovers, the "cliffs"
could still cause real user problems.  I wonder if there is
anything that can be done to make the "recovery" more
responsive?

We are looking into what part(s) of LTP is causing the cliffs.

Thanks,
Dan



-----Original Message-----
From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
Sent: Monday, January 28, 2008 8:21 AM
To: dan.magenheimer@xxxxxxxxxx
Cc: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx;
deepak.patel@xxxxxxxxxx;
akira.ijuin@xxxxxxxxxx; Dave Winchell
Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables
pending
missed ticks


Dan,

I guess I'm a bit out of date calling for clock= usage.
Looking at linux 2.6.20.4 sources, I think you should specify
"clocksource=pit nohpet" on the linux guest bootline.

You can leave the xen and dom0 bootlines as they are.
The xen and guest clocksources do not need to be the same.
In my tests, xen is using the hpet for its timekeeping and
that appears to be the default.

When you boot the guests you should see
  time.c: Using PIT/TSC based timekeeping.
on the rh4u5-64 guest, and something similar on the others.

(xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer
14.318MHz HPET.)

This appears to be the xen state, which is fine.
I was wrongly assuming that this was the guest state.
You might want to look in your guest logs and see what they were
picking
for a clock source.

Regards,
Dave




Dan Magenheimer wrote:



Thanks, I hadn't realized that! No wonder we didn't

see the same
improvement you saw!



Try specifying clock=pit on the linux boot line...


I'm confused... do you mean "clocksource=pit" on the Xen

command line or


"nohpet" / "clock=pit" / "clocksource=pit" on the guest (or

dom0?) command


line?  Or both places?  Since the tests take awhile, it

would be nice


to get this right the first time.  Do the Xen and guest

clocksources need


to be the same?

Thanks,
Dan

-----Original Message-----
*From:* Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
*Sent:* Sunday, January 27, 2008 2:22 PM
*To:* dan.magenheimer@xxxxxxxxxx; Keir Fraser
*Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx;
akira.ijuin@xxxxxxxxxx; Dave Winchell
*Subject:* RE: [Xen-devel] [PATCH] Add a timer mode

that disables
pending missed ticks

  Hi Dan,

Hpet timer does have a fairly large error, as I was

trying this
  one recently.
  I don't remember what I got for error, but 1% sounds

about right.


  The problem is that hpet is not built on top of vpt.c,

the module


  Keir and I did
  all the recent work in, for its periodic timer needs. Try
  specifying clock=pit
  on the linux boot line. If it still picks the hpet, which it
  might, let me know
  and I'll tell you how to get around this.

  Regards,
  Dave






--------------------------------------------------------------
----------


  *From:* Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
  *Sent:* Fri 1/25/2008 6:50 PM
  *To:* Dave Winchell; Keir Fraser
  *Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx;
  akira.ijuin@xxxxxxxxxx
  *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode

that disables


  pending missed ticks

  Sorry for the very late followup on this but we finally

were able


  to get our testing set up again on stable 3.1 bits and have
  seen some very bad results on 3.1.3-rc1, on the order of 1%.

  Test enviroment was a 4-socket dual core machine with 24GB of
memory running six two-vcpu 2GB domains, four hvm

plus two pv.
  All six guests were running LTP simultaneously.  The four hvm
guests were: RHEL5u1-64, RHEL4u5-32, RHEL5-64, and

RHEL4u5-64.
  Timer_mode was set to 2 for 64-bit guests and 0 for

32-bit guests.


All four hvm guests experienced skew around -1%,

even the 32-bit
guest. Less intensive testing didn't exhibit much

skew at all.
  A representative graph is attached.

Dave, I wonder if some portion of your patches

didn't end up in
  the xen trees?

  (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer
  14.318MHz HPET.)

  Thanks,
  Dan

  P.S. Many thanks to Deepak and Akira for running tests.

  > -----Original Message-----
  > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
  > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of
  > Dave Winchell
  > Sent: Wednesday, January 09, 2008 9:53 AM
  > To: Keir Fraser
  > Cc: dan.magenheimer@xxxxxxxxxx;

xen-devel@xxxxxxxxxxxxxxxxxxx; Dave


  > Winchell
  > Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
  > disables pending
  > missed ticks
  >
  >
  > Hi Keir,
  >
  > The latest change, c/s 16690, looks fine.
  > I agree that the code in c/s 16690 is equivalent to
  > the code I submitted. Also, your version is more
  > concise.
  >
  > The error tests confirm the equivalence. With

overnight cpu loads,


  > the checked in version was accurate to +.048% for sles
> and +.038% for red hat. My version was +.046% and

+.032% in a
  > 2 hour test.
  > I don't think the difference is significant.
  >
  > i/o loads produced errors of +.01%.
  >
  > Thanks for all your efforts on this issue.
  >
  > Regards,
  > Dave
  >
  >
  >
  > Keir Fraser wrote:
  >
  > >Applied as c/s 16690, although the checked-in patch is
  > smaller. I think the
> >only important fix is to pt_intr_post() and the

only bit of
  > the patch I
> >totally omitted was the change to

pt_process_missed_ticks().
  > I don't think
  > >that change can be important, but let's see what

happens to the


  error
  > >percentage...
  > >
  > > -- Keir
  > >
  > >On 4/1/08 23:24, "Dave Winchell"

<dwinchell@xxxxxxxxxxxxxxx> wrote:


  > >
  > >
  > >
  > >>Hi Dan and Keir,
  > >>
  > >>Attached is a patch that fixes some issues with the

SYNC policy


  > >>(no_missed_ticks_pending).
  > >>I have not tried to make the change the minimal one, but,
  > rather, just
  > >>ported into
  > >>the new code what I know to work well. The error for
  > >>no_missed_ticks_pending goes from
  > >>over 3% to .03% with this change according to my testing.
  > >>
  > >>Regards,
  > >>Dave
  > >>
  > >>Dan Magenheimer wrote:
  > >>
  > >>
  > >>
  > >>>Hi Dave --
  > >>>
  > >>>Did you get your correction ported?  If so, it would be
  > nice to see this get
  > >>>into 3.1.3.
  > >>>
  > >>>Note that I just did some very limited testing with
  > timer_mode=2(=SYNC=no
  > >>>missed ticks pending)
> >>>on tip of xen-3.1-testing (64-bit Linux hv

guest) and the
  > worst error I've
  > >>>seen so far
> >>>is 0.012%. But I haven't tried any exotic

loads, just LTP.
  > >>>
  > >>>Thanks,
  > >>>Dan
  > >>>
  > >>>
  > >>>
  > >>>
  > >>>
  > >>>>-----Original Message-----
  > >>>>From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
  > >>>>Sent: Wednesday, December 19, 2007 12:33 PM
  > >>>>To: dan.magenheimer@xxxxxxxxxx
  > >>>>Cc: Keir Fraser; Shan, Haitao;
  > xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
  > >>>>Eddie; Jiang, Yunhong; Dave Winchell
  > >>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
  > >>>>disables pending
  > >>>>missed ticks
  > >>>>
  > >>>>
  > >>>>Dan,
  > >>>>
  > >>>>I did some testing with the constant tsc offset

SYNC method


  > >>>>(now called
  > >>>>no_missed_ticks_pending)
  > >>>>and found the error to be very high, much larger

than 1 %, as


  > >>>>I recall.
  > >>>>I have not had a chance to submit a correction. I

will try to


  > >>>>do it later
  > >>>>this week or the first week in January. My version of
  constant tsc
  > >>>>offset SYNC method
> >>>>produces .02 % error, so I just need to port

that into the
  > >>>>current code.
  > >>>>
  > >>>>The error you got for both of those kernels is

what I would


  expect
  > >>>>for the default mode, delay_for_missed_ticks.
  > >>>>
  > >>>>I'll let Keir answer on how to set the time mode.
  > >>>>
  > >>>>Regards,
  > >>>>Dave
  > >>>>
  > >>>>Dan Magenheimer wrote:
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>>Anyone make measurements on the final patch?
  > >>>>>
> >>>>>I just ran a 64-bit RHEL5.1 pvm kernel and

saw a loss of
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
> >>>>about 0.2% with no load. This was

xen-unstable tip today
  > >>>>with no options specified.  32-bit was about 0.01%.
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>>I think I missed something... how do I run the various
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>accounting choices and which ones are known to be

appropriate


  > >>>>for which kernels?
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>>Thanks,
  > >>>>>Dan
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>>-----Original Message-----
  > >>>>>>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
  >

[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of



  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>Keir Fraser
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>>>Sent: Thursday, December 06, 2007 4:57 AM
  > >>>>>>To: Dave Winchell
> >>>>>>Cc: Shan, Haitao;

xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
  > Eddie; Jiang,
  > >>>>>>Yunhong
> >>>>>>Subject: Re: [Xen-devel] [PATCH] Add a timer

mode that
  > >>>>>>disables pending
  > >>>>>>missed ticks
  > >>>>>>
  > >>>>>>
  > >>>>>>Please take a look at xen-unstable changeset 16545.
  > >>>>>>
  > >>>>>>-- Keir
  > >>>>>>
  > >>>>>>On 26/11/07 20:57, "Dave Winchell"
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>Keir,
  > >>>>>>>
> >>>>>>>The accuracy data I've collected for i/o

loads for the
> >>>>>>>various time protocols follows. In

addition, the data
  > >>>>>>>for cpu loads is shown.
  > >>>>>>>
  > >>>>>>>The loads labeled cpu and i/o-8 are on an 8

processor AMD


  box.
  > >>>>>>>Two guests, red hat and sles 64 bit, 8 vcpu each.
  > >>>>>>>The cpu load is usex -e36 on each guest.
  > >>>>>>>(usex is available at
  http://people.redhat.com/anderson/usex.)
  > >>>>>>>i/o load is 8 instances of dd if=/dev/hda6

of=/dev/null.


  > >>>>>>>
  > >>>>>>>The loads labeled i/o-32 are 32 instances of dd.
  > >>>>>>>Also, these are run on 4 cpu AMD box.
  > >>>>>>>In addition, there is an idle rh-32bit guest.
  > >>>>>>>All three guests are 8vcpu.
  > >>>>>>>
  > >>>>>>>The loads labeled i/o-4/32 are the same as i/o-32
> >>>>>>>except that the redhat-64 guest has 4

instances of dd.
  > >>>>>>>
  > >>>>>>>Date Duration Protocol sles, rhat error load
  > >>>>>>>
> >>>>>>>11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42

sec -.006%,
  > +.005% cpu
> >>>>>>>11/09 3 hrs 19 min ASYNC -.13 sec, +1.44

sec, -.001%,
  > +.012% cpu
  > >>>>>>>
  > >>>>>>>11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%,
  -.004% cpu
  > >>>>>>>11/08 1 hr 25 min SYNC -.24 sec, -.26 sec,

-.005%, -.005% cpu


  > >>>>>>>11/12 65 hrs 40 min SYNC -18 sec, -8 sec,

-.008%, -.003% cpu


  > >>>>>>>
  > >>>>>>>11/08 28 min MIXED -.75 sec, -.67 sec -.045%,

-.040% cpu


> >>>>>>>11/08 15 hrs 39 min MIXED -19. sec,-17.4

sec, -.034%,
  > -.031% cpu
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%,
  > -.09% i/o-8
  > >>>>>>>11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015%
  > -.14% i/o-8
  > >>>>>>>
  > >>>>>>>11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%,
  > -.022% i/o-8
  > >>>>>>>11/14 48 min SYNC - .46 sec, - .48 sec,

-.017%, -.018% i/o-8


  > >>>>>>>
  > >>>>>>>11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%,
  > -.029% i/o-8
> >>>>>>>11/20 16 hrs 2 min MIXED -13.4 sec,-18.1

sec, -.023%,
  > -.031% i/o-8
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%,

-.04% i/o-32


  > >>>>>>>11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%,
  > -.005% i/o-32
  > >>>>>>>11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%,

-.11% i/o-32


  > >>>>>>>
  > >>>>>>>11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%,
  > .003% i/o-4/32
  > >>>>>>>11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%,
  > .01% i/o-4/32
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>Overhead measurements:
  > >>>>>>>
> >>>>>>>Progress in terms of number of passes

through a fixed
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>system workload
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>on an 8 vcpu red hat with an 8 vcpu sles idle.
  > >>>>>>>The workload was usex -b48.
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>ASYNC 167 min 145 passes .868 passes/min
  > >>>>>>>SYNC 167 min 144 passes .862 passes/min
  > >>>>>>>SYNC 1065 min 919 passes .863 passes/min
  > >>>>>>>MIXED 221 min 196 passes .887 passes/min
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>Conclusions:
  > >>>>>>>
  > >>>>>>>The only protocol which meets the .05% accuracy
  > requirement for ntp
  > >>>>>>>tracking under the loads
  > >>>>>>>above is the SYNC protocol. The worst case

accuracies for


  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>SYNC, MIXED,
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>and ASYNC
  > >>>>>>>are .022%, .12%, and .14%, respectively.
  > >>>>>>>
  > >>>>>>>We could reduce the cost of the SYNC method by only
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>scheduling the extra
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>wakeups if a certain number
  > >>>>>>>of ticks are missed.
  > >>>>>>>
  > >>>>>>>Regards,
  > >>>>>>>Dave
  > >>>>>>>
  > >>>>>>>Keir Fraser wrote:
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>
  > >>>>>>>>On 9/11/07 19:22, "Dave Winchell"
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>>Since I had a high error (~.03%) for the

ASYNC method a


  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>couple of days ago,
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>I ran another ASYNC test. I think there may have
  > been something
> >>>>>>>>>wrong with the code I used a couple of

days ago for
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>ASYNC. It may have been
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>missing the immediate delivery of interrupt

after context


  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>switch in.
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>My results indicate that either SYNC or ASYNC give
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>acceptable accuracy,
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>each running consistently around or under

.01%. MIXED has


  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>a fairly high
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>error of
  > >>>>>>>>>greater than .03%. Probably too close to .05% ntp
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>threshold for comfort.
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>I don't have an overnight run with SYNC. I

plan to leave


  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>SYNC running
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
> >>>>>>>>>over the weekend. If you'd rather I can

leave MIXED
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>running instead.
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>It may be too early to pick the protocol and

I can run


  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>more overnight tests
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>>next week.
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>>
  > >>>>>>>>I'm a bit worried about any unwanted side

effects of the


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>SYNC+run_timer
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>approach -- e.g., whether timer wakeups will

cause higher


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>system-wide CPU
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>contention. I find it easier to think through the
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>implications of ASYNC. I'm
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>surprised that MIXED loses time, and is less

accurate than


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>ASYNC. Perhaps it
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>delivers more timer interrupts than the other

approaches,


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>and each interrupt
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>event causes a small accumulated error?
  > >>>>>>>>
  > >>>>>>>>Overall I would consider MIXED and ASYNC as

favourites and


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>if the latter is
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>actually more accurate then I can simply revert the
  > changeset that
  > >>>>>>>>implemented MIXED.
  > >>>>>>>>
  > >>>>>>>>Perhaps rather than running more of the same

workloads you


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>could try idle
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>VCPUs and I/O bound VCPUs (e.g., repeated

large disc reads


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>to /dev/null)? We
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>don't have any data on workloads that aren't

CPU bound, so


  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>that's really an
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>>>obvious place to put any further effort imo.
  > >>>>>>>>
  > >>>>>>>>-- Keir
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>>>
  > >>>>>>_______________________________________________
  > >>>>>>Xen-devel mailing list
  > >>>>>>Xen-devel@xxxxxxxxxxxxxxxxxxx
  > >>>>>>http://lists.xensource.com/xen-devel
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>>
  > >>>
  > >>>
  > >>>
  > >>>
  > >>diff -r cfdbdca5b831 xen/arch/x86/hvm/vpt.c
> >>--- a/xen/arch/x86/hvm/vpt.c Thu Dec 06 15:36:07

2007 +0000
> >>+++ b/xen/arch/x86/hvm/vpt.c Fri Jan 04 17:58:16

2008 -0500
> >>@@ -58,7 +58,7 @@ static void

pt_process_missed_ticks(stru
  > >>
  > >>     missed_ticks = missed_ticks / (s_time_t)

pt->period + 1;


  > >>     if ( mode_is(pt->vcpu->domain,

no_missed_ticks_pending) )


  > >>-        pt->do_not_freeze = !pt->pending_intr_nr;
  > >>+        pt->do_not_freeze = 1;
  > >>     else
  > >>         pt->pending_intr_nr += missed_ticks;
  > >>     pt->scheduled += missed_ticks * pt->period;
  > >>@@ -127,7 +127,12 @@ static void pt_timer_fn(void *data)
  > >>
  > >>     pt_lock(pt);
  > >>
  > >>-    pt->pending_intr_nr++;
  > >>+    if ( mode_is(pt->vcpu->domain,

no_missed_ticks_pending) ) {


  > >>+        pt->pending_intr_nr = 1;
  > >>+ pt->do_not_freeze = 0;
  > >>+    }
  > >>+    else
  > >>+ pt->pending_intr_nr++;
  > >>
  > >>     if ( !pt->one_shot )
  > >>     {
> >>@@ -221,8 +226,6 @@ void pt_intr_post(struct

vcpu *v, struct
  > >>         return;
  > >>     }
  > >>
  > >>-    pt->do_not_freeze = 0;
  > >>-
  > >>     if ( pt->one_shot )
  > >>     {
  > >>         pt->enabled = 0;
  > >>@@ -235,6 +238,10 @@ void pt_intr_post(struct vcpu

*v, struct


  > >>             pt->last_plt_gtime = hvm_get_guest_time(v);
  > >>             pt->pending_intr_nr = 0; /* 'collapse' all
  > missed ticks */
  > >>         }
> >>+ else if ( mode_is(v->domain,

no_missed_ticks_pending) ) {
  > >>+     pt->pending_intr_nr--;
  > >>+     pt->last_plt_gtime = hvm_get_guest_time(v);
  > >>+ }
  > >>         else
  > >>         {
  > >>             pt->last_plt_gtime += pt->period_cycles;
  > >>
  > >>
  > >
  > >
  > >
  > >
  >
  >
  > _______________________________________________
  > Xen-devel mailing list
  > Xen-devel@xxxxxxxxxxxxxxxxxxx
  > http://lists.xensource.com/xen-devel
  >







_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel