Thanks very much Dave and Keir!
Keir, if its not too late already, can this patch be added for 3.1.3?
Which reminds me, when will 3.1.3 go final?
Thanks,
Dan
> -----Original Message-----
> From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> Sent: Wednesday, January 09, 2008 9:53 AM
> To: Keir Fraser
> Cc: dan.magenheimer@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx; Dave
> Winchell
> Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> disables pending
> missed ticks
>
>
> Hi Keir,
>
> The latest change, c/s 16690, looks fine.
> I agree that the code in c/s 16690 is equivalent to
> the code I submitted. Also, your version is more
> concise.
>
> The error tests confirm the equivalence. With overnight cpu loads,
> the checked in version was accurate to +.048% for sles
> and +.038% for red hat. My version was +.046% and +.032% in a
> 2 hour test.
> I don't think the difference is significant.
>
> i/o loads produced errors of +.01%.
>
> Thanks for all your efforts on this issue.
>
> Regards,
> Dave
>
>
>
> Keir Fraser wrote:
>
> >Applied as c/s 16690, although the checked-in patch is
> smaller. I think the
> >only important fix is to pt_intr_post() and the only bit of
> the patch I
> >totally omitted was the change to pt_process_missed_ticks().
> I don't think
> >that change can be important, but let's see what happens to the error
> >percentage...
> >
> > -- Keir
> >
> >On 4/1/08 23:24, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:
> >
> >
> >
> >>Hi Dan and Keir,
> >>
> >>Attached is a patch that fixes some issues with the SYNC policy
> >>(no_missed_ticks_pending).
> >>I have not tried to make the change the minimal one, but,
> rather, just
> >>ported into
> >>the new code what I know to work well. The error for
> >>no_missed_ticks_pending goes from
> >>over 3% to .03% with this change according to my testing.
> >>
> >>Regards,
> >>Dave
> >>
> >>Dan Magenheimer wrote:
> >>
> >>
> >>
> >>>Hi Dave --
> >>>
> >>>Did you get your correction ported? If so, it would be
> nice to see this get
> >>>into 3.1.3.
> >>>
> >>>Note that I just did some very limited testing with
> timer_mode=2(=SYNC=no
> >>>missed ticks pending)
> >>>on tip of xen-3.1-testing (64-bit Linux hv guest) and the
> worst error I've
> >>>seen so far
> >>>is 0.012%. But I haven't tried any exotic loads, just LTP.
> >>>
> >>>Thanks,
> >>>Dan
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>-----Original Message-----
> >>>>From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> >>>>Sent: Wednesday, December 19, 2007 12:33 PM
> >>>>To: dan.magenheimer@xxxxxxxxxx
> >>>>Cc: Keir Fraser; Shan, Haitao;
> xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
> >>>>Eddie; Jiang, Yunhong; Dave Winchell
> >>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> >>>>disables pending
> >>>>missed ticks
> >>>>
> >>>>
> >>>>Dan,
> >>>>
> >>>>I did some testing with the constant tsc offset SYNC method
> >>>>(now called
> >>>>no_missed_ticks_pending)
> >>>>and found the error to be very high, much larger than 1 %, as
> >>>>I recall.
> >>>>I have not had a chance to submit a correction. I will try to
> >>>>do it later
> >>>>this week or the first week in January. My version of constant tsc
> >>>>offset SYNC method
> >>>>produces .02 % error, so I just need to port that into the
> >>>>current code.
> >>>>
> >>>>The error you got for both of those kernels is what I would expect
> >>>>for the default mode, delay_for_missed_ticks.
> >>>>
> >>>>I'll let Keir answer on how to set the time mode.
> >>>>
> >>>>Regards,
> >>>>Dave
> >>>>
> >>>>Dan Magenheimer wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Anyone make measurements on the final patch?
> >>>>>
> >>>>>I just ran a 64-bit RHEL5.1 pvm kernel and saw a loss of
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>about 0.2% with no load. This was xen-unstable tip today
> >>>>with no options specified. 32-bit was about 0.01%.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>I think I missed something... how do I run the various
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>accounting choices and which ones are known to be appropriate
> >>>>for which kernels?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Thanks,
> >>>>>Dan
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>-----Original Message-----
> >>>>>>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >>>>>>[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>Keir Fraser
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>>Sent: Thursday, December 06, 2007 4:57 AM
> >>>>>>To: Dave Winchell
> >>>>>>Cc: Shan, Haitao; xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
> Eddie; Jiang,
> >>>>>>Yunhong
> >>>>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> >>>>>>disables pending
> >>>>>>missed ticks
> >>>>>>
> >>>>>>
> >>>>>>Please take a look at xen-unstable changeset 16545.
> >>>>>>
> >>>>>>-- Keir
> >>>>>>
> >>>>>>On 26/11/07 20:57, "Dave Winchell"
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Keir,
> >>>>>>>
> >>>>>>>The accuracy data I've collected for i/o loads for the
> >>>>>>>various time protocols follows. In addition, the data
> >>>>>>>for cpu loads is shown.
> >>>>>>>
> >>>>>>>The loads labeled cpu and i/o-8 are on an 8 processor AMD box.
> >>>>>>>Two guests, red hat and sles 64 bit, 8 vcpu each.
> >>>>>>>The cpu load is usex -e36 on each guest.
> >>>>>>>(usex is available at http://people.redhat.com/anderson/usex.)
> >>>>>>>i/o load is 8 instances of dd if=/dev/hda6 of=/dev/null.
> >>>>>>>
> >>>>>>>The loads labeled i/o-32 are 32 instances of dd.
> >>>>>>>Also, these are run on 4 cpu AMD box.
> >>>>>>>In addition, there is an idle rh-32bit guest.
> >>>>>>>All three guests are 8vcpu.
> >>>>>>>
> >>>>>>>The loads labeled i/o-4/32 are the same as i/o-32
> >>>>>>>except that the redhat-64 guest has 4 instances of dd.
> >>>>>>>
> >>>>>>>Date Duration Protocol sles, rhat error load
> >>>>>>>
> >>>>>>>11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42 sec -.006%,
> +.005% cpu
> >>>>>>>11/09 3 hrs 19 min ASYNC -.13 sec, +1.44 sec, -.001%,
> +.012% cpu
> >>>>>>>
> >>>>>>>11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%, -.004% cpu
> >>>>>>>11/08 1 hr 25 min SYNC -.24 sec, -.26 sec, -.005%, -.005% cpu
> >>>>>>>11/12 65 hrs 40 min SYNC -18 sec, -8 sec, -.008%, -.003% cpu
> >>>>>>>
> >>>>>>>11/08 28 min MIXED -.75 sec, -.67 sec -.045%, -.040% cpu
> >>>>>>>11/08 15 hrs 39 min MIXED -19. sec,-17.4 sec, -.034%,
> -.031% cpu
> >>>>>>>
> >>>>>>>
> >>>>>>>11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%,
> -.09% i/o-8
> >>>>>>>11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015%
> -.14% i/o-8
> >>>>>>>
> >>>>>>>11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%,
> -.022% i/o-8
> >>>>>>>11/14 48 min SYNC - .46 sec, - .48 sec, -.017%, -.018% i/o-8
> >>>>>>>
> >>>>>>>11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%,
> -.029% i/o-8
> >>>>>>>11/20 16 hrs 2 min MIXED -13.4 sec,-18.1 sec, -.023%,
> -.031% i/o-8
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%, -.04% i/o-32
> >>>>>>>11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%,
> -.005% i/o-32
> >>>>>>>11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%, -.11% i/o-32
> >>>>>>>
> >>>>>>>11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%,
> .003% i/o-4/32
> >>>>>>>11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%,
> .01% i/o-4/32
> >>>>>>>
> >>>>>>>
> >>>>>>>Overhead measurements:
> >>>>>>>
> >>>>>>>Progress in terms of number of passes through a fixed
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>system workload
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>on an 8 vcpu red hat with an 8 vcpu sles idle.
> >>>>>>>The workload was usex -b48.
> >>>>>>>
> >>>>>>>
> >>>>>>>ASYNC 167 min 145 passes .868 passes/min
> >>>>>>>SYNC 167 min 144 passes .862 passes/min
> >>>>>>>SYNC 1065 min 919 passes .863 passes/min
> >>>>>>>MIXED 221 min 196 passes .887 passes/min
> >>>>>>>
> >>>>>>>
> >>>>>>>Conclusions:
> >>>>>>>
> >>>>>>>The only protocol which meets the .05% accuracy
> requirement for ntp
> >>>>>>>tracking under the loads
> >>>>>>>above is the SYNC protocol. The worst case accuracies for
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>SYNC, MIXED,
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>and ASYNC
> >>>>>>>are .022%, .12%, and .14%, respectively.
> >>>>>>>
> >>>>>>>We could reduce the cost of the SYNC method by only
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>scheduling the extra
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>wakeups if a certain number
> >>>>>>>of ticks are missed.
> >>>>>>>
> >>>>>>>Regards,
> >>>>>>>Dave
> >>>>>>>
> >>>>>>>Keir Fraser wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>On 9/11/07 19:22, "Dave Winchell"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>Since I had a high error (~.03%) for the ASYNC method a
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>couple of days ago,
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>I ran another ASYNC test. I think there may have
> been something
> >>>>>>>>>wrong with the code I used a couple of days ago for
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>ASYNC. It may have been
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>missing the immediate delivery of interrupt after context
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>switch in.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>My results indicate that either SYNC or ASYNC give
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>acceptable accuracy,
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>each running consistently around or under .01%. MIXED has
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>a fairly high
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>error of
> >>>>>>>>>greater than .03%. Probably too close to .05% ntp
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>threshold for comfort.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>I don't have an overnight run with SYNC. I plan to leave
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>SYNC running
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>over the weekend. If you'd rather I can leave MIXED
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>running instead.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>It may be too early to pick the protocol and I can run
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>more overnight tests
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>>next week.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>I'm a bit worried about any unwanted side effects of the
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>SYNC+run_timer
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>approach -- e.g., whether timer wakeups will cause higher
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>system-wide CPU
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>contention. I find it easier to think through the
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>implications of ASYNC. I'm
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>surprised that MIXED loses time, and is less accurate than
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>ASYNC. Perhaps it
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>delivers more timer interrupts than the other approaches,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>and each interrupt
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>event causes a small accumulated error?
> >>>>>>>>
> >>>>>>>>Overall I would consider MIXED and ASYNC as favourites and
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>if the latter is
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>actually more accurate then I can simply revert the
> changeset that
> >>>>>>>>implemented MIXED.
> >>>>>>>>
> >>>>>>>>Perhaps rather than running more of the same workloads you
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>could try idle
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>VCPUs and I/O bound VCPUs (e.g., repeated large disc reads
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>to /dev/null)? We
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>don't have any data on workloads that aren't CPU bound, so
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>that's really an
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>obvious place to put any further effort imo.
> >>>>>>>>
> >>>>>>>>-- Keir
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>_______________________________________________
> >>>>>>Xen-devel mailing list
> >>>>>>Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>>>http://lists.xensource.com/xen-devel
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>diff -r cfdbdca5b831 xen/arch/x86/hvm/vpt.c
> >>--- a/xen/arch/x86/hvm/vpt.c Thu Dec 06 15:36:07 2007 +0000
> >>+++ b/xen/arch/x86/hvm/vpt.c Fri Jan 04 17:58:16 2008 -0500
> >>@@ -58,7 +58,7 @@ static void pt_process_missed_ticks(stru
> >>
> >> missed_ticks = missed_ticks / (s_time_t) pt->period + 1;
> >> if ( mode_is(pt->vcpu->domain, no_missed_ticks_pending) )
> >>- pt->do_not_freeze = !pt->pending_intr_nr;
> >>+ pt->do_not_freeze = 1;
> >> else
> >> pt->pending_intr_nr += missed_ticks;
> >> pt->scheduled += missed_ticks * pt->period;
> >>@@ -127,7 +127,12 @@ static void pt_timer_fn(void *data)
> >>
> >> pt_lock(pt);
> >>
> >>- pt->pending_intr_nr++;
> >>+ if ( mode_is(pt->vcpu->domain, no_missed_ticks_pending) ) {
> >>+ pt->pending_intr_nr = 1;
> >>+ pt->do_not_freeze = 0;
> >>+ }
> >>+ else
> >>+ pt->pending_intr_nr++;
> >>
> >> if ( !pt->one_shot )
> >> {
> >>@@ -221,8 +226,6 @@ void pt_intr_post(struct vcpu *v, struct
> >> return;
> >> }
> >>
> >>- pt->do_not_freeze = 0;
> >>-
> >> if ( pt->one_shot )
> >> {
> >> pt->enabled = 0;
> >>@@ -235,6 +238,10 @@ void pt_intr_post(struct vcpu *v, struct
> >> pt->last_plt_gtime = hvm_get_guest_time(v);
> >> pt->pending_intr_nr = 0; /* 'collapse' all
> missed ticks */
> >> }
> >>+ else if ( mode_is(v->domain, no_missed_ticks_pending) ) {
> >>+ pt->pending_intr_nr--;
> >>+ pt->last_plt_gtime = hvm_get_guest_time(v);
> >>+ }
> >> else
> >> {
> >> pt->last_plt_gtime += pt->period_cycles;
> >>
> >>
> >
> >
> >
> >
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|