[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/time: further improve TSC / CPU freq calibration accuracy


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 18 Jan 2022 14:39:03 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=46/AV6e9Ml293GQJtOOgTGpbANnAuqmnD4/momB9YNs=; b=NLj08Gl1DfcusZ6gkGbrerYlMJpFpI+Mk9BTKzCC0D9NXpB/W8kVPkQruuy935NgqmPXczbdngTQ7BD5K9EXZVXy7aMN3kJFz6zCKRlY3JIuSdj6oHuGHqp4Dzo5ZW9Gw+dmVNN/b6w5+p7sqJfzxq0b7jLXLzbwkPsyLzGDhPkq4StQm+vhxkDsj5Ff2a3NOL4N7YZIlUooMG4NUnXCp2xNdo1ntQu6wugYkatwdqJT2/+Rg6RPd+FRwTilKKMQL+jNNurGWkTWaFmEXRjKZG/gEZywDYRpiupMDObR5i6BPK2Oq2R04Iyh2R8Z1SrCdOZDqC/IuqBr+HIjvjYFbw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=C5igp/YCnJTOAcHoCQ2OBJzOBw5rSXlBio2UUDwxZZdPcE8tQZQvTlEbAA9kT1u3ZRVyvvcQQXTKOP3lzyK+no/EPJqqdSDdhwU4lmBcfm/qo2F7Fw3aTcTU8HuLU5hQhb6NBp5msBNgOxaIiu/Y+EpmL1DPqIiqzto4rIcgR4Zf0sMIRfBPiSNxi5zQsvqLb9aaR7tyq7W6TXuGM5jOtOKCacwTwuZAdnfpjSobIJT8Z5NHCmo320gMOpKI/3vvOuC0LsbeSxBHnFwvQkLPoVUxwpLwqEdcSO3owIk16HCpu/m4ByXwCuOTKQ05SRLSQ55nElkidMP92+0/KKN7hQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 18 Jan 2022 13:39:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.01.2022 13:45, Roger Pau Monné wrote:
> On Thu, Jan 13, 2022 at 02:41:31PM +0100, Jan Beulich wrote:
>> Calibration logic assumes that the platform timer (HPET or ACPI PM
>> timer) and the TSC are read at about the same time. This assumption may
>> not hold when a long latency event (e.g. SMI or NMI) occurs between the
>> two reads. Reduce the risk of reading uncorrelated values by doing at
>> least four pairs of reads, using the tuple where the delta between the
>> enclosing TSC reads was smallest. From the fourth iteration onwards bail
>> if the new TSC delta isn't better (smaller) than the best earlier one.
> 
> Do you have some measurements as to whether this makes the frequency
> detection any more accurate?

In the normal case (no SMI or alike) I haven't observed any further
improvement on top of the earlier patch.

>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> Obviously (I think) instead of having both read_{hpet,pmtmr}_and_tsc()
>> the calibration logic could be folded between HPET and PMTMR, at the
> 
> As an intermediate solution you could have a read_counter_and_tsc I
> would think?

Not sure in which way you view this as "intermediate".

>> expense of a couple more indirect calls (which may not be that much of a
>> problem as this is all boot-time only). Really such folding would have
>> been possible already before, just that the amount of duplicate code
>> hasn't been this large so far. IOW if so desired, then perhaps the
>> folding should be a separate prereq patch.
> 
> You could even pass a platform_timesource as a parameter to that
> generic read counter and TSC helper.

This was the plan, yes, in case I was asked to follow that route (or
if I decided myself that I'd prefer that significantly over the
redundancy).

>> --- a/xen/arch/x86/time.c
>> +++ b/xen/arch/x86/time.c
>> @@ -392,9 +392,36 @@ static u64 read_hpet_count(void)
>>      return hpet_read32(HPET_COUNTER);
>>  }
>>  
>> +static uint32_t __init read_hpet_and_tsc(uint64_t *tsc)
>> +{
>> +    uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0;
>> +    uint32_t best = best;
>> +    unsigned int i;
>> +
>> +    for ( i = 0; ; ++i )
>> +    {
>> +        uint32_t hpet = hpet_read32(HPET_COUNTER);
>> +        uint64_t tsc_cur = rdtsc_ordered();
>> +        uint64_t tsc_delta = tsc_cur - tsc_prev;
>> +
>> +        if ( tsc_delta < tsc_min )
>> +        {
>> +            tsc_min = tsc_delta;
>> +            *tsc = tsc_cur;
>> +            best = hpet;
>> +        }
>> +        else if ( i > 2 )
>> +            break;
>> +
>> +        tsc_prev = tsc_cur;
> 
> Would it be better to set tsc_prev to the current TSC value here in
> order to minimize the window you are measuring? Or else tsc_delta will
> also account for the time in the loop code, and there's more
> likeliness from being interrupted?

I did consider this (or some variant thereof), but did for the moment
conclude that it wouldn't make enough of a difference. If you think
it would be meaningfully better that way, I'll switch.

Both here and for the aspect above, please express you preference (i.e.
not in the form of a question).

> I guess being interrupted four times so that all loops are bad is very
> unlikely.
> 
> Since this loop could in theory run multiple times, do we need to
> check that the counter doesn't overflow? Or else the elapsed counter
> value would be miscalculated.

I don't think I see any issue with overflow here. It's the callers
who need to care about that. And I don't think this function will run
for long enough such that a counter would not only wrap, but also
reach its initial count again.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.