[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 1/5] x86/time: deal with negative deltas in get_s_time_fixed()



Hi, I'm not sure about the other places. In hvm_load_cpu_ctxt (xen/arch/x86/hvm/hvm.c ), it was easy to catch because process_pending_softirqs is frequently called there, which in turn processes softirqs from the timer (where the timestamp is updated). After I fixed sync_tsc in hvm_load_cpu_ctxt, the problem stopped reproducing under no load. However, when the number of vCPUs is 4 times greater than the number of CPUs (under heavy load), the problem rarely reoccurs (mostly during snapshot restores during process_pending_softirqs calls), and this is no longer a simple case. If get_s_time_fixed can indeed be interrupted during execution after rdtsc_ordered, then the current fix is ​​insufficient. It's necessary to atomically copy "t->stamp" to the stack using local_irq_disable and local_irq_enable (as in local_time_calibration), and then work with the copy, confident in its lifetime and immutability. But until get_s_time_fixed is proven to be interruptible, this is premature, so your fix is ​​sufficient. I think I need more information and testing to say more. 

Regarding the other scale_delta calls, if they include values ​​calculated from externally saved tsc values ​​that could have become stale during the process_pending_softirqs call, this definitely needs to be fixed. 

Here's the problematic code from dwm.exe, if that helps:

void __fastcall CPartitionVerticalBlankScheduler::UpdateCurrentTime(CPartitionVerticalBlankScheduler *this) {
    uint64_t *xor_time_ptr = (uint64_t *)((char *)this + 0x3E40);
    CPartitionVerticalBlankScheduler *scheduler = this;
    LARGE_INTEGER *current_time_ptr = (LARGE_INTEGER *)((char *)this + 0x3E30);
    uint64_t previous_time = *((_QWORD *)this + 0x7C6); // 0x3e30
    uint64_t address_high = ((_QWORD)this + 0x3E40) << 32;
    uint64_t xor_key = ((uint64_t)this + 0x3E40) | address_high;

    // validate internal state
    if ((previous_time ^ xor_key) != *((_QWORD *)this + 0x7C8)) { // 0x3e40
        char exception_record[0x98];
        memset(exception_record, 0, sizeof(exception_record));
        *(int *)exception_record = 0x88980080; // MILERR_INTERNALERROR

        uint64_t computed_xor = *xor_time_ptr ^ ((uint64_t)xor_time_ptr | address_high);
        int param_count = 4;
        int previous_time_high = (int)(previous_time >> 32);
        uint32_t previous_time_low = (uint32_t)previous_time;
        int computed_xor_high = (int)(computed_xor >> 32);
        uint32_t computed_xor_low = (uint32_t)computed_xor;

        RaiseFailFastException((PEXCEPTION_RECORD)exception_record, nullptr, 0);
        previous_time = *((_QWORD *)scheduler + 1990);
    }

    // get current timestamp
    *((_QWORD *)scheduler + 0x7C7) = previous_time; // 0x3e38
    QueryPerformanceCounter(current_time_ptr);

    LARGE_INTEGER new_time = *current_time_ptr;
    uint64_t last_previous_time = *((_QWORD *)scheduler + 0x7C7); // 0x3e38

    // check if time go backward
    if (new_time.QuadPart < last_previous_time) {
        char exception_record[0x98];
        memset(exception_record, 0, sizeof(exception_record));
        *(int *)exception_record = 0x8898009B; // MILERR_QPC_TIME_WENT_BACKWARD

        int new_time_high = new_time.HighPart;
        uint32_t new_time_low = new_time.LowPart;
        int last_previous_high = (int)(last_previous_time >> 32);
        uint32_t last_previous_low = (uint32_t)last_previous_time;
        int frequency_high = g_qpcFrequency.HighPart;
        uint32_t frequency_low = g_qpcFrequency.LowPart;

        int64_t time_delta = last_previous_time - new_time.QuadPart;
        int64_t millis_delta = (1000 * time_delta) / g_qpcFrequency.QuadPart;
        int millis_high = (int)(millis_delta >> 32);
        uint32_t millis_low = (uint32_t)millis_delta;

        int param_count = 8;

        RaiseFailFastException((PEXCEPTION_RECORD)exception_record, nullptr, 0);
        new_time = *(LARGE_INTEGER *)((char *)scheduler + 15920);
    }

    *xor_time_ptr = new_time.QuadPart ^ xor_key;
}

Thanks.

06.01.2026 16:58, Jan Beulich пишет:
Callers may pass in TSC values from before the local TSC stamp was last
updated (this would in particular be the case when the TSC was latched, a
time rendezvous occurs, and the latched value is used only afterwards).
scale_delta(), otoh, deals with unsigned values, and hence would treat
negative incoming deltas as huge positive values, yielding entirely bogus
return values.

Fixes: 88e64cb785c1 ("x86/HVM: use fixed TSC value when saving or restoring domain")
Reported-by: Антон Марков <akmarkov45@xxxxxxxxx>
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
An alternative might be to have scale_delta() deal with signed deltas, yet
that seemed more risky to me.

There could likely be more Fixes: tags; the one used is the oldest
applicable one, from what I can tell.

A similar issue looks to exist in read_xen_timer() and its read_cycle()
helper, if we're scheduled out (and beck in) between reading of the TSC
and calculation of the delta (involving ->tsc_timestamp). Am I
overlooking anything there?

stime2tsc() guards against negative deltas by using 0 instead; I'm not
quite sure that's correct either.

amd_check_erratum_1474() (next to its call to tsc_ticks2ns()) has a
comment towards the TSC being "sane", but is that correct? Due to
TSC_ADJUST, rdtsc() may well return a huge value (and the TSC would then
wrap through 0 at some point). Shouldn't we subtract boot_tsc_stamp before
calling tsc_ticks2ns()?

A similar issue looks to exist in tsc_get_info(), again when rdtsc()
possibly returns a huge value due to TSC_ADJUST. Once again I wonder
whether we shouldn't subtract boot_tsc_stamp.

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1649,8 +1649,13 @@ s_time_t get_s_time_fixed(u64 at_tsc)
         tsc = at_tsc;
     else
         tsc = rdtsc_ordered();
-    delta = tsc - t->stamp.local_tsc;
-    return t->stamp.local_stime + scale_delta(delta, &t->tsc_scale);
+
+    if ( tsc >= t->stamp.local_tsc )
+        delta = scale_delta(tsc - t->stamp.local_tsc, &t->tsc_scale);
+    else
+        delta = -scale_delta(t->stamp.local_tsc - tsc, &t->tsc_scale);
+
+    return t->stamp.local_stime + delta;
 }
 
 s_time_t get_s_time(void)


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.