Preview
Open Original
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Daniel J Blueman <daniel@quora.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
John Stultz <jstultz@google.com>,
Waiman Long <longman@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
Stephen Boyd <sboyd@kernel.org>,
Scott Hamilton <scott.hamilton@eviden.com>
Subject: clocksource: Reduce watchdog readout delay limit to prevent false positives
Date: Wed, 17 Dec 2025 18:21:05 +0100 [thread overview]
Message-ID: <87bjjxc9dq.ffs@tglx> (raw)
The "valid" readout delay between the two reads of the watchdog is larger
than the valid delta between the resulting watchdog and clocksource
intervals, which...
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Daniel J Blueman <daniel@quora.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
John Stultz <jstultz@google.com>,
Waiman Long <longman@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
Stephen Boyd <sboyd@kernel.org>,
Scott Hamilton <scott.hamilton@eviden.com>
Subject: clocksource: Reduce watchdog readout delay limit to prevent false positives
Date: Wed, 17 Dec 2025 18:21:05 +0100 [thread overview]
Message-ID: <87bjjxc9dq.ffs@tglx> (raw)
The "valid" readout delay between the two reads of the watchdog is larger
than the valid delta between the resulting watchdog and clocksource
intervals, which results in false positive watchdog results.
Assume TSC is the clocksource and HPET is the watchdog and both have a
uncertainty margin of 250us (default). The watchdog readout does:
1) wdnow = read(HPET);
2) csnow = read(TSC);
3) wdend = read(HPET);
The valid window for the delta between #1 and #3 is calculated by the
uncertainty margins of the watchdog and the clocksource:
m = 2 * watchdog.uncertainty_margin + cs.uncertainty margin;
which results in 750us for the TSC/HPET case.
The actual interval comparison uses a smaller margin:
m = watchdog.uncertainty_margin + cs.uncertainty margin;
which results in 500us for the TSC/HPET case.
That means the following scenario will trigger the watchdog:
Watchdog cycle N:
1) wdnow[N] = read(HPET);
2) csnow[N] = read(TSC);
3) wdend[N] = read(HPET);
Assume the delay between #1 and #2 is 100us and the delay between #1 and
#3 is within the 750us margin, i.e. the readout is considered valid.
Watchdog cycle N + 1:
4) wdnow[N + 1] = read(HPET);
5) csnow[N + 1] = read(TSC);
6) wdend[N + 1] = read(HPET);
If the delay between #4 and #6 is within the 750us margin then any delay
between #4 and #5 which is larger than 600us will fail the interval check
and mark the TSC unstable because the intervals are calculated against the
previous value:
wd_int = wdnow[N + 1] - wdnow[N];
cs_int = csnow[N + 1] - csnow[N];
Putting the above delays in place this results in:
cs_int = (wdnow[N + 1] + 610us) - (wdnow[N] + 100us);
-> cs_int = wd_int + 510us;
which is obviously larger than the allowed 500us margin and results in
marking TSC unstable.
Fix this by using the same margin as the interval comparison. If the delay
between two watchdog reads is larger than that, then the readout was either
disturbed by interconnect congestion, NMIs or SMIs.
Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
Reported-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20250602223251.496591-1-daniel@quora.org/
---
kernel/time/clocksource.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -252,7 +252,7 @@ enum wd_read_status {
static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
{
- int64_t md = 2 * watchdog->uncertainty_margin;
+ int64_t md = watchdog->uncertainty_margin;
unsigned int nretries, max_retries;
int64_t wd_delay, wd_seq_delay;
u64 wd_end, wd_end2;
@@ -285,7 +285,7 @@ static enum wd_read_status cs_watchdog_r
* watchdog test.
*/
wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
- if (wd_seq_delay > md)
+ if (wd_seq_delay > 2 * md)
goto skip_test;
}
next reply other threads:[~2025-12-17 17:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-17 17:21 Thomas Gleixner [this message]
2025-12-18 0:48 ` clocksource: Reduce watchdog readout delay limit to prevent false positives Paul E. McKenney
2025-12-19 10:13 ` Thomas Gleixner
2025-12-20 0:18 ` Paul E. McKenney
2025-12-20 8:18 ` Thomas Gleixner
2025-12-22 5:50 ` Paul E. McKenney
2025-12-20 8:37 ` Thomas Gleixner
2025-12-20 16:39 ` Paul E. McKenney
2025-12-20 8:38 ` Thomas Gleixner
2025-12-20 16:35 ` Paul E. McKenney
2025-12-22 5:50 ` Paul E. McKenney
2025-12-23 0:27 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bjjxc9dq.ffs@tglx \
--to=tglx@linutronix.de \
--cc=bp@alien8.de \
--cc=daniel@quora.org \
--cc=dave.hansen@linux.intel.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=sboyd@kernel.org \
--cc=scott.hamilton@eviden.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.