From 3d916a443e97169a3d88765c4e0b07ac813f439f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 10 Aug 2017 14:33:17 -0700 Subject: documentation: Slow systems can stall RCU grace periods If a fast system has a worst-case grace-period duration of (say) ten seconds, then running the same workload on a system ten times as slow will get you an RCU CPU stall warning given default stall-warning timeout settings. This commit therefore adds this possibility to stallwarn.txt. Reported-by: Daniel Lezcano Signed-off-by: Paul E. McKenney --- Documentation/RCU/stallwarn.txt | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation/RCU') diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 21b8913acbdf..238acbd94917 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -70,6 +70,12 @@ o A periodic interrupt whose handler takes longer than the time considerably longer than normal, which can in turn result in RCU CPU stall warnings. +o Testing a workload on a fast system, tuning the stall-warning + timeout down to just barely avoid RCU CPU stall warnings, and then + running the same workload with the same stall-warning timeout on a + slow system. Note that thermal throttling and on-demand governors + can cause a single system to be sometimes fast and sometimes slow! + o A hardware or software issue shuts off the scheduler-clock interrupt on a CPU that is not in dyntick-idle mode. This problem really has happened, and seems to be most likely to -- cgit v1.2.3