summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2012-03-07rt: Introduce cpu_chill()Thomas Gleixner
Retry loops on RT might loop forever when the modifying side was preempted. Add cpu_chill() to replace cpu_relax(). cpu_chill() defaults to cpu_relax() for non RT. On RT it puts the looping task to sleep for a tick so the preempted task can make progress. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07softirq: Check preemption after reenabling interruptsThomas Gleixner
raise_softirq_irqoff() disables interrupts and wakes the softirq daemon, but after reenabling interrupts there is no preemption check, so the execution of the softirq thread might be delayed arbitrarily. In principle we could add that check to local_irq_enable/restore, but that's overkill as the rasie_softirq_irqoff() sections are the only ones which show this behaviour. Reported-by: Carsten Emde <cbe@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07lglock/rt: Use non-rt for_each_cpu() in -rt codeSteven Rostedt
Currently the RT version of the lglocks() does a for_each_online_cpu() in the name##_global_lock_online() functions. Non-rt uses its own mask for this, and for good reason. A task may grab a *_global_lock_online(), and in the mean time, one of the CPUs goes offline. Now when that task does a *_global_unlock_online() it releases all the locks *except* the one that went offline. Now if that CPU were to come back on line, its lock is now owned by a task that never released it when it should have. This causes all sorts of fun errors. Like owners of a lock no longer existing, or sleeping on IO, waiting to be woken up by a task that happens to be blocked on the lock it never released. Convert the RT versions to use the lglock specific cpumasks. As once a CPU comes on line, the mask is set, and never cleared even when the CPU goes offline. The locks for that CPU will still be taken and released. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Carsten Emde <C.Emde@osadl.org> Cc: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Clark Williams <clark.williams@gmail.com> Cc: stable-rt@vger.kernel.org Link: http://lkml.kernel.org/r/20120301190345.374756214@goodmis.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07net: u64_stat: Protect seqcountThomas Gleixner
On RT we must prevent that the writer gets preempted inside the write section. Otherwise a preempting reader might spin forever. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07fs: Protect open coded isize seqcountThomas Gleixner
A writer might be preempted in the write side critical section on RT. Disable preemption to avoid endless spinning of a preempting reader. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07seqlock: Prevent rt starvationThomas Gleixner
If a low prio writer gets preempted while holding the seqlock write locked, a high prio reader spins forever on RT. To prevent this let the reader grab the spinlock, so it blocks and eventually boosts the writer. This way the writer can proceed and endless spinning is prevented. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07sysrq: Allow immediate Magic SysRq output for PREEMPT_RT_FULLFrank Rowand
Add a CONFIG option to allow the output from Magic SysRq to be output immediately, even if this causes large latencies. If PREEMPT_RT_FULL, printk() will not try to acquire the console lock when interrupts or preemption are disabled. If the console lock is not acquired the printk() output will be buffered, but will not be output immediately. Some drivers call into the Magic SysRq code with interrupts or preemption disabled, so the output of Magic SysRq will be buffered instead of printing immediately if this option is not selected. Even with this option selected, Magic SysRq output will be delayed if the attempt to acquire the console lock fails. Signed-off-by: Frank Rowand <frank.rowand@am.sony.com> Link: http://lkml.kernel.org/r/4E7CEF60.5020508@am.sony.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07mm, rt: kmap_atomic schedulingPeter Zijlstra
In fact, with migrate_disable() existing one could play games with kmap_atomic. You could save/restore the kmap_atomic slots on context switch (if there are any in use of course), this should be esp easy now that we have a kmap_atomic stack. Something like the below.. it wants replacing all the preempt_disable() stuff with pagefault_disable() && migrate_disable() of course, but then you can flip kmaps around like below. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [dvhart@linux.intel.com: build fix] Link: http://lkml.kernel.org/r/1311842631.5890.208.camel@twins
2012-03-07kgdb/serial: Short term workaroundJason Wessel
On 07/27/2011 04:37 PM, Thomas Gleixner wrote: > - KGDB (not yet disabled) is reportedly unusable on -rt right now due > to missing hacks in the console locking which I dropped on purpose. > To work around this in the short term you can use this patch, in addition to the clocksource watchdog patch that Thomas brewed up. Comments are welcome of course. Ultimately the right solution is to change separation between the console and the HW to have a polled mode + work queue so as not to introduce any kind of latency. Thanks, Jason.
2012-03-07ping-sysrq.patchCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2012-03-07skbufhead-raw-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07jump-label-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07workqueue: Fix cpuhotplug trainwreckPeter Zijlstra
The current workqueue code does crazy stuff on cpu unplug, it relies on forced affine breakage, thereby violating per-cpu expectations. Worse, it tries to re-attach to a cpu if the thing comes up again before all previously queued works are finished. This breaks (admittedly bonkers) cpu-hotplug use that relies on a down-up cycle to push all usage away. Introduce a new WQ_NON_AFFINE flag that indicates a per-cpu workqueue will not respect cpu affinity and use this to migrate all its pending works to whatever cpu is doing cpu-down. This also adds a warning for queue_on_cpu() users which warns when its used on WQ_NON_AFFINE workqueues for the API implies you care about what cpu things are ran on when such workqueues cannot guarantee this. For the rest, simply flush all per-cpu works and don't mess about. This also means that currently all workqueues that are manually flushing things on cpu-down in order to provide the per-cpu guarantee no longer need to do so. In short, we tell the WQ what we want it to do, provide validation for this and loose ~250 lines of code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07lglocks-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rcu: Make ksoftirqd do RCU quiescent statesPaul E. McKenney
Implementing RCU-bh in terms of RCU-preempt makes the system vulnerable to network-based denial-of-service attacks. This patch therefore makes __do_softirq() invoke rcu_bh_qs(), but only when __do_softirq() is running in ksoftirqd context. A wrapper layer in interposed so that other calls to __do_softirq() avoid invoking rcu_bh_qs(). The underlying function __do_softirq_common() does the actual work. The reason that rcu_bh_qs() is bad in these non-ksoftirqd contexts is that there might be a local_bh_enable() inside an RCU-preempt read-side critical section. This local_bh_enable() can invoke __do_softirq() directly, so if __do_softirq() were to invoke rcu_bh_qs() (which just calls rcu_preempt_qs() in the PREEMPT_RT_FULL case), there would be an illegal RCU-preempt quiescent state in the middle of an RCU-preempt read-side critical section. Therefore, quiescent states can only happen in cases where __do_softirq() is invoked directly from ksoftirqd. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20111005184518.GA21601@linux.vnet.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rcu: Fix macro substitution for synchronize_rcu_bh() on RTJohn Kacur
kernel/rcutorture.c:492: error: ‘synchronize_rcu_bh’ undeclared here (not in a function) synchronize_rcu_bh() is not just called as a normal function, but can also be referenced as a function pointer. When CONFIG_PREEMPT_RT_FULL is enabled, synchronize_rcu_bh() is defined as synchronize_rcu(), but needs to be defined without the parenthesis because the compiler will complain when synchronize_rcu_bh is referenced as a function pointer and not a function. Signed-off-by: John Kacur <jkacur@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: stable-rt@vger.kernel.org Link: http://lkml.kernel.org/r/1321235083-21756-1-git-send-email-jkacur@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rcu: Merge RCU-bh into RCU-preemptThomas Gleixner
The Linux kernel has long RCU-bh read-side critical sections that intolerably increase scheduling latency under mainline's RCU-bh rules, which include RCU-bh read-side critical sections being non-preemptible. This patch therefore arranges for RCU-bh to be implemented in terms of RCU-preempt for CONFIG_PREEMPT_RT_FULL=y. This has the downside of defeating the purpose of RCU-bh, namely, handling the case where the system is subjected to a network-based denial-of-service attack that keeps at least one CPU doing full-time softirq processing. This issue will be fixed by a later commit. The current commit will need some work to make it appropriate for mainline use, for example, it needs to be extended to cover Tiny RCU. [ paulmck: Added a useful changelog ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20111005185938.GA20403@linux.vnet.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07timer-handle-idle-trylock-in-get-next-timer-irq.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rwlocks: Fix section mismatchJohn Kacur
This fixes the following build error for the preempt-rt kernel. make kernel/fork.o CC kernel/fork.o kernel/fork.c:90: error: section of ¡tasklist_lock¢ conflicts with previous declaration make[2]: *** [kernel/fork.o] Error 1 make[1]: *** [kernel/fork.o] Error 2 The rt kernel cache aligns the RWLOCK in DEFINE_RWLOCK by default. The non-rt kernels explicitly cache align only the tasklist_lock in kernel/fork.c That can create a build conflict. This fixes the build problem by making the non-rt kernels cache align RWLOCKs by default. The side effect is that the other RWLOCKs are also cache aligned for non-rt. This is a short term solution for rt only. The longer term solution would be to push the cache aligned DEFINE_RWLOCK to mainline. If there are objections, then we could create a DEFINE_RWLOCK_CACHE_ALIGNED or something of that nature. Comments? Objections? Signed-off-by: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.LFD.2.00.1109191104010.23118@localhost6.localdomain6 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rt: Add the preempt-rt lock replacement APIsThomas Gleixner
Map spinlocks, rwlocks, rw_semaphores and semaphores to the rt_mutex based locking functions for preempt-rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rwsem-add-rt-variant.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rt-add-rt-to-mutex-headers.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rt-add-rt-spinlocks.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rtmutex-avoid-include-hell.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07spinlock-types-separate-raw.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rt-mutex-add-sleeping-spinlocks-support.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07rtmutex-lock-killable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07local-vars-migrate-disable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07genirq: Allow disabling of softirq processing in irq thread contextThomas Gleixner
The processing of softirqs in irq thread context is a performance gain for the non-rt workloads of a system, but it's counterproductive for interrupts which are explicitely related to the realtime workload. Allow such interrupts to prevent softirq processing in their thread context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2012-03-07tasklet: Prevent tasklets from going into infinite spin in RTIngo Molnar
When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads, and spinlocks turn are mutexes. But this can cause issues with tasks disabling tasklets. A tasklet runs under ksoftirqd, and if a tasklets are disabled with tasklet_disable(), the tasklet count is increased. When a tasklet runs, it checks this counter and if it is set, it adds itself back on the softirq queue and returns. The problem arises in RT because ksoftirq will see that a softirq is ready to run (the tasklet softirq just re-armed itself), and will not sleep, but instead run the softirqs again. The tasklet softirq will still see that the count is non-zero and will not execute the tasklet and requeue itself on the softirq again, which will cause ksoftirqd to run it again and again and again. It gets worse because ksoftirqd runs as a real-time thread. If it preempted the task that disabled tasklets, and that task has migration disabled, or can't run for other reasons, the tasklet softirq will never run because the count will never be zero, and ksoftirqd will go into an infinite loop. As an RT task, it this becomes a big problem. This is a hack solution to have tasklet_disable stop tasklets, and when a tasklet runs, instead of requeueing the tasklet softirqd it delays it. When tasklet_enable() is called, and tasklets are waiting, then the tasklet_enable() will kick the tasklets to continue. This prevents the lock up from ksoftirq going into an infinite loop. [ rostedt@goodmis.org: ported to 3.0-rt ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07softirq-disable-softirq-stacks-for-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07hardirq.h: Define softirq_count() as OUL to kill build warningYong Zhang
kernel/lockdep.c: In function ‘print_bad_irq_dependency’: kernel/lockdep.c:1476:3: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 7 has type ‘unsigned int’ kernel/lockdep.c: In function ‘print_usage_bug’: kernel/lockdep.c:2193:3: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 7 has type ‘unsigned int’ kernel/lockdep.i show this: printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n", curr->comm, task_pid_nr(curr), curr->hardirq_context, ((current_thread_info()->preempt_count) & (((1UL << (10))-1) << ((0 + 8) + 8))) >> ((0 + 8) + 8), curr->softirq_context, (0U) >> (0 + 8), curr->hardirqs_enabled, curr->softirqs_enabled); Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/20111013091909.GA32739@zhy Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07softirq-local-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07lockdep-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07softirq: Sanitize softirq pending for NOHZ/RTThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched: teach migrate_disable about atomic contextsPeter Zijlstra
<NMI> [<ffffffff812dafd8>] spin_bug+0x94/0xa8 [<ffffffff812db07f>] do_raw_spin_lock+0x43/0xea [<ffffffff814fa9be>] _raw_spin_lock_irqsave+0x6b/0x85 [<ffffffff8106ff9e>] ? migrate_disable+0x75/0x12d [<ffffffff81078aaf>] ? pin_current_cpu+0x36/0xb0 [<ffffffff8106ff9e>] migrate_disable+0x75/0x12d [<ffffffff81115b9d>] pagefault_disable+0xe/0x1f [<ffffffff81047027>] copy_from_user_nmi+0x74/0xe6 [<ffffffff810489d7>] perf_callchain_user+0xf3/0x135 Now clearly we can't go around taking locks from NMI context, cure this by short-circuiting migrate_disable() when we're in an atomic context already. Add some extra debugging to avoid things like: preempt_disable() migrate_disable(); preempt_enable(); migrate_enable(); Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314967297.1301.14.camel@twins Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-wbot4vsmwhi8vmbf83hsclk6@git.kernel.org
2012-03-07sched: Generic migrate_disablePeter Zijlstra
Make migrate_disable() be a preempt_disable() for !rt kernels. This allows generic code to use it but still enforces that these code sections stay relatively small. A preemptible migrate_disable() accessible for general use would allow people growing arbitrary per-cpu crap instead of clean these things up. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-275i87sl8e1jcamtchmehonm@git.kernel.org
2012-03-07migrate-disable-rt-variant.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07ftrace-migrate-disable-tracing.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-migrate-disable.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07hotplug: Lightweight get online cpusThomas Gleixner
get_online_cpus() is a heavy weight function which involves a global mutex. migrate_disable() wants a simpler construct which prevents only a CPU from going doing while a task is in a migrate disabled section. Implement a per cpu lockless mechanism, which serializes only in the real unplug case on a global mutex. That serialization affects only tasks on the cpu which should be brought down. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07stomp-machine-mark-stomper-thread.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07cond-resched-lock-rt-tweak.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-no-work-when-pi-blocked.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07cond-resched-softirq-fix.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-might-sleep-do-not-account-rcu-depth.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-rt-mutex-wakeup.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-mmdrop-delayed.patchThomas Gleixner
Needs thread context (pgd_lock) -> ifdeffed. workqueues wont work with RT Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07sched-delay-put-task.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-07posix-timers: thread posix-cpu-timers on -rtJohn Stultz
posix-cpu-timer code takes non -rt safe locks in hard irq context. Move it to a thread. [ 3.0 fixes from Peter Zijlstra <peterz@infradead.org> ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>