summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThomas Gleixner <tglx@linutronix.de>2014-02-07 20:58:41 +0100
committerZefan Li <lizefan@huawei.com>2015-09-18 09:20:47 +0800
commitaaedb09057b05c7c9e213dc465bff5f70e708535 (patch)
tree2f2a0c645f970d5f6390ca39ee2e1fe0f2eea790
parenta39bf4a8e29c7336c0c72652b7d0dd1cd1b13c51 (diff)
sched: Queue RT tasks to head when prio drops
commit 81a44c5441d7f7d2c3dc9105f4d65ad0d5818617 upstream. The following scenario does not work correctly: Runqueue of CPUx contains two runnable and pinned tasks: T1: SCHED_FIFO, prio 80 T2: SCHED_FIFO, prio 80 T1 is on the cpu and executes the following syscalls (classic priority ceiling scenario): sys_sched_setscheduler(pid(T1), SCHED_FIFO, .prio = 90); ... sys_sched_setscheduler(pid(T1), SCHED_FIFO, .prio = 80); ... Now T1 gets preempted by T3 (SCHED_FIFO, prio 95). After T3 goes back to sleep the scheduler picks T2. Surprise! The same happens w/o actual preemption when T1 is forced into the scheduler due to a sporadic NEED_RESCHED event. The scheduler invokes pick_next_task() which returns T2. So T1 gets preempted and scheduled out. This happens because sched_setscheduler() dequeues T1 from the prio 90 list and then enqueues it on the tail of the prio 80 list behind T2. This violates the POSIX spec and surprises user space which relies on the guarantee that SCHED_FIFO tasks are not scheduled out unless they give the CPU up voluntarily or are preempted by a higher priority task. In the latter case the preempted task must get back on the CPU after the preempting task schedules out again. We fixed a similar issue already in commit 60db48c (sched: Queue a deboosted task to the head of the RT prio queue). The same treatment is necessary for sched_setscheduler(). So enqueue to head of the prio bucket list if the priority of the task is lowered. It might be possible that existing user space relies on the current behaviour, but it can be considered highly unlikely due to the corner case nature of the application scenario. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1391803122-4425-6-git-send-email-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-rw-r--r--kernel/sched/core.c9
1 files changed, 7 insertions, 2 deletions
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2f8363e0a1ec..15be43522c8b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4396,8 +4396,13 @@ recheck:
if (running)
p->sched_class->set_curr_task(rq);
- if (on_rq)
- enqueue_task(rq, p, 0);
+ if (on_rq) {
+ /*
+ * We enqueue to tail when the priority of a task is
+ * increased (user space view).
+ */
+ enqueue_task(rq, p, oldprio <= p->prio ? ENQUEUE_HEAD : 0);
+ }
check_class_changed(rq, p, prev_class, oldprio);
task_rq_unlock(rq, p, &flags);