From 9f2bf6513a6cca0b00cbbf67ba6197017cfba548 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 23 Mar 2011 10:37:00 +0100 Subject: ptrace: Remove the extra wake_up_state() from ptrace_detach() This wake_up_state() has a turbulent history. This is a remnant from ancient ptrace implementation and patently wrong. Commit 95a3540d (ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed it but the change was reverted later by commit edaba2c5 (ptrace: revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic") citing compatibility breakage and general brokeness of the whole group stop / ptrace interaction. Then, recently, it got converted from wake_up_process() to wake_up_state() to make it less dangerous. Digging through the mailing archives, the compatibility breakage doesn't seem to be critical in the sense that the behavior isn't well defined or reliable to begin with and it seems to have been agreed to remove the wakeup with proper cleanup of the whole thing. Now that the group stop and its interaction with ptrace are being cleaned up, it's high time to finally kill this silliness. Signed-off-by: Tejun Heo Acked-by: Oleg Nesterov Cc: Roland McGrath --- kernel/ptrace.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'kernel/ptrace.c') diff --git a/kernel/ptrace.c b/kernel/ptrace.c index e2302e40b360..6acf8954017c 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -312,8 +312,6 @@ static int ptrace_detach(struct task_struct *child, unsigned int data) if (child->ptrace) { child->exit_code = data; dead = __ptrace_detach(current, child); - if (!child->exit_state) - wake_up_state(child, TASK_TRACED | TASK_STOPPED); } write_unlock_irq(&tasklist_lock); -- cgit v1.2.3 From d79fdd6d96f46fabb779d86332e3677c6f5c2a4f Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 23 Mar 2011 10:37:00 +0100 Subject: ptrace: Clean transitions between TASK_STOPPED and TRACED Currently, if the task is STOPPED on ptrace attach, it's left alone and the state is silently changed to TRACED on the next ptrace call. The behavior breaks the assumption that arch_ptrace_stop() is called before any task is poked by ptrace and is ugly in that a task manipulates the state of another task directly. With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and TRACED can be made clean. The tracer can use the flag to tell the tracee to retry stop on attach and detach. On retry, the tracee will enter the desired state in the correct way. The lower 16bits of task->group_stop is used to remember the signal number which caused the last group stop. This is used while retrying for ptrace attach as the original group_exit_code could have been consumed with wait(2) by then. As the real parent may wait(2) and consume the group_exit_code anytime, the group_exit_code needs to be saved separately so that it can be used when switching from regular sleep to ptrace_stop(). This is recorded in the lower 16bits of task->group_stop. If a task is already stopped and there's no intervening SIGCONT, a ptrace request immediately following a successful PTRACE_ATTACH should always succeed even if the tracer doesn't wait(2) for attach completion; however, with this change, the tracee might still be TASK_RUNNING trying to enter TASK_TRACED which would cause the following request to fail with -ESRCH. This intermediate state is hidden from the ptracer by setting GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait for it to clear on its signal->wait_chldexit. Completing the transition or getting killed clears TRAPPING and wakes up the tracer. Note that the STOPPED -> RUNNING -> TRACED transition is still visible to other threads which are in the same group as the ptracer and the reverse transition is visible to all. Please read the comments for details. Oleg: * Spotted a race condition where a task may retry group stop without proper bookkeeping. Fixed by redoing bookkeeping on retry. * Spotted that the transition is visible to userland in several different ways. Most are fixed with GROUP_STOP_TRAPPING. Unhandled corner case is documented. * Pointed out not setting GROUP_STOP_SIGMASK on an already stopped task would result in more consistent behavior. * Pointed out that calling ptrace_stop() from do_signal_stop() in TASK_STOPPED can race with group stop start logic and then confuse the TRAPPING wait in ptrace_check_attach(). ptrace_stop() is now called with TASK_RUNNING. * Suggested using signal->wait_chldexit instead of bit wait. * Spotted a race condition between TRACED transition and clearing of TRAPPING. Signed-off-by: Tejun Heo Acked-by: Oleg Nesterov Cc: Roland McGrath Cc: Jan Kratochvil --- kernel/ptrace.c | 49 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 5 deletions(-) (limited to 'kernel/ptrace.c') diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 6acf8954017c..745fc2dd00c5 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -49,14 +49,22 @@ static void ptrace_untrace(struct task_struct *child) spin_lock(&child->sighand->siglock); if (task_is_traced(child)) { /* - * If the group stop is completed or in progress, - * this thread was already counted as stopped. + * If group stop is completed or in progress, it should + * participate in the group stop. Set GROUP_STOP_PENDING + * before kicking it. + * + * This involves TRACED -> RUNNING -> STOPPED transition + * which is similar to but in the opposite direction of + * what happens while attaching to a stopped task. + * However, in this direction, the intermediate RUNNING + * state is not hidden even from the current ptracer and if + * it immediately re-attaches and performs a WNOHANG + * wait(2), it may fail. */ if (child->signal->flags & SIGNAL_STOP_STOPPED || child->signal->group_stop_count) - __set_task_state(child, TASK_STOPPED); - else - signal_wake_up(child, 1); + child->group_stop |= GROUP_STOP_PENDING; + signal_wake_up(child, 1); } spin_unlock(&child->sighand->siglock); } @@ -165,6 +173,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode) static int ptrace_attach(struct task_struct *task) { + bool wait_trap = false; int retval; audit_ptrace(task); @@ -204,12 +213,42 @@ static int ptrace_attach(struct task_struct *task) __ptrace_link(task, current); send_sig_info(SIGSTOP, SEND_SIG_FORCED, task); + spin_lock(&task->sighand->siglock); + + /* + * If the task is already STOPPED, set GROUP_STOP_PENDING and + * TRAPPING, and kick it so that it transits to TRACED. TRAPPING + * will be cleared if the child completes the transition or any + * event which clears the group stop states happens. We'll wait + * for the transition to complete before returning from this + * function. + * + * This hides STOPPED -> RUNNING -> TRACED transition from the + * attaching thread but a different thread in the same group can + * still observe the transient RUNNING state. IOW, if another + * thread's WNOHANG wait(2) on the stopped tracee races against + * ATTACH, the wait(2) may fail due to the transient RUNNING. + * + * The following task_is_stopped() test is safe as both transitions + * in and out of STOPPED are protected by siglock. + */ + if (task_is_stopped(task)) { + task->group_stop |= GROUP_STOP_PENDING | GROUP_STOP_TRAPPING; + signal_wake_up(task, 1); + wait_trap = true; + } + + spin_unlock(&task->sighand->siglock); + retval = 0; unlock_tasklist: write_unlock_irq(&tasklist_lock); unlock_creds: mutex_unlock(&task->signal->cred_guard_mutex); out: + if (wait_trap) + wait_event(current->signal->wait_chldexit, + !(task->group_stop & GROUP_STOP_TRAPPING)); return retval; } -- cgit v1.2.3 From e3bd058f62896ec7a2c605ed62a0a811e9147947 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 23 Mar 2011 10:37:01 +0100 Subject: ptrace: Collapse ptrace_untrace() into __ptrace_unlink() Remove the extra task_is_traced() check in __ptrace_unlink() and collapse ptrace_untrace() into __ptrace_unlink(). This is to prepare for further changes. While at it, drop the comment on top of ptrace_untrace() and convert __ptrace_unlink() comment to docbook format. Detailed comment will be added by the next patch. This patch doesn't cause any visible behavior changes. Signed-off-by: Tejun Heo Acked-by: Oleg Nesterov --- kernel/ptrace.c | 40 +++++++++++++++------------------------- 1 file changed, 15 insertions(+), 25 deletions(-) (limited to 'kernel/ptrace.c') diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 745fc2dd00c5..e6098434b533 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -37,15 +37,23 @@ void __ptrace_link(struct task_struct *child, struct task_struct *new_parent) child->parent = new_parent; } -/* - * Turn a tracing stop into a normal stop now, since with no tracer there - * would be no way to wake it up with SIGCONT or SIGKILL. If there was a - * signal sent that would resume the child, but didn't because it was in - * TASK_TRACED, resume it now. - * Requires that irqs be disabled. +/** + * __ptrace_unlink - unlink ptracee and restore its execution state + * @child: ptracee to be unlinked + * + * Remove @child from the ptrace list, move it back to the original parent. + * + * CONTEXT: + * write_lock_irq(tasklist_lock) */ -static void ptrace_untrace(struct task_struct *child) +void __ptrace_unlink(struct task_struct *child) { + BUG_ON(!child->ptrace); + + child->ptrace = 0; + child->parent = child->real_parent; + list_del_init(&child->ptrace_entry); + spin_lock(&child->sighand->siglock); if (task_is_traced(child)) { /* @@ -69,24 +77,6 @@ static void ptrace_untrace(struct task_struct *child) spin_unlock(&child->sighand->siglock); } -/* - * unptrace a task: move it back to its original parent and - * remove it from the ptrace list. - * - * Must be called with the tasklist lock write-held. - */ -void __ptrace_unlink(struct task_struct *child) -{ - BUG_ON(!child->ptrace); - - child->ptrace = 0; - child->parent = child->real_parent; - list_del_init(&child->ptrace_entry); - - if (task_is_traced(child)) - ptrace_untrace(child); -} - /* * Check that we have indeed attached to the thing.. */ -- cgit v1.2.3 From 0e9f0a4abfd80f8adca624538d479d95159b16d8 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 23 Mar 2011 10:37:01 +0100 Subject: ptrace: Always put ptracee into appropriate execution state Currently, __ptrace_unlink() wakes up the tracee iff it's in TASK_TRACED. For unlinking from PTRACE_DETACH, this is correct as the tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking also happens when the ptracer exits and in this case the ptracee can be in any state and ptrace might be left running even if the group it belongs to is stopped. This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is reinstated regardless of the ptracee's current state as long as it's alive and makes sure that signal_wake_up() is called if execution state transition is necessary. Test case follows. #include #include #include #include #include static const struct timespec ts1s = { .tv_sec = 1 }; int main(void) { pid_t tracee; siginfo_t si; tracee = fork(); if (tracee == 0) { while (1) { nanosleep(&ts1s, NULL); write(1, ".", 1); } } ptrace(PTRACE_ATTACH, tracee, NULL, NULL); waitid(P_PID, tracee, &si, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status); waitid(P_PID, tracee, &si, WSTOPPED); ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status); write(1, "exiting", 7); return 0; } Before the patch, after the parent process exits, the child is left running and prints out "." every second. exiting..... (continues) After the patch, the group stop initiated by the implied SIGSTOP from PTRACE_ATTACH is re-established when the parent exits. exiting Signed-off-by: Tejun Heo Reported-by: Oleg Nesterov Acked-by: Oleg Nesterov --- kernel/ptrace.c | 59 ++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 39 insertions(+), 20 deletions(-) (limited to 'kernel/ptrace.c') diff --git a/kernel/ptrace.c b/kernel/ptrace.c index e6098434b533..43485866749a 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -41,7 +41,26 @@ void __ptrace_link(struct task_struct *child, struct task_struct *new_parent) * __ptrace_unlink - unlink ptracee and restore its execution state * @child: ptracee to be unlinked * - * Remove @child from the ptrace list, move it back to the original parent. + * Remove @child from the ptrace list, move it back to the original parent, + * and restore the execution state so that it conforms to the group stop + * state. + * + * Unlinking can happen via two paths - explicit PTRACE_DETACH or ptracer + * exiting. For PTRACE_DETACH, unless the ptracee has been killed between + * ptrace_check_attach() and here, it's guaranteed to be in TASK_TRACED. + * If the ptracer is exiting, the ptracee can be in any state. + * + * After detach, the ptracee should be in a state which conforms to the + * group stop. If the group is stopped or in the process of stopping, the + * ptracee should be put into TASK_STOPPED; otherwise, it should be woken + * up from TASK_TRACED. + * + * If the ptracee is in TASK_TRACED and needs to be moved to TASK_STOPPED, + * it goes through TRACED -> RUNNING -> STOPPED transition which is similar + * to but in the opposite direction of what happens while attaching to a + * stopped task. However, in this direction, the intermediate RUNNING + * state is not hidden even from the current ptracer and if it immediately + * re-attaches and performs a WNOHANG wait(2), it may fail. * * CONTEXT: * write_lock_irq(tasklist_lock) @@ -55,25 +74,25 @@ void __ptrace_unlink(struct task_struct *child) list_del_init(&child->ptrace_entry); spin_lock(&child->sighand->siglock); - if (task_is_traced(child)) { - /* - * If group stop is completed or in progress, it should - * participate in the group stop. Set GROUP_STOP_PENDING - * before kicking it. - * - * This involves TRACED -> RUNNING -> STOPPED transition - * which is similar to but in the opposite direction of - * what happens while attaching to a stopped task. - * However, in this direction, the intermediate RUNNING - * state is not hidden even from the current ptracer and if - * it immediately re-attaches and performs a WNOHANG - * wait(2), it may fail. - */ - if (child->signal->flags & SIGNAL_STOP_STOPPED || - child->signal->group_stop_count) - child->group_stop |= GROUP_STOP_PENDING; - signal_wake_up(child, 1); - } + + /* + * Reinstate GROUP_STOP_PENDING if group stop is in effect and + * @child isn't dead. + */ + if (!(child->flags & PF_EXITING) && + (child->signal->flags & SIGNAL_STOP_STOPPED || + child->signal->group_stop_count)) + child->group_stop |= GROUP_STOP_PENDING; + + /* + * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick + * @child in the butt. Note that @resume should be used iff @child + * is in TASK_TRACED; otherwise, we might unduly disrupt + * TASK_KILLABLE sleeps. + */ + if (child->group_stop & GROUP_STOP_PENDING || task_is_traced(child)) + signal_wake_up(child, task_is_traced(child)); + spin_unlock(&child->sighand->siglock); } -- cgit v1.2.3 From 321fb561971ba0f10ce18c0f8a4b9fbfc7cef4b9 Mon Sep 17 00:00:00 2001 From: Oleg Nesterov Date: Fri, 1 Apr 2011 20:13:01 +0200 Subject: ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ After "ptrace: Clean transitions between TASK_STOPPED and TRACED" d79fdd6d96f46fabb779d86332e3677c6f5c2a4f, ptrace_check_attach() should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is no longer legal. Add the warning. Note: ptrace_check_attach() can be greatly simplified, in particular it doesn't need tasklist. But I'd prefer another patch for that. Signed-off-by: Oleg Nesterov Signed-off-by: Tejun Heo --- kernel/ptrace.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) (limited to 'kernel/ptrace.c') diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 43485866749a..20d5efdeee02 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -112,16 +112,14 @@ int ptrace_check_attach(struct task_struct *child, int kill) */ read_lock(&tasklist_lock); if ((child->ptrace & PT_PTRACED) && child->parent == current) { - ret = 0; /* * child->sighand can't be NULL, release_task() * does ptrace_unlink() before __exit_signal(). */ spin_lock_irq(&child->sighand->siglock); - if (task_is_stopped(child)) - child->state = TASK_TRACED; - else if (!task_is_traced(child) && !kill) - ret = -ESRCH; + WARN_ON_ONCE(task_is_stopped(child)); + if (task_is_traced(child) || kill) + ret = 0; spin_unlock_irq(&child->sighand->siglock); } read_unlock(&tasklist_lock); -- cgit v1.2.3