
From: Roland McGrath <roland@redhat.com>

Klaus Dittrich observed this bug and posted a test case for it.  This patch
fixes both that failure mode and some others possible.  What Klaus saw was
a false negative (i.e.  ECHILD when there was a child) when the group
leader was a zombie but delayed because other children live; in the test
program this happens in a race between the two threads dying on a signal. 
The change to the TASK_TRACED case avoids a potential false positive
(blocking, or WNOHANG returning 0, when there are really no children left),
in the race condition where my_ptrace_child returns zero.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/kernel/exit.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff -puN kernel/exit.c~fix-bogus-echild-return-from-wait-with-zombie-group-leader kernel/exit.c
--- 25/kernel/exit.c~fix-bogus-echild-return-from-wait-with-zombie-group-leader	2004-12-06 13:55:07.985260472 -0800
+++ 25-akpm/kernel/exit.c	2004-12-06 13:55:07.989259864 -0800
@@ -1322,6 +1322,10 @@ static long do_wait(pid_t pid, int optio
 
 	add_wait_queue(&current->wait_chldexit,&wait);
 repeat:
+	/*
+	 * We will set this flag if we see any child that might later
+	 * match our criteria, even if we are not able to reap it yet.
+	 */
 	flag = 0;
 	current->state = TASK_INTERRUPTIBLE;
 	read_lock(&tasklist_lock);
@@ -1340,11 +1344,14 @@ repeat:
 
 			switch (p->state) {
 			case TASK_TRACED:
-				flag = 1;
 				if (!my_ptrace_child(p))
 					continue;
 				/*FALLTHROUGH*/
 			case TASK_STOPPED:
+				/*
+				 * It's stopped now, so it might later
+				 * continue, exit, or stop again.
+				 */
 				flag = 1;
 				if (!(options & WUNTRACED) &&
 				    !my_ptrace_child(p))
@@ -1380,8 +1387,12 @@ repeat:
 						goto end;
 					break;
 				}
-				flag = 1;
 check_continued:
+				/*
+				 * It's running now, so it might later
+				 * exit, stop, or stop and then continue.
+				 */
+				flag = 1;
 				if (!unlikely(options & WCONTINUED))
 					continue;
 				retval = wait_task_continued(
_
