oom: task->mm == NULL doesn't mean the memory was freed
authorOleg Nesterov <oleg@redhat.com>
Sat, 30 Jul 2011 14:35:02 +0000 (16:35 +0200)
committerGreg Kroah-Hartman <gregkh@suse.de>
Fri, 5 Aug 2011 04:58:42 +0000 (21:58 -0700)
commit c027a474a68065391c8773f6e83ed5412657e369 upstream.

exit_mm() sets ->mm == NULL then it does mmput()->exit_mmap() which
frees the memory.

However select_bad_process() checks ->mm != NULL before TIF_MEMDIE,
so it continues to kill other tasks even if we have the oom-killed
task freeing its memory.

Change select_bad_process() to check ->mm after TIF_MEMDIE, but skip
the tasks which have already passed exit_notify() to ensure a zombie
with TIF_MEMDIE set can't block oom-killer. Alternatively we could
probably clear TIF_MEMDIE after exit_mmap().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
mm/oom_kill.c

index e4b0991ca3516b3fc3590f40da1ede9f8a4ba637..8093fc766d16fda592f42074fbe7293cc7f6892a 100644 (file)
@@ -303,7 +303,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
        do_each_thread(g, p) {
                unsigned int points;
 
-               if (!p->mm)
+               if (p->exit_state)
                        continue;
                if (oom_unkillable_task(p, mem, nodemask))
                        continue;
@@ -319,6 +319,8 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
                 */
                if (test_tsk_thread_flag(p, TIF_MEMDIE))
                        return ERR_PTR(-1UL);
+               if (!p->mm)
+                       continue;
 
                if (p->flags & PF_EXITING) {
                        /*