IB/ipath: Convert ipath_user_sdma_pin_pages() to use get_user_pages_fast()
authorJan Kara <jack@suse.cz>
Fri, 4 Oct 2013 13:29:06 +0000 (09:29 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 4 Dec 2013 18:56:16 +0000 (10:56 -0800)
commit 4adcf7fb6783e354aab38824d803fa8c4f8e8a27 upstream.

ipath_user_sdma_queue_pkts() gets called with mmap_sem held for
writing.  Except for get_user_pages() deep down in
ipath_user_sdma_pin_pages() we don't seem to need mmap_sem at all.

Even more interestingly the function ipath_user_sdma_queue_pkts() (and
also ipath_user_sdma_coalesce() called somewhat later) call
copy_from_user() which can hit a page fault and we deadlock on trying
to get mmap_sem when handling that fault.  So just make
ipath_user_sdma_pin_pages() use get_user_pages_fast() and leave
mmap_sem locking for mm.

This deadlock has actually been observed in the wild when the node
is under memory pressure.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
[ Merged in fix for call to get_user_pages_fast from Tetsuo Handa
  <penguin-kernel@I-love.SAKURA.ne.jp>.  - Roland ]
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/infiniband/hw/ipath/ipath_user_sdma.c

index f5cb13b2144566229c03be75e31623cb19942d45..cc04b7ba34882cc7981ec3815af73dc7a0519d91 100644 (file)
@@ -280,9 +280,7 @@ static int ipath_user_sdma_pin_pages(const struct ipath_devdata *dd,
        int j;
        int ret;
 
-       ret = get_user_pages(current, current->mm, addr,
-                            npages, 0, 1, pages, NULL);
-
+       ret = get_user_pages_fast(addr, npages, 0, pages);
        if (ret != npages) {
                int i;
 
@@ -811,10 +809,7 @@ int ipath_user_sdma_writev(struct ipath_devdata *dd,
        while (dim) {
                const int mxp = 8;
 
-               down_write(&current->mm->mmap_sem);
                ret = ipath_user_sdma_queue_pkts(dd, pq, &list, iov, dim, mxp);
-               up_write(&current->mm->mmap_sem);
-
                if (ret <= 0)
                        goto done_unlock;
                else {