drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)
authorNicolai Hähnle <nicolai.haehnle@amd.com>
Fri, 5 Feb 2016 15:59:43 +0000 (10:59 -0500)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 3 Mar 2016 23:07:19 +0000 (15:07 -0800)
commit a8d81b36267366603771431747438d18f32ae2d5 upstream.

An arbitrary amount of time can pass between spin_unlock and
fence_wait_any_timeout, so we need to ensure that nobody frees the
fences from under us.

A stress test (rapidly starting and killing hundreds of glxgears
instances) ran into a deadlock in fence_wait_any_timeout after
about an hour, and this race condition appears to be a plausible
cause.

v2: agd: rebase on upstream

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c

index 8b88edb0434bfb43ed4b7aae7e6f3e7dc63d0893..ca72a2e487b9b9d846b65479e9b865dd1ef71281 100644 (file)
@@ -354,12 +354,15 @@ int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager,
 
                for (i = 0, count = 0; i < AMDGPU_MAX_RINGS; ++i)
                        if (fences[i])
-                               fences[count++] = fences[i];
+                               fences[count++] = fence_get(fences[i]);
 
                if (count) {
                        spin_unlock(&sa_manager->wq.lock);
                        t = fence_wait_any_timeout(fences, count, false,
                                                   MAX_SCHEDULE_TIMEOUT);
+                       for (i = 0; i < count; ++i)
+                               fence_put(fences[i]);
+
                        r = (t > 0) ? 0 : t;
                        spin_lock(&sa_manager->wq.lock);
                } else {