Refer to the commit
85fbd722ad0f ("libata, freezer: avoid
block device removal while system is frozen"), when system
enter suspend, it may freeze kthreads and workqueues, and
do not restart them until complete PM resume all of devices.
If we remove XHCI hcd while system is frozen, it may call
usb_disconnect() to remove a usb block device which pluged
in before, but has gone missing. Unfortunately, remove the
block device can race with the rest of device resume. Since
freezable kthreads and workqueues are thawed after all of
devices resume are completed and block device removal depends
on freezable workqueues and kthreads (e.g. bdi_wq) to make
progress, this can lead to deadlock - block device removal
can't proceed because kthreads and workqueues are frozen and
can't be restarted because device resume is blocked behind
block device removal.
This patch must be used and tested with the commit
bc68c26eff86
("CHROMIUM: usb: dwc3: rockchip: fix NULL pointer dereference
when resume"). This issue can be easily reproduced with USB-C
HUB and USB2/3 flash drive, result in the following backtrace.
[ 360.201135] INFO: task kworker/u12:3:122 blocked for more than 120 seconds.
[ 360.208094] Not tainted 4.4.21 #185
[ 360.212102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.219923] kworker/u12:3 D
ffffffc000204fd8 0 122 2 0x00000000
[ 360.227007] Workqueue: events_unbound async_run_entry_fn
[ 360.232326] Call trace:
[ 360.234776] [<
ffffffc000204fd8>] __switch_to+0x9c/0xa8
[ 360.239918] [<
ffffffc000915bf4>] __schedule+0x440/0x6d8
[ 360.245139] [<
ffffffc000915f20>] schedule+0x94/0xb4
[ 360.250016] [<
ffffffc00091909c>] schedule_timeout+0x44/0x27c
[ 360.255670] [<
ffffffc000916b78>] wait_for_common+0xf8/0x198
[ 360.261237] [<
ffffffc000916c40>] wait_for_completion+0x28/0x34
[ 360.267067] [<
ffffffc0005f3f4c>] dpm_wait+0x40/0x4c
[ 360.271942] [<
ffffffc0005f4770>] device_resume+0x60/0x1a4
[ 360.277337] [<
ffffffc0005f48e4>] async_resume+0x30/0x60
[ 360.282558] [<
ffffffc000242fc4>] async_run_entry_fn+0x50/0x104
[ 360.288387] [<
ffffffc0002397f0>] process_one_work+0x240/0x424
[ 360.294128] [<
ffffffc00023a28c>] worker_thread+0x2fc/0x424
[ 360.299608] [<
ffffffc00023f5fc>] kthread+0x10c/0x114
[ 360.304570] [<
ffffffc000203dd0>] ret_from_fork+0x10/0x40
[ 360.309876] task PC stack pid father
[ 360.315789] init D
ffffffc000204fd8 0 1 0 0x00400009
...
[ 360.564124] [<
ffffffc000204fd8>] __switch_to+0x9c/0xa8
[ 360.569259] [<
ffffffc000915bf4>] __schedule+0x440/0x6d8
[ 360.574481] [<
ffffffc000915f20>] schedule+0x94/0xb4
[ 360.579355] [<
ffffffc00091909c>] schedule_timeout+0x44/0x27c
[ 360.585010] [<
ffffffc000916b78>] wait_for_common+0xf8/0x198
[ 360.590580] [<
ffffffc000916c40>] wait_for_completion+0x28/0x34
[ 360.596408] [<
ffffffc000239270>] flush_work+0x168/0x1a4
[ 360.601629] [<
ffffffc0002395a4>] flush_delayed_work+0x44/0x50
[ 360.607371] [<
ffffffc000322f48>] bdi_unregister+0xa8/0xfc
[ 360.612766] [<
ffffffc00049afdc>] blk_cleanup_queue+0xf4/0x10c
[ 360.618508] [<
ffffffc000625d7c>] __scsi_remove_device+0x80/0xc8
[ 360.624423] [<
ffffffc000623dec>] scsi_forget_host+0x5c/0x74
[ 360.629991] [<
ffffffc000619a98>] scsi_remove_host+0x90/0x110
[ 360.635646] [<
ffffffc000692940>] usb_stor_disconnect+0x78/0xec
[ 360.641474] [<
ffffffc0006545e4>] usb_unbind_interface+0xa0/0x1f8
[ 360.647477] [<
ffffffc0005e70cc>] __device_release_driver+0xb4/0x114
[ 360.653746] [<
ffffffc0005e7158>] device_release_driver+0x2c/0x40
[ 360.659748] [<
ffffffc0005e61f8>] bus_remove_device+0x110/0x128
[ 360.665575] [<
ffffffc0005e3178>] device_del+0x164/0x1f4
[ 360.670797] [<
ffffffc000652094>] usb_disable_device+0x94/0x1c8
[ 360.676625] [<
ffffffc000649b74>] usb_disconnect+0x9c/0x1d0
[ 360.682106] [<
ffffffc000649b60>] usb_disconnect+0x88/0x1d0
[ 360.687587] [<
ffffffc00064e0e4>] usb_remove_hcd+0xc8/0x1e0
[ 360.693068] [<
ffffffc000664b4c>] dwc3_rockchip_otg_extcon_evt_work+0x14c/0x198
[ 360.700284] [<
ffffffc0002397f0>] process_one_work+0x240/0x424
[ 360.706026] [<
ffffffc00023a28c>] worker_thread+0x2fc/0x424
[ 360.711506] [<
ffffffc00023f5fc>] kthread+0x10c/0x114
[ 360.716467] [<
ffffffc000203dd0>] ret_from_fork+0x10/0x40
BUG=chrome-os-partner:58705, chrome-os-partner:59103
TEST=Plug in USB-C HUB and USB2/3 flash drive, then set
system to enter S3. After system suspend, plug out the
USB-C HUB first, and then press keyboard or power key to
check if system can wakeup successfully.
Change-Id: I6cb8ea1a4399b9b69b522ec0ed5f0f7810118850
Signed-off-by: William wu <wulf@rock-chips.com>
Reviewed-on: https://chromium-review.googlesource.com/408499
Commit-Ready: Guenter Roeck <groeck@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: William Wu <wulf@rock-chips.com>