From: Robert Richter Date: Mon, 3 May 2010 17:44:32 +0000 (+0200) Subject: oprofile/x86: fix uninitialized counter usage during cpu hotplug X-Git-Tag: firefly_0821_release~9833^2~2137^2^2~6 X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=2623a1d55a6260c855e1f6d1895900b50b40a896;p=firefly-linux-kernel-4.4.55.git oprofile/x86: fix uninitialized counter usage during cpu hotplug This fixes a NULL pointer dereference that is triggered when taking a cpu offline after oprofile was initialized, e.g.: $ opcontrol --init $ opcontrol --start-daemon $ opcontrol --shutdown $ opcontrol --deinit $ echo 0 > /sys/devices/system/cpu/cpu1/online See the crash dump below. Though the counter has been disabled the cpu notifier is still active and trying to use already freed counter data. This fix is for linux-stable. To proper fix this, the hotplug code must be rewritten. Thus I will leave a WARN_ON_ONCE() message with this patch. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] op_amd_stop+0x2d/0x8e PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu1/online CPU 1 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16 Anaheim/Anaheim RIP: 0010:[] [] op_amd_stop+0x2d/0x8e RSP: 0018:ffff880001843f28 EFLAGS: 00010006 RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000200200 RDX: ffff880001843f68 RSI: dead000000100100 RDI: 0000000000000000 RBP: ffff880001843f48 R08: 0000000000000000 R09: ffff880001843f08 R10: ffffffff8102c9a5 R11: ffff88000184ea80 R12: 0000000000000000 R13: ffff88000184f6c0 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fec6a92e6f0(0000) GS:ffff880001840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000000163b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff88042fcd8000, task ffff88042fcd51d0) Stack: ffff880001843f48 0000000000000001 ffff88042e9f7d38 ffff880001843f68 <0> ffff880001843f58 ffffffff8132a602 ffff880001843f98 ffffffff810521b3 <0> ffff880001843f68 ffff880001843f68 ffff880001843f88 ffff88042fcd9fd8 Call Trace: [] nmi_cpu_stop+0x21/0x23 [] generic_smp_call_function_single_interrupt+0xdf/0x11b [] smp_call_function_single_interrupt+0x22/0x31 [] call_function_single_interrupt+0x13/0x20 [] ? wake_up_process+0x10/0x12 [] ? default_idle+0x22/0x37 [] c1e_idle+0xdf/0xe6 [] ? atomic_notifier_call_chain+0x13/0x15 [] cpu_idle+0x4b/0x7e [] start_secondary+0x1ae/0x1b2 Code: 89 e5 41 55 49 89 fd 41 54 45 31 e4 53 31 db 48 83 ec 08 89 df e8 be f8 ff ff 48 98 48 83 3c c5 10 67 7a 81 00 74 1f 49 8b 45 08 <42> 8b 0c 20 0f 32 48 c1 e2 20 25 ff ff bf ff 48 09 d0 48 89 c2 RIP [] op_amd_stop+0x2d/0x8e RSP CR2: 0000000000000000 ---[ end trace 679ac372d674b757 ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 0, comm: swapper Tainted: G D 2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16 Call Trace: [] panic+0x9e/0x10c [] ? up+0x34/0x39 [] ? kmsg_dump+0x112/0x12c [] oops_end+0x81/0x8e [] no_context+0x1f3/0x202 [] __bad_area_nosemaphore+0x1ba/0x1e0 [] ? enqueue_task_fair+0x16d/0x17a [] ? activate_task+0x42/0x53 [] ? try_to_wake_up+0x272/0x284 [] bad_area_nosemaphore+0xe/0x10 [] do_page_fault+0x1c8/0x37c [] ? enqueue_task_fair+0x16d/0x17a [] page_fault+0x1f/0x30 [] ? wake_up_process+0x10/0x12 [] ? op_amd_stop+0x2d/0x8e [] ? op_amd_stop+0x1c/0x8e [] nmi_cpu_stop+0x21/0x23 [] generic_smp_call_function_single_interrupt+0xdf/0x11b [] smp_call_function_single_interrupt+0x22/0x31 [] call_function_single_interrupt+0x13/0x20 [] ? wake_up_process+0x10/0x12 [] ? default_idle+0x22/0x37 [] c1e_idle+0xdf/0xe6 [] ? atomic_notifier_call_chain+0x13/0x15 [] cpu_idle+0x4b/0x7e [] start_secondary+0x1ae/0x1b2 ------------[ cut here ]------------ WARNING: at /local/rrichter/.source/linux/arch/x86/kernel/smp.c:118 native_smp_send_reschedule+0x27/0x53() Hardware name: Anaheim Modules linked in: Pid: 0, comm: swapper Tainted: G D 2.6.34-rc5-oprofile-x86_64-standard-00210-g8c00f06 #16 Call Trace: [] ? native_smp_send_reschedule+0x27/0x53 [] warn_slowpath_common+0x77/0xa4 [] warn_slowpath_null+0xf/0x11 [] native_smp_send_reschedule+0x27/0x53 [] resched_task+0x60/0x62 [] check_preempt_curr_idle+0x10/0x12 [] try_to_wake_up+0x1f5/0x284 [] default_wake_function+0xd/0xf [] pollwake+0x57/0x5a [] ? default_wake_function+0x0/0xf [] __wake_up_common+0x46/0x75 [] __wake_up+0x38/0x50 [] printk_tick+0x39/0x3b [] update_process_times+0x3f/0x5c [] tick_periodic+0x5d/0x69 [] tick_handle_periodic+0x21/0x71 [] smp_apic_timer_interrupt+0x82/0x95 [] apic_timer_interrupt+0x13/0x20 [] ? panic_blink_one_second+0x0/0x7b [] ? panic+0x10a/0x10c [] ? up+0x34/0x39 [] ? kmsg_dump+0x112/0x12c [] ? oops_end+0x81/0x8e [] ? no_context+0x1f3/0x202 [] ? __bad_area_nosemaphore+0x1ba/0x1e0 [] ? enqueue_task_fair+0x16d/0x17a [] ? activate_task+0x42/0x53 [] ? try_to_wake_up+0x272/0x284 [] ? bad_area_nosemaphore+0xe/0x10 [] ? do_page_fault+0x1c8/0x37c [] ? enqueue_task_fair+0x16d/0x17a [] ? page_fault+0x1f/0x30 [] ? wake_up_process+0x10/0x12 [] ? op_amd_stop+0x2d/0x8e [] ? op_amd_stop+0x1c/0x8e [] ? nmi_cpu_stop+0x21/0x23 [] ? generic_smp_call_function_single_interrupt+0xdf/0x11b [] ? smp_call_function_single_interrupt+0x22/0x31 [] ? call_function_single_interrupt+0x13/0x20 [] ? wake_up_process+0x10/0x12 [] ? default_idle+0x22/0x37 [] ? c1e_idle+0xdf/0xe6 [] ? atomic_notifier_call_chain+0x13/0x15 [] ? cpu_idle+0x4b/0x7e [] ? start_secondary+0x1ae/0x1b2 ---[ end trace 679ac372d674b758 ]--- Cc: Andi Kleen Cc: stable Signed-off-by: Robert Richter --- diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c index 9f001d904599..24582040b718 100644 --- a/arch/x86/oprofile/nmi_int.c +++ b/arch/x86/oprofile/nmi_int.c @@ -95,7 +95,10 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs) static void nmi_cpu_start(void *dummy) { struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs); - model->start(msrs); + if (!msrs->controls) + WARN_ON_ONCE(1); + else + model->start(msrs); } static int nmi_start(void) @@ -107,7 +110,10 @@ static int nmi_start(void) static void nmi_cpu_stop(void *dummy) { struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs); - model->stop(msrs); + if (!msrs->controls) + WARN_ON_ONCE(1); + else + model->stop(msrs); } static void nmi_stop(void)