From 75118a82e21cafb4a82b53bb85d1c7689787e046 Mon Sep 17 00:00:00 2001 From: Suresh Siddha Date: Fri, 13 Jun 2008 15:47:12 -0700 Subject: x86: fix NULL pointer deref in __switch_to Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [] __switch_to+0x2f/0x118 > > [62060.043009] *pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... /* * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy Signed-off-by: Suresh Siddha Signed-off-by: Ingo Molnar --- arch/x86/kernel/process_32.c | 1 + arch/x86/kernel/process_64.c | 1 + 2 files changed, 2 insertions(+) (limited to 'arch') diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 6d5483356e74..e2db9ac5c61c 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -333,6 +333,7 @@ void flush_thread(void) /* * Forget coprocessor state.. */ + tsk->fpu_counter = 0; clear_fpu(tsk); clear_used_math(); } diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index ac54ff56df80..c6eb5c91e5f6 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -294,6 +294,7 @@ void flush_thread(void) /* * Forget coprocessor state.. */ + tsk->fpu_counter = 0; clear_fpu(tsk); clear_used_math(); } -- cgit v1.2.3 From df17b1d990fc214f033c5588e58216ec941591e0 Mon Sep 17 00:00:00 2001 From: Mikael Pettersson Date: Sun, 15 Jun 2008 02:19:56 +0200 Subject: x86, 32-bit: fix boot failure on TSC-less processors Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6" (invalid opcode) and a kernel halt immediately after the kernel has been uncompressed. The BUG shows EIP pointing to an rdtsc instruction in native_read_tsc(), invoked from native_sched_clock(). (This error occurs so early that not even the serial console can capture it.) A bisection showed that this bug first occurs in 2.6.26-rc3-git7, via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910: >x86: distangle user disabled TSC from unstable > >tsc_enabled is set to 0 from the command line switch "notsc" and from >the mark_tsc_unstable code. Seperate those functionalities and replace >tsc_enable with tsc_disable. This makes also the native_sched_clock() >decision when to use TSC understandable. > >Preparatory patch to solve the sched_clock() issue on 32 bit. > >Signed-off-by: Thomas Gleixner The core reason for this bug is that native_sched_clock() gets called before tsc_init(). Before the commit above, tsc_32.c used a "tsc_enabled" variable which defaulted to 0 == disabled, and which only got enabled late in tsc_init(). Thus early calls to native_sched_clock() would skip the TSC and use jiffies instead. After the commit above, tsc_32.c uses a "tsc_disabled" variable which defaults to 0, meaning that the TSC is Ok to use. Early calls to native_sched_clock() now erroneously try to use the TSC on !cpu_has_tsc processors, leading to invalid opcode exceptions. My proposed fix is to initialise tsc_disabled to a "soft disabled" state distinct from the hard disabled state set up by the "notsc" kernel option. This fixes the native_sched_clock() problem. It also allows tsc_init() to be simplified: instead of setting tsc_disabled = 1 on every error return, we just set tsc_disabled = 0 once when all checks have succeeded. I've verified that this lets my 486 boot again. I've also verified that a Core2 machine still uses the TSC as clocksource after the patch. Signed-off-by: Mikael Pettersson Signed-off-by: Ingo Molnar --- arch/x86/kernel/tsc_32.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) (limited to 'arch') diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c index 068759db63dd..65b70637ad97 100644 --- a/arch/x86/kernel/tsc_32.c +++ b/arch/x86/kernel/tsc_32.c @@ -14,7 +14,10 @@ #include "mach_timer.h" -static int tsc_disabled; +/* native_sched_clock() is called before tsc_init(), so + we must start with the TSC soft disabled to prevent + erroneous rdtsc usage on !cpu_has_tsc processors */ +static int tsc_disabled = -1; /* * On some systems the TSC frequency does not @@ -402,25 +405,20 @@ void __init tsc_init(void) { int cpu; - if (!cpu_has_tsc || tsc_disabled) { - /* Disable the TSC in case of !cpu_has_tsc */ - tsc_disabled = 1; + if (!cpu_has_tsc || tsc_disabled > 0) return; - } cpu_khz = calculate_cpu_khz(); tsc_khz = cpu_khz; if (!cpu_khz) { mark_tsc_unstable("could not calculate TSC khz"); - /* - * We need to disable the TSC completely in this case - * to prevent sched_clock() from using it. - */ - tsc_disabled = 1; return; } + /* now allow native_sched_clock() to use rdtsc */ + tsc_disabled = 0; + printk("Detected %lu.%03lu MHz processor.\n", (unsigned long)cpu_khz / 1000, (unsigned long)cpu_khz % 1000); -- cgit v1.2.3 From d3942cff620bea073fc4e3c8ed878eb1e84615ce Mon Sep 17 00:00:00 2001 From: Bernhard Walle Date: Sun, 8 Jun 2008 16:16:07 +0200 Subject: x86: use BOOTMEM_EXCLUSIVE on 32-bit This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for i386 and prints a error message on failure. The patch is still for 2.6.26 since it is only bug fixing. The unification of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27. Signed-off-by: Bernhard Walle Signed-off-by: Ingo Molnar Cc: --- arch/x86/kernel/setup_32.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'arch') diff --git a/arch/x86/kernel/setup_32.c b/arch/x86/kernel/setup_32.c index 2c5f8b213e86..5a2f8e063887 100644 --- a/arch/x86/kernel/setup_32.c +++ b/arch/x86/kernel/setup_32.c @@ -532,10 +532,16 @@ static void __init reserve_crashkernel(void) (unsigned long)(crash_size >> 20), (unsigned long)(crash_base >> 20), (unsigned long)(total_mem >> 20)); + + if (reserve_bootmem(crash_base, crash_size, + BOOTMEM_EXCLUSIVE) < 0) { + printk(KERN_INFO "crashkernel reservation " + "failed - memory is in use\n"); + return; + } + crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; - reserve_bootmem(crash_base, crash_size, - BOOTMEM_DEFAULT); } else printk(KERN_INFO "crashkernel reservation failed - " "you have to specify a base address\n"); -- cgit v1.2.3 From ffe6e1da86d21d7855495b5a772c93f050258f6e Mon Sep 17 00:00:00 2001 From: Jordan Crouse Date: Wed, 18 Jun 2008 11:34:38 -0600 Subject: x86, geode: add a VSA2 ID for General Software General Software writes their own VSA2 module for their version of the Geode BIOS, which returns a different ID then the standard VSA2. This was causing the framebuffer driver to break for most GSW boards. Signed-off-by: Jordan Crouse Cc: tglx@linutronix.de Cc: linux-geode@lists.infradead.org Signed-off-by: Ingo Molnar --- arch/x86/kernel/geode_32.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'arch') diff --git a/arch/x86/kernel/geode_32.c b/arch/x86/kernel/geode_32.c index e8edd63ab000..9b08e852fd1a 100644 --- a/arch/x86/kernel/geode_32.c +++ b/arch/x86/kernel/geode_32.c @@ -166,6 +166,8 @@ int geode_has_vsa2(void) static int has_vsa2 = -1; if (has_vsa2 == -1) { + u16 val; + /* * The VSA has virtual registers that we can query for a * signature. @@ -173,7 +175,8 @@ int geode_has_vsa2(void) outw(VSA_VR_UNLOCK, VSA_VRC_INDEX); outw(VSA_VR_SIGNATURE, VSA_VRC_INDEX); - has_vsa2 = (inw(VSA_VRC_DATA) == VSA_SIG); + val = inw(VSA_VRC_DATA); + has_vsa2 = (val == AMD_VSA_SIG || val == GSW_VSA_SIG); } return has_vsa2; -- cgit v1.2.3