summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2010-05-12MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions.Ralf Baechle
(cherry picked from commit e65c7f33d75e977350ca350573d93c517ec02776) Previously it was unconditionally used on all Sibyte family SOCs. The M3 bug has to be handled in the TLB exception handler which is extremly performance sensitive, so this modification is expected to deliver around 2-3% performance improvment. This is important as required changes to the M3 workaround will make it more costly. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12pxa/colibri: fix missing #include <mach/mfp.h> in colibri.hJakob Viketoft
commit ccb8d8d070b8f25f0163da5c9ceacf63a5169540 upstream. The use of mfp_cfg_t causes build errors without including <mach/mfp.h>. CC: Daniel Mack <daniel@caiaq.de> Signed-off-by: Jakob Viketoft <jakob.viketoft@bitsim.com> Signed-off-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12parisc: Set PCI CLS early in boot.Carlos O'Donell
commit 5fd4514bb351b5ecb0da3692fff70741e5ed200c upstream. Set the PCI CLS early in the boot process to prevent device failures. In pcibios_set_master use the new pci_cache_line_size instead of a hard-coded value. Signed-off-by: Carlos O'Donell <carlos@codesourcery.com> Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12kgdb: don't needlessly skip PAGE_USER test for Fsl bookeWufei
commit 56151e753468e34aeb322af4b0309ab727c97d2e upstream. The bypassing of this test is a leftover from 2.4 vintage kernels, and is no longer appropriate, or even used by KGDB. Currently KGDB uses probe_kernel_write() for all access to memory via the KGDB core, so it can simply be deleted. This fixes CVE-2010-1446. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Wufei <fei.wu@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12KVM: remove unused load_segment_descriptor_to_kvm_desctMarcelo Tosatti
Commit 78ce64a384 in v2.6.32.12 introduced a warning due to unused load_segment_descriptor_to_kvm_desct helper, which has been opencoded by this commit. On upstream, the helper was removed as part of a different commit. Remove the now unused function. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12ACPI: introduce kernel parameter acpi_sleep=sci_force_enableZhang Rui
commit d7f0eea9e431e1b8b0742a74db1a9490730b2a25 upstream. Introduce kernel parameter acpi_sleep=sci_force_enable some laptop requires SCI_EN being set directly on resume, or else they hung somewhere in the resume code path. We already have a blacklist for these laptops but we still need this option, especially when debugging some suspend/resume problems, in case there are systems that need this workaround and are not yet in the blacklist. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasksPrarit Bhargava
commit ebb682f522411abbe358059a256a8672ec0bd55b upstream. The per_cpu cpuid4_info shared_map can contain stale data when CPUs are added and removed. The stale data can lead to a NULL pointer derefernce panic on a remove of a CPU that has had siblings previously removed. This patch resolves the panic by verifying a cpu is actually online before adding it to the shared_cpu_map, only examining cpus that are part of the same lower level cache, and by updating other siblings lowest level cache maps when a cpu is added. Signed-off-by: Prarit Bhargava <prarit@redhat.com> LKML-Reference: <20091209183336.17855.98708.sendpatchset@prarit.bos.redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12x86, k8 nb: Fix boot crash: enable k8_northbridges unconditionally on AMD ↵Borislav Petkov
systems commit 0e152cd7c16832bd5cadee0c2e41d9959bc9b6f9 upstream. de957628ce7c84764ff41331111036b3ae5bad0f changed setting of the x86_init.iommu.iommu_init function ptr only when GART IOMMU is found. One side effect of it is that num_k8_northbridges is not initialized anymore if not explicitly called. This resulted in uninitialized pointers in <arch/x86/kernel/cpu/intel_cacheinfo.c:amd_calc_l3_indices()>, for example, which uses the num_k8_northbridges thing through node_to_k8_nb_misc(). Fix that through an initcall that runs right after the PCI subsystem and does all the scanning. Then, remove initialization in gart_iommu_init() which is a rootfs_initcall and we're running before that. What is more, since num_k8_northbridges is being used in other places beside GART IOMMU, include it whenever we add AMD CPU support. The previous dependency chain in kconfig contained K8_NB depends on AGP_AMD64|GART_IOMMU which was clearly incorrect. The more natural way in terms of hardware dependency should be AGP_AMD64|GART_IOMMU depends on K8_NB depends on CPU_SUP_AMD && PCI. Make it so Number One! Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <20100312144303.GA29262@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12x86: Disable large pages on CPUs with Atom erratum AAE44H. Peter Anvin
commit 7a0fc404ae663776e96db43879a0fa24fec1fa3a upstream. Atom erratum AAE44/AAF40/AAG38/AAH41: "If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." [http://download.intel.com/design/processor/specupdt/319536.pdf] Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to workaround errata AAH41 (mixed 4K TLBs) it reduces the window of opportunity for the bug to occur and does not totally remove it. This patch disables mixed 4K/4MB page tables totally avoiding the page splitting and not tripping this processor issue. This is based on an original patch by Colin King. Originally-by: Colin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzeroH. Peter Anvin
commit 7ce5a2b9bb2e92902230e3121d8c3047fab9cb47 upstream. When we do a thread switch, we clear the outgoing FS/GS base if the corresponding selector is nonzero. This is taken by __switch_to() as an entry invariant; it does not verify that it is true on entry. However, copy_thread() doesn't enforce this constraint, which can result in inconsistent results after fork(). Make copy_thread() match the behavior of __switch_to(). Reported-and-tested-by: Samuel Thibault <samuel.thibault@inria.fr> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4BD1E061.8030605@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12powernow-k8: Fix frequency reportingMark Langsdorf
commit b810e94c9d8e3fff6741b66cd5a6f099a7887871 upstream. With F10, model 10, all valid frequencies are in the ACPI _PST table. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-12core, x86: make LIST_POISON less deadlyAvi Kivity
commit a29815a333c6c6e677294bbe5958e771d0aad3fd upstream. The list macros use LIST_POISON1 and LIST_POISON2 as undereferencable pointers in order to trap erronous use of freed list_heads. Unfortunately userspace can arrange for those pointers to actually be dereferencable, potentially turning an oops to an expolit. To avoid this allow architectures (currently x86_64 only) to override the default values for these pointers with truly-undereferencable values. This is easy on x86_64 as the virtual address space is large and contains areas that cannot be mapped. Other 64-bit architectures will likely find similar unmapped ranges. [ingo: switch to 0xdead000000000000 as the unmapped area] [ingo: add comments, cleanup] [jaswinder: eliminate sparse warnings] Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86/gart: Disable GART explicitly before initializationJoerg Roedel
commit 4b83873d3da0704987cb116833818ed96214ee29 upstream. If we boot into a crash-kernel the gart might still be enabled and its caches might be dirty. This can result in undefined behavior later. Fix it by explicitly disabling the gart hardware before initialization and flushing the caches after enablement. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: x86: Fix TSS size check for 16-bit tasksJan Kiszka
(Cherry-picked from commit e8861cfe2c75bdce36655b64d7ce02c2b31b604d) A 16-bit TSS is only 44 bytes long. So make sure to test for the correct size on task switch. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: fix the handling of dirty bitmaps to avoid overflowsTakuya Yoshikawa
(Cherry-picked from commit 87bf6e7de1134f48681fd2ce4b7c1ec45458cb6d) Int is not long enough to store the size of a dirty bitmap. This patch fixes this problem with the introduction of a wrapper function to calculate the sizes of dirty bitmaps. Note: in mark_page_dirty(), we have to consider the fact that __set_bit() takes the offset as int, not long. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: MMU: fix kvm_mmu_zap_page() and its calling pathXiao Guangrong
(Cherry-picked from commit 77662e0028c7c63e34257fda03ff9625c59d939d) This patch fix: - calculate zapped page number properly in mmu_zap_unsync_children() - calculate freeed page number properly kvm_mmu_change_mmu_pages() - if zapped children page it shoud restart hlist walking KVM-Stable-Tag. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: VMX: Save/restore rflags.vm correctly in real modeAvi Kivity
(Cherry-picked from commit 78ac8b47c566dd6177a3b9b291b756ccb70670b7) Currently we set eflags.vm unconditionally when entering real mode emulation through virtual-8086 mode, and clear it unconditionally when we enter protected mode. The means that the following sequence KVM_SET_REGS (rflags.vm=1) KVM_SET_SREGS (cr0.pe=1) Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode(). Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode: reads and writes to those bits access a shadow register instead of the actual register. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTLAndre Przywara
(Cherry-picked from commit 114be429c8cd44e57f312af2bbd6734e5a185b0d) There is a quirk for AMD K8 CPUs in many Linux kernels (see arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that clears bit 10 in that MCE related MSR. KVM can only cope with all zeros or all ones, so it will inject a #GP into the guest, which will let it panic. So lets add a quirk to the quirk and ignore this single cleared bit. This fixes -cpu kvm64 on all machines and -cpu host on K8 machines with some guest Linux kernels. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: Don't spam kernel log when injecting exceptions due to bad cr writesAvi Kivity
(Cherry-picked from commit d6a23895aa82353788a1cc5a1d9a1c963465463e) These are guest-triggerable. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: SVM: Fix memory leaks that happen when svm_create_vcpu() failsTakuya Yoshikawa
(Cherry-picked from commit b7af40433870aa0636932ad39b0c48a0cb319057) svm_create_vcpu() does not free the pages allocated during the creation when it fails to complete the allocations. This patch fixes it. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26perf_events, x86: Implement Intel Westmere/Nehalem-EX supportPeter Zijlstra
original patch commit ids: 452a339a976e7f782c786eb3f73080401e2fa3a6 and 134fbadf028a5977a1b06b0253d3ee33e6f0c642 perf_events, x86: Implement Intel Westmere support The new Intel documentation includes Westmere arch specific event maps that are significantly different from the Nehalem ones. Add support for this generation. Found the CPUID model numbers on wikipedia. Also ammend some Nehalem constraints, spotted those when looking for the differences between Nehalem and Westmere. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100127221122.151865645@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> perf, x86: Enable Nehalem-EX support According to Intel Software Devel Manual Volume 3B, the Nehalem-EX PMU is just like regular Nehalem (except for the uncore support, which is completely different). Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Youquan Song <youquan.song@linux.intel.com>
2010-04-26x86/PCI: irq and pci_ids patch for Intel Cougar Point DeviceIDsSeth Heasley
commit 93da6202264ce1256b04db8008a43882ae62d060 upstream. This patch adds the Intel Cougar Point (PCH) LPC and SMBus Controller DeviceIDs. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86-64, rwsem: Avoid store forwarding hazard in __downgrade_writeAvi Kivity
commit 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc upstream. The Intel Architecture Optimization Reference Manual states that a short load that follows a long store to the same object will suffer a store forwading penalty, particularly if the two accesses use different addresses. Trivially, a long load that follows a short store will also suffer a penalty. __downgrade_write() in rwsem incurs both penalties: the increment operation will not be able to reuse a recently-loaded rwsem value, and its result will not be reused by any recently-following rwsem operation. A comment in the code states that this is because 64-bit immediates are special and expensive; but while they are slightly special (only a single instruction allows them), they aren't expensive: a test shows that two loops, one loading a 32-bit immediate and one loading a 64-bit immediate, both take 1.5 cycles per iteration. Fix this by changing __downgrade_write to use the same add instruction on i386 and on x86_64, so that it uses the same operand size as all the other rwsem functions. Signed-off-by: Avi Kivity <avi@redhat.com> LKML-Reference: <1266049992-17419-1-git-send-email-avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86: Fix breakage of UML from the changes in the rwsem systemLinus Torvalds
commit 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe upstream. The patches 5d0b7235d83eefdafda300656e97d368afcafc9a and bafaecd11df15ad5b1e598adc7736afcd38ee13d broke the UML build: On Sun, 17 Jan 2010, Ingo Molnar wrote: > > FYI, -tip testing found that these changes break the UML build: > > kernel/built-in.o: In function `__up_read': > /home/mingo/tip/arch/x86/include/asm/rwsem.h:192: undefined reference to `call_rwsem_wake' > kernel/built-in.o: In function `__up_write': > /home/mingo/tip/arch/x86/include/asm/rwsem.h:210: undefined reference to `call_rwsem_wake' > kernel/built-in.o: In function `__downgrade_write': > /home/mingo/tip/arch/x86/include/asm/rwsem.h:228: undefined reference to `call_rwsem_downgrade_wake' > kernel/built-in.o: In function `__down_read': > /home/mingo/tip/arch/x86/include/asm/rwsem.h:112: undefined reference to `call_rwsem_down_read_failed' > kernel/built-in.o: In function `__down_write_nested': > /home/mingo/tip/arch/x86/include/asm/rwsem.h:154: undefined reference to `call_rwsem_down_write_failed' > collect2: ld returned 1 exit status Add lib/rwsem_64.o to the UML subarch objects to fix. LKML-Reference: <alpine.LFD.2.00.1001171023440.13231@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86-64: support native xadd rwsem implementationLinus Torvalds
commit bafaecd11df15ad5b1e598adc7736afcd38ee13d upstream. This one is much faster than the spinlock based fallback rwsem code, with certain artifical benchmarks having shown 300%+ improvement on threaded page faults etc. Again, note the 32767-thread limit here. So this really does need that whole "make rwsem_count_t be 64-bit and fix the BIAS values to match" extension on top of it, but that is conceptually a totally independent issue. NOT TESTED! The original patch that this all was based on were tested by KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the cleaned-up series, so caveat emptor.. Also note that it _may_ be a good idea to mark some more registers clobbered on x86-64 in the inline asms instead of saving/restoring them. They are inline functions, but they are only used in places where there are not a lot of live registers _anyway_, so doing for example the clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any worse, and would make the slow-path code smaller. (Not that the slow-path really matters to that degree. Saving a few unnecessary registers is the _least_ of our problems when we hit the slow path. The instruction/cycle counting really only matters in the fast path). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86-64, rwsem: 64-bit xadd rwsem implementationH. Peter Anvin
commit 1838ef1d782f7527e6defe87e180598622d2d071 upstream. For x86-64, 32767 threads really is not enough. Change rwsem_count_t to a signed long, so that it is 64 bits on x86-64. This required the following changes to the assembly code: a) %z0 doesn't work on all versions of gcc! At least gcc 4.4.2 as shipped with Fedora 12 emits "ll" not "q" for 64 bits, even for integer operands. Newer gccs apparently do this correctly, but avoid this problem by using the _ASM_ macros instead of %z. b) 64 bits immediates are only allowed in "movq $imm,%reg" constructs... no others. Change some of the constraints to "e", and fix the one case where we would have had to use an invalid immediate -- in that case, we only care about the upper half anyway, so just access the upper half. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <tip-bafaecd11df15ad5b1e598adc7736afcd38ee13d@git.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86: clean up rwsem type systemLinus Torvalds
commit 5d0b7235d83eefdafda300656e97d368afcafc9a upstream. The fast version of the rwsems (the code that uses xadd) has traditionally only worked on x86-32, and as a result it mixes different kinds of types wildly - they just all happen to be 32-bit. We have "long", we have "__s32", and we have "int". To make it work on x86-64, the types suddenly matter a lot more. It can be either a 32-bit or 64-bit signed type, and both work (with the caveat that a 32-bit counter will only have 15 bits of effective write counters, so it's limited to 32767 users). But whatever type you choose, it needs to be used consistently. This makes a new 'rwsem_counter_t', that is a 32-bit signed type. For a 64-bit type, you'd need to also update the BIAS values. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121755220.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86-32: clean up rwsem inline asm statementsLinus Torvalds
commit 59c33fa7791e9948ba467c2b83e307a0d087ab49 upstream. This makes gcc use the right register names and instruction operand sizes automatically for the rwsem inline asm statements. So instead of using "(%%eax)" to specify the memory address that is the semaphore, we use "(%1)" or similar. And instead of forcing the operation to always be 32-bit, we use "%z0", taking the size from the actual semaphore data structure itself. This doesn't actually matter on x86-32, but if we want to use the same inline asm for x86-64, we'll need to have the compiler generate the proper 64-bit names for the registers (%rax instead of %eax), and if we want to use a 64-bit counter too (in order to avoid the 15-bit limit on the write counter that limits concurrent users to 32767 threads), we'll need to be able to generate instructions with "q" accesses rather than "l". Since this header currently isn't enabled on x86-64, none of that matters, but we do want to use the xadd version of the semaphores rather than have to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended when you have lots of threads all taking page faults, and the fallback rwsem code that uses a spinlock performs abysmally badly in that case. [ hpa: modified the patch to skip size suffixes entirely when they are redundant due to register operands. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, cacheinfo: Enable L3 CID only on AMDBorislav Petkov
commit cb19060abfdecac0d1eb2d2f0e7d6b7a3f8bc4f4 upstream. Final stage linking can fail with arch/x86/built-in.o: In function `store_cache_disable': intel_cacheinfo.c:(.text+0xc509): undefined reference to `amd_get_nb_id' arch/x86/built-in.o: In function `show_cache_disable': intel_cacheinfo.c:(.text+0xc7d3): undefined reference to `amd_get_nb_id' when CONFIG_CPU_SUP_AMD is not enabled because the amd_get_nb_id helper is defined in AMD-specific code but also used in generic code (intel_cacheinfo.c). Reorganize the L3 cache index disable code under CONFIG_CPU_SUP_AMD since it is AMD-only anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100218184210.GF20473@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, cacheinfo: Remove NUMA dependency, fix for AMD Fam10h rev D1Borislav Petkov
commit f619b3d8427eb57f0134dab75b0d217325c72411 upstream. The show/store_cache_disable routines depend unnecessarily on NUMA's cpu_to_node and the disabling of cache indices broke when !CONFIG_NUMA. Remove that dependency by using a helper which is always correct. While at it, enable L3 Cache Index disable on rev D1 Istanbuls which sport the feature too. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100218184339.GG20473@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, cacheinfo: Calculate L3 indicesBorislav Petkov
commit 048a8774ca43488d78605031f11cc206d7a2682a upstream. We need to know the valid L3 indices interval when disabling them over /sysfs. Do that when the core is brought online and add boundary checks to the sysfs .store attribute. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-6-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, cacheinfo: Add cache index disable sysfs attrs only to L3 cachesBorislav Petkov
commit 897de50e08937663912c86fb12ad7f708af2386c upstream. The cache_disable_[01] attribute in /sys/devices/system/cpu/cpu?/cache/index[0-3]/ is enabled on all cache levels although only L3 supports it. Add it only to the cache level that actually supports it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-5-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, cacheinfo: Fix disabling of L3 cache indicesBorislav Petkov
commit dcf39daf3d6d97f8741e82f0b9fb7554704ed2d1 upstream. * Correct the masks used for writing the cache index disable indices. * Do not turn off L3 scrubber - it is not necessary. * Make sure wbinvd is executed on the same node where the L3 is. * Check for out-of-bounds values written to the registers. * Make show_cache_disable hex values unambiguous * Check for Erratum #388 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-4-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, lib: Add wbinvd smp helpersBorislav Petkov
commit a7b480e7f30b3813353ec009f10f2ac7a6669f3b upstream. Add wbinvd_on_cpu and wbinvd_on_all_cpus stubs for executing wbinvd on a particular CPU. [ hpa: renamed lib/smp.c to lib/cache-smp.c ] [ hpa: wbinvd_on_all_cpus() returns int, but wbinvd() returns void. Thus, the former cannot be a macro for the latter, replace with an inline function. ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-2-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26Revert "x86: disable IOMMUs on kernel crash"Chris Wright
commit 8f9f55e83e939724490d7cde3833c4883c6d1310 upstream. This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. Disabling the IOMMU can potetially allow DMA transactions to complete without being translated. Leave it enabled, and allow crash kernel to do the IOMMU reinitialization properly. Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86/amd-iommu: enable iommu before attaching devicesChris Wright
commit 75f66533bc883f761a7adcab3281fe3323efbc90 upstream. Hit another kdump problem as reported by Neil Horman. When initializaing the IOMMU, we attach devices to their domains before the IOMMU is fully (re)initialized. Attaching a device will issue some important invalidations. In the context of the newly kexec'd kdump kernel, the IOMMU may have stale cached data from the original kernel. Because we do the attach too early, the invalidation commands are placed in the new command buffer before the IOMMU is updated w/ that buffer. This leaves the stale entries in the kdump context and can renders device unusable. Simply enable the IOMMU before we do the attach. Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86/amd-iommu: Use helper function to destroy domainJoerg Roedel
commit 8b408fe4f853dcfa18d133aa4cf1d7546b4c3870 upstream. In the amd_iommu_domain_destroy the protection_domain_free function is partly reimplemented. The 'partly' is the bug here because the domain is not deleted from the domain list. This results in use-after-free errors and data-corruption. Fix it by just using protection_domain_free instead. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boardsSuresh Siddha
commit 472a474c6630efd195d3738339fd1bdc8aa3b1aa upstream. Jan Grossmann reported kernel boot panic while booting SMP kernel on his system with a single core cpu. SMP kernels call enable_IR_x2apic() from native_smp_prepare_cpus() and on platforms where the kernel doesn't find SMP configuration we ended up again calling enable_IR_x2apic() from the APIC_init_uniprocessor() call in the smp_sanity_check(). Thus leading to kernel panic. Don't call enable_IR_x2apic() and default_setup_apic_routing() from APIC_init_uniprocessor() in CONFIG_SMP case. NOTE: this kind of non-idempotent and assymetric initialization sequence is rather fragile and unclean, we'll clean that up in v2.6.35. This is the minimal fix for v2.6.34. Reported-by: Jan.Grossmann@kielnet.net Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: <jbarnes@virtuousgeek.org> Cc: <david.woodhouse@intel.com> Cc: <weidong.han@intel.com> Cc: <youquan.song@intel.com> Cc: <Jan.Grossmann@kielnet.net> LKML-Reference: <1270083887.7835.78.camel@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86, hpet: Erratum workaround for read after write of HPET comparatorPallipadi, Venkatesh
commit 8da854cb02156c90028233ae1e85ce46a1d3f82c upstream. On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote: > Hello, > > Again, on the Intel DP55KG board: > > # uname -a > Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux > > [ 1.237600] ------------[ cut here ]------------ > [ 1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80() > [ 1.238221] Hardware name: > [ 1.238504] hpet: compare register read back failed. > [ 1.238793] Modules linked in: > [ 1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1 > [ 1.239605] Call Trace: > [ 1.239886] <IRQ> [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0 > [ 1.240409] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.240699] [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50 > [ 1.240992] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241281] [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80 > [ 1.241573] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241859] [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100 > [ 1.246533] [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30 > [ 1.246826] [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0 > [ 1.247118] [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160 > [ 1.247407] [<ffffffff81029f55>] ? handle_irq+0x15/0x20 > [ 1.247689] [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0 > [ 1.247976] [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa > [ 1.248262] <EOI> [<ffffffff8102f277>] ? mwait_idle+0x57/0x80 > [ 1.248796] [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0 > [ 1.249080] ---[ end trace db7f668fb6fef4e1 ]--- > > Is this something Intel has to fix or is it a bug in the kernel? This is a chipset erratum. Thomas: You mentioned we can retain this check only for known-buggy and hpet debug kind of options. But here is the simple workaround patch for this particular erratum. Some chipsets have a erratum due to which read immediately following a write of HPET comparator returns old comparator value instead of most recently written value. Erratum 15 in "Intel I/O Controller Hub 9 (ICH9) Family Specification Update" (http://www.intel.com/assets/pdf/specupdate/316973.pdf) Workaround for the errata is to read the comparator twice if the first one fails. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20100225185348.GA9674@linux-os.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86: hpet: Make WARN_ON understandableThomas Gleixner
commit 18ed61da985c57eea3fe8038b13fa2837c9b3c3f upstream. Andrew complained rightly that the WARN_ON in hpet_next_event() is confusing and the code comment not really helpful. Change it to WARN_ONCE and print the reason in clear text. Change the comment to explain what kind of hardware wreckage we deal with. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26x86-32, resume: do a global tlb flush in S4 resumeShaohua Li
commit 8ae06d223f8203c72104e5c0c4ee49a000aedb42 upstream. Colin King reported a strange oops in S4 resume code path (see below). The test system has i5/i7 CPU. The kernel doesn't open PAE, so 4M page table is used. The oops always happen a virtual address 0xc03ff000, which is mapped to the last 4k of first 4M memory. Doing a global tlb flush fixes the issue. EIP: 0060:[<c0493a01>] EFLAGS: 00010086 CPU: 0 EIP is at copy_loop+0xe/0x15 EAX: 36aeb000 EBX: 00000000 ECX: 00000400 EDX: f55ad46c ESI: 0f800000 EDI: c03ff000 EBP: f67fbec4 ESP: f67fbea8 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 ... ... CR2: 00000000c03ff000 Tested-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <20100305005932.GA22675@sli10-desk.sh.intel.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26ARM: 6031/1: fix Thumb-2 decompressorRabin Vincent
commit d4d9959c099751158c5cf14813fe378e206339c6 upstream. 98e12b5a6e05413 ("ARM: Fix decompressor's kernel size estimation for ROM=y") broke the Thumb-2 decompressor because it added an entry in the LC0 table but didn't adjust the offset the Thumb-2 code uses to load the SP from that table. Fix it. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: x86: disable paravirt mmu reportingMarcelo Tosatti
commit a68a6a7282373bedba8a2ed751b6384edb983a64 upstream Disable paravirt MMU capability reporting, so that new (or rebooted) guests switch to native operation. Paravirt MMU is a burden to maintain and does not bring significant advantages compared to shadow anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: VMX: Disable unrestricted guest when EPT disabledSheng Yang
commit 046d87103addc117f0d397196e85189722d4d7de upstream Otherwise would cause VMEntry failure when using ept=0 on unrestricted guest supported processors. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: SVM: Reset cr0 properly on vcpu resetEduardo Habkost
commit 18fa000ae453767b59ab97477925895a3f0c46ea upstream svm_vcpu_reset() was not properly resetting the contents of the guest-visible cr0 register, causing the following issue: https://bugzilla.redhat.com/show_bug.cgi?id=525699 Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine with paging enabled, making the vcpu get a pagefault exception while trying to run it. Instead of setting vmcb->save.cr0 directly, the new code just resets kvm->arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on vmcb->save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0(). kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: VMX: Use macros instead of hex value on cr0 initializationEduardo Habkost
commit fa40052ca04bdbbeb20b839cc8ffe9fa7beefbe9 upstream This should have no effect, it is just to make the code clearer. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: VMX: Update instruction length on intercepted BPJan Kiszka
commit c573cd22939e54fc1b8e672054a505048987a7cb upstream We intercept #BP while in guest debugging mode. As VM exits due to intercepted exceptions do not necessarily come with valid idt_vectoring, we have to update event_exit_inst_len explicitly in such cases. At least in the absence of migration, this ensures that re-injections of #BP will find and use the correct instruction length. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: Fix segment descriptor loadingGleb Natapov
commit c697518a861e6c43b92b848895f9926580ee63c3 upstream Add proper error and permission checking. This patch also change task switching code to load segment selectors before segment descriptors, like SDM requires, otherwise permission checking during segment descriptor loading will be incorrect. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: x86 emulator: Fix popf emulationGleb Natapov
commit d4c6a1549c056f1d817e8f6f2f97d8b44933472f upstream POPF behaves differently depending on current CPU mode. Emulate correct logic to prevent guest from changing flags that it can't change otherwise. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-26KVM: x86 emulator: Check IOPL level during io instruction emulationGleb Natapov
commit f850e2e603bf5a05b0aee7901857cf85715aa694 upstream Make emulator check that vcpu is allowed to execute IN, INS, OUT, OUTS, CLI, STI. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>