summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2007-02-26hwmon: Refactor SENSOR_DEVICE_ATTR_2Jim Cromie
This patch refactors SENSOR_DEVICE_ATTR_2 macro, following pattern set by SENSOR_ATTR. First it creates a new macro SENSOR_ATTR_2() which expands to an initialization expression, then it uses that in SENSOR_DEVICE_ATTR_2, which declares and initializes a struct sensor_device_attribute_2. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-26hwmon: Allow sensor attributes arraysJim Cromie
This patch refactors SENSOR_DEVICE_ATTR macro. First it creates a new macro SENSOR_ATTR() which expands to an initialization expression, then it uses that in SENSOR_DEVICE_ATTR, which declares and initializes a struct sensor_device_attribute. IOW, SENSOR_ATTR() imitates __ATTR() in include/linux/device.h. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-26i2c-piix4: Add ATI IXP200/300/400 supportRudolf Marek
This patch adds the ATI IXP southbridges support to i2c-piix4, as it turned out those chips are compatible with it. Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-26I2C: i2c-piix4: Add Broadcom HT-1000 supportMartin Devera
Add Broadcom HT-1000 south bridge's PCI ID to i2c-piix driver. Note that at least on Supermicro H8SSL it uses non-standard SMBHSTCFG = 3 and standard values like 0 or 9 causes hangup. Signed-off-by: Martin Devera <devik@cdi.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-14[TCP]: struct tcp_sack_block annotationsAl Viro
Some of the instances of tcp_sack_block are host-endian, some - net-endian. Define struct tcp_sack_block_wire identical to struct tcp_sack_block with u32 replaced with __be32; annotate uses of tcp_sack_block replacing net-endian ones with tcp_sack_block_wire. Change is obviously safe since for cc(1) __be32 is typedefed to u32. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-14SPARC32: Fix over-optimization by GCC near ip_fast_csum.Bob Breuer
In some cases such as: iph->check = 0; iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl); GCC may optimize out the previous store. Observed as a failure of NFS over udp (bad checksums on ip fragments) when compiled with GCC 3.4.2. Signed-off-by: Bob Breuer <breuerr@mc.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-03reiserfs: avoid tail packing if an inode was ever mmappedVladimir Saveliev
This patch fixes a confusion reiserfs has for a long time. On release file operation reiserfs used to try to pack file data stored in last incomplete page of some files into metadata blocks. After packing the page got cleared with clear_page_dirty. It did not take into account that the page may be mmaped into other process's address space. Recent replacement for clear_page_dirty cancel_dirty_page found the confusion with sanity check that page has to be not mapped. The patch fixes the confusion by making reiserfs avoid tail packing if an inode was ever mmapped. reiserfs_mmap and reiserfs_file_release are serialized with mutex in reiserfs specific inode. reiserfs_mmap locks the mutex and sets a bit in reiserfs specific inode flags. reiserfs_file_release checks the bit having the mutex locked. If bit is set - tail packing is avoided. This eliminates a possibility that mmapped page gets cancel_page_dirty-ed. Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-03ia64: add pci_get_legacy_ide_irq()Bartlomiej Zolnierkiewicz
Add pci_get_legacy_ide_irq() identical to the one used by i386/x86_64. Fixes amd74xx driver build on ia64 (bugzilla bug #6644). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-30hwmon: New driver k8tempRudolf Marek
Add support for the temperature sensor(s) found in AMD K8 CPUs. Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-30[SCSI] arcmsr: initial driver, version 1.20.00.13Erich Chen
arcmsr is a driver for the Areca Raid controller, a host based RAID subsystem that speaks SCSI at the firmware level. This patch is quite a clean up over the initial submission with contributions from: Randy Dunlap <rdunlap@xenotime.net> Christoph Hellwig <hch@lst.de> Matthew Wilcox <matthew@wil.cx> Adrian Bunk <bunk@stusta.de> Signed-off-by: Erich Chen <erich@areca.com.tw> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-26[Bluetooth] Handle command complete event for exit periodic inquiryMarcel Holtmann
The command complete event of the exit periodic inquiry command must clear the HCI_INQUIRY flag and finish the HCI request. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09x86_64: re-add a newline to RESTORE_CONTEXTAdrian Bunk
RESTORE_CONTEXT lost a newline: http://www.mail-archive.com/kgdb-bugreport@lists.sourceforge.net/msg00559.html Reported by Steven M. Christey. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09PCI: irq: irq and pci_ids patch for Intel ICH9Jason Gaston
This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's. Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09x86_64: Don't leak NT bit into next task (CVE-2006-5755)Andi Kleen
SYSENTER can cause a NT to be set which might cause crashes on the IRET in the next task. Following similar i386 patch from Linus. Backport to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com> [Changed 'set_debugreg' to the older 'set_debug' in setup64.c and added raw_local_save_flags() from 2.6.19 to system.h] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09x86_64: fix ia32 syscall countChuck Ebbert
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-04i386: save/restore eflags in context switch (CVE-2006-5173)Linus Torvalds
(And reset it on new thread creation) It turns out that eflags is important to save and restore not just because of iopl, but due to the magic bits like the NT bit, which we don't want leaking between different threads. Backported to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com> [Backport consisted of removing the CFI annotations.] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-18bridge-netfilter: don't overwrite memory outside of skbStephen Hemminger
The bridge netfilter code needs to check for space at the front of the skb before overwriting; otherwise if skb from device doesn't have headroom, then it will cause random memory corruption. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-14pci_ids.h: Add NVIDIA PCI IDPeer Chen
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-14amd74xx.c: add some NVIDIA chipset IDsRandy Dunlap
Add some nVidia chipset ID's support. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-14sata_nv/amd74xx: Add MCP61 supportAndrew Chew
Added MCP61 support to sata_nv and amd74xx. Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-08PCI: Unhide the SMBus on Asus PU-DLSJean Delvare
Unhide the SMBus controller on the Asus PU-DLS board. This fixes bug #6763. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-08PCI: nVidia quirk to make AER PCI-E extended capability visibleBrice Goglin
The nVidia CK804 PCI-E chipset supports the AER extended capability but sometimes fails to link it (with some BIOS or after a warm reboot). It makes the AER cap invisible to pci_find_ext_capability(). The patch adds a quirk to set the missing bit that controls the linking of the capability. By the way, it removes the corresponding code in the myri10ge driver. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-08pci_ids.h: correct naming of 1022:7450 (AMD 8131 Bridge)John W. Linville
The naming of the constant defined for PCI ID 1022:7450 does not seem to match the information at http://pciids.sourceforge.net/: http://pci-ids.ucw.cz/iii/?i=1022 There 1022:7450 is listed as "AMD-8131 PCI-X Bridge" while 1022:7451 is listed as "AMD-8131 PCI-X IOAPIC". Yet, the current definition for 0x7450 is PCI_DEVICE_ID_AMD_8131_APIC. It seems to me like that name should map to 0x7451, while a name like PCI_DEVICE_ID_AMD_8131_BRIDGE should map to 0x7450. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-06Fix mempolicy.h build errorRalf Baechle
<linux/mempolicy.h> uses struct mm_struct and relies on a definition or declaration somehow magically being dragged in which may result in a build: CC mm/mempolicy.o In file included from mm/mempolicy.c:69: include/linux/mempolicy.h:150: warning: 'struct mm_struct' declared inside parameter list include/linux/mempolicy.h:150: warning: its scope is only this definition or declaration, which is probably not what you want include/linux/mempolicy.h:174: warning: 'struct mm_struct' declared inside parameter list mm/mempolicy.c:673: error: conflicting types for 'do_migrate_pages' include/linux/mempolicy.h:174: error: previous declaration of 'do_migrate_pages' was here mm/mempolicy.c:1696: error: conflicting types for 'mpol_rebind_mm' include/linux/mempolicy.h:150: error: previous declaration of 'mpol_rebind_mm' was here make[1]: *** [mm/mempolicy.o] Error 1 make: *** [mm] Error 2 $ Including <linux/sched.h> is a step into direction of include hell so fixed by adding a forward declaration of struct mm_struct instead. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04remove garbage the sneaked into the ext3 fixAdrian Bunk
Spotted by Thomas Voegtle. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29Fix incorrent type of flags in <asm/semaphore.h>Kyle McMartin
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)Al Viro
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of buffer size from address of buffer_head is a bad idea - it will generate junk in any case, may oops if buffer_head is close to the end of slab page and next page is not mapped and isn't what was intended there. IOW, ->b_data is missing in that call. Fortunately, result doesn't go into the primary on-disk data structures, so only backup ones get crap written to them; that had allowed this bug to remain unnoticed until now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-25[IGMP]: Fix IGMPV3_EXP() normalization bit shift value.David L Stevens
The IGMPV3_EXP() macro doesn't correctly shift the normalization bit, so time-out values are longer than they should be. Thanks to Dirk Ooms for finding the problem in IGMPv3 - MLDv2 had a similar problem that was already fixed a year ago. :-( Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24[IPX]: Annotate and fix IPX checksumAl Viro
Calculation of IPX checksum got buggered about 2.4.0. The old variant mangled the packet; that got fixed, but calculation itself got buggered. Restored the correct logics, fixed a subtle breakage we used to have even back then: if the sum is 0 mod 0xffff, we want to return 0, not 0xffff. The latter has special meaning for IPX (cheksum disabled). Observation (and obvious fix) nicked from history of FreeBSD ipx_cksum.c... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-11ia64/sparc: fix local DoS with corrupted ELFs (CVE-2006-4538)Kirill Korotaev
This patch prevents cross-region mappings on IA64 and SPARC which could lead to system crash. Adrian Bunk: Adapted to 2.6.16. Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-11[NET]: Update frag_list in pskb_trimHerbert Xu
When pskb_trim has to defer to ___pksb_trim to trim the frag_list part of the packet, the frag_list is not updated to reflect the trimming. This will usually work fine until you hit something that uses the packet length or tail from the frag_list. Examples include esp_output and ip_fragment. Another problem caused by this is that you can end up with a linear packet with a frag_list attached. It is possible to get away with this if we audit everything to make sure that they always consult skb->len before going down onto frag_list. In fact we can do the samething for the paged part as well to avoid copying the data area of the skb. For now though, let's do the conservative fix and update frag_list. Many thanks to Marco Berizzi for helping me to track down this bug. This 4-year old bug took 3 months to track down. Marco was very patient indeed :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09drivers/video/nvidia/nvidia.c: Add ID for Quadro NVS280Pavel Roskin
Quadro NVS280 is a dual-head PCIe card with PCI ID 10de:00fd and subsystem I 10de:0215. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-05ACPI: enable SMP C-states on x86_64Shaohua Li
http://bugzilla.kernel.org/show_bug.cgi?id=5653 Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-18[AGPGART] VIA PT880 Ultra support.Magnus Kessler
This patch enables agpgart on a Via "PT880 Ultra" based motherboard (Asus P4V800D-X). The PCI ID of the PT880 Ultra is 0x0308 instead of 0x0258 of the PT880. The patched via-agp passes testgart. Signed-off-by: Magnus Kessler <Magnus.Kessler@gmx.net> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-18PFKEYV2: Fix inconsistent typing in struct sadb_x_kmprivate.Tushar Gohad
Fixes inconsistent use of "uint32_t" vs. "u_int32_t". Fix pfkeyv2 userspace builds. Signed-off-by: Tushar Gohad <tgohad@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-06pci_ids.h: add some VIA IDE identifiersAlan Cox
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-06ia64 SGI-SN2: fix silent data corruption caused by XPCDean Nelson
Jack Steiner identified a problem where XPC can cause a silent data corruption. On module load, the placement may cause the xpc_remote_copy_buffer to span two physical pages. DMA transfers are done to the start virtual address translated to physical. This patch changes the buffer from a statically allocated buffer to a kmalloc'd buffer. Dean Nelson reviewed this before posting. I have tested it in the configuration that was showing the memory corruption and verified it works. I also added a BUG_ON statement to help catch this if a similar situation is encountered. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30ext3: avoid triggering ext3_error on bad NFS file handleNeil Brown
The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block() without any checking. If ext3_get_inode_block() allows it to trigger an error, then bad filehandles can have unpleasant effect - ext3_error() will usually cause a forced read-only remount, or a panic if `errors=panic' was used. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read off storage. Andrew Morton fixed an off-by-one error. Dann Frazier ported the patch to 2.6.16. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-26SCTP: Reject sctp packets with broadcast addresses.Vlad Yasevich
Make SCTP handle broadcast properly Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-23Fix sctp privilege elevation (CVE-2006-3745)Sridhar Samudrala
sctp_make_abort_user() now takes the msg_len along with the msg so that we don't have to recalculate the bytes in iovec. It also uses memcpy_fromiovec() so that we don't go beyond the length allocated. It is good to have this fix even if verify_iovec() is fixed to return error on overflow. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-18SPARC64: Fix quad-float multiply emulation.David S. Miller
Something is wrong with the 3-multiply (vs. 4-multiply) optimized version of _FP_MUL_MEAT_2_*(), so just use the slower version which actually computes correct values. Noticed by Rene Rebe Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-22[PATCH] I2O: Bugfixes to get I2O working againMarkus Lidel
- Fixed locking of struct i2o_exec_wait in Executive-OSM - Removed LCT Notify in i2o_exec_probe() which caused freeing memory and accessing freed memory during first enumeration of I2O devices - Added missing locking in i2o_exec_lct_notify() - removed put_device() of I2O controller in i2o_iop_remove() which caused the controller structure get freed to early - Fixed size of mempool in i2o_iop_alloc() - Fixed access to freed memory in i2o_msg_get() See http://bugzilla.kernel.org/show_bug.cgi?id=6561 Signed-off-by: Markus Lidel <Markus.Lidel@shadowconnect.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] SPARC64: Respect gfp_t argument to dma_alloc_coherent().David Miller
Using asm-generic/dma-mapping.h does not work because pushing the call down to pci_alloc_coherent() causes the gfp_t argument of dma_alloc_coherent() to be ignored. Fix this by implementing things directly, and adding a gfp_t argument we can use in the internal call down to the PCI DMA implementation of pci_alloc_coherent(). This fixes massive memory corruption when using the sound driver layer, which passes things like __GFP_COMP down into these routines and (correctly) expects that to work. This is a disk eater when sound is used, so it's pretty critical. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] SPARC64: Fix D-cache corruption in mremapDavid Miller
If we move a mapping from one virtual address to another, and this changes the virtual color of the mapping to those pages, we can see corrupt data due to D-cache aliasing. Check for and deal with this by overriding the move_pte() macro. Set things up so that other platforms can cleanly override the move_pte() macro too. This long standing bug corrupts user memory, and in particular has been notorious for corrupting Debian package database files on sparc64 boxes. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-20[PATCH] SCTP: Respect the real chunk length when walking parameters ↵Vladislav Yasevich
(CVE-2006-1858) When performing bound checks during the parameter processing, we want to use the real chunk and paramter lengths for bounds instead of the rounded ones. This prevents us from potentially walking of the end if the chunk length was miscalculated. We still use rounded lengths when advancing the pointer. This was found during a conformance test that changed the chunk length without modifying parameters. (Vlad noted elsewhere: the most you'd overflow is 3 bytes, so problem is parameter dependent). Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-09[PATCH] SCTP: Allow spillover of receive buffer to avoid deadlock. ↵Neil Horman
(CVE-2006-2275) This patch fixes a deadlock situation in the receive path by allowing temporary spillover of the receive buffer. - If the chunk we receive has a tsn that immediately follows the ctsn, accept it even if we run out of receive buffer space and renege data with higher TSNs. - Once we accept one chunk in a packet, accept all the remaining chunks even if we run out of receive buffer space. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Mark Butler <butlerm@middle.net> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-01[PATCH] i386: fix broken FP exception handlingChuck Ebbert
The FXSAVE information leak patch introduced a bug in FP exception handling: it clears FP exceptions only when there are already none outstanding. Mikael Pettersson reported that causes problems with the Erlang runtime and has tested this fix. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Acked-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-01[PATCH] MIPS: R2 build fixes for gcc < 3.4.Ralf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-01[PATCH] MIPS: Use "R" constraint for cache_op.Ralf Baechle
Gcc might emit an absolute address for the the "m" constraint which gas unfortunately does not permit. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-01[PATCH] x86/PAE: Fix pte_clear for the >4GB RAM caseZachary Amsden
Proposed fix for ptep_get_and_clear_full PAE bug. Pte_clear had the same bug, so use the same fix for both. Turns out pmd_clear had it as well, but pgds are not affected. The problem is rather intricate. Page table entries in PAE mode are 64-bits wide, but the only atomic 8-byte write operation available in 32-bit mode is cmpxchg8b, which is expensive (at least on P4), and thus avoided. But it can happen that the processor may prefetch entries into the TLB in the middle of an operation which clears a page table entry. So one must always clear the P-bit in the low word of the page table entry first when clearing it. Since the sequence *ptep = __pte(0) leaves the order of the write dependent on the compiler, it must be coded explicitly as a clear of the low word followed by a clear of the high word. Further, there must be a write memory barrier here to enforce proper ordering by the compiler (and, in the future, by the processor as well). On > 4GB memory machines, the implementation of pte_clear for PAE was clearly deficient, as it could leave virtual mappings of physical memory above 4GB aliased to memory below 4GB in the TLB. The implementation of ptep_get_and_clear_full has a similar bug, although not nearly as likely to occur, since the mappings being cleared are in the process of being destroyed, and should never be dereferenced again. But, as luck would have it, it is possible to trigger bugs even without ever dereferencing these bogus TLB mappings, even if the clear is followed fairly soon after with a TLB flush or invalidation. The problem is that memory above 4GB may now be aliased into the first 4GB of memory, and in fact, may hit a region of memory with non-memory semantics. These regions include AGP and PCI space. As such, these memory regions are not cached by the processor. This introduces the bug. The processor can speculate memory operations, including memory writes, as long as they are committed with the proper ordering. Speculating a memory write to a linear address that has a bogus TLB mapping is possible. Normally, the speculation is harmless. But for cached memory, it does leave the falsely speculated cacheline unmodified, but in a dirty state. This cache line will be eventually written back. If this cacheline happens to intersect a region of memory that is not protected by the cache coherency protocol, it can corrupt data in I/O memory, which is generally a very bad thing to do, and can cause total system failure or just plain undefined behavior. These bugs are extremely unlikely, but the severity is of such magnitude, and the fix so simple that I think fixing them immediately is justified. Also, they are nearly impossible to debug. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>