summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2007-08-15[PATCH] pi-futex: Fix exit races and locking problemsAlexey Kuznetsov
1. New entries can be added to tsk->pi_state_list after task completed exit_pi_state_list(). The result is memory leakage and deadlocks. 2. handle_mm_fault() is called under spinlock. The result is obvious. 3. results in self-inflicted deadlock inside glibc. Sometimes futex_lock_pi returns -ESRCH, when it is not expected and glibc enters to for(;;) sleep() to simulate deadlock. This problem is quite obvious and I think the patch is right. Though it looks like each "if" in futex_lock_pi() got some stupid special case "else if". :-) 4. sometimes futex_lock_pi() returns -EDEADLK, when nobody has the lock. The reason is also obvious (see comment in the patch), but correct fix is far beyond my comprehension. I guess someone already saw this, the chunk: if (rt_mutex_trylock(&q.pi_state->pi_mutex)) ret = 0; is obviously from the same opera. But it does not work, because the rtmutex is really taken at this point: wake_futex_pi() of previous owner reassigned it to us. My fix works. But it looks very stupid. I would think about removal of shift of ownership in wake_futex_pi() and making all the work in context of process taking lock. From: Thomas Gleixner <tglx@linutronix.de> Fix 1) Avoid the tasklist lock variant of the exit race fix by adding an additional state transition to the exit code. This fixes also the issue, when a task with recursive segfaults is not able to release the futexes. Fix 2) Cleanup the lookup_pi_state() failure path and solve the -ESRCH problem finally. Fix 3) Solve the fixup_pi_state_owner() problem which needs to do the fixup in the lock protected section by using the in_atomic userspace access functions. This removes also the ugly lock drop / unqueue inside of fixup_pi_state() Fix 4) Fix a stale lock in the error path of futex_wake_pi() Added some error checks for verification. The -EDEADLK problem is solved by the rtmutex fixups. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-15[PATCH] x86_64: allocate sparsemem memmap above 4GZou Nan hai
On systems with huge amount of physical memory, VFS cache and memory memmap may eat all available system memory under 4G, then the system may fail to allocate swiotlb bounce buffer. There was a fix for this issue in arch/x86_64/mm/numa.c, but that fix dose not cover sparsemem model. This patch add fix to sparsemem model by first try to allocate memmap above 4G. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw: trivial backport] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-15[PATCH] pci_ids: update patch for Intel ICH9MJason Gaston
This patch updates the Intel ICH9M LPC Controller DID's, due to a specification change. Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Cc: <stable@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-15[PATCH] make freezeable workqueues singlethreadOleg Nesterov
It is a known fact that freezeable multithreaded workqueues doesn't like CPU_DEAD. We keep them only for the incoming CPU-hotplug rework. Sadly, we can't just kill create_freezeable_workqueue() right now, make them singlethread. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-08-15[PATCH] md: Don't write more than is required of the last page of a bitmapNeilBrown
It is possible that real data or metadata follows the bitmap without full page alignment. So limit the last write to be only the required number of bytes, rounded up to the hard sector size of the device. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-11[PATCH] NET: Fix BMSR_100{HALF,FULL}2 defines in linux/mii.hDavid Miller
Noticed by Matvejchikov Ilya. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-06-11[PATCH] {ip, nf}_nat_proto_gre: do not modify/corrupt GREv0 packets through NATJorge Boncompte
While porting some changes of the 2.6.21-rc7 pptp/proto_gre conntrack and nat modules to a 2.4.32 kernel I noticed that the gre_key function returns a wrong pointer to the GRE key of a version 0 packet thus corrupting the packet payload. The intended behaviour for GREv0 packets is to act like nf_conntrack_proto_generic/nf_nat_proto_unknown so I have ripped the offending functions (not used anymore) and modified the nf_nat_proto_gre modules to not touch version 0 (non PPTP) packets. Signed-off-by: Jorge Boncompte <jorge@dti2.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-01Taskstats fix the structure members alignment issueBalbir Singh
We broke the the alignment of members of taskstats to the 8 byte boundary with the CSA patches. In the current kernel, the taskstats structure is not suitable for use by 32 bit applications in a 64 bit kernel. On x86_64 Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 116, # ac_uid 120, # ac_gid 124, # ac_pid 128, # ac_ppid 132, # ac_btime 136, # ac_etime 144, # ac_utime 152, # ac_stime 160, # ac_minflt 168, # ac_majflt 176, # coremem 184, # virtmem 192, # hiwater_rss 200, # hiwater_vm 208, # read_char 216, # write_char 224, # read_syscalls 232, # write_syscalls 240, # read_bytes 248, # write_bytes 256, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 12, # cpu_count 20, # cpu_delay_total 28, # blkio_count 36, # blkio_delay_total 44, # swapin_count 52, # swapin_delay_total 60, # cpu_run_real_total 68, # cpu_run_virtual_total 76, # ac_comm 108, # ac_sched 109, # ac_pad 112, # ac_uid 116, # ac_gid 120, # ac_pid 124, # ac_ppid 128, # ac_btime 132, # ac_etime 140, # ac_utime 148, # ac_stime 156, # ac_minflt 164, # ac_majflt 172, # coremem 180, # virtmem 188, # hiwater_rss 196, # hiwater_vm 204, # read_char 212, # write_char 220, # read_syscalls 228, # write_syscalls 236, # read_bytes 244, # write_bytes 252, # cancelled_write_bytes ); This is one way to solve the problem without re-arranging structure members is to pack the structure. The patch adds an __attribute__((aligned(8))) to the taskstats structure members so that 32 bit applications using taskstats can work with a 64 bit kernel. Using __attribute__((packed)) would break the 64 bit alignment of members. The fix was tested on x86_64. After the fix, we got Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-25[PATCH] IPV6: Disallow RH0 by default.YOSHIFUJI Hideaki
[IPV6]: Disallow RH0 by default. A security issue is emerging. Disallow Routing Header Type 0 by default as we have been doing for IPv4. Note: We allow RH2 by default because it is harmless. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-13ide: use correct IDE error recoverySuleiman Souhlal
ide: use correct IDE error recovery IDE error recovery is using IDLE IMMEDIATE if the drive is busy or has DRQ set. This violates the ATA spec (can only send IDLE IMMEDIATE when drive is not busy) and really hoses up some drives (modern drives will not be able to recover using this error handling). The correct thing to do is issue a SRST followed by a SET FEATURES command. This is what Western Digital recommends for error recovery and what Western Digital says Windows does.  It also does not violate the ATA spec as far as I can tell. Bart: * port the patch over the current tree * undo the recalibration code removal * send SET FEATURES command after checking for good drive status * don't check whether the current request is of REQ_TYPE_ATA_{CMD,TASK} type because we need to send SET FEATURES before handling any requests * some pre-ATA4 drives require INITIALIZE DEVICE PARAMETERS command before other commands (except IDENTIFY) so send SET FEATURES only if there are no pending drive->special requests * update comments and patch description * any bugs introduced by this patch are mine and not Suleiman's :-) Signed-off-by: Suleiman Souhlal <suleiman@google.com> Acked-by: Alan Cox <alan@redhat.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-04-13Fix IFB net driver input device crashesPatrick McHardy
[IFB]: Fix crash on input device removal The input_device pointer is not refcounted, which means the device may disappear while packets are queued, causing a crash when ifb passes packets with a stale skb->dev pointer to netif_rx(). Fix by storing the interface index instead and do a lookup where neccessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-06UML - fix epollJeff Dike
UML/x86_64 needs the same packing of struct epoll_event as x86_64. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-06ide: remove clearing bmdma status from cdrom_decode_status() (rev #4)Albert Lee
ide: remove clearing bmdma status from cdrom_decode_status() (rev #4) patch 2/2: Remove clearing bmdma status from cdrom_decode_status() since ATA devices might need it as well. (http://lkml.org/lkml/2006/12/4/201 and http://lkml.org/lkml/2006/11/15/94) Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: "Adam W. Hawks" <awhawks@us.ibm.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-06ide: clear bmdma status in ide_intr() for ICHx controllers (revised #4)Albert Lee
ide: clear bmdma status in ide_intr() for ICHx controllers (revised #4) patch 1/2 (revised): - Fix drive->waiting_for_dma to work with CDB-intr devices. - Do the dma status clearing in ide_intr() and add a new hwif->ide_dma_clear_irq for Intel ICHx controllers. Revised per Alan, Sergei and Bart's advice. Patch against 2.6.20-rc6. Tested ok on my ICH4 and pdc20275 adapters. Please review/apply, thanks. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Adam Hawks <awhawks@us.ibm.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-23fix MTIME_SEC_MAX on 32-bitThomas Gleixner
The maximum seconds value we can handle on 32bit is LONG_MAX. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-13conntrack: fix {nf, ip}_ct_iterate_cleanup endless loopsPatrick McHardy
[NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling: - unconfirmed entries can not be killed manually, they are removed on confirmation or final destruction of the conntrack entry, which means we might iterate forever without making forward progress. This can happen in combination with the conntrack event cache, which holds a reference to the conntrack entry, which is only released when the packet makes it all the way through the stack or a different packet is handled. - taking references to an unconfirmed entry and using it outside the locked section doesn't work, the list entries are not refcounted and another CPU might already be waiting to destroy the entry What the code really wants to do is make sure the references of the hash table to the selected conntrack entries are released, so they will be destroyed once all references from skbs and the event cache are dropped. Since unconfirmed entries haven't even entered the hash yet, simply mark them as dying and skip confirmation based on that. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09throttle_vm_writeout(): don't loop on GFP_NOFS and GFP_NOIO allocationsAndrew Morton
throttle_vm_writeout() is designed to wait for the dirty levels to subside. But if the caller holds IO or FS locks, we might be holding up that writeout. So change it to take a single nap to give other devices a chance to clean some memory, then return. Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09lockdep: forward declare struct task_structHeiko Carstens
3117df0453828bd045c16244e6f50e5714667a8a causes this: In file included from arch/s390/kernel/early.c:13: include/linux/lockdep.h:300: warning: "struct task_struct" declared inside parameter list include/linux/lockdep.h:300: warning: its scope is only this definition or declaration, which is probably not what you want Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09ufs: restore back support of openstepEvgeniy Dushistov
This is a fix of regression, which triggered by ~2.6.16. Patch with name ufs-directory-and-page-cache-from-blocks-to-pages.patch: in additional to conversation from block to page cache mechanism added new checks of directory integrity, one of them that directory entry do not across directory chunks. But some kinds of UFS: OpenStep UFS and Apple UFS (looks like these are the same filesystems) have different directory chunk size, then common UFSes(BSD and Solaris UFS). So this patch adds ability to works with variable size of directory chunks, and set it for ufstype=openstep to right size. Tested on darwin ufs. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09mmc: Power quirk for ENE controllersDarren Salt
mmc: Power quirk for ENE controllers Support for these devices was broken for 2.6.18-rc1 and later by commit 146ad66eac836c0b976c98f428d73e1f6a75270d, which added voltage level support. This restores the previous behaviour for these devices by ensuring that when the voltage is changed, only one write to set the voltage is performed. It may be that both writes are needed if the voltage is being changed between two non-zero values or that it's safe to ensure that only one write is done if the hardware only supports one voltage; I don't know whether either is the case nor can I test since I have only the one SD reader (1524:0550), and it supports just the one voltage. Signed-off-by: Darren Salt <linux@youmustbejoking.demon.co.uk> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09x86: Don't require the vDSO for handling a.out signalsAndi Kleen
x86: Don't require the vDSO for handling a.out signals and in other strange binfmts. vDSO is not necessarily mapped there. This fixes signals in a.out programs Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09Fix atmarp.h for userspaceDavid Miller
[ATM]: atmarp.h needs to always include linux/types.h To provide the __be* types, even for userspace includes. Reported by Andrew Walrond. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09Fix recently introduced problem with shutting down a busy NFS server.NeilBrown
When the last thread of nfsd exits, it shuts down all related sockets. It currently uses svc_close_socket to do this, but that only is immediately effective if the socket is not SK_BUSY. If the socket is busy - i.e. if a request has arrived that has not yet been processes - svc_close_socket is not effective and the shutdown process spins. So create a new svc_force_close_socket which removes the SK_BUSY flag is set and then calls svc_close_socket. Also change some open-codes loops in svc_destroy to use list_for_each_entry_safe. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-03-09md: Avoid possible BUG_ON in md bitmap handling.Neil Brown
md/bitmap tracks how many active write requests are pending on blocks associated with each bit in the bitmap, so that it knows when it can clear the bit (when count hits zero). The counter has 14 bits of space, so if there are ever more than 16383, we cannot cope. Currently the code just calles BUG_ON as "all" drivers have request queue limits much smaller than this. However is seems that some don't. Apparently some multipath configurations can allow more than 16383 concurrent write requests. So, in this unlikely situation, instead of calling BUG_ON we now wait for the count to drop down a bit. This requires a new wait_queue_head, some waiting code, and a wakeup call. Tested by limiting the counter to 20 instead of 16383 (writes go a lot slower in that case...). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> diff .prev/drivers/md/bitmap.c ./drivers/md/bitmap.c
2007-03-09knfsd: Fix a race in closing NFSd connections.NeilBrown
If you lose this race, it can iput a socket inode twice and you get a BUG in fs/inode.c When I added the option for user-space to close a socket, I added some cruft to svc_delete_socket so that I could call that function when closing a socket per user-space request. This was the wrong thing to do. I should have just set SK_CLOSE and let normal mechanisms do the work. Not only wrong, but buggy. The locking is all wrong and it openned up a race where-by a socket could be closed twice. So this patch: Introduces svc_close_socket which sets SK_CLOSE then either leave the close up to a thread, or calls svc_delete_socket if it can get SK_BUSY. Adds a bias to sk_busy which is removed when SK_DEAD is set, This avoid races around shutting down the socket. Changes several 'spin_lock' to 'spin_lock_bh' where the _bh was missing. Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916 Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-02libata: Fix ata_busy_wait() kernel docsAlan
> Looks like you should use ata_busy_wait() here, rather than reproducing > the same code again. It waits in 10uS chunks while 1uS chunks were used in the workaround. Could indeed do that once I know the fix is right. While I'm at it the ata_busy_wait kerneldoc is borked so here's a fix Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-01[PATCH] efi_set_rtc_mmss() is not __initAl Viro
fix the extern in efi.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid: HID: fix pb_fnmode and move it to generic HID HID: fix hid-input mapping for Firefly Mini Remote Control USB HID: fix hid_blacklist clash for 0x08ca/0x0010 HID: fix memleaking of collection
2007-01-30[PATCH] namespaces: fix task exit disasterSerge E. Hallyn
This is based on a patch by Eric W. Biederman, who pointed out that pid namespaces are still fake, and we only have one ever active. So for the time being, we can modify any code which could access tsk->nsproxy->pid_ns during task exit to just use &init_pid_ns instead, and move the exit_task_namespaces call in do_exit() back above exit_notify(), so that an exiting nfs server has a valid tsk->sighand to work with. Long term, pulling pid_ns out of nsproxy might be the cleanest solution. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> [ Eric's patch fixed to take care of free_pid() too ] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30Revert "[PATCH] namespaces: fix exit race by splitting exit"Linus Torvalds
This reverts commit 7a238fcba0629b6f2edbcd37458bae56fcf36be5 in preparation for a better and simpler fix proposed by Eric Biederman (and fixed up by Serge Hallyn) Acked-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgartLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart: [AGPGART] Add new IDs to VIA AGP. [AGPGART] Remove pointless assignment. [AGPGART] Remove pointless typedef in ati-agp [AGPGART] Prevent (unlikely) memory leak in amd_create_gatt_pages() [AGPGART] intel_agp: restore graphics device's pci space early in resume
2007-01-30Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: via82cxxx/pata_via: correct PCI_DEVICE_ID_VIA_SATA_EIDE ID and add support for CX700 and 8237S ide: unregister idepnp driver on unload ide: add missing __init tags to IDE PCI host drivers ia64: add pci_get_legacy_ide_irq() ide/generic: Jmicron has its own drivers now atiixp.c: add cable detection support for ATI IDE atiixp.c: sb600 ide only has one channel atiixp.c: remove unused code jmicron: fix warning ide: update MAINTAINERS entry
2007-01-30[PATCH] cdev.h: forward declarationsJan Engelhardt
Apparently this broke due to missing `struct inode' declaration. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: Noah Watkins <nwatkins@ittc.ku.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] use __u8 rather than u8 in userspace SIZE defines in hdreg.hMike Frysinger
Use __u8 rather than u8 in SIZE defines exported to userspace. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] use __u8/__u32 in userspace ioctl defines for I2OMike Frysinger
Make sure exported I2O ioctls utilize userspace safe types. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] namespaces: fix exit race by splitting exitSerge E. Hallyn
Fix exit race by splitting the nsproxy putting into two pieces. First piece reduces the nsproxy refcount. If we dropped the last reference, then it puts the mnt_ns, and returns the nsproxy as a hint to the caller. Else it returns NULL. The second piece of exiting task namespaces sets tsk->nsproxy to NULL, and drops the references to other namespaces and frees the nsproxy only if an nsproxy was passed in. A little awkward and should probably be reworked, but hopefully it fixes the NFS oops. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Daniel Hokka Zakrisson <daniel@hozac.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30HID: fix pb_fnmode and move it to generic HIDJiri Kosina
The apple powerbook people are used to switch the pb_fnmode setting at runtime through writing to sysfs, altering the module parameter value. This was broken for them in 2.6.20-rc1 when generic HID layer was introduced, as the pb_fnmode flag was made per-hiddevice, instead of global variable. This patch moves the pb_fnmode module parameter from usbhid module to hid module, but apart from that retains backward compatibility with respect to changing the mode through sysfs. Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2007-01-28[AGPGART] Add new IDs to VIA AGP.Dave Jones
Culled from the VIA codedrop. Also fixes up one ID used in amd64-agp to use the VIA part number instead of the board name in its ID. Signed-off-by: Dave Jones <davej@redhat.com>
2007-01-27via82cxxx/pata_via: correct PCI_DEVICE_ID_VIA_SATA_EIDE ID and add support ↵Josepch Chan
for CX700 and 8237S This patch: * Corrects the wrong device ID of PCI_DEVICE_ID_VIA_SATA_EIDE from 0x0581 to 0x5324. * Adds VIA CX700 and VT8237S support in drivers/ide/pci/via82cxxx.c * Adds VIA VT8237S support in drivers/ata/pata_via.c Signed-off-by: Josepch Chan <josephchan@via.com.tw> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-01-26Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: Fix Maple PATA IRQ assignment. ahci: use 0x80 as wait stat value instead of 0xff sata_via: style clean up, no indirect method call in LLD ahci: fix endianness in spurious interrupt message libata-sff: Don't call bmdma_stop on non DMA capable controllers libata: implement ATA_FLAG_IGN_SIMPLEX and use it in sata_uli ahci: improve and limit spurious interrupt messages, take#3 sata_via: don't diddle with ATA_NIEN in ->freeze libata: set_mode, Fix the FIXME libata hpt3xn: Hopefully sort out the DPLL logic versus the vendor code libata cmd64x: whack into a shape that looks like the documentation
2007-01-26[PATCH] md: fix potential memalloc deadlock in mdNeilBrown
If a GFP_KERNEL allocation is attempted in md while the mddev_lock is held, it is possible for a deadlock to eventuate. This happens if the array was marked 'clean', and the memalloc triggers a write-out to the md device. For the writeout to succeed, the array must be marked 'dirty', and that requires getting the mddev_lock. So, before attempting a GFP_KERNEL allocation while holding the lock, make sure the array is marked 'dirty' (unless it is currently read-only). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] knfsd: Fix type mismatch with filldir_t used by nfsdNeilBrown
nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t' except that the first pointer is 'struct readdir_cd *' rather than 'void *'. It then casts encode_dent_fn points to 'filldir_t' as needed. This hides any other type mismatches between the two such as the fact that the 'ino' arg recently changed from ino_t to u64. So: get rid of 'encode_dent_fn', get rid of the cast of the function type, change the first arg of various functions from 'struct readdir_cd *' to 'void *', and live with the fact that we have a little less type checking on the calling of these functions now. Less internal (to nfsd) checking offset by more external checking, which is more important. Thanks to Gabriel Paubert <paubert@iram.es> for discovering this and providing an initial patch. Signed-off-by: Gabriel Paubert <paubert@iram.es> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] fix various kernel-doc in header filesRobert P. J. Day
Fix a number of kernel-doc entries for header files in include/linux by making sure they begin with the appropriate '/**' notation and use @var notation. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] knfsd: replace some warning ins nfsfh.h with BUG_ON or WARN_ONNeilBrown
A couple of the warnings will be followed by an Oops if they ever fire, so may as well be BUG_ON. Another isn't obviously fatal but has never been known to fire, so make it a WARN_ON. Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned readsNeilBrown
NFSd assumes that largest number of pages that will be needed for a request+response is 2+N where N pages is the size of the largest permitted read/write request. The '2' are 1 for the non-data part of the request, and 1 for the non-data part of the reply. However, when a read request is not page-aligned, and we choose to use ->sendfile to send it directly from the page cache, we may need N+1 pages to hold the whole reply. This can overflow and array and cause an Oops. This patch increases size of the array for holding pages by one and makes sure that entry is NULL when it is not in use. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] Add VM_ALWAYSDUMPRoland McGrath
This patch adds the VM_ALWAYSDUMP flag for vm_flags in vm_area_struct. This provides a clean explicit way to have a vma always included in core dumps, as is needed for vDSO's. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] KVM: SVM: Propagate cpu shutdown events to userspaceJoerg Roedel
This patch implements forwarding of SHUTDOWN intercepts from the guest on to userspace on AMD SVM. A SHUTDOWN event occurs when the guest produces a triple fault (e.g. on reboot). This also fixes the bug that a guest reboot actually causes a host reboot under some circumstances. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-25libata: implement ATA_FLAG_IGN_SIMPLEX and use it in sata_uliTejun Heo
Some uli controllers have stuck SIMPLEX bit which can't be cleared with ata_pci_clear_simplex(), but the controller is capable of doing DMAs on both channels simultaneously. Implement ATA_FLAG_IGN_SIMPLEX which makes libata ignore the simplex bit and use it in sata_uli. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-01-24libata: set_mode, Fix the FIXMEAlan
When set_mode() changed ->set_mode didn't adapt. This makes the needed changes and removes the relevant FIXME case. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-01-24[PATCH] NFS: Fix races in nfs_revalidate_mapping()Trond Myklebust
Prevent the call to invalidate_inode_pages2() from racing with file writes by taking the inode->i_mutex across the page cache flush and invalidate. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>