summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2007-01-09skip data conversion in compat_sys_mount when data_page is NULLAndrey Mirkin
OpenVZ Linux kernel team has found a problem with mounting in compat mode. Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode leads to oops: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 PGD 34d48067 PUD 34d03067 PMD 0 Oops: 0000 [1] SMP CPU: 0 Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801 i2c_core e100 mii floppy ide_cd cdrom Pid: 14656, comm: mount RIP: 0060:[<ffffffff802bc7c6>] [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 RSP: 0000:ffff810034d31f38 EFLAGS: 00010292 RAX: 000000000000002c RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff810034c86bc0 RSI: 0000000000000096 RDI: ffffffff8061fc90 RBP: ffff810034d31f78 R08: 0000000000000000 R09: 000000000000000d R10: ffff810034d31e58 R11: 0000000000000001 R12: ffff810039dc3000 R13: 000000000805ea48 R14: 0000000000000000 R15: 00000000c0ed0000 FS: 0000000000000000(0000) GS:ffffffff80749000(0033) knlGS:00000000b7d556b0 CS: 0060 DS: 007b ES: 007b CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000034d43000 CR4: 00000000000006e0 Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task ffff810034c86bc0) Stack: 0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48 0000000000000000 0000000000000000 0000000000000000 0000000000000000 000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000 Call Trace: [<ffffffff8021e64e>] ia32_sysret+0x0/0xa Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a RIP [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 RSP <ffff810034d31f38> CR2: 0000000000000000 The problem is that data_page pointer can be NULL, so we should skip data conversion in this case. Signed-off-by: Andrey Mirkin <amirkin@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09corrupted cramfs filesystems cause kernel oops (CVE-2006-5823)Phillip Lougher
Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/ fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause Cramfs to kernel oops in cramfs_uncompress_block(). The cause of the oops is an unchecked corrupted block length field read by cramfs_readpage(). This patch adds a sanity check to cramfs_readpage() which checks that the block length field is sensible. The (PAGE_CACHE_SIZE << 1) size check is intentional, even though the uncompressed data is not going to be larger than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than the original source data. Mkcramfs checks that the compressed size is always less than or equal to PAGE_CACHE_SIZE << 1. Of course Cramfs could use the original uncompressed data in this case, but it doesn't. Signed-off-by: Phillip Lougher <phillip@lougher.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09handle ext3 directory corruption better (CVE-2006-6053)Eric Sandeen
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz Basically it makes a filesystem, splats some random bits over it, then tries to mount it and do some simple filesystem actions. At best, the filesystem catches the corruption gracefully. At worst, things spin out of control. As you might guess, we found a couple places in ext3 where things spin out of control :) First, we had a corrupted directory that was never checked for consistency... it was corrupt, and pointed to another bad "entry" of length 0. The for() loop looped forever, since the length of ext3_next_entry(de) was 0, and we kept looking at the same pointer over and over and over and over... I modeled this check and subsequent action on what is done for other directory types in ext3_readdir... (adding this check adds some computational expense; I am testing a followup patch to reduce the number of times we check and re-check these directory entries, in all cases. Thanks for the idea, Andreas). Next we had a root directory inode which had a corrupted size, claimed to be > 200M on a 4M filesystem. There was only really 1 block in the directory, but because the size was so large, readdir kept coming back for more, spewing thousands of printk's along the way. Per Andreas' suggestion, if we're in this read error condition and we're trying to read an offset which is greater than i_blocks worth of bytes, stop trying, and break out of the loop. With these two changes fsfuzz test survives quite well on ext3. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09ext2: skip pages past number of blocks in ext2_find_entry (CVE-2006-6054)Eric Sandeen
This one was pointed out on the MOKB site: http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html If a directory's i_size is corrupted, ext2_find_entry() will keep processing pages until the i_size is reached, even if there are no more blocks associated with the directory inode. This patch puts in some minimal sanity-checking so that we don't keep checking pages (and issuing errors) if we know there can be no more data to read, based on the block count of the directory inode. This is somewhat similar in approach to the ext3 patch I sent earlier this year. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09hfs_fill_super returns success even if no root inode (CVE-2006-6056)Eric Sandeen
http://kernelfun.blogspot.com/2006/11/mokb-14-11-2006-linux-26x-selinux.html mount that image... fs: filesystem was not cleanly unmounted, running fsck.hfs is recommended. mounting read-only. hfs: get root inode failed. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018 printing eip ... EIP is at superblock_doinit+0x21/0x767 ... [] selinux_sb_kern_mount+0xc/0x4b [] vfs_kern_mount+0x99/0xf6 [] do_kern_mount+0x2d/0x3e [] do_mount+0x5fa/0x66d [] sys_mount+0x77/0xae [] syscall_call+0x7/0xb DWARF2 unwinder stuck at syscall_call+0x7/0xb hfs_fill_super() returns success even if root_inode = hfs_iget(sb, &fd.search_key->cat, &rec); or sb->s_root = d_alloc_root(root_inode); fails. This superblock finds its way to superblock_doinit() which does: struct dentry *root = sb->s_root; struct inode *inode = root->d_inode; and boom. Need to make sure the error cases return an error, I think. [akpm@osdl.org: return -ENOMEM on oom] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09grow_buffers() infinite loop fix (CVE-2006-5757/CVE-2006-6060)Andrew Morton
If grow_buffers() is for some reason passed a block number which wants to li outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then will accidentally truncate `index' and will then instnatiate a page at the wrong pagecache offset. This causes __getblk_slow() to go into an infinite loop. This can happen with corrupted disks, or with software errors elsewhere. Detect that, and handle it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-04fuse: fix hang on SMPMiklos Szeredi
Fuse didn't always call i_size_write() with i_mutex held which caused rare hangs on SMP/32bit. This bug has been present since fuse-2.2, well before being merged into mainline. The simplest solution is to protect i_size_write() with the per-connection spinlock. Using i_mutex for this purpose would require some restructuring of the code and I'm not even sure it's always safe to acquire i_mutex in all places i_size needs to be set. Since most of vmtruncate is already duplicated for other reasons, duplicate the remaining part as well, making all i_size_write() calls internal to fuse. Using i_size_write() was unnecessary in fuse_init_inode(), since this function is only called on a newly created locked inode. Reported by a few people over the years, but special thanks to Dana Henriksen who was persistent enough in helping me debug it. Adrian Bunk: Backported to 2.6.16. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-18NFS: nfs_lookup - don't hash dentry when optimising away the lookupTrond Myklebust
If the open intents tell us that a given lookup is going to result in a, exclusive create, we currently optimize away the lookup call itself. The reason is that the lookup would not be atomic with the create RPC call, so why do it in the first place? A problem occurs, however, if the VFS aborts the exclusive create operation after the lookup, but before the call to create the file/directory: in this case we will end up with a hashed negative dentry in the dcache that has never been looked up. Fix this by only actually hashing the dentry once the create operation has been successfully completed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-09binfmt_elf: fix checks for bad addressChuck Ebbert
Fix check for bad address; use macro instead of open-coding two checks. Taken from RHEL4 kernel update. From: Ernie Petrides <petrides@redhat.com> For background, the BAD_ADDR() macro should return TRUE if the address is TASK_SIZE, because that's the lowest address that is *not* valid for user-space mappings. The macro was correct in binfmt_aout.c but was wrong for the "equal to" case in binfmt_elf.c. There were two in-line validations of user-space addresses in binfmt_elf.c, which have been appropriately converted to use the corrected BAD_ADDR() macro in the patch you posted yesterday. Note that the size checks against TASK_SIZE are okay as coded. The additional changes that I propose are below. These are in the error paths for bad ELF entry addresses once load_elf_binary() has already committed to exec'ing the new image (following the tearing down of the task's original address space). The 1st hunk deals with the interp-side of the outer "if". There were two problems here. The printk() should be removed because this path can be triggered at will by a bogus interpreter image created and used by a malicious user. Further, the error code should not be ENOEXEC, because that causes the loop in search_binary_handler() to continue trying other exec handlers (twice, in fact). But it's too late for this to work correctly, because the user address space has already been torn down, and an exec() failure cannot be returned to the user code because the code no longer exists. The only recovery is to force a SIGSEGV, but it's best to terminate the search loop immediately. I somewhat arbitrarily chose EINVAL as a fallback error code, but any error returned by load_elf_interp() will override that (but this value will never be seen by user-space). The 2nd hunk deals with the non-interp-side of the outer "if". There were two problems here as well. The SIGSEGV needs to be forced, because a prior sigaction() syscall might have set the associated disposition to SIG_IGN. And the ENOEXEC should be changed to EINVAL as described above. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04fcntl(F_SETSIG) fixTrond Myklebust
fcntl(F_SETSIG) no longer works on leases because lease_release_private_callback() gets called as the lease is copied in order to initialise it. The problem is that lease_alloc() performs an unnecessary initialisation, which sets the lease_manager_ops. Avoid the problem by allocating the target lease structure using locks_alloc_lock(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04JFS: pageno needs to be longDave Kleikamp
diRead and diWrite are representing the page number as an unsigned int. This causes file system corruption on volumes larger than 16TB. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29freevxfs: Add missing lock_kernel() to vxfs_readdirJosh Triplett
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree stopped calling the readdir member of a file_operations struct with the big kernel lock held, and fixed up all the readdir functions to do their own locking. However, that change added calls to unlock_kernel() in vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to lock_kernel(). Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)Al Viro
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of buffer size from address of buffer_head is a bad idea - it will generate junk in any case, may oops if buffer_head is close to the end of slab page and next page is not mapped and isn't what was intended there. IOW, ->b_data is missing in that call. Fortunately, result doesn't go into the primary on-disk data structures, so only backup ones get crap written to them; that had allowed this bug to remain unnoticed until now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-25CIFS: report rename failure when target file is locked by WindowsSteve French
Fixes Samba bugzilla bug # 4182 Rename by handle failures (retry after rename by path) were not being returned back. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24sysfs: remove duplicated dput in sysfs_update_fileHidetoshi Seto
Following function can drops d_count twice against one reference by lookup_one_len. <SOURCE> /** * sysfs_update_file - update the modified timestamp on an object attribute. * @kobj: object we're acting for. * @attr: attribute descriptor. */ int sysfs_update_file(struct kobject * kobj, const struct attribute * attr) { struct dentry * dir = kobj->dentry; struct dentry * victim; int res = -ENOENT; mutex_lock(&dir->d_inode->i_mutex); victim = lookup_one_len(attr->name, dir, strlen(attr->name)); if (!IS_ERR(victim)) { /* make sure dentry is really there */ if (victim->d_inode && (victim->d_parent->d_inode == dir->d_inode)) { victim->d_inode->i_mtime = CURRENT_TIME; fsnotify_modify(victim); /** * Drop reference from initial sysfs_get_dentry(). */ dput(victim); res = 0; } else d_drop(victim); /** * Drop the reference acquired from sysfs_get_dentry() above. */ dput(victim); } mutex_unlock(&dir->d_inode->i_mutex); return res; } </SOURCE> PCI-hotplug (drivers/pci/hotplug/pci_hotplug_core.c) is only user of this function. I confirmed that dentry of /sys/bus/pci/slots/XXX/* have negative d_count value. This patch removes unnecessary dput(). Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-20Fix BeFS slab corruptionDiego Calleja
In bugzilla #6941, Jens Kilian reported: "The function befs_utf2nls (in fs/befs/linuxvfs.c) writes a 0 byte past the end of a block of memory allocated via kmalloc(), leading to memory corruption. This happens only for filenames which are pure ASCII and a multiple of 4 bytes in length. [...] Without DEBUG_SLAB, this leads to further corruption and hard lockups; I believe this is the bug which has made kernels later than 2.6.8 unusable for me. (This must be due to changes in memory management, the bug has been in the BeFS driver since the time it was introduced (AFAICT).) Steps to reproduce: Create a directory (in BeOS, naturally :-) with files named, e.g., "1", "22", "333", "4444", ... Mount it in Linux and do an "ls" or "find"" This patch implements the suggested fix. Credits to Jens Kilian for debugging the problem and finding the right fix. Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-17ext3 -nobh option causes oopsBadari Pulavarty
For files other than IFREG, nobh option doesn't make sense. Modifications to them are journalled and needs buffer heads to do that. Without this patch, we get kernel oops in page_buffers(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09[DISKLABEL] SUN: Fix signed int usage for sector countJeff Mahoney
The current sun disklabel code uses a signed int for the sector count. When partitions larger than 1 TB are used, the cast to a sector_t causes the partition sizes to be invalid: # cat /proc/paritions | grep sdan 66 112 2146435072 sdan 66 115 9223372036853660736 sdan3 66 120 9223372036853660736 sdan8 This patch switches the sector count to an unsigned int to fix this. Eric Sandeen also submitted the same patch. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-08Don't allow chmod() on the /proc/<pid>/ filesMarcel Holtmann
This just turns off chmod() on the /proc/<pid>/ files, since there is no good reason to allow it, and had we disallowed it originally, the nasty /proc race exploit wouldn't have been possible. The other patches already fixed the problem chmod() could cause, so this is really just some final mop-up.. This particular version is based off a patch by Eugene and Marcel which had much better naming than my original equivalent one. Signed-off-by: Eugene Teo <eteo@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07[PATCH] md: Make sure bi_max_vecs is set properly in bio_splitNeil Brown
Else a subsequent bio_clone might make a mess. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Allow cifsd to suspend if connection is lostRafael J. Wysocki
Make cifsd allow us to suspend if it has lost the connection with a server Ref: http://bugzilla.kernel.org/show_bug.cgi?id=6811 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Fix typo in earlier cifs_unlink change and protect one extra path.Steve French
Since cifs_unlink can also be called from rename path and there was one report of oops am making the extra check for null inode. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Fix unlink oops when indirectly called in rename error path under ↵Steve French
heavy stress. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] fs/cifs/dir.c: fix possible NULL dereferenceSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-13fix fdset leakageKirill Korotaev
When found, it is obvious. nfds calculated when allocating fdsets is rewritten by calculation of size of fdtable, and when we are unlucky, we try to free fdsets of wrong size. Found due to OpenVZ resource management (User Beancounters). Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-09Have ext2 reject file handles with bad inode numbers early.Neil Brown
This prevents bad inode numbers from triggering errors in ext2_get_inode. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30fix struct file leakageKirill Korotaev
2.6.16 leaks like hell. While testing, I found massive filp leakage (reproduced in openvz) in the bowels of namei.c. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30Have ext3 reject file handles with bad inode numbers earlyEric Sandeen
blatantly ripped off from Neil Brown's ext2 patch. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30ext3: avoid triggering ext3_error on bad NFS file handleNeil Brown
The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block() without any checking. If ext3_get_inode_block() allows it to trigger an error, then bad filehandles can have unpleasant effect - ext3_error() will usually cause a forced read-only remount, or a panic if `errors=panic' was used. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read off storage. Andrew Morton fixed an off-by-one error. Dann Frazier ported the patch to 2.6.16. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-23Fix possible UDF deadlock and memory corruption (CVE-2006-4145)Jan Kara
UDF code is not really ready to handle extents larger that 1GB. This is the easy way to forbid creating those. Also truncation code did not count with the case when there are no extents in the file and we are extending the file. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-11fix debugfs inode leakJens Axboe
Looking at the reiser4 crash, I found a leak in debugfs. In debugfs_mknod(), we create the inode before checking if the dentry already has one attached. We don't free it if that is the case. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-11Fix missing ret assignment in __bio_map_user() error pathJens Axboe
If get_user_pages() returns less pages than what we asked for, we jump to out_unmap which will return ERR_PTR(ret). But ret can contain a positive number just smaller than local_nr_pages, so be sure to set it to -EFAULT always. Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-07-15[PATCH] Relax /proc fix a bitLinus Torvalds
Relax /proc fix a bit Clearign all of i_mode was a bit draconian. We only really care about S_ISUID/ISGID, after all. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-07-14[PATCH] Fix nasty /proc vulnerability (CVE-2006-3626)Linus Torvalds
Fix nasty /proc vulnerability We have a bad interaction with both the kernel and user space being able to change some of the /proc file status. This fixes the most obvious part of it, but I expect we'll also make it harder for users to modify even their "own" files in /proc. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)Anton Altaparmakov
It fixes a crash in NTFS on architectures where flush_dcache_page() is a real function. I never noticed this as all my testing is done on i386 where flush_dcache_page() is NULL. http://bugzilla.kernel.org/show_bug.cgi?id=6700 Many thanks to Pauline Ng for the detailed bug report and analysis! Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-06-22[PATCH] JFS: Fix multiple errors in metapage_releasepageDave Kleikamp
It looks like metapage_releasepage was making in invalid assumption that the releasepage method would not be called on a dirty page. Instead of issuing a warning and releasing the metapage, it should return 0, indicating that the private data for the page cannot be released. I also realized that metapage_releasepage had the return code all wrong. If it is successful in releasing the private data, it should return 1, otherwise it needs to return 0. Lastly, there is no need to call wait_on_page_writeback, since try_to_release_page will not call us with a page in writback state. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] fs/namei.c: Call to file_permission() under a spinlock in ↵Trond Myklebust
do_lookup_path() We're presently running lock_kernel() under fs_lock via nfs's ->permission handler. That's a ranking bug and sometimes a sleep-in-spinlock bug. This problem was introduced in the openat() patchset. We should not need to hold the current->fs->lock for a codepath that doesn't use current->fs. [vsu@altlinux.ru: fix error path] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] Missed error checking for intent's filp in open_namei().Oleg Drokin
It seems there is error check missing in open_namei for errors returned through intent.open.file (from lookup_instantiate_filp). If there is plain open performed, then such a check done inside __path_lookup_intent_open called from path_lookup_open(), but when the open is performed with O_CREAT flag set, then __path_lookup_intent_open is only called with LOOKUP_PARENT set where no file opening can occur yet. Later on lookup_hash is called where exact opening might take place and intent.open.file may be filled. If it is filled with error value of some sort, then we get kernel attempting to dereference this error value as address (and corresponding oops) in nameidata_to_filp() called from filp_open(). While this is relatively simple to workaround in ->lookup() method by just checking lookup_instantiate_filp() return value and returning error as needed, this is not so easy in ->d_revalidate(), where we can only return "yes, dentry is valid" or "no, dentry is invalid, perform full lookup again", and just returning 0 on error would cause extra lookup (with potential extra costly RPCs). So in short, I believe that there should be no difference in error handling for opening a file and creating a file in open_namei() and propose this simple patch as a solution. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-20[PATCH] fs/compat.c: fix 'if (a |= b )' typoAlexey Dobriyan
Mentioned by Mark Armbrust somewhere on Usenet. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-20[PATCH] smbfs: Fix slab corruption in samba error pathJan Niehusmann
Yesterday, I got the following error with 2.6.16.13 during a file copy from a smb filesystem over a wireless link. I guess there was some error on the wireless link, which in turn caused an error condition for the smb filesystem. In the log, smb_file_read reports error=4294966784 (0xfffffe00), which also shows up in the slab dumps, and also is -ERESTARTSYS. Error code 27499 corresponds to 0x6b6b, so the rq_errno field seems to be the only one being set after freeing the slab. In smb_add_request (which is the only place in smbfs where I found ERESTARTSYS), I found the following: if (!timeleft || signal_pending(current)) { /* * On timeout or on interrupt we want to try and remove the * request from the recvq/xmitq. */ smb_lock_server(server); if (!(req->rq_flags & SMB_REQ_RECEIVED)) { list_del_init(&req->rq_queue); smb_rput(req); } smb_unlock_server(server); } [...] if (signal_pending(current)) req->rq_errno = -ERESTARTSYS; I guess that some codepath like smbiod_flush() caused the request to be removed from the queue, and smb_rput(req) be called, without SMB_REQ_RECEIVED being set. This violates an asumption made by the quoted code. Then, the above code calls smb_rput(req) again, the req gets freed, and req->rq_errno = -ERESTARTSYS writes into the already freed slab. As list_del_init doesn't cause an error if called multiple times, that does cause the observed behaviour (freed slab with rq_errno=-ERESTARTSYS). If this observation is correct, the following patch should fix it. I wonder why the smb code uses list_del_init everywhere - using list_del instead would catch such situations by poisoning the next and prev pointers. May 4 23:29:21 knautsch kernel: [17180085.456000] ipw2200: Firmware error detected. Restarting. May 4 23:29:21 knautsch kernel: [17180085.456000] ipw2200: Sysfs 'error' log captured. May 4 23:33:02 knautsch kernel: [17180306.316000] ipw2200: Firmware error detected. Restarting. May 4 23:33:02 knautsch kernel: [17180306.316000] ipw2200: Sysfs 'error' log already exists. May 4 23:33:02 knautsch kernel: [17180306.968000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:34:18 knautsch kernel: [17180383.256000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:34:18 knautsch kernel: [17180383.284000] SMB connection re-established (-5) May 4 23:37:19 knautsch kernel: [17180563.956000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:40:09 knautsch kernel: [17180733.636000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:40:26 knautsch kernel: [17180750.700000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:43:02 knautsch kernel: [17180907.304000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:43:08 knautsch kernel: [17180912.324000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:43:34 knautsch kernel: [17180938.416000] smb_errno: class Unknown, code 27499 from command 0x6b May 4 23:43:34 knautsch kernel: [17180938.416000] Slab corruption: start=c4ebe09c, len=244 May 4 23:43:34 knautsch kernel: [17180938.416000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:43:34 knautsch kernel: [17180938.416000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs]) May 4 23:43:34 knautsch kernel: [17180938.416000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b May 4 23:43:34 knautsch kernel: [17180938.416000] 0f0: 00 fe ff ff May 4 23:43:34 knautsch kernel: [17180938.416000] Next obj: start=c4ebe19c, len=244 May 4 23:43:34 knautsch kernel: [17180938.416000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:43:34 knautsch kernel: [17180938.416000] Last user: [<00000000>](_stext+0x3feffde0/0x30) May 4 23:43:34 knautsch kernel: [17180938.416000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:43:34 knautsch kernel: [17180938.416000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:43:34 knautsch kernel: [17180938.460000] SMB connection re-established (-5) May 4 23:43:42 knautsch kernel: [17180946.292000] ipw2200: Firmware error detected. Restarting. May 4 23:43:42 knautsch kernel: [17180946.292000] ipw2200: Sysfs 'error' log already exists. May 4 23:45:04 knautsch kernel: [17181028.752000] ipw2200: Firmware error detected. Restarting. May 4 23:45:04 knautsch kernel: [17181028.752000] ipw2200: Sysfs 'error' log already exists. May 4 23:45:05 knautsch kernel: [17181029.868000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:45:36 knautsch kernel: [17181060.984000] smb_errno: class Unknown, code 27499 from command 0x6b May 4 23:45:36 knautsch kernel: [17181060.984000] Slab corruption: start=c4ebe09c, len=244 May 4 23:45:36 knautsch kernel: [17181060.984000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:45:36 knautsch kernel: [17181060.984000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs]) May 4 23:45:36 knautsch kernel: [17181060.984000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b May 4 23:45:36 knautsch kernel: [17181060.984000] 0f0: 00 fe ff ff May 4 23:45:36 knautsch kernel: [17181060.984000] Next obj: start=c4ebe19c, len=244 May 4 23:45:36 knautsch kernel: [17181060.984000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:45:36 knautsch kernel: [17181060.984000] Last user: [<00000000>](_stext+0x3feffde0/0x30) May 4 23:45:36 knautsch kernel: [17181060.984000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:45:36 knautsch kernel: [17181060.984000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:45:36 knautsch kernel: [17181061.024000] SMB connection re-established (-5) May 4 23:46:17 knautsch kernel: [17181102.132000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:47:46 knautsch kernel: [17181190.468000] smb_errno: class Unknown, code 27499 from command 0x6b May 4 23:47:46 knautsch kernel: [17181190.468000] Slab corruption: start=c4ebe09c, len=244 May 4 23:47:46 knautsch kernel: [17181190.468000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:47:46 knautsch kernel: [17181190.468000] Last user: [<e087b903>](smb_rput+0x53/0x90 [smbfs]) May 4 23:47:46 knautsch kernel: [17181190.468000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b May 4 23:47:46 knautsch kernel: [17181190.468000] 0f0: 00 fe ff ff May 4 23:47:46 knautsch kernel: [17181190.468000] Next obj: start=c4ebe19c, len=244 May 4 23:47:46 knautsch kernel: [17181190.468000] Redzone: 0x5a2cf071/0x5a2cf071. May 4 23:47:46 knautsch kernel: [17181190.468000] Last user: [<00000000>](_stext+0x3feffde0/0x30) May 4 23:47:46 knautsch kernel: [17181190.468000] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:47:46 knautsch kernel: [17181190.468000] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b May 4 23:47:46 knautsch kernel: [17181190.492000] SMB connection re-established (-5) May 4 23:49:20 knautsch kernel: [17181284.828000] smb_file_read: //some_file validation failed, error=4294966784 May 4 23:49:39 knautsch kernel: [17181303.896000] smb_file_read: //some_file validation failed, error=4294966784 Signed-off-by: Jan Niehusmann <jan@gondor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-20[PATCH] fs/locks.c: Fix sys_flock() raceTrond Myklebust
sys_flock() currently has a race which can result in a double free in the multi-thread case. Thread 1 Thread 2 sys_flock(file, LOCK_EX) sys_flock(file, LOCK_UN) If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for list_empty(&lock->fl_link) at the end of sys_flock, then both threads will end up calling locks_free_lock for the same lock. Fix is to make flock_lock_file() do the same as posix_lock_file(), namely to make a copy of the request, so that the caller can always free the lock. This also has the side-effect of fixing up a reference problem in the lockd handling of flock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-10[PATCH] fs/locks.c: Fix lease_init (CVE-2006-1860)Trond Myklebust
It is insane to be giving lease_init() the task of freeing the lock it is supposed to initialise, given that the lock is not guaranteed to be allocated on the stack. This causes lockups in fcntl_setlease(). Problem diagnosed by Daniel Hokka Zakrisson <daniel@hozac.com> Also fix a slab leak in __setlease() due to an uninitialised return value. Problem diagnosed by Björn Steinbrink. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Daniel Hokka Zakrisson <daniel@hozac.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Cc: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-04[PATCH] smbfs chroot issue (CVE-2006-1864)Olaf Kirch
Mark Moseley reported that a chroot environment on a SMB share can be left via "cd ..\\". Similar to CVE-2006-1863 issue with cifs, this fix is for smbfs. Steven French <sfrench@us.ibm.com> wrote: Looks fine to me. This should catch the slash on lookup or equivalent, which will be all obvious paths of interest. Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-01[PATCH] LSM: add missing hook to do_compat_readv_writev()James Morris
This patch addresses a flaw in LSM, where there is no mediation of readv() and writev() in for 32-bit compatible apps using a 64-bit kernel. This bug was discovered and fixed initially in the native readv/writev code [1], but was not fixed in the compat code. Thanks to Al for spotting this one. [1] http://lwn.net/Articles/154282/ Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-05-01[PATCH] Fix reiserfs deadlockJan Kara
reiserfs_cache_default_acl() should return whether we successfully found the acl or not. We have to return correct value even if reiserfs_get_acl() returns error code and not just 0. Otherwise callers such as reiserfs_mkdir() can unnecessarily lock the xattrs and later functions such as reiserfs_new_inode() fail to notice that we have already taken the lock and try to take it again with obvious consequences. Signed-off-by: Jan Kara <jack@suse.cz> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-01[PATCH] Simplify proc/devices and fix early termination regressionAndrew Morton
Repair /proc/devices early-termination regression. 2.6.16 broke /proc/devices. An application often gets an EOF before the end of data is reached, if that application uses a series of short read(2)s to access the data. I have used read buffers of varying sizes with varying degrees of unsuccess (larger sizes get further into the data than smaller sizes, following a simple pattern). It appears that the only safe way to get the data is to use a single read buffer larger than all the data in /proc/devices. The following example demonstates the problem: # dd if=/proc/devices bs=1 Character devices: 1 mem 27+0 records in 27+0 records out This patch is a backport of the fix recently accepted to Linus's tree: commit 68eef3b4791572ecb70249c7fb145bb3742dd899 [PATCH] Simplify proc/devices and fix early termination regression It replaces the complex, state-machine algorithm introduced in 2.6.16 with a simple algorithm, modeled on the implementation of /proc/interrupts. [akpm@osdl.org: cleanups, simplifications] Signed-off-by: Joe Korty <joe.korty@ccur.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-24[PATCH] Don't allow a backslash in a path component (CVE-2006-1863)Steve French
Unless Posix paths have been negotiated, the backslash, "\", is not a valid character in a path component. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-24[PATCH] Fix file lookup without refDipankar Sarma
There are places in the kernel where we look up files in fd tables and access the file structure without holding refereces to the file. So, we need special care to avoid the race between looking up files in the fd table and tearing down of the file in another CPU. Otherwise, one might see a NULL f_dentry or such torn down version of the file. This patch fixes those special places where such a race may happen. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-24[PATCH] x86: be careful about tailcall breakage for sys_open[at] tooLinus Torvalds
x86: be careful about tailcall breakage for sys_open[at] too Came up through a quick grep for other cases similar to the ftruncate() one in commit 0a489cb3b6a7b277030cdbc97c2c65905db94536. Also, add a comment, so that people who read the code understand why we do what looks like a no-op. (Again, this won't actually matter to any sane user, since libc will save and restore the register gcc stomps on, but it's still wrong to stomp on it) Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-24[PATCH] x86: don't allow tail-calls in sys_ftruncate[64]()Linus Torvalds
x86: don't allow tail-calls in sys_ftruncate[64]() Gcc thinks it owns the incoming argument stack, but that's not true for "asmlinkage" functions, and it corrupts the caller-set-up argument stack when it pushes the third argument onto the stack. Which can result in %ebx getting corrupted in user space. Now, normally nobody sane would ever notice, since libc will save and restore %ebx anyway over the system call, but it's still wrong. I'd much rather have "asmlinkage" tell gcc directly that it doesn't own the stack, but no such attribute exists, so we're stuck with our hacky manual "prevent_tail_call()" macro once more (we've had the same issue before with sys_waitpid() and sys_wait4()). Thanks to Hans-Werner Hilse <hilse@sub.uni-goettingen.de> for reporting the issue and testing the fix. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>