summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2007-10-07sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses (CVE-2007-3104)Eric Sandeen
Backport of ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch For regular files in sysfs, sysfs_readdir wants to traverse sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number. But, the dentry can be reclaimed under memory pressure, and there is no synchronization with readdir. This patch follows Tejun's scheme of allocating and storing an inode number in the new s_ino member of a sysfs_dirent, when dirents are created, and retrieving it from there for readdir, so that the pointer chain doesn't have to be traversed. Tejun's upstream patch uses a new-ish "ida" allocator which brings along some extra complexity; this -stable patch has a brain-dead incrementing counter which does not guarantee uniqueness, but because sysfs doesn't hash inodes as iunique expects, uniqueness wasn't guaranteed today anyway. Adrian Bunk: Backported to 2.6.16. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-07Reset current->pdeath_signal on SUID binary execution (CVE-2007-3848)Marcel Holtmann
This fixes a vulnerability in the "parent process death signal" implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd. and iSEC Security Research. http://marc.info/?l=bugtraq&m=118711306802632&w=2 Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-07-22coda: wrong order of arguments of ->readdir()Al Viro
Shows how many people are testing coda - the bug had been there for 5 years and results of stepping on it are not subtle. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-07-22ntfs_init_locked_inode(): fix array indexingAndrew Morton
Local variable `i' is a byte-counter. Don't use it as an index into an array of le32's. Reported-by: "young dave" <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-04-26aio: remove bare user-triggerable error printkZach Brown
The user can generate console output if they cause do_mmap() to fail during sys_io_setup(). This was seen in a regression test that does exactly that by spinning calling mmap() until it gets -ENOMEM before calling io_setup(). We don't need this printk at all, just remove it. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-03-02fs/bad_inode.c 64bit fixAdrian Bunk
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-26fix ext3 block bitmap leakageKirill Korotaev
This patch fixes ext3 block bitmap leakage, which leads to the following fsck messages on _healthy_ filesystem: Block bitmap differences: -64159 -73707 All kernels up to 2.6.17 have this bug. Found by Vasily Averin <vvs@sw.ru> and Andrey Savochkin <saw@sawoct.com> Test case triggered the issue was created by Dmitry Monakhov <dmonakhov@sw.ru> Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-21fix bad_inode_ops memory corruption (CVE-2006-5753)Eric Sandeen
CVE-2006-5753 is for a case where an inode can be marked bad, switching the ops to bad_inode_ops, which are all connected as: static int return_EIO(void) { return -EIO; } #define EIO_ERROR ((void *) (return_EIO)) static struct inode_operations bad_inode_ops = { .create = bad_inode_create ...etc... The problem here is that the void cast causes return types to not be promoted, and for ops such as listxattr which expect more than 32 bits of return value, the 32-bit -EIO is interpreted as a large positive 64-bit number, i.e. 0x00000000fffffffa instead of 0xfffffffa. This goes particularly badly when the return value is taken as a number of bytes to copy into, say, a user's buffer for example... I originally had coded up the fix by creating a return_EIO_<TYPE> macro for each return type, like this: static int return_EIO_int(void) { return -EIO; } #define EIO_ERROR_INT ((void *) (return_EIO_int)) static struct inode_operations bad_inode_ops = { .create = EIO_ERROR_INT, ...etc... but Al felt that it was probably better to create an EIO-returner for each actual op signature. Since so few ops share a signature, I just went ahead & created an EIO function for each individual file & inode op that returns a value. Adrian Bunk: backported to 2.6.16 Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-21Fix a free-wrong-pointer bug in nfs/acl server (CVE-2007-0772)Greg Banks
Due to type confusion, when an nfsacl verison 2 'ACCESS' request finishes and tries to clean up, it calls fh_put on entiredly the wrong thing and this can cause an oops. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-13Fix up CIFS for "test_clear_page_dirty()" removalLinus Torvalds
This also adds he required page "writeback" flag handling, that cifs hasn't been doing and that the page dirty flag changes made obvious. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Acked-by: Steve French <smfltc@us.ibm.com>
2007-02-13fix umask when noACL kernel meets extN tuned for ACLsHugh Dickins
Fix insecure default behaviour reported by Tigran Aivazian: if an ext2 or ext3 filesystem is tuned to mount with "acl", but mounted by a kernel built without ACL support, then umask was ignored when creating inodes - though root or user has umask 022, touch creates files as 0666, and mkdir creates directories as 0777. This appears to have worked right until 2.6.11, when a fix to the default mode on symlinks (always 0777) assumed VFS applies umask: which it does, unless the mount is marked for ACLs; but ext[23] set MS_POSIXACL in s_flags according to s_mount_opt set according to def_mount_opts. We could revert to the 2.6.10 ext[23]_init_acl (adding an S_ISLNK test); but other filesystems only set MS_POSIXACL when ACLs are configured. We could fix this at another level; but it seems most robust to avoid setting the s_mount_opt flag in the first place (at the expense of more ifdefs). Likewise don't set the XATTR_USER flag when built without XATTR support. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-03reiserfs: avoid tail packing if an inode was ever mmappedVladimir Saveliev
This patch fixes a confusion reiserfs has for a long time. On release file operation reiserfs used to try to pack file data stored in last incomplete page of some files into metadata blocks. After packing the page got cleared with clear_page_dirty. It did not take into account that the page may be mmaped into other process's address space. Recent replacement for clear_page_dirty cancel_dirty_page found the confusion with sanity check that page has to be not mapped. The patch fixes the confusion by making reiserfs avoid tail packing if an inode was ever mmapped. reiserfs_mmap and reiserfs_file_release are serialized with mutex in reiserfs specific inode. reiserfs_mmap locks the mutex and sets a bit in reiserfs specific inode flags. reiserfs_file_release checks the bit having the mutex locked. If bit is set - tail packing is avoided. This eliminates a possibility that mmapped page gets cancel_page_dirty-ed. Signed-off-by: Vladimir Saveliev <vs@namesys.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-22adfs: fix filename handlingJames Bursa
Fix filenames on adfs discs being terminated at the first character greater than 128 (adfs filenames are Latin 1). I saw this problem when using a loopback adfs image on a 2.6.17-rc5 x86_64 machine, and the patch fixed it there. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-20mm: fix bug in set_page_dirty_buffersNick Piggin
This was triggered, but not the fault of, the dirty page accounting patches. Suitable for -stable as well, after it goes upstream. Unable to handle kernel NULL pointer dereference at virtual address 0000004c EIP is at _spin_lock+0x12/0x66 Call Trace: [<401766e7>] __set_page_dirty_buffers+0x15/0xc0 [<401401e7>] set_page_dirty+0x2c/0x51 [<40140db2>] set_page_dirty_balance+0xb/0x3b [<40145d29>] __do_fault+0x1d8/0x279 [<40147059>] __handle_mm_fault+0x125/0x951 [<401133f1>] do_page_fault+0x440/0x59f [<4034d0c1>] error_code+0x39/0x40 [<08048a33>] 0x8048a33 ======================= Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09skip data conversion in compat_sys_mount when data_page is NULLAndrey Mirkin
OpenVZ Linux kernel team has found a problem with mounting in compat mode. Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode leads to oops: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 PGD 34d48067 PUD 34d03067 PMD 0 Oops: 0000 [1] SMP CPU: 0 Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801 i2c_core e100 mii floppy ide_cd cdrom Pid: 14656, comm: mount RIP: 0060:[<ffffffff802bc7c6>] [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 RSP: 0000:ffff810034d31f38 EFLAGS: 00010292 RAX: 000000000000002c RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff810034c86bc0 RSI: 0000000000000096 RDI: ffffffff8061fc90 RBP: ffff810034d31f78 R08: 0000000000000000 R09: 000000000000000d R10: ffff810034d31e58 R11: 0000000000000001 R12: ffff810039dc3000 R13: 000000000805ea48 R14: 0000000000000000 R15: 00000000c0ed0000 FS: 0000000000000000(0000) GS:ffffffff80749000(0033) knlGS:00000000b7d556b0 CS: 0060 DS: 007b ES: 007b CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000034d43000 CR4: 00000000000006e0 Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task ffff810034c86bc0) Stack: 0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48 0000000000000000 0000000000000000 0000000000000000 0000000000000000 000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000 Call Trace: [<ffffffff8021e64e>] ia32_sysret+0x0/0xa Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a RIP [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290 RSP <ffff810034d31f38> CR2: 0000000000000000 The problem is that data_page pointer can be NULL, so we should skip data conversion in this case. Signed-off-by: Andrey Mirkin <amirkin@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09corrupted cramfs filesystems cause kernel oops (CVE-2006-5823)Phillip Lougher
Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/ fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause Cramfs to kernel oops in cramfs_uncompress_block(). The cause of the oops is an unchecked corrupted block length field read by cramfs_readpage(). This patch adds a sanity check to cramfs_readpage() which checks that the block length field is sensible. The (PAGE_CACHE_SIZE << 1) size check is intentional, even though the uncompressed data is not going to be larger than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than the original source data. Mkcramfs checks that the compressed size is always less than or equal to PAGE_CACHE_SIZE << 1. Of course Cramfs could use the original uncompressed data in this case, but it doesn't. Signed-off-by: Phillip Lougher <phillip@lougher.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09handle ext3 directory corruption better (CVE-2006-6053)Eric Sandeen
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz Basically it makes a filesystem, splats some random bits over it, then tries to mount it and do some simple filesystem actions. At best, the filesystem catches the corruption gracefully. At worst, things spin out of control. As you might guess, we found a couple places in ext3 where things spin out of control :) First, we had a corrupted directory that was never checked for consistency... it was corrupt, and pointed to another bad "entry" of length 0. The for() loop looped forever, since the length of ext3_next_entry(de) was 0, and we kept looking at the same pointer over and over and over and over... I modeled this check and subsequent action on what is done for other directory types in ext3_readdir... (adding this check adds some computational expense; I am testing a followup patch to reduce the number of times we check and re-check these directory entries, in all cases. Thanks for the idea, Andreas). Next we had a root directory inode which had a corrupted size, claimed to be > 200M on a 4M filesystem. There was only really 1 block in the directory, but because the size was so large, readdir kept coming back for more, spewing thousands of printk's along the way. Per Andreas' suggestion, if we're in this read error condition and we're trying to read an offset which is greater than i_blocks worth of bytes, stop trying, and break out of the loop. With these two changes fsfuzz test survives quite well on ext3. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09ext2: skip pages past number of blocks in ext2_find_entry (CVE-2006-6054)Eric Sandeen
This one was pointed out on the MOKB site: http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html If a directory's i_size is corrupted, ext2_find_entry() will keep processing pages until the i_size is reached, even if there are no more blocks associated with the directory inode. This patch puts in some minimal sanity-checking so that we don't keep checking pages (and issuing errors) if we know there can be no more data to read, based on the block count of the directory inode. This is somewhat similar in approach to the ext3 patch I sent earlier this year. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09hfs_fill_super returns success even if no root inode (CVE-2006-6056)Eric Sandeen
http://kernelfun.blogspot.com/2006/11/mokb-14-11-2006-linux-26x-selinux.html mount that image... fs: filesystem was not cleanly unmounted, running fsck.hfs is recommended. mounting read-only. hfs: get root inode failed. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018 printing eip ... EIP is at superblock_doinit+0x21/0x767 ... [] selinux_sb_kern_mount+0xc/0x4b [] vfs_kern_mount+0x99/0xf6 [] do_kern_mount+0x2d/0x3e [] do_mount+0x5fa/0x66d [] sys_mount+0x77/0xae [] syscall_call+0x7/0xb DWARF2 unwinder stuck at syscall_call+0x7/0xb hfs_fill_super() returns success even if root_inode = hfs_iget(sb, &fd.search_key->cat, &rec); or sb->s_root = d_alloc_root(root_inode); fails. This superblock finds its way to superblock_doinit() which does: struct dentry *root = sb->s_root; struct inode *inode = root->d_inode; and boom. Need to make sure the error cases return an error, I think. [akpm@osdl.org: return -ENOMEM on oom] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-09grow_buffers() infinite loop fix (CVE-2006-5757/CVE-2006-6060)Andrew Morton
If grow_buffers() is for some reason passed a block number which wants to li outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then will accidentally truncate `index' and will then instnatiate a page at the wrong pagecache offset. This causes __getblk_slow() to go into an infinite loop. This can happen with corrupted disks, or with software errors elsewhere. Detect that, and handle it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-01-04fuse: fix hang on SMPMiklos Szeredi
Fuse didn't always call i_size_write() with i_mutex held which caused rare hangs on SMP/32bit. This bug has been present since fuse-2.2, well before being merged into mainline. The simplest solution is to protect i_size_write() with the per-connection spinlock. Using i_mutex for this purpose would require some restructuring of the code and I'm not even sure it's always safe to acquire i_mutex in all places i_size needs to be set. Since most of vmtruncate is already duplicated for other reasons, duplicate the remaining part as well, making all i_size_write() calls internal to fuse. Using i_size_write() was unnecessary in fuse_init_inode(), since this function is only called on a newly created locked inode. Reported by a few people over the years, but special thanks to Dana Henriksen who was persistent enough in helping me debug it. Adrian Bunk: Backported to 2.6.16. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-18NFS: nfs_lookup - don't hash dentry when optimising away the lookupTrond Myklebust
If the open intents tell us that a given lookup is going to result in a, exclusive create, we currently optimize away the lookup call itself. The reason is that the lookup would not be atomic with the create RPC call, so why do it in the first place? A problem occurs, however, if the VFS aborts the exclusive create operation after the lookup, but before the call to create the file/directory: in this case we will end up with a hashed negative dentry in the dcache that has never been looked up. Fix this by only actually hashing the dentry once the create operation has been successfully completed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-09binfmt_elf: fix checks for bad addressChuck Ebbert
Fix check for bad address; use macro instead of open-coding two checks. Taken from RHEL4 kernel update. From: Ernie Petrides <petrides@redhat.com> For background, the BAD_ADDR() macro should return TRUE if the address is TASK_SIZE, because that's the lowest address that is *not* valid for user-space mappings. The macro was correct in binfmt_aout.c but was wrong for the "equal to" case in binfmt_elf.c. There were two in-line validations of user-space addresses in binfmt_elf.c, which have been appropriately converted to use the corrected BAD_ADDR() macro in the patch you posted yesterday. Note that the size checks against TASK_SIZE are okay as coded. The additional changes that I propose are below. These are in the error paths for bad ELF entry addresses once load_elf_binary() has already committed to exec'ing the new image (following the tearing down of the task's original address space). The 1st hunk deals with the interp-side of the outer "if". There were two problems here. The printk() should be removed because this path can be triggered at will by a bogus interpreter image created and used by a malicious user. Further, the error code should not be ENOEXEC, because that causes the loop in search_binary_handler() to continue trying other exec handlers (twice, in fact). But it's too late for this to work correctly, because the user address space has already been torn down, and an exec() failure cannot be returned to the user code because the code no longer exists. The only recovery is to force a SIGSEGV, but it's best to terminate the search loop immediately. I somewhat arbitrarily chose EINVAL as a fallback error code, but any error returned by load_elf_interp() will override that (but this value will never be seen by user-space). The 2nd hunk deals with the non-interp-side of the outer "if". There were two problems here as well. The SIGSEGV needs to be forced, because a prior sigaction() syscall might have set the associated disposition to SIG_IGN. And the ENOEXEC should be changed to EINVAL as described above. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04fcntl(F_SETSIG) fixTrond Myklebust
fcntl(F_SETSIG) no longer works on leases because lease_release_private_callback() gets called as the lease is copied in order to initialise it. The problem is that lease_alloc() performs an unnecessary initialisation, which sets the lease_manager_ops. Avoid the problem by allocating the target lease structure using locks_alloc_lock(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-12-04JFS: pageno needs to be longDave Kleikamp
diRead and diWrite are representing the page number as an unsigned int. This causes file system corruption on volumes larger than 16TB. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29freevxfs: Add missing lock_kernel() to vxfs_readdirJosh Triplett
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree stopped calling the readdir member of a file_operations struct with the big kernel lock held, and fixed up all the readdir functions to do their own locking. However, that change added calls to unlock_kernel() in vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to lock_kernel(). Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-29add forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)Al Viro
sbi->s_group_desc is an array of pointers to buffer_head. memcpy() of buffer size from address of buffer_head is a bad idea - it will generate junk in any case, may oops if buffer_head is close to the end of slab page and next page is not mapped and isn't what was intended there. IOW, ->b_data is missing in that call. Fortunately, result doesn't go into the primary on-disk data structures, so only backup ones get crap written to them; that had allowed this bug to remain unnoticed until now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-25CIFS: report rename failure when target file is locked by WindowsSteve French
Fixes Samba bugzilla bug # 4182 Rename by handle failures (retry after rename by path) were not being returned back. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-24sysfs: remove duplicated dput in sysfs_update_fileHidetoshi Seto
Following function can drops d_count twice against one reference by lookup_one_len. <SOURCE> /** * sysfs_update_file - update the modified timestamp on an object attribute. * @kobj: object we're acting for. * @attr: attribute descriptor. */ int sysfs_update_file(struct kobject * kobj, const struct attribute * attr) { struct dentry * dir = kobj->dentry; struct dentry * victim; int res = -ENOENT; mutex_lock(&dir->d_inode->i_mutex); victim = lookup_one_len(attr->name, dir, strlen(attr->name)); if (!IS_ERR(victim)) { /* make sure dentry is really there */ if (victim->d_inode && (victim->d_parent->d_inode == dir->d_inode)) { victim->d_inode->i_mtime = CURRENT_TIME; fsnotify_modify(victim); /** * Drop reference from initial sysfs_get_dentry(). */ dput(victim); res = 0; } else d_drop(victim); /** * Drop the reference acquired from sysfs_get_dentry() above. */ dput(victim); } mutex_unlock(&dir->d_inode->i_mutex); return res; } </SOURCE> PCI-hotplug (drivers/pci/hotplug/pci_hotplug_core.c) is only user of this function. I confirmed that dentry of /sys/bus/pci/slots/XXX/* have negative d_count value. This patch removes unnecessary dput(). Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-20Fix BeFS slab corruptionDiego Calleja
In bugzilla #6941, Jens Kilian reported: "The function befs_utf2nls (in fs/befs/linuxvfs.c) writes a 0 byte past the end of a block of memory allocated via kmalloc(), leading to memory corruption. This happens only for filenames which are pure ASCII and a multiple of 4 bytes in length. [...] Without DEBUG_SLAB, this leads to further corruption and hard lockups; I believe this is the bug which has made kernels later than 2.6.8 unusable for me. (This must be due to changes in memory management, the bug has been in the BeFS driver since the time it was introduced (AFAICT).) Steps to reproduce: Create a directory (in BeOS, naturally :-) with files named, e.g., "1", "22", "333", "4444", ... Mount it in Linux and do an "ls" or "find"" This patch implements the suggested fix. Credits to Jens Kilian for debugging the problem and finding the right fix. Signed-off-by: Diego Calleja <diegocg@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-17ext3 -nobh option causes oopsBadari Pulavarty
For files other than IFREG, nobh option doesn't make sense. Modifications to them are journalled and needs buffer heads to do that. Without this patch, we get kernel oops in page_buffers(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-09[DISKLABEL] SUN: Fix signed int usage for sector countJeff Mahoney
The current sun disklabel code uses a signed int for the sector count. When partitions larger than 1 TB are used, the cast to a sector_t causes the partition sizes to be invalid: # cat /proc/paritions | grep sdan 66 112 2146435072 sdan 66 115 9223372036853660736 sdan3 66 120 9223372036853660736 sdan8 This patch switches the sector count to an unsigned int to fix this. Eric Sandeen also submitted the same patch. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-08Don't allow chmod() on the /proc/<pid>/ filesMarcel Holtmann
This just turns off chmod() on the /proc/<pid>/ files, since there is no good reason to allow it, and had we disallowed it originally, the nasty /proc race exploit wouldn't have been possible. The other patches already fixed the problem chmod() could cause, so this is really just some final mop-up.. This particular version is based off a patch by Eugene and Marcel which had much better naming than my original equivalent one. Signed-off-by: Eugene Teo <eteo@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-11-07[PATCH] md: Make sure bi_max_vecs is set properly in bio_splitNeil Brown
Else a subsequent bio_clone might make a mess. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Allow cifsd to suspend if connection is lostRafael J. Wysocki
Make cifsd allow us to suspend if it has lost the connection with a server Ref: http://bugzilla.kernel.org/show_bug.cgi?id=6811 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Fix typo in earlier cifs_unlink change and protect one extra path.Steve French
Since cifs_unlink can also be called from rename path and there was one report of oops am making the extra check for null inode. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] Fix unlink oops when indirectly called in rename error path under ↵Steve French
heavy stress. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-14[CIFS] fs/cifs/dir.c: fix possible NULL dereferenceSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-13fix fdset leakageKirill Korotaev
When found, it is obvious. nfds calculated when allocating fdsets is rewritten by calculation of size of fdtable, and when we are unlucky, we try to free fdsets of wrong size. Found due to OpenVZ resource management (User Beancounters). Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-09-09Have ext2 reject file handles with bad inode numbers early.Neil Brown
This prevents bad inode numbers from triggering errors in ext2_get_inode. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30fix struct file leakageKirill Korotaev
2.6.16 leaks like hell. While testing, I found massive filp leakage (reproduced in openvz) in the bowels of namei.c. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30Have ext3 reject file handles with bad inode numbers earlyEric Sandeen
blatantly ripped off from Neil Brown's ext2 patch. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-30ext3: avoid triggering ext3_error on bad NFS file handleNeil Brown
The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block() without any checking. If ext3_get_inode_block() allows it to trigger an error, then bad filehandles can have unpleasant effect - ext3_error() will usually cause a forced read-only remount, or a panic if `errors=panic' was used. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read off storage. Andrew Morton fixed an off-by-one error. Dann Frazier ported the patch to 2.6.16. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-23Fix possible UDF deadlock and memory corruption (CVE-2006-4145)Jan Kara
UDF code is not really ready to handle extents larger that 1GB. This is the easy way to forbid creating those. Also truncation code did not count with the case when there are no extents in the file and we are extending the file. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-11fix debugfs inode leakJens Axboe
Looking at the reiser4 crash, I found a leak in debugfs. In debugfs_mknod(), we create the inode before checking if the dentry already has one attached. We don't free it if that is the case. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-08-11Fix missing ret assignment in __bio_map_user() error pathJens Axboe
If get_user_pages() returns less pages than what we asked for, we jump to out_unmap which will return ERR_PTR(ret). But ret can contain a positive number just smaller than local_nr_pages, so be sure to set it to -EFAULT always. Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-07-15[PATCH] Relax /proc fix a bitLinus Torvalds
Relax /proc fix a bit Clearign all of i_mode was a bit draconian. We only really care about S_ISUID/ISGID, after all. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-07-14[PATCH] Fix nasty /proc vulnerability (CVE-2006-3626)Linus Torvalds
Fix nasty /proc vulnerability We have a bad interaction with both the kernel and user space being able to change some of the /proc file status. This fixes the most obvious part of it, but I expect we'll also make it harder for users to modify even their "own" files in /proc. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)Anton Altaparmakov
It fixes a crash in NTFS on architectures where flush_dcache_page() is a real function. I never noticed this as all my testing is done on i386 where flush_dcache_page() is NULL. http://bugzilla.kernel.org/show_bug.cgi?id=6700 Many thanks to Pauline Ng for the detailed bug report and analysis! Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-06-22[PATCH] JFS: Fix multiple errors in metapage_releasepageDave Kleikamp
It looks like metapage_releasepage was making in invalid assumption that the releasepage method would not be called on a dirty page. Instead of issuing a warning and releasing the metapage, it should return 0, indicating that the private data for the page cannot be released. I also realized that metapage_releasepage had the return code all wrong. If it is successful in releasing the private data, it should return 1, otherwise it needs to return 0. Lastly, there is no need to call wait_on_page_writeback, since try_to_release_page will not call us with a page in writback state. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>