summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-09-15JFFS2: add missing verify buffer allocation/deallocationMassimo Cirillo
commit bc8cec0dff072f1a45ce7f6b2c5234bb3411ac51 upstream. The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when that macro is enabled and the verify function is called. Similarly the jffs2_nor_wbuf_flash_cleanup() must free the buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is enabled. The following patch fixes the problem. The following patch applies to 2.6.30 kernel. Signed-off-by: Massimo Cirillo <maxcir@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-09OCFS2: fix build errorGreg Kroah-Hartman
Somehow a previous patch did not get committed correctly. This fixes the build. Thanks to Jayson King, Michael Tokarev, Joel Becker, and Chuck Ebbert for pointing out the problem, and the solution. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08ocfs2: ocfs2_write_begin_nolock() should handle len=0Sunil Mushran
commit 8379e7c46cc48f51197dd663fc6676f47f2a1e71 upstream. Bug introduced by mainline commit e7432675f8ca868a4af365759a8d4c3779a3d922 The bug causes ocfs2_write_begin_nolock() to oops when len=0. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-08ocfs2: Initialize the cluster we're writing to in a non-sparse extendSunil Mushran
commit e7432675f8ca868a4af365759a8d4c3779a3d922 upstream. In a non-sparse extend, we correctly allocate (and zero) the clusters between the old_i_size and pos, but we don't zero the portions of the cluster we're writing to outside of pos<->len. It handles clustersize > pagesize and blocksize < pagesize. [Cleaned up by Joel Becker.] Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-17Revert "compat_ioctl: hook up compat handler for FIEMAP ioctl"Greg Kroah-Hartman
This reverts commit 9ac3664242f11fb38ea5029712bc77ee317fe38c. This ioctl is not present in the 2.6.27 tree. I incorrectly added this patch to this tree. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16NFS: Fix an O_DIRECT Oops...Trond Myklebust
commit 1ae88b2e446261c038f2c0c3150ffae142b227a2 upstream. We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16mm_for_maps: shift down_read(mmap_sem) to the callerOleg Nesterov
commit 00f89d218523b9bf6b522349c039d5ac80aa536d upstream. mm_for_maps() takes ->mmap_sem after security checks, this looks strange and obfuscates the locking rules. Move this lock to its single caller, m_start(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16mm_for_maps: simplify, use ptrace_may_access()Oleg Nesterov
commit 13f0feafa6b8aead57a2a328e2fca6a5828bf286 upstream. It would be nice to kill __ptrace_may_access(). It requires task_lock(), but this lock is only needed to read mm->flags in the middle. Convert mm_for_maps() to use ptrace_may_access(), this also simplifies the code a little bit. Also, we do not need to take ->mmap_sem in advance. In fact I think mm_for_maps() should not play with ->mmap_sem at all, the caller should take this lock. With or without this patch, without ->cred_guard_mutex held we can race with exec() and get the new ->mm but check old creds. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16flat: fix uninitialized ptr with shared libsLinus Torvalds
commit 3440625d78711bee41a84cf29c3d8c579b522666 upstream. The new credentials code broke load_flat_shared_library() as it now uses an uninitialized cred pointer. Reported-by: Bernd Schmidt <bernds_cb1@t-online.de> Tested-by: Bernd Schmidt <bernds_cb1@t-online.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16compat_ioctl: hook up compat handler for FIEMAP ioctlEric Sandeen
commit 69130c7cf96ea853dc5be599dd6a4b98907d39cc upstream. The FIEMAP_IOC_FIEMAP mapping ioctl was missing a 32-bit compat handler, which means that 32-bit suerspace on 64-bit kernels cannot use this ioctl command. The structure is nicely aligned, padded, and sized, so it is just this simple. Tested w/ 32-bit ioctl tester (from Josef) on a 64-bit kernel on ext4. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Mark Lord <lkml@rtr.ca> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Josef Bacik <josef@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-16sysfs: fix hardlink count on device_movePeter Oberparleiter
commit 0f58b44582001c8bcdb75f36cf85ebbe5170e959 upstream. Update directory hardlink count when moving kobjects to a new parent. Fixes the following problem which occurs when several devices are moved to the same parent and then unregistered: > ls -laF /sys/devices/css0/defunct/ > total 0 > drwxr-xr-x 4294967295 root root 0 2009-07-14 17:02 ./ > drwxr-xr-x 114 root root 0 2009-07-14 17:02 ../ > drwxr-xr-x 2 root root 0 2009-07-14 17:01 power/ > -rw-r--r-- 1 root root 4096 2009-07-14 17:01 uevent Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30eCryptfs: parse_tag_3_packet check tag 3 packet encrypted key size ↵Ramon de Carvalho Valle
(CVE-2009-2407) commit f151cd2c54ddc7714e2f740681350476cda03a28 upstream. The parse_tag_3_packet function does not check if the tag 3 packet contains a encrypted key size larger than ECRYPTFS_MAX_ENCRYPTED_KEY_BYTES. Signed-off-by: Ramon de Carvalho Valle <ramon@risesecurity.org> [tyhicks@linux.vnet.ibm.com: Added printk newline and changed goto to out_free] Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30eCryptfs: Check Tag 11 literal data buffer size (CVE-2009-2406)Tyler Hicks
commit 6352a29305373ae6196491e6d4669f301e26492e upstream. Tag 11 packets are stored in the metadata section of an eCryptfs file to store the key signature(s) used to encrypt the file encryption key. After extracting the packet length field to determine the key signature length, a check is not performed to see if the length would exceed the key signature buffer size that was passed into parse_tag_11_packet(). Thanks to Ramon de Carvalho Valle for finding this bug using fsfuzzer. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-30elf: fix one check-after-useAmerigo Wang
commit e2dbe12557d85d81f4527879499f55681c3cca4f upstream. Check before use it. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-02jbd: fix race in buffer processing in commit codeJan Kara
commit a61d90d75d0f9e86432c45b496b4b0fbf0fd03dc upstream. In commit code, we scan buffers attached to a transaction. During this scan, we sometimes have to drop j_list_lock and then we recheck whether the journal buffer head didn't get freed by journal_try_to_free_buffers(). But checking for buffer_jbd(bh) isn't enough because a new journal head could get attached to our buffer head. So add a check whether the journal head remained the same and whether it's still at the same transaction and list. This is a nasty bug and can cause problems like memory corruption (use after free) or trigger various assertions in JBD code (observed). Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Fix race in ext4_inode_info.i_cached_extentTheodore Ts'o
(cherry picked from commit 2ec0ae3acec47f628179ee95fe2c4da01b5e9fc4) If two CPU's simultaneously call ext4_ext_get_blocks() at the same time, there is nothing protecting the i_cached_extent structure from being used and updated at the same time. This could potentially cause the wrong location on disk to be read or written to, including potentially causing the corruption of the block group descriptors and/or inode table. This bug has been in the ext4 code since almost the very beginning of ext4's development. Fortunately once the data is stored in the page cache cache, ext4_get_blocks() doesn't need to be called, so trying to replicate this problem to the point where we could identify its root cause was *extremely* difficult. Many thanks to Kevin Shanahan for working over several months to be able to reproduce this easily so we could finally nail down the cause of the corruption. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Clear the unwritten buffer_head flag after the extent is initializedAneesh Kumar K.V
(cherry picked from commit 2a8964d63d50dd2d65d71d342bc7fb6ef4117614) The BH_Unwritten flag indicates that the buffer is allocated on disk but has not been written; that is, the disk was part of a persistent preallocation area. That flag should only be set when a get_blocks() function is looking up a inode's logical to physical block mapping. When ext4_get_blocks_wrap() is called with create=1, the uninitialized extent is converted into an initialized one, so the BH_Unwritten flag is no longer appropriate. Hence, we need to make sure the BH_Unwritten is not left set, since the combination of BH_Mapped and BH_Unwritten is not allowed; among other things, it will result ext4's get_block() to be called over and over again during the write_begin phase of write(2). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Use a fake block number for delayed new buffer_headAneesh Kumar K.V
(cherry picked from commit 33b9817e2ae097c7b8d256e3510ac6c54fc6d9d0) Use a very large unsigned number (~0xffff) as as the fake block number for the delayed new buffer. The VFS should never try to write out this number, but if it does, this will make it obvious. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Fix sub-block zeroing for writes into preallocated extentsAneesh Kumar K.V
(cherry picked from commit 9c1ee184a30394e54165fa4c15923cabd952c106) We need to mark the buffer_head mapping preallocated space as new during write_begin. Otherwise we don't zero out the page cache content properly for a partial write. This will cause file corruption with preallocation. Now that we mark the buffer_head new we also need to have a valid buffer_head blocknr so that unmap_underlying_metadata() unmaps the correct block. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Ignore i_file_acl_high unless EXT4_FEATURE_INCOMPAT_64BIT is presentTheodore Ts'o
(cherry picked from commit a9e817425dc0baede8ebe5fbc9984a640257432b) Don't try to look at i_file_acl_high unless the INCOMPAT_64BIT feature bit is set. The field is normally zero, but older versions of e2fsck didn't automatically check to make sure of this, so in the spirit of "be liberal in what you accept", don't look at i_file_acl_high unless we are using a 64-bit filesystem. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Fix softlockup caused by illegal i_file_acl value in on-disk inodeTheodore Ts'o
(cherry picked from commit 485c26ec70f823f2a9cf45982b724893e53a859e) If the block containing external extended attributes (which is stored in i_file_acl and i_file_acl_high) is larger than the on-disk filesystem, the process which tried to access the extended attributes will endlessly issue kernel printks complaining that "__find_get_block_slow() failed", locking up that CPU until the system is forcibly rebooted. So when we read in the inode, make sure the i_file_acl value is legal, and if not, flag the filesystem as being corrupted. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: really print the find_group_flex fallback warning only onceChuck Ebbert
(cherry picked from commit 6b82f3cb2d480b7714eb0ff61aee99c22160389e) Missing braces caused the warning to print more than once. Signed-Off-By: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix locking typo in mballoc which could cause soft lockup hangsTheodore Ts'o
upstream commit: e7c9e3e99adf6c49c5d593a51375916acc039d1e Smatch (http://repo.or.cz/w/smatch.git/) complains about the locking in ext4_mb_add_n_trim() from fs/ext4/mballoc.c 4438 list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order], 4439 pa_inode_list) { 4440 spin_lock(&tmp_pa->pa_lock); 4441 if (tmp_pa->pa_deleted) { 4442 spin_unlock(&pa->pa_lock); 4443 continue; 4444 } Brown paper bag time... Reported-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix typo which causes a memory leak on error pathDan Carpenter
upstream commit: a7b19448ddbdc34b2b8fedc048ba154ca798667b This was found by smatch (http://repo.or.cz/w/smatch.git/) Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11jbd2: Update locking comentsJan Kara
(cherry picked from commit 86db97c87f744364d5889ca8a4134ca2048b8f83) Update information about locking in JBD2 revoke code. Inconsistency in comments found by Lin Tan <tammy000@gmail.com> CC: Lin Tan <tammy000@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Check for an valid i_mode when reading the inode from diskTheodore Ts'o
(cherry picked from commit 563bdd61fe4dbd6b58cf7eb06f8d8f14479ae1dc) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Fix discard of inode prealloc space with delayed allocation.Aneesh Kumar K.V
(cherry picked from commit d6014301b5599fba395c42a1e96a7fe86f7d0b2d) With delayed allocation we should not/cannot discard inode prealloc space during file close. We would still have dirty pages for which we haven't allocated blocks yet. With this fix after each get_blocks request we check whether we have zero reserved blocks and if yes and we don't have any writers on the file we discard inode prealloc space. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Automatically allocate delay allocated blocks on renameTheodore Ts'o
(cherry picked from commit 8750c6d5fcbd3342b3d908d157f81d345c5325a7) When renaming a file such that a link to another inode is overwritten, force any delay allocated blocks that to be allocated so that if the filesystem is mounted with data=ordered, the data blocks will be pushed out to disk along with the journal commit. Many application programs expect this, so we do this to avoid zero length files if the system crashes unexpectedly. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Automatically allocate delay allocated blocks on closeTheodore Ts'o
(cherry picked from commit 7d8f9f7d150dded7b68e61ca6403a1f166fb4edf) When closing a file that had been previously truncated, force any delay allocated blocks that to be allocated so that if the filesystem is mounted with data=ordered, the data blocks will be pushed out to disk along with the journal commit. Many application programs expect this, so we do this to avoid zero length files if the system crashes unexpectedly. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctlTheodore Ts'o
(cherry picked from commit ccd2506bd43113659aa904d5bea5d1300605e2a6) Add an ioctl which forces all of the delay allocated blocks to be allocated. This also provides a function ext4_alloc_da_blocks() which will be used by the following commits to force files to be fully allocated to preserve application-expected ext3 behaviour. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: return -EIO not -ESTALE on directory traversal through deleted inodeBryan Donlan
(cherry picked from commit e6f009b0b45220c004672d41a58865e94946104d) ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to report errors to NFS properly. However, in ext4_lookup(), this -ESTALE can be propagated to userspace if the filesystem is corrupted such that a directory entry references a deleted inode. This leads to a misleading error message - "Stale NFS file handle" - and confusion on the part of the admin. The bug can be easily reproduced by creating a new filesystem, making a link to an unused inode using debugfs, then mounting and attempting to ls -l said link. This patch thus changes ext4_lookup to return -EIO if it receives -ESTALE from ext4_iget(), as ext4 does for other filesystem metadata corruption; and also invokes the appropriate ext*_error functions when this case is detected. Signed-off-by: Bryan Donlan <bdonlan@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: tighten restrictions on inode flagsDuane Griffin
(cherry picked from commit 2dc6b0d48ca0599837df21b14bb8393d0804af57) At the moment there are few restrictions on which flags may be set on which inodes. Specifically DIRSYNC may only be set on directories and IMMUTABLE and APPEND may not be set on links. Tighten that to disallow TOPDIR being set on non-directories and only NODUMP and NOATIME to be set on non-regular file, non-directories. Introduces a flags masking function which masks flags based on mode and use it during inode creation and when flags are set via the ioctl to facilitate future consistency. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: don't inherit inappropriate inode flags from parentDuane Griffin
(cherry picked from commit 8fa43a81b97853fc69417bb6054182e78f95cbeb) At present INDEX and EXTENTS are the only flags that new ext4 inodes do NOT inherit from their parent. In addition prevent the flags DIRTY, ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited. List inheritable flags explicitly to prevent future flags from accidentally being inherited. This fixes the TOPDIR flag inheritance bug reported at http://bugzilla.kernel.org/show_bug.cgi?id=9866. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix bb_prealloc_list corruption due to wrong group lockingEric Sandeen
(cherry-picked from commit d33a1976fbee1ee321d6f014333d8f03a39d526c) This is for Red Hat bug 490026: EXT4 panic, list corruption in ext4_mb_new_inode_pa ext4_lock_group(sb, group) is supposed to protect this list for each group, and a common code flow to remove an album is like this: ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL); ext4_lock_group(sb, grp); list_del(&pa->pa_group_list); ext4_unlock_group(sb, grp); so it's critical that we get the right group number back for this prealloc context, to lock the right group (the one associated with this pa) and prevent concurrent list manipulation. however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a comment, "-1 is to protect from crossing allocation group". This makes sense for the group_pa, where pa_pstart is advanced by the length which has been used (in ext4_mb_release_context()), and when the entire length has been used, pa_pstart has been advanced to the first block of the next group. However, for inode_pa, pa_pstart is never advanced; it's just set once to the first block in the group and not moved after that. So in this case, if we subtract one in ext4_mb_put_pa(), we are actually locking the *previous* group, and opening the race with the other threads which do not subtract off the extra block. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix bogus BUG_ONs in in mballoc codeEric Sandeen
(cherry picked from commit 8d03c7a0c550e7ab24cadcef5e66656bfadec8b9) Thiemo Nagel reported that: # dd if=/dev/zero of=image.ext4 bs=1M count=2 # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \ -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4 # mount -o loop image.ext4 mnt/ # dd if=/dev/zero of=mnt/file oopsed, with a BUG_ON in ext4_mb_normalize_request because size == EXT4_BLOCKS_PER_GROUP It appears to me (esp. after talking to Andreas) that the BUG_ON is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should be allowed, though larger sizes do indicate a problem. Fix that an another (apparently rare) codepath with a similar check. Reported-by: Thiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: Print the find_group_flex() warning only onceTheodore Ts'o
(cherry picked from commit 2842c3b5449f31470b61db716f1926b594fb6156) This is a short-term warning, and even printk_ratelimit() can result in too much noise in system logs. So only print it once as a warning. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix header check in ext4_ext_search_right() for deep extent trees.Eric Sandeen
(cherry picked from commit 395a87bfefbc400011417e9eaae33169f9f036c0) The ext4_ext_search_right() function is confusing; it uses a "depth" variable which is 0 at the root and maximum at the leaves, but the on-disk metadata uses a "depth" (actually eh_depth) which is opposite: maximum at the root, and 0 at the leaves. The ext4_ext_check_header() function is given a depth and checks the header agaisnt that depth; it expects the on-disk semantics, but we are giving it the opposite in the while loop in this function. We should be giving it the on-disk notion of "depth" which we can get from (p_depth - depth) - and if you look, the last (more commonly hit) call to ext4_ext_check_header() does just this. Sending in the wrong depth results in (incorrect) messages about corruption: EXT4-fs error (device sdb1): ext4_ext_search_right: bad header in inode #2621457: unexpected eh_depth - magic f30a, entries 340, max 340(0), depth 1(2) http://bugzilla.kernel.org/show_bug.cgi?id=12821 Reported-by: David Dindorp <ddi@dubex.dk> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11ext4: fix ext4_free_inode() vs. ext4_claim_inode() raceEric Sandeen
(cherry picked from commit 7ce9d5d1f3c8736511daa413c64985a05b2feee3) I was seeing fsck errors on inode bitmaps after a 4 thread dbench run on a 4 cpu machine: Inode bitmap differences: -50736 -(50752--50753) etc... I believe that this is because ext4_free_inode() uses atomic bitops, and although ext4_new_inode() *used* to also use atomic bitops for synchronization, commit 393418676a7602e1d7d3f6e560159c65c8cbd50e changed this to use the sb_bgl_lock, so that we could also synchronize against read_inode_bitmap and initialization of uninit inode tables. However, that change left ext4_free_inode using atomic bitops, which I think leaves no synchronization between setting & unsetting bits in the inode table. The below patch fixes it for me, although I wonder if we're getting at all heavy-handed with this spinlock... Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-11nfs: Fix NFS v4 client handling of MAY_EXEC in nfs_permission.Frank Filz
commit 7ee2cb7f32b299c2b06a31fde155457203e4b7dd upstream. The problem is that permission checking is skipped if atomic open is possible, but when exec opens a file, it just opens it O_READONLY which means EXEC permission will not be checked at that time. This problem is observed by the following sequence (executed as root): mount -t nfs4 server:/ /mnt4 echo "ls" >/mnt4/foo chmod 744 /mnt4/foo su guest -c "mnt4/foo" Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Eugene Teo <eugeneteo@kernel.sg> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19ocfs2: fix i_mutex locking in ocfs2_splice_to_file()Miklos Szeredi
commit 328eaaba4e41a04c1dc4679d65bea3fee4349d86 upstream. Rearrange locking of i_mutex on destination and call to ocfs2_rw_lock() so locks are only held while buffers are copied with the pipe_to_file() actor, and not while waiting for more data on the pipe. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19splice: fix i_mutex locking in generic_splice_write()Miklos Szeredi
commit eb443e5a25d43996deb62b9bcee1a4ce5dea2ead upstream. Rearrange locking of i_mutex on destination so it's only held while buffers are copied with the pipe_to_file() actor, and not while waiting for more data on the pipe. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19splice: remove i_mutex locking in splice_from_pipe()Miklos Szeredi
commit 2933970b960223076d6affcf7a77e2bc546b8102 upstream. splice_from_pipe() is only called from two places: - generic_splice_sendpage() - splice_write_null() Neither of these require i_mutex to be taken on the destination inode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19splice: split up __splice_from_pipe()Miklos Szeredi
commit b3c2d2ddd63944ef2a1e4a43077b602288107e01 upstream. Split up __splice_from_pipe() into four helper functions: splice_from_pipe_begin() splice_from_pipe_next() splice_from_pipe_feed() splice_from_pipe_end() splice_from_pipe_next() will wait (if necessary) for more buffers to be added to the pipe. splice_from_pipe_feed() will feed the buffers to the supplied actor and return when there's no more data available (or if all of the requested data has been copied). This is necessary so that implementations can do locking around the non-waiting splice_from_pipe_feed(). This patch should not cause any change in behavior. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19NFS: Fix the notifications when renaming onto an existing fileTrond Myklebust
commit b1e4adf4ea41bb8b5a7bfc1a7001f137e65495df upstream. NFS appears to be returning an unnecessary "delete" notification when we're doing an atomic rename. See http://bugzilla.gnome.org/show_bug.cgi?id=575684 The fix is to get rid of the redundant call to d_delete(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19nfsd4: check for negative dentry before use in nfsv4 readdirJ. Bruce Fields
commit b2c0cea6b1cb210e962f07047df602875564069e upstream. After 2f9092e1020246168b1309b35e085ecd7ff9ff72 "Fix i_mutex vs. readdir handling in nfsd" (and 14f7dd63 "Copy XFS readdir hack into nfsd code"), an entry may be removed between the first mutex_unlock and the second mutex_lock. In this case, lookup_one_len() will return a negative dentry. Check for this case to avoid a NULL dereference. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19epoll: fix size check in epoll_create()Davide Libenzi
commit bfe3891a5f5d3b78146a45f40e435d14f5ae39dd upstream. Fix a size check WRT the manual pages. This was inadvertently broken by commit 9fe5ad9c8cef9ad5873d8ee55d1cf00d9b607df0 ("flag parameters add-on: remove epoll_create size param"). Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: <Hiroyuki.Mach@gmail.com> Cc: rohit verma <rohit.170309@gmail.com> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19cifs: Fix unicode string area word alignment in session setupJeff Layton
commit 27b87fe52baba0a55e9723030e76fce94fabcea4 refreshed. cifs: fix unicode string area word alignment in session setup The handling of unicode string area alignment is wrong. decode_unicode_ssetup improperly assumes that it will always be preceded by a pad byte. This isn't the case if the string area is already word-aligned. This problem, combined with the bad buffer sizing for the serverDomain string can cause memory corruption. The bad alignment can make it so that the alignment of the characters is off. This can make them translate to characters that are greater than 2 bytes each. If this happens we can overflow the allocation. Fix this by fixing the alignment in CIFS_SessSetup instead so we can verify it against the head of the response. Also, clean up the workaround for improperly terminated strings by checking for a odd-length unicode buffers and then forcibly terminating them. Finally, resize the buffer for serverDomain. Now that we've fixed the alignment, it's probably fine, but a malicious server could overflow it. A better solution for handling these strings is still needed, but this should be a suitable bandaid. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Cc: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19cifs: Fix buffer size in cifs_convertUCSpathSuresh Jayaraman
Relevant commits 7fabf0c9479fef9fdb9528a5fbdb1cb744a744a4 and f58841666bc22e827ca0dcef7b71c7bc2758ce82. The upstream commits adds cifs_from_ucs2 that includes functionality of cifs_convertUCSpath and does cleanup. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Acked-by: Steve French <sfrench@us.ibm.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19cifs: Fix incorrect destination buffer size in cifs_strncpy_to_hostSuresh Jayaraman
Relevant commits 968460ebd8006d55661dec0fb86712b40d71c413 and 066ce6899484d9026acd6ba3a8dbbedb33d7ae1b. Minimal hunks to fix buffer size and fix an existing problem pointed out by Guenter Kukuk that length of src is used for NULL termination of dst. cifs: Rename cifs_strncpy_to_host and fix buffer size There is a possibility for the path_name and node_name buffers to overflow if they contain charcters that are >2 bytes in the local charset. Resize the buffer allocation so to avoid this possibility. Also, as pointed out by Jeff Layton, it would be appropriate to rename the function to cifs_strlcpy_to_host to reflect the fact that the copied string is always NULL terminated. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-05-19cifs: Increase size of tmp_buf in cifs_readdir to avoid potential overflowsSuresh Jayaraman
Commit 7b0c8fcff47a885743125dd843db64af41af5a61 refreshed and use a #define from commit f58841666bc22e827ca0dcef7b71c7bc2758ce82. cifs: Increase size of tmp_buf in cifs_readdir to avoid potential overflows Increase size of tmp_buf to possible maximum to avoid potential overflows. Also moved UNICODE_NAME_MAX definition so that it can be used elsewhere. Pointed-out-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>