summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-04-01coredump: suppress uid comparison test if core output files are pipesNeil Horman
commit 76595f79d76fbe6267a51b3a866a028d150f06d4 upstream. Modify uid check in do_coredump so as to not apply it in the case of pipes. This just got noticed in testing. The end of do_coredump validates the uid of the inode for the created file against the uid of the crashing process to ensure that no one can pre-create a core file with different ownership and grab the information contained in the core when they shouldn' tbe able to. This causes failures when using pipes for a core dumps if the crashing process is not root, which is the uid of the pipe when it is created. The fix is simple. Since the check for matching uid's isn't relevant for pipes (a process can't create a pipe that the uermodehelper code will open anyway), we can just just skip it in the event ispipe is non-zero Reverts a pipe-affecting change which was accidentally made in : commit c46f739dd39db3b07ab5deb4e3ec81e1c04a91af : Author: Ingo Molnar <mingo@elte.hu> : AuthorDate: Wed Nov 28 13:59:18 2007 +0100 : Commit: Linus Torvalds <torvalds@woody.linux-foundation.org> : CommitDate: Wed Nov 28 10:58:01 2007 -0800 : : vfs: coredumping fix Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01Fix 'flush_old_exec()/setup_new_exec()' splitLinus Torvalds
commit 7ab02af428c2d312c0cf8fb0b01cc1eb21131a3d upstream. Commit 221af7f87b9 ("Split 'flush_old_exec' into two functions") split the function at the point of no return - ie right where there were no more error cases to check. That made sense from a technical standpoint, but when we then also combined it with the actual personality setting going in between flush_old_exec() and setup_new_exec(), it needs to be a bit more careful. In particular, we need to make sure that we really flush the old personality bits in the 'flush' stage, rather than later in the 'setup' stage, since otherwise we might be flushing the _new_ personality state that we're just setting up. So this moves the flags and personality flushing (and 'flush_thread()', which is the arch-specific function that generally resets lazy FP state etc) of the old process into flush_old_exec(), so that it doesn't affect any state that execve() is setting up for the new process environment. This was reported by Michal Simek as breaking his Microblaze qemu environment. Reported-and-tested-by: Michal Simek <michal.simek@petalogix.com> Cc: Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2010-04-01Split 'flush_old_exec' into two functionsLinus Torvalds
commit 221af7f87b97431e3ee21ce4b0e77d5411cf1549 upstream. 'flush_old_exec()' is the point of no return when doing an execve(), and it is pretty badly misnamed. It doesn't just flush the old executable environment, it also starts up the new one. Which is very inconvenient for things like setting up the new personality, because we want the new personality to affect the starting of the new environment, but at the same time we do _not_ want the new personality to take effect if flushing the old one fails. As a result, the x86-64 '32-bit' personality is actually done using this insane "I'm going to change the ABI, but I haven't done it yet" bit (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the personality, but just the "pending" bit, so that "flush_thread()" can do the actual personality magic. This patch in no way changes any of that insanity, but it does split the 'flush_old_exec()' function up into a preparatory part that can fail (still called flush_old_exec()), and a new part that will actually set up the new exec environment (setup_new_exec()). All callers are changed to trivially comply with the new world order. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2010-04-01cifs: fix length calculation for converted unicode readdir namesJeff Layton
commit f12f98dba6ea1517cd7fbb912208893b9c014c15 upstream. cifs_from_ucs2 returns the length of the converted name, including the length of the NULL terminator. We don't want to include the NULL terminator in the dentry name length however since that'll throw off the hash calculation for the dentry cache. I believe that this is the root cause of several problems that have cropped up recently that seem to be papered over with the "noserverino" mount option. More confirmation of that would be good, but this is clearly a bug and it fixes at least one reproducible problem that was reported. This patch fixes at least this reproducer in this kernel.org bug: http://bugzilla.kernel.org/show_bug.cgi?id=15088#c12 Reported-by: Bjorn Tore Sund <bjorn.sund@it.uib.no> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01NFS: Fix a bug in nfs_fscache_release_page()Trond Myklebust
commit 2c1740098c708b465e87637b237feb2fd98f129a upstream. Not having an fscache cookie is perfectly valid if the user didn't mount with the fscache option. This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=15234 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01mm: replace various uses of num_physpages by totalram_pagesJan Beulich
commit 4481374ce88ba8f460c8b89f2572027bd27057d0 upstream. Sizing of memory allocations shouldn't depend on the number of physical pages found in a system, as that generally includes (perhaps a huge amount of) non-RAM pages. The amount of what actually is usable as storage should instead be used as a basis here. Some of the calculations (i.e. those not intending to use high memory) should likely even use (totalram_pages - totalhigh_pages). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Airlie <airlied@linux.ie> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01fix LOOKUP_FOLLOW on automount "symlinks"Al Viro
commit ac278a9c505092dd82077a2446af8f9fc0d9c095 upstream. Make sure that automount "symlinks" are followed regardless of LOOKUP_FOLLOW; it should have no effect on them. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01inotify: fix coalesce duplicate events into a single event in special caseWei Yongjun
commit 3de0ef4f2067da58fa5126d821a56dcb98cdb565 upstream. If we do rename a dir entry, like this: rename("/tmp/ino7UrgoJ.rename1", "/tmp/ino7UrgoJ.rename2") rename("/tmp/ino7UrgoJ.rename2", "/tmp/ino7UrgoJ") The duplicate events should be coalesced into a single event. But those two events do not be coalesced into a single event, due to some bad check in event_compare(). It can not match the two NULL inodes as the same event. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01dnotify: ignore FS_EVENT_ON_CHILDAndreas Gruenbacher
commit 945526846a84c00adac1efd1c6befdaa77039623 upstream. Mask off FS_EVENT_ON_CHILD in dnotify_handle_event(). Otherwise, when there is more than one watch on a directory and dnotify_should_send_event() succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause spurious events. This case was overlooked in commit e42e2773. #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> static void create_event(int s, siginfo_t* si, void* p) { printf("create\n"); } static void delete_event(int s, siginfo_t* si, void* p) { printf("delete\n"); } int main (void) { struct sigaction action; char *tmpdir, *file; int fd1, fd2; sigemptyset (&action.sa_mask); action.sa_flags = SA_SIGINFO; action.sa_sigaction = create_event; sigaction (SIGRTMIN + 0, &action, NULL); action.sa_sigaction = delete_event; sigaction (SIGRTMIN + 1, &action, NULL); # define TMPDIR "/tmp/test.XXXXXX" tmpdir = malloc(strlen(TMPDIR) + 1); strcpy(tmpdir, TMPDIR); mkdtemp(tmpdir); # define TMPFILE "/file" file = malloc(strlen(tmpdir) + strlen(TMPFILE) + 1); sprintf(file, "%s/%s", tmpdir, TMPFILE); fd1 = open (tmpdir, O_RDONLY); fcntl(fd1, F_SETSIG, SIGRTMIN); fcntl(fd1, F_NOTIFY, DN_MULTISHOT | DN_CREATE); fd2 = open (tmpdir, O_RDONLY); fcntl(fd2, F_SETSIG, SIGRTMIN + 1); fcntl(fd2, F_NOTIFY, DN_MULTISHOT | DN_DELETE); if (fork()) { /* This triggers a create event */ creat(file, 0600); /* This triggers a create and delete event (!) */ unlink(file); } else { sleep(1); rmdir(tmpdir); } return 0; } Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01Fix race in tty_fasync() properlyLinus Torvalds
commit 80e1e823989ec44d8e35bdfddadbddcffec90424 upstream. This reverts commit 703625118069 ("tty: fix race in tty_fasync") and commit b04da8bfdfbb ("fnctl: f_modown should call write_lock_irqsave/ restore") that tried to fix up some of the fallout but was incomplete. It turns out that we really cannot hold 'tty->ctrl_lock' over calling __f_setown, because not only did that cause problems with interrupt disables (which the second commit fixed), it also causes a potential ABBA deadlock due to lock ordering. Thanks to Tetsuo Handa for following up on the issue, and running lockdep to show the problem. It goes roughly like this: - f_getown gets filp->f_owner.lock for reading without interrupts disabled, so an interrupt that happens while that lock is held can cause a lockdep chain from f_owner.lock -> sighand->siglock. - at the same time, the tty->ctrl_lock -> f_owner.lock chain that commit 703625118069 introduced, together with the pre-existing sighand->siglock -> tty->ctrl_lock chain means that we have a lock dependency the other way too. So instead of extending tty->ctrl_lock over the whole __f_setown() call, we now just take a reference to the 'pid' structure while holding the lock, and then release it after having done the __f_setown. That still guarantees that 'struct pid' won't go away from under us, which is all we really ever needed. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Américo Wang <xiyou.wangcong@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01fnctl: f_modown should call write_lock_irqsave/restoreGreg Kroah-Hartman
commit b04da8bfdfbbd79544cab2fadfdc12e87eb01600 upstream. Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown() should call write_lock_irqsave instead of just write_lock_irq so that because a caller could have a spinlock held and it would not be good to renable interrupts. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-01partitions: use sector size for EFI GPTKarel Zak
commit 7d13af3279985f554784a45cc961f706dbcdbdd1 upstream. Currently, kernel uses strictly 512-byte sectors for EFI GPT parsing. That's wrong. UEFI standard (version 2.3, May 2009, 5.3.1 GUID Format overview, page 95) defines that LBA is always based on the logical block size. It means bdev_logical_block_size() (aka BLKSSZGET) for Linux. This patch removes static sector size from EFI GPT parser. The problem is reproducible with the latest GNU Parted: # modprobe scsi_debug dev_size_mb=50 sector_size=4096 # ./parted /dev/sdb print Model: Linux scsi_debug (scsi) Disk /dev/sdb: 52.4MB Sector size (logical/physical): 4096B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 24.6kB 3002kB 2978kB primary 2 3002kB 6001kB 2998kB primary 3 6001kB 9003kB 3002kB primary # blockdev --rereadpt /dev/sdb # dmesg | tail -1 sdb: unknown partition table <---- !!! with this patch: # blockdev --rereadpt /dev/sdb # dmesg | tail -1 sdb: sdb1 sdb2 sdb3 Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01partitions: read whole sector with EFI GPT headerKarel Zak
commit 87038c2d5bda2418fda8b1456a0ae81cc3ff5bd8 upstream. The size of EFI GPT header is not static, but whole sector is allocated for the header. The HeaderSize field must be greater than 92 (= sizeof(struct gpt_header) and must be less than or equal to the logical block size. It means we have to read whole sector with the header, because the header crc32 checksum is calculated according to HeaderSize. For more details see UEFI standard (version 2.3, May 2009): - 5.3.1 GUID Format overview, page 93 - Table 13. GUID Partition Table Header, page 96 Signed-off-by: Karel Zak <kzak@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01ecryptfs: use after freeDan Carpenter
commit ece550f51ba175c14ec3ec047815927d7386ea1f upstream. The "full_alg_name" variable is used on a couple error paths, so we shouldn't free it until the end. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01ecryptfs: initialize private persistent file before dereferencing pointerErez Zadok
commit e27759d7a333d1f25d628c4f7caf845c51be51c2 upstream. Ecryptfs_open dereferences a pointer to the private lower file (the one stored in the ecryptfs inode), without checking if the pointer is NULL. Right afterward, it initializes that pointer if it is NULL. Swap order of statements to first initialize. Bug discovered by Duckjin Kang. Signed-off-by: Duckjin Kang <fromdj2k@gmail.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01reiserfs: truncate blocks not used by a writeJan Kara
commit ec8e2f7466ca370f5e09000ca40a71759afc9ac8 upstream. It can happen that write does not use all the blocks allocated in write_begin either because of some filesystem error (like ENOSPC) or because page with data to write has been removed from memory. We truncate these blocks so that we don't have dangling blocks beyond i_size. Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01inotify: only warn once for inotify problemsEric Paris
commit 976ae32be45a736acd49215a7e4771ff91f161c3 upstream. inotify will WARN() if it finds that the idr and the fsnotify internals somehow got out of sync. It was only supposed to do this once but due to this stupid bug it would warn every single time a problem was detected. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-04-01inotify: do not reuse watch descriptorsEric Paris
commit 9e572cc9877ee6c43af60778f6b8d5ba0692d935 upstream. Since commit 7e790dd5fc937bc8d2400c30a05e32a9e9eef276 ("inotify: fix error paths in inotify_update_watch") inotify changed the manor in which it gave watch descriptors back to userspace. Previous to this commit inotify acted like the following: inotify_add_watch(X, Y, Z) = 1 inotify_rm_watch(X, 1); inotify_add_watch(X, Y, Z) = 2 but after this patch inotify would return watch descriptors like so: inotify_add_watch(X, Y, Z) = 1 inotify_rm_watch(X, 1); inotify_add_watch(X, Y, Z) = 1 which I saw as equivalent to opening an fd where open(file) = 1; close(1); open(file) = 1; seemed perfectly reasonable. The issue is that quite a bit of userspace apparently relies on the behavior in which watch descriptors will not be quickly reused. KDE relies on it, I know some selinux packages rely on it, and I have heard complaints from other random sources such as debian bug 558981. Although the man page implies what we do is ok, we broke userspace so this patch almost reverts us to the old behavior. It is still slightly racey and I have patches that would fix that, but they are rather large and this will fix it for all real world cases. The race is as follows: - task1 creates a watch and blocks in idr_new_watch() before it updates the hint. - task2 creates a watch and updates the hint. - task1 updates the hint with it's older wd - task removes the watch created by task2 - task adds a new watch and will reuse the wd originally given to task2 it requires moving some locking around the hint (last_wd) but this should solve it for the real world and be -stable safe. As a side effect this patch papers over a bug in the lib/idr code which is causing a large number WARN's to pop on people's system and many reports in kerneloops.org. I'm working on the root cause of that idr bug seperately but this should make inotify immune to that issue. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-18quota: Fix dquot_transfer for filesystems different from ext4Jan Kara
commit 05b5d898235401c489c68e1f3bc5706a29ad5713 upstream. Commit fd8fbfc1 modified the way we find amount of reserved space belonging to an inode. The amount of reserved space is checked from dquot_transfer and thus inode_reserved_space gets called even for filesystems that don't provide get_reserved_space callback which results in a BUG. Fix the problem by checking get_reserved_space callback and return 0 if the filesystem does not provide it. CC: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-18fasync: split 'fasync_helper()' into separate add/remove functionsLinus Torvalds
commit 53281b6d34d44308372d16acb7fb5327609f68b6 upstream. Yes, the add and remove cases do share the same basic loop and the locking, but the compiler can inline and then CSE some of the end result anyway. And splitting it up makes the code way easier to follow, and makes it clearer exactly what the semantics are. In particular, we must make sure that the FASYNC flag in file->f_flags exactly matches the state of "is this file on any fasync list", since not only is that flag visible to user space (F_GETFL), but we also use that flag to check whether we need to remove any fasync entries on file close. We got that wrong for the case of a mixed use of file locking (which tries to remove any fasync entries for file leases) and fasync. Splitting the function up also makes it possible to do some future optimizations without making the function even messier. In particular, since the FASYNC flag has to match the state of "is this on a list", we can do the following future optimizations: - on remove, we don't even need to get the locks and traverse the list if FASYNC isn't set, since we can know a priori that there is no point (this is effectively the same optimization that we already do in __fput() wrt removing fasync on file close) - on add, we can use the FASYNC flag to decide whether we are changing an existing entry or need to allocate a new one. but this is just the cleanup + fix for the FASYNC flag. Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Tested-by: Tavis Ormandy <taviso@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06ext4: fix sleep inside spinlock issue with quota and dealloc (#14739)Dmitry Monakhov
commit 39bc680a8160bb9d6743f7873b535d553ff61058 upstream. Unlock i_block_reservation_lock before vfs_dq_reserve_block(). This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14739 Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06ext4: Convert to generic reserved quota's space management.Dmitry Monakhov
commit a9e7f4472075fb6937c545af3f6329e9946bbe66 upstream. This patch also fixes write vs chown race condition. Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06quota: decouple fs reserved space from quota reservationDmitry Monakhov
commit fd8fbfc1709822bd94247c5b2ab15a5f5041e103 upstream. Currently inode_reservation is managed by fs itself and this reservation is transfered on dquot_transfer(). This means what inode_reservation must always be in sync with dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be triggered) This is not easy because of complex locking order issues for example http://bugzilla.kernel.org/show_bug.cgi?id=14739 The patch introduce quota reservation field for each fs-inode (fs specific inode is used in order to prevent bloating generic vfs inode). This reservation is managed by quota code internally similar to i_blocks/i_bytes and may not be always in sync with internal fs reservation. Also perform some code rearrangement: - Unify dquot_reserve_space() and dquot_reserve_space() - Unify dquot_release_reserved_space() and dquot_free_space() - Also this patch add missing warning update to release_rsv() dquot_release_reserved_space() must call flush_warnings() as dquot_free_space() does. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06Add unlocked version of inode_add_bytes() functionDmitry Monakhov
commit b462707e7ccad058ae151e5c5b06eb5cadcb737f upstream. Quota code requires unlocked version of this function. Off course we can just copy-paste the code, but copy-pasting is always an evil. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06generic_permission: MAY_OPEN is not write accessSerge E. Hallyn
commit 7ea6600148c265b1fd53e521022b1d7aec81d974 upstream. generic_permission was refusing CAP_DAC_READ_SEARCH-enabled processes from opening DAC-protected files read-only, because do_filp_open adds MAY_OPEN to the open mask. Ignore MAY_OPEN. After this patch, CAP_DAC_READ_SEARCH is again sufficient to open(fname, O_RDONLY) on a file to which DAC otherwise refuses us read permission. Reported-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06XFS bug in log recover with quota (bugzilla id 855)Jan Rekorajski
commit 8ec6dba2581754e375be66f7bedd708d856d8b30 upstream. Hi, I was hit by a bug in linux 2.6.31 when XFS is not able to recover the log after a crash if fs was mounted with quotas. Gory details in XFS bugzilla: http://oss.sgi.com/bugzilla/show_bug.cgi?id=855. It looks like wrong struct is used in buffer length check, and the following patch should fix the problem. xfs_dqblk_t has a size of 104+32 bytes, while xfs_disk_dquot_t is 104 bytes long, and this is exactly what I see in system logs - "XFS: dquot too small (104) in xlog_recover_do_dquot_trans." Signed-off-by: Jan Rekorajski <baggins@sith.mimuw.edu.pl> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com> Cc: Simon Kirby <sim@hostway.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06udf: Try harder when looking for VAT inodeJan Kara
commit e971b0b9e0dd50d9ceecb67a6a6ab80a80906033 upstream. Some disks do not contain VAT inode in the last recorded block as required by the standard but a few blocks earlier (or the number of recorded blocks is wrong). So look for the VAT inode a bit before the end of the media. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-06cifs: NULL out tcon, pSesInfo, and srvTcp pointers when chasing DFS referralsJeff Layton
commit a2934c7b363ddcc001964f2444649f909e583bef upstream. The scenario is this: The kernel gets EREMOTE and starts chasing a DFS referral at mount time. The tcon reference is put, which puts the session reference too, but neither pointer is zeroed out. The mount gets retried (goto try_mount_again) with new mount info. Session setup fails fails and rc ends up being non-zero. The code then falls through to the end and tries to put the previously freed tcon pointer again. Oops at: cifs_put_smb_ses+0x14/0xd0 Fix this by moving the initialization of the rc variable and the tcon, pSesInfo and srvTcp pointers below the try_mount_again label. Also, add a FreeXid() before the goto to prevent xid "leaks". Signed-off-by: Jeff Layton <jlayton@redhat.com> Reported-by: Gustavo Carvalho Homem <gustavo@angulosolido.pt> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18jffs2: Fix long-standing bug with symlink garbage collection.David Woodhouse
commit 2e16cfca6e17ae37ae21feca080a6f2eca9087dc upstream. Ever since jffs2_garbage_collect_metadata() was first half-written in February 2001, it's been broken on architectures where 'char' is signed. When garbage collecting a symlink with target length above 127, the payload length would end up negative, causing interesting and bad things to happen. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18ext3: Fix data / filesystem corruption when write fails to copy dataJan Kara
commit 68eb3db08344286733adac48304d9fb7a0e53b27 upstream. When ext3_write_begin fails after allocating some blocks or generic_perform_write fails to copy data to write, we truncate blocks already instantiated beyond i_size. Although these blocks were never inside i_size, we have to truncate pagecache of these blocks so that corresponding buffers get unmapped. Otherwise subsequent __block_prepare_write (called because we are retrying the write) will find the buffers mapped, not call ->get_block, and thus the page will be backed by already freed blocks leading to filesystem and data corruption. Reported-by: James Y Knight <foom@fuhm.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18jbd2: don't wipe the journal on a failed journal checksumTheodore Ts'o
commit e6a47428de84e19fda52f21ab73fde2906c40d09 upstream. If there is a failed journal checksum, don't reset the journal. This allows for userspace programs to decide how to recover from this situation. It may be that ignoring the journal checksum failure might be a better way of recovering the file system. Once we add per-block checksums, we can definitely do better. Until then, a system administrator can try backing up the file system image (or taking a snapshot) and and trying to determine experimentally whether ignoring the checksum failure or aborting the journal replay results in less data loss. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18hfs: fix a potential buffer overflowAmerigo Wang
commit ec81aecb29668ad71f699f4e7b96ec46691895b6 upstream. A specially-crafted Hierarchical File System (HFS) filesystem could cause a buffer overflow to occur in a process's kernel stack during a memcpy() call within the hfs_bnode_read() function (at fs/hfs/bnode.c:24). The attacker can provide the source buffer and length, and the destination buffer is a local variable of a fixed length. This local variable (passed as "&entry" from fs/hfs/dir.c:112 and allocated on line 60) is stored in the stack frame of hfs_bnode_read()'s caller, which is hfs_readdir(). Because the hfs_readdir() function executes upon any attempt to read a directory on the filesystem, it gets called whenever a user attempts to inspect any filesystem contents. [amwang@redhat.com: modify this patch and fix coding style problems] Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eugene Teo <eteo@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Anderson <anderson@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18devpts_get_tty() should validate inodeSukadev Bhattiprolu
commit edfacdd6f81119b9005615593f2cbd94b8c7e2d8 upstream. devpts_get_tty() assumes that the inode passed in is associated with a valid pty. But if the only reference to the pty is via a bind-mount, the inode passed to devpts_get_tty() while valid, would refer to a pty that no longer exists. With a lot of debug effort, Grzegorz Nosek developed a small program (see below) to reproduce a crash on recent kernels. This crash is a regression introduced by the commit: commit 527b3e4773628b30d03323a2cb5fb0d84441990f Author: Sukadev Bhattiprolu <sukadev@us.ibm.com> Date: Mon Oct 13 10:43:08 2008 +0100 To fix, ensure that the dentry associated with the inode has not yet been deleted/unhashed by devpts_pty_kill(). See also: https://lists.linux-foundation.org/pipermail/containers/2009-July/019273.html tty-bug.c: #define _GNU_SOURCE #include <fcntl.h> #include <sched.h> #include <stdlib.h> #include <sys/mount.h> #include <sys/signal.h> #include <unistd.h> #include <stdio.h> #include <linux/fs.h> void dummy(int sig) { } static int child(void *unused) { int fd; signal(SIGINT, dummy); signal(SIGHUP, dummy); pause(); /* cheesy synchronisation to wait for /dev/pts/0 to appear */ mount("/dev/pts/0", "/dev/console", NULL, MS_BIND, NULL); sleep(2); fd = open("/dev/console", O_RDWR); dup(0); dup(0); write(1, "Hello world!\n", sizeof("Hello world!\n")-1); return 0; } int main(void) { pid_t pid; char *stack; stack = malloc(16384); pid = clone(child, stack+16384, CLONE_NEWNS|SIGCHLD, NULL); open("/dev/ptmx", O_RDWR|O_NOCTTY|O_NONBLOCK); unlockpt(fd); grantpt(fd); sleep(2); kill(pid, SIGHUP); sleep(1); return 0; /* exit before child opens /dev/console */ } Reported-by: Grzegorz Nosek <root@localdomain.pl> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18debugfs: fix create mutex racy fops and private dataMathieu Desnoyers
commit d3a3b0adad0865c12e39b712ca89efbd0a3a0dbc upstream. Setting fops and private data outside of the mutex at debugfs file creation introduces a race where the files can be opened with the wrong file operations and private data. It is easy to trigger with a process waiting on file creation notification. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)Theodore Ts'o
(cherry picked from commit fab3a549e204172236779f502eccb4f9bf0dc87d) Fix the following potential circular locking dependency between mm->mmap_sem and ei->i_data_sem: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-04115-gec044c5 #37 ------------------------------------------------------- ureadahead/1855 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<ffffffff81107224>] might_fault+0x5c/0xac but task is already holding lock: (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ei->i_data_sem){++++..}: [<ffffffff81099bfa>] __lock_acquire+0xb67/0xd0f [<ffffffff81099e7e>] lock_acquire+0xdc/0x102 [<ffffffff81516633>] down_read+0x51/0x84 [<ffffffff811a2414>] ext4_get_blocks+0x50/0x2a5 [<ffffffff811a3453>] ext4_get_block+0xab/0xef [<ffffffff81154f39>] do_mpage_readpage+0x198/0x48d [<ffffffff81155360>] mpage_readpages+0xd0/0x114 [<ffffffff811a104b>] ext4_readpages+0x1d/0x1f [<ffffffff810f8644>] __do_page_cache_readahead+0x12f/0x1bc [<ffffffff810f86f2>] ra_submit+0x21/0x25 [<ffffffff810f0cfd>] filemap_fault+0x19f/0x32c [<ffffffff81107b97>] __do_fault+0x55/0x3a2 [<ffffffff81109db0>] handle_mm_fault+0x327/0x734 [<ffffffff8151aaa9>] do_page_fault+0x292/0x2aa [<ffffffff81518205>] page_fault+0x25/0x30 [<ffffffff812a34d8>] clear_user+0x38/0x3c [<ffffffff81167e16>] padzero+0x20/0x31 [<ffffffff81168b47>] load_elf_binary+0x8bc/0x17ed [<ffffffff81130e95>] search_binary_handler+0xc2/0x259 [<ffffffff81166d64>] load_script+0x1b8/0x1cc [<ffffffff81130e95>] search_binary_handler+0xc2/0x259 [<ffffffff8113255f>] do_execve+0x1ce/0x2cf [<ffffffff81027494>] sys_execve+0x43/0x5a [<ffffffff8102918a>] stub_execve+0x6a/0xc0 -> #0 (&mm->mmap_sem){++++++}: [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f [<ffffffff81099e7e>] lock_acquire+0xdc/0x102 [<ffffffff81107251>] might_fault+0x89/0xac [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157 [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1 [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159 [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6 [<ffffffff811392ca>] sys_ioctl+0x56/0x79 [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 1 lock held by ureadahead/1855: #0: (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159 stack backtrace: Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37 Call Trace: [<ffffffff81098c70>] print_circular_bug+0xa8/0xb7 [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f [<ffffffff8102f229>] ? sched_clock+0x9/0xd [<ffffffff81099e7e>] lock_acquire+0xdc/0x102 [<ffffffff81107224>] ? might_fault+0x5c/0xac [<ffffffff81107251>] might_fault+0x89/0xac [<ffffffff81107224>] ? might_fault+0x5c/0xac [<ffffffff81124b44>] ? __kmalloc+0x13b/0x18c [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157 [<ffffffff811bca0b>] ? ext4_ext_fiemap_cb+0x0/0x157 [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1 [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159 [<ffffffff81107224>] ? might_fault+0x5c/0xac [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6 [<ffffffff8129f6d0>] ? __up_read+0x8d/0x95 [<ffffffff81517fb5>] ? retint_swapgs+0x13/0x1b [<ffffffff811392ca>] sys_ioctl+0x56/0x79 [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXTAkira Fujita
(cherry picked from commit 4a58579b9e4e2a35d57e6c9c8483e52f6f1b7fd6) This patch fixes three problems in the handling of the EXT4_IOC_MOVE_EXT ioctl: 1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for original and donor files, but they allow the illegal write access to donor file, since donor file is overwritten by original file data. To fix this problem, change access mode checks of original (r->r/w) and donor (r->w) files. 2. Disallow the use of donor files that have a setuid or setgid bits. 3. Call mnt_want_write() and mnt_drop_write() before and after ext4_move_extents() calling to get write access to a mount. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Wait for proper transaction commit on fsyncJan Kara
(cherry picked from commit b436b9bef84de6893e86346d8fbf7104bc520645) We cannot rely on buffer dirty bits during fsync because pdflush can come before fsync is called and clear dirty bits without forcing a transaction commit. What we do is that we track which transaction has last changed the inode and which transaction last changed allocation and force it to disk on fsync. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: fix incorrect block reservation on quota transfer.Dmitry Monakhov
(cherry picked from commit 194074acacebc169ded90a4657193f5180015051) Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid This means that we may end-up with transferring all quotas. Add we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in case of QUOTA_INIT_BLOCKS. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: quota macros cleanupDmitry Monakhov
(cherry picked from commit 5aca07eb7d8f14d90c740834d15ca15277f4820c) Currently all quota block reservation macros contains hard-coded "2" aka MAXQUOTAS value. This is no good because in some places it is not obvious to understand what does this digit represent. Let's introduce new macro with self descriptive name. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: ext4_get_reserved_space() must return bytes instead of blocksDmitry Monakhov
(cherry picked from commit 8aa6790f876e81f5a2211fe1711a5fe3fe2d7b20) Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: remove blocks from inode prealloc list on failureCurt Wohlgemuth
(cherry picked from commit b844167edc7fcafda9623955c05e4c1b3c32ebc7) This fixes a leak of blocks in an inode prealloc list if device failures cause ext4_mb_mark_diskspace_used() to fail. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: wait for log to commit when umountingJosef Bacik
(cherry picked from commit d4edac314e9ad0b21ba20ba8bc61b61f186f79e1) There is a potential race when a transaction is committing right when the file system is being umounting. This could reduce in a race because EXT4_SB(sb)->s_group_info could be freed in ext4_put_super before the commit code calls a callback so the mballoc code can release freed blocks in the transaction, resulting in a panic trying to access the freed s_group_info. The fix is to wait for the transaction to finish committing before we shutdown the multiblock allocator. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Avoid data / filesystem corruption when write fails to copy dataJan Kara
(cherry picked from commit b9a4207d5e911b938f73079a83cc2ae10524ec7f) When ext4_write_begin fails after allocating some blocks or generic_perform_write fails to copy data to write, we truncate blocks already instantiated beyond i_size. Although these blocks were never inside i_size, we have to truncate the pagecache of these blocks so that corresponding buffers get unmapped. Otherwise subsequent __block_prepare_write (called because we are retrying the write) will find the buffers mapped, not call ->get_block, and thus the page will be backed by already freed blocks leading to filesystem and data corruption. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks()Roel Kluin
(cherry picked from commit c09eef305dd43846360944ad072f051f964fa383) Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer()Theodore Ts'o
(cherry picked from commit e6ec116b67f46e0e7808276476554727b2e6240b) OOM happens. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: move_extent_per_page() cleanupAkira Fujita
(cherry picked from commit ac48b0a1d068887141581bea8285de5fcab182b0) Integrate duplicate lines (acquire/release semaphore and invalidate extent cache in move_extent_per_page()) into mext_replace_branches(), to reduce source and object code size. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: initialize moved_len before calling ext4_move_extents()Kazuya Mio
(cherry picked from commit 446aaa6e7e993b38a6f21c6acfa68f3f1af3dbe3) The move_extent.moved_len is used to pass back the number of exchanged blocks count to user space. Currently the caller must clear this field; but we spend more code space checking for this requirement than simply zeroing the field ourselves, so let's just make life easier for everyone all around. Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXTAkira Fujita
(cherry picked from commit 94d7c16cbbbd0e03841fcf272bcaf0620ad39618) At the beginning of ext4_move_extent(), we call ext4_discard_preallocations() to discard inode PAs of orig and donor inodes. But in the following case, blocks can be double freed, so move ext4_discard_preallocations() to the end of ext4_move_extents(). 1. Discard inode PAs of orig and donor inodes with ext4_discard_preallocations() in ext4_move_extents(). orig : [ DATA1 ] donor: [ DATA2 ] 2. While data blocks are exchanging between orig and donor inodes, new inode PAs is created to orig by other process's block allocation. (Since there are semaphore gaps in ext4_move_extents().) And new inode PAs is used partially (2-1). 2-1 Create new inode PAs to orig inode orig : [ DATA1 | used PA1 | free PA1 ] donor: [ DATA2 ] 3. Donor inode which has old orig inode's blocks is deleted after EXT4_IOC_MOVE_EXT finished (3-1, 3-2). So the block bitmap corresponds to old orig inode's blocks are freed. 3-1 After EXT4_IOC_MOVE_EXT finished orig : [ DATA2 | free PA1 ] donor: [ DATA1 | used PA1 ] 3-2 Delete donor inode orig : [ DATA2 | free PA1 ] donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ] 4. The double-free of blocks is occurred, when close() is called to orig inode. Because ext4_discard_preallocations() for orig inode frees used PA1 and free PA1, though used PA1 is already freed in 3. 4-1 Double-free of blocks is occurred orig : [ DATA2 | FREE SPACE(free PA1) ] donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ] Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: make "norecovery" an alias for "noload"Eric Sandeen
(cherry picked from commit e3bb52ae2bb9573e84c17b8e3560378d13a5c798) Users on the linux-ext4 list recently complained about differences across filesystems w.r.t. how to mount without a journal replay. In the discussion it was noted that xfs's "norecovery" option is perhaps more descriptively accurate than "noload," so let's make that an alias for ext4. Also show this status in /proc/mounts Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-14ext4: make trim/discard optional (and off by default)Eric Sandeen
(cherry picked from commit 5328e635315734d42080de9a5a1ee87bf4cae0a4) It is anticipated that when sb_issue_discard starts doing real work on trim-capable devices, we may see issues. Make this mount-time optional, and default it to off until we know that things are working out OK. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>