summaryrefslogtreecommitdiff
path: root/fs
diff options
context:
space:
mode:
authorLiu Bo <bo.li.liu@oracle.com>2016-08-03 12:33:01 -0700
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2017-01-06 11:16:10 +0100
commit25ee81e5462015d2ba3792d2ae0ea902a98ae8a4 (patch)
tree5007c319d0e103a114edc7472930e43c9c69b324 /fs
parent590202271ffd18b63fefeaa902f1ee36b853cf91 (diff)
Btrfs: fix memory leak in reading btree blocks
commit 2571e739677f1e4c0c63f5ed49adcc0857923625 upstream. So we can read a btree block via readahead or intentional read, and we can end up with a memory leak when something happens as follows, 1) readahead starts to read block A but does not wait for read completion, 2) btree_readpage_end_io_hook finds that block A is corrupted, and it needs to clear all block A's pages' uptodate bit. 3) meanwhile an intentional read kicks in and checks block A's pages' uptodate to decide which page needs to be read. 4) when some pages have the uptodate bit during 3)'s check so 3) doesn't count them for eb->io_pages, but they are later cleared by 2) so we has to readpage on the page, we get the wrong eb->io_pages which results in a memory leak of this block. This fixes the problem by firstly getting all pages's locking and then checking pages' uptodate bit. t1(readahead) t2(readahead endio) t3(the following read) read_extent_buffer_pages end_bio_extent_readpage for pg in eb: for page 0,1,2 in eb: if pg is uptodate: btree_readpage_end_io_hook(pg) num_reads++ if uptodate: eb->io_pages = num_reads SetPageUptodate(pg) _______________ for pg in eb: for page 3 in eb: read_extent_buffer_pages if pg is NOT uptodate: btree_readpage_end_io_hook(pg) for pg in eb: __extent_read_full_page(pg) sanity check reports something wrong if pg is uptodate: clear_extent_buffer_uptodate(eb) num_reads++ for pg in eb: eb->io_pages = num_reads ClearPageUptodate(page) _______________ for pg in eb: if pg is NOT uptodate: __extent_read_full_page(pg) So t3's eb->io_pages is not consistent with the number of pages it's reading, and during endio(), atomic_dec_and_test(&eb->io_pages) will get a negative number so that we're not able to free the eb. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'fs')
-rw-r--r--fs/btrfs/extent_io.c9
1 files changed, 9 insertions, 0 deletions
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 257bbdcb5df6..e767f347f2b1 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5294,11 +5294,20 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
lock_page(page);
}
locked_pages++;
+ }
+ /*
+ * We need to firstly lock all pages to make sure that
+ * the uptodate bit of our pages won't be affected by
+ * clear_extent_buffer_uptodate().
+ */
+ for (i = start_i; i < num_pages; i++) {
+ page = eb->pages[i];
if (!PageUptodate(page)) {
num_reads++;
all_uptodate = 0;
}
}
+
if (all_uptodate) {
if (start_i == 0)
set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags);