bcache: recover data from backing when data is clean

commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream. When we send a read request and hit the clean data in cache device, there is a situation called cache read race in bcache(see the commit in the tail of cache_look_up(), the following explaination just copy from there): The bucket we're reading from might be reused while our bio is in flight, and we could then end up reading the wrong data. We guard against this by checking (in bch_cache_read_endio()) if the pointer is stale again; if so, we treat it as an error (s->iop.error = -EINTR) and reread from the backing device (but we don't pass that error up anywhere) It should be noted that cache read race happened under normal circumstances, not the circumstance when SSD failed, it was counted and shown in /sys/fs/bcache/XXX/internal/cache_read_races. Without this patch, when we use writeback mode, we will never reread from the backing device when cache read race happened, until the whole cache device is clean, because the condition (s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR) will be passed up, at last, user will receive -EINTR when it's bio end, this is not suitable, and wield to up-application. In this patch, we use s->read_dirty_data to judge whether the read request hit dirty data in cache device, it is safe to reread data from the backing device when the read request hit clean data. This can not only handle cache read race, but also recover data when failed read request from cache device. [edited by mlyle to fix up whitespace, commit log title, comment spelling] Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean") Signed-off-by: Hua Rui <huarui.dev@gmail.com> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Michael Lyle <mlyle@lyle.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
author: Rui Hua <huarui.dev@gmail.com> 2017-11-24 15:14:26 -0800
committer: Greg Kroah-Hartman <gregkh@linuxfoundation.org> 2017-12-09 18:42:37 +0100
commit: 3f7477e64478c7d22642a97c25eb43649d2a480c (patch)
tree: d07d097d5a90b2a923b77dc7069da106637e38cc
parent: f80f34d8ba92b29f84228f91f9b6c0e0fca5c641 (diff)
1 files changed, 6 insertions, 7 deletions
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 3bd2e4f55f2c..525ce56524ba 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -707,16 +707,15 @@ static void cached_dev_read_error(struct closure *cl)
 {
 	struct search *s = container_of(cl, struct search, cl);
 	struct bio *bio = &s->bio.bio;
-	struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
 
 	/*
-	 * If cache device is dirty (dc->has_dirty is non-zero), then
-	 * recovery a failed read request from cached device may get a
-	 * stale data back. So read failure recovery is only permitted
-	 * when cache device is clean.
+	 * If read request hit dirty data (s->read_dirty_data is true),
+	 * then recovery a failed read request from cached device may
+	 * get a stale data back. So read failure recovery is only
+	 * permitted when read request hit clean data in cache device,
+	 * or when cache read race happened.
 	 */
-	if (s->recoverable &&
-	    (dc && !atomic_read(&dc->has_dirty))) {
+	if (s->recoverable && !s->read_dirty_data) {
 		/* Retry from the backing device: */
 		trace_bcache_read_retry(s->orig_bio);
author	Rui Hua <huarui.dev@gmail.com>	2017-11-24 15:14:26 -0800
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-12-09 18:42:37 +0100
commit	3f7477e64478c7d22642a97c25eb43649d2a480c (patch)
tree	d07d097d5a90b2a923b77dc7069da106637e38cc
parent	f80f34d8ba92b29f84228f91f9b6c0e0fca5c641 (diff)