Skip to content

Instantly share code, notes, and snippets.

@fdmanana
Last active December 28, 2021 11:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fdmanana/08d0e28191f4c491c0c09a7811d28b60 to your computer and use it in GitHub Desktop.
Save fdmanana/08d0e28191f4c491c0c09a7811d28b60 to your computer and use it in GitHub Desktop.
From 05eee78f0be8040d972c240c4a48c77464fb2230 Mon Sep 17 00:00:00 2001
Message-Id: <05eee78f0be8040d972c240c4a48c77464fb2230.1640691941.git.fdmanana@suse.com>
From: Filipe Manana <fdmanana@suse.com>
Date: Fri, 17 Dec 2021 11:53:42 +0000
Subject: [PATCH] btrfs: fix reserved space leak on log tree nodes after
transaction abort
After the recent changes by commit c2e39305299f01 ("btrfs: clear extent
buffer uptodate when we fail to write it") and its followup fix that has
the subject "btrfs: check WRITE_ERR when trying to read an extent buffer"
(not yet in Linus' tree), after a transaction abort we can often end up
not unreserving the space that was reserved for log tree extent buffers.
This happens because if writeback for a log tree extent buffer failed,
than we have cleared the EXTENT_BUFFER_UPTODATE from the extent buffer
and we have also set the bit EXTENT_BUFFER_WRITE_ERR on it. Later on,
when trying to free the log tree with free_log_tree(), which iterates
over the tree, we can end up getting an -EIO error when trying to read
a node or leaf, since read_extent_buffer_pages() returns -EIO if an
extent buffer does not have EXTENT_BUFFER_UPTODATE set and has the
EXTENT_BUFFER_WRITE_ERR bit set. Getting that -EIO means we return
immediately as we can not iterate over the entire tree.
In that case we never update the reserved space for every extent buffer
in the respective block group and space_info object. When this happens
we get the following traces when unmmounting the filesystem:
[174957.284509] BTRFS: error (device dm-0) in cleanup_transaction:1913: errno=-5 IO failure
[174957.286497] BTRFS: error (device dm-0) in free_log_tree:3420: errno=-5 IO failure
[174957.399379] ------------[ cut here ]------------
[174957.402497] WARNING: CPU: 2 PID: 3206883 at fs/btrfs/block-group.c:127 btrfs_put_block_group+0x77/0xb0 [btrfs]
[174957.407523] Modules linked in: btrfs overlay dm_zero (...)
[174957.424917] CPU: 2 PID: 3206883 Comm: umount Tainted: G W 5.16.0-rc5-btrfs-next-109 #1
[174957.426689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[174957.428716] RIP: 0010:btrfs_put_block_group+0x77/0xb0 [btrfs]
[174957.429717] Code: 21 48 8b bd (...)
[174957.432867] RSP: 0018:ffffb70d41cffdd0 EFLAGS: 00010206
[174957.433632] RAX: 0000000000000001 RBX: ffff8b09c3848000 RCX: ffff8b0758edd1c8
[174957.434689] RDX: 0000000000000001 RSI: ffffffffc0b467e7 RDI: ffff8b0758edd000
[174957.436068] RBP: ffff8b0758edd000 R08: 0000000000000000 R09: 0000000000000000
[174957.437114] R10: 0000000000000246 R11: 0000000000000000 R12: ffff8b09c3848148
[174957.438140] R13: ffff8b09c3848198 R14: ffff8b0758edd188 R15: dead000000000100
[174957.439317] FS: 00007f328fb82800(0000) GS:ffff8b0a2d200000(0000) knlGS:0000000000000000
[174957.440402] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[174957.441164] CR2: 00007fff13563e98 CR3: 0000000404f4e005 CR4: 0000000000370ee0
[174957.442117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[174957.443076] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[174957.443948] Call Trace:
[174957.444264] <TASK>
[174957.444538] btrfs_free_block_groups+0x255/0x3c0 [btrfs]
[174957.445238] close_ctree+0x301/0x357 [btrfs]
[174957.445803] ? call_rcu+0x16c/0x290
[174957.446250] generic_shutdown_super+0x74/0x120
[174957.446832] kill_anon_super+0x14/0x30
[174957.447305] btrfs_kill_super+0x12/0x20 [btrfs]
[174957.447890] deactivate_locked_super+0x31/0xa0
[174957.448440] cleanup_mnt+0x147/0x1c0
[174957.448888] task_work_run+0x5c/0xa0
[174957.449336] exit_to_user_mode_prepare+0x1e5/0x1f0
[174957.449934] syscall_exit_to_user_mode+0x16/0x40
[174957.450512] do_syscall_64+0x48/0xc0
[174957.450980] entry_SYSCALL_64_after_hwframe+0x44/0xae
[174957.451605] RIP: 0033:0x7f328fdc4a97
[174957.452059] Code: 03 0c 00 f7 (...)
[174957.454320] RSP: 002b:00007fff13564ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[174957.455262] RAX: 0000000000000000 RBX: 00007f328feea264 RCX: 00007f328fdc4a97
[174957.456131] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000560b8ae51dd0
[174957.457118] RBP: 0000560b8ae51ba0 R08: 0000000000000000 R09: 00007fff13563c40
[174957.458005] R10: 00007f328fe49fc0 R11: 0000000000000246 R12: 0000000000000000
[174957.459113] R13: 0000560b8ae51dd0 R14: 0000560b8ae51cb0 R15: 0000000000000000
[174957.460193] </TASK>
[174957.460534] irq event stamp: 0
[174957.461003] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[174957.461947] hardirqs last disabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
[174957.463147] softirqs last enabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
[174957.465116] softirqs last disabled at (0): [<0000000000000000>] 0x0
[174957.466323] ---[ end trace bc7ee0c490bce3af ]---
[174957.467282] ------------[ cut here ]------------
[174957.468184] WARNING: CPU: 2 PID: 3206883 at fs/btrfs/block-group.c:3976 btrfs_free_block_groups+0x330/0x3c0 [btrfs]
[174957.470066] Modules linked in: btrfs overlay dm_zero (...)
[174957.483137] CPU: 2 PID: 3206883 Comm: umount Tainted: G W 5.16.0-rc5-btrfs-next-109 #1
[174957.484691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[174957.486853] RIP: 0010:btrfs_free_block_groups+0x330/0x3c0 [btrfs]
[174957.488050] Code: 00 00 00 ad de (...)
[174957.491479] RSP: 0018:ffffb70d41cffde0 EFLAGS: 00010206
[174957.492520] RAX: ffff8b08d79310b0 RBX: ffff8b09c3848000 RCX: 0000000000000000
[174957.493868] RDX: 0000000000000001 RSI: fffff443055ee600 RDI: ffffffffb1131846
[174957.495183] RBP: ffff8b08d79310b0 R08: 0000000000000000 R09: 0000000000000000
[174957.496580] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8b08d7931000
[174957.498027] R13: ffff8b09c38492b0 R14: dead000000000122 R15: dead000000000100
[174957.499438] FS: 00007f328fb82800(0000) GS:ffff8b0a2d200000(0000) knlGS:0000000000000000
[174957.500990] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[174957.502117] CR2: 00007fff13563e98 CR3: 0000000404f4e005 CR4: 0000000000370ee0
[174957.503513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[174957.504864] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[174957.506167] Call Trace:
[174957.506654] <TASK>
[174957.507047] close_ctree+0x301/0x357 [btrfs]
[174957.507867] ? call_rcu+0x16c/0x290
[174957.508567] generic_shutdown_super+0x74/0x120
[174957.509447] kill_anon_super+0x14/0x30
[174957.510194] btrfs_kill_super+0x12/0x20 [btrfs]
[174957.511123] deactivate_locked_super+0x31/0xa0
[174957.511976] cleanup_mnt+0x147/0x1c0
[174957.512610] task_work_run+0x5c/0xa0
[174957.513309] exit_to_user_mode_prepare+0x1e5/0x1f0
[174957.514231] syscall_exit_to_user_mode+0x16/0x40
[174957.515069] do_syscall_64+0x48/0xc0
[174957.515718] entry_SYSCALL_64_after_hwframe+0x44/0xae
[174957.516688] RIP: 0033:0x7f328fdc4a97
[174957.517413] Code: 03 0c 00 f7 d8 (...)
[174957.521052] RSP: 002b:00007fff13564ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[174957.522514] RAX: 0000000000000000 RBX: 00007f328feea264 RCX: 00007f328fdc4a97
[174957.523950] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000560b8ae51dd0
[174957.525375] RBP: 0000560b8ae51ba0 R08: 0000000000000000 R09: 00007fff13563c40
[174957.526763] R10: 00007f328fe49fc0 R11: 0000000000000246 R12: 0000000000000000
[174957.528058] R13: 0000560b8ae51dd0 R14: 0000560b8ae51cb0 R15: 0000000000000000
[174957.529404] </TASK>
[174957.529843] irq event stamp: 0
[174957.530256] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[174957.531061] hardirqs last disabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
[174957.532075] softirqs last enabled at (0): [<ffffffffb0e94214>] copy_process+0x934/0x2040
[174957.533083] softirqs last disabled at (0): [<0000000000000000>] 0x0
[174957.533865] ---[ end trace bc7ee0c490bce3b0 ]---
[174957.534452] BTRFS info (device dm-0): space_info 4 has 1070841856 free, is not full
[174957.535404] BTRFS info (device dm-0): space_info total=1073741824, used=2785280, pinned=0, reserved=49152, may_use=0, readonly=65536 zone_unusable=0
[174957.537029] BTRFS info (device dm-0): global_block_rsv: size 0 reserved 0
[174957.537859] BTRFS info (device dm-0): trans_block_rsv: size 0 reserved 0
[174957.538697] BTRFS info (device dm-0): chunk_block_rsv: size 0 reserved 0
[174957.539552] BTRFS info (device dm-0): delayed_block_rsv: size 0 reserved 0
[174957.540403] BTRFS info (device dm-0): delayed_refs_rsv: size 0 reserved 0
This is often triggered with test cases generic/475 and generic/648 from
fstests, which makes the tests fail.
So instead of trying to iterate a log tree when freeing it after a
transaction abort, iterate over the io tree that tracks the ranges of the
log tree's extent buffers, and unaccount each extent buffer's range as
well as clear any extent buffers that are still dirty. Because when we
trigger writeback of log tree extent buffers we clear their range from
the ->dirty_log_pages io tree, we need to make sure the range is kept
but with a different bit, so that we can find it if we ever run into a
transaction abort - so when triggering the writeback, instead of clearing
the range, convert it to the EXTENT_UPTODATE bit.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/ctree.c | 44 +++++++++---
fs/btrfs/transaction.c | 23 ++++--
fs/btrfs/tree-log.c | 154 +++++++++++++++++++++++------------------
fs/btrfs/tree-log.h | 7 ++
4 files changed, 148 insertions(+), 80 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index defc494a2d08..fef24334c5be 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -16,6 +16,7 @@
#include "volumes.h"
#include "qgroup.h"
#include "tree-mod-log.h"
+#include "tree-log.h"
static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
*root, struct btrfs_path *path, int level);
@@ -112,6 +113,34 @@ noinline void btrfs_release_path(struct btrfs_path *p)
}
}
+static void delete_tree_block(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct extent_buffer *eb,
+ u64 parent_start,
+ bool is_last_ref)
+{
+ const u64 root_id = btrfs_root_id(root);
+
+ /*
+ * If we are deleting an unwritten block from a log tree, then delete its
+ * range from the io tree that tracks the blocks. This is only to ensure
+ * that if a transaction abort happens, we are able to do proper cleanup
+ * of space reservations, because we may not be able to iterate over the
+ * log tree in case he had a writeback failure for a log tree node - so
+ * we rely on io tree to figure out the range of each log tree block. We
+ * ignore any error from clear_extent_bits() because it's not common and
+ * it's not critical either.
+ */
+ if (root_id == BTRFS_TREE_LOG_OBJECTID &&
+ btrfs_header_generation(eb) == trans->transid &&
+ !btrfs_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN) &&
+ !btrfs_is_zoned(trans->fs_info))
+ clear_extent_bits(&root->dirty_log_pages, eb->start, eb->len,
+ BTRFS_LOG_PAGES_BITS);
+
+ btrfs_free_tree_block(trans, root_id, eb, parent_start, is_last_ref);
+}
+
/*
* safely gets a reference on the root node of a tree. A lock
* is not taken, so a concurrent writer may put a different node
@@ -463,8 +492,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
BUG_ON(ret < 0);
rcu_assign_pointer(root->node, cow);
- btrfs_free_tree_block(trans, btrfs_root_id(root), buf,
- parent_start, last_ref);
+ delete_tree_block(trans, root, buf, parent_start, last_ref);
free_extent_buffer(buf);
add_root_to_dirty_list(root);
} else {
@@ -485,8 +513,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
return ret;
}
}
- btrfs_free_tree_block(trans, btrfs_root_id(root), buf,
- parent_start, last_ref);
+ delete_tree_block(trans, root, buf, parent_start, last_ref);
}
if (unlock_orig)
btrfs_tree_unlock(buf);
@@ -930,7 +957,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
free_extent_buffer(mid);
root_sub_used(root, mid->len);
- btrfs_free_tree_block(trans, btrfs_root_id(root), mid, 0, 1);
+ delete_tree_block(trans, root, mid, 0, true);
/* once for the root ptr */
free_extent_buffer_stale(mid);
return 0;
@@ -989,8 +1016,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
btrfs_tree_unlock(right);
del_ptr(root, path, level + 1, pslot + 1);
root_sub_used(root, right->len);
- btrfs_free_tree_block(trans, btrfs_root_id(root), right,
- 0, 1);
+ delete_tree_block(trans, root, right, 0, true);
free_extent_buffer_stale(right);
right = NULL;
} else {
@@ -1035,7 +1061,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
btrfs_tree_unlock(mid);
del_ptr(root, path, level + 1, pslot);
root_sub_used(root, mid->len);
- btrfs_free_tree_block(trans, btrfs_root_id(root), mid, 0, 1);
+ delete_tree_block(trans, root, mid, 0, true);
free_extent_buffer_stale(mid);
mid = NULL;
} else {
@@ -4157,7 +4183,7 @@ static noinline void btrfs_del_leaf(struct btrfs_trans_handle *trans,
root_sub_used(root, leaf->len);
atomic_inc(&leaf->refs);
- btrfs_free_tree_block(trans, btrfs_root_id(root), leaf, 0, 1);
+ delete_tree_block(trans, root, leaf, 0, true);
free_extent_buffer_stale(leaf);
}
/*
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 03de89b45f27..4b17553e5b64 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1083,7 +1083,8 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info,
* on all the pages and clear them from the dirty pages state tree
*/
static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info,
- struct extent_io_tree *dirty_pages)
+ struct extent_io_tree *dirty_pages,
+ bool is_log_tree)
{
int err = 0;
int werr = 0;
@@ -1101,8 +1102,22 @@ static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info,
* after committing the log because the tree can be accessed
* concurrently - we do it only at transaction commit time when
* it's safe to do it (through extent_io_tree_release()).
+ *
+ * For a log tree, we convert the range bit so that we know
+ * about the range of log tree extent buffers even after they
+ * were written, so that if a transaction abort happens we
+ * know about the logical bytenr of the extents and can free
+ * them up, releasing reserved space in their block groups and
+ * in the metadata space_info. Ignore any errors in this case,
+ * we have no way to handle them, if they happen they are
+ * harmless, they only result in some warning during ummount.
*/
- err = clear_extent_bit(dirty_pages, start, end,
+ if (is_log_tree)
+ convert_extent_bit(dirty_pages, start, end,
+ EXTENT_UPTODATE, EXTENT_NEED_WAIT,
+ &cached_state);
+ else
+ err = clear_extent_bit(dirty_pages, start, end,
EXTENT_NEED_WAIT, 0, 0, &cached_state);
if (err == -ENOMEM)
err = 0;
@@ -1126,7 +1141,7 @@ static int btrfs_wait_extents(struct btrfs_fs_info *fs_info,
bool errors = false;
int err;
- err = __btrfs_wait_marked_extents(fs_info, dirty_pages);
+ err = __btrfs_wait_marked_extents(fs_info, dirty_pages, false);
if (test_and_clear_bit(BTRFS_FS_BTREE_ERR, &fs_info->flags))
errors = true;
@@ -1144,7 +1159,7 @@ int btrfs_wait_tree_log_extents(struct btrfs_root *log_root, int mark)
ASSERT(log_root->root_key.objectid == BTRFS_TREE_LOG_OBJECTID);
- err = __btrfs_wait_marked_extents(fs_info, dirty_pages);
+ err = __btrfs_wait_marked_extents(fs_info, dirty_pages, true);
if ((mark & EXTENT_DIRTY) &&
test_and_clear_bit(BTRFS_FS_LOG1_ERR, &fs_info->flags))
errors = true;
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 57554155ffb4..e57a5ef1be6f 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2792,24 +2792,17 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
return ret;
}
- if (trans) {
- btrfs_tree_lock(next);
- btrfs_clean_tree_block(next);
- btrfs_wait_tree_block_writeback(next);
- btrfs_tree_unlock(next);
- ret = btrfs_pin_reserved_extent(trans,
- bytenr, blocksize);
- if (ret) {
- free_extent_buffer(next);
- return ret;
- }
- btrfs_redirty_list_add(
- trans->transaction, next);
- } else {
- if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
- clear_extent_buffer_dirty(next);
- unaccount_log_buffer(fs_info, bytenr);
+ btrfs_tree_lock(next);
+ btrfs_clean_tree_block(next);
+ btrfs_wait_tree_block_writeback(next);
+ btrfs_tree_unlock(next);
+ ret = btrfs_pin_reserved_extent(trans, bytenr,
+ blocksize);
+ if (ret) {
+ free_extent_buffer(next);
+ return ret;
}
+ btrfs_redirty_list_add(trans->transaction, next);
}
free_extent_buffer(next);
continue;
@@ -2838,7 +2831,6 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
struct btrfs_path *path, int *level,
struct walk_control *wc)
{
- struct btrfs_fs_info *fs_info = root->fs_info;
int i;
int slot;
int ret;
@@ -2862,25 +2854,16 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
next = path->nodes[*level];
- if (trans) {
- btrfs_tree_lock(next);
- btrfs_clean_tree_block(next);
- btrfs_wait_tree_block_writeback(next);
- btrfs_tree_unlock(next);
- ret = btrfs_pin_reserved_extent(trans,
+ btrfs_tree_lock(next);
+ btrfs_clean_tree_block(next);
+ btrfs_wait_tree_block_writeback(next);
+ btrfs_tree_unlock(next);
+ ret = btrfs_pin_reserved_extent(trans,
path->nodes[*level]->start,
path->nodes[*level]->len);
- if (ret)
- return ret;
- btrfs_redirty_list_add(trans->transaction,
- next);
- } else {
- if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
- clear_extent_buffer_dirty(next);
-
- unaccount_log_buffer(fs_info,
- path->nodes[*level]->start);
- }
+ if (ret)
+ return ret;
+ btrfs_redirty_list_add(trans->transaction, next);
}
free_extent_buffer(path->nodes[*level]);
path->nodes[*level] = NULL;
@@ -2898,13 +2881,14 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
static int walk_log_tree(struct btrfs_trans_handle *trans,
struct btrfs_root *log, struct walk_control *wc)
{
- struct btrfs_fs_info *fs_info = log->fs_info;
int ret = 0;
int wret;
int level;
struct btrfs_path *path;
int orig_level;
+ ASSERT(trans != NULL);
+
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
@@ -2945,21 +2929,15 @@ static int walk_log_tree(struct btrfs_trans_handle *trans,
next = path->nodes[orig_level];
- if (trans) {
- btrfs_tree_lock(next);
- btrfs_clean_tree_block(next);
- btrfs_wait_tree_block_writeback(next);
- btrfs_tree_unlock(next);
- ret = btrfs_pin_reserved_extent(trans,
- next->start, next->len);
- if (ret)
- goto out;
- btrfs_redirty_list_add(trans->transaction, next);
- } else {
- if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
- clear_extent_buffer_dirty(next);
- unaccount_log_buffer(fs_info, next->start);
- }
+ btrfs_tree_lock(next);
+ btrfs_clean_tree_block(next);
+ btrfs_wait_tree_block_writeback(next);
+ btrfs_tree_unlock(next);
+ ret = btrfs_pin_reserved_extent(trans,
+ next->start, next->len);
+ if (ret)
+ goto out;
+ btrfs_redirty_list_add(trans->transaction, next);
}
}
@@ -3382,35 +3360,77 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
return ret;
}
+/*
+ * When cleaning up an aborted transaction we can't iterate the log tree because
+ * in case writeback failed for one of its extent buffers, we won't be able to
+ * read it, we'll get -EIO from btrfs_read_buffer(). So we instead iterate over
+ * the log tree's ->dirty_log_pages io tree.
+ */
+static void unaccount_all_log_buffers(struct btrfs_root *log)
+{
+ u64 start = 0;
+ u64 end;
+
+ while (!find_first_extent_bit(&log->dirty_log_pages, start, &start, &end,
+ BTRFS_LOG_PAGES_BITS, NULL)) {
+ struct btrfs_fs_info *fs_info = log->fs_info;
+
+ for (; start < end; start += fs_info->nodesize) {
+ struct extent_buffer *eb;
+
+ cond_resched();
+
+ unaccount_log_buffer(fs_info, start);
+ eb = find_extent_buffer(fs_info, start);
+ /*
+ * eb can be NULL in case it was already written and
+ * evicted from memory.
+ */
+ if (!eb)
+ continue;
+
+ wait_on_extent_buffer_writeback(eb);
+ if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags))
+ clear_extent_buffer_dirty(eb);
+ free_extent_buffer(eb);
+ }
+
+ start = end + 1;
+ }
+}
+
static void free_log_tree(struct btrfs_trans_handle *trans,
struct btrfs_root *log)
{
- int ret;
- struct walk_control wc = {
- .free = 1,
- .process_func = process_one_buffer
- };
+ if (trans && log->node) {
+ int ret;
+ struct walk_control wc = {
+ .free = 1,
+ .process_func = process_one_buffer
+ };
- if (log->node) {
ret = walk_log_tree(trans, log, &wc);
- if (ret) {
- if (trans)
- btrfs_abort_transaction(trans, ret);
- else
- btrfs_handle_fs_error(log->fs_info, ret, NULL);
- }
+ if (ret)
+ btrfs_abort_transaction(trans, ret);
+ } else if (!trans) {
+ unaccount_all_log_buffers(log);
}
- clear_extent_bits(&log->dirty_log_pages, 0, (u64)-1,
- EXTENT_DIRTY | EXTENT_NEW | EXTENT_NEED_WAIT);
+ clear_extent_bits(&log->dirty_log_pages, 0, (u64)-1, BTRFS_LOG_PAGES_BITS);
extent_io_tree_release(&log->log_csum_range);
btrfs_put_root(log);
}
/*
- * free all the extents used by the tree log. This should be called
- * at commit time of the full transaction
+ * Free all the extents used by a log tree.
+ *
+ * @trans: A transaction handle or NULL.
+ * @root: The parent root of a log tree.
+ *
+ * This should be called either at commit time of the full transaction, or when
+ * cleaning up a transaction that was aborted. In the former case @trans is not
+ * NULL, while in the second case @trans must be NULL.
*/
int btrfs_free_log(struct btrfs_trans_handle *trans, struct btrfs_root *root)
{
diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
index f6811c3df38a..d11a16a97e32 100644
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -12,6 +12,13 @@
/* return value for btrfs_log_dentry_safe that means we don't need to log it at all */
#define BTRFS_NO_LOG_SYNC 256
+/*
+ * All the bits we can set for an extent buffer's range in a log root's
+ * ->dirty_log_pages io tree.
+ */
+#define BTRFS_LOG_PAGES_BITS (EXTENT_DIRTY | EXTENT_NEW | EXTENT_NEED_WAIT | \
+ EXTENT_UPTODATE)
+
struct btrfs_log_ctx {
int log_ret;
int log_transid;
--
2.33.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment