In preparation for adding another incore inode tree tag, refactor the
code that sets and clears tags from the per-AG inode tree and the tree
of per-AG structures, and remove the open-coded versions used by the
blockgc code.
Note: For reclaim, we now rely on the radix tree tags instead of the
reclaimable inode count more heavily than we used to. The conversion
should be fine, but the logic isn't 100% identical.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Merge these two inode walk loops together, since they're pretty similar
now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Pass a pointer to the actual eofb structure around the inode scanner
functions instead of a void pointer, now that none of the functions is
used as a callback.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Soon we're going to be adding two new callers to the incore inode walk
code: reclaim of incore inodes, and (later) inactivation of inodes.
Both states operate on inodes that no longer have any VFS state, so we
need to move the xfs_irele calls into the processing functions.
In other words, icwalk processing functions are responsible for cleaning
up whatever state changes are made by the corresponding icwalk igrab
function that picked the inode for processing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Clean up the definition of which inode states are not eligible for
speculative preallocation garbage collecting by creating a private
#define. The deferred inactivation patchset will add two new entries to
the set of flags-to-ignore, so we want the definition not to end up a
cluttered mess.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
It turns out that there is a 1:1 mapping between the execute and goal
parameters that are passed to xfs_inode_walk_ag:
xfs_blockgc_scan_inode <=> XFS_ICWALK_BLOCKGC
xfs_dqrele_inode <=> XFS_ICWALK_DQRELE
Because of this exact correspondence, we don't need the execute function
pointer and can replace it with a direct call.
For the price of a forward static declaration, we can eliminate the
indirect function call. This likely has a negligible impact on
performance (since the execute function runs transactions), but it also
simplifies the function signature.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
The sole iter_flags is XFS_INODE_WALK_INEW_WAIT, and there are no users.
Remove the flag, and the parameter, and all the code that used it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Move the INEW wait into xfs_dqrele_inode so that we can drop the
iter_flags parameter in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Disentangle the dqrele_all inode grab code from the "generic" inode walk
grabbing code, and and use the opportunity to document why the dqrele
grab function does what it does. Since xfs_inode_walk_ag_grab is now
only used for blockgc, rename it to reflect that.
Ultimately, there will be four reasons to perform a walk of incore
inodes: quotaoff dquote releasing (dqrele), garbage collection of
speculative preallocations (blockgc), reclamation of incore inodes
(reclaim), and deferred inactivation (inodegc). Each of these four have
their own slightly different criteria for deciding if they want to
handle an inode, so it makes more sense to have four cohesive igrab
functions than one confusing parameteric grab function like we do now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
As part of removing the indirect calls and radix tag implementation
details from the incore inode walk loop, create an enum to represent the
goal of the inode iteration. More immediately, this separate removes
the need for the "ICI_NOTAG" define which makes little sense.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Shorten the prefix so that all the incore inode cache walk code has
"xfs_icwalk" in the name somewhere.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Move the inode walk functions further down in the file to limit the
forward declarations to the two walk functions as we add new code that
uses the inode walks. We'll clean them out later (i.e. after the
deferred inode inactivation series).
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Once we're done with inactivating an inode, we're finished updating
metadata for that inode. This means that we can detach the dquots at
the end and not have to wait for reclaim to do it for us.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
The only external caller of xfs_inode_walk* happens in quotaoff, when we
want to walk all the incore inodes to detach the dquots. Move this code
to xfs_icache.c so that we can hide xfs_inode_walk as the starting step
in more cleanups of inode walks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
for_each_perag_tag() is defined in xfs_icache.c for local use.
Promote this to xfs_ag.h and define equivalent iteration functions
so that we can use them to iterate AGs instead to replace open coded
perag walks and perag lookups.
We also convert as many of the straight forward open coded AG walks
to use these iterators as possible. Anything that is not a direct
conversion to an iterator is ignored and will be updated in future
commits.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
They are AG functions, not superblock functions, so move them to the
appropriate location.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
In xfs_inode_free_eofblocks, move the xfs_can_free_eofblocks call
further down in the function to the point where we have taken the
IOLOCK. This is preparation for the next patch, where we will need that
lock (or equivalent) so that we can check if there are any post-eof
blocks to clean out.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
In preparation of removing the historic icinode struct, move the flags2
field into the containing xfs_inode structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In preparation of removing the historic icinode struct, move the
forkoff field into the containing xfs_inode structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In preparation of removing the historic icinode struct, move the
flushiter field into the containing xfs_inode structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In preparation of removing the historic icinode struct, move the nblocks
field into the containing xfs_inode structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
In preparation of removing the historic icinode struct, move the projid
field into the containing xfs_inode structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
The xfs_icdinode structure just contains a random mix of inode field,
which are all read from the on-disk inode and mostly not looked at
before reading the inode or initializing a new inode cluster. The
only exceptions are the forkoff and blocks field, which are used
in sanity checks for freshly allocated inodes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Make sure di_flags2 is always initialized. We currently get this implicitly
by clearing the dinode core on allocating the in-core inode, but that is
about to go away.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Split looking up the dinode from xfs_imap_to_bp, which can be
significantly simplified as a result.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Since we're about to start using the blockgc workqueue to dispose of
inactivated inodes, strip the "block" prefix from the name; now it's
merely the general garbage collection (gc) workqueue.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Since xfs_inode_free_eofblocks and xfs_inode_free_cowblocks are now
internal static functions, we can save ourselves a cycling of the iolock
by passing the lock state out to xfs_blockgc_scan_inode and letting it
do all the unlocking.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Split the block preallocation garbage collection work into per-AG work
items so that we can take advantage of parallelization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Shorten the names of the two functions that start and stop block
preallocation garbage collection and move them up to the other blockgc
functions.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Perform background block preallocation gc scans more efficiently by
walking the incore inode tree once.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Remove the separate cowblocks work items and knob so that we can control
and run everything from a single blockgc work queue. Note that the
speculative_prealloc_lifetime sysfs knob retains its historical name
even though the functions move to prefix xfs_blockgc_*.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The clearing of posteof blocks and cowblocks serve the same purpose:
removing speculative block preallocations from inactive files. We don't
need to burn two radix tree tags on this, so combine them into one.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Change the one remaining caller of xfs_icache_free_cowblocks to use our
new combined blockgc scan function instead, since we will soon be
combining the two scans. This introduces a slight behavior change,
since a readonly remount now clears out post-EOF preallocations and not
just CoW staging extents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Change the one remaining caller of xfs_icache_free_eofblocks to use our
new combined blockgc scan function instead, since we will soon be
combining the two scans. This introduces a slight behavior change,
since the XFS_IOC_FREE_EOFBLOCKS now clears out speculative CoW
reservations in addition to post-eof blocks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Move the xfs_{eof,cow}blocks_worker and xfs_queue_{eof,cow}blocks
functions further down in the file so that the cleanups in the next
patches won't have to pre-declare static functions. No functional
changes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
In anticipation of more restructuring of the eof/cowblocks gc code,
refactor calling of those two functions into a single internal helper
function, then present a new standard interface to purge speculative
block preallocations and start shifting higher level code to use that.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
If an inode creation is unable to reserve enough quota to handle the
modification, try clearing whatever space the filesystem might have been
hanging onto in the hopes of speeding up the filesystem. The flushing
behavior will become particularly important when we add deferred inode
inactivation because that will increase the amount of space that isn't
actively tied to user data.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Change the signature of xfs_blockgc_free_quota in preparation for the
next few patches. Callers can now pass EOF_FLAGS into the function to
control scan parameters; and the function will now pass back any
corruption errors seen while scanning, though for our retry loops we'll
just try again unconditionally.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Move this function further down in the file so that later cleanups won't
have to declare static functions. Change the name because we're about
to rework all the code that performs garbage collection of speculatively
allocated file blocks. No functional changes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Buffered writers who have run out of quota reservation call
xfs_inode_free_quota_blocks to try to free any space reservations that
might reduce the quota usage. Unfortunately, the buffered write path
treats "out of project quota" the same as "out of overall space" so this
function has never supported scanning for space that might ease an "out
of project quota" condition.
We're about to start using this function for cases where we actually
/can/ tell if we're out of project quota, so add in this functionality.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Don't stall the cowblocks scan on a locked inode if we possibly can.
We'd much rather the background scanner keep moving.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
The functions to run an eof/cowblocks scan to try to reduce quota usage
are kind of a mess -- the logic repeatedly initializes an eofb structure
and there are logic bugs in the code that result in the cowblocks scan
never actually happening.
Replace all three functions with a single function that fills out an
eofb and runs both eof and cowblocks scans.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Fixes coccicheck warning:
fs/xfs/xfs_icache.c:1214:2-3: Unneeded semicolon
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
With the recent rework of the inode cluster flushing, we no longer
ever wait on the the inode flush "lock". It was never a lock in the
first place, just a completion to allow callers to wait for inode IO
to complete. We now never wait for flush completion as all inode
flushing is non-blocking. Hence we can get rid of all the iflock
infrastructure and instead just set and check a state flag.
Rename the XFS_IFLOCK flag to XFS_IFLUSHING, convert all the
xfs_iflock_nowait() test-and-set operations on that flag, and
replace all the xfs_ifunlock() calls to clear operations.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Use kmem_cache_alloc() directly.
All kmem_zone_alloc() users pass 0 as flags, which are translated into:
GFP_KERNEL | __GFP_NOWARN, and kmem_zone_alloc() loops forever until the
allocation succeeds.
We can use __GFP_NOFAIL to tell the allocator to loop forever rather
than doing it ourself, and because the allocation will never fail, we do
not need to use __GFP_NOWARN anymore. Hence, all callers can be
converted to use GFP_KERNEL | __GFP_NOFAIL
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: add a comment back in about nofail]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
We're going to split up the incore dquot state flags from the ondisk
dquot flags (eventually renaming this "type") so start by renaming the
three flags and the bitmask that are going to participate in this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Rather than attach inodes to the cluster buffer just when we are
doing IO, attach the inodes to the cluster buffer when they are
dirtied. The means the buffer always carries a list of dirty inodes
that reference it, and we can use that list to make more fundamental
changes to inode writeback that aren't otherwise possible.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Inode reclaim is quite different now to the way described in various
comments, so update all the comments explaining what it does and how
it works.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Clean up xfs_reclaim_inodes() callers. Most callers want blocking
behaviour, so just make the existing SYNC_WAIT behaviour the
default.
For the xfs_reclaim_worker(), just call xfs_reclaim_inodes_ag()
directly because we just want optimistic clean inode reclaim to be
done in the background.
For xfs_quiesce_attr() we can just remove the inode reclaim calls as
they are a historic relic that was required to flush dirty inodes
that contained unlogged changes. We now log all changes to the
inodes, so the sync AIL push from xfs_log_quiesce() called by
xfs_quiesce_attr() will do all the required inode writeback for
freeze.
Seeing as we now want to loop until all reclaimable inodes have been
reclaimed, make xfs_reclaim_inodes() loop on the XFS_ICI_RECLAIM_TAG
tag rather than having xfs_reclaim_inodes_ag() tell it that inodes
were skipped. This is much more reliable and will always loop until
all reclaimable inodes are reclaimed.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>