docs: fs: convert docs without extension to ReST
There are 3 remaining files without an extension inside the fs docs dir. Manually convert them to ReST. In the case of the nfs/exporting.rst file, as the nfs docs aren't ported yet, I opted to convert and add a :orphan: there, with should be removed when it gets added into a nfs-specific part of the fs documentation. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:
parent
5a5e045bb3
commit
ec23eb54fb
11 changed files with 225 additions and 119 deletions
|
@ -1,3 +1,8 @@
|
||||||
|
=================
|
||||||
|
Directory Locking
|
||||||
|
=================
|
||||||
|
|
||||||
|
|
||||||
Locking scheme used for directory operations is based on two
|
Locking scheme used for directory operations is based on two
|
||||||
kinds of locks - per-inode (->i_rwsem) and per-filesystem
|
kinds of locks - per-inode (->i_rwsem) and per-filesystem
|
||||||
(->s_vfs_rename_mutex).
|
(->s_vfs_rename_mutex).
|
||||||
|
@ -27,14 +32,17 @@ NB: we might get away with locking the the source (and target in exchange
|
||||||
case) shared.
|
case) shared.
|
||||||
|
|
||||||
5) link creation. Locking rules:
|
5) link creation. Locking rules:
|
||||||
|
|
||||||
* lock parent
|
* lock parent
|
||||||
* check that source is not a directory
|
* check that source is not a directory
|
||||||
* lock source
|
* lock source
|
||||||
* call the method.
|
* call the method.
|
||||||
|
|
||||||
All locks are exclusive.
|
All locks are exclusive.
|
||||||
|
|
||||||
6) cross-directory rename. The trickiest in the whole bunch. Locking
|
6) cross-directory rename. The trickiest in the whole bunch. Locking
|
||||||
rules:
|
rules:
|
||||||
|
|
||||||
* lock the filesystem
|
* lock the filesystem
|
||||||
* lock parents in "ancestors first" order.
|
* lock parents in "ancestors first" order.
|
||||||
* find source and target.
|
* find source and target.
|
||||||
|
@ -46,6 +54,7 @@ rules:
|
||||||
* If the target exists, lock it. If the source is a non-directory,
|
* If the target exists, lock it. If the source is a non-directory,
|
||||||
lock it. If we need to lock both, do so in inode pointer order.
|
lock it. If we need to lock both, do so in inode pointer order.
|
||||||
* call the method.
|
* call the method.
|
||||||
|
|
||||||
All ->i_rwsem are taken exclusive. Again, we might get away with locking
|
All ->i_rwsem are taken exclusive. Again, we might get away with locking
|
||||||
the the source (and target in exchange case) shared.
|
the the source (and target in exchange case) shared.
|
||||||
|
|
||||||
|
@ -54,6 +63,7 @@ read, modified or removed by method will be locked by caller.
|
||||||
|
|
||||||
|
|
||||||
If no directory is its own ancestor, the scheme above is deadlock-free.
|
If no directory is its own ancestor, the scheme above is deadlock-free.
|
||||||
|
|
||||||
Proof:
|
Proof:
|
||||||
|
|
||||||
First of all, at any moment we have a partial ordering of the
|
First of all, at any moment we have a partial ordering of the
|
|
@ -20,6 +20,8 @@ algorithms work.
|
||||||
path-lookup
|
path-lookup
|
||||||
api-summary
|
api-summary
|
||||||
splice
|
splice
|
||||||
|
locking
|
||||||
|
directory-locking
|
||||||
|
|
||||||
Filesystem support layers
|
Filesystem support layers
|
||||||
=========================
|
=========================
|
||||||
|
|
|
@ -1,3 +1,7 @@
|
||||||
|
=======
|
||||||
|
Locking
|
||||||
|
=======
|
||||||
|
|
||||||
The text below describes the locking rules for VFS-related methods.
|
The text below describes the locking rules for VFS-related methods.
|
||||||
It is (believed to be) up-to-date. *Please*, if you change anything in
|
It is (believed to be) up-to-date. *Please*, if you change anything in
|
||||||
prototypes or locking protocols - update this file. And update the relevant
|
prototypes or locking protocols - update this file. And update the relevant
|
||||||
|
@ -5,10 +9,14 @@ instances in the tree, don't leave that to maintainers of filesystems/devices/
|
||||||
etc. At the very least, put the list of dubious cases in the end of this file.
|
etc. At the very least, put the list of dubious cases in the end of this file.
|
||||||
Don't turn it into log - maintainers of out-of-the-tree code are supposed to
|
Don't turn it into log - maintainers of out-of-the-tree code are supposed to
|
||||||
be able to use diff(1).
|
be able to use diff(1).
|
||||||
|
|
||||||
Thing currently missing here: socket operations. Alexey?
|
Thing currently missing here: socket operations. Alexey?
|
||||||
|
|
||||||
--------------------------- dentry_operations --------------------------
|
dentry_operations
|
||||||
prototypes:
|
=================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
int (*d_revalidate)(struct dentry *, unsigned int);
|
int (*d_revalidate)(struct dentry *, unsigned int);
|
||||||
int (*d_weak_revalidate)(struct dentry *, unsigned int);
|
int (*d_weak_revalidate)(struct dentry *, unsigned int);
|
||||||
int (*d_hash)(const struct dentry *, struct qstr *);
|
int (*d_hash)(const struct dentry *, struct qstr *);
|
||||||
|
@ -24,7 +32,10 @@ prototypes:
|
||||||
struct dentry *(*d_real)(struct dentry *, const struct inode *);
|
struct dentry *(*d_real)(struct dentry *, const struct inode *);
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
rename_lock ->d_lock may block rcu-walk
|
|
||||||
|
================== =========== ======== ============== ========
|
||||||
|
ops rename_lock ->d_lock may block rcu-walk
|
||||||
|
================== =========== ======== ============== ========
|
||||||
d_revalidate: no no yes (ref-walk) maybe
|
d_revalidate: no no yes (ref-walk) maybe
|
||||||
d_weak_revalidate: no no yes no
|
d_weak_revalidate: no no yes no
|
||||||
d_hash no no no maybe
|
d_hash no no no maybe
|
||||||
|
@ -38,9 +49,13 @@ d_dname: no no no no
|
||||||
d_automount: no no yes no
|
d_automount: no no yes no
|
||||||
d_manage: no no yes (ref-walk) maybe
|
d_manage: no no yes (ref-walk) maybe
|
||||||
d_real no no yes no
|
d_real no no yes no
|
||||||
|
================== =========== ======== ============== ========
|
||||||
|
|
||||||
|
inode_operations
|
||||||
|
================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
--------------------------- inode_operations ---------------------------
|
|
||||||
prototypes:
|
|
||||||
int (*create) (struct inode *,struct dentry *,umode_t, bool);
|
int (*create) (struct inode *,struct dentry *,umode_t, bool);
|
||||||
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
|
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
|
||||||
int (*link) (struct dentry *,struct inode *,struct dentry *);
|
int (*link) (struct dentry *,struct inode *,struct dentry *);
|
||||||
|
@ -68,7 +83,10 @@ prototypes:
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
all may block
|
all may block
|
||||||
i_rwsem(inode)
|
|
||||||
|
============ =============================================
|
||||||
|
ops i_rwsem(inode)
|
||||||
|
============ =============================================
|
||||||
lookup: shared
|
lookup: shared
|
||||||
create: exclusive
|
create: exclusive
|
||||||
link: exclusive (both)
|
link: exclusive (both)
|
||||||
|
@ -89,17 +107,21 @@ fiemap: no
|
||||||
update_time: no
|
update_time: no
|
||||||
atomic_open: exclusive
|
atomic_open: exclusive
|
||||||
tmpfile: no
|
tmpfile: no
|
||||||
|
============ =============================================
|
||||||
|
|
||||||
|
|
||||||
Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem
|
Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem
|
||||||
exclusive on victim.
|
exclusive on victim.
|
||||||
cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
|
cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
|
||||||
|
|
||||||
See Documentation/filesystems/directory-locking for more detailed discussion
|
See Documentation/filesystems/directory-locking.rst for more detailed discussion
|
||||||
of the locking scheme for directory operations.
|
of the locking scheme for directory operations.
|
||||||
|
|
||||||
----------------------- xattr_handler operations -----------------------
|
xattr_handler operations
|
||||||
prototypes:
|
========================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
bool (*list)(struct dentry *dentry);
|
bool (*list)(struct dentry *dentry);
|
||||||
int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
|
int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
|
||||||
struct inode *inode, const char *name, void *buffer,
|
struct inode *inode, const char *name, void *buffer,
|
||||||
|
@ -110,13 +132,20 @@ prototypes:
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
all may block
|
all may block
|
||||||
i_rwsem(inode)
|
|
||||||
|
===== ==============
|
||||||
|
ops i_rwsem(inode)
|
||||||
|
===== ==============
|
||||||
list: no
|
list: no
|
||||||
get: no
|
get: no
|
||||||
set: exclusive
|
set: exclusive
|
||||||
|
===== ==============
|
||||||
|
|
||||||
|
super_operations
|
||||||
|
================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
--------------------------- super_operations ---------------------------
|
|
||||||
prototypes:
|
|
||||||
struct inode *(*alloc_inode)(struct super_block *sb);
|
struct inode *(*alloc_inode)(struct super_block *sb);
|
||||||
void (*free_inode)(struct inode *);
|
void (*free_inode)(struct inode *);
|
||||||
void (*destroy_inode)(struct inode *);
|
void (*destroy_inode)(struct inode *);
|
||||||
|
@ -138,7 +167,10 @@ prototypes:
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
All may block [not true, see below]
|
All may block [not true, see below]
|
||||||
s_umount
|
|
||||||
|
====================== ============ ========================
|
||||||
|
ops s_umount note
|
||||||
|
====================== ============ ========================
|
||||||
alloc_inode:
|
alloc_inode:
|
||||||
free_inode: called from RCU callback
|
free_inode: called from RCU callback
|
||||||
destroy_inode:
|
destroy_inode:
|
||||||
|
@ -157,6 +189,7 @@ show_options: no (namespace_sem)
|
||||||
quota_read: no (see below)
|
quota_read: no (see below)
|
||||||
quota_write: no (see below)
|
quota_write: no (see below)
|
||||||
bdev_try_to_free_page: no (see below)
|
bdev_try_to_free_page: no (see below)
|
||||||
|
====================== ============ ========================
|
||||||
|
|
||||||
->statfs() has s_umount (shared) when called by ustat(2) (native or
|
->statfs() has s_umount (shared) when called by ustat(2) (native or
|
||||||
compat), but that's an accident of bad API; s_umount is used to pin
|
compat), but that's an accident of bad API; s_umount is used to pin
|
||||||
|
@ -164,31 +197,44 @@ the superblock down when we only have dev_t given us by userland to
|
||||||
identify the superblock. Everything else (statfs(), fstatfs(), etc.)
|
identify the superblock. Everything else (statfs(), fstatfs(), etc.)
|
||||||
doesn't hold it when calling ->statfs() - superblock is pinned down
|
doesn't hold it when calling ->statfs() - superblock is pinned down
|
||||||
by resolving the pathname passed to syscall.
|
by resolving the pathname passed to syscall.
|
||||||
|
|
||||||
->quota_read() and ->quota_write() functions are both guaranteed to
|
->quota_read() and ->quota_write() functions are both guaranteed to
|
||||||
be the only ones operating on the quota file by the quota code (via
|
be the only ones operating on the quota file by the quota code (via
|
||||||
dqio_sem) (unless an admin really wants to screw up something and
|
dqio_sem) (unless an admin really wants to screw up something and
|
||||||
writes to quota files with quotas on). For other details about locking
|
writes to quota files with quotas on). For other details about locking
|
||||||
see also dquot_operations section.
|
see also dquot_operations section.
|
||||||
|
|
||||||
->bdev_try_to_free_page is called from the ->releasepage handler of
|
->bdev_try_to_free_page is called from the ->releasepage handler of
|
||||||
the block device inode. See there for more details.
|
the block device inode. See there for more details.
|
||||||
|
|
||||||
--------------------------- file_system_type ---------------------------
|
file_system_type
|
||||||
prototypes:
|
================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
struct dentry *(*mount) (struct file_system_type *, int,
|
struct dentry *(*mount) (struct file_system_type *, int,
|
||||||
const char *, void *);
|
const char *, void *);
|
||||||
void (*kill_sb) (struct super_block *);
|
void (*kill_sb) (struct super_block *);
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
may block
|
|
||||||
|
======= =========
|
||||||
|
ops may block
|
||||||
|
======= =========
|
||||||
mount yes
|
mount yes
|
||||||
kill_sb yes
|
kill_sb yes
|
||||||
|
======= =========
|
||||||
|
|
||||||
->mount() returns ERR_PTR or the root dentry; its superblock should be locked
|
->mount() returns ERR_PTR or the root dentry; its superblock should be locked
|
||||||
on return.
|
on return.
|
||||||
|
|
||||||
->kill_sb() takes a write-locked superblock, does all shutdown work on it,
|
->kill_sb() takes a write-locked superblock, does all shutdown work on it,
|
||||||
unlocks and drops the reference.
|
unlocks and drops the reference.
|
||||||
|
|
||||||
--------------------------- address_space_operations --------------------------
|
address_space_operations
|
||||||
prototypes:
|
========================
|
||||||
|
prototypes::
|
||||||
|
|
||||||
int (*writepage)(struct page *page, struct writeback_control *wbc);
|
int (*writepage)(struct page *page, struct writeback_control *wbc);
|
||||||
int (*readpage)(struct file *, struct page *);
|
int (*readpage)(struct file *, struct page *);
|
||||||
int (*writepages)(struct address_space *, struct writeback_control *);
|
int (*writepages)(struct address_space *, struct writeback_control *);
|
||||||
|
@ -218,7 +264,9 @@ prototypes:
|
||||||
locking rules:
|
locking rules:
|
||||||
All except set_page_dirty and freepage may block
|
All except set_page_dirty and freepage may block
|
||||||
|
|
||||||
PageLocked(page) i_rwsem
|
====================== ======================== =========
|
||||||
|
ops PageLocked(page) i_rwsem
|
||||||
|
====================== ======================== =========
|
||||||
writepage: yes, unlocks (see below)
|
writepage: yes, unlocks (see below)
|
||||||
readpage: yes, unlocks
|
readpage: yes, unlocks
|
||||||
writepages:
|
writepages:
|
||||||
|
@ -239,6 +287,7 @@ is_partially_uptodate: yes
|
||||||
error_remove_page: yes
|
error_remove_page: yes
|
||||||
swap_activate: no
|
swap_activate: no
|
||||||
swap_deactivate: no
|
swap_deactivate: no
|
||||||
|
====================== ======================== =========
|
||||||
|
|
||||||
->write_begin(), ->write_end() and ->readpage() may be called from
|
->write_begin(), ->write_end() and ->readpage() may be called from
|
||||||
the request handler (/dev/loop).
|
the request handler (/dev/loop).
|
||||||
|
@ -299,10 +348,10 @@ in the filesystem like having dirty inodes at umount and losing written data.
|
||||||
|
|
||||||
->writepages() is used for periodic writeback and for syscall-initiated
|
->writepages() is used for periodic writeback and for syscall-initiated
|
||||||
sync operations. The address_space should start I/O against at least
|
sync operations. The address_space should start I/O against at least
|
||||||
*nr_to_write pages. *nr_to_write must be decremented for each page which is
|
``*nr_to_write`` pages. ``*nr_to_write`` must be decremented for each page
|
||||||
written. The address_space implementation may write more (or less) pages
|
which is written. The address_space implementation may write more (or less)
|
||||||
than *nr_to_write asks for, but it should try to be reasonably close. If
|
pages than ``*nr_to_write`` asks for, but it should try to be reasonably close.
|
||||||
nr_to_write is NULL, all dirty pages must be written.
|
If nr_to_write is NULL, all dirty pages must be written.
|
||||||
|
|
||||||
writepages should _only_ write pages which are present on
|
writepages should _only_ write pages which are present on
|
||||||
mapping->io_pages.
|
mapping->io_pages.
|
||||||
|
@ -344,23 +393,34 @@ address space operations.
|
||||||
->swap_deactivate() will be called in the sys_swapoff()
|
->swap_deactivate() will be called in the sys_swapoff()
|
||||||
path after ->swap_activate() returned success.
|
path after ->swap_activate() returned success.
|
||||||
|
|
||||||
----------------------- file_lock_operations ------------------------------
|
file_lock_operations
|
||||||
prototypes:
|
====================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
|
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
|
||||||
void (*fl_release_private)(struct file_lock *);
|
void (*fl_release_private)(struct file_lock *);
|
||||||
|
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
inode->i_lock may block
|
|
||||||
fl_copy_lock: yes no
|
|
||||||
fl_release_private: maybe maybe[1]
|
|
||||||
|
|
||||||
[1]: ->fl_release_private for flock or POSIX locks is currently allowed
|
=================== ============= =========
|
||||||
|
ops inode->i_lock may block
|
||||||
|
=================== ============= =========
|
||||||
|
fl_copy_lock: yes no
|
||||||
|
fl_release_private: maybe maybe[1]_
|
||||||
|
=================== ============= =========
|
||||||
|
|
||||||
|
.. [1]:
|
||||||
|
->fl_release_private for flock or POSIX locks is currently allowed
|
||||||
to block. Leases however can still be freed while the i_lock is held and
|
to block. Leases however can still be freed while the i_lock is held and
|
||||||
so fl_release_private called on a lease should not block.
|
so fl_release_private called on a lease should not block.
|
||||||
|
|
||||||
----------------------- lock_manager_operations ---------------------------
|
lock_manager_operations
|
||||||
prototypes:
|
=======================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
void (*lm_notify)(struct file_lock *); /* unblock callback */
|
void (*lm_notify)(struct file_lock *); /* unblock callback */
|
||||||
int (*lm_grant)(struct file_lock *, struct file_lock *, int);
|
int (*lm_grant)(struct file_lock *, struct file_lock *, int);
|
||||||
void (*lm_break)(struct file_lock *); /* break_lease callback */
|
void (*lm_break)(struct file_lock *); /* break_lease callback */
|
||||||
|
@ -368,24 +428,33 @@ prototypes:
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
|
|
||||||
inode->i_lock blocked_lock_lock may block
|
========== ============= ================= =========
|
||||||
|
ops inode->i_lock blocked_lock_lock may block
|
||||||
|
========== ============= ================= =========
|
||||||
lm_notify: yes yes no
|
lm_notify: yes yes no
|
||||||
lm_grant: no no no
|
lm_grant: no no no
|
||||||
lm_break: yes no no
|
lm_break: yes no no
|
||||||
lm_change yes no no
|
lm_change yes no no
|
||||||
|
========== ============= ================= =========
|
||||||
|
|
||||||
|
buffer_head
|
||||||
|
===========
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
--------------------------- buffer_head -----------------------------------
|
|
||||||
prototypes:
|
|
||||||
void (*b_end_io)(struct buffer_head *bh, int uptodate);
|
void (*b_end_io)(struct buffer_head *bh, int uptodate);
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
|
|
||||||
called from interrupts. In other words, extreme care is needed here.
|
called from interrupts. In other words, extreme care is needed here.
|
||||||
bh is locked, but that's all warranties we have here. Currently only RAID1,
|
bh is locked, but that's all warranties we have here. Currently only RAID1,
|
||||||
highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices
|
highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices
|
||||||
call this method upon the IO completion.
|
call this method upon the IO completion.
|
||||||
|
|
||||||
--------------------------- block_device_operations -----------------------
|
block_device_operations
|
||||||
prototypes:
|
=======================
|
||||||
|
prototypes::
|
||||||
|
|
||||||
int (*open) (struct block_device *, fmode_t);
|
int (*open) (struct block_device *, fmode_t);
|
||||||
int (*release) (struct gendisk *, fmode_t);
|
int (*release) (struct gendisk *, fmode_t);
|
||||||
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
|
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
|
||||||
|
@ -399,7 +468,10 @@ prototypes:
|
||||||
void (*swap_slot_free_notify) (struct block_device *, unsigned long);
|
void (*swap_slot_free_notify) (struct block_device *, unsigned long);
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
bd_mutex
|
|
||||||
|
======================= ===================
|
||||||
|
ops bd_mutex
|
||||||
|
======================= ===================
|
||||||
open: yes
|
open: yes
|
||||||
release: yes
|
release: yes
|
||||||
ioctl: no
|
ioctl: no
|
||||||
|
@ -410,6 +482,7 @@ unlock_native_capacity: no
|
||||||
revalidate_disk: no
|
revalidate_disk: no
|
||||||
getgeo: no
|
getgeo: no
|
||||||
swap_slot_free_notify: no (see below)
|
swap_slot_free_notify: no (see below)
|
||||||
|
======================= ===================
|
||||||
|
|
||||||
media_changed, unlock_native_capacity and revalidate_disk are called only from
|
media_changed, unlock_native_capacity and revalidate_disk are called only from
|
||||||
check_disk_change().
|
check_disk_change().
|
||||||
|
@ -418,8 +491,11 @@ swap_slot_free_notify is called with swap_lock and sometimes the page lock
|
||||||
held.
|
held.
|
||||||
|
|
||||||
|
|
||||||
--------------------------- file_operations -------------------------------
|
file_operations
|
||||||
prototypes:
|
===============
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
loff_t (*llseek) (struct file *, loff_t, int);
|
loff_t (*llseek) (struct file *, loff_t, int);
|
||||||
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
|
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
|
||||||
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
|
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
|
||||||
|
@ -455,7 +531,6 @@ prototypes:
|
||||||
size_t, unsigned int);
|
size_t, unsigned int);
|
||||||
int (*setlease)(struct file *, long, struct file_lock **, void **);
|
int (*setlease)(struct file *, long, struct file_lock **, void **);
|
||||||
long (*fallocate)(struct file *, int, loff_t, loff_t);
|
long (*fallocate)(struct file *, int, loff_t, loff_t);
|
||||||
};
|
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
All may block.
|
All may block.
|
||||||
|
@ -490,8 +565,11 @@ in sys_read() and friends.
|
||||||
the lease within the individual filesystem to record the result of the
|
the lease within the individual filesystem to record the result of the
|
||||||
operation
|
operation
|
||||||
|
|
||||||
--------------------------- dquot_operations -------------------------------
|
dquot_operations
|
||||||
prototypes:
|
================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
int (*write_dquot) (struct dquot *);
|
int (*write_dquot) (struct dquot *);
|
||||||
int (*acquire_dquot) (struct dquot *);
|
int (*acquire_dquot) (struct dquot *);
|
||||||
int (*release_dquot) (struct dquot *);
|
int (*release_dquot) (struct dquot *);
|
||||||
|
@ -503,20 +581,26 @@ a proper locking wrt the filesystem and call the generic quota operations.
|
||||||
|
|
||||||
What filesystem should expect from the generic quota functions:
|
What filesystem should expect from the generic quota functions:
|
||||||
|
|
||||||
FS recursion Held locks when called
|
============== ============ =========================
|
||||||
|
ops FS recursion Held locks when called
|
||||||
|
============== ============ =========================
|
||||||
write_dquot: yes dqonoff_sem or dqptr_sem
|
write_dquot: yes dqonoff_sem or dqptr_sem
|
||||||
acquire_dquot: yes dqonoff_sem or dqptr_sem
|
acquire_dquot: yes dqonoff_sem or dqptr_sem
|
||||||
release_dquot: yes dqonoff_sem or dqptr_sem
|
release_dquot: yes dqonoff_sem or dqptr_sem
|
||||||
mark_dirty: no -
|
mark_dirty: no -
|
||||||
write_info: yes dqonoff_sem
|
write_info: yes dqonoff_sem
|
||||||
|
============== ============ =========================
|
||||||
|
|
||||||
FS recursion means calling ->quota_read() and ->quota_write() from superblock
|
FS recursion means calling ->quota_read() and ->quota_write() from superblock
|
||||||
operations.
|
operations.
|
||||||
|
|
||||||
More details about quota locking can be found in fs/dquot.c.
|
More details about quota locking can be found in fs/dquot.c.
|
||||||
|
|
||||||
--------------------------- vm_operations_struct -----------------------------
|
vm_operations_struct
|
||||||
prototypes:
|
====================
|
||||||
|
|
||||||
|
prototypes::
|
||||||
|
|
||||||
void (*open)(struct vm_area_struct*);
|
void (*open)(struct vm_area_struct*);
|
||||||
void (*close)(struct vm_area_struct*);
|
void (*close)(struct vm_area_struct*);
|
||||||
vm_fault_t (*fault)(struct vm_area_struct*, struct vm_fault *);
|
vm_fault_t (*fault)(struct vm_area_struct*, struct vm_fault *);
|
||||||
|
@ -525,7 +609,10 @@ prototypes:
|
||||||
int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
|
int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
|
||||||
|
|
||||||
locking rules:
|
locking rules:
|
||||||
mmap_sem PageLocked(page)
|
|
||||||
|
============= ======== ===========================
|
||||||
|
ops mmap_sem PageLocked(page)
|
||||||
|
============= ======== ===========================
|
||||||
open: yes
|
open: yes
|
||||||
close: yes
|
close: yes
|
||||||
fault: yes can return with page locked
|
fault: yes can return with page locked
|
||||||
|
@ -533,6 +620,7 @@ map_pages: yes
|
||||||
page_mkwrite: yes can return with page locked
|
page_mkwrite: yes can return with page locked
|
||||||
pfn_mkwrite: yes
|
pfn_mkwrite: yes
|
||||||
access: yes
|
access: yes
|
||||||
|
============= ======== ===========================
|
||||||
|
|
||||||
->fault() is called when a previously not present pte is about
|
->fault() is called when a previously not present pte is about
|
||||||
to be faulted in. The filesystem must find and return the page associated
|
to be faulted in. The filesystem must find and return the page associated
|
||||||
|
@ -569,7 +657,8 @@ access_process_vm(), typically used to debug a process through
|
||||||
/proc/pid/mem or ptrace. This function is needed only for
|
/proc/pid/mem or ptrace. This function is needed only for
|
||||||
VM_IO | VM_PFNMAP VMAs.
|
VM_IO | VM_PFNMAP VMAs.
|
||||||
|
|
||||||
================================================================================
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
Dubious stuff
|
Dubious stuff
|
||||||
|
|
||||||
(if you break something or notice that it is broken and do not fix it yourself
|
(if you break something or notice that it is broken and do not fix it yourself
|
|
@ -1,3 +1,4 @@
|
||||||
|
:orphan:
|
||||||
|
|
||||||
Making Filesystems Exportable
|
Making Filesystems Exportable
|
||||||
=============================
|
=============================
|
||||||
|
@ -42,9 +43,9 @@ filehandle fragment, there is no automatic creation of a path prefix
|
||||||
for the object. This leads to two related but distinct features of
|
for the object. This leads to two related but distinct features of
|
||||||
the dcache that are not needed for normal filesystem access.
|
the dcache that are not needed for normal filesystem access.
|
||||||
|
|
||||||
1/ The dcache must sometimes contain objects that are not part of the
|
1. The dcache must sometimes contain objects that are not part of the
|
||||||
proper prefix. i.e that are not connected to the root.
|
proper prefix. i.e that are not connected to the root.
|
||||||
2/ The dcache must be prepared for a newly found (via ->lookup) directory
|
2. The dcache must be prepared for a newly found (via ->lookup) directory
|
||||||
to already have a (non-connected) dentry, and must be able to move
|
to already have a (non-connected) dentry, and must be able to move
|
||||||
that dentry into place (based on the parent and name in the
|
that dentry into place (based on the parent and name in the
|
||||||
->lookup). This is particularly needed for directories as
|
->lookup). This is particularly needed for directories as
|
||||||
|
@ -52,7 +53,7 @@ the dcache that are not needed for normal filesystem access.
|
||||||
|
|
||||||
To implement these features, the dcache has:
|
To implement these features, the dcache has:
|
||||||
|
|
||||||
a/ A dentry flag DCACHE_DISCONNECTED which is set on
|
a. A dentry flag DCACHE_DISCONNECTED which is set on
|
||||||
any dentry that might not be part of the proper prefix.
|
any dentry that might not be part of the proper prefix.
|
||||||
This is set when anonymous dentries are created, and cleared when a
|
This is set when anonymous dentries are created, and cleared when a
|
||||||
dentry is noticed to be a child of a dentry which is in the proper
|
dentry is noticed to be a child of a dentry which is in the proper
|
||||||
|
@ -71,19 +72,23 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on
|
||||||
dentries. That guarantees that we won't need to hunt them down upon
|
dentries. That guarantees that we won't need to hunt them down upon
|
||||||
umount.
|
umount.
|
||||||
|
|
||||||
b/ A primitive for creation of secondary roots - d_obtain_root(inode).
|
b. A primitive for creation of secondary roots - d_obtain_root(inode).
|
||||||
Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the
|
Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the
|
||||||
per-superblock list (->s_roots), so they can be located at umount
|
per-superblock list (->s_roots), so they can be located at umount
|
||||||
time for eviction purposes.
|
time for eviction purposes.
|
||||||
|
|
||||||
c/ Helper routines to allocate anonymous dentries, and to help attach
|
c. Helper routines to allocate anonymous dentries, and to help attach
|
||||||
loose directory dentries at lookup time. They are:
|
loose directory dentries at lookup time. They are:
|
||||||
|
|
||||||
d_obtain_alias(inode) will return a dentry for the given inode.
|
d_obtain_alias(inode) will return a dentry for the given inode.
|
||||||
If the inode already has a dentry, one of those is returned.
|
If the inode already has a dentry, one of those is returned.
|
||||||
|
|
||||||
If it doesn't, a new anonymous (IS_ROOT and
|
If it doesn't, a new anonymous (IS_ROOT and
|
||||||
DCACHE_DISCONNECTED) dentry is allocated and attached.
|
DCACHE_DISCONNECTED) dentry is allocated and attached.
|
||||||
|
|
||||||
In the case of a directory, care is taken that only one dentry
|
In the case of a directory, care is taken that only one dentry
|
||||||
can ever be attached.
|
can ever be attached.
|
||||||
|
|
||||||
d_splice_alias(inode, dentry) will introduce a new dentry into the tree;
|
d_splice_alias(inode, dentry) will introduce a new dentry into the tree;
|
||||||
either the passed-in dentry or a preexisting alias for the given inode
|
either the passed-in dentry or a preexisting alias for the given inode
|
||||||
(such as an anonymous one created by d_obtain_alias), if appropriate.
|
(such as an anonymous one created by d_obtain_alias), if appropriate.
|
||||||
|
@ -95,17 +100,17 @@ Filesystem Issues
|
||||||
|
|
||||||
For a filesystem to be exportable it must:
|
For a filesystem to be exportable it must:
|
||||||
|
|
||||||
1/ provide the filehandle fragment routines described below.
|
1. provide the filehandle fragment routines described below.
|
||||||
2/ make sure that d_splice_alias is used rather than d_add
|
2. make sure that d_splice_alias is used rather than d_add
|
||||||
when ->lookup finds an inode for a given parent and name.
|
when ->lookup finds an inode for a given parent and name.
|
||||||
|
|
||||||
If inode is NULL, d_splice_alias(inode, dentry) is equivalent to
|
If inode is NULL, d_splice_alias(inode, dentry) is equivalent to::
|
||||||
|
|
||||||
d_add(dentry, inode), NULL
|
d_add(dentry, inode), NULL
|
||||||
|
|
||||||
Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)
|
Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)
|
||||||
|
|
||||||
Typically the ->lookup routine will simply end with a:
|
Typically the ->lookup routine will simply end with a::
|
||||||
|
|
||||||
return d_splice_alias(inode, dentry);
|
return d_splice_alias(inode, dentry);
|
||||||
}
|
}
|
|
@ -20,7 +20,7 @@ kernel which allows different filesystem implementations to coexist.
|
||||||
|
|
||||||
VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on
|
VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on
|
||||||
are called from a process context. Filesystem locking is described in
|
are called from a process context. Filesystem locking is described in
|
||||||
the document Documentation/filesystems/Locking.
|
the document Documentation/filesystems/locking.rst.
|
||||||
|
|
||||||
|
|
||||||
Directory Entry Cache (dcache)
|
Directory Entry Cache (dcache)
|
||||||
|
|
|
@ -24,7 +24,7 @@
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* See Documentation/filesystems/nfs/Exporting
|
* See Documentation/filesystems/nfs/exporting.rst
|
||||||
* and examples in fs/exportfs
|
* and examples in fs/exportfs
|
||||||
*
|
*
|
||||||
* Since cifs is a network file system, an "fsid" must be included for
|
* Since cifs is a network file system, an "fsid" must be included for
|
||||||
|
|
|
@ -7,7 +7,7 @@
|
||||||
* and for mapping back from file handles to dentries.
|
* and for mapping back from file handles to dentries.
|
||||||
*
|
*
|
||||||
* For details on why we do all the strange and hairy things in here
|
* For details on why we do all the strange and hairy things in here
|
||||||
* take a look at Documentation/filesystems/nfs/Exporting.
|
* take a look at Documentation/filesystems/nfs/exporting.rst.
|
||||||
*/
|
*/
|
||||||
#include <linux/exportfs.h>
|
#include <linux/exportfs.h>
|
||||||
#include <linux/fs.h>
|
#include <linux/fs.h>
|
||||||
|
|
|
@ -10,7 +10,7 @@
|
||||||
*
|
*
|
||||||
* The following files are helpful:
|
* The following files are helpful:
|
||||||
*
|
*
|
||||||
* Documentation/filesystems/nfs/Exporting
|
* Documentation/filesystems/nfs/exporting.rst
|
||||||
* fs/exportfs/expfs.c.
|
* fs/exportfs/expfs.c.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
|
@ -555,7 +555,7 @@ static int orangefs_fsync(struct file *file,
|
||||||
* Change the file pointer position for an instance of an open file.
|
* Change the file pointer position for an instance of an open file.
|
||||||
*
|
*
|
||||||
* \note If .llseek is overriden, we must acquire lock as described in
|
* \note If .llseek is overriden, we must acquire lock as described in
|
||||||
* Documentation/filesystems/Locking.
|
* Documentation/filesystems/locking.rst.
|
||||||
*
|
*
|
||||||
* Future upgrade could support SEEK_DATA and SEEK_HOLE but would
|
* Future upgrade could support SEEK_DATA and SEEK_HOLE but would
|
||||||
* require much changes to the FS
|
* require much changes to the FS
|
||||||
|
|
|
@ -151,7 +151,7 @@ struct dentry_operations {
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Locking rules for dentry_operations callbacks are to be found in
|
* Locking rules for dentry_operations callbacks are to be found in
|
||||||
* Documentation/filesystems/Locking. Keep it updated!
|
* Documentation/filesystems/locking.rst. Keep it updated!
|
||||||
*
|
*
|
||||||
* FUrther descriptions are found in Documentation/filesystems/vfs.rst.
|
* FUrther descriptions are found in Documentation/filesystems/vfs.rst.
|
||||||
* Keep it updated too!
|
* Keep it updated too!
|
||||||
|
|
|
@ -139,7 +139,7 @@ struct fid {
|
||||||
* @get_parent: find the parent of a given directory
|
* @get_parent: find the parent of a given directory
|
||||||
* @commit_metadata: commit metadata changes to stable storage
|
* @commit_metadata: commit metadata changes to stable storage
|
||||||
*
|
*
|
||||||
* See Documentation/filesystems/nfs/Exporting for details on how to use
|
* See Documentation/filesystems/nfs/exporting.rst for details on how to use
|
||||||
* this interface correctly.
|
* this interface correctly.
|
||||||
*
|
*
|
||||||
* encode_fh:
|
* encode_fh:
|
||||||
|
|
Loading…
Add table
Reference in a new issue