1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00

NFS Client Updates for Linux 6.14

New Features:
   * Enable using direct IO with localio
   * Added localio related tracepoints
 
 Bugfixes:
   * Sunrpc fixes for working with a very large cl_tasks list
   * Fix a possible buffer overflow in nfs_sysfs_link_rpc_client()
   * Fixes for handling reconnections with localio
   * Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT
   * Fix COPY_NOTIFY xdr_buf size calculations
   * pNFS/Flexfiles fix for retrying requesting a layout segment for reads
   * Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is expired
 
 Cleanups:
   * Various other nfs & nfsd localio cleanups
   * Prepratory patches for async copy improvements that are under development
   * Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other xprts
   * Add netns inum and srcaddr to debugfs rpc_xprt info
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmeZUzUACgkQ18tUv7Cl
 QOvArw/9HltIlcJHbi7tApGJ4dFpuJCa/fHbA1n5bHvKrCR5aElmFZoiFDdsM1JX
 kFAlMED9n1dW9VmzLJcepxmrLo/t7KXueiZNharHynWTxcszSl6jS+tOFBW6OflG
 Rrrjq/SrsWI2Fu8X4e/7ZV7pqRLGGn5SSMwgbuMbcyzBvVgN8mZM/BneIp1J59AI
 5NOsif5KWetVhQc43zlRlbVWR5cvNGcUK4i58LIaPFzPMt0xq/XJI+QWffj6kv4g
 cHabCNYTdQYMkhiPQC+LLYkw6sMbw2NatajTTYNMWfR/I+7wz9k5ej6CHKPIFCSr
 xjmscypySTLfMFQjrDFZkpX2CwSp/VIbV6go36DJwAlcCRzqz+I7cajlrRK4zvyr
 DyrcaZHvClEczP9QqdPj2wqRXbmIOsDMksOu4ACTUImd4o3f2v1K6DcwRj9oUIhV
 AGR31OEMt2A+RaVvVZYR4PpixJ01vH9LcmsaOu5KkHX8X4q2osQ7eMy+FV4kV09S
 pMnxDMAyszJU8IuzUG1/HfkonNlDMivIbqpgG4ZaVW08Nq4mCxJll1vTAa9FTLz2
 z+9eocqKwf724q1RAgOB7vj4AwOwL4Ul6d18UBtyUitZz3ndLRZ8Yy6r/AhrpCsC
 3co0Y3znZbKeRjmReNl0GLG4qiKE+E7Xh23Lf3IqXg8GE2Mu+Ls=
 =srvH
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New Features:
   - Enable using direct IO with localio
   - Added localio related tracepoints

  Bugfixes:
   - Sunrpc fixes for working with a very large cl_tasks list
   - Fix a possible buffer overflow in nfs_sysfs_link_rpc_client()
   - Fixes for handling reconnections with localio
   - Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT
   - Fix COPY_NOTIFY xdr_buf size calculations
   - pNFS/Flexfiles fix for retrying requesting a layout segment for
     reads
   - Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is
     expired

  Cleanups:
   - Various other nfs & nfsd localio cleanups
   - Prepratory patches for async copy improvements that are under
     development
   - Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other
     xprts
   - Add netns inum and srcaddr to debugfs rpc_xprt info"

* tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (28 commits)
  SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expired
  sunrpc: add netns inum and srcaddr to debugfs rpc_xprt info
  pnfs/flexfiles: retry getting layout segment for reads
  NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE
  NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE
  NFSv4.2: fix COPY_NOTIFY xdr buf size calculation
  NFS: Rename struct nfs4_offloadcancel_data
  NFS: Fix typo in OFFLOAD_CANCEL comment
  NFS: CB_OFFLOAD can return NFS4ERR_DELAY
  nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it
  nfs: fix incorrect error handling in LOCALIO
  nfs: probe for LOCALIO when v3 client reconnects to server
  nfs: probe for LOCALIO when v4 client reconnects to server
  nfs/localio: remove redundant code and simplify LOCALIO enablement
  nfs_common: add nfs_localio trace events
  nfs_common: track all open nfsd_files per LOCALIO nfs_client
  nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock
  nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file
  nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_
  nfsd: update percpu_ref to manage references on nfsd_net
  ...
This commit is contained in:
Linus Torvalds 2025-01-28 14:23:46 -08:00
commit b88fe2b5dd
36 changed files with 840 additions and 319 deletions

View file

@ -218,64 +218,30 @@ NFS Client and Server Interlock
===============================
LOCALIO provides the nfs_uuid_t object and associated interfaces to
allow proper network namespace (net-ns) and NFSD object refcounting:
allow proper network namespace (net-ns) and NFSD object refcounting.
We don't want to keep a long-term counted reference on each NFSD's
net-ns in the client because that prevents a server container from
completely shutting down.
So we avoid taking a reference at all and rely on the per-cpu
reference to the server (detailed below) being sufficient to keep
the net-ns active. This involves allowing the NFSD's net-ns exit
code to iterate all active clients and clear their ->net pointers
(which are needed to find the per-cpu-refcount for the nfsd_serv).
Details:
- Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
that can be used to find the client. It does add the 16-byte
uuid_t to nfs_client so it is bigger than needed (given that
uuid_t is only used during the initial NFS client and server
LOCALIO handshake to determine if they are local to each other).
If that is really a problem we can find a fix.
- When the nfs server confirms that the uuid_t is local, it moves
the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.
- When each server's net-ns is shutting down - in a "pre_exit"
handler, all these nfs_uuid_t have their ->net cleared. There is
an rcu_synchronize() call between pre_exit() handlers and exit()
handlers so any caller that sees nfs_uuid_t ->net as not NULL can
safely manage the per-cpu-refcount for nfsd_serv.
- The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
can safely dereference ->net in a private rcu_read_lock() section
to allow safe access to the associated nfsd_net and nfsd_serv.
So LOCALIO required the introduction and use of NFSD's percpu_ref to
interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
LOCALIO required the introduction and use of NFSD's percpu nfsd_net_ref
to interlock nfsd_shutdown_net() and nfsd_open_local_fh(), to ensure
each net-ns is not destroyed while in use by nfsd_open_local_fh(), and
warrants a more detailed explanation:
nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
nfsd_open_local_fh() uses nfsd_net_try_get() before opening its
nfsd_file handle and then the caller (NFS client) must drop the
reference for the nfsd_file and associated nn->nfsd_serv using
nfs_file_put_local() once it has completed its IO.
reference for the nfsd_file and associated net-ns using
nfsd_file_put_local() once it has completed its IO.
This interlock working relies heavily on nfsd_open_local_fh() being
afforded the ability to safely deal with the possibility that the
NFSD's net-ns (and nfsd_net by association) may have been destroyed
by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
possible given the nfs_uuid_t ->net pointer managemenet detailed
above.
by nfsd_destroy_serv() via nfsd_shutdown_net().
All told, this elaborate interlock of the NFS client and server has been
verified to fix an easy to hit crash that would occur if an NFSD
instance running in a container, with a LOCALIO client mounted, is
shutdown. Upon restart of the container and associated NFSD the client
would go on to crash due to NULL pointer dereference that occurred due
to the LOCALIO client's attempting to nfsd_open_local_fh(), using
nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.
This interlock of the NFS client and server has been verified to fix an
easy to hit crash that would occur if an NFSD instance running in a
container, with a LOCALIO client mounted, is shutdown. Upon restart of
the container and associated NFSD, the client would go on to crash due
to NULL pointer dereference that occurred due to the LOCALIO client's
attempting to nfsd_open_local_fh() without having a proper reference on
NFSD's net-ns.
NFS Client issues IO instead of Server
======================================
@ -306,10 +272,26 @@ is issuing IO to the underlying local filesystem that it is sharing with
the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and
fs/nfs/localio.c:nfs_local_commit().
With normal NFS that makes use of RPC to issue IO to the server, if an
application uses O_DIRECT the NFS client will bypass the pagecache but
the NFS server will not. The NFS server's use of buffered IO affords
applications to be less precise with their alignment when issuing IO to
the NFS client. But if all applications properly align their IO, LOCALIO
can be configured to use end-to-end O_DIRECT semantics from the NFS
client to the underlying local filesystem, that it is sharing with
the NFS server, by setting the 'localio_O_DIRECT_semantics' nfs module
parameter to Y, e.g.:
echo Y > /sys/module/nfs/parameters/localio_O_DIRECT_semantics
Once enabled, it will cause LOCALIO to use end-to-end O_DIRECT semantics
(but again, this may cause IO to fail if applications do not properly
align their IO).
Security
========
Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka
LOCALIO is only supported when UNIX-style authentication (AUTH_UNIX, aka
AUTH_SYS) is used.
Care is taken to ensure the same NFS security mechanisms are used
@ -324,6 +306,24 @@ client is afforded this same level of access (albeit in terms of the NFS
protocol via SUNRPC). No other namespaces (user, mount, etc) have been
altered or purposely extended from the server to the client.
Module Parameters
=================
/sys/module/nfs/parameters/localio_enabled (bool)
controls if LOCALIO is enabled, defaults to Y. If client and server are
local but 'localio_enabled' is set to N then LOCALIO will not be used.
/sys/module/nfs/parameters/localio_O_DIRECT_semantics (bool)
controls if O_DIRECT extends down to the underlying filesystem, defaults
to N. Application IO must be logical blocksize aligned, otherwise
O_DIRECT will fail.
/sys/module/nfsv3/parameters/nfs3_localio_probe_throttle (uint)
controls if NFSv3 read and write IOs will trigger (re)enabling of
LOCALIO every N (nfs3_localio_probe_throttle) IOs, defaults to 0
(disabled). Must be power-of-2, admin keeps all the pieces if they
misconfigure (too low a value or non-power-of-2).
Testing
=======

View file

@ -170,7 +170,8 @@ config ROOT_NFS
config NFS_FSCACHE
bool "Provide NFS client caching support"
depends on NFS_FS=m && NETFS_SUPPORT || NFS_FS=y && NETFS_SUPPORT=y
depends on NFS_FS
select NETFS_SUPPORT
select FSCACHE
help
Say Y here if you want NFS data to be cached locally on disc through

View file

@ -718,7 +718,7 @@ __be32 nfs4_callback_offload(void *data, void *dummy,
copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
if (!copy)
return htonl(NFS4ERR_SERVERFAULT);
return cpu_to_be32(NFS4ERR_DELAY);
spin_lock(&cps->clp->cl_lock);
rcu_read_lock();

View file

@ -38,7 +38,7 @@
#include <linux/sunrpc/bc_xprt.h>
#include <linux/nsproxy.h>
#include <linux/pid_namespace.h>
#include <linux/nfslocalio.h>
#include "nfs4_fs.h"
#include "callback.h"
@ -186,7 +186,7 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init)
seqlock_init(&clp->cl_boot_lock);
ktime_get_real_ts64(&clp->cl_nfssvc_boot);
nfs_uuid_init(&clp->cl_uuid);
spin_lock_init(&clp->cl_localio_lock);
INIT_WORK(&clp->cl_local_probe_work, nfs_local_probe_async_work);
#endif /* CONFIG_NFS_LOCALIO */
clp->cl_principal = "*";
@ -244,7 +244,7 @@ static void pnfs_init_server(struct nfs_server *server)
*/
void nfs_free_client(struct nfs_client *clp)
{
nfs_local_disable(clp);
nfs_localio_disable_client(clp);
/* -EIO all pending I/O */
if (!IS_ERR(clp->cl_rpcclient))

View file

@ -303,6 +303,7 @@ static void nfs_read_sync_pgio_error(struct list_head *head, int error)
static void nfs_direct_pgio_init(struct nfs_pgio_header *hdr)
{
get_dreq(hdr->dreq);
set_bit(NFS_IOHDR_ODIRECT, &hdr->flags);
}
static const struct nfs_pgio_completion_ops nfs_direct_read_completion_ops = {

View file

@ -164,18 +164,17 @@ decode_name(struct xdr_stream *xdr, u32 *id)
}
static struct nfsd_file *
ff_local_open_fh(struct nfs_client *clp, const struct cred *cred,
ff_local_open_fh(struct pnfs_layout_segment *lseg, u32 ds_idx,
struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, fmode_t mode)
{
if (mode & FMODE_WRITE) {
/*
* Always request read and write access since this corresponds
* to a rw layout.
*/
mode |= FMODE_READ;
}
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
struct nfs4_ff_layout_mirror *mirror = FF_LAYOUT_COMP(lseg, ds_idx);
return nfs_local_open_fh(clp, cred, fh, mode);
return nfs_local_open_fh(clp, cred, fh, &mirror->nfl, mode);
#else
return NULL;
#endif
}
static bool ff_mirror_match_fh(const struct nfs4_ff_layout_mirror *m1,
@ -247,6 +246,7 @@ static struct nfs4_ff_layout_mirror *ff_layout_alloc_mirror(gfp_t gfp_flags)
spin_lock_init(&mirror->lock);
refcount_set(&mirror->ref, 1);
INIT_LIST_HEAD(&mirror->mirrors);
nfs_localio_file_init(&mirror->nfl);
}
return mirror;
}
@ -257,6 +257,7 @@ static void ff_layout_free_mirror(struct nfs4_ff_layout_mirror *mirror)
ff_layout_remove_mirror(mirror);
kfree(mirror->fh_versions);
nfs_close_local_fh(&mirror->nfl);
cred = rcu_access_pointer(mirror->ro_cred);
put_cred(cred);
cred = rcu_access_pointer(mirror->rw_cred);
@ -847,6 +848,9 @@ ff_layout_pg_init_read(struct nfs_pageio_descriptor *pgio,
struct nfs4_pnfs_ds *ds;
u32 ds_idx;
if (NFS_SERVER(pgio->pg_inode)->flags &
(NFS_MOUNT_SOFT|NFS_MOUNT_SOFTERR))
pgio->pg_maxretrans = io_maxretrans;
retry:
pnfs_generic_pg_check_layout(pgio, req);
/* Use full layout for now */
@ -860,6 +864,8 @@ retry:
if (!pgio->pg_lseg)
goto out_nolseg;
}
/* Reset wb_nio, since getting layout segment was successful */
req->wb_nio = 0;
ds = ff_layout_get_ds_for_read(pgio, &ds_idx);
if (!ds) {
@ -876,14 +882,24 @@ retry:
pgm->pg_bsize = mirror->mirror_ds->ds_versions[0].rsize;
pgio->pg_mirror_idx = ds_idx;
if (NFS_SERVER(pgio->pg_inode)->flags &
(NFS_MOUNT_SOFT|NFS_MOUNT_SOFTERR))
pgio->pg_maxretrans = io_maxretrans;
return;
out_nolseg:
if (pgio->pg_error < 0)
return;
if (pgio->pg_error < 0) {
if (pgio->pg_error != -EAGAIN)
return;
/* Retry getting layout segment if lower layer returned -EAGAIN */
if (pgio->pg_maxretrans && req->wb_nio++ > pgio->pg_maxretrans) {
if (NFS_SERVER(pgio->pg_inode)->flags & NFS_MOUNT_SOFTERR)
pgio->pg_error = -ETIMEDOUT;
else
pgio->pg_error = -EIO;
return;
}
pgio->pg_error = 0;
/* Sleep for 1 second before retrying */
ssleep(1);
goto retry;
}
out_mds:
trace_pnfs_mds_fallback_pg_init_read(pgio->pg_inode,
0, NFS4_MAX_UINT64, IOMODE_READ,
@ -1820,7 +1836,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
hdr->mds_offset = offset;
/* Start IO accounting for local read */
localio = ff_local_open_fh(ds->ds_clp, ds_cred, fh, FMODE_READ);
localio = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh, FMODE_READ);
if (localio) {
hdr->task.tk_start = ktime_get();
ff_layout_read_record_layoutstats_start(&hdr->task, hdr);
@ -1896,7 +1912,7 @@ ff_layout_write_pagelist(struct nfs_pgio_header *hdr, int sync)
hdr->args.offset = offset;
/* Start IO accounting for local write */
localio = ff_local_open_fh(ds->ds_clp, ds_cred, fh,
localio = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh,
FMODE_READ|FMODE_WRITE);
if (localio) {
hdr->task.tk_start = ktime_get();
@ -1981,7 +1997,7 @@ static int ff_layout_initiate_commit(struct nfs_commit_data *data, int how)
data->args.fh = fh;
/* Start IO accounting for local commit */
localio = ff_local_open_fh(ds->ds_clp, ds_cred, fh,
localio = ff_local_open_fh(lseg, idx, ds->ds_clp, ds_cred, fh,
FMODE_READ|FMODE_WRITE);
if (localio) {
data->task.tk_start = ktime_get();

View file

@ -83,6 +83,7 @@ struct nfs4_ff_layout_mirror {
nfs4_stateid stateid;
const struct cred __rcu *ro_cred;
const struct cred __rcu *rw_cred;
struct nfs_file_localio nfl;
refcount_t ref;
spinlock_t lock;
unsigned long flags;

View file

@ -1137,6 +1137,8 @@ struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
ctx->lock_context.open_context = ctx;
INIT_LIST_HEAD(&ctx->list);
ctx->mdsthreshold = NULL;
nfs_localio_file_init(&ctx->nfl);
return ctx;
}
EXPORT_SYMBOL_GPL(alloc_nfs_open_context);
@ -1168,6 +1170,7 @@ static void __put_nfs_open_context(struct nfs_open_context *ctx, int is_sync)
nfs_sb_deactive(sb);
put_rpccred(rcu_dereference_protected(ctx->ll_cred, 1));
kfree(ctx->mdsthreshold);
nfs_close_local_fh(&ctx->nfl);
kfree_rcu(ctx, rcu_head);
}

View file

@ -455,11 +455,13 @@ extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode);
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
/* localio.c */
extern void nfs_local_disable(struct nfs_client *);
extern void nfs_local_probe(struct nfs_client *);
extern void nfs_local_probe_async(struct nfs_client *);
extern void nfs_local_probe_async_work(struct work_struct *);
extern struct nfsd_file *nfs_local_open_fh(struct nfs_client *,
const struct cred *,
struct nfs_fh *,
struct nfs_file_localio *,
const fmode_t);
extern int nfs_local_doio(struct nfs_client *,
struct nfsd_file *,
@ -471,11 +473,12 @@ extern int nfs_local_commit(struct nfsd_file *,
extern bool nfs_server_is_local(const struct nfs_client *clp);
#else /* CONFIG_NFS_LOCALIO */
static inline void nfs_local_disable(struct nfs_client *clp) {}
static inline void nfs_local_probe(struct nfs_client *clp) {}
static inline void nfs_local_probe_async(struct nfs_client *clp) {}
static inline struct nfsd_file *
nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, const fmode_t mode)
struct nfs_fh *fh, struct nfs_file_localio *nfl,
const fmode_t mode)
{
return NULL;
}

View file

@ -35,6 +35,7 @@ struct nfs_local_kiocb {
struct bio_vec *bvec;
struct nfs_pgio_header *hdr;
struct work_struct work;
void (*aio_complete_work)(struct work_struct *);
struct nfsd_file *localio;
};
@ -48,9 +49,14 @@ struct nfs_local_fsync_ctx {
static bool localio_enabled __read_mostly = true;
module_param(localio_enabled, bool, 0644);
static bool localio_O_DIRECT_semantics __read_mostly = false;
module_param(localio_O_DIRECT_semantics, bool, 0644);
MODULE_PARM_DESC(localio_O_DIRECT_semantics,
"LOCALIO will use O_DIRECT semantics to filesystem.");
static inline bool nfs_client_is_local(const struct nfs_client *clp)
{
return !!test_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
return !!rcu_access_pointer(clp->cl_uuid.net);
}
bool nfs_server_is_local(const struct nfs_client *clp)
@ -115,30 +121,6 @@ const struct rpc_program nfslocalio_program = {
.stats = &nfslocalio_rpcstat,
};
/*
* nfs_local_enable - enable local i/o for an nfs_client
*/
static void nfs_local_enable(struct nfs_client *clp)
{
spin_lock(&clp->cl_localio_lock);
set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
trace_nfs_local_enable(clp);
spin_unlock(&clp->cl_localio_lock);
}
/*
* nfs_local_disable - disable local i/o for an nfs_client
*/
void nfs_local_disable(struct nfs_client *clp)
{
spin_lock(&clp->cl_localio_lock);
if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) {
trace_nfs_local_disable(clp);
nfs_uuid_invalidate_one_client(&clp->cl_uuid);
}
spin_unlock(&clp->cl_localio_lock);
}
/*
* nfs_init_localioclient - Initialise an NFS localio client connection
*/
@ -178,7 +160,7 @@ static bool nfs_server_uuid_is_local(struct nfs_client *clp)
rpc_shutdown_client(rpcclient_localio);
/* Server is only local if it initialized required struct members */
if (status || !clp->cl_uuid.net || !clp->cl_uuid.dom)
if (status || !rcu_access_pointer(clp->cl_uuid.net) || !clp->cl_uuid.dom)
return false;
return true;
@ -194,44 +176,64 @@ void nfs_local_probe(struct nfs_client *clp)
/* Disallow localio if disabled via sysfs or AUTH_SYS isn't used */
if (!localio_enabled ||
clp->cl_rpcclient->cl_auth->au_flavor != RPC_AUTH_UNIX) {
nfs_local_disable(clp);
nfs_localio_disable_client(clp);
return;
}
if (nfs_client_is_local(clp)) {
/* If already enabled, disable and re-enable */
nfs_local_disable(clp);
nfs_localio_disable_client(clp);
}
if (!nfs_uuid_begin(&clp->cl_uuid))
return;
if (nfs_server_uuid_is_local(clp))
nfs_local_enable(clp);
nfs_localio_enable_client(clp);
nfs_uuid_end(&clp->cl_uuid);
}
EXPORT_SYMBOL_GPL(nfs_local_probe);
void nfs_local_probe_async_work(struct work_struct *work)
{
struct nfs_client *clp =
container_of(work, struct nfs_client, cl_local_probe_work);
nfs_local_probe(clp);
}
void nfs_local_probe_async(struct nfs_client *clp)
{
queue_work(nfsiod_workqueue, &clp->cl_local_probe_work);
}
EXPORT_SYMBOL_GPL(nfs_local_probe_async);
static inline struct nfsd_file *nfs_local_file_get(struct nfsd_file *nf)
{
return nfs_to->nfsd_file_get(nf);
}
static inline void nfs_local_file_put(struct nfsd_file *nf)
{
nfs_to->nfsd_file_put(nf);
}
/*
* nfs_local_open_fh - open a local filehandle in terms of nfsd_file
* __nfs_local_open_fh - open a local filehandle in terms of nfsd_file.
*
* Returns a pointer to a struct nfsd_file or NULL
* Returns a pointer to a struct nfsd_file or ERR_PTR.
* Caller must release returned nfsd_file with nfs_to_nfsd_file_put_local().
*/
struct nfsd_file *
nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, const fmode_t mode)
static struct nfsd_file *
__nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, struct nfs_file_localio *nfl,
const fmode_t mode)
{
struct nfsd_file *localio;
int status;
if (!nfs_server_is_local(clp))
return NULL;
if (mode & ~(FMODE_READ | FMODE_WRITE))
return NULL;
localio = nfs_open_local_fh(&clp->cl_uuid, clp->cl_rpcclient,
cred, fh, mode);
cred, fh, nfl, mode);
if (IS_ERR(localio)) {
status = PTR_ERR(localio);
int status = PTR_ERR(localio);
trace_nfs_local_open_fh(fh, mode, status);
switch (status) {
case -ENOMEM:
@ -240,10 +242,59 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
/* Revalidate localio, will disable if unsupported */
nfs_local_probe(clp);
}
return NULL;
}
return localio;
}
/*
* nfs_local_open_fh - open a local filehandle in terms of nfsd_file.
* First checking if the open nfsd_file is already cached, otherwise
* must __nfs_local_open_fh and insert the nfsd_file in nfs_file_localio.
*
* Returns a pointer to a struct nfsd_file or NULL.
*/
struct nfsd_file *
nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred,
struct nfs_fh *fh, struct nfs_file_localio *nfl,
const fmode_t mode)
{
struct nfsd_file *nf, *new, __rcu **pnf;
if (!nfs_server_is_local(clp))
return NULL;
if (mode & ~(FMODE_READ | FMODE_WRITE))
return NULL;
if (mode & FMODE_WRITE)
pnf = &nfl->rw_file;
else
pnf = &nfl->ro_file;
new = NULL;
rcu_read_lock();
nf = rcu_dereference(*pnf);
if (!nf) {
rcu_read_unlock();
new = __nfs_local_open_fh(clp, cred, fh, nfl, mode);
if (IS_ERR(new))
return NULL;
/* try to swap in the pointer */
spin_lock(&clp->cl_uuid.lock);
nf = rcu_dereference_protected(*pnf, 1);
if (!nf) {
nf = new;
new = NULL;
rcu_assign_pointer(*pnf, nf);
}
spin_unlock(&clp->cl_uuid.lock);
rcu_read_lock();
}
nf = nfs_local_file_get(nf);
rcu_read_unlock();
if (new)
nfs_to_nfsd_file_put_local(new);
return nf;
}
EXPORT_SYMBOL_GPL(nfs_local_open_fh);
static struct bio_vec *
@ -285,10 +336,19 @@ nfs_local_iocb_alloc(struct nfs_pgio_header *hdr,
kfree(iocb);
return NULL;
}
init_sync_kiocb(&iocb->kiocb, file);
if (localio_O_DIRECT_semantics &&
test_bit(NFS_IOHDR_ODIRECT, &hdr->flags)) {
iocb->kiocb.ki_filp = file;
iocb->kiocb.ki_flags = IOCB_DIRECT;
} else
init_sync_kiocb(&iocb->kiocb, file);
iocb->kiocb.ki_pos = hdr->args.offset;
iocb->hdr = hdr;
iocb->kiocb.ki_flags &= ~IOCB_APPEND;
iocb->aio_complete_work = NULL;
return iocb;
}
@ -328,7 +388,7 @@ nfs_local_pgio_done(struct nfs_pgio_header *hdr, long status)
hdr->res.op_status = NFS4_OK;
hdr->task.tk_status = 0;
} else {
hdr->res.op_status = nfs4_stat_to_errno(status);
hdr->res.op_status = nfs_localio_errno_to_nfs4_stat(status);
hdr->task.tk_status = status;
}
}
@ -338,11 +398,23 @@ nfs_local_pgio_release(struct nfs_local_kiocb *iocb)
{
struct nfs_pgio_header *hdr = iocb->hdr;
nfs_to_nfsd_file_put_local(iocb->localio);
nfs_local_file_put(iocb->localio);
nfs_local_iocb_free(iocb);
nfs_local_hdr_release(hdr, hdr->task.tk_ops);
}
/*
* Complete the I/O from iocb->kiocb.ki_complete()
*
* Note that this function can be called from a bottom half context,
* hence we need to queue the rpc_call_done() etc to a workqueue
*/
static inline void nfs_local_pgio_aio_complete(struct nfs_local_kiocb *iocb)
{
INIT_WORK(&iocb->work, iocb->aio_complete_work);
queue_work(nfsiod_workqueue, &iocb->work);
}
static void
nfs_local_read_done(struct nfs_local_kiocb *iocb, long status)
{
@ -365,6 +437,23 @@ nfs_local_read_done(struct nfs_local_kiocb *iocb, long status)
status > 0 ? status : 0, hdr->res.eof);
}
static void nfs_local_read_aio_complete_work(struct work_struct *work)
{
struct nfs_local_kiocb *iocb =
container_of(work, struct nfs_local_kiocb, work);
nfs_local_pgio_release(iocb);
}
static void nfs_local_read_aio_complete(struct kiocb *kiocb, long ret)
{
struct nfs_local_kiocb *iocb =
container_of(kiocb, struct nfs_local_kiocb, kiocb);
nfs_local_read_done(iocb, ret);
nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_read_aio_complete_work */
}
static void nfs_local_call_read(struct work_struct *work)
{
struct nfs_local_kiocb *iocb =
@ -379,10 +468,10 @@ static void nfs_local_call_read(struct work_struct *work)
nfs_local_iter_init(&iter, iocb, READ);
status = filp->f_op->read_iter(&iocb->kiocb, &iter);
WARN_ON_ONCE(status == -EIOCBQUEUED);
nfs_local_read_done(iocb, status);
nfs_local_pgio_release(iocb);
if (status != -EIOCBQUEUED) {
nfs_local_read_done(iocb, status);
nfs_local_pgio_release(iocb);
}
revert_creds(save_cred);
}
@ -410,6 +499,11 @@ nfs_do_local_read(struct nfs_pgio_header *hdr,
nfs_local_pgio_init(hdr, call_ops);
hdr->res.eof = false;
if (iocb->kiocb.ki_flags & IOCB_DIRECT) {
iocb->kiocb.ki_complete = nfs_local_read_aio_complete;
iocb->aio_complete_work = nfs_local_read_aio_complete_work;
}
INIT_WORK(&iocb->work, nfs_local_call_read);
queue_work(nfslocaliod_workqueue, &iocb->work);
@ -534,6 +628,24 @@ nfs_local_write_done(struct nfs_local_kiocb *iocb, long status)
nfs_local_pgio_done(hdr, status);
}
static void nfs_local_write_aio_complete_work(struct work_struct *work)
{
struct nfs_local_kiocb *iocb =
container_of(work, struct nfs_local_kiocb, work);
nfs_local_vfs_getattr(iocb);
nfs_local_pgio_release(iocb);
}
static void nfs_local_write_aio_complete(struct kiocb *kiocb, long ret)
{
struct nfs_local_kiocb *iocb =
container_of(kiocb, struct nfs_local_kiocb, kiocb);
nfs_local_write_done(iocb, ret);
nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_write_aio_complete_work */
}
static void nfs_local_call_write(struct work_struct *work)
{
struct nfs_local_kiocb *iocb =
@ -552,11 +664,11 @@ static void nfs_local_call_write(struct work_struct *work)
file_start_write(filp);
status = filp->f_op->write_iter(&iocb->kiocb, &iter);
file_end_write(filp);
WARN_ON_ONCE(status == -EIOCBQUEUED);
nfs_local_write_done(iocb, status);
nfs_local_vfs_getattr(iocb);
nfs_local_pgio_release(iocb);
if (status != -EIOCBQUEUED) {
nfs_local_write_done(iocb, status);
nfs_local_vfs_getattr(iocb);
nfs_local_pgio_release(iocb);
}
revert_creds(save_cred);
current->flags = old_flags;
@ -592,10 +704,16 @@ nfs_do_local_write(struct nfs_pgio_header *hdr,
case NFS_FILE_SYNC:
iocb->kiocb.ki_flags |= IOCB_DSYNC|IOCB_SYNC;
}
nfs_local_pgio_init(hdr, call_ops);
nfs_set_local_verifier(hdr->inode, hdr->res.verf, hdr->args.stable);
if (iocb->kiocb.ki_flags & IOCB_DIRECT) {
iocb->kiocb.ki_complete = nfs_local_write_aio_complete;
iocb->aio_complete_work = nfs_local_write_aio_complete_work;
}
INIT_WORK(&iocb->work, nfs_local_call_write);
queue_work(nfslocaliod_workqueue, &iocb->work);
@ -626,8 +744,8 @@ int nfs_local_doio(struct nfs_client *clp, struct nfsd_file *localio,
if (status != 0) {
if (status == -EAGAIN)
nfs_local_disable(clp);
nfs_to_nfsd_file_put_local(localio);
nfs_localio_disable_client(clp);
nfs_local_file_put(localio);
hdr->task.tk_status = status;
nfs_local_hdr_release(hdr, call_ops);
}
@ -668,7 +786,7 @@ nfs_local_commit_done(struct nfs_commit_data *data, int status)
data->task.tk_status = 0;
} else {
nfs_reset_boot_verifier(data->inode);
data->res.op_status = nfs4_stat_to_errno(status);
data->res.op_status = nfs_localio_errno_to_nfs4_stat(status);
data->task.tk_status = status;
}
}
@ -678,7 +796,7 @@ nfs_local_release_commit_data(struct nfsd_file *localio,
struct nfs_commit_data *data,
const struct rpc_call_ops *call_ops)
{
nfs_to_nfsd_file_put_local(localio);
nfs_local_file_put(localio);
call_ops->rpc_call_done(&data->task, data);
call_ops->rpc_release(data);
}

View file

@ -844,6 +844,41 @@ nfs3_proc_pathconf(struct nfs_server *server, struct nfs_fh *fhandle,
return status;
}
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
static unsigned nfs3_localio_probe_throttle __read_mostly = 0;
module_param(nfs3_localio_probe_throttle, uint, 0644);
MODULE_PARM_DESC(nfs3_localio_probe_throttle,
"Probe for NFSv3 LOCALIO every N IO requests. Must be power-of-2, defaults to 0 (probing disabled).");
static void nfs3_localio_probe(struct nfs_server *server)
{
struct nfs_client *clp = server->nfs_client;
/* Throttled to reduce nfs_local_probe_async() frequency */
if (!nfs3_localio_probe_throttle || nfs_server_is_local(clp))
return;
/*
* Try (re)enabling LOCALIO if isn't enabled -- admin deems
* it worthwhile to periodically check if LOCALIO possible by
* setting the 'nfs3_localio_probe_throttle' module parameter.
*
* This is useful if LOCALIO was previously enabled, but was
* disabled due to server restart, and IO has successfully
* completed in terms of normal RPC.
*/
if ((clp->cl_uuid.nfs3_localio_probe_count++ &
(nfs3_localio_probe_throttle - 1)) == 0) {
if (!nfs_server_is_local(clp))
nfs_local_probe_async(clp);
}
}
#else
static void nfs3_localio_probe(struct nfs_server *server) {}
#endif
static int nfs3_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
{
struct inode *inode = hdr->inode;
@ -855,8 +890,11 @@ static int nfs3_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
if (nfs3_async_handle_jukebox(task, inode))
return -EAGAIN;
if (task->tk_status >= 0 && !server->read_hdrsize)
cmpxchg(&server->read_hdrsize, 0, hdr->res.replen);
if (task->tk_status >= 0) {
if (!server->read_hdrsize)
cmpxchg(&server->read_hdrsize, 0, hdr->res.replen);
nfs3_localio_probe(server);
}
nfs_invalidate_atime(inode);
nfs_refresh_inode(inode, &hdr->fattr);
@ -886,8 +924,10 @@ static int nfs3_write_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
if (nfs3_async_handle_jukebox(task, inode))
return -EAGAIN;
if (task->tk_status >= 0)
if (task->tk_status >= 0) {
nfs_writeback_update_inode(hdr);
nfs3_localio_probe(NFS_SERVER(inode));
}
return 0;
}

View file

@ -498,15 +498,15 @@ out_put_src_lock:
return err;
}
struct nfs42_offloadcancel_data {
struct nfs42_offload_data {
struct nfs_server *seq_server;
struct nfs42_offload_status_args args;
struct nfs42_offload_status_res res;
};
static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
static void nfs42_offload_prepare(struct rpc_task *task, void *calldata)
{
struct nfs42_offloadcancel_data *data = calldata;
struct nfs42_offload_data *data = calldata;
nfs4_setup_sequence(data->seq_server->nfs_client,
&data->args.osa_seq_args,
@ -515,7 +515,7 @@ static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
{
struct nfs42_offloadcancel_data *data = calldata;
struct nfs42_offload_data *data = calldata;
trace_nfs4_offload_cancel(&data->args, task->tk_status);
nfs41_sequence_done(task, &data->res.osr_seq_res);
@ -525,22 +525,22 @@ static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
rpc_restart_call_prepare(task);
}
static void nfs42_free_offloadcancel_data(void *data)
static void nfs42_offload_release(void *data)
{
kfree(data);
}
static const struct rpc_call_ops nfs42_offload_cancel_ops = {
.rpc_call_prepare = nfs42_offload_cancel_prepare,
.rpc_call_prepare = nfs42_offload_prepare,
.rpc_call_done = nfs42_offload_cancel_done,
.rpc_release = nfs42_free_offloadcancel_data,
.rpc_release = nfs42_offload_release,
};
static int nfs42_do_offload_cancel_async(struct file *dst,
nfs4_stateid *stateid)
{
struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
struct nfs42_offloadcancel_data *data = NULL;
struct nfs42_offload_data *data = NULL;
struct nfs_open_context *ctx = nfs_file_open_context(dst);
struct rpc_task *task;
struct rpc_message msg = {
@ -552,14 +552,14 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
.rpc_message = &msg,
.callback_ops = &nfs42_offload_cancel_ops,
.workqueue = nfsiod_workqueue,
.flags = RPC_TASK_ASYNC,
.flags = RPC_TASK_ASYNC | RPC_TASK_MOVEABLE,
};
int status;
if (!(dst_server->caps & NFS_CAP_OFFLOAD_CANCEL))
return -EOPNOTSUPP;
data = kzalloc(sizeof(struct nfs42_offloadcancel_data), GFP_KERNEL);
data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
if (data == NULL)
return -ENOMEM;
@ -861,7 +861,7 @@ int nfs42_proc_layoutstats_generic(struct nfs_server *server,
.rpc_message = &msg,
.callback_ops = &nfs42_layoutstat_ops,
.callback_data = data,
.flags = RPC_TASK_ASYNC,
.flags = RPC_TASK_ASYNC | RPC_TASK_MOVEABLE,
};
struct rpc_task *task;
@ -1016,7 +1016,7 @@ int nfs42_proc_layouterror(struct pnfs_layout_segment *lseg,
struct rpc_task_setup task_setup = {
.rpc_message = &msg,
.callback_ops = &nfs42_layouterror_ops,
.flags = RPC_TASK_ASYNC,
.flags = RPC_TASK_ASYNC | RPC_TASK_MOVEABLE,
};
unsigned int i;

View file

@ -144,9 +144,11 @@
decode_putfh_maxsz + \
decode_offload_cancel_maxsz)
#define NFS4_enc_copy_notify_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
encode_copy_notify_maxsz)
#define NFS4_dec_copy_notify_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_copy_notify_maxsz)
#define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \
@ -549,7 +551,7 @@ static void nfs4_xdr_enc_copy(struct rpc_rqst *req,
}
/*
* Encode OFFLOAD_CANEL request
* Encode OFFLOAD_CANCEL request
*/
static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
struct xdr_stream *xdr,

View file

@ -1955,6 +1955,7 @@ restart:
}
rcu_read_unlock();
nfs4_free_state_owners(&freeme);
nfs_local_probe_async(clp);
if (lost_locks)
pr_warn("NFS: %s: lost %d locks\n",
clp->cl_hostname, lost_locks);

View file

@ -1714,38 +1714,6 @@ TRACE_EVENT(nfs_local_open_fh,
)
);
DECLARE_EVENT_CLASS(nfs_local_client_event,
TP_PROTO(
const struct nfs_client *clp
),
TP_ARGS(clp),
TP_STRUCT__entry(
__field(unsigned int, protocol)
__string(server, clp->cl_hostname)
),
TP_fast_assign(
__entry->protocol = clp->rpc_ops->version;
__assign_str(server);
),
TP_printk(
"server=%s NFSv%u", __get_str(server), __entry->protocol
)
);
#define DEFINE_NFS_LOCAL_CLIENT_EVENT(name) \
DEFINE_EVENT(nfs_local_client_event, name, \
TP_PROTO( \
const struct nfs_client *clp \
), \
TP_ARGS(clp))
DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_enable);
DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_local_disable);
DECLARE_EVENT_CLASS(nfs_xdr_event,
TP_PROTO(
const struct xdr_stream *xdr,

View file

@ -961,8 +961,9 @@ static int nfs_generic_pg_pgios(struct nfs_pageio_descriptor *desc)
struct nfs_client *clp = NFS_SERVER(hdr->inode)->nfs_client;
struct nfsd_file *localio =
nfs_local_open_fh(clp, hdr->cred,
hdr->args.fh, hdr->args.context->mode);
nfs_local_open_fh(clp, hdr->cred, hdr->args.fh,
&hdr->args.context->nfl,
hdr->args.context->mode);
if (NFS_SERVER(hdr->inode)->nfs_client->cl_minorversion)
task_flags = RPC_TASK_MOVEABLE;

View file

@ -280,9 +280,9 @@ void nfs_sysfs_link_rpc_client(struct nfs_server *server,
char name[RPC_CLIENT_NAME_SIZE];
int ret;
strcpy(name, clnt->cl_program->name);
strcat(name, uniq ? uniq : "");
strcat(name, "_client");
strscpy(name, clnt->cl_program->name, sizeof(name));
strncat(name, uniq ? uniq : "", sizeof(name) - strlen(name) - 1);
strncat(name, "_client", sizeof(name) - strlen(name) - 1);
ret = sysfs_create_link_nowarn(&server->kobj,
&clnt->cl_sysfs->kobject, name);

View file

@ -1826,7 +1826,8 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how,
task_flags = RPC_TASK_MOVEABLE;
localio = nfs_local_open_fh(NFS_SERVER(inode)->nfs_client, data->cred,
data->args.fh, data->context->mode);
data->args.fh, &data->context->nfl,
data->context->mode);
return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode),
data->mds_ops, how,
RPC_TASK_CRED_NOREF | task_flags, localio);

View file

@ -6,8 +6,9 @@
obj-$(CONFIG_NFS_ACL_SUPPORT) += nfs_acl.o
nfs_acl-objs := nfsacl.o
CFLAGS_localio_trace.o += -I$(src)
obj-$(CONFIG_NFS_COMMON_LOCALIO_SUPPORT) += nfs_localio.o
nfs_localio-objs := nfslocalio.o
nfs_localio-objs := nfslocalio.o localio_trace.o
obj-$(CONFIG_GRACE_PERIOD) += grace.o
obj-$(CONFIG_NFS_V4_2_SSC_HELPER) += nfs_ssc.o

View file

@ -15,7 +15,7 @@ static const struct {
{ NFS_OK, 0 },
{ NFSERR_PERM, -EPERM },
{ NFSERR_NOENT, -ENOENT },
{ NFSERR_IO, -errno_NFSERR_IO},
{ NFSERR_IO, -EIO },
{ NFSERR_NXIO, -ENXIO },
/* { NFSERR_EAGAIN, -EAGAIN }, */
{ NFSERR_ACCES, -EACCES },
@ -45,7 +45,6 @@ static const struct {
{ NFSERR_SERVERFAULT, -EREMOTEIO },
{ NFSERR_BADTYPE, -EBADTYPE },
{ NFSERR_JUKEBOX, -EJUKEBOX },
{ -1, -EIO }
};
/**
@ -59,26 +58,29 @@ int nfs_stat_to_errno(enum nfs_stat status)
{
int i;
for (i = 0; nfs_errtbl[i].stat != -1; i++) {
for (i = 0; i < ARRAY_SIZE(nfs_errtbl); i++) {
if (nfs_errtbl[i].stat == (int)status)
return nfs_errtbl[i].errno;
}
return nfs_errtbl[i].errno;
return -EIO;
}
EXPORT_SYMBOL_GPL(nfs_stat_to_errno);
/*
* We need to translate between nfs v4 status return values and
* the local errno values which may not be the same.
*
* nfs4_errtbl_common[] is used before more specialized mappings
* available in nfs4_errtbl[] or nfs4_errtbl_localio[].
*/
static const struct {
int stat;
int errno;
} nfs4_errtbl[] = {
} nfs4_errtbl_common[] = {
{ NFS4_OK, 0 },
{ NFS4ERR_PERM, -EPERM },
{ NFS4ERR_NOENT, -ENOENT },
{ NFS4ERR_IO, -errno_NFSERR_IO},
{ NFS4ERR_IO, -EIO },
{ NFS4ERR_NXIO, -ENXIO },
{ NFS4ERR_ACCESS, -EACCES },
{ NFS4ERR_EXIST, -EEXIST },
@ -98,15 +100,20 @@ static const struct {
{ NFS4ERR_BAD_COOKIE, -EBADCOOKIE },
{ NFS4ERR_NOTSUPP, -ENOTSUPP },
{ NFS4ERR_TOOSMALL, -ETOOSMALL },
{ NFS4ERR_SERVERFAULT, -EREMOTEIO },
{ NFS4ERR_BADTYPE, -EBADTYPE },
{ NFS4ERR_LOCKED, -EAGAIN },
{ NFS4ERR_SYMLINK, -ELOOP },
{ NFS4ERR_OP_ILLEGAL, -EOPNOTSUPP },
{ NFS4ERR_DEADLOCK, -EDEADLK },
};
static const struct {
int stat;
int errno;
} nfs4_errtbl[] = {
{ NFS4ERR_SERVERFAULT, -EREMOTEIO },
{ NFS4ERR_LOCKED, -EAGAIN },
{ NFS4ERR_OP_ILLEGAL, -EOPNOTSUPP },
{ NFS4ERR_NOXATTR, -ENODATA },
{ NFS4ERR_XATTR2BIG, -E2BIG },
{ -1, -EIO }
};
/*
@ -116,7 +123,14 @@ static const struct {
int nfs4_stat_to_errno(int stat)
{
int i;
for (i = 0; nfs4_errtbl[i].stat != -1; i++) {
/* First check nfs4_errtbl_common */
for (i = 0; i < ARRAY_SIZE(nfs4_errtbl_common); i++) {
if (nfs4_errtbl_common[i].stat == stat)
return nfs4_errtbl_common[i].errno;
}
/* Then check nfs4_errtbl */
for (i = 0; i < ARRAY_SIZE(nfs4_errtbl); i++) {
if (nfs4_errtbl[i].stat == stat)
return nfs4_errtbl[i].errno;
}
@ -132,3 +146,56 @@ int nfs4_stat_to_errno(int stat)
return -stat;
}
EXPORT_SYMBOL_GPL(nfs4_stat_to_errno);
/*
* This table is useful for conversion from local errno to NFS error.
* It provides more logically correct mappings for use with LOCALIO
* (which is focused on converting from errno to NFS status).
*/
static const struct {
int stat;
int errno;
} nfs4_errtbl_localio[] = {
/* Map errors differently than nfs4_errtbl */
{ NFS4ERR_IO, -EREMOTEIO },
{ NFS4ERR_DELAY, -EAGAIN },
{ NFS4ERR_FBIG, -E2BIG },
/* Map errors not handled by nfs4_errtbl */
{ NFS4ERR_STALE, -EBADF },
{ NFS4ERR_STALE, -EOPENSTALE },
{ NFS4ERR_DELAY, -ETIMEDOUT },
{ NFS4ERR_DELAY, -ERESTARTSYS },
{ NFS4ERR_DELAY, -ENOMEM },
{ NFS4ERR_IO, -ETXTBSY },
{ NFS4ERR_IO, -EBUSY },
{ NFS4ERR_SERVERFAULT, -ESERVERFAULT },
{ NFS4ERR_SERVERFAULT, -ENFILE },
{ NFS4ERR_IO, -EUCLEAN },
{ NFS4ERR_PERM, -ENOKEY },
};
/*
* Convert an errno to an NFS error code for LOCALIO.
*/
__u32 nfs_localio_errno_to_nfs4_stat(int errno)
{
int i;
/* First check nfs4_errtbl_common */
for (i = 0; i < ARRAY_SIZE(nfs4_errtbl_common); i++) {
if (nfs4_errtbl_common[i].errno == errno)
return nfs4_errtbl_common[i].stat;
}
/* Then check nfs4_errtbl_localio */
for (i = 0; i < ARRAY_SIZE(nfs4_errtbl_localio); i++) {
if (nfs4_errtbl_localio[i].errno == errno)
return nfs4_errtbl_localio[i].stat;
}
/* If we cannot translate the error, the recovery routines should
* handle it.
* Note: remaining NFSv4 error codes have values > 10000, so should
* not conflict with native Linux error codes.
*/
return NFS4ERR_SERVERFAULT;
}
EXPORT_SYMBOL_GPL(nfs_localio_errno_to_nfs4_stat);

View file

@ -0,0 +1,10 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2024 Trond Myklebust <trond.myklebust@hammerspace.com>
* Copyright (C) 2024 Mike Snitzer <snitzer@hammerspace.com>
*/
#include <linux/nfs_fs.h>
#include <linux/namei.h>
#define CREATE_TRACE_POINTS
#include "localio_trace.h"

View file

@ -0,0 +1,56 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2024 Trond Myklebust <trond.myklebust@hammerspace.com>
* Copyright (C) 2024 Mike Snitzer <snitzer@hammerspace.com>
*/
#undef TRACE_SYSTEM
#define TRACE_SYSTEM nfs_localio
#if !defined(_TRACE_NFS_COMMON_LOCALIO_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_NFS_COMMON_LOCALIO_H
#include <linux/tracepoint.h>
#include <trace/misc/fs.h>
#include <trace/misc/nfs.h>
#include <trace/misc/sunrpc.h>
DECLARE_EVENT_CLASS(nfs_local_client_event,
TP_PROTO(
const struct nfs_client *clp
),
TP_ARGS(clp),
TP_STRUCT__entry(
__field(unsigned int, protocol)
__string(server, clp->cl_hostname)
),
TP_fast_assign(
__entry->protocol = clp->rpc_ops->version;
__assign_str(server);
),
TP_printk(
"server=%s NFSv%u", __get_str(server), __entry->protocol
)
);
#define DEFINE_NFS_LOCAL_CLIENT_EVENT(name) \
DEFINE_EVENT(nfs_local_client_event, name, \
TP_PROTO( \
const struct nfs_client *clp \
), \
TP_ARGS(clp))
DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_localio_enable_client);
DEFINE_NFS_LOCAL_CLIENT_EVENT(nfs_localio_disable_client);
#endif /* _TRACE_NFS_COMMON_LOCALIO_H */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH .
#define TRACE_INCLUDE_FILE localio_trace
/* This part must be outside protection */
#include <trace/define_trace.h>

View file

@ -7,38 +7,67 @@
#include <linux/module.h>
#include <linux/list.h>
#include <linux/nfslocalio.h>
#include <linux/nfs3.h>
#include <linux/nfs4.h>
#include <linux/nfs_fs.h>
#include <net/netns/generic.h>
#include "localio_trace.h"
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("NFS localio protocol bypass support");
static DEFINE_SPINLOCK(nfs_uuid_lock);
static DEFINE_SPINLOCK(nfs_uuids_lock);
/*
* Global list of nfs_uuid_t instances
* that is protected by nfs_uuid_lock.
* that is protected by nfs_uuids_lock.
*/
static LIST_HEAD(nfs_uuids);
/*
* Lock ordering:
* 1: nfs_uuid->lock
* 2: nfs_uuids_lock
* 3: nfs_uuid->list_lock (aka nn->local_clients_lock)
*
* May skip locks in select cases, but never hold multiple
* locks out of order.
*/
void nfs_uuid_init(nfs_uuid_t *nfs_uuid)
{
nfs_uuid->net = NULL;
RCU_INIT_POINTER(nfs_uuid->net, NULL);
nfs_uuid->dom = NULL;
nfs_uuid->list_lock = NULL;
INIT_LIST_HEAD(&nfs_uuid->list);
INIT_LIST_HEAD(&nfs_uuid->files);
spin_lock_init(&nfs_uuid->lock);
nfs_uuid->nfs3_localio_probe_count = 0;
}
EXPORT_SYMBOL_GPL(nfs_uuid_init);
bool nfs_uuid_begin(nfs_uuid_t *nfs_uuid)
{
spin_lock(&nfs_uuid_lock);
/* Is this nfs_uuid already in use? */
if (!list_empty(&nfs_uuid->list)) {
spin_unlock(&nfs_uuid_lock);
spin_lock(&nfs_uuid->lock);
if (rcu_access_pointer(nfs_uuid->net)) {
/* This nfs_uuid is already in use */
spin_unlock(&nfs_uuid->lock);
return false;
}
spin_lock(&nfs_uuids_lock);
if (!list_empty(&nfs_uuid->list)) {
/* This nfs_uuid is already in use */
spin_unlock(&nfs_uuids_lock);
spin_unlock(&nfs_uuid->lock);
return false;
}
uuid_gen(&nfs_uuid->uuid);
list_add_tail(&nfs_uuid->list, &nfs_uuids);
spin_unlock(&nfs_uuid_lock);
spin_unlock(&nfs_uuids_lock);
uuid_gen(&nfs_uuid->uuid);
spin_unlock(&nfs_uuid->lock);
return true;
}
@ -46,12 +75,16 @@ EXPORT_SYMBOL_GPL(nfs_uuid_begin);
void nfs_uuid_end(nfs_uuid_t *nfs_uuid)
{
if (nfs_uuid->net == NULL) {
spin_lock(&nfs_uuid_lock);
if (nfs_uuid->net == NULL)
if (!rcu_access_pointer(nfs_uuid->net)) {
spin_lock(&nfs_uuid->lock);
if (!rcu_access_pointer(nfs_uuid->net)) {
/* Not local, remove from nfs_uuids */
spin_lock(&nfs_uuids_lock);
list_del_init(&nfs_uuid->list);
spin_unlock(&nfs_uuid_lock);
}
spin_unlock(&nfs_uuids_lock);
}
spin_unlock(&nfs_uuid->lock);
}
}
EXPORT_SYMBOL_GPL(nfs_uuid_end);
@ -69,68 +102,142 @@ static nfs_uuid_t * nfs_uuid_lookup_locked(const uuid_t *uuid)
static struct module *nfsd_mod;
void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
struct net *net, struct auth_domain *dom,
struct module *mod)
spinlock_t *list_lock, struct net *net,
struct auth_domain *dom, struct module *mod)
{
nfs_uuid_t *nfs_uuid;
spin_lock(&nfs_uuid_lock);
spin_lock(&nfs_uuids_lock);
nfs_uuid = nfs_uuid_lookup_locked(uuid);
if (nfs_uuid) {
kref_get(&dom->ref);
nfs_uuid->dom = dom;
/*
* We don't hold a ref on the net, but instead put
* ourselves on a list so the net pointer can be
* invalidated.
*/
list_move(&nfs_uuid->list, list);
rcu_assign_pointer(nfs_uuid->net, net);
__module_get(mod);
nfsd_mod = mod;
if (!nfs_uuid) {
spin_unlock(&nfs_uuids_lock);
return;
}
spin_unlock(&nfs_uuid_lock);
/*
* We don't hold a ref on the net, but instead put
* ourselves on @list (nn->local_clients) so the net
* pointer can be invalidated.
*/
spin_lock(list_lock); /* list_lock is nn->local_clients_lock */
list_move(&nfs_uuid->list, list);
spin_unlock(list_lock);
spin_unlock(&nfs_uuids_lock);
/* Once nfs_uuid is parented to @list, avoid global nfs_uuids_lock */
spin_lock(&nfs_uuid->lock);
__module_get(mod);
nfsd_mod = mod;
nfs_uuid->list_lock = list_lock;
kref_get(&dom->ref);
nfs_uuid->dom = dom;
rcu_assign_pointer(nfs_uuid->net, net);
spin_unlock(&nfs_uuid->lock);
}
EXPORT_SYMBOL_GPL(nfs_uuid_is_local);
static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid)
void nfs_localio_enable_client(struct nfs_client *clp)
{
if (nfs_uuid->net) {
module_put(nfsd_mod);
nfs_uuid->net = NULL;
/* nfs_uuid_is_local() does the actual enablement */
trace_nfs_localio_enable_client(clp);
}
EXPORT_SYMBOL_GPL(nfs_localio_enable_client);
/*
* Cleanup the nfs_uuid_t embedded in an nfs_client.
* This is the long-form of nfs_uuid_init().
*/
static bool nfs_uuid_put(nfs_uuid_t *nfs_uuid)
{
LIST_HEAD(local_files);
struct nfs_file_localio *nfl, *tmp;
spin_lock(&nfs_uuid->lock);
if (unlikely(!rcu_access_pointer(nfs_uuid->net))) {
spin_unlock(&nfs_uuid->lock);
return false;
}
RCU_INIT_POINTER(nfs_uuid->net, NULL);
if (nfs_uuid->dom) {
auth_domain_put(nfs_uuid->dom);
nfs_uuid->dom = NULL;
}
list_del_init(&nfs_uuid->list);
list_splice_init(&nfs_uuid->files, &local_files);
spin_unlock(&nfs_uuid->lock);
/* Walk list of files and ensure their last references dropped */
list_for_each_entry_safe(nfl, tmp, &local_files, list) {
nfs_close_local_fh(nfl);
cond_resched();
}
spin_lock(&nfs_uuid->lock);
BUG_ON(!list_empty(&nfs_uuid->files));
/* Remove client from nn->local_clients */
if (nfs_uuid->list_lock) {
spin_lock(nfs_uuid->list_lock);
BUG_ON(list_empty(&nfs_uuid->list));
list_del_init(&nfs_uuid->list);
spin_unlock(nfs_uuid->list_lock);
nfs_uuid->list_lock = NULL;
}
module_put(nfsd_mod);
spin_unlock(&nfs_uuid->lock);
return true;
}
void nfs_uuid_invalidate_clients(struct list_head *list)
void nfs_localio_disable_client(struct nfs_client *clp)
{
if (nfs_uuid_put(&clp->cl_uuid))
trace_nfs_localio_disable_client(clp);
}
EXPORT_SYMBOL_GPL(nfs_localio_disable_client);
void nfs_localio_invalidate_clients(struct list_head *nn_local_clients,
spinlock_t *nn_local_clients_lock)
{
LIST_HEAD(local_clients);
nfs_uuid_t *nfs_uuid, *tmp;
struct nfs_client *clp;
spin_lock(&nfs_uuid_lock);
list_for_each_entry_safe(nfs_uuid, tmp, list, list)
nfs_uuid_put_locked(nfs_uuid);
spin_unlock(&nfs_uuid_lock);
}
EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_clients);
void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid)
{
if (nfs_uuid->net) {
spin_lock(&nfs_uuid_lock);
nfs_uuid_put_locked(nfs_uuid);
spin_unlock(&nfs_uuid_lock);
spin_lock(nn_local_clients_lock);
list_splice_init(nn_local_clients, &local_clients);
spin_unlock(nn_local_clients_lock);
list_for_each_entry_safe(nfs_uuid, tmp, &local_clients, list) {
if (WARN_ON(nfs_uuid->list_lock != nn_local_clients_lock))
break;
clp = container_of(nfs_uuid, struct nfs_client, cl_uuid);
nfs_localio_disable_client(clp);
}
}
EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_one_client);
EXPORT_SYMBOL_GPL(nfs_localio_invalidate_clients);
static void nfs_uuid_add_file(nfs_uuid_t *nfs_uuid, struct nfs_file_localio *nfl)
{
/* Add nfl to nfs_uuid->files if it isn't already */
spin_lock(&nfs_uuid->lock);
if (list_empty(&nfl->list)) {
rcu_assign_pointer(nfl->nfs_uuid, nfs_uuid);
list_add_tail(&nfl->list, &nfs_uuid->files);
}
spin_unlock(&nfs_uuid->lock);
}
/*
* Caller is responsible for calling nfsd_net_put and
* nfsd_file_put (via nfs_to_nfsd_file_put_local).
*/
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
struct rpc_clnt *rpc_clnt, const struct cred *cred,
const struct nfs_fh *nfs_fh, const fmode_t fmode)
const struct nfs_fh *nfs_fh, struct nfs_file_localio *nfl,
const fmode_t fmode)
{
struct net *net;
struct nfsd_file *localio;
@ -139,7 +246,7 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
* Not running in nfsd context, so must safely get reference on nfsd_serv.
* But the server may already be shutting down, if so disallow new localio.
* uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
* if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
* if uuid->net is not NULL, then calling nfsd_net_try_get() is safe
* and if it succeeds we will have an implied reference to the net.
*
* Otherwise NFS may not have ref on NFSD and therefore cannot safely
@ -147,21 +254,62 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
*/
rcu_read_lock();
net = rcu_dereference(uuid->net);
if (!net || !nfs_to->nfsd_serv_try_get(net)) {
if (!net || !nfs_to->nfsd_net_try_get(net)) {
rcu_read_unlock();
return ERR_PTR(-ENXIO);
}
rcu_read_unlock();
/* We have an implied reference to net thanks to nfsd_serv_try_get */
/* We have an implied reference to net thanks to nfsd_net_try_get */
localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
cred, nfs_fh, fmode);
if (IS_ERR(localio))
nfs_to_nfsd_net_put(net);
else
nfs_uuid_add_file(uuid, nfl);
return localio;
}
EXPORT_SYMBOL_GPL(nfs_open_local_fh);
void nfs_close_local_fh(struct nfs_file_localio *nfl)
{
struct nfsd_file *ro_nf = NULL;
struct nfsd_file *rw_nf = NULL;
nfs_uuid_t *nfs_uuid;
rcu_read_lock();
nfs_uuid = rcu_dereference(nfl->nfs_uuid);
if (!nfs_uuid) {
/* regular (non-LOCALIO) NFS will hammer this */
rcu_read_unlock();
return;
}
ro_nf = rcu_access_pointer(nfl->ro_file);
rw_nf = rcu_access_pointer(nfl->rw_file);
if (ro_nf || rw_nf) {
spin_lock(&nfs_uuid->lock);
if (ro_nf)
ro_nf = rcu_dereference_protected(xchg(&nfl->ro_file, NULL), 1);
if (rw_nf)
rw_nf = rcu_dereference_protected(xchg(&nfl->rw_file, NULL), 1);
/* Remove nfl from nfs_uuid->files list */
RCU_INIT_POINTER(nfl->nfs_uuid, NULL);
list_del_init(&nfl->list);
spin_unlock(&nfs_uuid->lock);
rcu_read_unlock();
if (ro_nf)
nfs_to_nfsd_file_put_local(ro_nf);
if (rw_nf)
nfs_to_nfsd_file_put_local(rw_nf);
return;
}
rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(nfs_close_local_fh);
/*
* The NFS LOCALIO code needs to call into NFSD using various symbols,
* but cannot be statically linked, because that will make the NFS

View file

@ -39,6 +39,7 @@
#include <linux/fsnotify.h>
#include <linux/seq_file.h>
#include <linux/rhashtable.h>
#include <linux/nfslocalio.h>
#include "vfs.h"
#include "nfsd.h"
@ -391,7 +392,7 @@ nfsd_file_put(struct nfsd_file *nf)
}
/**
* nfsd_file_put_local - put nfsd_file reference and arm nfsd_serv_put in caller
* nfsd_file_put_local - put nfsd_file reference and arm nfsd_net_put in caller
* @nf: nfsd_file of which to put the reference
*
* First save the associated net to return to caller, then put
@ -833,6 +834,14 @@ __nfsd_file_cache_purge(struct net *net)
struct nfsd_file *nf;
LIST_HEAD(dispose);
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
if (net) {
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfs_localio_invalidate_clients(&nn->local_clients,
&nn->local_clients_lock);
}
#endif
rhltable_walk_enter(&nfsd_file_rhltable, &iter);
do {
rhashtable_walk_start(&iter);
@ -1222,10 +1231,9 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
* a file. The security implications of this should be carefully
* considered before use.
*
* The nfsd_file object returned by this API is reference-counted
* and garbage-collected. The object is retained for a few
* seconds after the final nfsd_file_put() in case the caller
* wants to re-use it.
* The nfsd_file_object returned by this API is reference-counted
* but not garbage-collected. The object is unhashed after the
* final nfsd_file_put().
*
* Return values:
* %nfs_ok - @pnf points to an nfsd_file with its reference
@ -1247,7 +1255,7 @@ nfsd_file_acquire_local(struct net *net, struct svc_cred *cred,
__be32 beres;
beres = nfsd_file_do_acquire(NULL, net, cred, client,
fhp, may_flags, NULL, pnf, true);
fhp, may_flags, NULL, pnf, false);
put_cred(revert_creds(save_cred));
return beres;
}

View file

@ -25,10 +25,12 @@
#include "cache.h"
static const struct nfsd_localio_operations nfsd_localio_ops = {
.nfsd_serv_try_get = nfsd_serv_try_get,
.nfsd_serv_put = nfsd_serv_put,
.nfsd_net_try_get = nfsd_net_try_get,
.nfsd_net_put = nfsd_net_put,
.nfsd_open_local_fh = nfsd_open_local_fh,
.nfsd_file_put_local = nfsd_file_put_local,
.nfsd_file_get = nfsd_file_get,
.nfsd_file_put = nfsd_file_put,
.nfsd_file_file = nfsd_file_file,
};
@ -52,7 +54,7 @@ void nfsd_localio_ops_init(void)
* avoid all the NFS overhead with reads, writes and commits.
*
* On successful return, returned nfsd_file will have its nf_net member
* set. Caller (NFS client) is responsible for calling nfsd_serv_put and
* set. Caller (NFS client) is responsible for calling nfsd_net_put and
* nfsd_file_put (via nfs_to_nfsd_file_put_local).
*/
struct nfsd_file *
@ -114,6 +116,7 @@ static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfs_uuid_is_local(&argp->uuid, &nn->local_clients,
&nn->local_clients_lock,
net, rqstp->rq_client, THIS_MODULE);
return rpc_success;

View file

@ -134,9 +134,10 @@ struct nfsd_net {
struct svc_info nfsd_info;
#define nfsd_serv nfsd_info.serv
struct percpu_ref nfsd_serv_ref;
struct completion nfsd_serv_confirm_done;
struct completion nfsd_serv_free_done;
struct percpu_ref nfsd_net_ref;
struct completion nfsd_net_confirm_done;
struct completion nfsd_net_free_done;
/*
* clientid and stateid data for construction of net unique COPY
@ -213,6 +214,7 @@ struct nfsd_net {
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
/* Local clients to be invalidated when net is shut down */
spinlock_t local_clients_lock;
struct list_head local_clients;
#endif
};
@ -223,8 +225,8 @@ struct nfsd_net {
extern bool nfsd_support_version(int vers);
extern unsigned int nfsd_net_id;
bool nfsd_serv_try_get(struct net *net);
void nfsd_serv_put(struct net *net);
bool nfsd_net_try_get(struct net *net);
void nfsd_net_put(struct net *net);
void nfsd_copy_write_verifier(__be32 verf[2], struct nfsd_net *nn);
void nfsd_reset_write_verifier(struct nfsd_net *nn);

View file

@ -2217,6 +2217,7 @@ static __net_init int nfsd_net_init(struct net *net)
seqlock_init(&nn->writeverf_lock);
nfsd_proc_stat_init(net);
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
spin_lock_init(&nn->local_clients_lock);
INIT_LIST_HEAD(&nn->local_clients);
#endif
return 0;
@ -2234,14 +2235,15 @@ out_export_error:
* nfsd_net_pre_exit - Disconnect localio clients from net namespace
* @net: a network namespace that is about to be destroyed
*
* This invalidated ->net pointers held by localio clients
* This invalidates ->net pointers held by localio clients
* while they can still safely access nn->counter.
*/
static __net_exit void nfsd_net_pre_exit(struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
nfs_uuid_invalidate_clients(&nn->local_clients);
nfs_localio_invalidate_clients(&nn->local_clients,
&nn->local_clients_lock);
}
#endif

View file

@ -204,32 +204,32 @@ int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change
return 0;
}
bool nfsd_serv_try_get(struct net *net) __must_hold(rcu)
bool nfsd_net_try_get(struct net *net) __must_hold(rcu)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
return (nn && percpu_ref_tryget_live(&nn->nfsd_serv_ref));
return (nn && percpu_ref_tryget_live(&nn->nfsd_net_ref));
}
void nfsd_serv_put(struct net *net) __must_hold(rcu)
void nfsd_net_put(struct net *net) __must_hold(rcu)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
percpu_ref_put(&nn->nfsd_serv_ref);
percpu_ref_put(&nn->nfsd_net_ref);
}
static void nfsd_serv_done(struct percpu_ref *ref)
static void nfsd_net_done(struct percpu_ref *ref)
{
struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref);
struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_net_ref);
complete(&nn->nfsd_serv_confirm_done);
complete(&nn->nfsd_net_confirm_done);
}
static void nfsd_serv_free(struct percpu_ref *ref)
static void nfsd_net_free(struct percpu_ref *ref)
{
struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_serv_ref);
struct nfsd_net *nn = container_of(ref, struct nfsd_net, nfsd_net_ref);
complete(&nn->nfsd_serv_free_done);
complete(&nn->nfsd_net_free_done);
}
/*
@ -426,6 +426,10 @@ static void nfsd_shutdown_net(struct net *net)
if (!nn->nfsd_net_up)
return;
percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done);
wait_for_completion(&nn->nfsd_net_confirm_done);
nfsd_export_flush(net);
nfs4_state_shutdown_net(net);
nfsd_reply_cache_shutdown(nn);
@ -434,7 +438,10 @@ static void nfsd_shutdown_net(struct net *net)
lockd_down(net);
nn->lockd_up = false;
}
percpu_ref_exit(&nn->nfsd_serv_ref);
wait_for_completion(&nn->nfsd_net_free_done);
percpu_ref_exit(&nn->nfsd_net_ref);
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
@ -516,11 +523,6 @@ void nfsd_destroy_serv(struct net *net)
lockdep_assert_held(&nfsd_mutex);
percpu_ref_kill_and_confirm(&nn->nfsd_serv_ref, nfsd_serv_done);
wait_for_completion(&nn->nfsd_serv_confirm_done);
wait_for_completion(&nn->nfsd_serv_free_done);
/* percpu_ref_exit is called in nfsd_shutdown_net */
spin_lock(&nfsd_notifier_lock);
nn->nfsd_serv = NULL;
spin_unlock(&nfsd_notifier_lock);
@ -621,12 +623,12 @@ int nfsd_create_serv(struct net *net)
if (nn->nfsd_serv)
return 0;
error = percpu_ref_init(&nn->nfsd_serv_ref, nfsd_serv_free,
error = percpu_ref_init(&nn->nfsd_net_ref, nfsd_net_free,
0, GFP_KERNEL);
if (error)
return error;
init_completion(&nn->nfsd_serv_free_done);
init_completion(&nn->nfsd_serv_confirm_done);
init_completion(&nn->nfsd_net_free_done);
init_completion(&nn->nfsd_net_confirm_done);
if (nfsd_max_blksize == 0)
nfsd_max_blksize = nfsd_get_default_max_blksize();

View file

@ -9,9 +9,10 @@
#include <uapi/linux/nfs.h>
/* Mapping from NFS error code to "errno" error code. */
#define errno_NFSERR_IO EIO
int nfs_stat_to_errno(enum nfs_stat status);
int nfs4_stat_to_errno(int stat);
__u32 nfs_localio_errno_to_nfs4_stat(int errno);
#endif /* _LINUX_NFS_COMMON_H */

View file

@ -77,6 +77,23 @@ struct nfs_lock_context {
struct rcu_head rcu_head;
};
struct nfs_file_localio {
struct nfsd_file __rcu *ro_file;
struct nfsd_file __rcu *rw_file;
struct list_head list;
void __rcu *nfs_uuid; /* opaque pointer to 'nfs_uuid_t' */
};
static inline void nfs_localio_file_init(struct nfs_file_localio *nfl)
{
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
nfl->ro_file = NULL;
nfl->rw_file = NULL;
INIT_LIST_HEAD(&nfl->list);
nfl->nfs_uuid = NULL;
#endif
}
struct nfs4_state;
struct nfs_open_context {
struct nfs_lock_context lock_context;
@ -87,15 +104,16 @@ struct nfs_open_context {
struct nfs4_state *state;
fmode_t mode;
int error;
unsigned long flags;
#define NFS_CONTEXT_BAD (2)
#define NFS_CONTEXT_UNLOCK (3)
#define NFS_CONTEXT_FILE_OPEN (4)
int error;
struct list_head list;
struct nfs4_threshold *mdsthreshold;
struct list_head list;
struct rcu_head rcu_head;
struct nfs_file_localio nfl;
};
struct nfs_open_dir_context {

View file

@ -50,7 +50,6 @@ struct nfs_client {
#define NFS_CS_DS 7 /* - Server is a DS */
#define NFS_CS_REUSEPORT 8 /* - reuse src port on reconnect */
#define NFS_CS_PNFS 9 /* - Server used for pnfs */
#define NFS_CS_LOCAL_IO 10 /* - client is local */
struct sockaddr_storage cl_addr; /* server identifier */
size_t cl_addrlen;
char * cl_hostname; /* hostname of server */
@ -132,7 +131,7 @@ struct nfs_client {
struct timespec64 cl_nfssvc_boot;
seqlock_t cl_boot_lock;
nfs_uuid_t cl_uuid;
spinlock_t cl_localio_lock;
struct work_struct cl_local_probe_work;
#endif /* CONFIG_NFS_LOCALIO */
};

View file

@ -1632,6 +1632,7 @@ enum {
NFS_IOHDR_RESEND_PNFS,
NFS_IOHDR_RESEND_MDS,
NFS_IOHDR_UNSTABLE_WRITES,
NFS_IOHDR_ODIRECT,
};
struct nfs_io_completion;

View file

@ -6,9 +6,6 @@
#ifndef __LINUX_NFSLOCALIO_H
#define __LINUX_NFSLOCALIO_H
/* nfsd_file structure is purposely kept opaque to NFS client */
struct nfsd_file;
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
#include <linux/module.h>
@ -19,6 +16,9 @@ struct nfsd_file;
#include <linux/nfs.h>
#include <net/net_namespace.h>
struct nfs_client;
struct nfs_file_localio;
/*
* Useful to allow a client to negotiate if localio
* possible with its server.
@ -27,28 +27,38 @@ struct nfsd_file;
*/
typedef struct {
uuid_t uuid;
unsigned nfs3_localio_probe_count;
/* this struct is over a cacheline, avoid bouncing */
spinlock_t ____cacheline_aligned lock;
struct list_head list;
spinlock_t *list_lock; /* nn->local_clients_lock */
struct net __rcu *net; /* nfsd's network namespace */
struct auth_domain *dom; /* auth_domain for localio */
/* Local files to close when net is shut down or exports change */
struct list_head files;
} nfs_uuid_t;
void nfs_uuid_init(nfs_uuid_t *);
bool nfs_uuid_begin(nfs_uuid_t *);
void nfs_uuid_end(nfs_uuid_t *);
void nfs_uuid_is_local(const uuid_t *, struct list_head *,
void nfs_uuid_is_local(const uuid_t *, struct list_head *, spinlock_t *,
struct net *, struct auth_domain *, struct module *);
void nfs_uuid_invalidate_clients(struct list_head *list);
void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid);
void nfs_localio_enable_client(struct nfs_client *clp);
void nfs_localio_disable_client(struct nfs_client *clp);
void nfs_localio_invalidate_clients(struct list_head *nn_local_clients,
spinlock_t *nn_local_clients_lock);
/* localio needs to map filehandle -> struct nfsd_file */
extern struct nfsd_file *
nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *,
const struct cred *, const struct nfs_fh *,
const fmode_t) __must_hold(rcu);
void nfs_close_local_fh(struct nfs_file_localio *);
struct nfsd_localio_operations {
bool (*nfsd_serv_try_get)(struct net *);
void (*nfsd_serv_put)(struct net *);
bool (*nfsd_net_try_get)(struct net *);
void (*nfsd_net_put)(struct net *);
struct nfsd_file *(*nfsd_open_local_fh)(struct net *,
struct auth_domain *,
struct rpc_clnt *,
@ -56,6 +66,8 @@ struct nfsd_localio_operations {
const struct nfs_fh *,
const fmode_t);
struct net *(*nfsd_file_put_local)(struct nfsd_file *);
struct nfsd_file *(*nfsd_file_get)(struct nfsd_file *);
void (*nfsd_file_put)(struct nfsd_file *);
struct file *(*nfsd_file_file)(struct nfsd_file *);
} ____cacheline_aligned;
@ -64,17 +76,18 @@ extern const struct nfsd_localio_operations *nfs_to;
struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
struct rpc_clnt *, const struct cred *,
const struct nfs_fh *, const fmode_t);
const struct nfs_fh *, struct nfs_file_localio *,
const fmode_t);
static inline void nfs_to_nfsd_net_put(struct net *net)
{
/*
* Once reference to nfsd_serv is dropped, NFSD could be
* unloaded, so ensure safe return from nfsd_file_put_local()
* by always taking RCU.
* Once reference to net (and associated nfsd_serv) is dropped, NFSD
* could be unloaded, so ensure safe return from nfsd_net_put() by
* always taking RCU.
*/
rcu_read_lock();
nfs_to->nfsd_serv_put(net);
nfs_to->nfsd_net_put(net);
rcu_read_unlock();
}
@ -91,12 +104,19 @@ static inline void nfs_to_nfsd_file_put_local(struct nfsd_file *localio)
}
#else /* CONFIG_NFS_LOCALIO */
struct nfs_file_localio;
static inline void nfs_close_local_fh(struct nfs_file_localio *nfl)
{
}
static inline void nfsd_localio_ops_init(void)
{
}
static inline void nfs_to_nfsd_file_put_local(struct nfsd_file *localio)
struct nfs_client;
static inline void nfs_localio_disable_client(struct nfs_client *clp)
{
}
#endif /* CONFIG_NFS_LOCALIO */
#endif /* __LINUX_NFSLOCALIO_H */

View file

@ -93,6 +93,7 @@ struct rpc_clnt {
const struct cred *cl_cred;
unsigned int cl_max_connect; /* max number of transports not to the same IP */
struct super_block *pipefs_sb;
atomic_t cl_task_count;
};
/*

View file

@ -958,12 +958,17 @@ void rpc_shutdown_client(struct rpc_clnt *clnt)
trace_rpc_clnt_shutdown(clnt);
clnt->cl_shutdown = 1;
while (!list_empty(&clnt->cl_tasks)) {
rpc_killall_tasks(clnt);
wait_event_timeout(destroy_wait,
list_empty(&clnt->cl_tasks), 1*HZ);
}
/* wait for tasks still in workqueue or waitqueue */
wait_event_timeout(destroy_wait,
atomic_read(&clnt->cl_task_count) == 0, 1 * HZ);
rpc_release_client(clnt);
}
EXPORT_SYMBOL_GPL(rpc_shutdown_client);
@ -1139,6 +1144,7 @@ void rpc_task_release_client(struct rpc_task *task)
list_del(&task->tk_task);
spin_unlock(&clnt->cl_lock);
task->tk_client = NULL;
atomic_dec(&clnt->cl_task_count);
rpc_release_client(clnt);
}
@ -1189,10 +1195,7 @@ void rpc_task_set_client(struct rpc_task *task, struct rpc_clnt *clnt)
task->tk_flags |= RPC_TASK_TIMEOUT;
if (clnt->cl_noretranstimeo)
task->tk_flags |= RPC_TASK_NO_RETRANS_TIMEOUT;
/* Add to the client's list of all tasks */
spin_lock(&clnt->cl_lock);
list_add_tail(&task->tk_task, &clnt->cl_tasks);
spin_unlock(&clnt->cl_lock);
atomic_inc(&clnt->cl_task_count);
}
static void
@ -1787,9 +1790,14 @@ call_reserveresult(struct rpc_task *task)
if (status >= 0) {
if (task->tk_rqstp) {
task->tk_action = call_refresh;
/* Add to the client's list of all tasks */
spin_lock(&task->tk_client->cl_lock);
if (list_empty(&task->tk_task))
list_add_tail(&task->tk_task, &task->tk_client->cl_tasks);
spin_unlock(&task->tk_client->cl_lock);
return;
}
rpc_call_rpcerror(task, -EIO);
return;
}
@ -1854,13 +1862,13 @@ call_refreshresult(struct rpc_task *task)
fallthrough;
case -EAGAIN:
status = -EACCES;
fallthrough;
case -EKEYEXPIRED:
if (!task->tk_cred_retry)
break;
task->tk_cred_retry--;
trace_rpc_retry_refresh_status(task);
return;
case -EKEYEXPIRED:
break;
case -ENOMEM:
rpc_delay(task, HZ >> 4);
return;
@ -3319,8 +3327,11 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt,
EXPORT_SYMBOL_GPL(rpc_clnt_xprt_switch_has_addr);
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
static void rpc_show_header(void)
static void rpc_show_header(struct rpc_clnt *clnt)
{
printk(KERN_INFO "clnt[%pISpc] RPC tasks[%d]\n",
(struct sockaddr *)&clnt->cl_xprt->addr,
atomic_read(&clnt->cl_task_count));
printk(KERN_INFO "-pid- flgs status -client- --rqstp- "
"-timeout ---ops--\n");
}
@ -3352,7 +3363,7 @@ void rpc_show_tasks(struct net *net)
spin_lock(&clnt->cl_lock);
list_for_each_entry(task, &clnt->cl_tasks, tk_task) {
if (!header) {
rpc_show_header();
rpc_show_header(clnt);
header++;
}
rpc_show_task(clnt, task);

View file

@ -74,6 +74,9 @@ tasks_stop(struct seq_file *f, void *v)
{
struct rpc_clnt *clnt = f->private;
spin_unlock(&clnt->cl_lock);
seq_printf(f, "clnt[%pISpc] RPC tasks[%d]\n",
(struct sockaddr *)&clnt->cl_xprt->addr,
atomic_read(&clnt->cl_task_count));
}
static const struct seq_operations tasks_seq_operations = {
@ -179,6 +182,18 @@ xprt_info_show(struct seq_file *f, void *v)
seq_printf(f, "addr: %s\n", xprt->address_strings[RPC_DISPLAY_ADDR]);
seq_printf(f, "port: %s\n", xprt->address_strings[RPC_DISPLAY_PORT]);
seq_printf(f, "state: 0x%lx\n", xprt->state);
seq_printf(f, "netns: %u\n", xprt->xprt_net->ns.inum);
if (xprt->ops->get_srcaddr) {
int ret, buflen;
char buf[INET6_ADDRSTRLEN];
buflen = ARRAY_SIZE(buf);
ret = xprt->ops->get_srcaddr(xprt, buf, buflen);
if (ret < 0)
ret = sprintf(buf, "<closed>");
seq_printf(f, "saddr: %.*s\n", ret, buf);
}
return 0;
}