1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

51 commits

Author SHA1 Message Date
Linus Torvalds
f4484d138b Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "55 patches.

  Subsystems affected by this patch series: percpu, procfs, sysctl,
  misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2,
  hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits)
  lib: remove redundant assignment to variable ret
  ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
  kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
  lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
  btrfs: use generic Kconfig option for 256kB page size limit
  arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
  configs: introduce debug.config for CI-like setup
  delayacct: track delays from memory compact
  Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
  delayacct: cleanup flags in struct task_delay_info and functions use it
  delayacct: fix incomplete disable operation when switch enable to disable
  delayacct: support swapin delay accounting for swapping without blkio
  panic: remove oops_id
  panic: use error_report_end tracepoint on warnings
  fs/adfs: remove unneeded variable make code cleaner
  FAT: use io_schedule_timeout() instead of congestion_wait()
  hfsplus: use struct_group_attr() for memcpy() region
  nilfs2: remove redundant pointer sbufs
  fs/binfmt_elf: use PT_LOAD p_align values for static PIE
  const_structs.checkpatch: add frequently used ops structs
  ...
2022-01-20 10:41:01 +02:00
Isabella Basso
fd0a146240 hash.h: remove unused define directive
Patch series "test_hash.c: refactor into KUnit", v3.

We refactored the lib/test_hash.c file into KUnit as part of the student
group LKCAMP [1] introductory hackathon for kernel development.

This test was pointed to our group by Daniel Latypov [2], so its full
conversion into a pure KUnit test was our goal in this patch series, but
we ran into many problems relating to it not being split as unit tests,
which complicated matters a bit, as the reasoning behind the original
tests is quite cryptic for those unfamiliar with hash implementations.

Some interesting developments we'd like to highlight are:

 - In patch 1/5 we noticed that there was an unused define directive
   that could be removed.

 - In patch 4/5 we noticed how stringhash and hash tests are all under
   the lib/test_hash.c file, which might cause some confusion, and we
   also broke those kernel config entries up.

Overall KUnit developments have been made in the other patches in this
series:

In patches 2/5, 3/5 and 5/5 we refactored the lib/test_hash.c file so as
to make it more compatible with the KUnit style, whilst preserving the
original idea of the maintainer who designed it (i.e.  George Spelvin),
which might be undesirable for unit tests, but we assume it is enough
for a first patch.

This patch (of 5):

Currently, there exist hash_32() and __hash_32() functions, which were
introduced in a patch [1] targeting architecture specific optimizations.
These functions can be overridden on a per-architecture basis to achieve
such optimizations.  They must set their corresponding define directive
(HAVE_ARCH_HASH_32 and HAVE_ARCH__HASH_32, respectively) so that header
files can deal with these overrides properly.

As the supported 32-bit architectures that have their own hash function
implementation (i.e.  m68k, Microblaze, H8/300, pa-risc) have only been
making use of the (more general) __hash_32() function (which only lacks
a right shift operation when compared to the hash_32() function), remove
the define directive corresponding to the arch-specific hash_32()
implementation.

[1] https://lore.kernel.org/lkml/20160525073311.5600.qmail@ns.sciencehorizons.net/

[akpm@linux-foundation.org: hash_32_generic() becomes hash_32()]

Link: https://lkml.kernel.org/r/20211208183711.390454-1-isabbasso@riseup.net
Link: https://lkml.kernel.org/r/20211208183711.390454-2-isabbasso@riseup.net
Reviewed-by: David Gow <davidgow@google.com>
Tested-by: David Gow <davidgow@google.com>
Co-developed-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Signed-off-by: Augusto Durães Camargo <augusto.duraes33@gmail.com>
Co-developed-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Enzo Ferreira <ferreiraenzoa@gmail.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Jason Gunthorpe
4922f09209 Linux 5.16-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3
 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0
 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem
 St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz
 iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+
 VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz
 1m2tVNY=
 =7wRd
 -----END PGP SIGNATURE-----

Merge tag 'v5.16-rc5' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-14 20:18:48 -04:00
Pavel Skripkin
84b01721e8 RDMA: Fix use-after-free in rxe_queue_cleanup
On error handling path in rxe_qp_from_init() qp->sq.queue is freed and
then rxe_create_qp() will drop last reference to this object. qp clean up
function will try to free this queue one time and it causes UAF bug.

Fix it by zeroing queue pointer after freeing queue in rxe_qp_from_init().

Fixes: 514aee660d ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/20211121202239.3129-1-paskripkin@gmail.com
Reported-by: syzbot+aab53008a5adf26abe91@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-25 13:15:59 -04:00
Bob Pearson
02827b6708 RDMA/rxe: Cleanup rxe_pool_entry
Currently three different names are used to describe rxe pool elements.
They are referred to as entries, elems or pelems. This patch chooses one
'elem' and changes the other ones.

Link: https://lore.kernel.org/r/20211103050241.61293-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-19 13:29:14 -04:00
Xiao Yang
1cf2ce8272 RDMA/rxe: Remove the is_user members of struct rxe_sq/rxe_rq/rxe_srq
The is_user members of struct rxe_sq/rxe_rq/rxe_srq are unsed since
commit ae6e843fe0 ("RDMA/rxe: Add memory barriers to kernel queues").
In this case, it is fine to remove them directly.

Link: https://lore.kernel.org/r/20210930094813.226888-2-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 19:45:29 -03:00
Bob Pearson
ae6e843fe0 RDMA/rxe: Add memory barriers to kernel queues
Earlier patches added memory barriers to protect user space to kernel
space communications. The user space queues were previously shown to have
occasional memory synchonization errors which were removed by adding
smp_load_acquire, smp_store_release barriers.  This patch extends that to
the case where queues are used between kernel space threads.

This patch also extends the queue types to include kernel ULP queues which
access the other end of the queues in kernel verbs calls like poll_cq and
post_send/recv.

Link: https://lore.kernel.org/r/20210914164206.19768-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-24 10:14:59 -03:00
Zhu Yanjun
ad17bbef3d RDMA/rxe: remove the unnecessary variable
In the struct rxe_qp, the variable send_pkts is never used.  So remove it.

Link: https://lore.kernel.org/r/20210915075128.482919-1-yanjun.zhu@intel.com
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-14 16:22:48 -03:00
Jason Gunthorpe
fdcebbc2ac Linux 5.13-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmDPuyMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvxgH/RKvSuRPwkJ2Jcp9
 VLi5kCbqtJlYLq6tB6peSJ8otKgxkcRwC0pIY4LlYIAWYboktLQ5RKp/9nB2h2FN
 aMZUMu6AI/lVJyFMI5MnKnJIUiUq+WXR3lSSlw68vwFLFdzqUZFNq+bqeiVvnIy1
 yqA6naj24Tu/RbYffQoPvdSJcU2SLXRMxwD8HRGiU2d51RaFsOvsZvF+P5TVcsEV
 ZmttJeER21CaI/A809eqaFmyGrUOcZZK9roZEbMwanTZOMw18biEsLu/UH4kBX01
 JC4+RlGxcWjQ5YNZgChsgoOK/CHzc6ITztTntdeDWAvwZjQFzV7pCy4/3BWne3O+
 5178yHM=
 =o8cN
 -----END PGP SIGNATURE-----

Merge tag 'v5.13-rc7' into rdma.git for-next

Linux 5.13-rc7

Needed for dependencies in following patches. Merge conflict in rxe_cmop.c
resolved by compining both patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-22 14:43:51 -03:00
Bob Pearson
15ae1375ea RDMA/rxe: Fix qp reference counting for atomic ops
Currently the rdma_rxe driver attempts to protect atomic responder
resources by taking a reference to the qp which is only freed when the
resource is recycled for a new read or atomic operation. This means that
in normal circumstances there is almost always an extra qp reference once
an atomic operation has been executed which prevents cleaning up the qp
and associated pd and cqs when the qp is destroyed.

This patch removes the call to rxe_add_ref() in send_atomic_ack() and the
call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is
destroyed while a peer is retrying an atomic op it will cause the
operation to fail which is acceptable.

Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com
Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Fixes: 86af617641 ("IB/rxe: remove unnecessary skb_clone")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:20:23 -03:00
Bob Pearson
5bcf5a59c4 RDMA/rxe: Protext kernel index from user space
In order to prevent user space from modifying the index that belongs to
the kernel for shared queues let the kernel use a local copy of the index
and copy any new values of that index to the shared rxe_queue_bus struct.

This adds more switch statements which decreases the performance of the
queue API. Move the type into the parameter list for these functions so
that the compiler can optimize out the switch statements when the explicit
type is known. Modify all the calls in the driver on performance paths to
pass in the explicit queue type.

Link: https://lore.kernel.org/r/20210527194748.662636-4-rpearsonhpe@gmail.com
Link: https://lore.kernel.org/linux-rdma/20210526165239.GP1002214@@nvidia.com/
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 15:53:01 -03:00
Bob Pearson
59daff49f2 RDMA/rxe: Add a type flag to rxe_queue structs
To create optimal code only want to use smp_load_acquire() and
smp_store_release() for user indices in rxe_queue APIs since kernel
indices are protected by locks which also act as memory barriers. By
adding a type to the queues we can determine which indices need to be
protected.

Link: https://lore.kernel.org/r/20210527194748.662636-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 15:53:01 -03:00
Leon Romanovsky
67f29896fd RDMA/rxe: Clear all QP fields if creation failed
rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly
created ones, but in case rxe_qp_from_init() failed it was filled with
garbage and caused tot the following error.

  refcount_t: underflow; use-after-free.
  WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28
  Modules linked in:
  CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28
  Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55
  RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67
  RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
  R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800
  R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000
  FS:  00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   __refcount_sub_and_test include/linux/refcount.h:283 [inline]
   __refcount_dec_and_test include/linux/refcount.h:315 [inline]
   refcount_dec_and_test include/linux/refcount.h:333 [inline]
   kref_put include/linux/kref.h:64 [inline]
   rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805
   execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327
   rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391
   kref_put include/linux/kref.h:65 [inline]
   rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425
   _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline]
   ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231
   ib_create_qp include/rdma/ib_verbs.h:3644 [inline]
   create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920
   ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline]
   ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092
   add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717
   enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331
   ib_register_device drivers/infiniband/core/device.c:1413 [inline]
   ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365
   rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147
   rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247
   rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503
   rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline]
   rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250
   nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555
   rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195
   rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
   rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259
   netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
   netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
   netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
   sock_sendmsg_nosec net/socket.c:654 [inline]
   sock_sendmsg+0xcf/0x120 net/socket.c:674
   ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
   ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
   __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
   do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com
Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-11 16:58:14 -03:00
Bob Pearson
dc78074a80 RDMA/rxe: Fix minor coding style issues
checkpatch -f found 3 warnings in RDMA/rxe

1. a missing space following switch
2. return followed by else
3. use of strlcpy() instead of strscpy().

This patch fixes each of these. In

		...
	} elseif (...) {
		...
		return 0;
	} else
		...

The middle block can be safely moved since it is completely independent of
the other code.

Link: https://lore.kernel.org/r/20210205230525.49068-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:42:57 -04:00
Xiao Yang
bad07664a5 RDMA/rxe: Add check for supported QP types
Current rdma_rxe only supports five QP types, attempting to create any
others should return an error - the type check was missed.

Link: https://lore.kernel.org/r/20201216071755.149449-2-yangx.jy@cn.fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-12 16:11:35 -04:00
Bob Pearson
63fa15dbd4 RDMA/rxe: Add SPDX hdrs to rxe source files
Add SPDX headers to all rxe .c and .h files.

Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31 12:20:02 -03:00
Bob Pearson
5f9e2822d1 RDMA/rxe: Fix style warnings
Fixed several minor checkpatch warnings in existing rxe source.

Link: https://lore.kernel.org/r/20200820224638.3212-3-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:51:08 -03:00
Bart Van Assche
fb3063d319 RDMA/rxe: Fix configuration of atomic queue pair attributes
From the comment above the definition of the roundup_pow_of_two() macro:

     The result is undefined when n == 0.

Hence only pass positive values to roundup_pow_of_two(). This patch fixes
the following UBSAN complaint:

  UBSAN: Undefined behaviour in ./include/linux/log2.h:57:13
  shift exponent 64 is too large for 64-bit type 'long unsigned int'
  Call Trace:
   dump_stack+0xa5/0xe6
   ubsan_epilogue+0x9/0x26
   __ubsan_handle_shift_out_of_bounds.cold+0x4c/0xf9
   rxe_qp_from_attr.cold+0x37/0x5d [rdma_rxe]
   rxe_modify_qp+0x59/0x70 [rdma_rxe]
   _ib_modify_qp+0x5aa/0x7c0 [ib_core]
   ib_modify_qp+0x3b/0x50 [ib_core]
   cma_modify_qp_rtr+0x234/0x260 [rdma_cm]
   __rdma_accept+0x1a7/0x650 [rdma_cm]
   nvmet_rdma_cm_handler+0x1286/0x14cd [nvmet_rdma]
   cma_cm_event_handler+0x6b/0x330 [rdma_cm]
   cma_ib_req_handler+0xe60/0x22d0 [rdma_cm]
   cm_process_work+0x30/0x140 [ib_cm]
   cm_req_handler+0x11f4/0x1cd0 [ib_cm]
   cm_work_handler+0xb8/0x344e [ib_cm]
   process_one_work+0x569/0xb60
   worker_thread+0x7a/0x5d0
   kthread+0x1e6/0x210
   ret_from_fork+0x24/0x30

Link: https://lore.kernel.org/r/20200217205714.26937-1-bvanassche@acm.org
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-19 16:56:57 -04:00
Rao Shoaib
363824f92a RDMA/rxe: Compute the maximum sges and inline size based on the WQE size
The SGE buffer size and max_inline data should be derived from the size of
the WQE. Each value individually sets the WQE size, so compute the actual
sizes based on the actual WQE size and configure the QP with the maximums.

Also fix the missing return of the actual maximum capability to the caller.

Link: https://lore.kernel.org/r/1578962480-17814-3-git-send-email-rao.shoaib@oracle.com
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15 16:36:41 -04:00
Shamir Rabinovitch
ff23dfa134 IB: Pass only ib_udata in function prototypes
Now when ib_udata is passed to all the driver's object create/destroy APIs
the ib_udata will carry the ib_ucontext for every user command. There is
no need to also pass the ib_ucontext via the functions prototypes.

Make ib_udata the only argument psssed.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01 15:00:47 -03:00
Shamir Rabinovitch
8994445054 IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs
Now when we have the udata passed to all the ib_xxx object creation APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15 15:38:38 -07:00
Kamal Heib
fa40718804 RDMA/rxe: Move rxe_init_av() to rxe_av.c
Move the function rxe_init_av() to rxe_av.c file and use it instead of
calling rxe_av_from_attr() and rxe_av_fill_ip_info(), also remove the
unused rxe_dev parameter from rxe_init_av().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 15:57:49 -07:00
Shamir Rabinovitch
e00b64f7c5 RDMA: Cleanup undesired pd->uobject usage
Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 19:15:48 -07:00
Yuval Shaia
e7521d82b3 IB/rxe: Utilize generic function to validate port number
Utilize rdma_is_port_valid to validate the given port.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-06 21:18:00 -07:00
Kamal Heib
d31131bba5 RDMA: Remove unused parameter from ib_modify_qp_is_ok()
The ll parameter is not used in ib_modify_qp_is_ok(), so remove it.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-03 16:05:46 -06:00
Zhu Yanjun
721ad7e643 IB/rxe: replace kvfree with vfree
The buf is allocated by vmalloc_user in the function rxe_queue_init.
So it is better to free it by vfree.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-30 19:20:38 -06:00
Vijay Immanuel
d3c04a3a68 IB/rxe: vary the source udp port for receive scaling
Select the source udp port number for a QP based on the
source QPN. This provides a better spread of traffic
across NIC RX queues for RC/UC QPs.

Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-08-30 16:31:50 -04:00
Steve Wise
33023fb85a IB/core: add max_send_sge and max_recv_sge attributes
This patch replaces the ib_device_attr.max_sge with max_send_sge and
max_recv_sge. It allows ulps to take advantage of devices that have very
different send and recv sge depths.  For example cxgb4 has a max_recv_sge
of 4, yet a max_send_sge of 16.  Splitting out these attributes allows
much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW
API. Consider a large RDMA WRITE that has 16 scattergather entries.
With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of
16, it can be done with 1 WRITE WR.

Acked-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18 13:17:28 -06:00
Parav Pandit
47ec386662 RDMA: Convert drivers to use sgid_attr instead of sgid_index
The core code now ensures that all driver callbacks that receive an
rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present.

Drivers can use this pointer instead of calling a query function with
sgid_index. This simplifies the drivers and also avoids races where a
gid_index lookup may return different data if it is changed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18 11:11:26 -06:00
Jason Gunthorpe
0c43ab371b RDMA/rxe: Use structs to describe the uABI instead of opencoding
Open coding pointer math is not acceptable for describing the uABI in
RDMA. Provide structs for all the cases.

The udata is casted to the struct as close to the verbs entry point
as possible for maximum clarity. Function signatures and so forth
are revised to allow for this.

Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:02 -06:00
Hernán Gonzalez
c33bab622d IB/rxe: Remove unused variable (char *rxe_qp_state_name[])
Note: This is compile only tested as I have no access to the hw.  This
variable was not used anywhere in the code. Removing it saves 24 bytes.

add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-24 (-24)
Function                                     old     new   delta
rxe_qp_state_name                             24       -     -24
Total: Before=3348732, After=3348708, chg -0.00%

Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:40 -07:00
Zhu Yanjun
316663c6bf IB/rxe: remove redudant parameter in rxe_av_fill_ip_info
In the function rxe_av_fill_ip_info, the parameter rxe is not used.
So it is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31 16:32:10 -05:00
Zhu Yanjun
9c96f3d4dd IB/rxe: remove unnecessary parameter in rxe_av_to_attr
In the function rxe_av_to_attr, the parameter rxe is
not used. So it is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31 16:32:10 -05:00
Zhu Yanjun
1a241db19c IB/rxe: remove redudant parameter in function
In the function rxe_av_from_attr, the parameter rxe
is not used. So it is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-31 16:32:10 -05:00
Bart Van Assche
bb3ffb7ad4 RDMA/rxe: Fix rxe_qp_cleanup()
rxe_qp_cleanup() can sleep so it must be run in thread context and
not in atomic context. This patch avoids that the following bug is
triggered:

Kernel BUG at 00000000560033f3 [verbose debug info unavailable]
BUG: sleeping function called from invalid context at net/core/sock.c:2761
in_atomic(): 1, irqs_disabled(): 0, pid: 7, name: ksoftirqd/0
INFO: lockdep is turned off.
Preemption disabled at:
[<00000000b6e69628>] __do_softirq+0x4e/0x540
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.15.0-rc7-dbg+ #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
 dump_stack+0x85/0xbf
 ___might_sleep+0x177/0x260
 lock_sock_nested+0x1d/0x90
 inet_shutdown+0x2e/0xd0
 rxe_qp_cleanup+0x107/0x140 [rdma_rxe]
 rxe_elem_release+0x18/0x80 [rdma_rxe]
 rxe_requester+0x1cf/0x11b0 [rdma_rxe]
 rxe_do_task+0x78/0xf0 [rdma_rxe]
 tasklet_action+0x99/0x270
 __do_softirq+0xc0/0x540
 run_ksoftirqd+0x1c/0x70
 smpboot_thread_fn+0x1be/0x270
 kthread+0x117/0x130
 ret_from_fork+0x24/0x30

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Moni Shoua <monis@mellanox.com>
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18 14:49:20 -05:00
Kees Cook
3bfbea7473 IB/rxe: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Moni Shoua <monis@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 15:24:49 -04:00
Andrew Boyer
825a51a4af IB/rxe: Fix up rxe_qp_cleanup()
Replace sk_dst_get()/dst_release() in rxe_qp_cleanup() with sk_dst_reset().
sk_dst_get() takes a new reference on dst, so the dst_release() doesn't
actually release the original reference, which was the design intent.

Fixes: 4ed6ad1eb3 ("IB/rxe: Cache dst in QP instead of getting it...")
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28 19:12:34 -04:00
Dasaratharaman Chandramouli
d8966fcd4c IB/core: Use rdma_ah_attr accessor functions
Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
yonatanc
4ed6ad1eb3 IB/rxe: Cache dst in QP instead of getting it for each send
In RC QP there is no need to resolve the outgoing interface
for each packet, as this does not change during QP life cycle.

Instead cache the interface on the socket and use that one.
This improves performance by 12% by sparing redundant
calls to rxe_find_route.

ib_send_bw -d rxe0  -x 1 -n 9000 -e  -s $((1024 * 1024 )) -l 100

----------------------------------------------------------------------------------------
|        | bytes   | iterations | BW peak[MB/sec] | BW average[MB/sec] | MsgRate[Mpps] |
----------------------------------------------------------------------------------------
| before | 1048576 | 9000       | inf             | 551.21             | 0.000551      |
| after  | 1048576 | 9000       | inf             | 615.54             | 0.000616      |
----------------------------------------------------------------------------------------

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 10:45:02 -04:00
Parav Pandit
99fc12f60e IB/rxe: Avoid accessing timers for non RC QPs
This patch avoids RNR NAK timer and retransmit timer initialization and
cleanup for non RC QPs (such as UD QP, GSI QP).

Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 10:43:28 -04:00
Wei Yongjun
8d8a473380 IB/rxe: use setup_timer to simplify the code
Use setup_timer function instead of initializing timer with the function
and data fields.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19 09:18:31 -05:00
Doug Ledford
6dd7abae71 Merge branch 'k.o/for-4.10-rc' into HEAD 2017-02-19 09:18:21 -05:00
Yonatan Cohen
2d4b21e0a2 IB/rxe: Prevent from completer to operate on non valid QP
On UD QP completer tasklet is scheduled for each packet sent.

If it is followed by a destroy_qp(), the kernel panic will
happen as the completer tries to operate on a destroyed QP.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 16:17:32 -05:00
Bart Van Assche
32404fb764 IB/rxe: Let the compiler check the type of the cleanup functions
Change the argument type of these functions from void * into
struct rxe_pool_entry *.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Andrew Boyer <andrew.boyer@dell.com>
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-10 16:52:47 -05:00
Linus Torvalds
296915912d First round of -rc fixes for 4.10 kernel
- Series of qedr fixes
 - Series of rxe fixes
 - One isolated i40iw fix
 - One isolated cma fix
 - One isolated cxgb4 fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYXAGvAAoJELgmozMOVy/dDukQAMMNarWp0U8KfNYRU5tyCBwd
 aIQC1gFT6GUCFys40Z6L84m1D3NpGR+vzVv3grVBeuge73b79zAOHXvVDwJCA+Jl
 QQLG3vZ13C3158sLDiK8zL+4Ob5OfOQ5nQ2spvDfJWpye9SD+pWFcrpqvK02ANRN
 kFHILk1gROBTNi46yBR5hjWOkw7Bua6XLsPxh6xoaDZ43NL0r0xgm43FTnj/19x3
 0zpZYYKP+3C6U7678rqaog9zfXHvadghW5/WBJ/VgfKqEmH89ESx4J2MvbB8DxFD
 1tWAOpr5TNY5jnh8mtUsceDjCzQivc/RWqAu05BspEwcavjSLFyRYr1epR0/4oAd
 PqLSmfORmhpJ8+5Kmn+chtXo3TT4SYGHIzSUbgbEV/ClwX/7UW+w8mfQZ3buUBq/
 cQp/oRnJcsrQIEDFO3AH7P+6Sxy6t3zbSl5oKBUOI1u4RFmC7YBPqo9fQu2Z2mGk
 3+AWQaPr7qgEcFzXBgLzvd4LhTYKsvmiNwrcXi9KjjwQjNEVg15qqF2YtmxEUgi9
 kh3IOcGan3iSblhV/WLrxcOjlPQrPpBOVnTPhUskFtlsrD+032OxeOBpVoU3nCUt
 MjTYWoNTYdw4wHz0w373o0uR4+4nl4a5OmO4Fh6Drmg5hm4Bl9BWy0Kziu93Z1Ay
 Z2utZVWLWhBzn8yJujUz
 =NW9g
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "First round of -rc fixes for 4.10 kernel:

   - a series of qedr fixes
   - a series of rxe fixes
   - one i40iw fix
   - one cma fix
   - one cxgb4 fix"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/rxe: Don't check for null ptr in send()
  IB/rxe: Drop future atomic/read packets rather than retrying
  IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends
  qedr: Always notify the verb consumer of flushed CQEs
  qedr: clear the vendor error field in the work completion
  qedr: post_send/recv according to QP state
  qedr: ignore inline flag in read verbs
  qedr: modify QP state to error when destroying it
  qedr: return correct value on modify qp
  qedr: return error if destroy CQ failed
  qedr: configure the number of CQEs on CQ creation
  i40iw: Set 128B as the only supported RQ WQE size
  IB/cma: Fix a race condition in iboe_addr_get_sgid()
  IB/rxe: Fix a memory leak in rxe_qp_cleanup()
  iw_cxgb4: set correct FetchBurstMax for QPs
2016-12-23 10:38:48 -08:00
Bart Van Assche
e259934d4d IB/rxe: Fix a memory leak in rxe_qp_cleanup()
A socket is associated with every QP by the rxe driver but sock_release()
is never called. Add a call to sock_release() in rxe_qp_cleanup().

Fixes: commit 8700e3e7c48A5 ("Add Soft RoCE driver")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Moni Shoua <monis@mellanox.com>
Cc: Kamal Heib <kamalh@mellanox.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Haggai Eran <haggaie@mellanox.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-18 13:35:19 -05:00
Yonatan Cohen
6d931308f5 IB/rxe: Update qp state for user query
The method rxe_qp_error() transitions QP to error state
and make sure the QP is drained. It did not though update
the QP state for user's query.

This patch fixes this.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Yonatan Cohen
aa75b07b47 IB/rxe: Clear queue buffer when modifying QP to reset
RXE resets the send-q only once in rxe_qp_init_req() when
QP is created, but when the QP is reused after QP reset, the send-q
holds previous garbage data.

This garbage data wrongly fails CQEs that otherwise
should have completed successfully.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Parav Pandit
e404f945a6 IB/rxe: improved debug prints & code cleanup
1. Debugging qp state transitions and qp errors in loopback and
multiple QP tests is difficult without qp numbers in debug logs.
This patch adds qp number to important debug logs.

2. Instead of having rxe: prefix in few logs and not having in
few logs, using uniform module name prefix using pr_fmt macro.

3. Code cleanup for various warnings reported by checkpatch for
incomplete unsigned data type, line over 80 characters, return
statements.

Signed-off-by: Parav Pandit <pandit.parav@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06 13:50:04 -04:00
Parav Pandit
b6bbee0d24 IB/rxe: Properly honor max IRD value for rd/atomic.
This patch honoris the max incoming read request count instead of
outgoing read req count
(a) during modify qp by allocating response queue metadata
(b) during incoming read request processing

Signed-off-by: Parav Pandit <pandit.parav@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-10-06 13:50:04 -04:00