1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

13104 commits

Author SHA1 Message Date
Christian König
f30bceab16 RDMA: use dma_resv_wait() instead of extracting the fence
Use dma_resv_wait() instead of extracting the exclusive fence and
waiting on it manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Maor Gottlieb <maorg@nvidia.com>
Cc: Gal Pressman <galpress@amazon.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220321135856.1331-4-christian.koenig@amd.com
2022-03-24 10:09:20 +01:00
Linus Torvalds
9030fb0bb9 Folio changes for 5.18
- Rewrite how munlock works to massively reduce the contention
    on i_mmap_rwsem (Hugh Dickins):
    https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
  - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
    https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
  - Convert GUP to use folios and make pincount available for order-1
    pages. (Matthew Wilcox)
  - Convert a few more truncation functions to use folios (Matthew Wilcox)
  - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
  - Convert rmap_walk to use folios (Matthew Wilcox)
  - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
  - Add support for creating large folios in readahead (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
 gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
 P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
 s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
 Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
 CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
 v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
 =n9Ad
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
2022-03-22 17:03:12 -07:00
Dan Carpenter
87e0eacb17 RDMA/nldev: Prevent underflow in nldev_stat_set_counter_dynamic_doit()
This code checks "index" for an upper bound but it does not check for
negatives.  Change the type to unsigned to prevent underflows.

Fixes: 3c3c1f1416 ("RDMA/nldev: Allow optional-counter status configuration through RDMA netlink")
Link: https://lore.kernel.org/r/20220316083948.GC30941@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-18 15:40:54 -03:00
Max Gurtovoy
2e11a5e459 IB/iser: Fix error flow in case of registration failure
During READ/WRITE preparation, in case of failure in memory registration
using iser_reg_mem_fastreg we must unmap previously mapped iser task.

Link: https://lore.kernel.org/r/20220308145546.8372-5-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-18 14:37:49 -03:00
Max Gurtovoy
80303ee244 IB/iser: Generalize map/unmap dma tasks
Avoid code duplication and add the mapping/unmapping of the protection
buffers to the iser_dma_map_task_data/iser_dma_unmap_task_data functions.

Link: https://lore.kernel.org/r/20220308145546.8372-4-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-18 14:37:49 -03:00
Max Gurtovoy
ee4efeaea8 IB/iser: Use iser_fr_desc as registration context
After removing the FMR support in iSER, there is only one type of
registration context. Replace the void pointer with the explicit structure
for registration (struct iser_fr_desc).

Link: https://lore.kernel.org/r/20220308145546.8372-3-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-18 14:37:49 -03:00
Max Gurtovoy
7f68d7493f IB/iser: Remove iser_reg_data_sg helper function
Open coding it makes the code more readable and simple.

Link: https://lore.kernel.org/r/20220308145546.8372-2-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-18 14:37:49 -03:00
Bob Pearson
3197706abd RDMA/rxe: Use standard names for ref counting
Rename rxe_add_ref() to rxe_get() and rxe_drop_ref() to rxe_put().
Significantly improves readability for new readers.

Link: https://lore.kernel.org/r/20220304000808.225811-10-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-16 10:34:42 -03:00
Bob Pearson
3225717f6d RDMA/rxe: Replace red-black trees by xarrays
Currently the rxe driver uses red-black trees to add indices to the rxe
object pools. Linux xarrays provide a better way to implement the same
functionality for indices. This patch replaces red-black trees by xarrays
for pool objects. Since xarrays already have a spinlock use that in place
of the pool rwlock. Make sure that all changes in the xarray(index) and
kref(ref counnt) occur atomically.

Link: https://lore.kernel.org/r/20220304000808.225811-9-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-16 10:34:42 -03:00
Bob Pearson
df34dc9e03 RDMA/rxe: Shorten pool names in rxe_pool.c
Replace pool names like "rxe-xx" with "xx". Just reduces clutter.

Link: https://lore.kernel.org/r/20220304000808.225811-8-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:57 -03:00
Bob Pearson
3ccffe8abf RDMA/rxe: Move max_elem into rxe_type_info
Move the maximum number of elements from a parameter in rxe_pool_init to a
member of the rxe_type_info array.

Link: https://lore.kernel.org/r/20220304000808.225811-7-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:57 -03:00
Bob Pearson
b4a47f6836 RDMA/rxe: Replace obj by elem in declaration
Fix a harmless typo replacing obj by elem in the cleanup fields.  This has
no effect but is confusing.

Link: https://lore.kernel.org/r/20220304000808.225811-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:57 -03:00
Bob Pearson
3c3e4d582b RDMA/rxe: Delete _locked() APIs for pool objects
Since caller managed locks for indexed objects are no longer used these
APIs are deleted.

Link: https://lore.kernel.org/r/20220304000808.225811-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:57 -03:00
Bob Pearson
c9f4c69583 RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC
There is only one remaining object type that allocates its own memory,
that is mr. So the sense of RXE_POOL_NO_ALLOC is changed to
RXE_POOL_ALLOC. Add checks to rxe_alloc() and rxe_add_to_pool() to make
sure the correct call is used for the setting of this flag.

Link: https://lore.kernel.org/r/20220304000808.225811-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:56 -03:00
Bob Pearson
8a1a0be894 RDMA/rxe: Replace mr by rkey in responder resources
Currently rxe saves a copy of MR in responder resources for RDMA reads.
Since the responder resources are never freed just over written if more
are needed this MR may not have a reference freed until the QP is
destroyed. This patch uses the rkey instead of the MR and on subsequent
packets of a multipacket read reply message it looks up the MR from the
rkey for each packet. This makes it possible for a user to deregister an
MR or unbind a MW on the fly and get correct behaviour.

Link: https://lore.kernel.org/r/20220304000808.225811-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:56 -03:00
Bob Pearson
63221acb0c RDMA/rxe: Fix ref error in rxe_av.c
The commit referenced below can take a reference to the AH which is never
dropped. This only happens in the UD request path. This patch optionally
passes that AH back to the caller so that it can hold the reference while
the AV is being accessed and then drop it. Code to do this is added to
rxe_req.c. The AV is also passed to rxe_prepare in rxe_net.c as an
optimization.

Fixes: e2fe06c908 ("RDMA/rxe: Lookup kernel AH from ah index in UD WQEs")
Link: https://lore.kernel.org/r/20220304000808.225811-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:49:56 -03:00
Yixing Liu
70f9252158 RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT
Before destroying MPT, the reserved loopback QPs send loopback IOs (one
write operation per SL). Completing these loopback IOs represents that
there isn't any outstanding request in MPT, then it's safe to destroy MPT.

Link: https://lore.kernel.org/r/20220310042835.38634-1-liangwenpeng@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 20:19:00 -03:00
Mustafa Ismail
51cad28724 RDMA/irdma: Add support for address handle re-use
Address handles (AH) are a limited HW resource and some user applications
may create large numbers of identical AH's.  Avoid running out of AH's by
reusing existing identical ones.

Link: https://lore.kernel.org/r/20220228183650.290-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 16:22:55 -03:00
Julia Lawall
2c25e45267 RDMA/qib: Fix typos in comments
Various spelling mistakes in comments. Detected with the help of
Coccinelle.

Link: https://lore.kernel.org/r/20220314115354.144023-23-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:56:02 -03:00
Yongzhi Liu
087f9c3f23 RDMA/mlx5: Fix memory leak in error flow for subscribe event routine
In case the second xa_insert() fails, the obj_event is not released.  Fix
the error unwind flow to free that memory to avoid a memory leak.

Fixes: 7597385371 ("IB/mlx5: Enable subscription for device events over DEVX")
Link: https://lore.kernel.org/r/1647018361-18266-1-git-send-email-lyz_cs@pku.edu.cn
Signed-off-by: Yongzhi Liu <lyz_cs@pku.edu.cn>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:41:10 -03:00
Leon Romanovsky
7922d3de4d Revert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"
This reverts commit 7c4a539ec3. which causes
to the following error in mlx4.

  Destroy of kernel CQ shouldn't fail
  WARNING: CPU: 4 PID: 18064 at include/rdma/ib_verbs.h:3936 mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib]
  Modules linked in: bonding ib_ipoib ip_gre ipip tunnel4 geneve rdma_ucm nf_tables ib_umad mlx4_en mlx4_ib ib_uverbs mlx4_core ip6_gre gre ip6_tunnel tunnel6 iptable_raw openvswitch nsh rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay fuse [last unloaded: mlx4_core]
  CPU: 4 PID: 18064 Comm: ibv_xsrq_pingpo Not tainted 5.17.0-rc7_master_62c6ecb #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:mlx4_ib_dealloc_xrcd+0x12e/0x1b0 [mlx4_ib]
  Code: 1e 93 08 00 40 80 fd 01 0f 87 fa f1 04 00 83 e5 01 0f 85 2b ff ff ff 48 c7 c7 20 4f b6 a0 c6 05 fd 92 08 00 01 e8 47 c9 82 e2 <0f> 0b e9 11 ff ff ff 0f b6 2d eb 92 08 00 40 80 fd 01 0f 87 b1 f1
  RSP: 0018:ffff8881a4957750 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff8881ac4b6800 RCX: 0000000000000000
  RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed103492aedc
  RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2e378eb
  R10: ffffed109a5c6f1d R11: 0000000000000001 R12: ffff888132620000
  R13: ffff8881a4957a90 R14: ffff8881aa2d4000 R15: ffff8881a4957ad0
  FS:  00007f0401747740(0000) GS:ffff8884d2e00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055f8ae036118 CR3: 000000012fe94005 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   ib_dealloc_xrcd_user+0xce/0x120 [ib_core]
   ib_uverbs_dealloc_xrcd+0xad/0x210 [ib_uverbs]
   uverbs_free_xrcd+0xe8/0x190 [ib_uverbs]
   destroy_hw_idr_uobject+0x7a/0x130 [ib_uverbs]
   uverbs_destroy_uobject+0x164/0x730 [ib_uverbs]
   uobj_destroy+0x72/0xf0 [ib_uverbs]
   ib_uverbs_cmd_verbs+0x19fb/0x3110 [ib_uverbs]
   ib_uverbs_ioctl+0x169/0x260 [ib_uverbs]
   __x64_sys_ioctl+0x856/0x1550
   do_syscall_64+0x3d/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 7c4a539ec3 ("RDMA/core: Fix ib_qp_usecnt_dec() called when error")
Link: https://lore.kernel.org/r/74c11029adaf449b3b9228a77cc82f39e9e892c8.1646851220.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:39:21 -03:00
Chengguang Xu
aaaf62e066 RDMA/rxe: Remove useless argument for update_state()
The argument 'payload' is not used in update_state(), so just remove it.

Link: https://lore.kernel.org/r/20220307145047.3235675-2-cgxu519@mykernel.net
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:36:01 -03:00
Chengguang Xu
7e8e611d6a RDMA/rxe: Change variable and function argument to proper type
The type of wqe length is u32 so in order to avoid overflow and shadow
casting change variable and relevant function argument to proper type.

Link: https://lore.kernel.org/r/20220307145047.3235675-1-cgxu519@mykernel.net
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:36:01 -03:00
Dan Carpenter
6f6dbb819d RDMA/irdma: Prevent some integer underflows
My static checker complains that:

    drivers/infiniband/hw/irdma/ctrl.c:3605 irdma_sc_ceq_init()
    warn: can subtract underflow 'info->dev->hmc_fpm_misc.max_ceqs'?

It appears that "info->dev->hmc_fpm_misc.max_ceqs" comes from the firmware
in irdma_sc_parse_fpm_query_buf() so, yes, there is a chance that it could
be zero.  Even if we trust the firmware, it's easy enough to change the
condition just as a hardenning measure.

Fixes: 3f49d68425 ("RDMA/irdma: Implement HW Admin Queue OPs")
Link: https://lore.kernel.org/r/20220307125928.GE16710@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-14 20:31:12 -03:00
Moshe Shemesh
66771a1c72 net/mlx5: Move debugfs entries to separate struct
Move the debugfs entry pointers under priv to their own struct.
Add get function for device debugfs root.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09 13:33:02 -08:00
Wenpeng Liang
73f7e05609 RDMA/hns: Refactor the alloc_cqc()
Abstract the alloc_cqc() into several parts and separate the process
unrelated to allocating CQC.

Link: https://lore.kernel.org/r/20220302064830.61706-10-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:32 -04:00
Chengchang Tang
b65afbd2a0 RDMA/hns: Refactor the alloc_srqc()
Abstract the alloc_srqc() into several parts and separate the alloc_srqn()
from the alloc_srqc().

Link: https://lore.kernel.org/r/20220302064830.61706-9-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:32 -04:00
Wenpeng Liang
904de76c42 RDMA/hns: Clean up the return value check of hns_roce_alloc_cmd_mailbox()
hns_roce_alloc_cmd_mailbox() never returns NULL, so the check should be
IS_ERR(). And the error code should be converted as the function's return
value.

Link: https://lore.kernel.org/r/20220302064830.61706-8-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:32 -04:00
Chengchang Tang
cf7f8f5c1c RDMA/hns: Remove similar code that configures the hardware contexts
Remove duplicate code for creating and destroying hardware contexts via
mailbox.

Link: https://lore.kernel.org/r/20220302064830.61706-7-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:31 -04:00
Chengchang Tang
162e29feab RDMA/hns: Refactor mailbox functions
The current mailbox functions have too many parameters, making the code
difficult to maintain. So construct a new structure mbox_msg to pass the
information needed by mailbox.

Link: https://lore.kernel.org/r/20220302064830.61706-6-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:31 -04:00
Wenpeng Liang
e50cda2b9f RDMA/hns: Fix the wrong type of parameter "op" of the mailbox
The "op" field of the mailbox occupies 8 bits, so the parameter "op"
should be of type u8.

Link: https://lore.kernel.org/r/20220302064830.61706-5-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:31 -04:00
Wenpeng Liang
479dc93ba7 RDMA/hns: Remove redundant parameter "mailbox" in the mailbox
The parameter "out_param" of the mailbox is always null when the context is
destroyed. So remove the function parameter "mailbox".

Link: https://lore.kernel.org/r/20220302064830.61706-4-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:31 -04:00
Chengchang Tang
0018ed4bb0 RDMA/hns: Remove fixed parameter “timeout” in the mailbox
The value of the function parameter “timeout” is unique. Therefore,
it is unnecessary to specify the parameter “timeout” value each time.
So remove it.

Link: https://lore.kernel.org/r/20220302064830.61706-3-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:30 -04:00
Chengchang Tang
5a32949d81 RDMA/hns: Remove the unused parameter "op_modifier" in mailbox
The parameter "op_modifier" is only used for HIP06. It is useless for HIP08
and later versions. After removing HIP06, this parameter is no longer used,
so remove it.

Link: https://lore.kernel.org/r/20220302064830.61706-2-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:36:30 -04:00
Yajun Deng
7c4a539ec3 RDMA/core: Fix ib_qp_usecnt_dec() called when error
ib_destroy_qp() would called by ib_create_qp_user() if error, the former
contains ib_qp_usecnt_dec(), but ib_qp_usecnt_inc() was not called before.

So move ib_qp_usecnt_inc() into create_qp().

Fixes: d2b10794fc ("RDMA/core: Create clean QP creations interface for uverbs")
Link: https://lore.kernel.org/r/20220303024232.2847388-1-yajun.deng@linux.dev
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:30:31 -04:00
Mike Marciniszyn
b135e324d7 IB/hfi1: Allow larger MTU without AIP
The AIP code signals the phys_mtu in the following query_port()
fragment:

	props->phys_mtu = HFI1_CAP_IS_KSET(AIP) ? hfi1_max_mtu :
				ib_mtu_enum_to_int(props->max_mtu);

Using the largest MTU possible should not depend on AIP.

Fix by unconditionally using the hfi1_max_mtu value.

Fixes: 6d72344cf6 ("IB/ipoib: Increase ipoib Datagram mode MTU's upper limit")
Link: https://lore.kernel.org/r/1644348309-174874-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-04 17:22:02 -04:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Christoph Hellwig
dc90f0846d mm: don't include <linux/memremap.h> in <linux/mm.h>
Move the check for the actual pgmap types that need the free at refcount
one behavior into the out of line helper, and thus avoid the need to
pull memremap.h into mm.h.

Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Jakub Kicinski
f2b77012dd Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
mlx5-next 2022-22-02

The following PR includes updates to mlx5-next branch:

Headlines:
==========

1) Jakub cleans up unused static inline functions

2) I did some low level firmware command interface return status changes to
provide the caller with full visibility on the error/status returned by
the Firmware.

3) Use the new command interface in RDMA DEVX usecases to avoid flooding
dmesg with some "expected" user error prone use cases.

4) Moshe also uses the new command interface to grab the specific error
code from MFRL register command to provide the exact error reason for
why SW reset couldn't perform internally in FW.

5) From Mark Bloch: Lag, drop packets in hardware when possible

In active-backup mode the inactive interface's packets are dropped by the
bond device. In switchdev where TC rules are offloaded to the FDB
this can lead to packets being hit in the FDB where without offload
they would have been dropped before reaching TC rules in the kernel.

Create a drop rule to make sure packets on inactive ports are dropped
before reaching the FDB.

Listen on NETDEV_CHANGEUPPER / NETDEV_CHANGEINFODATA events and record
the inactive state and offload accordingly.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Add clarification on sync reset failure
  net/mlx5: Add reset_state field to MFRL register
  RDMA/mlx5: Use new command interface API
  net/mlx5: cmdif, Refactor error handling and reporting of async commands
  net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct}
  net/mlx5: cmdif, Add new api for command execution
  net/mlx5: cmdif, cmd_check refactoring
  net/mlx5: cmdif, Return value improvements
  net/mlx5: Lag, offload active-backup drops to hardware
  net/mlx5: Lag, record inactive state of bond device
  net/mlx5: Lag, don't use magic numbers for ports
  net/mlx5: Lag, use local variable already defined to access E-Switch
  net/mlx5: E-switch, add drop rule support to ingress ACL
  net/mlx5: E-switch, remove special uplink ingress ACL handling
  net/mlx5: E-Switch, reserve and use same uplink metadata across ports
  net/mlx5: Add ability to insert to specific flow group
  mlx5: remove unused static inlines
====================

Link: https://lore.kernel.org/r/20220223233930.319301-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-28 16:23:58 -08:00
Yajun Deng
a80501b891 RDMA/core: Remove unnecessary statements
The rdma_zalloc_drv_obj() in __ib_alloc_pd() would zero pd, it unnecessary
add NULL to the object in struct pd.

The uverbs_free_pd() already return busy if pd->usecnt is true, there is
no need to add a warning.

Link: https://lore.kernel.org/r/20220223074901.201506-1-yajun.deng@linux.dev
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 13:57:24 -04:00
Mustafa Ismail
17850f2b0b RDMA/irdma: Remove incorrect masking of PD
The PD id is masked with 0x7fff, while PD can be 18 bits for GEN2 HW.
Remove the masking as it should not be needed and can cause incorrect PD
id to be used.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220225163211.127-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 12:07:40 -04:00
Mustafa Ismail
b200189626 RDMA/irdma: Fix Passthrough mode in VM
Using PCI_FUNC macro in a VM, when the device is in passthrough mode does
not provide the real function instance. This means that currently, devices
will not probe unless the instance in the VM matches the instance in the
host.

Fix this by getting the pf_id from the LAN during the probe.

Fixes: 8498a30e1b ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Link: https://lore.kernel.org/r/20220225163211.127-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 12:07:40 -04:00
Mustafa Ismail
6702bc1474 RDMA/irdma: Fix netdev notifications for vlan's
Currently, events on vlan netdevs are being ignored. Fix this by finding
the real netdev and processing the notifications for vlan netdevs.

Fixes: 915cc7ac0f ("RDMA/irdma: Add miscellaneous utility definitions")
Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 12:07:40 -04:00
Zhu Yanjun
ea7596c1e5 RDMA/irdma: Make irdma_create_mg_ctx return a void
The function irdma_create_mg_ctx always returns 0, so make it void and
delete the return value check.

Link: https://lore.kernel.org/r/20220224182832.3896686-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 11:32:42 -04:00
Jason Gunthorpe
22e9f71072 RDMA/cma: Do not change route.addr.src_addr outside state checks
If the state is not idle then resolve_prepare_src() should immediately
fail and no change to global state should happen. However, it
unconditionally overwrites the src_addr trying to build a temporary any
address.

For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():

           if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)

Which would manifest as this trace from syzkaller:

  BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
  Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204

  CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:79 [inline]
   dump_stack+0x141/0x1d7 lib/dump_stack.c:120
   print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
   __kasan_report mm/kasan/report.c:399 [inline]
   kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
   __list_add_valid+0x93/0xa0 lib/list_debug.c:26
   __list_add include/linux/list.h:67 [inline]
   list_add_tail include/linux/list.h:100 [inline]
   cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
   rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
   ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
   ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
   vfs_write+0x28e/0xa30 fs/read_write.c:603
   ksys_write+0x1ee/0x250 fs/read_write.c:658
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xae

This is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().

Instead of trying to re-use the src_addr memory to indirectly create an
any address derived from the dst build one explicitly on the stack and
bind to that as any other normal flow would do. rdma_bind_addr() will copy
it over the src_addr once it knows the state is valid.

This is similar to commit bc0bdc5afa ("RDMA/cma: Do not change
route.addr.src_addr.ss_family")

Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 732d41c545 ("RDMA/cma: Make the locking for automatic state transition more clear")
Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-25 16:46:51 -04:00
Zhu Yanjun
884194ef26 RDMA/irdma: Move union irdma_sockaddr to header file
The union irdma_sockaddr is used frequently. So move it to the header
file.

Link: https://lore.kernel.org/r/20220223024252.3873736-4-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Zhu Yanjun
8627da62cc RDMA/irdma: Remove the unnecessary variable saddr
Firstly the variable saddr was to check the type of a network. Now the
variable net_type is used to do the same work. So it is removed.

Link: https://lore.kernel.org/r/20220223024252.3873736-3-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Zhu Yanjun
80005c43d4 RDMA/irdma: Use net_type to check network type
The member variable net_type is to check the type of network.

Link: https://lore.kernel.org/r/20220223024252.3873736-2-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Bob Pearson
6090a0c4c7 RDMA/rxe: Cleanup rxe_mcast.c
Finish adding subroutine comment headers to subroutines in
rxe_mcast.c. Make minor api change cleanups.

Link: https://lore.kernel.org/r/20220223230706.50332-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:29:15 -04:00
Bob Pearson
a181c4c81a RDMA/rxe: Collect cleanup mca code in a subroutine
Collect cleanup code for struct rxe_mca into a subroutine,
__rxe_cleanup_mca() called in rxe_detach_mcg() in rxe_mcast.c.

Link: https://lore.kernel.org/r/20220223230706.50332-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:29:15 -04:00