linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Bob Pearson	25fd735a4c	RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_av.c Replace calls to pr_xxx() in rxe_av.c with rxe_dbg_xxx(). Link: https://lore.kernel.org/r/20221103171013.20659-13-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-11-10 15:33:05 -04:00
Bob Pearson	14e501fdb0	RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_verbs.c Replace calls to pr_xxx() in rxe_verbs.c with rxe_dbg_xxx(). Link: https://lore.kernel.org/r/20221103171013.20659-12-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-11-10 15:33:05 -04:00
Bob Pearson	2778b72b1d	RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_mr.c Replace calls to pr_xxx() in rxe_mr.c by rxe_dbg_mr(). Link: https://lore.kernel.org/r/20221103171013.20659-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-11-10 15:33:03 -04:00
Yunsheng Lin	692373d186	RDMA/rxe: cleanup some error handling in rxe_verbs.c Instead of 'goto and return', just return directly to simplify the error handling, and avoid some unnecessary return value check. Link: https://lore.kernel.org/r/20221028075053.3990467-1-xuhaoyue1@hisilicon.com Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-10-28 15:11:44 -03:00
Bob Pearson	dccb23f6c3	RDMA/rxe: Split rxe_run_task() into two subroutines Split rxe_run_task(task, sched) into rxe_run_task(task) and rxe_sched_task(task). Link: https://lore.kernel.org/r/20221021200118.2163-5-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-10-28 13:47:15 -03:00
Bob Pearson	6c5e683925	RDMA/rxe: Remove redundant num_sge fields In include/uapi/rdma/rdma_user_rxe.h there are redundant copies of num_sge in the rxe_send_wr, rxe_recv_wqe, and rxe_dma_info. Only the ones in rxe_dma_info are actually used by the rxe kernel driver. The userspace would set these values, but the kernel never read them. This change has no affect on the current ABI and new or old versions of rdma-core operate correctly with new or old versions of the kernel rxe driver. Link: https://lore.kernel.org/r/20220913222716.18335-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-09-27 10:15:24 -03:00
Bob Pearson	58651bbb30	RDMA/rxe: Set pd early in mr alloc routines Move setting of pd in mr objects ahead of any possible errors so that it will always be set in rxe_mr_cleanup() to avoid seg faults when rxe_put(mr_pd(mr)) is called. Fixes: `cf40367961` ("RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()") Link: https://lore.kernel.org/r/20220805183153.32007-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-09-27 10:15:24 -03:00
Daisuke Matsuda	954afc5a8f	RDMA/rxe: Use members of generic struct in rxe_mr rxe_mr and ib_mr have interchangeable members. Remove device specific members and use ones in the generic struct. Both 'iova' and 'length' are filled in ib_uverbs or ib_core layer after MR registration. Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Link: https://lore.kernel.org/r/20220921080844.1616883-2-matsuda-daisuke@fujitsu.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-09-22 12:46:39 +03:00
Daisuke Matsuda	d4ecb56e86	RDMA/rxe: Remove an unused member from struct rxe_mr Commit `1e75550648` ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"") brought back the member 'va' to struct rxe_mr. However, it is actually used by nobody and thus can be removed. Fixes: `1e75550648` ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"") Link: https://lore.kernel.org/r/20220829012335.1212697-1-matsuda-daisuke@fujitsu.com Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-08-29 09:44:07 +03:00
Li Zhijian	1e75550648	Revert "RDMA/rxe: Create duplicate mapping tables for FMRs" Below 2 commits will be reverted: commit `8ff5f5d9d8` ("RDMA/rxe: Prevent double freeing rxe_map_set()") commit `647bf13ce9` ("RDMA/rxe: Create duplicate mapping tables for FMRs") The community has a few bug reports which pointed this commit at last. Some proposals are raised up in the meantime but all of them have no follow-up operation. The previous commit led the map_set of FMR to be not available any more if the MR is registered again after invalidating. Although the mentioned patch try to fix a potential race in building/accessing the same table for fast memory regions, it broke rtrs etc ULPs. Since the latter could be worse, revert this patch. With previous commit, it's observed that a same MR in rnbd server will trigger below code path: -> rxe_mr_init_fast() \|-> alloc map_set() # map_set is uninitialized \|...-> rxe_map_mr_sg() # build the map_set \|-> rxe_mr_set_page() \|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means # we can access host memory(such rxe_mr_copy) \|...-> rxe_invalidate_mr() # mr->state change to FREE from VALID \|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, # but map_set was not built again \|...-> rxe_mr_copy() # kernel crash due to access wild addresses # that lookup from the map_set The backtraces are not always identical. [1st]---------- RIP: 0010:lookup_iova+0x66/0xa0 [rdma_rxe] Code: 00 00 00 48 d3 ee 89 32 c3 4c 8b 18 49 8b 3b 48 8b 47 08 48 39 c6 72 38 48 29 c6 45 31 d2 b8 01 00 00 00 48 63 c8 48 c1 e1 04 <48> 8b 4c 0f 08 48 39 f1 77 21 83 c0 01 48 29 ce 3d 00 01 00 00 75 RSP: 0018:ffffb7ff80063bf0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9b9949d86800 RCX: 0000000000000000 RDX: ffffb7ff80063c00 RSI: 0000000049f6b378 RDI: 002818da00000004 RBP: 0000000000000120 R08: ffffb7ff80063c08 R09: ffffb7ff80063c04 R10: 0000000000000002 R11: ffff9b9916f7eef8 R12: ffff9b99488a0038 R13: ffff9b99488a0038 R14: ffff9b9914fb346a R15: ffff9b990ab27000 FS: 0000000000000000(0000) GS:ffff9b997dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efc33a98ed0 CR3: 0000000014f32004 CR4: 00000000001706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> rxe_mr_copy.part.0+0x6f/0x140 [rdma_rxe] rxe_responder+0x12ee/0x1b60 [rdma_rxe] ? rxe_icrc_check+0x7e/0x100 [rdma_rxe] ? rxe_rcv+0x1d0/0x780 [rdma_rxe] ? rxe_icrc_hdr.isra.0+0xf6/0x160 [rdma_rxe] rxe_do_task+0x67/0xb0 [rdma_rxe] rxe_xmit_packet+0xc7/0x210 [rdma_rxe] rxe_requester+0x680/0xee0 [rdma_rxe] ? update_load_avg+0x5f/0x690 ? update_load_avg+0x5f/0x690 ? rtrs_clt_recv_done+0x1b/0x30 [rtrs_client] [2nd]---------- RIP: 0010:rxe_mr_copy.part.0+0xa8/0x140 [rdma_rxe] Code: 00 00 49 c1 e7 04 48 8b 00 4c 8d 2c d0 48 8b 44 24 10 4d 03 7d 00 85 ed 7f 10 eb 6c 89 54 24 0c 49 83 c7 10 31 c0 85 ed 7e 5e <49> 8b 3f 8b 14 24 4c 89 f6 48 01 c7 85 d2 74 06 48 89 fe 4c 89 f7 RSP: 0018:ffffae3580063bf8 EFLAGS: 00010202 RAX: 0000000000018978 RBX: ffff9d7ef7a03600 RCX: 0000000000000008 RDX: 000000000000007c RSI: 000000000000007c RDI: ffff9d7ef7a03600 RBP: 0000000000000120 R08: ffffae3580063c08 R09: ffffae3580063c04 R10: ffff9d7efece0038 R11: ffff9d7ec4b1db00 R12: ffff9d7efece0038 R13: ffff9d7ef4098260 R14: ffff9d7f11e23c6a R15: 4c79500065708144 FS: 0000000000000000(0000) GS:ffff9d7f3dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fce47276c60 CR3: 0000000003f66004 CR4: 00000000001706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> rxe_responder+0x12ee/0x1b60 [rdma_rxe] ? rxe_icrc_check+0x7e/0x100 [rdma_rxe] ? rxe_rcv+0x1d0/0x780 [rdma_rxe] ? rxe_icrc_hdr.isra.0+0xf6/0x160 [rdma_rxe] rxe_do_task+0x67/0xb0 [rdma_rxe] rxe_xmit_packet+0xc7/0x210 [rdma_rxe] rxe_requester+0x680/0xee0 [rdma_rxe] ? update_load_avg+0x5f/0x690 ? update_load_avg+0x5f/0x690 ? rtrs_clt_recv_done+0x1b/0x30 [rtrs_client] rxe_do_task+0x67/0xb0 [rdma_rxe] tasklet_action_common.constprop.0+0x92/0xc0 __do_softirq+0xe1/0x2d8 run_ksoftirqd+0x21/0x30 smpboot_thread_fn+0x183/0x220 ? sort_range+0x20/0x20 kthread+0xe2/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Link: https://lore.kernel.org/r/1658805386-2-1-git-send-email-lizhijian@fujitsu.com Link: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@linux.dev/T/ Link: https://www.spinics.net/lists/linux-rdma/msg110836.html Link: https://lore.kernel.org/lkml/94a5ea93-b8bb-3a01-9497-e2021f29598a@linux.dev/t/ Tested-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-27 12:22:57 +03:00
Bob Pearson	215d0a755e	RDMA/rxe: Stop lookup of partially built objects Currently the rdma_rxe driver has a security weakness due to giving objects which are partially initialized indices allowing external actors to gain access to them by sending packets which refer to their index (e.g. qpn, rkey, etc) causing unpredictable results. This patch adds a new API rxe_finalize(obj) which enables looking up pool objects from indices using rxe_pool_get_index() for AH, QP, MR, and MW. They are added in create verbs only after the objects are fully initialized. It also adds wait for completion to destroy/dealloc verbs to assure that all references have been dropped before returning to rdma_core by implementing a new rxe_pool API rxe_cleanup() which drops a reference to the object and then waits for all other references to be dropped. When the last reference is dropped the object is completed by kref. After that it cleans up the object and if locally allocated frees the memory. In the special case of address handle objects the delay is implemented separately if the destroy_ah call is not sleepable. Combined with deferring cleanup code to type specific cleanup routines this allows all pending activity referring to objects to complete before returning to rdma_core. Link: https://lore.kernel.org/r/20220612223434.31462-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 10:56:01 -03:00
Bob Pearson	4703b4f0d9	RDMA/rxe: Enforce IBA C11-17 Add a counter to keep track of the number of WQs connected to a CQ and return an error if destroy_cq() is called while the counter is non zero. Link: https://lore.kernel.org/r/20220421014042.26985-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	ed2b5dd0f8	RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup() Move the code from rxe_qp_destroy() to rxe_qp_do_cleanup(). This allows flows holding references to qp to complete before the qp object is torn down. Link: https://lore.kernel.org/r/20220421014042.26985-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	b2a41678fc	RDMA/rxe: Add rxe_srq_cleanup() Move cleanup code from rxe_destroy_srq() to rxe_srq_cleanup() which is called after all references are dropped to allow code depending on the srq object to complete. Link: https://lore.kernel.org/r/20220421014042.26985-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:39 -03:00
Bob Pearson	0b1fbfb9e9	RDMA/rxe: Remove IB_SRQ_INIT_MASK Currently the #define IB_SRQ_INIT_MASK is used to distinguish the rxe_create_srq verb from the rxe_modify_srq verb so that some code can be shared between these two subroutines. This commit splits rxe_srq_chk_attr into two subroutines: rxe_srq_chk_init and rxe_srq_chk_attr which handle the create_srq and modify_srq verbs separately. Link: https://lore.kernel.org/r/20220421014042.26985-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-06 15:33:31 -03:00
Bob Pearson	409baed5d7	RDMA/rxe: Remove support for SMI QPs from rdma_rxe Currently the rdma_rxe driver supports SMI type QPs in a few places which is incorrect. RoCE devices never should support SMI QPs. This commit removes SMI QP support from the driver. Link: https://lore.kernel.org/r/20220407185416.16372-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:33 -03:00
Bob Pearson	3197706abd	RDMA/rxe: Use standard names for ref counting Rename rxe_add_ref() to rxe_get() and rxe_drop_ref() to rxe_put(). Significantly improves readability for new readers. Link: https://lore.kernel.org/r/20220304000808.225811-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-16 10:34:42 -03:00
Bob Pearson	3225717f6d	RDMA/rxe: Replace red-black trees by xarrays Currently the rxe driver uses red-black trees to add indices to the rxe object pools. Linux xarrays provide a better way to implement the same functionality for indices. This patch replaces red-black trees by xarrays for pool objects. Since xarrays already have a spinlock use that in place of the pool rwlock. Make sure that all changes in the xarray(index) and kref(ref counnt) occur atomically. Link: https://lore.kernel.org/r/20220304000808.225811-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-16 10:34:42 -03:00
Bob Pearson	a099b08599	RDMA/rxe: Revert changes from irqsave to bh locks A previous patch replaced all irqsave locks in rxe with bh locks. This ran into problems because rdmacm has a bad habit of calling rdma verbs APIs while disabling irqs. This is not allowed during spin_unlock_bh() causing programs that use rdmacm to fail. This patch reverts the changes to locks that had this problem or got dragged into the same mess. After this patch blktests/check -q srp now runs correctly. Link: https://lore.kernel.org/r/20220215194448.44369-1-rpearsonhpe@gmail.com Fixes: `21adfa7a3c` ("RDMA/rxe: Replace irqsave locks with bh locks") Reported-by: Guoqing Jiang <guoqing.jiang@linux.dev> Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 11:51:28 -04:00
Bob Pearson	f9f4846057	RDMA/rxe: Enforce IBA o10-2.2.3 Add code to check if a QP is attached to one or more multicast groups when destroy_qp is called and return an error if so. Link: https://lore.kernel.org/r/20220127213755.31697-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 14:33:28 -04:00
Bob Pearson	758c7f1e9c	RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c Move rxe_mcast_attach and rxe_mcast_detach from rxe_verbs.c to rxe_mcast.c, Make non-static and add declarations to rxe_loc.h. Make the subroutines in rxe_mcast.c referenced by these routines static and remove their declarations from rxe_loc.h. Link: https://lore.kernel.org/r/20220127213755.31697-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 13:22:09 -04:00
Zhu Yanjun	104f062fd1	RDMA/rxe: Use the standard method to produce udp source port Use the standard method to produce udp source port. Link: https://lore.kernel.org/r/20220106180359.2915060-5-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-07 19:34:56 -04:00
Bob Pearson	02827b6708	RDMA/rxe: Cleanup rxe_pool_entry Currently three different names are used to describe rxe pool elements. They are referred to as entries, elems or pelems. This patch chooses one 'elem' and changes the other ones. Link: https://lore.kernel.org/r/20211103050241.61293-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:14 -04:00
Bob Pearson	21adfa7a3c	RDMA/rxe: Replace irqsave locks with bh locks Most of the locks in the rxe driver are _irqsave/restore locks but in fact there are no interrupt threads that run rxe code or share data with rxe. There are softirq threads and data sharing so the appropriate lock type is _bh. This patch replaces all irqsave type locks with bh type locks. Link: https://lore.kernel.org/r/20211103050241.61293-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:14 -04:00
Bob Pearson	3b87e08242	RDMA/rxe: Convert kernel UD post send to use ah_num Modify ib_post_send for kernel UD sends to put the AH index into the WQE instead of the address vector. Link: https://lore.kernel.org/r/20211007204051.10086-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:27 -03:00
Bob Pearson	73a5493210	RDMA/rxe: Create AH index and return to user space Make changes to rdma_user_rxe.h to allow indexing AH objects, passing the index in UD send WRs to the driver and returning the index to the rxe provider. Modify rxe_create_ah() to add an index to AH when created and if called from a new user provider return it to user space. If called from an old provider mark the AH as not having a useful index. Modify rxe_destroy_ah to drop the index before deleting the object. Link: https://lore.kernel.org/r/20211007204051.10086-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Bob Pearson	cfc0312d9c	RDMA/rxe: Move AV from rxe_send_wqe to rxe_send_wr Move the struct rxe_av av from struct rxe_send_wqe to struct rxe_send_wr placing it in wr.ud at the same offset as it was previously. This has the effect of increasing the size of struct rxe_send_wr while keeping the size of struct rxe_send_wqe the same. This better reflects the use of this field which is only used for UD sends. This change has no effect on ABI compatibility so the modified rxe driver will operate with older versions of rdma-core. Link: https://lore.kernel.org/r/20211007204051.10086-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Xiao Yang	1cf2ce8272	RDMA/rxe: Remove the is_user members of struct rxe_sq/rxe_rq/rxe_srq The is_user members of struct rxe_sq/rxe_rq/rxe_srq are unsed since commit `ae6e843fe0` ("RDMA/rxe: Add memory barriers to kernel queues"). In this case, it is fine to remove them directly. Link: https://lore.kernel.org/r/20210930094813.226888-2-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:29 -03:00
Bob Pearson	647bf13ce9	RDMA/rxe: Create duplicate mapping tables for FMRs For fast memory regions create duplicate mapping tables so ib_map_mr_sg() can build a new mapping table which is then swapped into place synchronously with the execution of an IB_WR_REG_MR work request. Currently the rxe driver uses the same table for receiving RDMA operations and for building new tables in preparation for reusing the MR. This exposes users to potentially incorrect results. Link: https://lore.kernel.org/r/20210914164206.19768-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:15:00 -03:00
Bob Pearson	ae6e843fe0	RDMA/rxe: Add memory barriers to kernel queues Earlier patches added memory barriers to protect user space to kernel space communications. The user space queues were previously shown to have occasional memory synchonization errors which were removed by adding smp_load_acquire, smp_store_release barriers. This patch extends that to the case where queues are used between kernel space threads. This patch also extends the queue types to include kernel ULP queues which access the other end of the queues in kernel verbs calls like poll_cq and post_send/recv. Link: https://lore.kernel.org/r/20210914164206.19768-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:14:59 -03:00
Zhu Yanjun	d12faf2dee	RDMA/rxe: remove the redundant variable The variable port is not necessary. So remove it. Link: https://lore.kernel.org/r/20210915012456.476728-1-yanjun.zhu@intel.com Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-14 16:22:48 -03:00
Leon Romanovsky	514aee660d	RDMA: Globally allocate and release QP memory Convert QP object to follow IB/core general allocation scheme. That change allows us to make sure that restrack properly kref the memory. Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com Reviewed-by: Gal Pressman <galpress@amazon.com> #efa Tested-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-03 13:44:27 -03:00
Bob Pearson	add2b3b80e	RDMA/rxe: Move crc32 init code to rxe_icrc.c This patch collects the code from rxe_register_device() that sets up the crc32 calculation into a subroutine rxe_icrc_init() in rxe_icrc.c. Link: https://lore.kernel.org/r/20210707040040.15434-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Jason Gunthorpe	915e4af59f	RDMA: Remove rdma_set_device_sysfs_group() The driver's device group can be specified as part of the ops structure like the device's port group. No need for the complicated API. Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:58:32 -03:00
Jason Gunthorpe	4b5f4d3fb4	RDMA: Split the alloc_hw_stats() ops to port and device variants This is being used to implement both the port and device global stats, which is causing some confusion in the drivers. For instance EFA and i40iw both seem to be misusing the device stats. Split it into two ops so drivers that don't support one or the other can leave the op NULL'd, making the calling code a little simpler to understand. Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com Tested-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:58:29 -03:00
Bob Pearson	570d2b99d0	RDMA/rxe: Disallow MR dereg and invalidate when bound Check that an MR has no bound MWs before allowing a dereg or invalidate operation. Link: https://lore.kernel.org/r/20210608042552.33275-11-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:19 -03:00
Bob Pearson	886441fb2e	RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK Rxe has two mask bits WR_LOCAL_MASK and WR_REG_MASK with WR_REG_MASK used to indicate any local operation and WR_LOCAL_MASK unused. This patch replaces both of these with one mask bit WR_LOCAL_OP_MASK which is clearer. Link: https://lore.kernel.org/r/20210608042552.33275-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	beec0239c3	RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs Add ib_alloc_mw and ib_dealloc_mw verbs APIs. Added new file rxe_mw.c focused on MWs. Changed the 8 bit random key generator. Added a cleanup routine for MWs. Added verbs routines to ib_device_ops. Link: https://lore.kernel.org/r/20210608042552.33275-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:17 -03:00
Bob Pearson	5bcf5a59c4	RDMA/rxe: Protext kernel index from user space In order to prevent user space from modifying the index that belongs to the kernel for shared queues let the kernel use a local copy of the index and copy any new values of that index to the shared rxe_queue_bus struct. This adds more switch statements which decreases the performance of the queue API. Move the type into the parameter list for these functions so that the compiler can optimize out the switch statements when the explicit type is known. Modify all the calls in the driver on performance paths to pass in the explicit queue type. Link: https://lore.kernel.org/r/20210527194748.662636-4-rpearsonhpe@gmail.com Link: https://lore.kernel.org/linux-rdma/20210526165239.GP1002214@@nvidia.com/ Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 15:53:01 -03:00
Lang Cheng	cd5b010fff	RDMA/rxe: Remove unused parameter udata The old version of ib_umem_get() need these udata as a parameter but now they are unnecessary. Fixes: `c320e527e1` ("IB: Allow calls to ib_umem_get from kernel ULPs") Link: https://lore.kernel.org/r/1620807142-39157-5-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-05-20 11:52:17 -03:00
Bob Pearson	364e282c4f	RDMA/rxe: Split MEM into MR and MW In the original rxe implementation it was intended to use a common object to represent MRs and MWs but they are different enough to separate these into two objects. This allows replacing the mem name with mr for MRs which is more consistent with the style for the other objects and less likely to be confusing. This is a long patch that mostly changes mem to mr where it makes sense and adds a new rxe_mw struct. Link: https://lore.kernel.org/r/20210325212425.2792-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-30 17:11:30 -03:00
Mark Bloch	1fb7f8973f	RDMA: Support more than 255 rdma ports Current code uses many different types when dealing with a port of a RDMA device: u8, unsigned int and u32. Switch to u32 to clean up the logic. This allows us to make (at least) the core view consistent and use the same type. Unfortunately not all places can be converted. Many uverbs functions expect port to be u8 so keep those places in order not to break UAPIs. HW/Spec defined values must also not be changed. With the switch to u32 we now can support devices with more than 255 ports. U32_MAX is reserved to make control logic a bit easier to deal with. As a device with U32_MAX ports probably isn't going to happen any time soon this seems like a non issue. When a device with more than 255 ports is created uverbs will report the RDMA device as having 255 ports as this is the max currently supported. The verbs interface is not changed yet because the IBTA spec limits the port size in too many places to be u8 and all applications that relies in verbs won't be able to cope with this change. At this stage, we are extending the interfaces that are using vendor channel solely Once the limitation is lifted mlx5 in switchdev mode will be able to have thousands of SFs created by the device. As the only instance of an RDMA device that reports more than 255 ports will be a representor device and it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other ULPs aren't effected by this change and their sysfs/interfaces that are exposes to userspace can remain unchanged. While here cleanup some alignment issues and remove unneeded sanity checks (mainly in rdmavt), Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-26 09:31:21 -03:00
Bob Pearson	086f580c01	RDMA/rxe: Cleanup init_send_wqe This patch changes the type of init_send_wqe in rxe_verbs.c to void since it always returns 0. It also separates out the code that copies inline data into the send wqe as copy_inline_data_to_wqe(). Link: https://lore.kernel.org/r/20210206002437.2756-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 20:43:11 -04:00
Bob Pearson	dc78074a80	RDMA/rxe: Fix minor coding style issues checkpatch -f found 3 warnings in RDMA/rxe 1. a missing space following switch 2. return followed by else 3. use of strlcpy() instead of strscpy(). This patch fixes each of these. In ... } elseif (...) { ... return 0; } else ... The middle block can be safely moved since it is completely independent of the other code. Link: https://lore.kernel.org/r/20210205230525.49068-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 20:42:57 -04:00
Bob Pearson	91a42c5bec	RDMA/rxe: Make add/drop key/index APIs type safe Replace 'void ' parameters with 'struct rxe_pool_entry ' and use a macro to allow: rxe_add_index, rxe_drop_index, rxe_add_key, rxe_drop_key and rxe_add_to_pool APIs to be type safe against changing the position of pelem in the objects. Link: https://lore.kernel.org/r/20201216231550.27224-6-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Bob Pearson	d21a1240f5	RDMA/rxe: Use acquire/release for memory ordering Change work and completion queues to use smp_load_acquire() and smp_store_release() to synchronize between driver and users. This commit goes with a matching series of commits in the rxe user space provider. Link: https://lore.kernel.org/r/20201210174258.5234-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-11 19:57:48 -04:00
Jason Gunthorpe	a9d2e9ae95	RDMA/siw,rxe: Make emulated devices virtual in the device tree This moves siw and rxe to be virtual devices in the device tree: lrwxrwxrwx 1 root root 0 Nov 6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/ Previously they were trying to parent themselves to the physical device of their attached netdev, which doesn't make alot of sense. My hope is this will solve some weird syzkaller hits related to sysfs as it could be possible that the parent of a netdev is another netdev, eg under bonding or some other syzkaller found netdev configuration. Nesting a ib_device under anything but a physical device is going to cause inconsistencies in sysfs during destructions. Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-23 16:14:31 -04:00
Christoph Hellwig	5a7a9e038b	RDMA/core: remove use of dma_virt_ops Use the ib_dma_* helpers to skip the DMA translation instead. This removes the last user if dma_virt_ops and keeps the weird layering violation inside the RDMA core instead of burderning the DMA mapping subsystems with it. This also means the software RDMA drivers now don't have to mess with DMA parameters that are not relevant to them at all, and that in the future we can use PCI P2P transfers even for software RDMA, as there is no first fake layer of DMA mapping that the P2P DMA support. Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-17 15:22:07 -04:00
Jason Gunthorpe	bf3b7b7ba9	Merge branch 'for-rc' into rdma.git From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git The rc RDMA branch is needed due to dependencies on the next patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-17 15:20:26 -04:00
Jason Gunthorpe	5c4193669b	RDMA/rxe,siw: Restore uverbs_cmd_mask IB_USER_VERBS_CMD_POST_SEND These two drivers open code the call to POST_SEND and do not use the rdma-core wrapper to do it, thus their usages was missed during the audit. Both drivers use this as a doorbell to signal the kernel to start DMA. Fixes: `628c02bf38` ("RDMA: Remove uverbs cmds from drivers that don't use them") Link: https://lore.kernel.org/r/0-v1-4608c5610afa+fb-uverbs_cmd_post_send_fix_jgg@nvidia.com Reported-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-02 15:32:06 -04:00

1 2 3 4

163 commits