Add a "scope" parameter to ip_route_output() so that callers don't have
to override the tos parameter with the RTO_ONLINK flag if they want a
local scope.
This will allow converting flowi4_tos to dscp_t in the future, thus
allowing static analysers to flag invalid interactions between
"tos" (the DSCP bits) and ECN.
Only three users ask for local scope (bonding, arp and atm). The others
continue to use RT_SCOPE_UNIVERSE. While there, add a comment to warn
users about the limitations of ip_route_output().
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com> # infiniband
Signed-off-by: David S. Miller <davem@davemloft.net>
Very small update this cycle:
- Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5, irdma,
rxe, rtrs, mana
- Simplify the hns hem mechanism
- Fix EFA's MSI-X allocation in resource constrained configurations
- Fix a KASN splat in srpt
- Narrow hns's congestion control selection to QPs granularity and allow
userspace to select it
- Solve a parallel module loading race between the CM module and a driver
module
- Flexible array cleanup
- Dump hns's SCC Conext to 'rdma res' for debugging
- Make mana build page lists for HW objects that require a 0 offset
correctly
- Stuck CM ID debugging
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZfgzdQAKCRCFwuHvBreF
YbS7AQDLy6uJ/1dgrZQ4efcyQDs6H93LG4jWZKoA7F9Oho+MFQEAsQM/UL4nj18O
T6vHl30N0Ee0aOCqET7HBbnFGKEADAE=
=KxUj
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Very small update this cycle:
- Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5,
irdma, rxe, rtrs, mana
- Simplify the hns hem mechanism
- Fix EFA's MSI-X allocation in resource constrained configurations
- Fix a KASN splat in srpt
- Narrow hns's congestion control selection to QPs granularity and
allow userspace to select it
- Solve a parallel module loading race between the CM module and a
driver module
- Flexible array cleanup
- Dump hns's SCC Conext to 'rdma res' for debugging
- Make mana build page lists for HW objects that require a 0 offset
correctly
- Stuck CM ID debugging"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (29 commits)
RDMA/cm: add timeout to cm_destroy_id wait
RDMA/mana_ib: Use virtual address in dma regions for MRs
RDMA/mana_ib: Fix bug in creation of dma regions
RDMA/hns: Append SCC context to the raw dump of QPC
RDMA/uverbs: Avoid -Wflex-array-member-not-at-end warnings
RDMA/hns: Support userspace configuring congestion control algorithm with QP granularity
RDMA/rtrs-clt: Check strnlen return len in sysfs mpath_policy_store()
RDMA/uverbs: Remove flexible arrays from struct *_filter
RDMA/device: Fix a race between mad_client and cm_client init
RDMA/hns: Fix mis-modifying default congestion control algorithm
RDMA/rxe: Remove unused 'iova' parameter from rxe_mr_init_user
RDMA/srpt: Do not register event handler until srpt device is fully setup
RDMA/irdma: Remove duplicate assignment
RDMA/efa: Limit EQs to available MSI-X vectors
RDMA/mlx5: Delete unused mlx5_ib_copy_pas prototype
RDMA/cxgb4: Delete unused c4iw_ep_redirect prototype
RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function
RDMA/mana_ib: Introduce mana_ib_get_netdev helper function
RDMA/mana_ib: Introduce mdev_to_gc helper function
RDMA/hns: Simplify 'struct hns_roce_hem' allocation
...
Remove the unneeded assignment of the qp_num which is already
set in irdma_create_qp().
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Link: https://lore.kernel.org/r/20240131233953.400483-1-sindhu.devale@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Add IRDMA_AE_LLP_TOO_MANY_RNRS to the list of AE's processed as an
abnormal asyncronous event.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@gmail.com>
Link: https://lore.kernel.org/r/20240131233849.400285-5-sindhu.devale@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The CQ shadow read threshold is currently not set for GEN 2. This could
cause an invalid CQ overflow condition, so remove the GEN check that
exclused GEN 1.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Link: https://lore.kernel.org/r/20240131233849.400285-4-sindhu.devale@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
These two series from Mark and Shun extend RDMA mlx5 API.
Mark's series provides c0 register used to match egress
traffic sent by local device.
Shun's series adds new type for ICM area.
Link: https://lore.kernel.org/all/cover.1701871118.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Virtual QP and CQ require a 4K HW page size but the driver passes
PAGE_SIZE to ib_umem_find_best_pgsz() instead.
Fix this by using the appropriate 4k value in the bitmap passed to
ib_umem_find_best_pgsz().
Fixes: 693a5386ef ("RDMA/irdma: Split mr alloc and free into new functions")
Link: https://lore.kernel.org/r/20231129202143.1434-4-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The SQ is shared for between kernel and used by storing the kernel page
pointer and passing that to a kmap_atomic().
This then requires that the alignment is PAGE_SIZE aligned.
Fix by adding an iWarp specific alignment check.
Fixes: e965ef0e7b ("RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp")
Link: https://lore.kernel.org/r/20231129202143.1434-3-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Currently, there is no wait for the QP suspend to complete on a modify
to SQD state. Add a wait, after the modify to SQD state, for the Suspend
Complete AE. While we are at it, update the suspend timeout value in
irdma_prep_tc_change to use IRDMA_EVENT_TIMEOUT_MS too.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231114170246.238-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Remove the modify to SQD before going to ERROR state. It is not needed.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231114170246.238-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Simplify irdma_ieq_check_mpacrc() by using crypto_shash_digest() instead
of an init+update+final sequence. This should also improve performance.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20231029045756.153943-1-ebiggers@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Add support for reregister MR verb API by doing a de-register
followed by a register MR with the new attributes. Reuse resources
like iwmr handle and HW stag where possible.
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231004151306.228-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Documentation/process/license-rules.rst and checkpatch expect the SPDX
identifier syntax for multiple licenses to use capital "OR". Correct it
to keep consistent format and avoid copy-paste issues.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Many small changes across the subystem, some highlights:
- Usual driver cleanups in qedr, siw, erdma, hfi1, mlx4/5, irdma, mthca,
hns, and bnxt_re
- siw now works over tunnel and other netdevs with a MAC address by
removing assumptions about a MAC/GID from the connection manager
- "Doorbell Pacing" for bnxt_re - this is a best effort scheme to allow
userspace to slow down the doorbell rings if the HW gets full
- irdma egress VLAN priority, better QP/WQ sizing
- rxe bug fixes in queue draining and srq resizing
- Support more ethernet speed options in the core layer
- DMABUF support for bnxt_re
- Multi-stage MTT support for erdma to allow much bigger MR registrations
- A irdma fix with a CVE that came in too late to go to -rc, missing
bounds checking for 0 length MRs
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZPEqkAAKCRCFwuHvBreF
YZrNAPoCBfU+VjCKNr2yqF7s52os5ZdBV7Uuh4txHcXWW9H7GAD/f19i2u62fzNu
C27jj4cztemMBb8mgwyxPw/wLg7NLwY=
=pC6k
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Many small changes across the subystem, some highlights:
- Usual driver cleanups in qedr, siw, erdma, hfi1, mlx4/5, irdma,
mthca, hns, and bnxt_re
- siw now works over tunnel and other netdevs with a MAC address by
removing assumptions about a MAC/GID from the connection manager
- "Doorbell Pacing" for bnxt_re - this is a best effort scheme to
allow userspace to slow down the doorbell rings if the HW gets full
- irdma egress VLAN priority, better QP/WQ sizing
- rxe bug fixes in queue draining and srq resizing
- Support more ethernet speed options in the core layer
- DMABUF support for bnxt_re
- Multi-stage MTT support for erdma to allow much bigger MR
registrations
- A irdma fix with a CVE that came in too late to go to -rc, missing
bounds checking for 0 length MRs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (87 commits)
IB/hfi1: Reduce printing of errors during driver shut down
RDMA/hfi1: Move user SDMA system memory pinning code to its own file
RDMA/hfi1: Use list_for_each_entry() helper
RDMA/mlx5: Fix trailing */ formatting in block comment
RDMA/rxe: Fix redundant break statement in switch-case.
RDMA/efa: Fix wrong resources deallocation order
RDMA/siw: Call llist_reverse_order in siw_run_sq
RDMA/siw: Correct wrong debug message
RDMA/siw: Balance the reference of cep->kref in the error path
Revert "IB/isert: Fix incorrect release of isert connection"
RDMA/bnxt_re: Fix kernel doc errors
RDMA/irdma: Prevent zero-length STAG registration
RDMA/erdma: Implement hierarchical MTT
RDMA/erdma: Refactor the storage structure of MTT entries
RDMA/erdma: Renaming variable names and field names of struct erdma_mem
RDMA/hns: Support hns HW stats
RDMA/hns: Dump whole QP/CQ/MR resource in raw
RDMA/irdma: Add missing kernel-doc in irdma_setup_umode_qp()
RDMA/mlx4: Copy union directly
RDMA/irdma: Drop unused kernel push code
...
Currently irdma allows zero-length STAGs to be programmed in HW during
the kernel mode fast register flow. Zero-length MR or STAG registration
disable HW memory length checks.
Improve gaps in bounds checking in irdma by preventing zero-length STAG or
MR registrations except if the IB_PD_UNSAFE_GLOBAL_RKEY is set.
This addresses the disclosure CVE-2023-25775.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Christopher Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230818144838.1758-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The driver has code blocks for kernel push WQEs but does not
map the doorbell page rendering this mode non functional [1]
Remove code associated with this feature from the kernel fast
path as there is currently no plan of record to support this.
This also address a sparse issue reported by lkp.
drivers/infiniband/hw/irdma/uk.c:285:24: sparse: sparse: incorrect type in assignment (different base types) @@ expected bool [usertype] push_wqe:1 @@ got restricted __le32 [usertype] *push_db @@
drivers/infiniband/hw/irdma/uk.c:285:24: sparse: expected bool [usertype] push_wqe:1
drivers/infiniband/hw/irdma/uk.c:285:24: sparse: got restricted __le32 [usertype] *push_db
drivers/infiniband/hw/irdma/uk.c:386:24: sparse: sparse: incorrect type in assignment (different base types) @@ expected bool [usertype] push_wqe:1 @@ got restricted __le32 [usertype] *push_db @@
[1] https://lore.kernel.org/linux-rdma/20230815051809.GB22185@unreal/T/#t
Fixes: 272bba19d6 ("RDMA: Remove unnecessary ternary operators")
Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308110251.BV6BcwUR-lkp@intel.com/
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230816001209.1721-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20230804082102.3361961-1-ruanjinjie@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
One-element and zero-length arrays are deprecated. So, replace
one-element array in struct irdma_qvlist_info with flexible-array
member.
A patch for this was sent a while ago[1]. However, it seems that, at
the time, the changes were partially folded[2][3], and the actual
flexible-array transformation was omitted. This patch fixes that.
The only binary difference seen before/after changes is shown below:
| drivers/infiniband/hw/irdma/hw.o
| @@ -868,7 +868,7 @@
| drivers/infiniband/hw/irdma/hw.c:484 (discriminator 2)
| size += struct_size(iw_qvlist, qv_info, rf->msix_count);
| 55b: imul $0x45c,%rdi,%rdi
|- 562: add $0x10,%rdi
|+ 562: add $0x4,%rdi
which is, of course, expected as it reflects the mistake made
while folding the patch I've mentioned above.
Worth mentioning is the fact that with this change we save 12 bytes
of memory, as can be inferred from the diff snapshot above. Notice
that:
$ pahole -C rdma_qv_info idrivers/infiniband/hw/irdma/hw.o
struct irdma_qv_info {
u32 v_idx; /* 0 4 */
u16 ceq_idx; /* 4 2 */
u16 aeq_idx; /* 6 2 */
u8 itr_idx; /* 8 1 */
/* size: 12, cachelines: 1, members: 4 */
/* padding: 3 */
/* last cacheline: 12 bytes */
};
Link: https://lore.kernel.org/linux-hardening/20210525230038.GA175516@embeddedor/ [1]
Link: https://lore.kernel.org/linux-hardening/bf46b428deef4e9e89b0ea1704b1f0e5@intel.com/ [2]
Link: https://lore.kernel.org/linux-rdma/20210520143809.819-1-shiraz.saleem@intel.com/T/#u [3]
Fixes: 44d9e52977 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZMpsQrZadBaJGkt4@work
Signed-off-by: Leon Romanovsky <leon@kernel.org>
There are a little ternary operators, the true or false judgment
of which is unnecessary in C language semantics.
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20230731085118.394443-1-ruanjinjie@huawei.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Remove description of @free_hwcqp in irdma_destroy_cqp().
to silence the warning:
drivers/infiniband/hw/irdma/hw.c:580: warning: Excess function parameter 'free_hwcqp' description in 'irdma_destroy_cqp'
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6028
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230731015915.34867-1-yang.lee@linux.alibaba.com
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
HW GEN1 and GEN2 have different min WQ sizes but they are
currently set to the same value.
Use a gen specific attribute min_hw_wq_size and extend ABI to
pass it to user-space.
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155525.1081-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Currently the attribute cap.max_send_wr and cap.max_recv_wr
sent from user-space during create QP are the provider computed
SQ/RQ depth as opposed to raw values passed from application.
This inhibits computation of an accurate value for max_send_wr
and max_recv_wr for this QP in the kernel which matches the value
returned in user create QP. Also these capabilities needs to be
reported from the driver in query QP.
Add support by extending the ABI to allow the raw cap.max_send_wr and
cap.max_recv_wr to be passed from user-space, while keeping compatibility
for the older scheme.
The internal HW depth and shift needed for the WQs needs to be computed
now for both kernel and user-mode QPs. Add new helpers to assist with this:
irdma_uk_calc_depth_shift_sq, irdma_uk_calc_depth_shift_rq and
irdma_uk_calc_depth_shift_wq.
Consolidate all the user mode QP setup into a new function
irdma_setup_umode_qp which keeps it with its counterpart
irdma_setup_kmode_qp.
Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155525.1081-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The return value from irdma_netdev_vlan_ipv6() is not used. Rename
the functions and change to a void return.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-5-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Add a CQ table based loookup to allow quick search
for CQ pointer having CQ ID in case of CQ related
asynchrononous event. The table is implemented in a
similar fashion to QP table.
Also add a reference counters for CQ. This is to prevent
destroying CQ while an asynchronous event is being processed.
The memory resource table size is sized higher with this update,
and this table doesn't need to be physically contiguous, so use
a vzalloc vs kzalloc to allocate the table.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
In case of a failure in irdma_create_cqp, do not call
irdma_destroy_cqp, but cleanup all the allocated resources
in reverse order.
Drop the extra argument in irdma_destroy_cqp as its no longer needed.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Drop the local wqe_idx in irdma_sc_get_next_aeqe and instead
store the wqe_idx in the info structure for all asynchronous events(AE)
received. There is no reason it should be tied to a specific AE source.
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155505.1069-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Report the correct WC error if a MW bind is performed
on an already valid/bound window.
Fixes: 44d9e52977 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155439.1057-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The op_type field CQ poll info structure is incorrectly
filled in with the queue type as opposed to the op_type
received in the CQEs. The wrong opcode could be decoded
and returned to the ULP.
Copy the op_type field received in the CQE in the CQ poll
info structure.
Fixes: 24419777e9 ("RDMA/irdma: Fix RQ completion opcode")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155439.1057-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The new irdma_iw_get_vlan_prio() function requires IPv6 support to build:
x86_64-linux-ld: drivers/infiniband/hw/irdma/cm.o: in function `irdma_iw_get_vlan_prio':
cm.c:(.text+0x2832): undefined reference to `ipv6_chk_addr'
Add a compile-time check in the same way as elsewhere in this file to avoid
this by conditionally leaving out the ipv6 specific bits.
Fixes: f877f22ac1 ("RDMA/irdma: Implement egress VLAN priority")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230718193835.3546684-1-arnd@kernel.org
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
CQP completion statistics is read lockesly in irdma_wait_event and
irdma_check_cqp_progress while it can be updated in the completion
thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.
Make completion statistics an atomic variable to reflect coherent updates
to it. This will also avoid load/store tearing logic bug potentially
possible by compiler optimizations.
[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]
[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4:
[77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma]
[77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma]
[77346.171835] cqp_compl_worker+0x1b/0x20 [irdma]
[77346.172009] process_one_work+0x4d1/0xa40
[77346.172024] worker_thread+0x319/0x700
[77346.172037] kthread+0x180/0x1b0
[77346.172054] ret_from_fork+0x22/0x30
[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2:
[77346.172234] irdma_handle_cqp_op+0xf4/0x4b0 [irdma]
[77346.172413] irdma_cqp_aeq_cmd+0x75/0xa0 [irdma]
[77346.172592] irdma_create_aeq+0x390/0x45a [irdma]
[77346.172769] irdma_rt_init_hw.cold+0x212/0x85d [irdma]
[77346.172944] irdma_probe+0x54f/0x620 [irdma]
[77346.173122] auxiliary_bus_probe+0x66/0xa0
[77346.173137] really_probe+0x140/0x540
[77346.173154] __driver_probe_device+0xc7/0x220
[77346.173173] driver_probe_device+0x5f/0x140
[77346.173190] __driver_attach+0xf0/0x2c0
[77346.173208] bus_for_each_dev+0xa8/0xf0
[77346.173225] driver_attach+0x29/0x30
[77346.173240] bus_add_driver+0x29c/0x2f0
[77346.173255] driver_register+0x10f/0x1a0
[77346.173272] __auxiliary_driver_register+0xbc/0x140
[77346.173287] irdma_init_module+0x55/0x1000 [irdma]
[77346.173460] do_one_initcall+0x7d/0x410
[77346.173475] do_init_module+0x81/0x2c0
[77346.173491] load_module+0x1232/0x12c0
[77346.173506] __do_sys_finit_module+0x101/0x180
[77346.173522] __x64_sys_finit_module+0x3c/0x50
[77346.173538] do_syscall_64+0x39/0x90
[77346.173553] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095
Fixes: 915cc7ac0f ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
On code inspection, there are many instances in the driver where
CEQE and AEQE fields written to by HW are read without guaranteeing
that the polarity bit has been read and checked first.
Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields
prior to checking the polarity bit.
Fixes: 3f49d68425 ("RDMA/irdma: Implement HW Admin Queue OPs")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
When a VLAN interface is in use, get and use the VLAN
egress mapping.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175318.1301-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Delete a duplicate statement from this function implementation.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Minjie Du <duminjie@vivo.com>
Acked-by: Alok Prasad <palok@marvell.com>
Link: https://lore.kernel.org/r/20230706022704.1260-1-duminjie@vivo.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSYzfYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG/ucH/iOM/1Py/fSg0qSs
7NJ4XXlourT5zrnRMom3cm3d9gYqgTzgvKFL3kjMEexTRVYbhlcO4ZPRsiry8zxF
ToGX+V8tDMqb8WSdFHzkljRY+zDRyfEUDMlTzROAD9DunLmQtkJKyrggkeGdjkpP
OyfGqKpwlLXZRAXBil/U8Mx9MHdjJubloZwghLZr33VdUZa68+JJ9l6w163Oe/ET
K264NM0wxN/kvN57JvePgqMccQwpINylg8IhRI+XelgczjUXeJBsOA8TDv4bDN4Q
bjCLhkWbIaZtTYqvOXa/kD0T8wd7KETsMBQN8YzyDh6W0GmAlJjTawyAhA6jA5in
x3uz2W8=
=L3zp
-----END PGP SIGNATURE-----
Merge tag 'v6.4' into rdma.git for-next
Linux 6.4
Resolve conflicts between rdma rc and next in rxe_cq matching linux-next:
drivers/infiniband/sw/rxe/rxe_cq.c:
https://lore.kernel.org/r/20230622115246.365d30ad@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Commit df8fc4e934 ("kbuild: Enable -fstrict-flex-arrays=3") triggers a
warning for fortified memset():
In function 'fortify_memset_chk',
inlined from 'irdma_clr_wqes' at drivers/infiniband/hw/irdma/uk.c:103:4:
include/linux/fortify-string.h:493:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
493 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem here isthat the inner array only has four 8-byte elements, so
clearing 4096 bytes overflows that. As this structure is part of an outer
array, change the code to pass a pointer to the irdma_qp_quanta instead,
and change the size argument for readability, matching the comment above
it.
Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Link: https://lore.kernel.org/r/20230523111859.2197825-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
If the local invalidate fence is indicated in the WR, only the read fence
is currently being set in WQE. Fix this to set both the read and local
fence in the WQE.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There is a window where the poll cq may use a QP that has been freed.
This can happen if a CQE is polled before irdma_clean_cqes() can clear the
CQE's related to the QP and the destroy QP races to free the QP memory.
then the QP structures are used in irdma_poll_cq. Fix this by moving the
clearing of CQE's before the reference is removed and the QP is destroyed.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Move the initialization of the iw device ops to be under the declaration
of the irdma_iw_dev_ops.
Link: https://lore.kernel.org/r/20230515191142.413633-4-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The return value from irdma_init_rdma_device() is always 0 - change it to
be void.
Link: https://lore.kernel.org/r/20230515191142.413633-3-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The return value from irdma_init_iw_device() is always 0 - change it to be
void.
Link: https://lore.kernel.org/r/20230515191142.413633-2-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Usual wide collection of unrelated items in drivers:
- Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5, rxe,
usnic, usnic, bnxt_re, ocrdma, iser
* Unnecessary NULL checks
* kmap obsolescence
* pci_enable_pcie_error_reporting() obsolescence
* Unused variables and macros
* trace event related warnings
* casting warnings
- Code cleanups for irdm and erdma
- EFA reporting of 128 byte PCIe TLP support
- mlx5 more agressively uses the out of order HW feature
- Big rework of how state machines and tasks work in rxe
- Fix a syzkaller found crash netdev refcount leak in siw
- bnxt_re revises their HW description header
- Congestion control for bnxt_re
- Use mmu_notifiers more safely in hfi1
- mlx5 gets better support for PCIe relaxed ordering inside VMs
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZEva5wAKCRCFwuHvBreF
YZFmAQC9T3b/XQ3bRknYciuzbatC98o9xB0FTqmEFYGj+Y2lVAD9EEVe3HKfHfi3
t/GxXYB5r22oxg5bgsblZfEdEdTVCg8=
=akMm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Usual wide collection of unrelated items in drivers:
- Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5,
rxe, usnic, usnic, bnxt_re, ocrdma, iser:
- remove unnecessary NULL checks
- kmap obsolescence
- pci_enable_pcie_error_reporting() obsolescence
- unused variables and macros
- trace event related warnings
- casting warnings
- Code cleanups for irdm and erdma
- EFA reporting of 128 byte PCIe TLP support
- mlx5 more agressively uses the out of order HW feature
- Big rework of how state machines and tasks work in rxe
- Fix a syzkaller found crash netdev refcount leak in siw
- bnxt_re revises their HW description header
- Congestion control for bnxt_re
- Use mmu_notifiers more safely in hfi1
- mlx5 gets better support for PCIe relaxed ordering inside VMs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits)
RDMA/efa: Add rdma write capability to device caps
RDMA/mlx5: Use correct device num_ports when modify DC
RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
RDMA/rxe: Fix spinlock recursion deadlock on requester
RDMA/mlx5: Fix flow counter query via DEVX
RDMA/rxe: Protect QP state with qp->state_lock
RDMA/rxe: Move code to check if drained to subroutine
RDMA/rxe: Remove qp->req.state
RDMA/rxe: Remove qp->comp.state
RDMA/rxe: Remove qp->resp.state
RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
net/mlx5: Update relaxed ordering read HCA capabilities
RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
RDMA: Add ib_virt_dma_to_page()
RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c
IB/hfi1: Place struct mmu_rb_handler on cache line start
IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
...
Workqueue is in the process of cleaning up the distinction between unbound
workqueues w/ @nr_active==1 and ordered workqueues. Explicit WQ_UNBOUND
isn't needed for alloc_ordered_workqueue() and will trigger a warning in
the future. Let's remove it. This doesn't cause any functional changes.
Link: https://lore.kernel.org/r/ZEGW-IcFReR1juVM@slm.duckdns.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>