1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

8739 commits

Author SHA1 Message Date
Leon Romanovsky
d4b2d16571 RDMA/hfi1: Workaround truncation compilation error
Increase name array to be large enough to overcome the following
compilation error.

drivers/infiniband/hw/hfi1/efivar.c: In function ‘read_hfi1_efi_var’:
drivers/infiniband/hw/hfi1/efivar.c:124:44: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
  124 |         snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
      |                                            ^
drivers/infiniband/hw/hfi1/efivar.c:124:9: note: ‘snprintf’ output 2 or more bytes (assuming 65) into a destination of size 64
  124 |         snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/hfi1/efivar.c:133:52: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
  133 |                 snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
      |                                                    ^
drivers/infiniband/hw/hfi1/efivar.c:133:17: note: ‘snprintf’ output 2 or more bytes (assuming 65) into a destination of size 64
  133 |                 snprintf(name, sizeof(name), "%s-%s", prefix_name, kind);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:243: drivers/infiniband/hw/hfi1/efivar.o] Error 1

Fixes: c03c08d50b ("IB/hfi1: Check upper-case EFI variables")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/238fa39a8fd60e87a5ad7e1ca6584fcdf32e9519.1698159993.git.leonro@nvidia.com
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-25 11:13:57 +03:00
Chengfeng Ye
2f19c4b839 IB/hfi1: Fix potential deadlock on &irq_src_lock and &dd->uctxt_lock
handle_receive_interrupt_napi_sp() running inside interrupt handler
could introduce inverse lock ordering between &dd->irq_src_lock
and &dd->uctxt_lock, if read_mod_write() is preempted by the isr.

          [CPU0]                                        |          [CPU1]
hfi1_ipoib_dev_open()                                   |
--> hfi1_netdev_enable_queues()                         |
--> enable_queues(rx)                                   |
--> hfi1_rcvctrl()                                      |
--> set_intr_bits()                                     |
--> read_mod_write()                                    |
--> spin_lock(&dd->irq_src_lock)                        |
                                                        | hfi1_poll()
                                                        | --> poll_next()
                                                        | --> spin_lock_irq(&dd->uctxt_lock)
                                                        |
                                                        | --> hfi1_rcvctrl()
                                                        | --> set_intr_bits()
                                                        | --> read_mod_write()
                                                        | --> spin_lock(&dd->irq_src_lock)
<interrupt>                                             |
   --> handle_receive_interrupt_napi_sp()               |
   --> set_all_fastpath()                               |
   --> hfi1_rcd_get_by_index()                          |
   --> spin_lock_irqsave(&dd->uctxt_lock)               |

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

To prevent the potential deadlock, the patch use spin_lock_irqsave()
on &dd->irq_src_lock inside read_mod_write() to prevent the possible
deadlock scenario.

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Link: https://lore.kernel.org/r/20230926101116.2797-1-dg573847474@gmail.com
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-25 11:12:43 +03:00
Yang Li
7a1c2abf9a RDMA/core: Remove NULL check before dev_{put, hold}
The call netdev_{put, hold} of dev_{put, hold} will check NULL,
so there is no need to check before using dev_{put, hold},
remove it to silence the warning:

./drivers/infiniband/core/nldev.c:375:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7047
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231024003815.89742-1-yang.lee@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-24 18:16:04 +03:00
Colin Ian King
b55706366c RDMA/hfi1: Remove redundant assignment to pointer ppd
Pointer ppd is being assigned a value in a for-loop however it
is never read. The assignment is redundant and can be removed.
Cleans up clang scan build warning:

drivers/infiniband/hw/hfi1/init.c:1030:3: warning: Value stored
to 'ppd' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20231023141733.667807-1-colin.i.king@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-24 17:33:25 +03:00
Patrisious Haddad
02e7d139e5 RDMA/mlx5: Change the key being sent for MPV device affiliation
Change the key that we send from IB driver to EN driver regarding the
MPV device affiliation, since at that stage the IB device is not yet
initialized, so its index would be zero for different IB devices and
cause wrong associations between unrelated master and slave devices.

Instead use a unique value from inside the core device which is already
initialized at this stage.

Fixes: 0d293714ac ("RDMA/mlx5: Send events from IB driver about device affiliation state")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/ac7e66357d963fc68d7a419515180212c96d137d.1697705185.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-24 17:10:58 +03:00
Nathan Chancellor
9040c0d96f RDMA/bnxt_re: Fix clang -Wimplicit-fallthrough in bnxt_re_handle_cq_async_error()
Clang warns (or errors with CONFIG_WERROR=y):

  drivers/infiniband/hw/bnxt_re/main.c:1114:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
   1114 |         default:
        |         ^
  drivers/infiniband/hw/bnxt_re/main.c:1114:2: note: insert 'break;' to avoid fall-through
   1114 |         default:
        |         ^
        |         break;
  1 error generated.

Clang is a little more pedantic than GCC, which does not warn when
falling through to a case that is just break or return. Clang's version
is more in line with the kernel's own stance in deprecated.rst, which
states that all switch/case blocks must end in either break,
fallthrough, continue, goto, or return. Add the missing break to silence
the warning.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1950
Fixes: b02fd3f79e ("RDMA/bnxt_re: Report async events and errors")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20231020-bnxt_re-implicit-fallthrough-v1-1-b5c19534a6c9@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-22 10:31:01 +03:00
Junxian Huang
07f06e0e5c RDMA/hns: Fix init failure of RoCE VF and HIP08
During device init, a struct for HW stats will be allocated. As HW
stats are not supported for VF and HIP08, currently
hns_roce_alloc_hw_port_stats() returns NULL in this case. However,
ib-core considers the returned NULL pointer as memory allocation
failure and returns ENOMEM, eventually leading to the failure of VF
and HIP08 init.

In the case where the driver does not support the .alloc_hw_port_stats()
ops, ib-core will return EOPNOTSUPP and ignore this error code in the
upper layer function. So for VF and HIP08, just don't set the HW stats
ops to ib-core.

Fixes: 5a87279591 ("RDMA/hns: Support hns HW stats")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-8-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Junxian Huang
b4a797b894 RDMA/hns: Fix unnecessary port_num transition in HW stats allocation
The num_ports capability of devices should be compared with the
number of port(i.e. the input param "port_num") but not the port
index(i.e. port_num - 1).

Fixes: 5a87279591 ("RDMA/hns: Support hns HW stats")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-7-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Luoyouming
27c5fd271d RDMA/hns: The UD mode can only be configured with DCQCN
Due to hardware limitations, only DCQCN is supported for UD. Therefore, the
default algorithm for UD is set to DCQCN.

Fixes: f91696f2f0 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-6-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Luoyouming
5e617c18b1 RDMA/hns: Add check for SL
SL set by users may exceed the capability of devices. So add check
for this situation.

Fixes: fba429fcf9 ("RDMA/hns: Fix missing fields in address vector")
Fixes: 70f9252158 ("RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT")
Fixes: f0cb411aad ("RDMA/hns: Use new interface to modify QP context")
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-5-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Chengchang Tang
b5f9efff10 RDMA/hns: Fix signed-unsigned mixed comparisons
The ib_mtu_enum_to_int() and uverbs_attr_get_len() may returns a negative
value. In this case, mixed comparisons of signed and unsigned types will
throw wrong results.

This patch adds judgement for this situation.

Fixes: 30b707886a ("RDMA/hns: Support inline data in extented sge space for RC")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-4-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Chengchang Tang
c64e9710f9 RDMA/hns: Fix uninitialized ucmd in hns_roce_create_qp_common()
ucmd in hns_roce_create_qp_common() are not initialized. But it works fine
until new member sdb_addr is added to struct hns_roce_ib_create_qp.

If the user-mode driver uses an old version ABI, then the value of the new
member will be undefined after ib_copy_from_udata().

This patch fixes it by initialize this variable to 0. And the default value
of the new member sdb_addr will be 0 which is invalid.

Fixes: 0425e3e6e0 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-3-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Chengchang Tang
9faef73ef4 RDMA/hns: Fix printing level of asynchronous events
The current driver will print all asynchronous events. Some of the
print levels are set improperly, e.g. SRQ limit reach and SRQ last
wqe reach, which may also occur during normal operation of the software.
Currently, the information of these event is printed as a warning,
which causes a large amount of printing even during normal use of the
application. As a result, the service performance deteriorates.

This patch fixes the printing storms by modifying the print level.

Fixes: b00a92c8f2 ("RDMA/hns: Move all prints out of irq handle")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20231017125239.164455-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-19 09:49:40 +03:00
Jeff Layton
7e6481cebd
qib: convert to new timestamp accessors
Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20231004185347.80880-5-jlayton@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-18 13:26:16 +02:00
Chandramohan Akula
45cfa8864c RDMA/bnxt_re: Do not report SRQ error in srq notification
In the SRQ notification handler, do not report the SRQ_ERROR
in the default event case, as there was no error.

Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1697049097-31992-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-15 11:48:32 +03:00
Chandramohan Akula
b02fd3f79e RDMA/bnxt_re: Report async events and errors
Report QP, SRQ and CQ async events and errors.

Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1697049097-31992-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-15 11:48:32 +03:00
Chandramohan Akula
d60a779673 RDMA/bnxt_re: Update HW interface headers
Updating the HW structures for the affiliated event and error
reporting. Newly added interface structures will be used in the
followup patch.

Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1697049097-31992-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-15 11:48:32 +03:00
Patrisious Haddad
c1336bb4aa IB/mlx5: Fix rdma counter binding for RAW QP
Previously when we had a RAW QP, we bound a counter to it when it moved
to INIT state, using the counter context inside RQC.

But when we try to modify that counter later in RTS state we used
modify QP which tries to change the counter inside QPC instead of RQC.

Now we correctly modify the counter set_id inside of RQC instead of QPC
for the RAW QP.

Fixes: d14133dd41 ("IB/mlx5: Support set qp counter")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/2e5ab6713784a8fe997d19c508187a0dfecf2dfc.1696847964.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-15 11:04:01 +03:00
Jakub Kicinski
1bc60524ca Merge branch 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:

====================
This PR is collected from
https://lore.kernel.org/all/cover.1695296682.git.leon@kernel.org

This series from Patrisious extends mlx5 to support IPsec packet offload
in multiport devices (MPV, see [1] for more details).

These devices have single flow steering logic and two netdev interfaces,
which require extra logic to manage IPsec configurations as they performed
on netdevs.

[1] https://lore.kernel.org/linux-rdma/20180104152544.28919-1-leon@kernel.org/

* 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Handle IPsec steering upon master unbind/bind
  net/mlx5: Configure IPsec steering for ingress RoCEv2 MPV traffic
  net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic
  net/mlx5: Add create alias flow table function to ipsec roce
  net/mlx5: Implement alias object allow and create functions
  net/mlx5: Add alias flow table bits
  net/mlx5: Store devcom pointer inside IPsec RoCE
  net/mlx5: Register mlx5e priv to devcom in MPV mode
  RDMA/mlx5: Send events from IB driver about device affiliation state
  net/mlx5: Introduce ifc bits for migration in a chunk mode

====================

Link: https://lore.kernel.org/r/20231002083832.19746-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13 09:35:34 -07:00
Sindhu Devale
5ac388db27 RDMA/irdma: Add support to re-register a memory region
Add support for reregister MR verb API by doing a de-register
followed by a register MR with the new attributes. Reuse resources
like iwmr handle and HW stag where possible.

Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20231004151306.228-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-09 08:39:49 +03:00
Leon Romanovsky
16419098e8 IPsec packet offload support in multiport RoCE devices
This series from Patrisious extends mlx5 to support IPsec packet offload
in multiport devices (MPV, see [1] for more details).

These devices have single flow steering logic and two netdev interfaces,
which require extra logic to manage IPsec configurations as they performed
on netdevs.

Thanks

[1] https://lore.kernel.org/linux-rdma/20180104152544.28919-1-leon@kernel.org/

Link: https://lore.kernel.org/all/20231002083832.19746-1-leon@kernel.org
Signed-of-by: Leon Romanovsky <leon@kernel.org>

* mlx5-next: (576 commits)
  net/mlx5: Handle IPsec steering upon master unbind/bind
  net/mlx5: Configure IPsec steering for ingress RoCEv2 MPV traffic
  net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic
  net/mlx5: Add create alias flow table function to ipsec roce
  net/mlx5: Implement alias object allow and create functions
  net/mlx5: Add alias flow table bits
  net/mlx5: Store devcom pointer inside IPsec RoCE
  net/mlx5: Register mlx5e priv to devcom in MPV mode
  RDMA/mlx5: Send events from IB driver about device affiliation state
  net/mlx5: Introduce ifc bits for migration in a chunk mode
  Linux 6.6-rc3
  ...
2023-10-04 21:21:49 +03:00
Kees Cook
964168970c IB/hfi1: Annotate struct tid_rb_node with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct tid_rb_node.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230929180431.3005464-7-keescook@chromium.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 14:44:54 +03:00
Kees Cook
2aba54a9e0 IB/mthca: Annotate struct mthca_icm_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct mthca_icm_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230929180431.3005464-6-keescook@chromium.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 14:44:54 +03:00
Kees Cook
ed7c64de62 RDMA/usnic: Annotate struct usnic_uiom_chunk with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct usnic_uiom_chunk.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Christian Benvenuti <benve@cisco.com>
Cc: Nelson Escobar <neescoba@cisco.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230929180431.3005464-3-keescook@chromium.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 14:44:54 +03:00
Leon Romanovsky
c99a7457e5 RDMA/mlx5: Remove not-used cache disable flag
During execution of mlx5_mkey_cache_cleanup(), there is a guarantee
that MR are not registered and/or destroyed. It means that we don't
need newly introduced cache disable flag.

Fixes: 374012b004 ("RDMA/mlx5: Fix mkey cache possible deadlock on cleanup")
Link: https://lore.kernel.org/r/c7e9c9f98c8ae4a7413d97d9349b29f5b0a23dbe.1695921626.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-10-02 14:32:44 +03:00
Yangyang Li
c9813b0b99 RDMA/hns: Support SRQ record doorbell
Compared with normal doorbell, using record doorbell can shorten the
process of ringing the doorbell and reduce the latency.

Add a flag HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB to allow FW to
enable/disable SRQ record doorbell.

If the flag above is set, allocate the dma buffer for SRQ record
doorbell and write the buffer address into SRQC during SRQ creation.

For userspace SRQ, add a flag HNS_ROCE_RSP_SRQ_CAP_RECORD_DB to notify
userspace whether the SRQ record doorbell is enabled.

Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20230926130026.583088-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 11:47:08 +03:00
Patrisious Haddad
0d293714ac RDMA/mlx5: Send events from IB driver about device affiliation state
Send blocking events from IB driver whenever the device is done being
affiliated or if it is removed from an affiliation.

This is useful since now the EN driver can register to those event and
know when a device is affiliated or not.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/a7491c3e483cfd8d962f5f75b9a25f253043384a.1695296682.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-10-02 11:20:59 +03:00
Patrisious Haddad
4f4db19089 IB/mlx5: Adjust mlx5 rate mapping to support 800Gb
Adjust mlx5 function which maps the speed rate from IB spec values
to internal driver values to be able to handle speeds up to 800Gb.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/301c803d8486b0df8aefad3fb3cc10dc58671985.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:54 +03:00
Patrisious Haddad
b28ad32442 IB/mlx5: Rename 400G_8X speed to comply to naming convention
Rename 400G_8X speed to comply to naming convention.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/ac98447cac8379a43fbdb36d56e5fb2b741a97ff.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:50 +03:00
Patrisious Haddad
948f0bf5ad IB/mlx5: Add support for 800G_8X lane speed
Add a check for 800G_8X speed when querying PTYS and report it back
correctly when needed.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/26fd0b6e1fac071c3eb779657bb3d8ba47f47c4f.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:46 +03:00
Or Har-Toov
561b4a3ac6 IB/mlx5: Expose XDR speed through MAD
Under MAD query port, Report NDR speed when NDR is supported in the port
capability mask.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/d30bdec2a66a8a2edd1d84ee61453c58cf346b43.1695204156.git.leon@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:38:43 +03:00
Shay Drory
57e7071683 RDMA/mlx5: Implement mkeys management via LIFO queue
Currently, mkeys are managed via xarray. This implementation leads to
a degradation in cases many MRs are unregistered in parallel, due to xarray
internal implementation, for example: deregistration 1M MRs via 64 threads
is taking ~15% more time[1].

Hence, implement mkeys management via LIFO queue, which solved the
degradation.

[1]
2.8us in kernel v5.19 compare to 3.2us in kernel v6.4

Signed-off-by: Shay Drory <shayd@nvidia.com>
Link: https://lore.kernel.org/r/fde3d4cfab0f32f0ccb231cd113298256e1502c5.1695283384.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:36:18 +03:00
Shay Drory
374012b004 RDMA/mlx5: Fix mkey cache possible deadlock on cleanup
Fix the deadlock by refactoring the MR cache cleanup flow to flush the
workqueue without holding the rb_lock.
This adds a race between cache cleanup and creation of new entries which
we solve by denied creation of new entries after cache cleanup started.

Lockdep:
WARNING: possible circular locking dependency detected
 [ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted
 [ 2785.339778 ] ------------------------------------------------------
 [ 2785.340848 ] devlink/53872 is trying to acquire lock:
 [ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900
 [ 2785.343403 ]
 [ 2785.343403 ] but task is already holding lock:
 [ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]
 [ 2785.346273 ]
 [ 2785.346273 ] which lock already depends on the new lock.
 [ 2785.346273 ]
 [ 2785.347720 ]
 [ 2785.347720 ] the existing dependency chain (in reverse order) is:
 [ 2785.349003 ]
 [ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}:
 [ 2785.350160 ]        __mutex_lock+0x14c/0x15c0
 [ 2785.350962 ]        delayed_cache_work_func+0x2d1/0x610 [mlx5_ib]
 [ 2785.352044 ]        process_one_work+0x7c2/0x1310
 [ 2785.352879 ]        worker_thread+0x59d/0xec0
 [ 2785.353636 ]        kthread+0x28f/0x330
 [ 2785.354370 ]        ret_from_fork+0x1f/0x30
 [ 2785.355135 ]
 [ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}:
 [ 2785.356515 ]        __lock_acquire+0x2d8a/0x5fe0
 [ 2785.357349 ]        lock_acquire+0x1c1/0x540
 [ 2785.358121 ]        __flush_work+0xe8/0x900
 [ 2785.358852 ]        __cancel_work_timer+0x2c7/0x3f0
 [ 2785.359711 ]        mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib]
 [ 2785.360781 ]        mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib]
 [ 2785.361969 ]        __mlx5_ib_remove+0x68/0x120 [mlx5_ib]
 [ 2785.362960 ]        mlx5r_remove+0x63/0x80 [mlx5_ib]
 [ 2785.363870 ]        auxiliary_bus_remove+0x52/0x70
 [ 2785.364715 ]        device_release_driver_internal+0x3c1/0x600
 [ 2785.365695 ]        bus_remove_device+0x2a5/0x560
 [ 2785.366525 ]        device_del+0x492/0xb80
 [ 2785.367276 ]        mlx5_detach_device+0x1a9/0x360 [mlx5_core]
 [ 2785.368615 ]        mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core]
 [ 2785.369934 ]        mlx5_devlink_reload_down+0x292/0x580 [mlx5_core]
 [ 2785.371292 ]        devlink_reload+0x439/0x590
 [ 2785.372075 ]        devlink_nl_cmd_reload+0xaef/0xff0
 [ 2785.372973 ]        genl_family_rcv_msg_doit.isra.0+0x1bd/0x290
 [ 2785.374011 ]        genl_rcv_msg+0x3ca/0x6c0
 [ 2785.374798 ]        netlink_rcv_skb+0x12c/0x360
 [ 2785.375612 ]        genl_rcv+0x24/0x40
 [ 2785.376295 ]        netlink_unicast+0x438/0x710
 [ 2785.377121 ]        netlink_sendmsg+0x7a1/0xca0
 [ 2785.377926 ]        sock_sendmsg+0xc5/0x190
 [ 2785.378668 ]        __sys_sendto+0x1bc/0x290
 [ 2785.379440 ]        __x64_sys_sendto+0xdc/0x1b0
 [ 2785.380255 ]        do_syscall_64+0x3d/0x90
 [ 2785.381031 ]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
 [ 2785.381967 ]
 [ 2785.381967 ] other info that might help us debug this:
 [ 2785.381967 ]
 [ 2785.383448 ]  Possible unsafe locking scenario:
 [ 2785.383448 ]
 [ 2785.384544 ]        CPU0                    CPU1
 [ 2785.385383 ]        ----                    ----
 [ 2785.386193 ]   lock(&dev->cache.rb_lock);
 [ 2785.386940 ]				lock((work_completion)(&(&ent->dwork)->work));
 [ 2785.388327 ]				lock(&dev->cache.rb_lock);
 [ 2785.389425 ]   lock((work_completion)(&(&ent->dwork)->work));
 [ 2785.390414 ]
 [ 2785.390414 ]  *** DEADLOCK ***
 [ 2785.390414 ]
 [ 2785.391579 ] 6 locks held by devlink/53872:
 [ 2785.392341 ]  #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
 [ 2785.393630 ]  #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0
 [ 2785.395324 ]  #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core]
 [ 2785.397322 ]  #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core]
 [ 2785.399231 ]  #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600
 [ 2785.400864 ]  #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]

Fixes: b958451783 ("RDMA/mlx5: Change the cache structure to an RB-tree")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-09-26 12:33:53 +03:00
Shay Drory
dab994bcc6 RDMA/mlx5: Fix NULL string error
checkpath is complaining about NULL string, change it to 'Unknown'.

Fixes: 37aa5c36aa ("IB/mlx5: Add UARs write-combining and non-cached mapping")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:29:44 +03:00
Hamdan Igbaria
2fad8f06a5 RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creation
The mutex was not unlocked on some of the error flows.
Moved the unlock location to include all the error flow scenarios.

Fixes: e1f4a52ac1 ("RDMA/mlx5: Create an indirect flow table for steering anchor")
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:29:40 +03:00
Michael Guralnik
4f14c6c021 RDMA/mlx5: Fix assigning access flags to cache mkeys
After the change to use dynamic cache structure, new cache entries
can be added and the mkey allocation can no longer assume that all
mkeys created for the cache have access_flags equal to zero.

Example of a flow that exposes the issue:
A user registers MR with RO on a HCA that cannot UMR RO and the mkey is
created outside of the cache. When the user deregisters the MR, a new
cache entry is created to store mkeys with RO.

Later, the user registers 2 MRs with RO. The first MR is reused from the
new cache entry. When we try to get the second mkey from the cache we see
the entry is empty so we go to the MR cache mkey allocation flow which
would have allocated a mkey with no access flags, resulting the user getting
a MR without RO.

Fixes: dd1b913fb0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow")
Reviewed-by: Edward Srouji <edwards@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/8a802700b82def3ace3f77cd7a9ad9d734af87e7.1695203958.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-26 12:29:33 +03:00
David Howells
7ebc540b35
infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC
Use user_backed_iter() to see if iterator is UBUF/IOVEC rather than poking
inside the iterator.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20230925120309.1731676-5-dhowells@redhat.com
cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
cc: Jason Gunthorpe <jgg@ziepe.ca>
cc: Leon Romanovsky <leon@kernel.org>
cc: linux-rdma@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-25 14:30:28 +02:00
Christophe JAILLET
d7f393430a IB/mlx4: Fix the size of a buffer in add_port_entries()
In order to be sure that 'buff' is never truncated, its size should be
12, not 11.

When building with W=1, this fixes the following warnings:

  drivers/infiniband/hw/mlx4/sysfs.c: In function ‘add_port_entries’:
  drivers/infiniband/hw/mlx4/sysfs.c:268:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
    268 |                 sprintf(buff, "%d", i);
        |                                  ^
  drivers/infiniband/hw/mlx4/sysfs.c:268:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
    268 |                 sprintf(buff, "%d", i);
        |                 ^~~~~~~~~~~~~~~~~~~~~~
  drivers/infiniband/hw/mlx4/sysfs.c:286:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
    286 |                 sprintf(buff, "%d", i);
        |                                  ^
  drivers/infiniband/hw/mlx4/sysfs.c:286:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
    286 |                 sprintf(buff, "%d", i);
        |                 ^~~~~~~~~~~~~~~~~~~~~~

Fixes: c1e7e46612 ("IB/mlx4: Add iov directory in sysfs under the ib device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/0bb1443eb47308bc9be30232cc23004c4d4cf43e.1695448530.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-23 21:53:24 +03:00
Justin Stitt
cb7ab7854b IB/qib: Replace deprecated strncpy
`strncpy` is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

We know `txselect_list` is expected to be NUL-terminated based on its
use in `param_get_string()`:
| int param_get_string(char *buffer, const struct kernel_param *kp)
| {
| 	const struct kparam_string *kps = kp->str;
| 	return scnprintf(buffer, PAGE_SIZE, "%s\n", kps->string);
| }

Note that `txselect_list` is assigned to `kp_txselect`'s string field:
| static struct kparam_string kp_txselect = {
| 	.string = txselect_list,
| 	.maxlen = MAX_ATTEN_LEN
| };

Wherein it is then assigned the set and get methods:
| module_param_call(txselect, setup_txselect, param_get_string,
| 		  &kp_txselect, S_IWUSR | S_IRUGO);

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230921-strncpy-drivers-infiniband-hw-qib-qib_iba7322-c-v1-1-373727763f5b@google.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-22 13:29:19 +03:00
Justin Stitt
c2d0c5b28a IB/hfi1: Replace deprecated strncpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We see that `buf` is expected to be NUL-terminated based on it's use
within a trace event wherein `is_misc_err_name` and `is_various_name`
map to `is_name` through `is_table`:
| TRACE_EVENT(hfi1_interrupt,
| 	    TP_PROTO(struct hfi1_devdata *dd, const struct is_table *is_entry,
| 		     int src),
| 	    TP_ARGS(dd, is_entry, src),
| 	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
| 			     __array(char, buf, 64)
| 			     __field(int, src)
| 			     ),
| 	    TP_fast_assign(DD_DEV_ASSIGN(dd);
| 			   is_entry->is_name(__entry->buf, 64,
| 					     src - is_entry->start);
| 			   __entry->src = src;
| 			   ),
| 	    TP_printk("[%s] source: %s [%d]", __get_str(dev), __entry->buf,
| 		      __entry->src)
| );

Considering the above, a suitable replacement is `strscpy_pad` due to
the fact that it guarantees NUL-termination on the destination buffer
while maintaining the NUL-padding behavior that strncpy provides.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230921-strncpy-drivers-infiniband-hw-hfi1-chip-c-v1-1-37afcf4964d9@google.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-22 13:28:54 +03:00
Justin Stitt
f0cc82ca11 RDMA/irdma: Replace deprecated strncpy
`strncpy` is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

A suitable replacement is `strscpy_pad` due to the fact that it
guarantees NUL-termination on the destination buffer.

It is unclear to me whether `i40iw_client.name` requires NUL-padding but
have opted to keep the NUL-padding behavior that strncpy provides to
ensure no functional change.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230921-strncpy-drivers-infiniband-hw-irdma-i40iw_if-c-v1-1-22d87aef7186@google.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-22 13:27:27 +03:00
Selvin Xavier
a83c692789 RDMA/bnxt_re: Decrement resource stats correctly
rc_qp_count and ud_qp_count is not decremented during qp destroy.
Fix this.

Fixes: cb95709e0d ("bnxt_re: Update the hw counters for resource stats")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1695199280-13520-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-21 11:56:23 +03:00
Selvin Xavier
9fc5f9a92f RDMA/bnxt_re: Fix the handling of control path response data
Flag that indicate control path command completion should be cleared
only after copying the command response data. As soon as the is_in_used
flag is clear, the waiting thread can proceed with wrong response
data.  This wrong data is causing multiple issues like wrong lkey
used in data traffic and wrong AH Id etc.

Use a memory barrier to ensure that the response data
is copied and visible to the process waiting on a different
cpu core before clearing the is_in_used flag.

Clear the is_in_used after copying the command response.

Fixes: bcfee4ce3e ("RDMA/bnxt_re: remove redundant cmdq_bitmap")
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1695199280-13520-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-21 11:56:23 +03:00
wenglianfa
58c49c097f RDMA/hns: Support SRQ restrack ops for hns driver
The SRQ restrack attributes come from the context maintained by ROCEE.

Example:
$ rdma res show srq -jp -dd
[ {
        "ifindex": 0,
        "ifname": "hns_0",
        "srqn": 0,
        "type": "BASIC",
        "lqpn": [ "14-15","22-23" ],
        "pdn": 2,
        "pid": 1224,
        "comm": "ib_send_bw",{
            "drv_srqn": 0,
            "drv_wqe_cnt": 512,
            "drv_max_gs": 2,
            "drv_xrcdn": 0
        }
    } ]

$ rdma res show srq link hns_0 -jpr
[ {
        "ifindex": 0,
        "ifname": "hns_0",
        "data": [ 149,0,0,0,0,0,0,0,0,0,0,0,119,101,120,99,0,
		  46,62,31,0,0,0,0,3,0,0,1,0,58,62,31,0,0,0,0,
		  30,159,15,0,0,0,64,5,0,0,0,0,0,0,0,0,0,0,0,
		  9,0,0,0,0,0,0,0,0 ]
    } ]

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Link: https://lore.kernel.org/r/20230918131110.3987498-4-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-20 10:51:45 +03:00
Ilpo Järvinen
8bf7187d97 RDMA/hfi1: Use FIELD_GET() to extract Link Width
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of
custom masking and shifting, and remove extract_width() which only
wraps that FIELD_GET().

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230919125648.1920-2-ilpo.jarvinen@linux.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-20 10:30:38 +03:00
David Ahern
4ececeb839 IB/hfi1: Remove open coded reference to skb frag offset
frag offset should be obtained from skb_frag_off; update the code.

Signed-off-by: David Ahern <dsahern@kernel.org>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/20230911215753.12325-1-dsahern@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-18 14:24:15 +03:00
Cheng Xu
b2abdffb50 RDMA/erdma: Fix NULL pointer access in regmr_cmd
Fix the crash of regmr_cmd called by erdma_ib_alloc_mr. The reason is
that mr->mem.mtt is not initialized but it is accessed in regmr_cmd.

The call trace information:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 <...>
 RIP: 0010:regmr_cmd+0x170/0x1c0 [erdma]
 <...>
Call Trace:
 ? __die+0x20/0x70
 ? page_fault_oops+0x66/0x150
 ? do_user_addr_fault+0x61/0x660
 ? exc_page_fault+0x65/0x140
 ? asm_exc_page_fault+0x22/0x30
 ? regmr_cmd+0x170/0x1c0 [erdma]
 ? preempt_count_add+0x70/0xa0
 ? _raw_spin_lock_irqsave+0x19/0x50
 ? _raw_spin_unlock_irqrestore+0x1b/0x40
 ? erdma_alloc_idx+0x51/0x90 [erdma]
 erdma_get_dma_mr+0xa3/0x120 [erdma]
 __ib_alloc_pd+0xeb/0x1c0 [ib_core]

Fixes: 7244b4aa42 ("RDMA/erdma: Refactor the storage structure of MTT entries")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/3d140c1d-524a-4dbe-a51c-aee4f7ecafdb@moroto.mountain/
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230908060559.80203-1-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-18 10:42:19 +03:00
Dan Carpenter
6b5f0749ce RDMA/erdma: Fix error code in erdma_create_scatter_mtt()
The erdma_create_scatter_mtt() function is supposed to return error
pointers.  Returning NULL will lead to an Oops.

Fixes: ed10435d35 ("RDMA/erdma: Implement hierarchical MTT")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/1eb400d5-d8a3-4a8e-b3da-c43c6c377f86@moroto.mountain
Acked-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-18 09:12:00 +03:00
Artem Chernyshev
8fb8a82086 RDMA/cxgb4: Check skb value for failure to allocate
get_skb() can fail to allocate skb, so check it.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 5be78ee924 ("RDMA/cxgb4: Fix LE hash collision bug for active open connection")
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Link: https://lore.kernel.org/r/20230905124048.284165-1-artem.chernyshev@red-soft.ru
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-11 14:55:53 +03:00
Krzysztof Kozlowski
3ec648c631 IB: Use capital "OR" for multiple licenses in SPDX
Documentation/process/license-rules.rst and checkpatch expect the SPDX
identifier syntax for multiple licenses to use capital "OR".  Correct it
to keep consistent format and avoid copy-paste issues.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230823092912.122674-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-09-11 14:14:00 +03:00