1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
linux/drivers
Yishai Hadas abc7b3f1f0 RDMA/mlx5: Fix a WARN during dereg_mr for DM type
Memory regions (MR) of type DM (device memory) do not have an associated
umem.

In the __mlx5_ib_dereg_mr() -> mlx5_free_priv_descs() flow, the code
incorrectly takes the wrong branch, attempting to call
dma_unmap_single() on a DMA address that is not mapped.

This results in a WARN [1], as shown below.

The issue is resolved by properly accounting for the DM type and
ensuring the correct branch is selected in mlx5_free_priv_descs().

[1]
WARNING: CPU: 12 PID: 1346 at drivers/iommu/dma-iommu.c:1230 iommu_dma_unmap_page+0x79/0x90
Modules linked in: ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry ovelay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core
CPU: 12 UID: 0 PID: 1346 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ #1631
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:iommu_dma_unmap_page+0x79/0x90
Code: 2b 49 3b 29 72 26 49 3b 69 08 73 20 4d 89 f0 44 89 e9 4c 89 e2 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 07 b8 88 ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 0f 1f 44 00
RSP: 0018:ffffc90001913a10 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810194b0a8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff88810194b0a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f537abdd740(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f537aeb8000 CR3: 000000010c248001 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __warn+0x84/0x190
? iommu_dma_unmap_page+0x79/0x90
? report_bug+0xf8/0x1c0
? handle_bug+0x55/0x90
? exc_invalid_op+0x13/0x60
? asm_exc_invalid_op+0x16/0x20
? iommu_dma_unmap_page+0x79/0x90
dma_unmap_page_attrs+0xe6/0x290
mlx5_free_priv_descs+0xb0/0xe0 [mlx5_ib]
__mlx5_ib_dereg_mr+0x37e/0x520 [mlx5_ib]
? _raw_spin_unlock_irq+0x24/0x40
? wait_for_completion+0xfe/0x130
? rdma_restrack_put+0x63/0xe0 [ib_core]
ib_dereg_mr_user+0x5f/0x120 [ib_core]
? lock_release+0xc6/0x280
destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs]
uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs]
uobj_destroy+0x3f/0x70 [ib_uverbs]
ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs]
? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs]
? lock_acquire+0xc1/0x2f0
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0x116/0x170 [ib_uverbs]
? lock_release+0xc6/0x280
ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
__x64_sys_ioctl+0x1b0/0xa70
do_syscall_64+0x6b/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f537adaf17b
Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffff218f0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffff218f1d8 RCX: 00007f537adaf17b
RDX: 00007ffff218f1c0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffff218f1a0 R08: 00007f537aa8d010 R09: 0000561ee2e4f270
R10: 00007f537aace3a8 R11: 0000000000000246 R12: 00007ffff218f190
R13: 000000000000001c R14: 0000561ee2e4d7c0 R15: 00007ffff218f450
</TASK>

Fixes: f18ec42231 ("RDMA/mlx5: Use a union inside mlx5_ib_mr")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/2039c22cfc3df02378747ba4d623a558b53fc263.1738587076.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-02-06 03:43:57 -05:00
..
accel Mainly individually changelogged singleton patches. The patch series in 2025-01-26 17:50:53 -08:00
accessibility
acpi 21 hotfixes. 8 are cc:stable and the remainder address post-6.13 issues. 2025-02-01 09:49:20 -08:00
amba
android Char/Misc/IIO driver updates for 6.14-rc1 2025-01-27 16:51:51 -08:00
ata ata changes for 6.14 part2 2025-01-31 11:07:56 -08:00
atm
auxdisplay auxdisplay for v6.14-1 2025-01-24 08:03:52 -08:00
base RISC-V Patches for the 6.14 Merge Window, Part 1 2025-01-31 15:13:25 -08:00
bcma
block block-6.14-20250131 2025-01-31 11:49:30 -08:00
bluetooth Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming 2025-01-29 15:23:49 -05:00
bus Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
cache
cdrom treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
cdx cdx: disable cdx bus from bus shutdown callback 2025-01-10 15:43:16 +01:00
char treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
clk The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
clocksource hyperv: Switch from hyperv-tlfs.h to hyperv/hvhdk.h 2025-01-10 00:54:21 +00:00
comedi
connector
counter
cpufreq cpufreq: airoha: Depends on OF 2025-01-29 10:56:11 +01:00
cpuidle More power management updates for 6.14-rc1 2025-01-30 15:10:34 -08:00
crypto Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
cxl cxl changes for v6.14 2025-01-29 11:23:22 -08:00
dax
dca
devfreq Update devfreq next for v6.14 2025-01-13 20:48:34 +01:00
dio
dma dmaengine updates for v6.14 2025-01-29 14:29:57 -08:00
dma-buf
dpll
edac - The first part of a restructuring of AMD's representation of a northbridge 2025-01-21 09:38:52 -08:00
eisa
extcon Update extcon next for v6.14 2025-01-12 13:44:27 +01:00
firewire Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
firmware sound fixes for 6.14-rc1 2025-01-31 09:17:02 -08:00
fpga FPGA Manager changes for 6.14-rc1 2025-01-09 10:56:57 +01:00
fsi
gnss
gpio gpio fixes for v6.14-rc1 2025-01-30 10:19:30 -08:00
gpu drm fixes for 6.14-rc1 2025-01-31 15:45:41 -08:00
greybus
hid pci-v6.14-changes 2025-01-25 16:03:40 -08:00
hsi
hte
hv treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
hwmon Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
hwspinlock
hwtracing KVM/arm64 updates for 6.14 2025-01-28 09:01:36 -08:00
i2c i2c-for-6.14-rc1-take2 2025-01-30 17:43:36 -08:00
i3c I3C for 6.14 2025-01-24 15:48:01 -08:00
idle Power management updates for 6.14-rc1 2025-01-22 11:16:14 -08:00
iio IIO: 2nd set of fixes for the 6.13 cycle. 2025-01-16 13:46:08 +01:00
infiniband RDMA/mlx5: Fix a WARN during dereg_mr for DM type 2025-02-06 03:43:57 -05:00
input platform-drivers-x86 for v6.14-1 2025-01-24 07:18:39 -08:00
interconnect interconnect changes for 6.14 2025-01-16 14:01:40 +01:00
iommu hyperv-next for v6.14 2025-01-25 09:22:55 -08:00
ipack
irqchip Merge tag 'irq-core-2025-01-21' into loongarch-next 2025-01-25 18:15:32 +08:00
isdn
leds Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
macintosh The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
mailbox mailbox: th1520: Fix memory corruption due to incorrect array size 2025-01-18 16:20:55 -06:00
mcb
md block-6.14-20250131 2025-01-31 11:49:30 -08:00
media [GIT PULL for v6.14] media updates 2025-02-01 09:15:01 -08:00
memory spi: Support DTR in spi-mem 2025-01-15 19:07:39 +01:00
memstick Char/Misc/IIO driver updates for 6.14-rc1 2025-01-27 16:51:51 -08:00
message Merge branch '6.13/scsi-fixes' into 6.14/scsi-staging 2025-01-10 15:20:30 -05:00
mfd - Fix race in device_node_get_regmap() using more extensive locking. 2025-01-22 09:16:02 -08:00
misc treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
mmc MMC core: 2025-01-22 10:39:17 -08:00
most
mtd block-6.14-20250131 2025-01-31 11:49:30 -08:00
mux mux: constify mux class 2025-01-10 10:15:04 +01:00
net First batch of fixes for 6.14. Nothing really stands out, 2025-01-30 12:24:20 -08:00
nfc nfc: mrvl: Don't use "proxy" headers 2025-01-18 17:10:05 -08:00
ntb PCI: Remove devres from pci_intx() 2025-01-18 14:38:49 -06:00
nubus
nvdimm driver core: Constify API device_find_child() and adapt for various usages 2025-01-03 11:19:35 +01:00
nvme block: force noio scope in blk_mq_freeze_queue 2025-01-31 07:20:08 -07:00
nvmem nvmem: core: improve range check for nvmem_cell_write() 2025-01-10 16:16:48 +01:00
of Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
opp Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
parisc
parport
pci PCI: Restore original INTX_DISABLE bit by pcim_intx() 2025-01-27 12:55:12 -06:00
pcmcia
peci
perf treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
phy phy-for-6.14 2025-01-29 14:32:38 -08:00
pinctrl Pin control changes for the v6.14 kernel cycle: 2025-01-24 07:38:50 -08:00
platform USB / Thunderbolt driver updates for 6.14-rc1 2025-01-27 16:29:16 -08:00
pmdomain pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode 2025-01-21 10:45:24 +01:00
pnp
power power supply and reset changes for the 6.14 series 2025-01-27 15:37:16 -08:00
powercap
pps pps: clients: gpio: Bypass edge's direction check when not needed 2025-01-10 16:12:33 +01:00
ps3
ptp First batch of fixes for 6.14. Nothing really stands out, 2025-01-30 12:24:20 -08:00
pwm Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
rapidio
ras x86/amd_nb: Move SMN access code to a new amd_node driver 2025-01-08 10:59:44 +01:00
regulator regulator: Fixes for v6.14 2025-01-29 11:56:55 -08:00
remoteproc remoteproc: st: Use syscon_regmap_lookup_by_phandle_args 2025-01-15 10:04:27 -07:00
reset soc: driver updates for 6.14 2025-01-24 14:56:59 -08:00
rpmsg driver core: Constify API device_find_child() and adapt for various usages 2025-01-03 11:19:35 +01:00
rtc RTC for 6.13 2025-01-30 17:50:02 -08:00
s390 more s390 updates for 6.14 merge window 2025-01-30 10:48:17 -08:00
sbus
scsi block-6.14-20250131 2025-01-31 11:49:30 -08:00
sh
siox
slimbus Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
soc [GIT PULL for v6.14] media updates 2025-01-25 15:59:46 -08:00
soundwire soundwire updates for 6.14 2025-01-29 14:38:19 -08:00
spi spi: Fix for v6.14 2025-01-24 16:12:12 -08:00
spmi spmi: hisi-spmi-controller: Drop duplicated OF node assignment in spmi_controller_probe() 2025-01-17 12:58:49 +01:00
ssb
staging Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
target SCSI misc on 20250126 2025-01-26 16:12:44 -08:00
tc
tee
thermal Merge branch 'thermal-intel' 2025-01-20 13:10:15 +01:00
thunderbolt Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
tty Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
ufs block-6.14-20250131 2025-01-31 11:49:30 -08:00
uio Char/Misc/IIO driver updates for 6.14-rc1 2025-01-27 16:51:51 -08:00
usb Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
vdpa virtio: features, fixes, cleanups 2025-01-27 15:26:06 -08:00
vfio VFIO updates for v6.14-rc1 2025-01-28 14:16:46 -08:00
vhost vhost/net: Set num_buffers for virtio 1.0 2025-01-27 09:39:25 -05:00
video fbdev fixes and updates for 6.14-rc1: 2025-01-24 11:32:13 -08:00
virt - A segmented Reverse Map table (RMP) is a across-nodes distributed 2025-01-21 09:00:31 -08:00
virtio virtio: features, fixes, cleanups 2025-01-27 15:26:06 -08:00
w1 1-Wire bus drivers for v6.14 2025-01-09 10:54:19 +01:00
watchdog linux-watchdog 6.14-rc1 tag 2025-01-25 16:19:10 -08:00
xen xen: branch for v6.14-rc1 2025-01-29 11:39:20 -08:00
zorro zorro: Constify 'struct bin_attribute' 2025-01-08 18:04:36 +01:00
Kconfig
Makefile