Tony Nguyen says:
====================
ice: fix synchronization between .ndo_bpf() and reset
Larysa Zaremba says:
PF reset can be triggered asynchronously, by tx_timeout or by a user. With some
unfortunate timings both ice_vsi_rebuild() and .ndo_bpf will try to access and
modify XDP rings at the same time, causing system crash.
The first patch factors out rtnl-locked code from VSI rebuild code to avoid
deadlock. The following changes lock rebuild and .ndo_bpf() critical sections
with an internal mutex as well and provide complementary fixes.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: do not bring the VSI up, if it was down before the XDP setup
ice: remove ICE_CFG_BUSY locking from AF_XDP code
ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset
ice: check for XDP rings instead of bpf program when unconfiguring
ice: protect XDP configuration with a mutex
ice: move netif_queue_set_napi to rtnl-protected sections
====================
Link: https://patch.msgid.link/20240903183034.3530411-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hopefully final fixes for v6.11 and this time only fixes to ath11k
driver. We need to revert hibernation support due to reported
regressions and we have a fix for kernel crash introduced in
v6.11-rc1.
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmbYZsIRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZvwbAf/XhgO7uKuZnGfq6NBfyLIaXQjHRe6OSc0
/EJ+c9J3DK5vQUHcw2ALrBYtKPJIrgLIzMG+u4N/4klxLwWCzZpsMLTqNvUV4VRe
eKdD9v/0dZnrKrpmQxgtEoImc028gC5c2K9jDK4e9MU/kBMln9PpH3xyasnV9aLj
2ajkO7xdLooTJ4cdZtdDL9fd5px7IdghNgHiY+24lfuPY2AZV5AUK/AUHZw3CG6K
nXk6+PEO1Xr2P3SAc5yRYg4vXJIy7g8ELAGUj8f3XznvYKpcO+afce8By+H6GecQ
ofsWSLO86ajgu3wMMxyWhXFs8TAsyeE8fLJgNBqIfC05i1IkGfYwBg==
=C23H
-----END PGP SIGNATURE-----
Merge tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.11
Hopefully final fixes for v6.11 and this time only fixes to ath11k
driver. We need to revert hibernation support due to reported
regressions and we have a fix for kernel crash introduced in
v6.11-rc1.
* tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
MAINTAINERS: wifi: cw1200: add net-cw1200.h
Revert "wifi: ath11k: support hibernation"
Revert "wifi: ath11k: restore country code during resume"
wifi: ath11k: fix NULL pointer dereference in ath11k_mac_get_eirp_power()
====================
Link: https://patch.msgid.link/20240904135906.5986EC4CECA@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
axienet_dma_err_handler can race with axienet_stop in the following
manner:
CPU 1 CPU 2
====================== ==================
axienet_stop()
napi_disable()
axienet_dma_stop()
axienet_dma_err_handler()
napi_disable()
axienet_dma_stop()
axienet_dma_start()
napi_enable()
cancel_work_sync()
free_irq()
Fix this by setting a flag in axienet_stop telling
axienet_dma_err_handler not to bother doing anything. I chose not to use
disable_work_sync to allow for easier backporting.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Link: https://patch.msgid.link/20240903175141.4132898-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
generic_ocp_write() asks the parameter "size" must be 4 bytes align.
Therefore, write the bp would fail, if the mac->bp_num is odd. Align the
size to 4 for fixing it. The way may write an extra bp, but the
rtl8152_is_fw_mac_ok() makes sure the value must be 0 for the bp whose
index is more than mac->bp_num. That is, there is no influence for the
firmware.
Besides, I check the return value of generic_ocp_write() to make sure
everything is correct.
Fixes: e5c266a611 ("r8152: set bp in bulk")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://patch.msgid.link/20240903063333.4502-1-hayeswang@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bareudp devices update their stats concurrently.
Therefore they need proper atomic increments.
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/04b7b9d0b480158eb3ab4366ec80aa2ab7e41fcb.1725031794.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently napi_disable() gets called during rxq and txq cleanup,
even before napi is enabled and hrtimer is initialized. It causes
kernel panic.
? page_fault_oops+0x136/0x2b0
? page_counter_cancel+0x2e/0x80
? do_user_addr_fault+0x2f2/0x640
? refill_obj_stock+0xc4/0x110
? exc_page_fault+0x71/0x160
? asm_exc_page_fault+0x27/0x30
? __mmdrop+0x10/0x180
? __mmdrop+0xec/0x180
? hrtimer_active+0xd/0x50
hrtimer_try_to_cancel+0x2c/0xf0
hrtimer_cancel+0x15/0x30
napi_disable+0x65/0x90
mana_destroy_rxq+0x4c/0x2f0
mana_create_rxq.isra.0+0x56c/0x6d0
? mana_uncfg_vport+0x50/0x50
mana_alloc_queues+0x21b/0x320
? skb_dequeue+0x5f/0x80
Cc: stable@vger.kernel.org
Fixes: e1b5683ff6 ("net: mana: Move NAPI from EQ to CQ")
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver generates a random MAC once on load
and uses it over and over, including on two devices
needing a random MAC at the same time.
Jakub suggested revamping the driver to the modern
API for setting a random MAC rather than fixing
the old stuff.
The bug is as old as the driver.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://patch.msgid.link/20240829175201.670718-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We have three patch which address two issues in the ath11k driver
which should be addressed for 6.11-rc7:
One patch fixes a NULL pointer dereference while parsing transmit
power envelope (TPE) information, and the other two patches revert the
hibernation support since it is interfering with suspend on some
platforms. Note the cause of the suspend wakeups is still being
investigated, and it is hoped this can be addressed and hibernation
support can be restored in the near future.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQQ/mtSHzPUi16IfDEksFbugiYzLewUCZtcpwRQcampvaG5zb25A
a2VybmVsLm9yZwAKCRAsFbugiYzLe6uIAP9wa6OX4oQswrUfm5/NOkb3N24gyXfI
RHr4E/Q0gFhB+QD/aksyfPwowRCv/xLNkF0y6jt0WyQEOKtOMlID21ebCA4=
=8VbW
-----END PGP SIGNATURE-----
Merge tag 'ath-current-20240903' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath
ath.git patches for v6.11-rc7
We have three patch which address two issues in the ath11k driver
which should be addressed for 6.11-rc7:
One patch fixes a NULL pointer dereference while parsing transmit
power envelope (TPE) information, and the other two patches revert the
hibernation support since it is interfering with suspend on some
platforms. Note the cause of the suspend wakeups is still being
investigated, and it is hoped this can be addressed and hibernation
support can be restored in the near future.
After XDP configuration is completed, we bring the interface up
unconditionally, regardless of its state before the call to .ndo_bpf().
Preserve the information whether the interface had to be brought down and
later bring it up only in such case.
Fixes: efc2214b60 ("ice: Add support for XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Locking used in ice_qp_ena() and ice_qp_dis() does pretty much nothing,
because ICE_CFG_BUSY is a state flag that is supposed to be set in a PF
state, not VSI one. Therefore it does not protect the queue pair from
e.g. reset.
Remove ICE_CFG_BUSY locking from ice_qp_dis() and ice_qp_ena().
Fixes: 2d4238f556 ("ice: Add support for AF_XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Consider the following scenario:
.ndo_bpf() | ice_prepare_for_reset() |
________________________|_______________________________________|
rtnl_lock() | |
ice_down() | |
| test_bit(ICE_VSI_DOWN) - true |
| ice_dis_vsi() returns |
ice_up() | |
| proceeds to rebuild a running VSI |
.ndo_bpf() is not the only rtnl-locked callback that toggles the interface
to apply new configuration. Another example is .set_channels().
To avoid the race condition above, act only after reading ICE_VSI_DOWN
under rtnl_lock.
Fixes: 0f9d5027a7 ("ice: Refactor VSI allocation, deletion and rebuild flow")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
If VSI rebuild is pending, .ndo_bpf() can attach/detach the XDP program on
VSI without applying new ring configuration. When unconfiguring the VSI, we
can encounter the state in which there is an XDP program but no XDP rings
to destroy or there will be XDP rings that need to be destroyed, but no XDP
program to indicate their presence.
When unconfiguring, rely on the presence of XDP rings rather then XDP
program, as they better represent the current state that has to be
destroyed.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The main threat to data consistency in ice_xdp() is a possible asynchronous
PF reset. It can be triggered by a user or by TX timeout handler.
XDP setup and PF reset code access the same resources in the following
sections:
* ice_vsi_close() in ice_prepare_for_reset() - already rtnl-locked
* ice_vsi_rebuild() for the PF VSI - not protected
* ice_vsi_open() - already rtnl-locked
With an unfortunate timing, such accesses can result in a crash such as the
one below:
[ +1.999878] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 14
[ +2.002992] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 18
[Mar15 18:17] ice 0000:b1:00.0 ens801f0np0: NETDEV WATCHDOG: CPU: 38: transmit queue 14 timed out 80692736 ms
[ +0.000093] ice 0000:b1:00.0 ens801f0np0: tx_timeout: VSI_num: 6, Q 14, NTC: 0x0, HW_HEAD: 0x0, NTU: 0x0, INT: 0x4000001
[ +0.000012] ice 0000:b1:00.0 ens801f0np0: tx_timeout recovery level 1, txqueue 14
[ +0.394718] ice 0000:b1:00.0: PTP reset successful
[ +0.006184] BUG: kernel NULL pointer dereference, address: 0000000000000098
[ +0.000045] #PF: supervisor read access in kernel mode
[ +0.000023] #PF: error_code(0x0000) - not-present page
[ +0.000023] PGD 0 P4D 0
[ +0.000018] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000023] CPU: 38 PID: 7540 Comm: kworker/38:1 Not tainted 6.8.0-rc7 #1
[ +0.000031] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
[ +0.000036] Workqueue: ice ice_service_task [ice]
[ +0.000183] RIP: 0010:ice_clean_tx_ring+0xa/0xd0 [ice]
[...]
[ +0.000013] Call Trace:
[ +0.000016] <TASK>
[ +0.000014] ? __die+0x1f/0x70
[ +0.000029] ? page_fault_oops+0x171/0x4f0
[ +0.000029] ? schedule+0x3b/0xd0
[ +0.000027] ? exc_page_fault+0x7b/0x180
[ +0.000022] ? asm_exc_page_fault+0x22/0x30
[ +0.000031] ? ice_clean_tx_ring+0xa/0xd0 [ice]
[ +0.000194] ice_free_tx_ring+0xe/0x60 [ice]
[ +0.000186] ice_destroy_xdp_rings+0x157/0x310 [ice]
[ +0.000151] ice_vsi_decfg+0x53/0xe0 [ice]
[ +0.000180] ice_vsi_rebuild+0x239/0x540 [ice]
[ +0.000186] ice_vsi_rebuild_by_type+0x76/0x180 [ice]
[ +0.000145] ice_rebuild+0x18c/0x840 [ice]
[ +0.000145] ? delay_tsc+0x4a/0xc0
[ +0.000022] ? delay_tsc+0x92/0xc0
[ +0.000020] ice_do_reset+0x140/0x180 [ice]
[ +0.000886] ice_service_task+0x404/0x1030 [ice]
[ +0.000824] process_one_work+0x171/0x340
[ +0.000685] worker_thread+0x277/0x3a0
[ +0.000675] ? preempt_count_add+0x6a/0xa0
[ +0.000677] ? _raw_spin_lock_irqsave+0x23/0x50
[ +0.000679] ? __pfx_worker_thread+0x10/0x10
[ +0.000653] kthread+0xf0/0x120
[ +0.000635] ? __pfx_kthread+0x10/0x10
[ +0.000616] ret_from_fork+0x2d/0x50
[ +0.000612] ? __pfx_kthread+0x10/0x10
[ +0.000604] ret_from_fork_asm+0x1b/0x30
[ +0.000604] </TASK>
The previous way of handling this through returning -EBUSY is not viable,
particularly when destroying AF_XDP socket, because the kernel proceeds
with removal anyway.
There is plenty of code between those calls and there is no need to create
a large critical section that covers all of them, same as there is no need
to protect ice_vsi_rebuild() with rtnl_lock().
Add xdp_state_lock mutex to protect ice_vsi_rebuild() and ice_xdp().
Leaving unprotected sections in between would result in two states that
have to be considered:
1. when the VSI is closed, but not yet rebuild
2. when VSI is already rebuild, but not yet open
The latter case is actually already handled through !netif_running() case,
we just need to adjust flag checking a little. The former one is not as
trivial, because between ice_vsi_close() and ice_vsi_rebuild(), a lot of
hardware interaction happens, this can make adding/deleting rings exit
with an error. Luckily, VSI rebuild is pending and can apply new
configuration for us in a managed fashion.
Therefore, add an additional VSI state flag ICE_VSI_REBUILD_PENDING to
indicate that ice_xdp() can just hot-swap the program.
Also, as ice_vsi_rebuild() flow is touched in this patch, make it more
consistent by deconfiguring VSI when coalesce allocation fails.
Fixes: 2d4238f556 ("ice: Add support for AF_XDP")
Fixes: efc2214b60 ("ice: Add support for XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently, netif_queue_set_napi() is called from ice_vsi_rebuild() that is
not rtnl-locked when called from the reset. This creates the need to take
the rtnl_lock just for a single function and complicates the
synchronization with .ndo_bpf. At the same time, there no actual need to
fill napi-to-queue information at this exact point.
Fill napi-to-queue information when opening the VSI and clear it when the
VSI is being closed. Those routines are already rtnl-locked.
Also, rewrite napi-to-queue assignment in a way that prevents inclusion of
XDP queues, as this leads to out-of-bounds writes, such as one below.
[ +0.000004] BUG: KASAN: slab-out-of-bounds in netif_queue_set_napi+0x1c2/0x1e0
[ +0.000012] Write of size 8 at addr ffff889881727c80 by task bash/7047
[ +0.000006] CPU: 24 PID: 7047 Comm: bash Not tainted 6.10.0-rc2+ #2
[ +0.000004] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
[ +0.000003] Call Trace:
[ +0.000003] <TASK>
[ +0.000002] dump_stack_lvl+0x60/0x80
[ +0.000007] print_report+0xce/0x630
[ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000007] ? __virt_addr_valid+0x1c9/0x2c0
[ +0.000005] ? netif_queue_set_napi+0x1c2/0x1e0
[ +0.000003] kasan_report+0xe9/0x120
[ +0.000004] ? netif_queue_set_napi+0x1c2/0x1e0
[ +0.000004] netif_queue_set_napi+0x1c2/0x1e0
[ +0.000005] ice_vsi_close+0x161/0x670 [ice]
[ +0.000114] ice_dis_vsi+0x22f/0x270 [ice]
[ +0.000095] ice_pf_dis_all_vsi.constprop.0+0xae/0x1c0 [ice]
[ +0.000086] ice_prepare_for_reset+0x299/0x750 [ice]
[ +0.000087] pci_dev_save_and_disable+0x82/0xd0
[ +0.000006] pci_reset_function+0x12d/0x230
[ +0.000004] reset_store+0xa0/0x100
[ +0.000006] ? __pfx_reset_store+0x10/0x10
[ +0.000002] ? __pfx_mutex_lock+0x10/0x10
[ +0.000004] ? __check_object_size+0x4c1/0x640
[ +0.000007] kernfs_fop_write_iter+0x30b/0x4a0
[ +0.000006] vfs_write+0x5d6/0xdf0
[ +0.000005] ? fd_install+0x180/0x350
[ +0.000005] ? __pfx_vfs_write+0x10/0xA10
[ +0.000004] ? do_fcntl+0x52c/0xcd0
[ +0.000004] ? kasan_save_track+0x13/0x60
[ +0.000003] ? kasan_save_free_info+0x37/0x60
[ +0.000006] ksys_write+0xfa/0x1d0
[ +0.000003] ? __pfx_ksys_write+0x10/0x10
[ +0.000002] ? __x64_sys_fcntl+0x121/0x180
[ +0.000004] ? _raw_spin_lock+0x87/0xe0
[ +0.000005] do_syscall_64+0x80/0x170
[ +0.000007] ? _raw_spin_lock+0x87/0xe0
[ +0.000004] ? __pfx__raw_spin_lock+0x10/0x10
[ +0.000003] ? file_close_fd_locked+0x167/0x230
[ +0.000005] ? syscall_exit_to_user_mode+0x7d/0x220
[ +0.000005] ? do_syscall_64+0x8c/0x170
[ +0.000004] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? fput+0x1a/0x2c0
[ +0.000004] ? filp_close+0x19/0x30
[ +0.000004] ? do_dup2+0x25a/0x4c0
[ +0.000004] ? __x64_sys_dup2+0x6e/0x2e0
[ +0.000002] ? syscall_exit_to_user_mode+0x7d/0x220
[ +0.000004] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? __count_memcg_events+0x113/0x380
[ +0.000005] ? handle_mm_fault+0x136/0x820
[ +0.000005] ? do_user_addr_fault+0x444/0xa80
[ +0.000004] ? clear_bhb_loop+0x25/0x80
[ +0.000004] ? clear_bhb_loop+0x25/0x80
[ +0.000002] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000005] RIP: 0033:0x7f2033593154
Fixes: 080b0c8d6d ("ice: Fix ASSERT_RTNL() warning during certain scenarios")
Fixes: 91fdbce7e8 ("ice: Add support in the driver for associating queue with napi")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The call of of_get_child_by_name() will cause refcount incremented
for leds, if it succeeds, it should call of_node_put() to decrease
it, fix it.
Fixes: 01e5b728e9 ("net: phy: Add a binding for PHY LEDs")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240830022025.610844-1-ruanjinjie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We are not using ndev->stats for rx_packets and rx_bytes anymore.
Instead, we use per CPU stats which are collated in
am65_cpsw_nuss_ndo_get_stats().
Fix RX statistics for XDP_TX and XDP_REDIRECT cases.
Fixes: 8acacc40f7 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
If number of TX queues are set to 1 we get a NULL pointer
dereference during XDP_TX.
~# ethtool -L eth0 tx 1
~# ./xdp-trafficgen udp -A <ipv6-src> -a <ipv6-dst> eth0 -t 2
Transmitting on eth0 (ifindex 2)
[ 241.135257] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
Fix this by using actual TX queues instead of max TX queues
when picking the TX channel in am65_cpsw_ndo_xdp_xmit().
Fixes: 8acacc40f7 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The following XDP_DROP test from [1] stalls the interface after
250 packets.
~# xdb-bench drop -m native eth0
This is because new RX requests are never queued. Fix that.
The below XDP_TX test from [1] fails with a warning
[ 499.947381] XDP_WARN: xdp_update_frame_from_buff(line:277): Driver BUG: missing reserved tailroom
~# xdb-bench tx -m native eth0
Fix that by using PAGE_SIZE during xdp_init_buf().
In XDP_REDIRECT case only 1 packet was processed in rx_poll.
Fix it to process up to budget packets.
Fix all XDP error cases to call trace_xdp_exception() and drop the packet
in am65_cpsw_run_xdp().
[1] xdp-tools suite https://github.com/xdp-project/xdp-tools
Fixes: 8acacc40f7 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEUEC6huC2BN0pvD5fKDiiPnotvG8FAmbSPaoTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRAoOKI+ei28b01hB/4taX+ezcqTmL0crMQ7N1JCLsWg8kax
A4PTh7Zksj7OPY142Zn3M9D2xyj+ZACQQ9+vSS04Ex+i3CaGPZw4eVk0scxDbU8z
gkotQTk8a/+8dHJG0HMkXoLrp50YECVF2SsaiUXclrfpPDd6WIRadcvf6TUdVsI5
Z7B9tyIad7SEYj8r0iDHje3k1GaYkEqp5mqaB38y5RsDiNXa0mO6uqkbTT8WgooL
KLc8ecB9/igpXIylQghEkfuWpsNAFSG6lZsblhL2/DlE9w5cmrdMo+oEd0+OkJTh
+iyPi6NVaSyW/whmwhePi3RsIsCazGGUG1mKkaLJoOTJDmAGvj8f3fAi
=mZ7/
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-6.11-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2024-08-30
The first patch is by Kuniyuki Iwashima for the CAN BCM protocol that
adds a missing proc entry removal when a device unregistered.
Simon Horman fixes the cleanup in the error cleanup path of the m_can
driver's open function.
Markus Schneider-Pargmann contributes 7 fixes for the m_can driver,
all related to the recently added IRQ coalescing support.
The next 2 patches are by me, target the mcp251xfd driver and fix ring
and coalescing configuration problems when switching from CAN-CC to
CAN-FD mode.
Simon Arlott's patch fixes a possible deadlock in the mcp251x driver.
The last patch is by Martin Jocic for the kvaser_pciefd driver and
fixes a problem with lost IRQs, which result in starvation, under high
load situations.
* tag 'linux-can-fixes-for-6.11-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: kvaser_pciefd: Use a single write when releasing RX buffers
can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open
can: mcp251xfd: mcp251xfd_ring_init(): check TX-coalescing configuration
can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode
can: m_can: Limit coalescing to peripheral instances
can: m_can: Reset cached active_interrupts on start
can: m_can: disable_all_interrupts, not clear active_interrupts
can: m_can: Do not cancel timer from within timer
can: m_can: Remove m_can_rx_peripheral indirection
can: m_can: Remove coalesing disable in isr during suspend
can: m_can: Reset coalescing during suspend/resume
can: m_can: Release irq on error in m_can_open
can: bcm: Remove proc entry when dev is unregistered.
====================
Link: https://patch.msgid.link/20240830215914.1610393-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 7f0343b7b8.
We are going to revert commit 166a490f59 ("wifi: ath11k: support hibernation"), on
which this commit depends. With that commit reverted, this one is not needed any
more, so revert this commit first.
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20240830073420.5790-2-quic_bqiang@quicinc.com
Call rtnl_unlock() on this error path, before returning.
Fixes: bc23aa949a ("igc: Add pcie error handler support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a clear use-after-free error. We remove it, and rely on checking
the return code of vcap_del_rule.
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/kernel-janitors/7bffefc6-219a-4f71-baa0-ad4526e5c198@kili.mountain/
Fixes: c956b9b318 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Signed-off-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0x7d and 0x7e bytes are meant to be escaped in the data portion of
frames, but this didn't occur since next_chunk_len() had an off-by-one
error. That also resulted in the final byte of a payload being written
as a separate tty write op.
The chunk prior to an escaped byte would be one byte short, and the
next call would never test the txpos+1 case, which is where the escaped
byte was located. That meant it never hit the escaping case in
mctp_serial_tx_work().
Example Input: 01 00 08 c8 7e 80 02
Previous incorrect chunks from next_chunk_len():
01 00 08
c8 7e 80
02
With this fix:
01 00 08 c8
7e
80 02
Cc: stable@vger.kernel.org
Fixes: a0c2ccd9b5 ("mctp: Add MCTP-over-serial transport binding")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test various edge cases of inputs that contain characters
that need escaping.
This adds a new kunit suite for mctp-serial.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kvaser's PCIe cards uses the KCAN FPGA IP block which has dual 4K
buffers for incoming messages shared by all (currently up to eight)
channels. While the driver processes messages in one buffer, new
incoming messages are stored in the other and so on.
The design of KCAN is such that a buffer must be fully read and then
released. Releasing a buffer will make the FPGA switch buffers. If the
other buffer contains at least one incoming message the FPGA will also
instantly issue a new interrupt, if not the interrupt will be issued
after receiving the first new message.
With IRQx interrupts, it takes a little time for the interrupt to
happen, enough for any previous ISR call to do it's business and
return, but MSI interrupts are way faster so this time is reduced to
almost nothing.
So with MSI, releasing the buffer HAS to be the very last action of
the ISR before returning, otherwise the new interrupt might be
"masked" by the kernel because the previous ISR call hasn't returned.
And the interrupts are edge-triggered so we cannot loose one, or the
ping-pong reading process will stop.
This is why this patch modifies the driver to use a single write to
the SRB_CMD register before returning.
Signed-off-by: Martin Jocic <martin.jocic@kvaser.com>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20240830153113.2081440-1-martin.jocic@kvaser.com
Fixes: 26ad340e58 ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-08-28 (igb, ice)
This series contains updates to igb and ice drivers.
Daiwei Li restores writing the TSICR (TimeSync Interrupt Cause)
register on 82850 devices to workaround a hardware issue for igb.
Dawid detaches netdev device for reset to avoid ethtool accesses during
reset causing NULL pointer dereferences on ice.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Add netif_device_attach/detach into PF reset flow
igb: Fix not clearing TimeSync interrupts for 82580
====================
Link: https://patch.msgid.link/20240828225444.645154-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* wfx: fix for open network connection
* iwlwifi: fix for hibernate (due to fast resume feature)
* iwlwifi: fix for a few warnings that were recently added
(had previously been messages not warnings)
Previously broken:
* mwifiex: fix static structures used for per-device data
* iwlwifi: some harmless FW related messages were tagged
too high priority
* iwlwifi: scan buffers weren't checked correctly
* mac80211: SKB leak on beacon error path
* iwlwifi: fix ACPI table interop with certain BIOSes
* iwlwifi: fix locking for link selection
* mac80211: fix SSID comparison in beacon validation
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmbO9O8ACgkQ10qiO8sP
aACYDA//TwIfA3+SMUducdNfLjRPa8hoxitZqXUdMVHO/gj6b5isz7g15KgL6UV6
NtJyPSLfYPY4UHsXvEs6zreLQMoQgjpk58sq8uE0y9KxjNzeVd9Qy0rq66KqZR5p
N9bM10qx1oT6CkFK8F1GGKKmOVuYm3bLtwcvCZICcMBi1ziuK6A5FoOFdbzXdUET
QaDFc5P6rtNh0WAtsGjoed+9GQ+AhXcBloohfRZvCHCaY8jMdJyAKACpBQmPUSbX
b1WVwp/cLNeUBmkhhfPE2xJmur4WiI2s/Xy94b5naUIn33re1AfAu2msuhQCyoHv
LiF9wG++NW2Gbh7F5vFt4PVPgCKgkhgmPXDy+yFJose4Ay6wpq8nr0XzgG4Y/mWQ
GYElJrZPQnqxnSMa3VEgLBuGas0KOTfAwXIIGZiFDVlAlPGO1zL5fti25pCQbkKH
NYcNuWCUX5jPrCLjx0SL9P7BLpr8eiWtvq9h+8DfkNWrcSFllIJzkCKYbP7ps9cE
KmFp4ilqmKlYKcMLQM8ePgx2vPegQfmPczcqdYAMJcnXmOcQS7s47jI6BV41CVPt
ls2+/cO2BSexKB4yGWSbF8Xk9+67UL8h10Z0OJ5gD7zWESwyRi2N7q5CiA5XOTLu
RrbkH3uPzWLP+A/uXkQBMUEihaJSwfAO67Wrzi91kSdyPWmervo=
=024u
-----END PGP SIGNATURE-----
Merge tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Regressions:
* wfx: fix for open network connection
* iwlwifi: fix for hibernate (due to fast resume feature)
* iwlwifi: fix for a few warnings that were recently added
(had previously been messages not warnings)
Previously broken:
* mwifiex: fix static structures used for per-device data
* iwlwifi: some harmless FW related messages were tagged
too high priority
* iwlwifi: scan buffers weren't checked correctly
* mac80211: SKB leak on beacon error path
* iwlwifi: fix ACPI table interop with certain BIOSes
* iwlwifi: fix locking for link selection
* mac80211: fix SSID comparison in beacon validation
* tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: clear trans->state earlier upon error
wifi: wfx: repair open network AP mode
wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap()
wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead
wifi: iwlwifi: mvm: allow 6 GHz channels in MLO scan
wifi: iwlwifi: mvm: pause TCM when the firmware is stopped
wifi: iwlwifi: fw: fix wgds rev 3 exact size
wifi: iwlwifi: mvm: take the mutex before running link selection
wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room()
wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation
wifi: iwlwifi: lower message level for FW buffer destination
wifi: iwlwifi: mvm: fix hibernation
wifi: mac80211: fix beacon SSID mismatch handling
wifi: mwifiex: duplicate static structs used in driver instances
====================
Link: https://patch.msgid.link/20240828100151.23662-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ethtool callbacks can be executed while reset is in progress and try to
access deleted resources, e.g. getting coalesce settings can result in a
NULL pointer dereference seen below.
Reproduction steps:
Once the driver is fully initialized, trigger reset:
# echo 1 > /sys/class/net/<interface>/device/reset
when reset is in progress try to get coalesce settings using ethtool:
# ethtool -c <interface>
BUG: kernel NULL pointer dereference, address: 0000000000000020
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 11 PID: 19713 Comm: ethtool Tainted: G S 6.10.0-rc7+ #7
RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice]
RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206
RAX: 000000000000000c RBX: ffff94512305b028 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: ffff9451c3f2e588
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff9451c3f2e580 R11: 000000000000001f R12: ffff945121fa9000
R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: ffffffff9e65dd40
FS: 00007faee5fbe740(0000) GS:ffff94546fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 00000000001706f0
Call Trace:
<TASK>
ice_get_coalesce+0x17/0x30 [ice]
coalesce_prepare_data+0x61/0x80
ethnl_default_doit+0xde/0x340
genl_family_rcv_msg_doit+0xf2/0x150
genl_rcv_msg+0x1b3/0x2c0
netlink_rcv_skb+0x5b/0x110
genl_rcv+0x28/0x40
netlink_unicast+0x19c/0x290
netlink_sendmsg+0x222/0x490
__sys_sendto+0x1df/0x1f0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7faee60d8e27
Calling netif_device_detach() before reset makes the net core not call
the driver when ethtool command is issued, the attempt to execute an
ethtool command during reset will result in the following message:
netlink error: No such device
instead of NULL pointer dereference. Once reset is done and
ice_rebuild() is executing, the netif_device_attach() is called to allow
for ethtool operations to occur again in a safe manner.
Fixes: fcea6f3da5 ("ice: Add stats and ethtool support")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
82580 NICs have a hardware bug that makes it
necessary to write into the TSICR (TimeSync Interrupt Cause) register
to clear it:
https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/
Add a conditional so only for 82580 we write into the TSICR register,
so we don't risk losing events for other models.
Without this change, when running ptp4l with an Intel 82580 card,
I get the following output:
> timed out while polling for tx timestamp increasing tx_timestamp_timeout or
> increasing kworker priority may correct this issue, but a driver bug likely
> causes it
This goes away with this change.
This (partially) reverts commit ee14cc9ea1 ("igb: Fix missing time sync events").
Fixes: ee14cc9ea1 ("igb: Fix missing time sync events")
Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507sMryobkBKe=xk=w@mail.gmail.com/
Tested-by: Daiwei Li <daiweili@google.com>
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Daiwei Li <daiweili@google.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When sockfd_lookup() fails, gtp_encap_enable_socket() returns a
NULL pointer, but its callers only check for error pointers thus miss
the NULL pointer case.
Fix it by returning an error pointer with the error code carried from
sockfd_lookup().
(I found this bug during code inspection.)
Fixes: 1e3a3abd8b ("gtp: make GTP sockets in gtp_newlink optional")
Cc: Andreas Schultz <aschultz@tpip.net>
Cc: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240825191638.146748-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a local variable for slave->dev, to prepare for the lock change in
the next patch. There is no functionality change.
Fixes: 9a5605505d ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20240823031056.110999-3-jianbol@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add this implementation for bonding, so hardware resources can be
freed from the active slave after xfrm state is deleted. The netdev
used to invoke xdo_dev_state_free callback, is saved in the xfrm state
(xs->xso.real_dev), which is also the bond's active slave. To prevent
it from being freed, acquire netdev reference before leaving RCU
read-side critical section, and release it after callback is done.
And call it when deleting all SAs from old active real interface while
switching current active slave.
Fixes: 9a5605505d ("bonding: Add struct bond_ipesc to manage SA")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20240823031056.110999-2-jianbol@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With recent work to the doorbell workaround code a small hole was
introduced that could cause a tx_timeout. This happens if the rx
dbell_deadline goes beyond the netdev watchdog timeout set by the driver
(i.e. 2 seconds). Fix this by changing the netdev watchdog timeout to 5
seconds and reduce the max rx dbell_deadline to 4 seconds.
The test that can reproduce the issue being fixed is a multi-queue send
test via pktgen with the "burst" setting to 1. This causes the queue's
doorbell to be rung on every packet sent to the driver, which may result
in the device missing doorbells due to the high doorbell rate.
Cc: stable@vger.kernel.org
Fixes: 4ded136c78 ("ionic: add work item for missed-doorbell check")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240822192557.9089-1-brett.creeley@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When the firmware crashes, we first told the op_mode and only then,
changed the transport's state. This is a problem if the op_mode's
nic_error() handler needs to send a host command: it'll see that the
transport's state still reflects that the firmware is alive.
Today, this has no consequences since we set the STATUS_FW_ERROR bit and
that will prevent sending host commands. iwl_fw_dbg_stop_restart_recording
looks at this bit to know not to send a host command for example.
To fix the hibernation, we needed to reset the firmware without having
an error and checking STATUS_FW_ERROR to see whether the firmware is
alive will no longer hold, so this change is necessary as well.
Change the flow a bit.
Change trans->state before calling the op_mode's nic_error() method and
check trans->state instead of STATUS_FW_ERROR. This will keep the
current behavior of iwl_fw_dbg_stop_restart_recording upon firmware
error, and it'll allow us to call iwl_fw_dbg_stop_restart_recording
safely even if STATUS_FW_ERROR is clear, but yet, the firmware is not
alive.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.9d7427fbdfd7.Ia056ca57029a382c921d6f7b6a6b28fc480f2f22@changeid
[I missed this was a dependency for the hibernation fix, changed
the commit message a bit accordingly]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
RSN IE missing in beacon is normal in open networks.
Avoid returning -EINVAL in this case.
Steps to reproduce:
$ cat /etc/wpa_supplicant.conf
network={
ssid="testNet"
mode=2
key_mgmt=NONE
}
$ wpa_supplicant -iwlan0 -c /etc/wpa_supplicant.conf
nl80211: Beacon set failed: -22 (Invalid argument)
Failed to set beacon parameters
Interface initialization failed
wlan0: interface state UNINITIALIZED->DISABLED
wlan0: AP-DISABLED
wlan0: Unable to setup interface.
Failed to initialize AP interface
After the change:
$ wpa_supplicant -iwlan0 -c /etc/wpa_supplicant.conf
Successfully initialized wpa_supplicant
wlan0: interface state UNINITIALIZED->ENABLED
wlan0: AP-ENABLED
Cc: stable@vger.kernel.org
Fixes: fe0a7776d4 ("wifi: wfx: fix possible NULL pointer dereference in wfx_set_mfp_ap()")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20240823131521.3309073-1-alexander.sverdlin@siemens.com
Crash is seen on AM64x 10M link when connecting / disconnecting multiple
times.
The fix for this is to enable quirk_10m_link_issue for AM64x.
Fixes: b256e13378 ("net: ti: icssg-prueth: Add AM64x icssg support")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20240823120412.1262536-1-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a WARNING in iwl_trans_wait_tx_queues_empty() (that was
recently converted from just a message), that can be hit if we
wait for TX queues to become empty after firmware died. Clearly,
we can't expect anything from the firmware after it's declared dead.
Don't call iwl_trans_wait_tx_queues_empty() in this case. While it could
be a good idea to stop the flow earlier, the flush functions do some
maintenance work that is not related to the firmware, so keep that part
of the code running even when the firmware is not running.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.a7cbd794cee9.I44a739fbd4ffcc46b83844dd1c7b2eb0c7b270f6@changeid
[edit commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
MLO internal scan may include 6 GHz channels. Since the 6 GHz scan
indication is not set, the channel flags are set incorrectly, which
leads to a firmware assert.
Since the MLO scan may include 6 GHz and non 6 GHz channels in one
request, add support for non-PSC 6 GHz channels (PSC channels are
already supported) when the 6 GHz indication is not set.
Fixes: 38b3998dfb ("wifi: iwlwifi: mvm: Introduce internal MLO passive scan")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.04807f8213b2.Idd09d4366df92a74853649c1a520b7f0f752d1ac@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Not doing so will make us send a host command to the transport while the
firmware is not alive, which will trigger a WARNING.
bad state = 0
WARNING: CPU: 2 PID: 17434 at drivers/net/wireless/intel/iwlwifi/iwl-trans.c:115 iwl_trans_send_cmd+0x1cb/0x1e0 [iwlwifi]
RIP: 0010:iwl_trans_send_cmd+0x1cb/0x1e0 [iwlwifi]
Call Trace:
<TASK>
iwl_mvm_send_cmd+0x40/0xc0 [iwlmvm]
iwl_mvm_config_scan+0x198/0x260 [iwlmvm]
iwl_mvm_recalc_tcm+0x730/0x11d0 [iwlmvm]
iwl_mvm_tcm_work+0x1d/0x30 [iwlmvm]
process_one_work+0x29e/0x640
worker_thread+0x2df/0x690
? rescuer_thread+0x540/0x540
kthread+0x192/0x1e0
? set_kthread_struct+0x90/0x90
ret_from_fork+0x22/0x30
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.5abe71ca1b6b.I97a968cb8be1f24f94652d9b110ecbf6af73f89e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Check size of WGDS revision 3 is equal to 8 entries size with some header,
but doesn't depend on the number of used entries. Check that used entries
are between min and max but allow more to be present than are used to fix
operation with some BIOSes that have such data.
Fixes: 97f8a3d161 ("iwlwifi: ACPI: support revision 3 WGDS tables")
Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.cc71dfc67ec3.Ic27ee15ac6128b275c210b6de88f2145bd83ca7b@changeid
[edit commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
iwl_mvm_select_links is called by the link selection worker and it
requires the mutex.
Take it in the link selection worker.
This logic used to run from iwl_mvm_rx_umac_scan_complete_notif which
had the mvm->mutex held. This was changed to run in a worker holding the
wiphy mutex, but we also need the mvm->mutex.
Fixes: 2e194efa38 ("wifi: iwlwifi: mvm: Fix race in scan completion")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.0cacecd5db1e.Iaca38a078592b69bdd06549daf63408ccf1810e4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The calculation should consider also the 6GHz IE's len, fix that.
In addition, in iwl_mvm_sched_scan_start() the scan_fits helper is
called only in case non_psc_incldued is true, but it should be called
regardless, fix that as well.
Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.7db825442fd2.I99f4d6587709de02072fd57957ec7472331c6b1d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fast resume is a feature that was recently introduced to speed up the
resume time. It basically keeps the firmware alive while the system
is suspended and that avoids starting again the whole device.
This flow can't work for hibernation, since when the system boots,
before the frozen image is loaded, the kernel may touch the device. As a
result, we can't assume the device is in the exact same state as before
the hibernation.
Detect that we are resuming from hibernation through the PCI device and
forbid the fast resume flow. We also need to shut down the device
cleanly when that happens.
In addition, in case the device is power gated during S3, we won't be
able to keep the device alive. Detect this situation with BE200 at least
with the help of the CSR_FUNC_SCRATCH register and reset the device upon
resume if it was power gated during S3.
Fixes: e8bb19c1d5 ("wifi: iwlwifi: support fast resume")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.24eb3b19e74f.I3837810318dbef0a0a773cf4c4fcf89cdc6fdbd3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The driver must ensure TX descriptor updates are visible
before updating TX pointer and TX clear pointer.
This resolves TX hangs observed on AST2600 when running
iperf3.
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mana_hwc_rx_event_handler() / mana_hwc_handle_resp() calls
complete(&ctx->comp_event) before posting the wqe back. It's
possible that other callers, like mana_create_txq(), start the
next round of mana_hwc_send_request() before the posting of wqe.
And if the HW is fast enough to respond, it can hit no_wqe error
on the HW channel, then the response message is lost. The mana
driver may fail to create queues and open, because of waiting for
the HW response and timed out.
Sample dmesg:
[ 528.610840] mana 39d4:00:02.0: HWC: Request timed out!
[ 528.614452] mana 39d4:00:02.0: Failed to send mana message: -110, 0x0
[ 528.618326] mana 39d4:00:02.0 enP14804s2: Failed to create WQ object: -110
To fix it, move posting of rx wqe before complete(&ctx->comp_event).
Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>