net: Handle napi_schedule() calls from non-interrupt
napi_schedule() is expected to be called either: * From an interrupt, where raised softirqs are handled on IRQ exit * From a softirq disabled section, where raised softirqs are handled on the next call to local_bh_enable(). * From a softirq handler, where raised softirqs are handled on the next round in do_softirq(), or further deferred to a dedicated kthread. Other bare tasks context may end up ignoring the raised NET_RX vector until the next random softirq handling opportunity, which may not happen before a while if the CPU goes idle afterwards with the tick stopped. Such "misuses" have been detected on several places thanks to messages of the kind: "NOHZ tick-stop error: local softirq work is pending, handler #08!!!" For example: __raise_softirq_irqoff __napi_schedule rtl8152_runtime_resume.isra.0 rtl8152_resume usb_resume_interface.isra.0 usb_resume_both __rpm_callback rpm_callback rpm_resume __pm_runtime_resume usb_autoresume_device usb_remote_wakeup hub_event process_one_work worker_thread kthread ret_from_fork ret_from_fork_asm And also: * drivers/net/usb/r8152.c::rtl_work_func_t * drivers/net/netdevsim/netdev.c::nsim_start_xmit There is a long history of issues of this kind:019edd01d1
("ath10k: sdio: Add missing BH locking around napi_schdule()")3300685893
("idpf: disable local BH when scheduling napi for marker packets")e3d5d70cb4
("net: lan78xx: fix "softirq work is pending" error")e55c27ed9c
("mt76: mt7615: add missing bh-disable around rx napi schedule")c0182aa985
("mt76: mt7915: add missing bh-disable around tx napi enable/schedule")970be1dff2
("mt76: disable BH around napi_schedule() calls")019edd01d1
("ath10k: sdio: Add missing BH locking around napi_schdule()")30bfec4fec
("can: rx-offload: can_rx_offload_threaded_irq_finish(): add new function to be called from threaded interrupt")e63052a5dd
("mlx5e: add add missing BH locking around napi_schdule()")83a0c6e589
("i40e: Invoke softirqs after napi_reschedule")bd4ce941c8
("mlx4: Invoke softirqs after napi_reschedule")8cf699ec84
("mlx4: do not call napi_schedule() without care")ec13ee8014
("virtio_net: invoke softirqs after __napi_schedule") This shows that relying on the caller to arrange a proper context for the softirqs to be handled while calling napi_schedule() is very fragile and error prone. Also fixing them can also prove challenging if the caller may be called from different kinds of contexts. Therefore fix this from napi_schedule() itself with waking up ksoftirqd when softirqs are raised from task contexts. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Jakub Kicinski <kuba@kernel.org> Reported-by: Francois Romieu <romieu@fr.zoreil.com> Closes: https://lore.kernel.org/lkml/354a2690-9bbf-4ccb-8769-fa94707a9340@molgen.mpg.de/ Cc: Breno Leitao <leitao@debian.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250223221708.27130-1-frederic@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
parent
49806fe6e6
commit
77e45145e3
1 changed files with 1 additions and 1 deletions
|
@ -4757,7 +4757,7 @@ use_local_napi:
|
|||
* we have to raise NET_RX_SOFTIRQ.
|
||||
*/
|
||||
if (!sd->in_net_rx_action)
|
||||
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
|
||||
raise_softirq_irqoff(NET_RX_SOFTIRQ);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_RPS
|
||||
|
|
Loading…
Add table
Reference in a new issue