List Kuniyuki as an official TCP reviewer.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
ice: fix Rx data path for heavy 9k MTU traffic
Maciej Fijalkowski says:
This patchset fixes a pretty nasty issue that was reported by RedHat
folks which occurred after ~30 minutes (this value varied, just trying
here to state that it was not observed immediately but rather after a
considerable longer amount of time) when ice driver was tortured with
jumbo frames via mix of iperf traffic executed simultaneously with
wrk/nginx on client/server sides (HTTP and TCP workloads basically).
The reported splats were spanning across all the bad things that can
happen to the state of page - refcount underflow, use-after-free, etc.
One of these looked as follows:
[ 2084.019891] BUG: Bad page state in process swapper/34 pfn:97fcd0
[ 2084.025990] page:00000000a60ee772 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x97fcd0
[ 2084.035462] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 2084.041990] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[ 2084.049730] raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
[ 2084.057468] page dumped because: nonzero _refcount
[ 2084.062260] Modules linked in: bonding tls sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mgag200 irqd
[ 2084.137829] CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Not tainted 5.14.0-427.37.1.el9_4.x86_64 #1
[ 2084.147039] Hardware name: Dell Inc. PowerEdge R750/0216NK, BIOS 1.13.2 12/19/2023
[ 2084.154604] Call Trace:
[ 2084.157058] <IRQ>
[ 2084.159080] dump_stack_lvl+0x34/0x48
[ 2084.162752] bad_page.cold+0x63/0x94
[ 2084.166333] check_new_pages+0xb3/0xe0
[ 2084.170083] rmqueue_bulk+0x2d2/0x9e0
[ 2084.173749] ? ktime_get+0x35/0xa0
[ 2084.177159] rmqueue_pcplist+0x13b/0x210
[ 2084.181081] rmqueue+0x7d3/0xd40
[ 2084.184316] ? xas_load+0x9/0xa0
[ 2084.187547] ? xas_find+0x183/0x1d0
[ 2084.191041] ? xa_find_after+0xd0/0x130
[ 2084.194879] ? intel_iommu_iotlb_sync_map+0x89/0xe0
[ 2084.199759] get_page_from_freelist+0x11f/0x530
[ 2084.204291] __alloc_pages+0xf2/0x250
[ 2084.207958] ice_alloc_rx_bufs+0xcc/0x1c0 [ice]
[ 2084.212543] ice_clean_rx_irq+0x631/0xa20 [ice]
[ 2084.217111] ice_napi_poll+0xdf/0x2a0 [ice]
[ 2084.221330] __napi_poll+0x27/0x170
[ 2084.224824] net_rx_action+0x233/0x2f0
[ 2084.228575] __do_softirq+0xc7/0x2ac
[ 2084.232155] __irq_exit_rcu+0xa1/0xc0
[ 2084.235821] common_interrupt+0x80/0xa0
[ 2084.239662] </IRQ>
[ 2084.241768] <TASK>
The fix is mostly about reverting what was done in commit 1dc1a7e7f4
("ice: Centrallize Rx buffer recycling") followed by proper timing on
page_count() storage and then removing the ice_rx_buf::act related logic
(which was mostly introduced for purposes from cited commit).
Special thanks to Xu Du for providing reproducer and Jacob Keller for
initial extensive analysis.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: stop storing XDP verdict within ice_rx_buf
ice: gather page_count()'s of each frag right before XDP prog call
ice: put Rx buffers after being done with current frame
====================
Link: https://patch.msgid.link/20250131185415.3741532-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 4094871db1 ("udp: only do GSO if # of segs > 1") avoided GSO
for small packets. But the kernel currently dismisses GSO requests only
after checking MTU/PMTU on gso_size. This means any packets, regardless
of their payload sizes, could be dropped when PMTU becomes smaller than
requested gso_size. We encountered this issue in production and it
caused a reliability problem that new QUIC connection cannot be
established before PMTU cache expired, while non GSO sockets still
worked fine at the same time.
Ideally, do not check any GSO related constraints when payload size is
smaller than requested gso_size, and return EMSGSIZE instead of EINVAL
on MTU/PMTU check failure to be more specific on the error cause.
Fixes: 4094871db1 ("udp: only do GSO if # of segs > 1")
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable PCIe AER on the tg3 device on system reboot on a limited
list of Dell PowerEdge systems. This prevents a fatal PCIe AER event
on the tg3 device during the ACPI _PTS (prepare to sleep) method for
S5 on those systems. The _PTS is invoked by acpi_enter_sleep_state_prep()
as part of the kernel's reboot sequence as a result of commit
38f34dba80 ("PM: ACPI: reboot: Reinstate S5 for reboot").
There was an earlier fix for this problem by commit 2ca1c94ce0
("tg3: Disable tg3 device on system reboot to avoid triggering AER").
But it was discovered that this earlier fix caused a reboot hang
when some Dell PowerEdge servers were booted via ipxe. To address
this reboot hang, the earlier fix was essentially reverted by commit
9fc3bc7643 ("tg3: power down device only on SYSTEM_POWER_OFF").
This re-exposed the tg3 PCIe AER on reboot problem.
This fix is not an ideal solution because the root cause of the AER
is in system firmware. Instead, it's a targeted work-around in the
tg3 driver.
Note also that the PCIe AER must be disabled on the tg3 device even
if the system is configured to use "firmware first" error handling.
V3:
- Fix sparse warning on improper comparison of pdev->current_state
- Adhere to netdev comment style
Fixes: 9fc3bc7643 ("tg3: power down device only on SYSTEM_POWER_OFF")
Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If XDP traffic runs on a CPU which is greater than or equal to
the number of the Tx queues of the NIC, then vmxnet3_xdp_get_tq()
always picks up queue 0 for transmission as it uses reciprocal scale
instead of simple modulo operation.
vmxnet3_xdp_xmit() and vmxnet3_xdp_xmit_frame() use the above
returned queue without any locking which can lead to race conditions
when multiple XDP xmits run in parallel on different CPU's.
This patch uses a simple module scheme when the current CPU equals or
exceeds the number of Tx queues on the NIC. It also adds locking in
vmxnet3_xdp_xmit() and vmxnet3_xdp_xmit_frame() functions.
Fixes: 54f00cce11 ("vmxnet3: Add XDP support.")
Signed-off-by: Sankararaman Jayaraman <sankararaman.jayaraman@broadcom.com>
Signed-off-by: Ronak Doshi <ronak.doshi@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250131042340.156547-1-sankararaman.jayaraman@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add check for the return value of devm_kzalloc() to guarantee the success
of allocation.
Fixes: 42c2eb6b1f ("ice: Implement devlink-rate API")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250131013832.24805-1-jiashengjiangcool@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some lwtunnels have a dst cache for post-transformation dst.
If the packet destination did not change we may end up recording
a reference to the lwtunnel in its own cache, and the lwtunnel
state will never be freed.
Discovered by the ioam6.sh test, kmemleak was recently fixed
to catch per-cpu memory leaks. I'm not sure if rpl and seg6
can actually hit this, but in principle I don't see why not.
Fixes: 8cb3bf8bff ("ipv6: ioam: Add support for the ip6ip6 encapsulation")
Fixes: 6c8702c60b ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Fixes: a7a29f9c36 ("net: ipv6: add rpl sr tunnel")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250130031519.2716843-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some Wake-on-LAN modes such as WAKE_FILTER may only be supported by the MAC,
while others might be only supported by the PHY. Make sure that the .get_wol()
returns the union of both rather than only that of the PHY if the PHY supports
Wake-on-LAN.
When disabling Wake-on-LAN, make sure that this is done at both the PHY
and MAC level, rather than doing an early return from the PHY driver.
Fixes: 7e400ff35c ("net: bcmgenet: Add support for PHY-based Wake-on-LAN")
Fixes: 9ee09edc05 ("net: bcmgenet: Properly overlay PHY and MAC Wake-on-LAN capabilities")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250129231342.35013-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Idea behind having ice_rx_buf::act was to simplify and speed up the Rx
data path by walking through buffers that were representing cleaned HW
Rx descriptors. Since it caused us a major headache recently and we
rolled back to old approach that 'puts' Rx buffers right after running
XDP prog/creating skb, this is useless now and should be removed.
Get rid of ice_rx_buf::act and related logic. We still need to take care
of a corner case where XDP program releases a particular fragment.
Make ice_run_xdp() to return its result and use it within
ice_put_rx_mbuf().
Fixes: 2fba7dc515 ("ice: Add support for XDP multi-buffer on Rx side")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
If we store the pgcnt on few fragments while being in the middle of
gathering the whole frame and we stumbled upon DD bit not being set, we
terminate the NAPI Rx processing loop and come back later on. Then on
next NAPI execution we work on previously stored pgcnt.
Imagine that second half of page was used actively by networking stack
and by the time we came back, stack is not busy with this page anymore
and decremented the refcnt. The page reuse algorithm in this case should
be good to reuse the page but given the old refcnt it will not do so and
attempt to release the page via page_frag_cache_drain() with
pagecnt_bias used as an arg. This in turn will result in negative refcnt
on struct page, which was initially observed by Xu Du.
Therefore, move the page count storage from ice_get_rx_buf() to a place
where we are sure that whole frame has been collected, but before
calling XDP program as it internally can also change the page count of
fragments belonging to xdp_buff.
Fixes: ac07533911 ("ice: Store page count inside ice_rx_buf")
Reported-and-tested-by: Xu Du <xudu@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Introduce a new helper ice_put_rx_mbuf() that will go through gathered
frags from current frame and will call ice_put_rx_buf() on them. Current
logic that was supposed to simplify and optimize the driver where we go
through a batch of all buffers processed in current NAPI instance turned
out to be broken for jumbo frames and very heavy load that was coming
from both multi-thread iperf and nginx/wrk pair between server and
client. The delay introduced by approach that we are dropping is simply
too big and we need to take the decision regarding page
recycling/releasing as quick as we can.
While at it, address an error path of ice_add_xdp_frag() - we were
missing buffer putting from day 1 there.
As a nice side effect we get rid of annoying and repetitive three-liner:
xdp->data = NULL;
rx_ring->first_desc = ntc;
rx_ring->nr_frags = 0;
by embedding it within introduced routine.
Fixes: 1dc1a7e7f4 ("ice: Centrallize Rx buffer recycling")
Reported-and-tested-by: Xu Du <xudu@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
but as usual there's a slight concentration of fixes for issues
added in the last two weeks before the MW, and driver bugs
from 6.13 which tend to get discovered upon wider distribution.
Including fixes from IPSec, netfilter and Bluetooth.
Current release - regressions:
- net: revert RTNL changes in unregister_netdevice_many_notify()
- Bluetooth: fix possible infinite recursion of btusb_reset
- eth: adjust locking in some old drivers which protect their state
with spinlocks to avoid sleeping in atomic; core protects
netdev state with a mutex now
Previous releases - regressions:
- eth: mlx5e: make sure we pass node ID, not CPU ID to kvzalloc_node()
- eth: bgmac: reduce max frame size to support just 1500 bytes;
the jumbo frame support would previously cause OOB writes,
but now fails outright
- mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted,
avoid false detection of MPTCP blackholing
Previous releases - always broken:
- mptcp: handle fastopen disconnect correctly
- xfrm: make sure skb->sk is a full sock before accessing its fields
- xfrm: fix taking a lock with preempt disabled for RT kernels
- usb: ipheth: improve safety of packet metadata parsing; prevent
potential OOB accesses
- eth: renesas: fix missing rtnl lock in suspend/resume path
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmebzXsACgkQMUZtbf5S
IrvGBQ//auOF2yY1sg40fBvc6Hr1jpZBcr+uqTL6Qka1uVOvTFY51hAN54lBt32+
ixmcHsD0xdcHrr7VrqSXqurQLiGsdwUpnxZFCj/FymQuMunVysEqudvPeKDVHpsw
JW5c4nJOexEA2viByK9iB23Qq0P3uBoPEnKrbSTVSDvYaXUj6y8Cvt3/vXc+H/tc
T7GaxHH55NNNPkRz34YU3OWcaZsgkQEcdVpZf4tODPmg7J5VQj8SQeMhk/HI0sdO
WKjWB0woZkiQECtamqAOXnv47PXd6igv8NALRPlJcKjs0EszUvuYhD/9MEOeghjI
sjcQn9JnPpG+ca/qFVCSpEEOo2zGVn5dkJT5x26udH+5XHf7Pq+zpJwB6LHo98yF
bGMpIrF6gi2EnBtS/tRjMyBU9Ut9KiUtjXMvn9EsD1U1FGIbz6wyQLlT0pG0hwJb
rEdKfegrcyhWKHOD4vH9ciEg/7lgGfsGyfJDktIMdyailZc6tBcxwbdlc+5jRDA9
0RqGASIXaiX7AC3WOSeQzMgbV+WXdhaX/yrMJL5KBfBTzxG2audnJt1tPN3mbh3z
NM6M2cnMsoX4QLSiaukJaCL7LWHSTlVttZVg8FGXHj1PejMQQBVjGVvzk2UF55UR
gV7X9/VkXhmIDAZgThWtOdPLz+ItksfSKiruhUsXust6JgqRuqc=
=GjBE
-----END PGP SIGNATURE-----
Merge tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from IPSec, netfilter and Bluetooth.
Nothing really stands out, but as usual there's a slight concentration
of fixes for issues added in the last two weeks before the merge
window, and driver bugs from 6.13 which tend to get discovered upon
wider distribution.
Current release - regressions:
- net: revert RTNL changes in unregister_netdevice_many_notify()
- Bluetooth: fix possible infinite recursion of btusb_reset
- eth: adjust locking in some old drivers which protect their state
with spinlocks to avoid sleeping in atomic; core protects netdev
state with a mutex now
Previous releases - regressions:
- eth:
- mlx5e: make sure we pass node ID, not CPU ID to kvzalloc_node()
- bgmac: reduce max frame size to support just 1500 bytes; the
jumbo frame support would previously cause OOB writes, but now
fails outright
- mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted, avoid
false detection of MPTCP blackholing
Previous releases - always broken:
- mptcp: handle fastopen disconnect correctly
- xfrm:
- make sure skb->sk is a full sock before accessing its fields
- fix taking a lock with preempt disabled for RT kernels
- usb: ipheth: improve safety of packet metadata parsing; prevent
potential OOB accesses
- eth: renesas: fix missing rtnl lock in suspend/resume path"
* tag 'net-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
MAINTAINERS: add Neal to TCP maintainers
net: revert RTNL changes in unregister_netdevice_many_notify()
net: hsr: fix fill_frame_info() regression vs VLAN packets
doc: mptcp: sysctl: blackhole_timeout is per-netns
mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted
netfilter: nf_tables: reject mismatching sum of field_len with set key length
net: sh_eth: Fix missing rtnl lock in suspend/resume path
net: ravb: Fix missing rtnl lock in suspend/resume path
selftests/net: Add test for loading devbound XDP program in generic mode
net: xdp: Disallow attaching device-bound programs in generic mode
tcp: correct handling of extreme memory squeeze
bgmac: reduce max frame size to support just MTU 1500
vsock/test: Add test for connect() retries
vsock/test: Add test for UAF due to socket unbinding
vsock/test: Introduce vsock_connect_fd()
vsock/test: Introduce vsock_bind()
vsock: Allow retrying on connect() failure
vsock: Keep the binding until socket destruction
Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection
Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming
...
We want to encourage use of newer Sphinx - they fixed a performance problem
and the docs build takes less than half the time it used to.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmebrwsPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YscsH/RtKdO0Ibwg+VW1IvCcJBB9WXaUft7lgK9CL
KsnNsCJNh334LJ5Q4YT0EKetCwGiz74Xi2GR1qK8SIdYIbzxKeEP+p8IP3lfmbVg
pGxlb/WrOvkgZ1KzjLV4sWQ4jDznIUGB/nBTpG79K6m8niVXtTvCnqCkgAe6UVTu
N8kOqV48aVUOjYmJ+qlspJ4p0zUEddTSrkyy0AKrVU18h0HVrVDC4Ko+W6G3z5Bl
3v4XXjngpOVhs1stgLg/+R185zGOdCVKieOefyYNo3d7yZG1Xw1QMf41+bEpMqRI
6dLS/BEPGk+SdXPyGubxYMqofBpqUbsA+ky3eWboJCeM5iK1HNk=
=Sq+j
-----END PGP SIGNATURE-----
Merge tag 'docs-6.14-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"Two fixes for footnote-related warnings that appeared with Sphinx 8.x.
We want to encourage use of newer Sphinx - they fixed a performance
problem and the docs build takes less than half the time it used to"
* tag 'docs-6.14-2' of git://git.lwn.net/linux:
docs: power: Fix footnote reference for Toshiba Satellite P10-554
Documentation: ublk: Drop Stefan Hajnoczi's message footnote
- Architecutre-specific ftrace recursion trylock tests were
removed in favour of the generic function_graph_enter(),
but s390 got missed. Remove this test for s390 as well.
- Add ftrace_get_symaddr() for s390, which returns the symbol
address from ftrace 'ip' parameter
-----BEGIN PGP SIGNATURE-----
iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZ5udBxccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8Ku8AP0UBjHX0VN6f7bIqs5TdrmbVD03
WO7kB/EFnHXZvFdfigD/TRzLmLi1kZtdBCARFDjSiy9wwGbZxYQQuJex23FYQgA=
=MF7E
-----END PGP SIGNATURE-----
Merge tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Architecutre-specific ftrace recursion trylock tests were removed in
favour of the generic function_graph_enter(), but s390 got missed.
Remove this test for s390 as well.
- Add ftrace_get_symaddr() for s390, which returns the symbol address
from ftrace 'ip' parameter
* tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/tracing: Define ftrace_get_symaddr() for s390
s390/fgraph: Fix to remove ftrace_test_recursion_trylock()
- The rework that uncoupled physical and virtual address spaces
inadvertently prevented KASAN shadow mappings from using large
pages. Restore large page mappings for KASAN shadows
- Add decompressor routine physmem_alloc() that may fail,
unlike physmem_alloc_or_die(). This allows callers to
implement fallback paths
- Allow falling back from large pages to smaller pages (1MB or
4KB) if the allocation of 2GB pages in the decompressor can
not be fulfilled
- Add to the decompressor boot print support of "%%" format
string, width and padding hadnling, length modifiers and
decimal conversion specifiers
- Add to the decompressor message severity levels similar to
kernel ones. Support command-line options that control
console output verbosity
- Replaces boot_printk() calls with appropriate loglevel-
specific helpers such as boot_emerg(), boot_warn(), and
boot_debug().
- Collect all boot messages into a ring buffer independent
of the current log level. This is particularly useful for
early crash analysis
- If 'earlyprintk' command line parameter is not specified, store
decompressor boot messages in a ring buffer to be printed later
by the kernel, once the console driver is registered
- Add 'bootdebug' command line parameter to enable printing of
decompressor debug messages when needed. That parameters allows
message supressing and filtering
- Dump boot messages on a decompressor crash, but only if
'bootdebug' command line parameter is enabled
- When CONFIG_PRINTK_TIME is enabled, add timestamps to boot
messages in the same format as regular printk()
- Dump physical memory tracking information on boot:
online ranges, reserved areas and vmem allocations
- Dump virtual memory layout and randomization details
- Improve decompression error reporting and dump the message
ring buffer in case the boot failed and system halted
- Add an exception handler which handles exceptions when FPU
control register is attempted to be set to an invalid value.
Remove '.fixup' section as result of this change
- Use 'A', 'O', and 'R' inline assembly format flags, which
allows recent Clang compilers to generate better FPU code
- Rework uaccess code so it reads better and generates more
efficient code
- Cleanup futex inline assembly code
- Disable KMSAN instrumention for futex inline assemblies, which
contain dereferenced user pointers. Otherwise, shadows for the
user pointers would be accessed
- PFs which are not initially configured but in standby create
only a single-function PCI domain. If they are configured later
on, sibling PFs and their child VFs will not be added to their
PCI domain breaking SR-IOV expectations. Fix that by allowing
initially configured but in standby PFs create multi-function
PCI domains
- Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
compile errors caused by kernel's own definitions of 'bool',
'false', and 'true' conflicting with the C23 reserved keywords
- Fix sclp subsystem failure when a sclp console is not present
- Fix misuse of non-NULL terminated strings in vmlogrdr driver
- Various other small improvements, cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZ5t0jhccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8A4wAP91BcHtV4OTmoMsG4JTwiy9wTzI
zro9IrBSCLPLED7kfQEAzjEspd5gWMFDnIM/KxXmCNHA/nBTXVh/1F+b1F0z+Q8=
=JfEG
-----END PGP SIGNATURE-----
Merge tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Alexander Gordeev:
- The rework that uncoupled physical and virtual address spaces
inadvertently prevented KASAN shadow mappings from using large pages.
Restore large page mappings for KASAN shadows
- Add decompressor routine physmem_alloc() that may fail, unlike
physmem_alloc_or_die(). This allows callers to implement fallback
paths
- Allow falling back from large pages to smaller pages (1MB or 4KB) if
the allocation of 2GB pages in the decompressor can not be fulfilled
- Add to the decompressor boot print support of "%%" format string,
width and padding hadnling, length modifiers and decimal conversion
specifiers
- Add to the decompressor message severity levels similar to kernel
ones. Support command-line options that control console output
verbosity
- Replaces boot_printk() calls with appropriate loglevel- specific
helpers such as boot_emerg(), boot_warn(), and boot_debug().
- Collect all boot messages into a ring buffer independent of the
current log level. This is particularly useful for early crash
analysis
- If 'earlyprintk' command line parameter is not specified, store
decompressor boot messages in a ring buffer to be printed later by
the kernel, once the console driver is registered
- Add 'bootdebug' command line parameter to enable printing of
decompressor debug messages when needed. That parameters allows
message suppressing and filtering
- Dump boot messages on a decompressor crash, but only if 'bootdebug'
command line parameter is enabled
- When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages
in the same format as regular printk()
- Dump physical memory tracking information on boot: online ranges,
reserved areas and vmem allocations
- Dump virtual memory layout and randomization details
- Improve decompression error reporting and dump the message ring
buffer in case the boot failed and system halted
- Add an exception handler which handles exceptions when FPU control
register is attempted to be set to an invalid value. Remove '.fixup'
section as result of this change
- Use 'A', 'O', and 'R' inline assembly format flags, which allows
recent Clang compilers to generate better FPU code
- Rework uaccess code so it reads better and generates more efficient
code
- Cleanup futex inline assembly code
- Disable KMSAN instrumention for futex inline assemblies, which
contain dereferenced user pointers. Otherwise, shadows for the user
pointers would be accessed
- PFs which are not initially configured but in standby create only a
single-function PCI domain. If they are configured later on, sibling
PFs and their child VFs will not be added to their PCI domain
breaking SR-IOV expectations.
Fix that by allowing initially configured but in standby PFs create
multi-function PCI domains
- Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
compile errors caused by kernel's own definitions of 'bool', 'false',
and 'true' conflicting with the C23 reserved keywords
- Fix sclp subsystem failure when a sclp console is not present
- Fix misuse of non-NULL terminated strings in vmlogrdr driver
- Various other small improvements, cleanups and fixes
* tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (53 commits)
s390/vmlogrdr: Use array instead of string initializer
s390/vmlogrdr: Use internal_name for error messages
s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
s390/tools: Use array instead of string initializer
s390/vmem: Fix null-pointer-arithmetic warning in vmem_map_init()
s390: Add '-std=gnu11' to decompressor and purgatory CFLAGS
s390/bitops: Use correct constraint for arch_test_bit() inline assembly
s390/pci: Fix SR-IOV for PFs initially in standby
s390/futex: Avoid KMSAN instrumention for user pointers
s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inline
s390/futex: Cleanup futex_atomic_cmpxchg_inatomic()
s390/futex: Generate futex atomic op functions
s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER
s390/uaccess: Use asm goto for put_user()/get_user()
s390/uaccess: Remove usage of the oac specifier
s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handling
s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly constraints
s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappers
s390/uaccess: Move put_user() / __put_user() close to put_user() asm code
s390/uaccess: Use asm goto for __mvc_kernel_nofault()
...
- update gpio-sim selftests to not fail now that we no longer allow
rmdir() on configfs entries of active devices
- remove leftover code from gpio-mxc
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmebRzwACgkQEacuoBRx
13KUEQ/+KAAVHAr3l3bFG9POniRk6fZB2kyWeOUvGGmW1qBHjRF50wKsV0+EjhEi
0jm5HX92tvqTfci/jCkZbw0eIZ72P9QXqgBKo+Hare+7Q0bX/eddBgjNTyaAyjLW
VHwOoP3M7jdsb5vBmgMOIbsJHQ8Q3TscSN5ndFfQ04qZ0ahAczV86rQQRiFaxhle
rr8XrICLlOnLS9YV4yYgJhNulIMvhLhcz70gyYNU6UOfEHNm3ZblBgoOWrPAvd+d
3lZ+4ZDNaAPyBhW2a64FDs1e+/9VpFfrp3CNGecM8tV8swd33atpeNsClPqFPJYR
pLir57iVbH3FiE76PdifR128toaH3+pSrPNuL7Gi2oVsdA5aN2l0y+YuXIFgpP+7
/FCLKEfcY1G/Kp9nai0FG3ltAAkbaULi06dYdgXAC+xv391j1p9EI9iz9ZifpAA8
I1Apa36E+MDgku52eiL16isNsLESE3ky/URsf8zLe8bRUEKCF6YsVoOLe7RDl5CQ
fDgJx7xALIm8vk2jizmTmMIDVgY5leLgXS4pVHvEVBAg8zZe0OH5SRyYan82pv6t
asw4RfEu7J7l5k5fgqWf2p6H4cpU2SuI35YdoTjYReGbExabnkylZTyoRjkGLX7o
NqIktzN4nJzyGM0VAkOVnj5RKTlvCKSVLg0Qk3srXeG42qJ3Cqw=
=Ql6L
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- update gpio-sim selftests to not fail now that we no longer allow
rmdir() on configfs entries of active devices
- remove leftover code from gpio-mxc
* tag 'gpio-fixes-for-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
selftests: gpio: gpio-sim: Fix missing chip disablements
gpio: mxc: remove dead code after switch to DT-only
Most of the filesystem methods where we care about dentry name
and parent have their stability guaranteed by the callers;
->d_revalidate() is the major exception.
It's easy enough for callers to supply stable values for
expected name and expected parent of the dentry being
validated. That kills quite a bit of boilerplate in
->d_revalidate() instances, along with a bunch of races
where they used to access ->d_name without sufficient
precautions.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZ5gkoQAKCRBZ7Krx/gZQ
6w9FAP4nyxNNWMjE1TwuWR/DNDMYYuw/qn/miZ88B5BUM8hzqgD/W2SjRvcbSaIm
xSIYpbtKgtqNU34P1PU+dBvL8Utz2AE=
=TWY8
-----END PGP SIGNATURE-----
Merge tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs d_revalidate updates from Al Viro:
"Provide stable parent and name to ->d_revalidate() instances
Most of the filesystem methods where we care about dentry name and
parent have their stability guaranteed by the callers;
->d_revalidate() is the major exception.
It's easy enough for callers to supply stable values for expected name
and expected parent of the dentry being validated. That kills quite a
bit of boilerplate in ->d_revalidate() instances, along with a bunch
of races where they used to access ->d_name without sufficient
precautions"
* tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
9p: fix ->rename_sem exclusion
orangefs_d_revalidate(): use stable parent inode and name passed by caller
ocfs2_dentry_revalidate(): use stable parent inode and name passed by caller
nfs: fix ->d_revalidate() UAF on ->d_name accesses
nfs{,4}_lookup_validate(): use stable parent inode passed by caller
gfs2_drevalidate(): use stable parent inode and name passed by caller
fuse_dentry_revalidate(): use stable parent inode and name passed by caller
vfat_revalidate{,_ci}(): use stable parent inode passed by caller
exfat_d_revalidate(): use stable parent inode passed by caller
fscrypt_d_revalidate(): use stable parent inode passed by caller
ceph_d_revalidate(): propagate stable name down into request encoding
ceph_d_revalidate(): use stable parent inode passed by caller
afs_d_revalidate(): use stable name and parent inode passed by caller
Pass parent directory inode and expected name to ->d_revalidate()
generic_ci_d_compare(): use shortname_storage
ext4 fast_commit: make use of name_snapshot primitives
dissolve external_name.u into separate members
make take_dentry_name_snapshot() lockless
dcache: back inline names with a struct-wrapped array of unsigned long
make sure that DNAME_INLINE_LEN is a multiple of word size
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmebYkAACgkQ1w0aZmrP
KyH3kBAAvj0QSXPd3iwuFA0E9gmlZKWMrI/TmmSjI9nthmPAOXdTKOPjG5TmvESn
bWie6PHVNQjtQ+E3fUoGBFZz5P8F6X1hKx4VM0r75fDeHPxEkGuA0dPyTYG/CEjZ
4WwnfZ1YmSO7rXPRqRQa+X50XgJD0PiVYhb5QPAelNbb+pEyE5URnqNkKWBC7xQC
YizcZE5Zvm0/yBf+HegQL9eenwLHJbnC1F+d6LrOnDKOd3qXY41jpG0CJDnkseKv
qQRO1GRdMuivqrd9jicqUFbmhnCK8pVMRVm1NtENJ3I44KAAJlbihKztV7QL2BRt
H4DySZpaAm6CFw5Gut2OOSKNCw8c3bejBEeTaw6JWwA+gG/xqgAh2npyrY2Qsf+s
pLw+pNtfUxOmRnd3HDhjy++AUPfBBNfLREjaUXOUAM8I475ijccMuhdxnhLK/Wg1
F7AST3YRhaRByXTD13OG9S1GwK8ejQ1EKbTV7MnbGa6XzA8jMh+Uhr7gcI7ke20q
nDuxp9/N6iWHF4ncu9YHGyFiVTt7mQ4TyUgsxqJkH1rp3VhgKl3rpemY7G8EoYaW
QQRO4/l39XK32E3fBQ12Z0iunUTtdthkWO2CiCSY1GvXeek13wcu0tunkJ6TZMSx
P0IWoVEWv4dssluExToMQli66M4lL1WfF5VILGncIUlWWkMVoYM=
=81dC
-----END PGP SIGNATURE-----
Merge tag 'nf-25-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains one Netfilter fix:
1) Reject mismatching sum of field_len with set key length which allows
to create a set without inconsistent pipapo rule width and set key
length.
* tag 'nf-25-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: reject mismatching sum of field_len with set key length
====================
Link: https://patch.msgid.link/20250130113307.2327470-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Neal Cardwell has been indispensable in TCP reviews
and investigations, especially protocol-related.
Neal is also the author of packetdrill.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250129191332.2526140-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch reverts following changes:
83419b61d1 net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 2)
ae646f1a0b net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 1)
cfa579f666 net: no longer hold RTNL while calling flush_all_backlogs()
This caused issues in layers holding a private mutex:
cleanup_net()
rtnl_lock();
mutex_lock(subsystem_mutex);
unregister_netdevice();
rtnl_unlock(); // LOCKDEP violation
rtnl_lock();
I will revisit this in next cycle, opt-in for the new behavior
from safe contexts only.
Fixes: cfa579f666 ("net: no longer hold RTNL while calling flush_all_backlogs()")
Fixes: ae646f1a0b ("net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 1)")
Fixes: 83419b61d1 ("net: reduce RTNL hold duration in unregister_netdevice_many_notify() (part 2)")
Reported-by: syzbot+5b9196ecf74447172a9a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6789d55f.050a0220.20d369.004e.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250129142726.747726-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refactored:
unify inode corruption marking and mark them as bad immediately
upon detection of an error in attribute enumeration..
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmebjOoACgkQqbAzH4Mk
B7bjBBAAuNpe2AiRjeYtmp/Ix2sc/kA90i9ztptTY39wPNjUXnfOzJ0GnfAvV1mo
r/VgI3LScP+z++14oRgJzpynWE+s587vJjmST9Sq/P6qBH7cO9sDU7CMOllhW69Z
oQKigyK/isCN5lUBgnbcbNjGpYGyLwIXoiN6WR0iY09fYsby+HM3//lkgMhMJQ8L
aMpO3uQ1KkMsBgCexqR5D1Nh0hpuAOW0dPWO2kCvSOobk+MX+Ko3EVuR5LU2kK3Y
RP2uJrJx3gz/QfwCOeIm6iwo5LVnX0LzZBrOkJ05fx8kSI8vH0m5MiLJd/JFiqsS
Nkpf1tayNpg7BmEx95PgjdhdyJvVbybCW3Nyo+7U2zNQ3eBCuRilS+Ai0jsPTGRt
7VxBS2cZkDOq0iywmEIDGackqowDpc+TPAvSG7SZO+Qq5DF4tbNh4ropN62UbPaW
2H6CbwbD7/JjmesTG/GaGwnVePc8mjsBmjl7ChHaVDVgmOjgFxdxRiNVUQKyFmUm
AaW1Vm4Lq2VObIhgYo5kVPRx9PCEAORohWl16yOJ+qh4fWKyALpnaWBMpJY/oHoi
oc5wCrEM9YoYo0Hpk6myb61P+JYbepv4f0AdVhzVs5ydIh/uH0e/uS0jQr3BcqjR
F0a715SmymVManyHLVROWxboICFg7OLzRWkV6hpL2W864UegRBA=
=qaWP
-----END PGP SIGNATURE-----
Merge tag 'ntfs3_for_6.14' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 fixes from Konstantin Komarov:
- unify inode corruption marking and mark them as bad immediately upon
detection of an error in attribute enumeration
- folio cleanup
* tag 'ntfs3_for_6.14' of https://github.com/Paragon-Software-Group/linux-ntfs3:
fs/ntfs3: Unify inode corruption marking with _ntfs_bad_inode()
fs/ntfs3: Mark inode as bad as soon as error detected in mi_enum_attr()
ntfs3: Remove an access to page->index
- second half of a fix for a bug that'd been causing oopses on
filesystems using snapshots with memory pressure (key cache fills for
snaphots btrees are tricky)
- build fix for strange compiler configurations that double stack frame
size
- "journal stuck timeout" now takes into account device latency: this
fixes some spurious warnings, and the main remaining source of SRCU
lock hold time warnings (I'm no longer seeing this in my CI, so any
users still seeing this should definitely ping me)
- fix for slow/hanging unmounts (" Improve journal pin flushing")
- some more tracepoint fixes/improvements, to chase down the "rebalance
isn't making progress" issues
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmeajBsACgkQE6szbY3K
bna2BA/9E/+WBDFQHkLJ4kNQBxKL4u1xfav5kKGZ79mUlqruhr3AckLPFzWmQO21
eOJE0NeyvpsvLewDXMGZ8w/Nm3Vdc53X6ATKkQaF/UoTYVWbubmF62sXBzSS8TUh
YIM6s24/CbCi8lT49JAuIaG3OC21KH0X0zOcvyepfmn1aiPNLr4y7zOWKynOhgCt
mAt374ayUDDTgoQmXqPIrGp8eD/C+vUjo1ief+DIGMQmDj4uHpb5iBbmjXm8FF9x
4TcrP1UjWpiWPcHeb98H/CBWOnjDSgOFhYmxhVOvkDpC6XbtPSgKQIOs7tSJ0Nuo
IOzrGuBPVfd2m+wgXsn7zbn0HNOjS76sCo92K1lAdS86k0eqRfXmCxkU6FUphNkA
WCG8WrK0RjHL132iR97dtv36No8ji5mZN1ILPk/h4KRkoKC+9fA8BaJdAGVt+6NP
wZLtZxZkV8BqgXF41HwzHt54YftRPn2kR47Jfu1rPimSUd4Uqy8Yjw2J/fUT7eAd
6JdfiadhAtMRWnFGzmVs4LsEWJ7Ja7GnG7jhjzlACsqbXsTU8k16Wq38IchC6mi+
p+hqq9pLAeosW9Lk/QTGFrq52aQfyOzdUjq1pyCcEYtZFNqjj8GmmHVejxZWiRTo
C6dTEkSIMcBx+9QP8BJ5o+xMR02KABn+8x43TzQxQ2DXj0QamTA=
=QJHL
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2025-01-29' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- second half of a fix for a bug that'd been causing oopses on
filesystems using snapshots with memory pressure (key cache fills for
snaphots btrees are tricky)
- build fix for strange compiler configurations that double stack frame
size
- "journal stuck timeout" now takes into account device latency: this
fixes some spurious warnings, and the main remaining source of SRCU
lock hold time warnings (I'm no longer seeing this in my CI, so any
users still seeing this should definitely ping me)
- fix for slow/hanging unmounts (" Improve journal pin flushing")
- some more tracepoint fixes/improvements, to chase down the "rebalance
isn't making progress" issues
* tag 'bcachefs-2025-01-29' of git://evilpiepirate.org/bcachefs:
bcachefs: Improve trace_move_extent_finish
bcachefs: Fix trace_copygc
bcachefs: Journal writes are now IOPRIO_CLASS_RT
bcachefs: Improve journal pin flushing
bcachefs: fix bch2_btree_node_flags
bcachefs: rebalance, copygc enabled are runtime opts
bcachefs: Improve decompression error messages
bcachefs: bset_blacklisted_journal_seq is now AUTOFIX
bcachefs: "Journal stuck" timeout now takes into account device latency
bcachefs: Reduce stack frame size of __bch2_str_hash_check_key()
bcachefs: Fix btree_trans_peek_key_cache()
Matthieu Baerts says:
====================
mptcp: blackhole only if 1st SYN retrans w/o MPC is accepted
Here are two small fixes for issues introduced in v6.12.
- Patch 1: reset the mpc_drop mark for other SYN retransmits, to only
consider an MPTCP blackhole when the first SYN retransmitted without
the MPTCP options is accepted, as initially intended.
- Patch 2: also mention in the doc that the blackhole_timeout sysctl
knob is per-netns, like all the others.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================
Link: https://patch.msgid.link/20250129-net-mptcp-blackhole-fix-v1-0-afe88e5a6d2c@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
All other sysctl entries mention it, and it is a per-namespace sysctl.
So mention it as well.
Fixes: 27069e7cb3 ("mptcp: disable active MPTCP in case of blackhole")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The Fixes commit mentioned this:
> An MPTCP firewall blackhole can be detected if the following SYN
> retransmission after a fallback to "plain" TCP is accepted.
But in fact, this blackhole was detected if any following SYN
retransmissions after a fallback to TCP was accepted.
That's because 'mptcp_subflow_early_fallback()' will set 'request_mptcp'
to 0, and 'mpc_drop' will never be reset to 0 after.
This is an issue, because some not so unusual situations might cause the
kernel to detect a false-positive blackhole, e.g. a client trying to
connect to a server while the network is not ready yet, causing a few
SYN retransmissions, before reaching the end server.
Fixes: 27069e7cb3 ("mptcp: disable active MPTCP in case of blackhole")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The field length description provides the length of each separated key
field in the concatenation, each field gets rounded up to 32-bits to
calculate the pipapo rule width from pipapo_init(). The set key length
provides the total size of the key aligned to 32-bits.
Register-based arithmetics still allows for combining mismatching set
key length and field length description, eg. set key length 10 and field
description [ 5, 4 ] leading to pipapo width of 12.
Cc: stable@vger.kernel.org
Fixes: 3ce67e3793 ("netfilter: nf_tables: do not allow mismatch field size and set key length")
Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fix the suspend/resume path by ensuring the rtnl lock is held where
required. Calls to sh_eth_close, sh_eth_open and wol operations must be
performed under the rtnl lock to prevent conflicts with ongoing ndo
operations.
Fixes: b71af04676 ("sh_eth: add more PM methods")
Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
- btusb: mediatek: Add locks for usb_driver_claim_interface()
- L2CAP: accept zero as a special value for MTU auto-selection
- btusb: Fix possible infinite recursion of btusb_reset
- Add ABI doc for sysfs reset
- btnxpuart: Fix glitches seen in dual A2DP streaming
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmealqcZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKUn6D/wMzyaazNsfa+/WwzqlyDPZ
17mQWJUrRY7AhH2/vFA7JrfWeVgmihg90qaSocb8LnGnlUvEiX53xhRNspBt//lt
26ZHpDDNq+pKftG3zbR5U76BhpjvjRgzMJO9U7YJAFlNvNiL5pL+UTufIhFU60mR
x53e3ixNAl+62vwT2Re1S2NueMT8a2c8VPnbNI2W2xw59Am+WqQKWBJrthBkK9G6
nt5frOkFkva8J+zI4Rxu7nEzSCdiSXrO1OGae41v+orNzIl9ffyxW69XEuGsleMV
NJ8/ZJZrhWh/lEWqXlZu20+c119pmwXQPM5ONZKv4rSrFtuc1bEPNp6foRoHwfsh
qbtjPg323IJeR4TdIQUdo2VGM8w08GLEP5azYHjJ9MYvOdtjAUZ6YO5nhXmzE6Vt
cnF/OYMnggx202TnUzCT2HUghA3F4SNDRR6NT82xf+vnzbnF5W8ZIBUc6vbtqIxu
dGlkGtWF8waodNvh7F1+2X0El3joi6HOzw07+HLmnpa78Mn5Di3KUknotZ/gpYtn
M/FPEqJvmbMom1XlMYg5g5y+YvSdygllPCI5ewNR/LD/1S3jt6ES7+/pt9/ZAWrl
1WnCN1gJ6aaS7UQNteQNjB90iXxsTkPXqJvRhcr+WZauxqX7msFKuxmjhK++fplj
o/79A7oQ+XZTNw78zcrH4A==
=q2T2
-----END PGP SIGNATURE-----
Merge tag 'for-net-2025-01-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- btusb: mediatek: Add locks for usb_driver_claim_interface()
- L2CAP: accept zero as a special value for MTU auto-selection
- btusb: Fix possible infinite recursion of btusb_reset
- Add ABI doc for sysfs reset
- btnxpuart: Fix glitches seen in dual A2DP streaming
* tag 'for-net-2025-01-29' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: accept zero as a special value for MTU auto-selection
Bluetooth: btnxpuart: Fix glitches seen in dual A2DP streaming
Bluetooth: Add ABI doc for sysfs reset
Bluetooth: Fix possible infinite recursion of btusb_reset
Bluetooth: btusb: mediatek: Add locks for usb_driver_claim_interface()
====================
Link: https://patch.msgid.link/20250129210057.1318963-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add a test to bpf_offload.py for loading a devbound XDP program in
generic mode, checking that it fails correctly.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250127131344.238147-2-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Device-bound programs are used to support RX metadata kfuncs. These
kfuncs are driver-specific and rely on the driver context to read the
metadata. This means they can't work in generic XDP mode. However, there
is no check to disallow such programs from being attached in generic
mode, in which case the metadata kfuncs will be called in an invalid
context, leading to crashes.
Fix this by adding a check to disallow attaching device-bound programs
in generic mode.
Fixes: 2b3486bc2d ("bpf: Introduce device-bound XDP programs")
Reported-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
Closes: https://lore.kernel.org/r/dae862ec-43b5-41a0-8edf-46c59071cdda@hetzner-cloud.de
Tested-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250127131344.238147-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Testing with iperf3 using the "pasta" protocol splicer has revealed
a problem in the way tcp handles window advertising in extreme memory
squeeze situations.
Under memory pressure, a socket endpoint may temporarily advertise
a zero-sized window, but this is not stored as part of the socket data.
The reasoning behind this is that it is considered a temporary setting
which shouldn't influence any further calculations.
However, if we happen to stall at an unfortunate value of the current
window size, the algorithm selecting a new value will consistently fail
to advertise a non-zero window once we have freed up enough memory.
This means that this side's notion of the current window size is
different from the one last advertised to the peer, causing the latter
to not send any data to resolve the sitution.
The problem occurs on the iperf3 server side, and the socket in question
is a completely regular socket with the default settings for the
fedora40 kernel. We do not use SO_PEEK or SO_RCVBUF on the socket.
The following excerpt of a logging session, with own comments added,
shows more in detail what is happening:
// tcp_v4_rcv(->)
// tcp_rcv_established(->)
[5201<->39222]: ==== Activating log @ net/ipv4/tcp_input.c/tcp_data_queue()/5257 ====
[5201<->39222]: tcp_data_queue(->)
[5201<->39222]: DROPPING skb [265600160..265665640], reason: SKB_DROP_REASON_PROTO_MEM
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 259909392->260034360 (124968), unread 5565800, qlen 85, ofoq 0]
[OFO queue: gap: 65480, len: 0]
[5201<->39222]: tcp_data_queue(<-)
[5201<->39222]: __tcp_transmit_skb(->)
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: tcp_select_window(->)
[5201<->39222]: (inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM) ? --> TRUE
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
returning 0
[5201<->39222]: tcp_select_window(<-)
[5201<->39222]: ADVERTISING WIN 0, ACK_SEQ: 265600160
[5201<->39222]: [__tcp_transmit_skb(<-)
[5201<->39222]: tcp_rcv_established(<-)
[5201<->39222]: tcp_v4_rcv(<-)
// Receive queue is at 85 buffers and we are out of memory.
// We drop the incoming buffer, although it is in sequence, and decide
// to send an advertisement with a window of zero.
// We don't update tp->rcv_wnd and tp->rcv_wup accordingly, which means
// we unconditionally shrink the window.
[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]: __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]: [new_win = 0, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]: [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]: NOT calling tcp_send_ack()
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: __tcp_cleanup_rbuf(<-)
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 260040464->260040464 (0), unread 5559696, qlen 85, ofoq 0]
returning 6104 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)
// After each read, the algorithm for calculating the new receive
// window in __tcp_cleanup_rbuf() finds it is too small to advertise
// or to update tp->rcv_wnd.
// Meanwhile, the peer thinks the window is zero, and will not send
// any more data to trigger an update from the interrupt mode side.
[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]: __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]: [new_win = 262144, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]: [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]: NOT calling tcp_send_ack()
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: __tcp_cleanup_rbuf(<-)
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 260099840->260171536 (71696), unread 5428624, qlen 83, ofoq 0]
returning 131072 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)
// The above pattern repeats again and again, since nothing changes
// between the reads.
[...]
[5201<->39222]: tcp_recvmsg_locked(->)
[5201<->39222]: __tcp_cleanup_rbuf(->) tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160
[5201<->39222]: [new_win = 262144, win_now = 131184, 2 * win_now = 262368]
[5201<->39222]: [new_win >= (2 * win_now) ? --> time_to_ack = 0]
[5201<->39222]: NOT calling tcp_send_ack()
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[5201<->39222]: __tcp_cleanup_rbuf(<-)
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 265600160->265600160 (0), unread 0, qlen 0, ofoq 0]
returning 54672 bytes
[5201<->39222]: tcp_recvmsg_locked(<-)
// The receive queue is empty, but no new advertisement has been sent.
// The peer still thinks the receive window is zero, and sends nothing.
// We have ended up in a deadlock situation.
Note that well behaved endpoints will send win0 probes, so the problem
will not occur.
Furthermore, we have observed that in these situations this side may
send out an updated 'th->ack_seq´ which is not stored in tp->rcv_wup
as it should be. Backing ack_seq seems to be harmless, but is of
course still wrong from a protocol viewpoint.
We fix this by updating the socket state correctly when a packet has
been dropped because of memory exhaustion and we have to advertize
a zero window.
Further testing shows that the connection recovers neatly from the
squeeze situation, and traffic can continue indefinitely.
Fixes: e2142825c1 ("net: tcp: send zero-window ACK when no memory")
Cc: Menglong Dong <menglong8.dong@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20250127231304.1465565-1-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bgmac allocates new replacement buffer before handling each received
frame. Allocating & DMA-preparing 9724 B each time consumes a lot of CPU
time. Ideally bgmac should just respect currently set MTU but it isn't
the case right now. For now just revert back to the old limited frame
size.
This change bumps NAT masquerade speed by ~95%.
Since commit 8218f62c9c ("mm: page_frag: use initial zero offset for
page_frag_alloc_align()"), the bgmac driver fails to open its network
interface successfully and runs out of memory in the following call
stack:
bgmac_open
-> bgmac_dma_init
-> bgmac_dma_rx_skb_for_slot
-> netdev_alloc_frag
BGMAC_RX_ALLOC_SIZE = 10048 and PAGE_FRAG_CACHE_MAX_SIZE = 32768.
Eventually we land into __page_frag_alloc_align() with the following
parameters across multiple successive calls:
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=0
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=10048
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=20096
__page_frag_alloc_align: fragsz=10048, align_mask=-1, size=32768, offset=30144
So in that case we do indeed have offset + fragsz (40192) > size (32768)
and so we would eventually return NULL. Reverting to the older 1500
bytes MTU allows the network driver to be usable again.
Fixes: 8c7da63978 ("bgmac: configure MTU and add support for frames beyond 8192 byte size")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
[florian: expand commit message about recent commits]
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250127175159.1788246-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Deliberately fail a connect() attempt; expect error. Then verify that
subsequent attempt (using the same socket) can still succeed, rather than
fail outright.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-6-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fail the autobind, then trigger a transport reassign. Socket might get
unbound from unbound_sockets, which then leads to a reference count
underflow.
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-5-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sk_err is set when a (connectible) connect() fails. Effectively, this makes
an otherwise still healthy SS_UNCONNECTED socket impossible to use for any
subsequent connection attempts.
Clear sk_err upon trying to establish a connection.
Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-2-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Preserve sockets bindings; this includes both resulting from an explicit
bind() and those implicitly bound through autobind during connect().
Prevents socket unbinding during a transport reassignment, which fixes a
use-after-free:
1. vsock_create() (refcnt=1) calls vsock_insert_unbound() (refcnt=2)
2. transport->release() calls vsock_remove_bound() without checking if
sk was bound and moved to bound list (refcnt=1)
3. vsock_bind() assumes sk is in unbound list and before
__vsock_insert_bound(vsock_bound_sockets()) calls
__vsock_remove_bound() which does:
list_del_init(&vsk->bound_table); // nop
sock_put(&vsk->sk); // refcnt=0
BUG: KASAN: slab-use-after-free in __vsock_bind+0x62e/0x730
Read of size 4 at addr ffff88816b46a74c by task a.out/2057
dump_stack_lvl+0x68/0x90
print_report+0x174/0x4f6
kasan_report+0xb9/0x190
__vsock_bind+0x62e/0x730
vsock_bind+0x97/0xe0
__sys_bind+0x154/0x1f0
__x64_sys_bind+0x6e/0xb0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Allocated by task 2057:
kasan_save_stack+0x1e/0x40
kasan_save_track+0x10/0x30
__kasan_slab_alloc+0x85/0x90
kmem_cache_alloc_noprof+0x131/0x450
sk_prot_alloc+0x5b/0x220
sk_alloc+0x2c/0x870
__vsock_create.constprop.0+0x2e/0xb60
vsock_create+0xe4/0x420
__sock_create+0x241/0x650
__sys_socket+0xf2/0x1a0
__x64_sys_socket+0x6e/0xb0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 2057:
kasan_save_stack+0x1e/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x37/0x60
__kasan_slab_free+0x4b/0x70
kmem_cache_free+0x1a1/0x590
__sk_destruct+0x388/0x5a0
__vsock_bind+0x5e1/0x730
vsock_bind+0x97/0xe0
__sys_bind+0x154/0x1f0
__x64_sys_bind+0x6e/0xb0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150
RIP: 0010:refcount_warn_saturate+0xce/0x150
__vsock_bind+0x66d/0x730
vsock_bind+0x97/0xe0
__sys_bind+0x154/0x1f0
__x64_sys_bind+0x6e/0xb0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
refcount_t: underflow; use-after-free.
WARNING: CPU: 7 PID: 2057 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150
RIP: 0010:refcount_warn_saturate+0xee/0x150
vsock_remove_bound+0x187/0x1e0
__vsock_release+0x383/0x4a0
vsock_release+0x90/0x120
__sock_release+0xa3/0x250
sock_close+0x14/0x20
__fput+0x359/0xa80
task_work_run+0x107/0x1d0
do_exit+0x847/0x2560
do_group_exit+0xb8/0x250
__x64_sys_exit_group+0x3a/0x50
x64_sys_call+0xfec/0x14f0
do_syscall_64+0x93/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: c0cfa2d8a7 ("vsock: add multi-transports support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-1-1cf57065b770@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- SoundWire multi lane support to use multiple lanes if supported
- Stream handling of DEPREPARED state
- amd wake register programming for power off mode
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmeac7MACgkQfBQHDyUj
g0daUA//VD+ousUjyUrm07rNYOw+ZtwVChj+aVGSrauuuGxNOsp+RXrd+lLDjWgc
ruHwegb8s9Dynq6RVJcuXQ8bKlop8FX6QMn2lewV/+dYkOk4PBwVr3gKER2lKnZV
WZ4hI/bFcJyCmIf6v1pIwzc1XKiCLuZzhnPIbKTcL3cLEKssaqMTYQAsnZw15gCY
GUccgFYGYe0+eNdt7BtxlAaDZN3X6ankgwxIh1/kXHz7Tiu5fbqWxBnmBAxBQtHx
shK4mnK5ccWrNZYzcq0+CWvYErVg496FyZm6ZNOCZKqaOfvXwtC4ZEyEkDDCb2Ep
h2eIaLaMIYOzDZt/zoV/pcwhgYtEMiyrmhsLRqgywAjweA1tf+wD/ti6fKN9L4tf
YQ0Xbe13aSoFy9fQWF/26rf0WVE0oU38w6KlZ/uM/xY6gcYnzZgdOnlMNnmOZKfg
GY5DvqMrUgscG6WAzwtu+Ro4zXURQWgqBY3+e4d1IVNk2u9q7pCfV5E2QhRri3q6
217+QPdPEqmQwad9vowoPfPpFpCXvs/duv7SYdU0Wm6h55ShC22hho6n2VcxQoZ7
U0DpilHWjbQxnBLLrgm3dcXC8d7V+AnA7CDk1vR1H2vDy+WzVY1xPmBGAyn1idUC
Ly7oowR1yt56iqVig5cGzA1GjAsBqRX+KLyPL7EU9xQjWCBq80Q=
=PnzK
-----END PGP SIGNATURE-----
Merge tag 'soundwire-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
- SoundWire multi lane support to use multiple lanes if supported
- Stream handling of DEPREPARED state
- AMD wake register programming for power off mode
* tag 'soundwire-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: amd: clear wake enable register for power off mode
soundwire: generic_bandwidth_allocation: count the bandwidth of active streams only
SoundWire: pass stream to compute_params()
soundwire: generic_bandwidth_allocation: add lane in sdw_group_params
soundwire: generic_bandwidth_allocation: select data lane
soundwire: generic_bandwidth_allocation: check required freq accurately
soundwire: generic_bandwidth_allocation: correct clk_freq check in sdw_select_row_col
Soundwire: generic_bandwidth_allocation: set frame shape on fly
Soundwire: stream: program BUSCLOCK_SCALE
Soundwire: add sdw_slave_get_scale_index helper
soundwire: generic_bandwidth_allocation: skip DEPREPARED streams
soundwire: stream: set DEPREPARED state earlier
soundwire: add lane_used_bandwidth in struct sdw_bus
soundwire: mipi_disco: read lane mapping properties from ACPI
soundwire: add lane field in sdw_port_runtime
soundwire: bus: Move irq mapping cleanup into devres
New support:
- TI J722S CSI BCDMA controller support
- Intel idxd Panther Lake family platforms
- Allwinner F1C100s suniv DMA
- Qualcomm QCS615, QCS8300, SM8750, SA8775P GPI dma controller support
- AMD ae4dma controller support and reorganisation of amd driver
Updates:
- Channel page support for Nvidia Tegra210 adma driver
- Freescale support for S32G based platforms
- Yamilfy atmel dma bindings
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmeaabAACgkQfBQHDyUj
g0cIog/6ApWNxyRcC85htR4/K9xRqyTTn640nXQ4wP482ECTWJPkEzfvpphIandp
wme+5/a6VoEdj2ft9OB5dK5XuW3bImKJRMG/saymz3KaHq9geOddOnPsvDaDp8ji
FfvrvFWd8AIAFjVBRX8hBTGofPCS3LAeczgUJs7HRRU0f1sPZIR79JcbelGMfy3l
44a0YyuEMBXgtWMaiDgEjldV05Ba1fuXkg8XdjWXaWO8LIZR7d8LmN7w8i8e6CJf
Gekxa6i0xCNUTh9iX4VtqToTszNZCLo+lQ6VwoUoKjnE9xXATbEwMi9Lbe50pTbF
LlD/qdWpqw5faGgxvgXaUV6DG/TgobgtSy/dO0YaKLlFTBlyzcCW3a2c7XN0Ho5l
urToRH+l8QJubqz8/YVnPgSsUtgDrx9kTcoYJ3y8iNo/Cdpi2m98x8h6N+deaEjq
YhvgJp8XmxNYL9jZpVt8ZvUV5QPClUriQDyPSjMMKeTQrRggl3so4xfq+cNF/gBr
zR/MQjLgE5x5BDzjhLnTofJWT/ZL4cM0tWec6VSW1S1JR01CYroWv2/+0ESzclsq
sJHwTDyyXSEFsgESe3i1rXnvQAAxujchleSUHLc+vAW9ULkhzJtGZjDNUBVNda2C
wunr7z6pAK+mk+AAU2RQJLmEYCpamDOL65IEeT1zPkrQ3MQXK0A=
=r4Bb
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"A bunch of new device support and updates to few drivers, biggest of
them amd ones.
New support:
- TI J722S CSI BCDMA controller support
- Intel idxd Panther Lake family platforms
- Allwinner F1C100s suniv DMA
- Qualcomm QCS615, QCS8300, SM8750, SA8775P GPI dma controller support
- AMD ae4dma controller support and reorganisation of amd driver
Updates:
- Channel page support for Nvidia Tegra210 adma driver
- Freescale support for S32G based platforms
- Yamilfy atmel dma bindings"
* tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (45 commits)
dmaengine: idxd: Enable Function Level Reset (FLR) for halt
dmaengine: idxd: Refactor halt handler
dmaengine: idxd: Add idxd_device_config_save() and idxd_device_config_restore() helpers
dmaengine: idxd: Binding and unbinding IDXD device and driver
dmaengine: idxd: Add idxd_pci_probe_alloc() helper
dt-bindings: dma: atmel: Convert to json schema
dt-bindings: dma: st-stm32-dmamux: Add description for dma-cell values
dmaengine: qcom: gpi: Add GPI immediate DMA support for SPI protocol
dt-bindings: dma: adi,axi-dmac: deprecate adi,channels node
dt-bindings: dma: adi,axi-dmac: convert to yaml schema
dmaengine: mv_xor: switch to for_each_child_of_node_scoped()
dmaengine: bcm2835-dma: Prevent suspend if DMA channel is busy
dmaengine: tegra210-adma: Support channel page
dt-bindings: dma: Support channel page to nvidia,tegra210-adma
dmaengine: ti: k3-udma: Add support for J722S CSI BCDMA
dt-bindings: dma: ti: k3-bcdma: Add J722S CSI BCDMA
dmaengine: ti: edma: fix OF node reference leaks in edma_driver
dmaengine: ti: edma: make the loop condition simpler in edma_probe()
dmaengine: fsl-edma: read/write multiple registers in cyclic transactions
dmaengine: fsl-edma: add support for S32G based platforms
...
One of the possible ways to enable the input MTU auto-selection for L2CAP
connections is supposed to be through passing a special "0" value for it
as a socket option. Commit [1] added one of those into avdtp. However, it
simply wouldn't work because the kernel still treats the specified value
as invalid and denies the setting attempt. Recorded BlueZ logs include the
following:
bluetoothd[496]: profiles/audio/avdtp.c:l2cap_connect() setsockopt(L2CAP_OPTIONS): Invalid argument (22)
[1]: ae5be371a9
Found by Linux Verification Center (linuxtesting.org).
Fixes: 4b6e228e29 ("Bluetooth: Auto tune if input MTU is set to 0")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This fixes a regression caused by previous commit for fixing truncated
ACL data, which is causing some intermittent glitches when running two
A2DP streams.
serdev_device_write_buf() is the root cause of the glitch, which is
reverted, and the TX work will continue to write until the queue is empty.
This change fixes both issues. No A2DP streaming glitches or truncated
ACL data issue observed.
Fixes: 8023dd2204 ("Bluetooth: btnxpuart: Fix driver sending truncated data")
Fixes: 689ca16e52 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets")
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The functionality was implemented in commit 0f8a001374 ("Bluetooth:
Allow reset via sysfs")
Fixes: 0f8a001374 ("Bluetooth: Allow reset via sysfs")
Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>