1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

1335548 commits

Author SHA1 Message Date
Linus Torvalds
94b481f767 bcachefs fixes for 6.14-rc2
- add a SubmittingPatches to clarify that patches submitted for bcachefs
   do, in fact, need to be tested
 - discard path now correctly issues journal flushes when needed, this
   fixes performance issues when the filesystem is nearly full and we're
   bottlenecked on copygc
 - fix a bug that could cause the pending rebalance work accounting to be
   off when devices are being onlined/offlined; users should report if
   they are still seeing this
 - and a few more trivial ones
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmelgKQACgkQE6szbY3K
 bnY5gg/9EETvYT1Mdw2a0c/OhT61qxbLUO3T4FAfPRtBUi72Y0OP05FUf5jZFLw8
 dmxK8qpZVW3Xs5FORMvezEyVMnu3FmlUcTiGaJsYxAGQnPB/IFG5c2sJYeznDJP7
 rX5vOHUVJYfrcfukNZZzqB9mhC8VFO8BhXGggT1IFanc6Fnkoys3YnWS6F6a9oVa
 Lv67aSi2uJVQXbneN3r5iFxh5DzgyJetm1v2g0VPQGif4zfOqhSdBUFnbJ42It3Y
 1N7//fVh94cS+DTxSDQaqNNVj3k9FI9aJiFWs9p08VNMQ3OlIloaNTrythUtRVF1
 zkSwwMoVVuHwmtaf+5IujetXO3RoSoIMUGoOMXH6UIh9TQioFoXxBrxFp7X61/q3
 NVk9hi2U5WYkGXDWOHbVCMBFVAbY2MckHrkM0+GXG4ICVmSmVtFXVhyb4Dpeich9
 uvR4lTMoxa4lTiAoesIYa4puhAFnq27MCcq5c3DXbT63efU3eU7GCeSWUT85ZqfV
 nHmm/ZaQJClN66K+ikK/L29MjvzlmWxiGTF0Tk5JusiHbbzKIcDZsnQNonGzoPKk
 pbR1FWNl/mX1P9gk6lOLJMscSIrOKWWC/wI1pI6x3x84gvtG0oZW8Zg+J3XkexlD
 NzhFkOkf1GZAvg8eYklFlzAS1+14pBxgnibg9fbsDqPG74RlNRM=
 =yics
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2025-02-06.2' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Nothing major, things continue to be fairly quiet over here.

   - add a SubmittingPatches to clarify that patches submitted for
     bcachefs do, in fact, need to be tested

   - discard path now correctly issues journal flushes when needed, this
     fixes performance issues when the filesystem is nearly full and
     we're bottlenecked on copygc

   - fix a bug that could cause the pending rebalance work accounting to
     be off when devices are being onlined/offlined; users should report
     if they are still seeing this

   - and a few more trivial ones"

* tag 'bcachefs-2025-02-06.2' of git://evilpiepirate.org/bcachefs:
  bcachefs: bch2_bkey_sectors_need_rebalance() now only depends on bch_extent_rebalance
  bcachefs: Fix rcu imbalance in bch2_fs_btree_key_cache_exit()
  bcachefs: Fix discard path journal flushing
  bcachefs: fix deadlock in journal_entry_open()
  bcachefs: fix incorrect pointer check in __bch2_subvolume_delete()
  bcachefs docs: SubmittingPatches.rst
2025-02-07 09:16:07 -08:00
Hector Martin
1b3291f000 MAINTAINERS: Remove myself
I no longer have any faith left in the kernel development process or
community management approach.

Apple/ARM platform development will continue downstream. If I feel like
sending some patches upstream in the future myself for whatever subtree
I may, or I may not. Anyone who feels like fighting the upstreaming
fight themselves is welcome to do so.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-07 09:12:33 -08:00
Pavel Machek
511121a48b MAINTAINERS: Move Pavel to kernel.org address
I need to filter my emails better, switch to pavel@kernel.org address
to help with that.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-07 09:12:33 -08:00
Kent Overstreet
4be214c269 bcachefs: bch2_bkey_sectors_need_rebalance() now only depends on bch_extent_rebalance
Previously, bch2_bkey_sectors_need_rebalance() called
bch2_target_accepts_data(), checking whether the target is writable.

However, this means that adding or removing devices from a target would
change the value of bch2_bkey_sectors_need_rebalance() for an existing
extent; this needs to be invariant so that the extent trigger can
correctly maintain rebalance_work accounting.

Instead, check target_accepts_data() in io_opts_to_rebalance_opts(),
before creating the bch_extent_rebalance entry.

This fixes (one?) cause of rebalance_work accounting being off.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Kent Overstreet
3539880ef1 bcachefs: Fix rcu imbalance in bch2_fs_btree_key_cache_exit()
Spotted by sparse.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Kent Overstreet
9e9033522a bcachefs: Fix discard path journal flushing
The discard path is supposed to issue journal flushes when there's too
many buckets empty buckets that need a journal commit before they can be
written to again, but at some point this code seems to have been lost.

Bring it back with a new optimization to make sure we don't issue too
many journal flushes: the journal now tracks the sequence number of the
most recent flush in progress, which the discard path uses when deciding
which buckets need a journal flush.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Jeongjun Park
2ef995df0c bcachefs: fix deadlock in journal_entry_open()
In the previous commit b3d82c2f27, code was added to prevent journal sequence
overflow. Among them, the code added to journal_entry_open() uses the
bch2_fs_fatal_err_on() function to handle errors.

However, __journal_res_get() , which calls journal_entry_open() , calls
journal_entry_open() while holding journal->lock , but bch2_fs_fatal_err_on()
internally tries to acquire journal->lock , which results in a deadlock.

So we need to add a locked helper to handle fatal errors even when the
journal->lock is held.

Fixes: b3d82c2f27 ("bcachefs: Guard against journal seq overflow")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Jeongjun Park
6b37037d6d bcachefs: fix incorrect pointer check in __bch2_subvolume_delete()
For some unknown reason, checks on struct bkey_s_c_snapshot and struct
bkey_s_c_snapshot_tree pointers are missing.

Therefore, I think it would be appropriate to fix the incorrect pointer checking
through this patch.

Fixes: 4bd06f07bc ("bcachefs: Fixes for snapshot_tree.master_subvol")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Kent Overstreet
fdfd0ad828 bcachefs docs: SubmittingPatches.rst
Add an (initial?) patch submission checklist, focusing mainly on
testing.

Yes, all patches must be tested, and that starts (but does not end) with
the patch author.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Linus Torvalds
bb066fe812 pci-v6.14-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmelDIoUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxzhQ/8CDqrVGq+nWgXs9NGYBfbWrIH39Lh
 k7PZA/GHvDSaMaFsCZ3tba7HtYNjcVjn6GPCYwHDjPY3oRmMwb63rSfbdtslR6l6
 CJRMETWVU9LIi9vGxb0nFBRkBE9+tHrIZXAndJZmfimDnYAHraM7S1F63+AvcyqA
 1AxJk8Vb2EtiDvbBYhwq4ef716zuzdC9b45N9fLodtROKpd14DPvicM1TNu2eMaT
 D8NhIiQQtHwPqqj0S4ai0O4xyfHVvJsQijmpyaADFwYNtqVptsou7MG2v9iZR8KX
 jYVjvcHlqwJc2D1VKSA6nsU+cV8ChIBrwdoHgWw4fS87QSuOd8WupLmszQJTj2xw
 Hj+8Z1S911PuQr97lO97l8BGxU0Talv3DY1o3B7I4N5bbFJgCut5fLU8YjPxUmEa
 PxY7YkHVEhP2PSuw9Em5u15MWF972OI9hz86Nl8F701KzgDu2SGSWq3yUpz1w+EZ
 DZSXJnPH2jGhc6bMlSEmVse36wKigETnbNOyEuFAe6iJ7ZHrU1uqS5GLTnuCkn3V
 +Y4x3txq8GGlNpEt9gR/0r2C7bUNTreN28/x2ot1Be9P0Wc2Xj9OTglzbirg2KBq
 n2Uw70lhqCaELM/2QGsxW2bI0r1nIPlQ2t64hbBQ6g6bS43Xi5OqiGI2m9qycILZ
 x2wnC2qU+NvDLqw=
 =Nh1D
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

 - When saving a device's state, always save the upstream bridge's PM L1
   Substates configuration as well because the bridge never saves its
   own state, and restoring a device needs the state for both ends; this
   was a regression that caused link and power management errors after
   suspend/resume (Ilpo Järvinen)

 - Correct TPH Control Register write, where we wrote the ST Mode where
   the THP Requester Enable value was intended (Robin Murphy)

* tag 'pci-v6.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/TPH: Restore TPH Requester Enable correctly
  PCI/ASPM: Fix L1SS saving
2025-02-06 12:32:03 -08:00
Linus Torvalds
5b734b49de xen: branch for v6.14-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZ6Ti7AAKCRCAXGG7T9hj
 vqTVAP4iQCeMzxPJoHteWm9ihOyIpHZ+5Kimhle/irUmAYC5OwD/St7EdmY3MiZd
 sMRNrW3dFBvOQkgnysRw7OOaP8GUfgY=
 =DT9D
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Three fixes for xen_hypercall_hvm() that was introduced in the 6.13
  cycle"

* tag 'for-linus-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: remove unneeded dummy push from xen_hypercall_hvm()
  x86/xen: add FRAME_END to xen_hypercall_hvm()
  x86/xen: fix xen_hypercall_hvm() to not clobber %rbx
2025-02-06 12:25:35 -08:00
Linus Torvalds
3cf0a98fea Current release - regressions:
- core: harmonize tstats and dstats
 
   - ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels
 
   - eth: tun: revert fix group permission check
 
   - eth: stmmac: revert "specify hardware capability value when FIFO size isn't specified"
 
 Previous releases - regressions:
 
   - udp: gso: do not drop small packets when PMTU reduces
 
   - rxrpc: fix race in call state changing vs recvmsg()
 
   - eth: ice: fix Rx data path for heavy 9k MTU traffic
 
   - eth: vmxnet3: fix tx queue race condition with XDP
 
 Previous releases - always broken:
 
   - sched: pfifo_tail_enqueue: drop new packet when sch->limit == 0
 
   - ethtool: ntuple: fix rss + ring_cookie check
 
   - rxrpc: fix the rxrpc_connection attend queue handling
 
 Misc:
 
   - recognize Kuniyuki Iwashima as a maintainer
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmekqVQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkLgcP/3sewa114G/vERakXPptWLnCRtXLMaMk
 E8ihlBj9qI0hD51Qi+NRYKI/IJqA5ZB/p4I5cBz7xI8d5VOQYhSuCdCnwMEZPAfG
 IQS8VpcFjcdj1e7cjlFu/s5PzF6cRRvinhWQpE5YpuE2TFCl+SJ8XAupLvi8zCe4
 wo1Pet4dhKOKtrrhqiCeSNUer0/MSeLrCIB7mHha279l0D8Wx+m+h8OJMQahXI0t
 IJWNkxPyrIqx0ksR8Uy9SVAIrNlR5iEeqcwztLUtqxzs+b/s7OJcWmgqXgSMTOb5
 RUr8Bthm3DQyXuFoR5bFA/oaoJb+iZvOjcEqQaYU2dF5zcRfdY/omhQmNq6s/7/w
 7KV7C/RIrnhwLjkEd2IX5pzvgM9VuO9ewk65+sZ82o0wukhKn6RIyJYHzvwbc0rH
 rQMGAoz/uF4eiuqfS+uh86+VJfjBkG/sgdw0JSAt7dN14Fadg14+Ocz6apjwNuzJ
 0QUWABxKSpvktufycyoo5s/bPqLMBxI/W3XZwIzbJ+dupRavPxkWpT/AwI//ZTDe
 rUTl2SEd4Ruliy1w41TXgubRxuW07M7bm4AxUrLzAzTx+aX7AdBeIyZp83x3oW1p
 nNUSdJ0m7A0Z1k28tdAP/gjD/vjcQ0J5YQ6MvZ1RBQl7MI+GLYD2irWbbrlI+g+6
 HKQcXhv/7pG4
 =t5WM
 -----END PGP SIGNATURE-----

Merge tag 'net-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Interestingly the recent kmemleak improvements allowed our CI to catch
  a couple of percpu leaks addressed here.

  We (mostly Jakub, to be accurate) are working to increase review
  coverage over the net code-base tweaking the MAINTAINER entries.

  Current release - regressions:

   - core: harmonize tstats and dstats

   - ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels

   - eth: tun: revert fix group permission check

   - eth: stmmac: revert "specify hardware capability value when FIFO
     size isn't specified"

  Previous releases - regressions:

   - udp: gso: do not drop small packets when PMTU reduces

   - rxrpc: fix race in call state changing vs recvmsg()

   - eth: ice: fix Rx data path for heavy 9k MTU traffic

   - eth: vmxnet3: fix tx queue race condition with XDP

  Previous releases - always broken:

   - sched: pfifo_tail_enqueue: drop new packet when sch->limit == 0

   - ethtool: ntuple: fix rss + ring_cookie check

   - rxrpc: fix the rxrpc_connection attend queue handling

  Misc:

   - recognize Kuniyuki Iwashima as a maintainer"

* tag 'net-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
  Revert "net: stmmac: Specify hardware capability value when FIFO size isn't specified"
  MAINTAINERS: add a sample ethtool section entry
  MAINTAINERS: add entry for ethtool
  rxrpc: Fix race in call state changing vs recvmsg()
  rxrpc: Fix call state set to not include the SERVER_SECURING state
  net: sched: Fix truncation of offloaded action statistics
  tun: revert fix group permission check
  selftests/tc-testing: Add a test case for qdisc_tree_reduce_backlog()
  netem: Update sch->q.qlen before qdisc_tree_reduce_backlog()
  selftests/tc-testing: Add a test case for pfifo_head_drop qdisc when limit==0
  pfifo_tail_enqueue: Drop new packet when sch->limit == 0
  selftests: mptcp: connect: -f: no reconnect
  net: rose: lock the socket in rose_bind()
  net: atlantic: fix warning during hot unplug
  rxrpc: Fix the rxrpc_connection attend queue handling
  net: harmonize tstats and dstats
  selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not supported
  selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigure
  ethtool: ntuple: fix rss + ring_cookie check
  ethtool: rss: fix hiding unsupported fields in dumps
  ...
2025-02-06 09:14:54 -08:00
Robin Murphy
6f64b83d9f PCI/TPH: Restore TPH Requester Enable correctly
When we reenable TPH after changing a Steering Tag value, we need the
actual TPH Requester Enable value, not the ST Mode (which only happens to
work out by chance for non-extended TPH in interrupt vector mode).

Link: https://lore.kernel.org/r/13118098116d7bce07aa20b8c52e28c7d1847246.1738759933.git.robin.murphy@arm.com
Fixes: d2e8a34876 ("PCI/TPH: Add Steering Tag support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wei Huang <wei.huang2@amd.com>
2025-02-06 10:30:11 -06:00
Ilpo Järvinen
7507eb3e7b PCI/ASPM: Fix L1SS saving
Commit 1db806ec06 ("PCI/ASPM: Save parent L1SS config in
pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both the
Upstream Port and its upstream bridge when handling an Upstream Port, which
matches what the L1SS restore side does. However, parent->state_saved can
be set true at an earlier time when the upstream bridge saved other parts
of its state. Then later when attempting to save the L1SS config while
handling the Upstream Port, parent->state_saved is true in
pci_save_aspm_l1ss_state() resulting in early return and skipping saving
bridge's L1SS config because it is assumed to be already saved. Later on
restore, junk is written into L1SS config which causes issues with some
devices.

Remove parent->state_saved check and unconditionally save L1SS config also
for the upstream bridge from an Upstream Port which ought to be harmless
from correctness point of view. With the Upstream Port check now present,
saving the L1SS config more than once for the bridge is no longer a problem
(unlike when the parent->state_saved check got introduced into the fix
during its development).

Link: https://lore.kernel.org/r/20250131152913.2507-1-ilpo.jarvinen@linux.intel.com
Fixes: 1db806ec06 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731
Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com>
Reported by: Rafael J. Wysocki <rafael@kernel.org>
Closes: https://lore.kernel.org/r/CAJZ5v0iKmynOQ5vKSQbg1J_FmavwZE-nRONovOZ0mpMVauheWg@mail.gmail.com
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/r/d7246feb-4f3f-4d0c-bb64-89566b170671@molgen.mpg.de
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> # Dell XPS 13 9360
2025-02-06 09:54:38 -06:00
Russell King (Oracle)
2a64c96356 Revert "net: stmmac: Specify hardware capability value when FIFO size isn't specified"
This reverts commit 8865d22656, which caused breakage for platforms
which are not using xgmac2 or gmac4. Only these two cores have the
capability of providing the FIFO sizes from hardware capability fields
(which are provided in priv->dma_cap.[tr]x_fifo_size.)

All other cores can not, which results in these two fields containing
zero. We also have platforms that do not provide a value in
priv->plat->[tr]x_fifo_size, resulting in these also being zero.

This causes the new tests introduced by the reverted commit to fail,
and produce e.g.:

	stmmaceth f0804000.eth: Can't specify Rx FIFO size

An example of such a platform which fails is QEMU's npcm750-evb.
This uses dwmac1000 which, as noted above, does not have the capability
to provide the FIFO sizes from hardware.

Therefore, revert the commit to maintain compatibility with the way
the driver used to work.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/4e98f967-f636-46fb-9eca-d383b9495b86@roeck-us.net
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Steven Price <steven.price@arm.com>
Fixes: 8865d22656 ("net: stmmac: Specify hardware capability value when FIFO size isn't specified")
Link: https://patch.msgid.link/E1tfeyR-003YGJ-Gb@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-06 11:53:54 +01:00
Jakub Kicinski
82b02a7c45 MAINTAINERS: add a sample ethtool section entry
I feel like we don't do a good enough keeping authors of driver
APIs around. The ethtool code base was very nicely compartmentalized
by Michal. Establish a precedent of creating MAINTAINERS entries
for "sections" of the ethtool API. Use Andrew and cable test as
a sample entry. The entry should ideally cover 3 elements:
a core file, test(s), and keywords. The last one is important
because we intend the entries to cover core code *and* reviews
of drivers implementing given API!

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250204215750.169249-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-06 10:32:49 +01:00
Jakub Kicinski
1e3835a8ae MAINTAINERS: add entry for ethtool
Michal did an amazing job converting ethtool to Netlink, but never
added an entry to MAINTAINERS for himself. Create a formal entry
so that we can delegate (portions) of this code to folks.

Over the last 3 years majority of the reviews have been done by
Andrew and I. I suppose Michal didn't want to be on the receiving
end of the flood of patches.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250204215729.168992-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-06 10:32:49 +01:00
Jakub Kicinski
884af6ab1e Merge branch 'rxrpc-call-state-fixes'
David Howells says:

====================
rxrpc: Call state fixes

Here some call state fixes for AF_RXRPC.

 (1) Fix the state of a call to not treat the challenge-response cycle as
     part of an incoming call's state set.  The problem is that it makes
     handling received of the final packet in the receive phase difficult
     as that wants to change the call state - but security negotiations may
     not yet be complete.

 (2) Fix a race between the changing of the call state at the end of the
     request reception phase of a service call, recvmsg() collecting the last
     data and sendmsg() trying to send the reply before the I/O thread has
     advanced the call state.

Link: https://lore.kernel.org/20250203110307.7265-2-dhowells@redhat.com
====================

Link: https://patch.msgid.link/20250204230558.712536-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:47:51 -08:00
David Howells
2d7b30aef3 rxrpc: Fix race in call state changing vs recvmsg()
There's a race in between the rxrpc I/O thread recording the end of the
receive phase of a call and recvmsg() examining the state of the call to
determine whether it has completed.

The problem is that call->_state records the I/O thread's view of the call,
not the application's view (which may lag), so that alone is not
sufficient.  To this end, the application also checks whether there is
anything left in call->recvmsg_queue for it to pick up.  The call must be
in state RXRPC_CALL_COMPLETE and the recvmsg_queue empty for the call to be
considered fully complete.

In rxrpc_input_queue_data(), the latest skbuff is added to the queue and
then, if it was marked as LAST_PACKET, the state is advanced...  But this
is two separate operations with no locking around them.

As a consequence, the lack of locking means that sendmsg() can jump into
the gap on a service call and attempt to send the reply - but then get
rejected because the I/O thread hasn't advanced the state yet.

Simply flipping the order in which things are done isn't an option as that
impacts the client side, causing the checks in rxrpc_kernel_check_life() as
to whether the call is still alive to race instead.

Fix this by moving the update of call->_state inside the skb queue
spinlocked section where the packet is queued on the I/O thread side.

rxrpc's recvmsg() will then automatically sync against this because it has
to take the call->recvmsg_queue spinlock in order to dequeue the last
packet.

rxrpc's sendmsg() doesn't need amending as the app shouldn't be calling it
to send a reply until recvmsg() indicates it has returned all of the
request.

Fixes: 93368b6bd5 ("rxrpc: Move call state changes from recvmsg to I/O thread")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250204230558.712536-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:47:46 -08:00
David Howells
41b996ce83 rxrpc: Fix call state set to not include the SERVER_SECURING state
The RXRPC_CALL_SERVER_SECURING state doesn't really belong with the other
states in the call's state set as the other states govern the call's Rx/Tx
phase transition and govern when packets can and can't be received or
transmitted.  The "Securing" state doesn't actually govern the reception of
packets and would need to be split depending on whether or not we've
received the last packet yet (to mirror RECV_REQUEST/ACK_REQUEST).

The "Securing" state is more about whether or not we can start forwarding
packets to the application as recvmsg will need to decode them and the
decoding can't take place until the challenge/response exchange has
completed.

Fix this by removing the RXRPC_CALL_SERVER_SECURING state from the state
set and, instead, using a flag, RXRPC_CALL_CONN_CHALLENGING, to track
whether or not we can queue the call for reception by recvmsg() or notify
the kernel app that data is ready.  In the event that we've already
received all the packets, the connection event handler will poke the app
layer in the appropriate manner.

Also there's a race whereby the app layer sees the last packet before rxrpc
has managed to end the rx phase and change the state to one amenable to
allowing a reply.  Fix this by queuing the packet after calling
rxrpc_end_rx_phase().

Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20250204230558.712536-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:47:46 -08:00
Ido Schimmel
811b8f534f net: sched: Fix truncation of offloaded action statistics
In case of tc offload, when user space queries the kernel for tc action
statistics, tc will query the offloaded statistics from device drivers.
Among other statistics, drivers are expected to pass the number of
packets that hit the action since the last query as a 64-bit number.

Unfortunately, tc treats the number of packets as a 32-bit number,
leading to truncation and incorrect statistics when the number of
packets since the last query exceeds 0xffffffff:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 58 sec used 0 sec
        Action statistics:
        Sent 1133877034176 bytes 536959475 pkt (dropped 0, overlimits 0 requeues 0)
[...]

According to the above, 2111-byte packets were redirected which is
impossible as only 64-byte packets were transmitted and the MTU was
1500.

Fix by treating packets as a 64-bit number:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 61 sec used 0 sec
        Action statistics:
        Sent 1370624380864 bytes 21416005951 pkt (dropped 0, overlimits 0 requeues 0)
[...]

Which shows that only 64-byte packets were redirected (1370624380864 /
21416005951 = 64).

Fixes: 3804070235 ("net/sched: Enable netdev drivers to update statistics of offloaded actions")
Reported-by: Joe Botha <joe@atomic.ac>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250204123839.1151804-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:32:06 -08:00
Willem de Bruijn
a70c7b3cbc tun: revert fix group permission check
This reverts commit 3ca459eaba.

The blamed commit caused a regression when neither tun->owner nor
tun->group is set. This is intended to be allowed, but now requires
CAP_NET_ADMIN.

Discussion in the referenced thread pointed out that the original
issue that prompted this patch can be resolved in userspace.

The relaxed access control may also make a device accessible when it
previously wasn't, while existing users may depend on it to not be.

This is a clean pure git revert, except for fixing the indentation on
the gid_valid line that checkpatch correctly flagged.

Fixes: 3ca459eaba ("tun: fix group permission check")
Link: https://lore.kernel.org/netdev/CAFqZXNtkCBT4f+PwyVRmQGoT3p1eVa01fCG_aNtpt6dakXncUg@mail.gmail.com/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Stas Sergeev <stsp2@yandex.ru>
Link: https://patch.msgid.link/20250204161015.739430-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:22:11 -08:00
Jakub Kicinski
02b71dc115 Merge branch 'net_sched-two-security-bug-fixes-and-test-cases'
Cong Wang says:

====================
net_sched: two security bug fixes and test cases

This patchset contains two bug fixes reported in security mailing list,
and test cases for both of them.
====================

Link: https://patch.msgid.link/20250204005841.223511-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:15:00 -08:00
Cong Wang
91aadc16ee selftests/tc-testing: Add a test case for qdisc_tree_reduce_backlog()
Integrate the test case provided by Mingi Cho into TDC.

All test results:

1..4
ok 1 ca5e - Check class delete notification for ffff:
ok 2 e4b7 - Check class delete notification for root ffff:
ok 3 33a9 - Check ingress is not searchable on backlog update
ok 4 a4b9 - Test class qlen notification

Cc: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://patch.msgid.link/20250204005841.223511-5-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:15:00 -08:00
Cong Wang
638ba50893 netem: Update sch->q.qlen before qdisc_tree_reduce_backlog()
qdisc_tree_reduce_backlog() notifies parent qdisc only if child
qdisc becomes empty, therefore we need to reduce the backlog of the
child qdisc before calling it. Otherwise it would miss the opportunity
to call cops->qlen_notify(), in the case of DRR, it resulted in UAF
since DRR uses ->qlen_notify() to maintain its active list.

Fixes: f8d4bc4550 ("net/sched: netem: account for backlog updates from child qdisc")
Cc: Martin Ottens <martin.ottens@fau.de>
Reported-by: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://patch.msgid.link/20250204005841.223511-4-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:14:46 -08:00
Quang Le
3fe5648d1d selftests/tc-testing: Add a test case for pfifo_head_drop qdisc when limit==0
When limit == 0, pfifo_tail_enqueue() must drop new packet and
increase dropped packets count of the qdisc.

All test results:

1..16
ok 1 a519 - Add bfifo qdisc with system default parameters on egress
ok 2 585c - Add pfifo qdisc with system default parameters on egress
ok 3 a86e - Add bfifo qdisc with system default parameters on egress with handle of maximum value
ok 4 9ac8 - Add bfifo qdisc on egress with queue size of 3000 bytes
ok 5 f4e6 - Add pfifo qdisc on egress with queue size of 3000 packets
ok 6 b1b1 - Add bfifo qdisc with system default parameters on egress with invalid handle exceeding maximum value
ok 7 8d5e - Add bfifo qdisc on egress with unsupported argument
ok 8 7787 - Add pfifo qdisc on egress with unsupported argument
ok 9 c4b6 - Replace bfifo qdisc on egress with new queue size
ok 10 3df6 - Replace pfifo qdisc on egress with new queue size
ok 11 7a67 - Add bfifo qdisc on egress with queue size in invalid format
ok 12 1298 - Add duplicate bfifo qdisc on egress
ok 13 45a0 - Delete nonexistent bfifo qdisc
ok 14 972b - Add prio qdisc on egress with invalid format for handles
ok 15 4d39 - Delete bfifo qdisc twice
ok 16 d774 - Check pfifo_head_drop qdisc enqueue behaviour when limit == 0

Signed-off-by: Quang Le <quanglex97@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://patch.msgid.link/20250204005841.223511-3-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:14:46 -08:00
Quang Le
647cef20e6 pfifo_tail_enqueue: Drop new packet when sch->limit == 0
Expected behaviour:
In case we reach scheduler's limit, pfifo_tail_enqueue() will drop a
packet in scheduler's queue and decrease scheduler's qlen by one.
Then, pfifo_tail_enqueue() enqueue new packet and increase
scheduler's qlen by one. Finally, pfifo_tail_enqueue() return
`NET_XMIT_CN` status code.

Weird behaviour:
In case we set `sch->limit == 0` and trigger pfifo_tail_enqueue() on a
scheduler that has no packet, the 'drop a packet' step will do nothing.
This means the scheduler's qlen still has value equal 0.
Then, we continue to enqueue new packet and increase scheduler's qlen by
one. In summary, we can leverage pfifo_tail_enqueue() to increase qlen by
one and return `NET_XMIT_CN` status code.

The problem is:
Let's say we have two qdiscs: Qdisc_A and Qdisc_B.
 - Qdisc_A's type must have '->graft()' function to create parent/child relationship.
   Let's say Qdisc_A's type is `hfsc`. Enqueue packet to this qdisc will trigger `hfsc_enqueue`.
 - Qdisc_B's type is pfifo_head_drop. Enqueue packet to this qdisc will trigger `pfifo_tail_enqueue`.
 - Qdisc_B is configured to have `sch->limit == 0`.
 - Qdisc_A is configured to route the enqueued's packet to Qdisc_B.

Enqueue packet through Qdisc_A will lead to:
 - hfsc_enqueue(Qdisc_A) -> pfifo_tail_enqueue(Qdisc_B)
 - Qdisc_B->q.qlen += 1
 - pfifo_tail_enqueue() return `NET_XMIT_CN`
 - hfsc_enqueue() check for `NET_XMIT_SUCCESS` and see `NET_XMIT_CN` => hfsc_enqueue() don't increase qlen of Qdisc_A.

The whole process lead to a situation where Qdisc_A->q.qlen == 0 and Qdisc_B->q.qlen == 1.
Replace 'hfsc' with other type (for example: 'drr') still lead to the same problem.
This violate the design where parent's qlen should equal to the sum of its childrens'qlen.

Bug impact: This issue can be used for user->kernel privilege escalation when it is reachable.

Fixes: 57dbb2d83d ("sched: add head drop fifo queue")
Reported-by: Quang Le <quanglex97@gmail.com>
Signed-off-by: Quang Le <quanglex97@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Link: https://patch.msgid.link/20250204005841.223511-2-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 18:13:58 -08:00
Matthieu Baerts (NGI0)
5368a67307 selftests: mptcp: connect: -f: no reconnect
The '-f' parameter is there to force the kernel to emit MPTCP FASTCLOSE
by closing the connection with unread bytes in the receive queue.

The xdisconnect() helper was used to stop the connection, but it does
more than that: it will shut it down, then wait before reconnecting to
the same address. This causes the mptcp_join's "fastclose test" to fail
all the time.

This failure is due to a recent change, with commit 218cc16632
("selftests: mptcp: avoid spurious errors on disconnect"), but that went
unnoticed because the test is currently ignored. The recent modification
only shown an existing issue: xdisconnect() doesn't need to be used
here, only the shutdown() part is needed.

Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250204-net-mptcp-sft-conn-f-v1-1-6b470c72fffa@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-05 17:54:32 -08:00
Juergen Gross
aaf5eefd37 x86/xen: remove unneeded dummy push from xen_hypercall_hvm()
Stack alignment of the kernel in 64-bit mode is 8, not 16, so the
dummy push in xen_hypercall_hvm() for aligning the stack to 16 bytes
can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-05 17:23:22 +01:00
Juergen Gross
0bd797b801 x86/xen: add FRAME_END to xen_hypercall_hvm()
xen_hypercall_hvm() is missing a FRAME_END at the end, add it.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502030848.HTNTTuo9-lkp@intel.com/
Fixes: b4845bb638 ("x86/xen: add central hypercall functions")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-05 17:23:20 +01:00
Juergen Gross
98a5cfd232 x86/xen: fix xen_hypercall_hvm() to not clobber %rbx
xen_hypercall_hvm(), which is used when running as a Xen PVH guest at
most only once during early boot, is clobbering %rbx. Depending on
whether the caller relies on %rbx to be preserved across the call or
not, this clobbering might result in an early crash of the system.

This can be avoided by using an already saved register instead of %rbx.

Fixes: b4845bb638 ("x86/xen: add central hypercall functions")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-05 17:23:15 +01:00
Linus Torvalds
92514ef226 for-6.14-rc1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmejbmkACgkQxWXV+ddt
 WDuXuQ/8DETuhqww7hwrDgHyDHRgj/783Oy1+V/jJPQZ1hpjAbSJBU7aKxneAMMc
 Gj4ExbDfjd8kRAvsE71/1McJqVBd6CiuBG/k65vjqVS4lM8mEasVefCa4OsOR3iB
 gGzQeNbrDzIs3IOg8hM1l1iPDkI/AyOyeysD/qNMdWO4mcKEMwrBYkQJhL6+DzfE
 BKVP+NAdK4iv/W/EngAqvcd7mq2RdxeR9nBesHnKTzPCVFf6bBT3b3Qem/ovY6Td
 ZNQLjx9GXumDj0jSiyRI5rxjSYOrqSU4JV+1C7ghOwBj2uD2SZxAvghuRSuUTKnN
 9/9x1RNO7i+FE3GHzYWShhHkNuuLZspmi2J1neSttQ5Jy/PFJR/tUAED5zY6Nl0X
 73GWSzLlmaJ+E77DkTJksBYmRrnwtA6plMKD4umDCHw0lNu/0GCvxtZY1T1ZiKy2
 yK37Ja559+k7RPIdKoyHo81A7num4gLeBZTqd9F/XPjU26b57Qnqk1LetgGeT8xk
 IZFQtz9HdmvLwwbxWwNvp1ttRf+1dj1lpVnb5n6r0d8Uyta9tgQpDvwhNjluBsEx
 AQxK9yUZ5kAXEEbEyDwoOsZ5yjwkaMzqRpQWWappb0jCLm5dADI87odFRaOSlgWS
 WoXL6Vbod8G5vaEVwPDl6yuSS+609c7M8ftBlgvcx3XZ5/N6y2M=
 =YjCE
 -----END PGP SIGNATURE-----

Merge tag 'for-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - add lockdep annotation for relocation root to fix a splat warning
   while merging roots

 - fix assertion failure when splitting ordered extent after transaction
   abort

 - don't print 'qgroup inconsistent' message when rescan process updates
   qgroup data sooner than the subvolume deletion process

 - fix use-after-free (accessing the error number) when attempting to
   join an aborted transaction

 - avoid starting new transaction if not necessary when cleaning qgroup
   during subvolume drop

* tag 'for-6.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: avoid starting new transaction when cleaning qgroup during subvolume drop
  btrfs: fix use-after-free when attempting to join an aborted transaction
  btrfs: do not output error message if a qgroup has been already cleaned up
  btrfs: fix assertion failure when splitting ordered extent after transaction abort
  btrfs: fix lockdep splat while merging a relocation root
2025-02-05 08:13:07 -08:00
Eric Dumazet
a1300691ae net: rose: lock the socket in rose_bind()
syzbot reported a soft lockup in rose_loopback_timer(),
with a repro calling bind() from multiple threads.

rose_bind() must lock the socket to avoid this issue.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+7ff41b5215f0c534534e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67a0f78d.050a0220.d7c5a.00a0.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20250203170838.3521361-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-04 14:03:58 -08:00
Jacob Moroni
028676bb18 net: atlantic: fix warning during hot unplug
Firmware deinitialization performs MMIO accesses which are not
necessary if the device has already been removed. In some cases,
these accesses happen via readx_poll_timeout_atomic which ends up
timing out, resulting in a warning at hw_atl2_utils_fw.c:112:

[  104.595913] Call Trace:
[  104.595915]  <TASK>
[  104.595918]  ? show_regs+0x6c/0x80
[  104.595923]  ? __warn+0x8d/0x150
[  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595934]  ? report_bug+0x182/0x1b0
[  104.595938]  ? handle_bug+0x6e/0xb0
[  104.595940]  ? exc_invalid_op+0x18/0x80
[  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
[  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
[  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
[  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
[  104.595975]  __dev_close_many+0xad/0x160
[  104.595978]  dev_close_many+0x99/0x170
[  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
[  104.595981]  ? __call_rcu_common+0xcd/0x700
[  104.595984]  unregister_netdevice_queue+0xc6/0x110
[  104.595986]  unregister_netdev+0x1c/0x30
[  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]

Fix this by skipping firmware deinitialization altogether if the
PCI device is no longer present.

Tested with an AQC113 attached via Thunderbolt by performing
repeated unplug cycles while traffic was running via iperf.

Fixes: 97bde5c4f9 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250203143604.24930-3-mail@jakemoroni.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-04 14:00:35 -08:00
Linus Torvalds
5c8c229261 Fixes for kthreads
- Properly handle return value when allocation fails for the preferred
   affinity.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmeiMlMACgkQhSRUR1CO
 jHdkNg//bFAleXDdw1FcDIzpsTjQa9fCG7GiyCdFGlNoJl1IbIwVQtvZjLZhR9Ep
 xzW1V/s2YGIG9rlwDbEq2DBE/PUc/jWjhlhpuqrLE66VkXx7TPqipVMwgubBZKJr
 HCO4L4ubT3ZE/e/EBrRMhxcWqbDdTmmbG1zRjg/lv8iVH4o26vTm2ZdNCLAET3UA
 bq1VQiTFzJDo+iW1hEJoe52wroZmfQUmx1SdtXL11iHCFpxx93ZJYYLNTZ/Iweyl
 lbuRoLRD2//627mFxWI2gOzBwM3u++DgaDCqv4tOG1wBv3tMceOIHbA7K3caCrE7
 SKXnJ1jxj3A0hiC4azc+5+QeFE++hYLIJZeujRyze4Q1iNvc1hzwijBRqc6h6j7N
 rPgyalKPfKt80fzLUbR0P7psI5tJMNNyZnloQ+m9hl33n9ogFw6vGWZ0uQLRNoP1
 l1aGXIc2YGENH3sIFrAwfhuYGG/+D5h+mHXqc/UhuKCplN8zFdeFLtRPMhJk4hKn
 qfaJrluSw2r04pzhX3OxYD/q9tOp356oRUNWMwYIBLojwIbRizLUOAmfeo0xuWje
 JmraK3gEsCZd0qMa6S7qGF7H/gGlHudG73Ti85POeXizaUGZF7cxaDq4+nM/4SWI
 uWhDcbTCpHZzwyNBASSrx0xYnWvYh49Au9XYY1veNrHQDD4ssQ0=
 =lIYQ
 -----END PGP SIGNATURE-----

Merge tag 'kthreads-fixes-2025-02-04' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks

Pull kthreads fix from Frederic Weisbecker:

 - Properly handle return value when allocation fails for the preferred
   affinity

* tag 'kthreads-fixes-2025-02-04' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
  kthread: Fix return value on kzalloc() failure in kthread_affine_preferred()
2025-02-04 11:01:48 -08:00
Linus Torvalds
d009de7d54 Livepatching fixup for 6.14-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmeh4BEACgkQUqAMR0iA
 lPJLtxAAixRnu2x7piazO5vWcoJLub1xrxsYwDzUY+2yyuHeqdNAmQbhbWHaMeZ6
 ZsJ/jUBmIcrkwxVN6maQagPqYct5e82JpGLL67fE4qfaqaFlnL3lnBttA6XNXTAD
 2QH2ovAHWn/Szlr3WcbmSYKZyttbfKVxgvp4IwmAk5hlRgVVGoSgxMGsZZHiV2PX
 IJCoeV9UGYbQEhVDsU9dtPWGgAHgiH2eAQdR0gWHpXrRVgGTubI3GhiUyPkW6yTy
 sZ6aIxVKb7atQEeAfiPvJV+3ubCUc2w7YAv5qNYqkmz2PUpvgvVnkr0RdR/Vmeq1
 6PtCG0POpnPYVTzhvjyhOmQQlGY3/RePh+krUGvIeOZnceqShjIuRdfMlM99AqSj
 cUMlnhUhfN+Fo4p3QL7QIc/rnbYBqpoG7gKd2b1iI265gRQ/PabGp2NeZt3vF0dI
 BvAJetrnga/kZIWHxjQ1XUd9QenSSX9jK9916kRYBY/bDBi3o4xXjSVN0uKfoIY/
 iYml6iuekmpKp4ZxcsZAJ5iaP7Su5dYepef6RcEpoRvr+1XJ8Bgf6evc/UDgBPXM
 cXK1V1q0lZ/b8tDfIvftiWuq6ViWc7u8rWnjAW5MHfYUxeVzQXWR8l7eBM13tnVZ
 bb9zQGxT6xECGj28xEd6wJU4dzZIldZA8Y74RgKUOYut6/9gs04=
 =P6Tm
 -----END PGP SIGNATURE-----

Merge tag 'livepatching-for-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching fix from Petr Mladek:

 - Fix livepatching selftests for util-linux-2.40.x

* tag 'livepatching-for-6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  selftests: livepatch: handle PRINTK_CALLER in check_result()
2025-02-04 09:52:01 -08:00
Linus Torvalds
f5a2601378 platform-drivers-x86 for v6.14-2
Fixes:
 
  - ideapad-laptop: Pass a correct pointer to the driver data
 
  - intel/ifs: Provide a link to the IFS test images
 
  - intel/pmc: Use large enough type when decoding LTR value
 
 The following is an automated shortlog grouped by driver:
 
 ideapad-laptop:
  -  pass a correct pointer to the driver data
 
 intel/ifs:
  -  Update documentation with image download path
 
 intel: pmc:
  -  fix ltr decode in pmc_core_ltr_show()
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQSCSUwRdwTNL2MhaBlZrE9hU+XOMQUCZ6HR1QAKCRBZrE9hU+XO
 MZ66AP9MoxGhJJgR9A9+Rr+zwRLsbudl4XGJqGcDGFR4HeeVTwD3Q0mSBHvNW+XQ
 yzwkyP+sySzQvoNu62n3GTsOnv75CA==
 =b/yB
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:

 - ideapad-laptop: Pass a correct pointer to the driver data

 - intel/ifs: Provide a link to the IFS test images

 - intel/pmc: Use large enough type when decoding LTR value

* tag 'platform-drivers-x86-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/intel/ifs: Update documentation with image download path
  platform/x86/intel: pmc: fix ltr decode in pmc_core_ltr_show()
  platform/x86: ideapad-laptop: pass a correct pointer to the driver data
2025-02-04 09:50:02 -08:00
David Howells
4241a702e0 rxrpc: Fix the rxrpc_connection attend queue handling
The rxrpc_connection attend queue is never used because conn::attend_link
is never initialised and so is always NULL'd out and thus always appears to
be busy.  This requires the following fix:

 (1) Fix this the attend queue problem by initialising conn::attend_link.

And, consequently, two further fixes for things masked by the above bug:

 (2) Fix rxrpc_input_conn_event() to handle being invoked with a NULL
     sk_buff pointer - something that can now happen with the above change.

 (3) Fix the RXRPC_SKB_MARK_SERVICE_CONN_SECURED message to carry a pointer
     to the connection and a ref on it.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
Fixes: f2cce89a07 ("rxrpc: Implement a mechanism to send an event notification to a connection")
Link: https://patch.msgid.link/20250203110307.7265-3-dhowells@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-04 15:30:28 +01:00
Jithu Joseph
a787ab73e2
platform/x86/intel/ifs: Update documentation with image download path
The documentation previously listed the path to download In Field Scan
(IFS) test images as "TBD".

Update the documentation to include the correct image download
location. Also move the download link to the appropriate section within
the documentation.

Reported-by: Anisse Astier <anisse@astier.eu>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20250131205315.1585663-1-jithu.joseph@intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-02-04 10:00:45 +02:00
Paolo Abeni
d3ed6dee73 net: harmonize tstats and dstats
After the blamed commits below, some UDP tunnel use dstats for
accounting. On the xmit path, all the UDP-base tunnels ends up
using iptunnel_xmit_stats() for stats accounting, and the latter
assumes the relevant (tunnel) network device uses tstats.

The end result is some 'funny' stat report for the mentioned UDP
tunnel, e.g. when no packet is actually dropped and a bunch of
packets are transmitted:

gnv2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue \
		state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether ee:7d:09:87:90:ea brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
         14916      23      0      15       0       0
    TX:  bytes packets errors dropped carrier collsns
             0    1566      0       0       0       0

Address the issue ensuring the same binary layout for the overlapping
fields of dstats and tstats. While this solution is a bit hackish, is
smaller and with no performance pitfall compared to other alternatives
i.e. supporting both dstat and tstat in iptunnel_xmit_stats() or
reverting the blamed commit.

With time we should possibly move all the IP-based tunnel (and virtual
devices) to dstats.

Fixes: c77200c074 ("bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: 6fa6de3022 ("geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Fixes: be226352e8 ("vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/2e1c444cf0f63ae472baff29862c4c869be17031.1738432804.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:39:59 -08:00
Jakub Kicinski
2fc9956b31 Merge branch 'ethtool-rss-minor-fixes-for-recent-rss-changes'
Jakub Kicinski says:

====================
ethtool: rss: minor fixes for recent RSS changes

Make sure RSS_GET messages are consistent in do and dump.
Fix up a recently added safety check for RSS + queue offset.
Adjust related tests so that they pass on devices which
don't support RSS + queue offset.
====================

Link: https://patch.msgid.link/20250201013040.725123-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:39:23 -08:00
Jakub Kicinski
c3da585509 selftests: drv-net: rss_ctx: don't fail reconfigure test if queue offset not supported
Vast majority of drivers does not support queue offset.
Simply return if the rss context + queue ntuple fails.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:39:23 -08:00
Jakub Kicinski
de379dfd9a selftests: drv-net: rss_ctx: add missing cleanup in queue reconfigure
Commit under Fixes adds ntuple rules but never deletes them.

Fixes: 29a4bc1fe9 ("selftest: extend test_rss_context_queue_reconfigure for action addition")
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:39:23 -08:00
Jakub Kicinski
2b91cc1214 ethtool: ntuple: fix rss + ring_cookie check
The info.flow_type is for RXFH commands, ntuple flow_type is inside
the flow spec. The check currently does nothing, as info.flow_type
is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS.

Fixes: 9e43ad7a1e ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:38:58 -08:00
Jakub Kicinski
244f8aa46f ethtool: rss: fix hiding unsupported fields in dumps
Commit ec6e57beaf ("ethtool: rss: don't report key if device
doesn't support it") intended to stop reporting key fields for
additional rss contexts if device has a global hashing key.

Later we added dump support and the filtering wasn't properly
added there. So we end up reporting the key fields in dumps
but not in dos:

  # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \
		--json '{"header": {"dev-index":2}, "context": 1 }'
  {
     "header": { ... },
     "context": 1,
     "indir": [0, 1, 2, 3, ...]]
  }

  # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get
  [
     ... snip context 0 ...
     { "header": { ... },
       "context": 1,
       "indir": [0, 1, 2, 3, ...],
 ->    "input_xfrm": 255,
 ->    "hfunc": 1,
 ->    "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
     }
  ]

Hide these fields correctly.

The drivers/net/hw/rss_ctx.py selftest catches this when run on
a device with single key, already:

  # Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump:
  # Check|     ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero")
  # Check failed {0} == {0} key is all zero
  not ok 8 rss_ctx.test_rss_context_dump

Fixes: f6122900f4 ("ethtool: rss: support dumping RSS contexts")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:38:52 -08:00
Yu-Chun Lin
1b0332a426 kthread: Fix return value on kzalloc() failure in kthread_affine_preferred()
kthread_affine_preferred() incorrectly returns 0 instead of -ENOMEM
when kzalloc() fails. Return 'ret' to ensure the correct error code is
propagated.

Fixes: 4d13f4304f ("kthread: Implement preferred affinity")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202501301528.t0cZVbnq-lkp@intel.com/
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-02-04 01:42:27 +01:00
Linus Torvalds
0de63bb7d9 Fix a braino in d_revalidate series
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZ6Ey3AAKCRBZ7Krx/gZQ
 61CzAP9fn68btv0R7pFJjbw5yAnrQHRDJ1i7PjI2NKLz05AwnQD/eD5oNakTbqMd
 LRCNb1awaEaEwnv4kf4vHEi9ykBEXQE=
 =0QhL
 -----END PGP SIGNATURE-----

Merge tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull d_revalidate fix from Al Viro:
 "Fix a braino in d_revalidate series: check ->d_op for NULL"

* tag 'pull-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix braino in "9p: fix ->rename_sem exclusion"
2025-02-03 13:39:55 -08:00
Jakub Kicinski
0e6dc66b5c Merge branch 'maintainers-recognize-kuniyuki-iwashima-as-a-maintainer'
Jakub Kicinski says:

====================
MAINTAINERS: recognize Kuniyuki Iwashima as a maintainer

Kuniyuki Iwashima has been a prolific contributor and trusted reviewer
for some core portions of the networking stack for a couple of years now.
Formalize some obvious areas of his expertise and list him as a maintainer.
====================

Link: https://patch.msgid.link/20250202014728.1005003-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 13:27:53 -08:00
Jakub Kicinski
8a2e22f665 MAINTAINERS: add entry for UNIX sockets
Add a MAINTAINERS entry for UNIX socket, Kuniyuki has been
the de-facto maintainer of this code for a while.

Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 13:27:51 -08:00
Jakub Kicinski
ae0585b04a MAINTAINERS: add a general entry for BSD sockets
Create a MAINTAINERS entry for BSD sockets. List the top 3
reviewers as maintainers. The entry is meant to cover core
socket code (of which there isn't much) but also reviews
of any new socket families.

Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250202014728.1005003-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 13:27:50 -08:00