Some instructions update the cpu execution mode, which needs to update the
emulation mode.
Extract this code, and make assign_eip_far use it.
assign_eip_far now reads CS, instead of getting it via a parameter,
which is ok, because callers always assign CS to the same value
before calling this function.
No functional change is intended.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-12-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SYSEXIT is one of the instructions that can change the
processor mode, thus ctxt->mode should be updated after it.
Note that this is likely a benign bug, because the only problematic
mode change is from 32 bit to 64 bit which can lead to truncation of RIP,
and it is not possible to do with sysexit,
since sysexit running in 32 bit mode will be limited to 32 bit version.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-11-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tag "guest_saw_irq" as "volatile" to ensure that the compiler will never
optimize away lookups. Relying on the compiler thinking that the flag
is global and thus might change also works, but it's subtle, less robust,
and looks like a bug at first glance, e.g. risks being "fixed" and
breaking the test.
Make the flag "static" as well since convincing the compiler it's global
is no longer necessary.
Alternatively, the flag could be accessed with {READ,WRITE}_ONCE(), but
literally every access would need the wrappers, and eking out performance
isn't exactly top priority for selftests.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221013211234.1318131-17-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tests for races between shinfo_cache (de)activation and hypercall+ioctl()
processing. KVM has had bugs where activating the shared info cache
multiple times and/or with concurrent users results in lock corruption,
NULL pointer dereferences, and other fun.
For the timer injection testcase (#22), re-arm the timer until the IRQ
is successfully injected. If the timer expires while the shared info
is deactivated (invalid), KVM will drop the event.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221013211234.1318131-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
`hostname` needs to be set as null-pointer after free in
`cifs_put_tcp_session` function, or when `cifsd` thread attempts
to resolve hostname and reconnect the host, the thread would deref
the invalid pointer.
Here is one of practical backtrace examples as reference:
Task 477
---------------------------
do_mount
path_mount
do_new_mount
vfs_get_tree
smb3_get_tree
smb3_get_tree_common
cifs_smb3_do_mount
cifs_mount
mount_put_conns
cifs_put_tcp_session
--> kfree(server->hostname)
cifsd
---------------------------
kthread
cifs_demultiplex_thread
cifs_reconnect
reconn_set_ipaddr_from_hostname
--> if (!server->hostname)
--> if (server->hostname[0] == '\0') // !! UAF fault here
CIFS: VFS: cifs_mount failed w/return code = -112
mount error(112): Host is down
BUG: KASAN: use-after-free in reconn_set_ipaddr_from_hostname+0x2ba/0x310
Read of size 1 at addr ffff888108f35380 by task cifsd/480
CPU: 2 PID: 480 Comm: cifsd Not tainted 6.1.0-rc2-00106-gf705792f89dd-dirty #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x68/0x85
print_report+0x16c/0x4a3
kasan_report+0x95/0x190
reconn_set_ipaddr_from_hostname+0x2ba/0x310
__cifs_reconnect.part.0+0x241/0x800
cifs_reconnect+0x65f/0xb60
cifs_demultiplex_thread+0x1570/0x2570
kthread+0x2c5/0x380
ret_from_fork+0x22/0x30
</TASK>
Allocated by task 477:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x7e/0x90
__kmalloc_node_track_caller+0x52/0x1b0
kstrdup+0x3b/0x70
cifs_get_tcp_session+0xbc/0x19b0
mount_get_conns+0xa9/0x10c0
cifs_mount+0xdf/0x1970
cifs_smb3_do_mount+0x295/0x1660
smb3_get_tree+0x352/0x5e0
vfs_get_tree+0x8e/0x2e0
path_mount+0xf8c/0x1990
do_mount+0xee/0x110
__x64_sys_mount+0x14b/0x1f0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 477:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x50
__kasan_slab_free+0x10a/0x190
__kmem_cache_free+0xca/0x3f0
cifs_put_tcp_session+0x30c/0x450
cifs_mount+0xf95/0x1970
cifs_smb3_do_mount+0x295/0x1660
smb3_get_tree+0x352/0x5e0
vfs_get_tree+0x8e/0x2e0
path_mount+0xf8c/0x1990
do_mount+0xee/0x110
__x64_sys_mount+0x14b/0x1f0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff888108f35380
which belongs to the cache kmalloc-16 of size 16
The buggy address is located 0 bytes inside of
16-byte region [ffff888108f35380, ffff888108f35390)
The buggy address belongs to the physical page:
page:00000000333f8e58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888108f350e0 pfn:0x108f35
flags: 0x200000000000200(slab|node=0|zone=2)
raw: 0200000000000200 0000000000000000 dead000000000122 ffff8881000423c0
raw: ffff888108f350e0 000000008080007a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888108f35280: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
ffff888108f35300: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
>ffff888108f35380: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
^
ffff888108f35400: fa fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888108f35480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 7be3248f31 ("cifs: To match file servers, make sure the server hostname matches")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
- Extend Wa_1607297627 to Alderlake-P (José Roberto de Souza)
- Keep PCI autosuspend control 'on' by default on all dGPU (Anshuman Gupta)
- Reset frl trained flag before restarting FRL training (Ankit Nautiyal)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y1o+teE2Z11pT1MN@tursulin-desk
Commit 78e5a33994 ("cpumask: fix checking valid cpu range") has
started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1
are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's
start and next seq operations implement a pattern like
n = cpumask_next(n - 1, mask);
show(n);
while (1) {
++n;
n = cpumask_next(n - 1, mask);
if (n >= nr_cpu_ids)
break;
show(n);
}
which will issue the warning when reading /proc/cpuinfo. Ensure no
warning is generated by validating the cpu index before calling
cpumask_next().
[*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20221014155845.1986223-2-ajones@ventanamicro.com/
Fixes: 78e5a33994 ("cpumask: fix checking valid cpu range")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Conor Dooley <conor@kernel.org> says:
From: Conor Dooley <conor.dooley@microchip.com>
This came up due to a report from Kevin @ kernel-ci, who had been
running a mixed configuration of GNU binutils and clang. Their compiler
was relatively recent & supports Zicbom but binutils @ 2.35.2 did not.
Our current checks for extension support only cover the compiler, but it
appears to me that we need to check both the compiler & linker support
in case of "pot-luck" configurations that mix different versions of
LD,AS,CC etc.
Linker support does not seem possible to actually check, since the ISA
string is emitted into the object files - so I put in version checks for
that. The checks have gotten a bit ugly since 32 & 64 bit support need
to be checked independently but ahh well.
As I was going, I fell into the trap of there being duplicated checks
for CC support in both the Makefile and Kconfig, so as part of renaming
the Kconfig symbol to TOOLCHAIN_HAS_FOO, I dropped the extra checks in
the Makefile. This has the added advantage of the TOOLCHAIN_HAS_FOO
symbol for Zihintpause appearing in .config.
I pushed out a version of this that specificly checked for assember
support for LKP to test & it looked /okay/ - but I did some more testing
today and realised that this is redudant & have since dropped the as
check.
I tested locally with a fair few different combinations, to try and
cover each of AS, LD, CC missing support for the extension.
* b4-shazam-merge:
riscv: fix detection of toolchain Zihintpause support
riscv: fix detection of toolchain Zicbom support
Link: https://lore.kernel.org/r/20221006173520.1785507-1-conor@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
It is not sufficient to check if a toolchain supports a particular
extension without checking if the linker supports that extension
too. For example, Clang 15 supports Zihintpause but GNU bintutils
2.35.2 does not, leading build errors like so:
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zihintpause'
Add a TOOLCHAIN_HAS_ZIHINTPAUSE which checks if each of the compiler,
assembler and linker support the extension. Replace the ifdef in the
vdso with one depending on this new symbol.
Fixes: 8eb060e101 ("arch/riscv: add Zihintpause support")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221006173520.1785507-3-conor@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
It is not sufficient to check if a toolchain supports a particular
extension without checking if the linker supports that extension too.
For example, Clang 15 supports Zicbom but GNU bintutils 2.35.2 does
not, leading build errors like so:
riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicbom1p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zicbom'
Convert CC_HAS_ZICBOM to TOOLCHAIN_HAS_ZICBOM & check if the linker
also supports Zicbom.
Reported-by: Kevin Hilman <khilman@baylibre.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1714
Link: https://storage.kernelci.org/next/master/next-20220920/riscv/defconfig+CONFIG_EFI=n/clang-16/logs/kernel.log
Fixes: 1631ba1259 ("riscv: Add support for non-coherent devices using zicbom extension")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221006173520.1785507-2-conor@kernel.org
[Palmer: Check for ld-2.38, not 2.39, as 2.38 no longer errors.]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Hi Atish,
It seems that the panic is due to the missing memcpy during kasan_init.
Could you please check whether this patch is helpful?
When doing kasan_populate, the new allocated base_pud/base_p4d should
contain kasan_early_shadow_{pud, p4d}'s content. Add the missing memcpy
to avoid page fault when read/write kasan shadow region.
Tested on:
- qemu with sv57 and CONFIG_KASAN on.
- qemu with sv48 and CONFIG_KASAN on.
Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Tested-by: Atish Patra <atishp@rivosinc.com>
Fixes: 8fbdccd2b1 ("riscv: mm: Support kasan for sv57")
Link: https://lore.kernel.org/r/20221009083050.3814850-1-panqinglin2020@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Current release - regressions:
- ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1
Current release - new code bugs:
- mptcp: fix abba deadlock on fastopen
- eth: stmmac: rk3588: allow multiple gmac controllers in one system
Previous releases - regressions:
- ip: rework the fix for dflt addr selection for connected nexthop
- net: couple more fixes for misinterpreting bits in struct page after
the signature was added
Previous releases - always broken:
- ipv6: ensure sane device mtu in tunnels
- openvswitch: switch from WARN to pr_warn on a user-triggerable path
- ethtool: eeprom: fix null-deref on genl_info in dump
- ieee802154: more return code fixes for corner cases in dgram_sendmsg
- mac802154: fix link-quality-indicator recording
- eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack offload
- eth: fec: limit register access on i.MX6UL
- eth: bcm4908_enet: update TX stats after actual transmission
- can: rcar_canfd: improve IRQ handling for RZ/G2L
Misc:
- genetlink: piggy back on the newly added resv_op_start to enforce
more sanity checks on new commands
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNa2CIACgkQMUZtbf5S
IrsEDhAAsqvsIqhnwaDuvzTpdz/l2ZiLyRixue+Z5Q88/LkSYC7SRMjh70TzbYEj
ENbB+hzGt9zDYIga1+vtLU13rENiI+3V0Pr5eOK9jVV2KBwQmgj1PatjlLhfQ8aa
q9c/dg3YqKFcsLjHpCZC1O3imDEU+Wt1XV+N2tuoOhJ1QVPSemjSVUEgIP+qLTD7
cXd+bWpcEXq/X0jkptElGsCM4RHxuN9MCcQDoGfdyoGEmXDi17BmmJEVu4LWdamg
bPlky2uerFBtuUyK3jSvsoTI0VHwcxAr/MSmMxwcRGMr/smy/1UIKfehSJUOXFsr
XeN4pfgezqPvl4l7LjC0xx83zg1UffKGhkGuu47MS3A8rS+zSo9CEH993owOb5Ty
ZH5ZhBsdS6wchCbM15eqEby2ATYh/pYf8gNEBYfItsj2QuIPoqt8h19yQ4Gu1eX2
1w1RpDJH0SyD02hsmfRWKzjehHNbNM+cQ2+prVazhXuSmhGxTOqWsirv6mThlfm6
IEuG62d0VOYFoRBKxTV27S57QyfT0/+uMyu7UjDX5lieJGXvN6wGH7UlOUDBC5j/
4GhW8Li4hxskxv292S8nvwANAOY02wWaunVsEtLYwB+7erkPDISUkiUjdxi4Uc7W
yfxqbhW70Yd9sDEoKXGRsQ21nl82ZBeUIWPx/xLr+F6PuKdvUHo=
=g5TW
-----END PGP SIGNATURE-----
Merge tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from 802.15.4 (Zigbee et al).
Current release - regressions:
- ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1
Current release - new code bugs:
- mptcp: fix abba deadlock on fastopen
- eth: stmmac: rk3588: allow multiple gmac controllers in one system
Previous releases - regressions:
- ip: rework the fix for dflt addr selection for connected nexthop
- net: couple more fixes for misinterpreting bits in struct page
after the signature was added
Previous releases - always broken:
- ipv6: ensure sane device mtu in tunnels
- openvswitch: switch from WARN to pr_warn on a user-triggerable path
- ethtool: eeprom: fix null-deref on genl_info in dump
- ieee802154: more return code fixes for corner cases in
dgram_sendmsg
- mac802154: fix link-quality-indicator recording
- eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack
offload
- eth: fec: limit register access on i.MX6UL
- eth: bcm4908_enet: update TX stats after actual transmission
- can: rcar_canfd: improve IRQ handling for RZ/G2L
Misc:
- genetlink: piggy back on the newly added resv_op_start to enforce
more sanity checks on new commands"
* tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
net: enetc: survive memory pressure without crashing
kcm: do not sense pfmemalloc status in kcm_sendpage()
net: do not sense pfmemalloc status in skb_append_pagefrags()
net/mlx5e: Fix macsec sci endianness at rx sa update
net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
net/mlx5e: Fix macsec rx security association (SA) update/delete
net/mlx5e: Fix macsec coverity issue at rx sa update
net/mlx5: Fix crash during sync firmware reset
net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
net/mlx5e: TC, Reject forwarding from internal port to internal port
net/mlx5: Fix possible use-after-free in async command interface
net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
net/mlx5e: Update restore chain id for slow path packets
net/mlx5e: Extend SKB room check to include PTP-SQ
net/mlx5: DR, Fix matcher disconnect error flow
net/mlx5: Wait for firmware to enable CRS before pci_restore_state
net/mlx5e: Do not increment ESN when updating IPsec ESN state
netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed
...
- Fix an ancient signal action copy race. (Bernd Edlinger)
- Fix a memory leak in ELF loader, when under memory pressure. (Li Zetao)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmNa1xEWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJoLqD/927ZXWxVLQ0GygmNz3xSEZh+5c
34flrZv4LUDQPw1rNXycWx2D5MQv5MehrpsMvF+11pu/M1EP3e3+R3bngFeFXtBo
12ov3yEloe6yA8bOPPWEDB1fU8K7C9aODKMcJOoWFCk20g7uQGYS8+GCUGhLxjHs
mZn5U8OuEGGvn4QuGknIps+Ddca2SHuJ7jBtsw8NVjuvtWcAhlw9PYNbLTJEgBzU
0zsfK68idMpQHDPvWMmoRcwAXn3kiVzc3wKeR9Zdx9q2NyDIS+OxgynEAc3fM2rf
ag19+Epn6GUGPMakS/zJNQS0wCA4+pJi60Z+Hlddy0WNUocg55uHd0zY7xcT3s75
rsPtbTeabOrtzQMf7lSpsn5OUeCDJjc3KcZIlmILaZaVXUZv+jvysRwH7CRdDNNS
gM2j9nu87I8TbSPXbY79KutvucfKAl88iWxRgFqnzyqzRYLWahwWSKsiVubH7OoU
kUYdDdPmiZh7XAqTFUsMF4++wyx/PAwU7RdYuxaUvHZd6PT8J92AqIisPwRT9ojL
oqLpgRoeYX3JY7aDyvBjYan2IKfIPhB0WZF9vCeHVoTXoEy/LVZeWVNoBXyO6ILl
BYzBAjp5oJRLbJYVtjI4/gkDizdtpAu8YYRYX36TUvBAkFqpGYn9dvySpMGl24uJ
g3IEqTj/kajeZleHnQ==
=dHXB
-----END PGP SIGNATURE-----
Merge tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve fixes from Kees Cook:
- Fix an ancient signal action copy race (Bernd Edlinger)
- Fix a memory leak in ELF loader, when under memory pressure (Li
Zetao)
* tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
fs/binfmt_elf: Fix memory leak in load_elf_binary()
exec: Copy oldsighand->action under spin-lock
When holding a delegation, the NFS client optimizes away setting the
attributes of a file from the GETATTR in the compound after CLONE, and for
a zero-length CLONE we will end up setting the inode's size to zero in
nfs42_copy_dest_done(). Handle this case by computing the resulting count
from the server's reported size after CLONE's GETATTR.
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 94d202d5ca ("NFSv42: Copy offload should update the file size when appropriate")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If a zero length is passed to kmalloc() it returns 0x10, which is
not a valid address. gss_unwrap_resp_integ() subsequently crashes
when it attempts to dereference that pointer.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
There's a small window where a LOCK sent during a delegation return can
race with another OPEN on client, but the open stateid has not yet been
updated. In this case, the client doesn't handle the OLD_STATEID error
from the server and will lose this lock, emitting:
"NFS: nfs4_handle_delegation_recall_error: unhandled error -10024".
Fix this by sending the task through the nfs4 error handling in
nfs4_lock_done() when we may have to reconcile our stateid with what the
server believes it to be. For this case, the result is a retry of the
LOCK operation with the updated stateid.
Reported-by: Gonzalo Siero Humet <gsierohu@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
There is a null-ptr-deref when xps sysfs alloc failed:
BUG: KASAN: null-ptr-deref in sysfs_do_create_link_sd+0x40/0xd0
Read of size 8 at addr 0000000000000030 by task gssproxy/457
CPU: 5 PID: 457 Comm: gssproxy Not tainted 6.0.0-09040-g02357b27ee03 #9
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
kasan_report+0xa3/0x120
sysfs_do_create_link_sd+0x40/0xd0
rpc_sysfs_client_setup+0x161/0x1b0
rpc_new_client+0x3fc/0x6e0
rpc_create_xprt+0x71/0x220
rpc_create+0x1d4/0x350
gssp_rpc_create+0xc3/0x160
set_gssp_clnt+0xbc/0x140
write_gssp+0x116/0x1a0
proc_reg_write+0xd6/0x130
vfs_write+0x177/0x690
ksys_write+0xb9/0x150
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
When the xprt_switch sysfs alloc failed, should not add xprt and
switch sysfs to it, otherwise, maybe null-ptr-deref; also initialize
the 'xps_sysfs' to NULL to avoid oops when destroy it.
Fixes: 2a338a5431 ("sunrpc: add a symlink from rpc-client directory to the xprt_switch")
Fixes: d408ebe04a ("sunrpc: add add sysfs directory per xprt under each xprt_switch")
Fixes: baea99445d ("sunrpc: add xprt_switch direcotry to sunrpc's sysfs")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Currently, we are only guaranteed to send RECLAIM_COMPLETE if we have
open state to recover. Fix the client to always send RECLAIM_COMPLETE
after setting up the lease.
Fixes: fce5c838e1 ("nfs41: RECLAIM_COMPLETE functionality")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If RECLAIM_COMPLETE sets the NFS4CLNT_BIND_CONN_TO_SESSION flag, then we
need to loop back in order to handle it.
Fixes: 0048fdd066 ("NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
If the server reboots while we are engaged in a delegation return, and
there is a pNFS layout with return-on-close set, then the current code
can end up deadlocking in pnfs_roc() when nfs_inode_set_delegation()
tries to return the old delegation.
Now that delegreturn actually uses its own copy of the stateid, it
should be safe to just always update the delegation stateid in place.
Fixes: 078000d02d ("pNFS: We want return-on-close to complete when evicting the inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The 'nfs_server' and 'mount_server' structures include a union of
'struct sockaddr' (with the older 16 bytes max address size) and
'struct sockaddr_storage' which is large enough to hold all the
supported sa_family types (128 bytes max size). The runtime memcpy()
buffer overflow checker is seeing attempts to write beyond the 16
bytes as an overflow, but the actual expected size is that of 'struct
sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
completely through-out NFS, which results in adjusting the memcpy()
buffers to the correct union members. Avoids this false positive run-time
warning under CONFIG_FORTIFY_SOURCE:
memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Fix the following coccicheck warning:
fs/nfs/dir.c:2494:2-7: WARNING:
NULL check before some freeing functions is not needed.
Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmNaSe8ACgkQCF8+vY7k
4RWc+g//dWz903vryxpRlC2ap7W3Ml9A9KC2NONyR9ryvX71rXaqsVe5/rg4+kKK
HJ5xbY6cFZVNWWvSEfW2sfNolAwdkEuDem1VDrOIvJRV+1fxCip4frwkD4bhY/V0
H8I50tgSAtzpTTwEX8w5KHocUYXqtn0T9SX6CA5ll9ijJwVdU2hUZFzUWG2cbx1r
shSv99HQApflUfD6McwhqFK8r1EpyzKUZypM3zmpSinvEinT+5naV6QBXlLMsO9F
mmGnQKyOch1a9tTHV9MAojEVN7wTWZbdT7hwwwMP1Fj8zhdt6UqncjK6eaHhbpYd
WExZEGhN1l+ZDxZZ1kY/VX/pE93uLaq16WkJH6HftiTYjdXpZe6IjBZnJsSIPktO
BCoEYJfmCmfC/9AkqrM9/TdFBJ3MRgZwfrhZ8j6dcEgvZ5OYpQLOaWIR0cZ0YYPE
iw+HooXlv3gf1JiMLb8KVFpC4UrD1RU8HfIFD2KaMx1UKUs3NjVzv5g8V+IDUa1i
ky80MvEXXH6Eg91QNypQEY6EH6G6c2Mk8yVj6WVFWTEC9mqNo/A1egL6DFDKbfZd
OuP3bl/hjdNU1oQ9ajBq/GurUJQoFtnCie5M2Sqy3gyKfD92F7nufJusfICaPgz7
SztFKjPcaomLNJl/IH2ALox11+fs2HfTeiCX0zZwun8ddPqOWBM=
=WXL5
-----END PGP SIGNATURE-----
Merge tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A bunch of patches addressing issues in the vivid driver and adding
new checks in V4L2 to validate the input parameters from some ioctls"
* tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vivid.rst: loop_video is set on the capture devnode
media: vivid: set num_in/outputs to 0 if not supported
media: vivid: drop GFP_DMA32
media: vivid: fix control handler mutex deadlock
media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced'
media: v4l2-dv-timings: add sanity checks for blanking values
media: vivid: dev->bitmap_cap wasn't freed in all cases
media: vivid: s_fbuf: add more sanity checks
enter_exception64() performs an MTE check, which involves dereferencing
vcpu->kvm. While vcpu has already been fixed up to be a HYP VA pointer,
kvm is still a pointer in the kernel VA space.
This only affects nVHE configurations with MTE enabled, as in other
cases, the pointer is either valid (VHE) or not dereferenced (!MTE).
Fix this by first converting kvm to a HYP VA pointer.
Fixes: ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221027120945.29679-1-ryan.roberts@arm.com
Fix a memory leak that was introduced by a change that went into -rc1.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY1oM6BQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3ixAP9IY1TdJu64uKTofFdYvO/wBASpdszm
GkY1QnEFxATA9AEAwRswZgaGiuKj4hFBeIWmu9+luT4T7kVIcaumslTyTg8=
=YinC
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt fix from Eric Biggers:
"Fix a memory leak that was introduced by a change that went into -rc1"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: fix keyring memory leak on mount failure
Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.
An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.
This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).
The hardware guide explains this best:
| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
| value, it indicates that all BDs in the ring have been prepared and
| hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
| indicate N BDs have been prepared. Note that hardware cannot start if
| only a single buffer is prepared due to the restrictions described in
| (2).
| - Software may write consumer index to match producer index anytime
| while the ring is operational to indicate all received BDs prior have
| been processed and new BDs prepared for hardware.
Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.
When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.
The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.
To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.
The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Similar to changes done in TCP in blamed commit.
We should not sense pfmemalloc status in sendpage() methods.
Fixes: 3261400639 ("tcp: TX zerocopy should not sense pfmemalloc status")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221027040637.1107703-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The cited commit at rx sa update operation passes the sci object
attribute, in the wrong endianness and not as expected by the HW
effectively create malformed hw sa context in case of update rx sa
consequently, HW produces unexpected MACsec packets which uses this
sa.
Fix by passing sci to create macsec object with the correct endianness,
while at it add __force u64 to prevent sparse check error of type
"sparse: error: incorrect type in assignment".
Fixes: aae3454e4d ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The cited commit produces a sparse check error of type
"sparse: error: restricted __be64 degrades to integer". The
offending line wrongly did a bitwise operation between two different
storage types one of 64 bit when the other smaller side is 16 bit
which caused the above sparse error, furthermore bitwise operation
usage here is wrong in the first place as the constant MACSEC_PORT_ES
is not a bitwise field.
Fix by using the right mask to get the lower 16 bit if the sci number,
and use comparison operator '==' instead of bitwise '&' operator.
Fixes: 3b20949cb2 ("net/mlx5e: Add MACsec RX steering rules")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The cited commit adds the support for update/delete MACsec Rx SA,
naturally, these operations need to check if the SA in question exists
to update/delete the SA and return error code otherwise, however they
do just the opposite i.e. return with error if the SA exists
Fix by change the check to return error in case the SA in question does
not exist, adjust error message and code accordingly.
Fixes: aae3454e4d ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The cited commit at update rx sa operation passes object attributes
to MACsec object create function without initializing/setting all
attributes fields leaving some of them with garbage values, therefore
violating the implicit assumption at create object function, which
assumes that all input object attributes fields are set.
Fix by initializing the object attributes struct to zero, thus leaving
unset fields with the legal zero value.
Fixes: aae3454e4d ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When setting Bluefield to DPU NIC mode using mlxconfig tool + sync
firmware reset flow, we run into scenario where the host was not
eswitch manager at the time of mlx5 driver load but becomes eswitch manager
after the sync firmware reset flow. This results in null pointer
access of mpfs structure during mac filter add. This change prevents null
pointer access but mpfs table entries will not be added.
Fixes: 5ec697446f ("net/mlx5: Add support for devlink reload action fw activate")
Signed-off-by: Suresh Devarakonda <ramad@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update devlink health fw fatal reporter state to "healthy" is needed by
strictly calling devlink_health_reporter_state_update() after recovery
was done by PCI error handler. This is needed when fw_fatal reporter was
triggered due to PCI error. Poll health is called and set reporter state
to error. Health recovery failed (since EEH didn't re-enable the PCI).
PCI handlers keep on recover flow and succeed later without devlink
acknowledgment. Fix this by adding devlink state update at the end of
the PCI handler recovery process.
Fixes: 6181e5cb75 ("devlink: add support for reporter recovery completion")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On multi table split the driver creates a new attr instance with
data being copied from prev attr instance zeroing action flags.
Also need to reset dests properties to avoid incorrect dests per attr.
Fixes: 8300f22526 ("net/mlx5e: Create new flow attr for multi table actions")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reject TC rules that forward from internal port to internal port
as it is not supported.
This include rules that are explicitly have internal port as
the filter device as well as rules that apply on tunnel interfaces
as the route device for the tunnel interface can be an internal
port.
Fixes: 27484f7170 ("net/mlx5e: Offload tc rules that redirect to ovs internal port")
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlx5 SQs must select the timestamp format explicitly according to the
active clock mode, select the current active timestamp mode so ASO SQ create
will succeed.
This fixes the following error prints when trying to create ipsec ASO SQ
while the timestamp format is real time mode.
mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22)
mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22
mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22
Fixes: cdd04f4d4d ("net/mlx5: Add support to create SQ and CQ for ASO")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently encap slow path rules just forward to software without
setting the chain id miss register, so driver doesn't restore
the chain, and packets hitting this rule will restart from tc chain
0 instead of continuing to the chain the encap rule was on.
Fix this by setting the chain id miss register to the chain id mapping.
Fixes: 8f1e0b97cc ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When tx_port_ts is set, the driver diverts all UPD traffic over PTP port
to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives.
When the packet size is greater then MTU, the firmware might drop it and
the packet won't be transmitted to the wire, hence the wire-CQE won't
reach the driver. In this case the SKBs are accumulated in the SKB fifo.
Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is
full, driver stops the queue resulting in a TX timeout. Devlink
TX-reporter can recover from it.
Fixes: 1880bc4e4a ("net/mlx5e: Add TX port timestamp support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When 2nd flow rules arrives, it will merge together with the
1st one if matcher criteria is the same.
If merge fails, driver will rollback the merge contents, and
reject the 2nd rule. At rollback stage, matcher can't be
disconnected unconditionally, otherise the 1st rule can't be
hit anymore.
Add logic to check if the matcher should be disconnected or not.
Fixes: cc2295cd54 ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers")
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After firmware reset driver should verify firmware already enabled CRS
and became responsive to pci config cycles before restoring pci state.
Fix that by waiting till device_id is readable through PCI again.
Fixes: eabe8e5e88 ("net/mlx5: Handle sync reset now event")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
An offloaded SA stops receiving after about 2^32 + replay_window
packets. For example, when SA reaches <seq-hi 0x1, seq 0x2c>, all
subsequent packets get dropped with SA-icv-failure (integrity_failed).
To reproduce the bug:
- ConnectX-6 Dx with crypto enabled (FW 22.30.1004)
- ipsec.conf:
nic-offload = yes
replay-window = 32
esn = yes
salifetime=24h
- Run netperf for a long time to send more than 2^32 packets
netperf -H <device-under-test> -t TCP_STREAM -l 20000
When 2^32 + replay_window packets are received, the replay window
moves from the 2nd half of subspace (overlap=1) to the 1st half
(overlap=0). The driver then updates the 'esn' value in NIC
(i.e. seq_hi) as follows.
seq_hi = xfrm_replay_seqhi(seq_bottom)
new esn in NIC = seq_hi + 1
The +1 increment is wrong, as seq_hi already contains the correct
seq_hi. For example, when seq_hi=1, the driver actually tells NIC to
use seq_hi=2 (esn). This incorrect esn value causes all subsequent
packets to fail integrity checks (SA-icv-failure). So, do not
increment.
Fixes: cb01008390 ("net/mlx5: IPSec, Add support for ESN")
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhengchao Shao says:
====================
fix some issues in netdevsim driver
When strace tool is used to perform memory injection, memory leaks and
files not removed issues are found. Fix them.
====================
Link: https://lore.kernel.org/r/20221026014642.116261-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove dir in nsim_dev_debugfs_init() when creating ports dir failed.
Otherwise, the netdevsim device will not be created next time. Kernel
reports an error: debugfs: Directory 'netdevsim1' with parent 'netdevsim'
already present!
Fixes: ab1d0cc004 ("netdevsim: change debugfs tree topology")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>