tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
also calculated again in efx_set_channels because it was originally done
there, and when efx_allocate_msix_channels was introduced it was
forgotten to be removed from efx_set_channels.
Moreover, the old calculation is wrong when using
efx_separate_tx_channels because now we can have XDP channels after the
TX channels, so n_channels - n_tx_channels doesn't point to the first TX
channel.
Remove the old calculation from efx_set_channels, and add the
initialization of this variable if MSI or legacy interrupts are used,
next to the initialization of the rest of the related variables, where
it was missing.
This has been already done for sfc, do it also for sfc_siena.
Fixes: 3990a8fffb ("sfc: allocate channels for XDP tx queues")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).
Fix efx_channel_has_tx_queues to return the correct value for this case
too.
This has been already done for sfc, do it also for sfc_siena.
Messages shown at probe time before the fix:
sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
------------[ cut here ]------------
netdevice: ens6f0np0: failed to initialise TXQ -1
WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
Call Trace:
efx_init_tx_queue+0xaa/0xf0 [sfc]
efx_start_channels+0x49/0x120 [sfc]
efx_start_all+0x1f8/0x430 [sfc]
efx_net_open+0x5a/0xe0 [sfc]
__dev_open+0xd0/0x190
__dev_change_flags+0x1b3/0x220
dev_change_flags+0x21/0x60
[...] stripped
Messages shown at remove time before the fix:
sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
sfc 0000:03:00.0 ens6f0np0: failed to flush queues
Fixes: 8700aff089 ("sfc: fix channel allocation with brute force")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Tested-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
also calculated again in efx_set_channels because it was originally done
there, and when efx_allocate_msix_channels was introduced it was
forgotten to be removed from efx_set_channels.
Moreover, the old calculation is wrong when using
efx_separate_tx_channels because now we can have XDP channels after the
TX channels, so n_channels - n_tx_channels doesn't point to the first TX
channel.
Remove the old calculation from efx_set_channels, and add the
initialization of this variable if MSI or legacy interrupts are used,
next to the initialization of the rest of the related variables, where
it was missing.
Fixes: 3990a8fffb ("sfc: allocate channels for XDP tx queues")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).
Fix efx_channel_has_tx_queues to return the correct value for this case
too.
Messages shown at probe time before the fix:
sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
------------[ cut here ]------------
netdevice: ens6f0np0: failed to initialise TXQ -1
WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
[...] stripped
Call Trace:
efx_init_tx_queue+0xaa/0xf0 [sfc]
efx_start_channels+0x49/0x120 [sfc]
efx_start_all+0x1f8/0x430 [sfc]
efx_net_open+0x5a/0xe0 [sfc]
__dev_open+0xd0/0x190
__dev_change_flags+0x1b3/0x220
dev_change_flags+0x21/0x60
[...] stripped
Messages shown at remove time before the fix:
sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
sfc 0000:03:00.0 ens6f0np0: failed to flush queues
Fixes: 8700aff089 ("sfc: fix channel allocation with brute force")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Tested-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Siena only supports software TSO. This means more code can be deleted,
as pointed out by the Smatch static checker warning:
drivers/net/ethernet/sfc/siena/tx.c:184 __efx_siena_enqueue_skb()
warn: duplicate check 'segments' (previous on line 158)
Fixes: 956f2d86cb ("sfc/siena: Remove build references to missing functionality")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/kernel-janitors/YoH5tJMnwuGTrn1Z@kili/
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/165294463549.23865.4557617334650441347.stgit@palantir17.mph.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Both sfc/efx_channels.h and sfc/siena/efx_channels.h used the same
wrapper #ifndef EFX_CHANNELS_H, this patch changes the siena define to be
EFX_SIENA_CHANNELS_H to avoid build system confusion.
This fixes the following build break:
drivers/net/ethernet/sfc/ptp.c:2191:28:
error: ‘efx_copy_channel’ undeclared here (not in a function); did you mean ‘efx_ptp_channel’?
2191 | .copy = efx_copy_channel,
Fixes: 6e173d3b4a ("sfc: Copy shared files needed for Siena (part 1)")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220518065820.131611-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The code for gso_max_size was added originally to allow for debugging and
workaround of buggy devices that couldn't support TSO with blocks 64K in
size. The original reason for limiting it to 64K was because that was the
existing limits of IPv4 and non-jumbogram IPv6 length fields.
With the addition of Big TCP we can remove this limit and allow the value
to potentially go up to UINT_MAX and instead be limited by the tso_max_size
value.
So in order to support this we need to go through and clean up the
remaining users of the gso_max_size value so that the values will cap at
64K for non-TCPv6 flows. In addition we can clean up the GSO_MAX_SIZE value
so that 64K becomes GSO_LEGACY_MAX_SIZE and UINT_MAX will now be the upper
limit for GSO_MAX_SIZE.
v6: (edumazet) fixed a compile error if CONFIG_IPV6=n,
in a new sk_trim_gso_size() helper.
netif_set_tso_max_size() caps the requested TSO size
with GSO_MAX_SIZE.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove napi_weight statics which are set to 64 and never modified,
remnants of the out-of-tree napi_weight module param.
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220512205603.1536771-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If CONFIG_PTP_1588_CLOCK=m and CONFIG_SFC_SIENA=y, the siena driver will fail to link:
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_remove_channel':
ptp.c:(.text+0xa28): undefined reference to `ptp_clock_unregister'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_probe_channel':
ptp.c:(.text+0x13a0): undefined reference to `ptp_clock_register'
ptp.c:(.text+0x1470): undefined reference to `ptp_clock_unregister'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_pps_worker':
ptp.c:(.text+0x1d29): undefined reference to `ptp_clock_event'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_siena_ptp_get_ts_info':
ptp.c:(.text+0x301b): undefined reference to `ptp_clock_index'
To fix this build error, make SFC_SIENA depends on PTP_1588_CLOCK.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: d48523cb88 ("sfc: Copy shared files needed for Siena (part 2)")
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220513012721.140871-1-renzhijie2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
They were removed in the first series since they were not used for EF10.
Put that code back for Siena, with the prototypes in siena_sriov.h
since that file is a more applicable place for it.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Change the clock name and work queue names to differentiate them from
the names used in sfc.ko.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a Siena Kconfig option and use it in stead of the sfc one.
Rename the internal variable for the 'mcdi_logging_default' module
parameter to avoid a naming conflict with the one in sfc.ko.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a Siena Kconfig option and use it in stead of the sfc one.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a Siena Kconfig option and use it in stead of the sfc one.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a Siena Kconfig option and use it in stead of the sfc one.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the NIC ->probe() callback, ->mtd_probe() callback is called.
If NIC has 2 ports, ->probe() is called twice and ->mtd_probe() too.
In the ->mtd_probe(), which is efx_ef10_mtd_probe() it allocates and
initializes mtd partiion.
But mtd partition for sfc is shared data.
So that allocated mtd partition data from last called
efx_ef10_mtd_probe() will not be used.
Therefore it must be freed.
But it doesn't free a not used mtd partition data in efx_ef10_mtd_probe().
kmemleak reports:
unreferenced object 0xffff88811ddb0000 (size 63168):
comm "systemd-udevd", pid 265, jiffies 4294681048 (age 348.586s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffffa3767749>] kmalloc_order_trace+0x19/0x120
[<ffffffffa3873f0e>] __kmalloc+0x20e/0x250
[<ffffffffc041389f>] efx_ef10_mtd_probe+0x11f/0x270 [sfc]
[<ffffffffc0484c8a>] efx_pci_probe.cold.17+0x3df/0x53d [sfc]
[<ffffffffa414192c>] local_pci_probe+0xdc/0x170
[<ffffffffa4145df5>] pci_device_probe+0x235/0x680
[<ffffffffa443dd52>] really_probe+0x1c2/0x8f0
[<ffffffffa443e72b>] __driver_probe_device+0x2ab/0x460
[<ffffffffa443e92a>] driver_probe_device+0x4a/0x120
[<ffffffffa443f2ae>] __driver_attach+0x16e/0x320
[<ffffffffa4437a90>] bus_for_each_dev+0x110/0x190
[<ffffffffa443b75e>] bus_add_driver+0x39e/0x560
[<ffffffffa4440b1e>] driver_register+0x18e/0x310
[<ffffffffc02e2055>] 0xffffffffc02e2055
[<ffffffffa3001af3>] do_one_initcall+0xc3/0x450
[<ffffffffa33ca574>] do_init_module+0x1b4/0x700
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Fixes: 8127d661e7 ("sfc: Add support for Solarflare SFC9100 family")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220512054709.12513-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make the (un)load message more specific to differentiate it from
the sfc.ko messages.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The implementation of each is quite short. This means sriov.c is
not needed any more.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For siena use efx_siena_ as the function prefix.
efx_nic_update_stats_atomic is only used in efx_common.c, so move
it there.
efx_nic_copy_stats is not used in Siena, so it is removed.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For siena use efx_siena_ as the function prefix.
Several functions are not used in Siena, so they are removed.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For siena use efx_siena_ as the function prefix.
This patch covers selftest.h, ptp.h, net_driver.h and ethtool_common.h.
efx_ethtool_fill_self_tests() can become static.
Some functions in ptp.c can also become static.
Rename loopback_mode in net_driver.h.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For siena use efx_siena_ as the function prefix.
Several functions are not used in Siena, so they are removed.
Use a Siena specific variable name for module parameter
efx_separate_tx_channels.
Move efx_fini_tx_queue() to avoid a forward declaration of
efx_dequeue_buffer().
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building with allyesconfig there are many identical
symbol names.
For siena use efx_siena_ as the function and variable prefix
to avoid build errors.
efx_mtd_remove_partition can become static as it is no longer called
from other files.
efx_ticks_to_usecs and efx_xmit_done_single are not used in Siena, so
they are removed.
Several functions are only used inside efx_channels.c for Siena so
they can become static.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Functionality not supported or needed on Siena includes:
- Anything for EF100
- EF10 specifics such as register access, PIO and TSO offload.
Also only bind to Siena NICs.
Remove EF10 specifics from nic.h.
The functions that start with efx_farch_ will be removed from sfc.ko
with a subsequent patch.
Add the efx_ prefix to siena_prepare_flush() to make it consistent
with the other APIs.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These are the files starting with m through w.
No changes are done, those will be done with subsequent commits.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These are the files starting with b through i.
No changes are done, those will be done with subsequent commits.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It fixes memory leak in ring buffer change logic.
When ring buffer size is changed(ethtool -G eth0 rx 4096), sfc driver
works like below.
1. stop all channels and remove ring buffers.
2. allocates new buffer array.
3. allocates rx buffers.
4. start channels.
While the above steps are working, it skips some steps if the channel
doesn't have a ->copy callback function.
Due to ptp channel doesn't have ->copy callback, these above steps are
skipped for ptp channel.
It eventually makes some problems.
a. ptp channel's ring buffer size is not changed, it works only
1024(default).
b. memory leak.
The reason for memory leak is to use the wrong ring buffer values.
There are some values, which is related to ring buffer size.
a. efx->rxq_entries
- This is global value of rx queue size.
b. rx_queue->ptr_mask
- used for access ring buffer as circular ring.
- roundup_pow_of_two(efx->rxq_entries) - 1
c. rx_queue->max_fill
- efx->rxq_entries - EFX_RXD_HEAD_ROOM
These all values should be based on ring buffer size consistently.
But ptp channel's values are not.
a. efx->rxq_entries
- This is global(for sfc) value, always new ring buffer size.
b. rx_queue->ptr_mask
- This is always 1023(default).
c. rx_queue->max_fill
- This is new ring buffer size - EFX_RXD_HEAD_ROOM.
Let's assume we set 4096 for rx ring buffer,
normal channel ptp channel
efx->rxq_entries 4096 4096
rx_queue->ptr_mask 4095 1023
rx_queue->max_fill 4086 4086
sfc driver allocates rx ring buffers based on these values.
When it allocates ptp channel's ring buffer, 4086 ring buffers are
allocated then, these buffers are attached to the allocated array.
But ptp channel's ring buffer array size is still 1024(default)
and ptr_mask is still 1023 too.
So, 3062 ring buffers will be overwritten to the array.
This is the reason for memory leak.
Test commands:
ethtool -G <interface name> rx 4096
while :
do
ip link set <interface name> up
ip link set <interface name> down
done
In order to avoid this problem, it adds ->copy callback to ptp channel
type.
So that rx_queue->ptr_mask value will be updated correctly.
Fixes: 7c236c43b8 ("sfc: Add support for IEEE-1588 PTP")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch all Ethernet drivers which use custom napi weights
to the new API.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should call the TSO setting helper, GSO is controllable
by user space.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For Siena we do not need new messages that were defined
for the EF100 architecture. Several debug messages have
also been removed.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable the build of Siena code until later in this patch series.
Prevent sfc.ko from binding to Siena NICs.
efx_init_sriov/efx_fini_sriov is only used for Siena. Remove calls
to those.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch extends the EF100 PF driver by adding .sriov_configure()
which would allow users to enable and disable virtual functions
using the sriov sysfs.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/75e74d9e-14ce-0524-9668-5ab735a7cf62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The strings are only used in efx_common.c so the definitions
can be static in there.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
It is defined both in efx.h and tx_common.h.
Remove the definition in efx.h.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This means we can remove them from efx_channel.h and avoid
naming conflicts later.
efx_channel_dummy_op_void() cannot be static as it is
used in ef100_nic.c.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The 8000 series and newer NICs all get hardware timestamps from the MAC
and can provide timestamps on a normal TX queue, rather than via a slow
path through the MC. As such we can use this path for any packet where a
hardware timestamp is requested.
This also enables support for PTP over transports other than IPv4+UDP.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@xilinx.com>
Link: https://lore.kernel.org/r/510652dc-54b4-0e11-657e-e37ee3ca26a9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device
we care about. It appears that what we care about here is specifically
whether DMA mapping ops involve any IOMMU overhead or not, so check for
translation actually being active for our device.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/7350f957944ecfce6cce90f422e3992a1f428775.1649166055.git.robin.murphy@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In some cases, xdp tx_queue can get used before initialization.
1. interface up/down
2. ring buffer size change
When CPU cores are lower than maximum number of channels of sfc driver,
it creates new channels only for XDP.
When an interface is up or ring buffer size is changed, all channels
are initialized.
But xdp channels are always initialized later.
So, the below scenario is possible.
Packets are received to rx queue of normal channels and it is acted
XDP_TX and tx_queue of xdp channels get used.
But these tx_queues are not initialized yet.
If so, TX DMA or queue error occurs.
In order to avoid this problem.
1. initializes xdp tx_queues earlier than other rx_queue in
efx_start_channels().
2. checks whether tx_queue is initialized or not in efx_xdp_tx_buffers().
Splat looks like:
sfc 0000:08:00.1 enp8s0f1np1: TX queue 10 spurious TX completion id 250
sfc 0000:08:00.1 enp8s0f1np1: resetting (RECOVER_OR_ALL)
sfc 0000:08:00.1 enp8s0f1np1: MC command 0x80 inlen 100 failed rc=-22
(raw=22) arg=789
sfc 0000:08:00.1 enp8s0f1np1: has been disabled
Fixes: f28100cb9c ("sfc: fix lack of XDP TX queues - error XDP TX failed (-22)")
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the page_ring is not used page_ptr_mask is 0.
Do not dereference page_ring[0] in this case.
Fixes: 2768935a46 ("sfc: reuse pages to avoid DMA mapping/unmapping costs")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On such systems cpumask_of_node() returns NULL, which bitmap
operations are not happy with.
Fixes: c265b569a4 ("sfc: default config to 1 channel/core in local NUMA node only")
Fixes: 09a99ab16c ("sfc: set affinity hints in local NUMA node only")
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/164857006953.8140.3265568858101821256.stgit@palantir17.mph.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
seqno could be read as a stale value outside of the lock. The lock is
already acquired to protect the modification of seqno against a possible
race condition. Place the reading of this value also inside this locking
to protect it against a possible race condition.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Affinity hints were being set to CPUs in local NUMA node first, and then
in other CPUs. This was creating 2 unintended issues:
1. Channels created to be assigned each to a different physical core
were assigned to hyperthreading siblings because of being in same
NUMA node.
Since the patch previous to this one, this did not longer happen
with default rss_cpus modparam because less channels are created.
2. XDP channels could be assigned to CPUs in different NUMA nodes,
decreasing performance too much (to less than half in some of my
tests).
This patch sets the affinity hints spreading the channels only in local
NUMA node's CPUs. A fallback for the case that no CPU in local NUMA node
is online has been added too.
Example of CPUs being assigned in a non optimal way before this and the
previous patch (note: in this system, xdp-8 to xdp-15 are created
because num_possible_cpus == 64, but num_present_cpus == 32 so they're
never used):
$ lscpu | grep -i numa
NUMA node(s): 2
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
$ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list
/proc/irq/141/0000:07:00.0-0/../smp_affinity_list:0
/proc/irq/142/0000:07:00.0-1/../smp_affinity_list:1
/proc/irq/143/0000:07:00.0-2/../smp_affinity_list:2
/proc/irq/144/0000:07:00.0-3/../smp_affinity_list:3
/proc/irq/145/0000:07:00.0-4/../smp_affinity_list:4
/proc/irq/146/0000:07:00.0-5/../smp_affinity_list:5
/proc/irq/147/0000:07:00.0-6/../smp_affinity_list:6
/proc/irq/148/0000:07:00.0-7/../smp_affinity_list:7
/proc/irq/149/0000:07:00.0-8/../smp_affinity_list:16
/proc/irq/150/0000:07:00.0-9/../smp_affinity_list:17
/proc/irq/151/0000:07:00.0-10/../smp_affinity_list:18
/proc/irq/152/0000:07:00.0-11/../smp_affinity_list:19
/proc/irq/153/0000:07:00.0-12/../smp_affinity_list:20
/proc/irq/154/0000:07:00.0-13/../smp_affinity_list:21
/proc/irq/155/0000:07:00.0-14/../smp_affinity_list:22
/proc/irq/156/0000:07:00.0-15/../smp_affinity_list:23
/proc/irq/157/0000:07:00.0-xdp-0/../smp_affinity_list:8
/proc/irq/158/0000:07:00.0-xdp-1/../smp_affinity_list:9
/proc/irq/159/0000:07:00.0-xdp-2/../smp_affinity_list:10
/proc/irq/160/0000:07:00.0-xdp-3/../smp_affinity_list:11
/proc/irq/161/0000:07:00.0-xdp-4/../smp_affinity_list:12
/proc/irq/162/0000:07:00.0-xdp-5/../smp_affinity_list:13
/proc/irq/163/0000:07:00.0-xdp-6/../smp_affinity_list:14
/proc/irq/164/0000:07:00.0-xdp-7/../smp_affinity_list:15
/proc/irq/165/0000:07:00.0-xdp-8/../smp_affinity_list:24
/proc/irq/166/0000:07:00.0-xdp-9/../smp_affinity_list:25
/proc/irq/167/0000:07:00.0-xdp-10/../smp_affinity_list:26
/proc/irq/168/0000:07:00.0-xdp-11/../smp_affinity_list:27
/proc/irq/169/0000:07:00.0-xdp-12/../smp_affinity_list:28
/proc/irq/170/0000:07:00.0-xdp-13/../smp_affinity_list:29
/proc/irq/171/0000:07:00.0-xdp-14/../smp_affinity_list:30
/proc/irq/172/0000:07:00.0-xdp-15/../smp_affinity_list:31
CPUs assignments after this and previous patch, so normal channels
created only one per core in NUMA node and affinities set only to local
NUMA node:
$ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list
/proc/irq/116/0000:07:00.0-0/../smp_affinity_list:0
/proc/irq/117/0000:07:00.0-1/../smp_affinity_list:1
/proc/irq/118/0000:07:00.0-2/../smp_affinity_list:2
/proc/irq/119/0000:07:00.0-3/../smp_affinity_list:3
/proc/irq/120/0000:07:00.0-4/../smp_affinity_list:4
/proc/irq/121/0000:07:00.0-5/../smp_affinity_list:5
/proc/irq/122/0000:07:00.0-6/../smp_affinity_list:6
/proc/irq/123/0000:07:00.0-7/../smp_affinity_list:7
/proc/irq/124/0000:07:00.0-xdp-0/../smp_affinity_list:16
/proc/irq/125/0000:07:00.0-xdp-1/../smp_affinity_list:17
/proc/irq/126/0000:07:00.0-xdp-2/../smp_affinity_list:18
/proc/irq/127/0000:07:00.0-xdp-3/../smp_affinity_list:19
/proc/irq/128/0000:07:00.0-xdp-4/../smp_affinity_list:20
/proc/irq/129/0000:07:00.0-xdp-5/../smp_affinity_list:21
/proc/irq/130/0000:07:00.0-xdp-6/../smp_affinity_list:22
/proc/irq/131/0000:07:00.0-xdp-7/../smp_affinity_list:23
/proc/irq/132/0000:07:00.0-xdp-8/../smp_affinity_list:0
/proc/irq/133/0000:07:00.0-xdp-9/../smp_affinity_list:1
/proc/irq/134/0000:07:00.0-xdp-10/../smp_affinity_list:2
/proc/irq/135/0000:07:00.0-xdp-11/../smp_affinity_list:3
/proc/irq/136/0000:07:00.0-xdp-12/../smp_affinity_list:4
/proc/irq/137/0000:07:00.0-xdp-13/../smp_affinity_list:5
/proc/irq/138/0000:07:00.0-xdp-14/../smp_affinity_list:6
/proc/irq/139/0000:07:00.0-xdp-15/../smp_affinity_list:7
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Handling channels from CPUs in different NUMA node can penalize
performance, so better configure only one channel per core in the same
NUMA node than the NIC, and not per each core in the system.
Fallback to all other online cores if there are not online CPUs in local
NUMA node.
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>