This was left as TODO in commit 01e23b2b3b ("net: enetc: add support
for preemptible traffic classes") since it's relatively complicated.
Where this makes a difference is with a configuration as follows:
ethtool --set-mm eno0 pmac-enabled on tx-enabled on verify-enabled on
Preemptible packets should only be sent when the MAC Merge TX direction
becomes active (i.o.w. when the verification process succeeds, aka when
the link partner confirms it can process preemptible traffic). But the
tc qdisc with the preemptible traffic classes is offloaded completely
asynchronously w.r.t. the MM becoming active.
The ENETC manual does suggest that this should be handled in the driver:
"On startup, software should wait for the verification process to
complete (MMCSR[VSTS]=011) before initiating traffic".
Adding the necessary logic allows future selftests to uphold the claim
that an inactive or disabled MAC Merge layer should never send data
packets through the pMAC.
This change moves enetc_set_ptcfpr() from enetc.c to enetc_ethtool.c,
where its only caller is now - enetc_mm_commit_preemptible_tcs().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The MMCSR register contains 2 fields with overlapping meaning:
- LPA (Local preemption active):
This read-only status bit indicates whether preemption is active for
this port. This bit will be set if preemption is both enabled and has
completed the verification process.
- TXSTS (Merge status):
This read-only status field provides the state of the MAC Merge sublayer
transmit status as defined in IEEE Std 802.3-2018 Clause 99.
00 Transmit preemption is inactive
01 Transmit preemption is active
10 Reserved
11 Reserved
However none of these 2 fields offer reliable reporting to software.
When connecting ENETC to a link partner which is not capable of Frame
Preemption, the expectation is that ENETC's verification should fail
(VSTS=4) and its MM TX direction should be inactive (LPA=0, TXSTS=00)
even though the MM TX is enabled (ME=1). But surprise, the LPA bit of
MMCSR stays set even if VSTS=4 and ME=1.
OTOH, the TXSTS field has the opposite problem. I cannot get its value
to change from 0, even when connecting to a link partner capable of
frame preemption, which does respond to its verification frames (ME=1
and VSTS=3, "SUCCEEDED").
The only option with such buggy hardware seems to be to reimplement the
formula for calculating tx-active in software, which is for tx-enabled
to be true, and for the verify-status to be either SUCCEEDED, or
DISABLED.
Without reliable tx-active reporting, we have no good indication when
to commit the preemptible traffic classes to hardware, which makes it
possible (but not desirable) to send preemptible traffic to a link
partner incapable of receiving it. However, currently we do not have the
logic to wait for TX to be active yet, so the impact is limited.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current enetc_set_mm() is designed to set the priv->active_offloads bit
ENETC_F_QBU for enetc_mm_link_state_update() to act on, but if the link
is already up, it modifies the ENETC_MMCSR_ME ("Merge Enable") bit
directly.
The problem is that it only *sets* ENETC_MMCSR_ME if the link is up, it
doesn't *clear* it if needed. So subsequent enetc_get_mm() calls still
see tx-enabled as true, up until a link down event, which is when
enetc_mm_link_state_update() will get called.
This is not a functional issue as far as I can assess. It has only come
up because I'd like to uphold a simple API rule in core ethtool code:
the pMAC cannot be disabled if TX is going to be enabled. Currently,
the fact that TX remains enabled for longer than expected (after the
enetc_set_mm() call that disables it) is going to violate that rule,
which is how it was caught.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
PFs which support the MAC Merge layer also have a set of 8 registers
called "Port traffic class N frame preemption register (PTC0FPR - PTC7FPR)".
Through these, a traffic class (group of TX rings of same dequeue
priority) can be mapped to the eMAC or to the pMAC.
There's nothing particularly spectacular here. We should probably only
commit the preemptible TCs to hardware once the MAC Merge layer became
active, but unlike Felix, we don't have an IRQ that notifies us of that.
We'd have to sleep for up to verifyTime (127 ms) to wait for a
resolution coming from the verification state machine; not only from the
ndo_setup_tc() code path, but also from enetc_mm_link_state_update().
Since it's relatively complicated and has a relatively small benefit,
I'm not doing it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To gain access to the larger encapsulating structure which has the type
tc_mqprio_qopt_offload, rename just the "qopt" field as "qopt".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
I have observed an issue where the RX direction of the LS1028A ENETC pMAC
seems unresponsive. The minimal procedure to reproduce the issue is:
1. Connect ENETC port 0 with a loopback RJ45 cable to one of the Felix
switch ports (0).
2. Bring the ports up (MAC Merge layer is not enabled on either end).
3. Send a large quantity of unidirectional (express) traffic from Felix
to ENETC. I tried altering frame size and frame count, and it doesn't
appear to be specific to either of them, but rather, to the quantity
of octets received. Lowering the frame count, the minimum quantity of
packets to reproduce relatively consistently seems to be around 37000
frames at 1514 octets (w/o FCS) each.
4. Using ethtool --set-mm, enable the pMAC in the Felix and in the ENETC
ports, in both RX and TX directions, and with verification on both
ends.
5. Wait for verification to complete on both sides.
6. Configure a traffic class as preemptible on both ends.
7. Send some packets again.
The issue is at step 5, where the verification process of ENETC ends
(meaning that Felix responds with an SMD-R and ENETC sees the response),
but the verification process of Felix never ends (it remains VERIFYING).
If step 3 is skipped or if ENETC receives less traffic than
approximately that threshold, the test runs all the way through
(verification succeeds on both ends, preemptible traffic passes fine).
If, between step 4 and 5, the step below is also introduced:
4.1. Disable and re-enable PM0_COMMAND_CONFIG bit RX_EN
then again, the sequence of steps runs all the way through, and
verification succeeds, even if there was the previous RX traffic
injected into ENETC.
Traffic sent *by* the ENETC port prior to enabling the MAC Merge layer
does not seem to influence the verification result, only received
traffic does.
The LS1028A manual does not mention any relationship between
PM0_COMMAND_CONFIG and MMCSR, and the hardware people don't seem to
know for now either.
The bit that is toggled to work around the issue is also toggled
by enetc_mac_enable(), called from phylink's mac_link_down() and
mac_link_up() methods - which is how the workaround was found:
verification would work after a link down/up.
Fixes: c7b9e80869 ("net: enetc: add support for MAC Merge layer")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230411192645.1896048-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
A number of MDIO drivers make use of devm_mdiobus_alloc_size(). This
is only available when CONFIG_MDIO_DEVRES is enabled. Add missing
depends or selects, depending on if there are circular dependencies or
not. This avoids linker errors, especially for randconfig builds.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230409150204.2346231-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Not all fec MDIO bus drivers support C45 mode transactions. The older fec
hardware block in many ColdFire SoCs does not appear to support them, at
least according to most of the different ColdFire SoC reference manuals.
The bits used to generate C45 access on the iMX parts, in the OP field
of the MMFR register, are documented as generating non-compliant MII
frames (it is not documented as to exactly how they are non-compliant).
Commit 8d03ad1ab0 ("net: fec: Separate C22 and C45 transactions")
means the fec driver will always register c45 MDIO read and write
methods. During probe these will always be accessed now generating
non-compliant MII accesses on ColdFire based devices.
Add a quirk define, FEC_QUIRK_HAS_MDIO_C45, that can be used to
distinguish silicon that supports MDIO C45 framing or not. Add this to
all the existing iMX quirks, so they will be behave as they do now (*).
(*) it seems that some iMX parts may not support C45 transactions either.
The iMX25 and iMX50 Reference Manuals contain similar wording to
the ColdFire Reference Manuals on this.
Fixes: 8d03ad1ab0 ("net: fec: Separate C22 and C45 transactions")
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230404052207.3064861-1-gerg@linux-m68k.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Autoneg bit in the advertising bitmap and state->an_enabled are
always identical. Thus, we will be removing state->an_enabled.
Use the Autoneg bit in the advertising bitmap to indicate whether
autonegotiation should be used, rather than using the an_enabled
member which will be going away. This means we use the same condition
as phylink_mii_c22_pcs_config().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When running "ethtool -S eno0 --groups rmon" without an explicit "--src
emac|pmac" argument, the kernel will not report
rx-rmon-etherStatsPkts64to64Octets, rx-rmon-etherStatsPkts65to127Octets,
etc. This is because on ETHTOOL_MAC_STATS_SRC_AGGREGATE, we do not
populate the "ranges" argument.
ocelot_port_get_rmon_stats() does things differently and things work
there. I had forgotten to make sure that the code is structured the same
way in both drivers, so do that now.
Fixes: cf52bd238b ("net: enetc: add support for MAC Merge statistics counters")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230321232831.1200905-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties.
Convert reading boolean properties to of_property_read_bool().
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
Acked-by: Kalle Valo <kvalo@kernel.org>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Core support
- New devm_of_phy_optional_get() API with users and conversion
- New support:
- Mediatek MT7986 tphy support
- Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250
USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY
support amd SM8350 & SM8450 combo phy support
- Qualcomm SNPS eUSB2 eUSB2 repeater drivers
- Allwinner F1C100s USB PHY support
- Tegra xusb support for Tegra234
- Updates:
- Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy
- G4 mode support in Qualcomm UFS phy and support for various SoCs
- Yaml conversion for Meson usb2 phy
- TI Type C support for usb phy for j721
- Yaml conversion for Tegra xusb binding
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmP4vg0ACgkQfBQHDyUj
g0dJYxAA0DL9X6IFeK8U5blp/ZWmXBqbhRmnsf0KGsSRJMSgHkPruhOWfTCAvXnc
2+xp8wiE0MdLI7xseJWZi7Q2r10f5LtU55rzbL3mI7MWd/g2WTlKiXCDrpa4fY/Z
pxo+892vUJh3+I2+Sjf0JnIY89MV/sqSLXsFeKDtvp7J9lMjA98TV6m+YDVTXn22
SW3hjaB8ochSQV1HEMdEJWsrZc3lmszLdQM+qz3PafyQRbhc1A98Vkf0X/sWR/Ot
p0FCXlNnY3O272dnrU0V5yv7wwWqjVDN5+Q3vk3AbSlo9ERLVwchayUzxi8EIS7t
cPmxhsyMoEmsSIPx4z47vLt1NQoqiaKNM7XCrn13Z0fE9fbTW8Trx8VBXcIUsE98
hT6IxrjRFGJOta8koOssBqSjuwP4QBIZiwXL2YEujj3hGqyRefOCN5XBek7dVyDe
ctwJsIKBCG8Wh87dFldYLrJgQKR9svZXDjxVADpYMUpPM2v02DCWhUyM50ODowZf
Yl7bP8dXtn2UBIybbhNTZg29PbrATk73tcr73GZeX8JTOK2vpsZ3+fUsdxPYzed3
lF2vw361E2ry1DtgmH7XMXevDFvKJ/aks5FIAKebc1tlAPPGYVIkBqyQprAQmlS3
tDQ+6+jQLAr14iSaVQd9MC3obNqbJYHf1WEU3rKtDy3MB0flbqo=
=2g27
-----END PGP SIGNATURE-----
Merge tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"This features a bunch of new device support, a couple of new drivers,
yaml conversion and updates of a few drivers.
Core support:
- New devm_of_phy_optional_get() API with users and conversion
New hardware support:
- Mediatek MT7986 phy support
- Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3
phy support, SM6350 combo phy support, SM6125 UFS PHY support amd
SM8350 & SM8450 combo phy support
- Qualcomm SNPS eUSB2 eUSB2 repeater drivers
- Allwinner F1C100s USB PHY support
- Tegra xusb support for Tegra234
Updates:
- Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy
- G4 mode support in Qualcomm UFS phy and support for various SoCs
- Yaml conversion for Meson usb2 phy
- TI Type C support for usb phy for j721
- Yaml conversion for Tegra xusb binding"
* tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (106 commits)
phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater
phy: qcom: Add QCOM SNPS eUSB2 repeater driver
dt-bindings: phy: qcom,snps-eusb2-phy: Add phys property for the repeater
dt-bindings: phy: Add qcom,snps-eusb2-repeater schema file
dt-bindings: phy: amlogic,g12a-usb3-pcie-phy: add missing optional phy-supply property
phy: rockchip-typec: Fix unsigned comparison with less than zero
phy: rockchip-typec: fix tcphy_get_mode error case
phy: qcom: snps-eusb2: Add missing headers
phy: qcom-qmp-combo: Add support for SM8550
phy: qcom-qmp: Add v6 DP register offsets
phy: qcom-qmp: pcs-usb: Add v6 register offsets
dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Document SM8550 compatible
phy: qcom: Add QCOM SNPS eUSB2 driver
dt-bindings: phy: Add qcom,snps-eusb2-phy schema file
phy: qcom-qmp-pcie: Add support for SM8550 g3x2 and g4x2 PCIEs
phy: qcom-qmp: qserdes-lane-shared: Add v6 register offsets
phy: qcom-qmp: qserdes-txrx: Add v6.20 register offsets
phy: qcom-qmp: pcs-pcie: Add v6.20 register offsets
phy: qcom-qmp: pcs-pcie: Add v6 register offsets
phy: qcom-qmp: pcs: Add v6.20 register offsets
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY+bZrwAKCRDbK58LschI
gzi4AP4+TYo0jnSwwkrOoN9l4f5VO9X8osmj3CXfHBv7BGWVxAD/WnvA3TDZyaUd
agIZTkRs6BHF9He8oROypARZxTeMLwM=
=nO1C
-----END PGP SIGNATURE-----
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-02-11
We've added 96 non-merge commits during the last 14 day(s) which contain
a total of 152 files changed, 4884 insertions(+), 962 deletions(-).
There is a minor conflict in drivers/net/ethernet/intel/ice/ice_main.c
between commit 5b246e533d ("ice: split probe into smaller functions")
from the net-next tree and commit 66c0e13ad2 ("drivers: net: turn on
XDP features") from the bpf-next tree. Remove the hunk given ice_cfg_netdev()
is otherwise there a 2nd time, and add XDP features to the existing
ice_cfg_netdev() one:
[...]
ice_set_netdev_features(netdev);
netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
NETDEV_XDP_ACT_XSK_ZEROCOPY;
ice_set_ops(netdev);
[...]
Stephen's merge conflict mail:
https://lore.kernel.org/bpf/20230207101951.21a114fa@canb.auug.org.au/
The main changes are:
1) Add support for BPF trampoline on s390x which finally allows to remove many
test cases from the BPF CI's DENYLIST.s390x, from Ilya Leoshkevich.
2) Add multi-buffer XDP support to ice driver, from Maciej Fijalkowski.
3) Add capability to export the XDP features supported by the NIC.
Along with that, add a XDP compliance test tool,
from Lorenzo Bianconi & Marek Majtyka.
4) Add __bpf_kfunc tag for marking kernel functions as kfuncs,
from David Vernet.
5) Add a deep dive documentation about the verifier's register
liveness tracking algorithm, from Eduard Zingerman.
6) Fix and follow-up cleanups for resolve_btfids to be compiled
as a host program to avoid cross compile issues,
from Jiri Olsa & Ian Rogers.
7) Batch of fixes to the BPF selftest for xdp_hw_metadata which resulted
when testing on different NICs, from Jesper Dangaard Brouer.
8) Fix libbpf to better detect kernel version code on Debian, from Hao Xiang.
9) Extend libbpf to add an option for when the perf buffer should
wake up, from Jon Doron.
10) Follow-up fix on xdp_metadata selftest to just consume on TX
completion, from Stanislav Fomichev.
11) Extend the kfuncs.rst document with description on kfunc
lifecycle & stability expectations, from David Vernet.
12) Fix bpftool prog profile to skip attaching to offline CPUs,
from Tonghao Zhang.
====================
Link: https://lore.kernel.org/r/20230211002037.8489-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add PF driver support for the following:
- Viewing the standardized MAC Merge layer counters.
- Viewing the standardized Ethernet MAC and RMON counters associated
with the pMAC.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230206094531.444988-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add PF driver support for viewing and changing the MAC Merge sublayer
parameters, and seeing the verification state machine's current state.
The verification handshake with the link partner is driven by hardware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230206094531.444988-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We assume that the mqprio queue configuration from taprio has a simple
1:1 mapping between prio and traffic class, and one TX queue per TC.
That might not be the case. Actually parse and act upon the mqprio
config.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Regardless of the requested queue count per traffic class, the enetc
driver allocates a number of TX rings equal to the number of TCs, and
hardcodes a queue configuration of "1@0 1@1 ... 1@max-tc". Other
configurations are silently ignored and treated the same.
Improve that by allowing what the user requests to be actually
fulfilled. This allows more than one TX ring per traffic class.
For example:
$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 4 \
map 0 0 1 1 2 2 3 3 queues 2@0 2@2 2@4 2@6
[ 146.267648] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[ 146.273451] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
[ 146.283280] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 1
[ 146.293987] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 1
[ 146.300467] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 2
[ 146.306866] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 2
[ 146.313261] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 3
[ 146.319622] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 3
$ tc qdisc del dev eno0 root
[ 178.238418] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[ 178.244369] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 0
[ 178.251486] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 0
[ 178.258006] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 0
[ 178.265038] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 0
[ 178.271557] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 0
[ 178.277910] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 0
[ 178.284281] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 0
$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
[ 186.113162] fsl_enetc 0000:00:00.0 eno0: TX ring 0 prio 0
[ 186.118764] fsl_enetc 0000:00:00.0 eno0: TX ring 1 prio 1
[ 186.124374] fsl_enetc 0000:00:00.0 eno0: TX ring 2 prio 2
[ 186.130765] fsl_enetc 0000:00:00.0 eno0: TX ring 3 prio 3
[ 186.136404] fsl_enetc 0000:00:00.0 eno0: TX ring 4 prio 4
[ 186.142049] fsl_enetc 0000:00:00.0 eno0: TX ring 5 prio 5
[ 186.147674] fsl_enetc 0000:00:00.0 eno0: TX ring 6 prio 6
[ 186.153305] fsl_enetc 0000:00:00.0 eno0: TX ring 7 prio 7
The driver used to set TC_MQPRIO_HW_OFFLOAD_TCS, near which there is
this comment in the UAPI header:
TC_MQPRIO_HW_OFFLOAD_TCS, /* offload TCs, no queue counts */
which is what enetc was doing up until now (and no longer is; we offload
queue counts too), remove that assignment.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The enetc driver does not validate the mqprio queue configuration, so it
currently allows things like this:
$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 queues 3@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 1
But also things like this, completely omitting the queue configuration:
$ tc qdisc add dev eno0 root handle 1: mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 hw 1
By requesting validation via the mqprio capability structure, this is no
longer allowed, and we bring what is accepted by hardware in line with
what is accepted by software.
The check that num_tc <= real_num_tx_queues also becomes superfluous and
can be dropped, because mqprio_validate_queue_counts() validates that no
TXQ range exceeds real_num_tx_queues. That is a stronger check, because
there is at least 1 TXQ per TC, so there are at least as many TXQs as TCs.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it can happen that an mqprio qdisc is installed with num_tc 8,
and this will reserve 8 (out of 8) TXQs for the network stack. Then we
can attach an XDP program, and this will crop 2 TXQs, leaving just 6 for
mqprio. That's not what the user requested, and we should fail it.
On the other hand, if mqprio isn't requested, we still give the 8 TXQs
to the network stack (with hashing among a single traffic class), but
then, cropping 2 TXQs for XDP is fine, because the user didn't
explicitly ask for any number of TXQs, so no expectations are violated.
Simply put, the logic that mqprio should impose a minimum number of TXQs
for the network never existed. Let's say (more or less arbitrarily) that
without mqprio, the driver expects a minimum number of TXQs equal to the
number of CPUs (on NXP LS1028A, that is either 1, or 2). And with mqprio,
mqprio gives the minimum required number of TXQs.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the blamed net-next commit, enetc_setup_xdp_prog() no longer goes
through enetc_open(), and therefore, the function which was supposed to
detect whether a BPF program exists (in order to crop some TX queues
from network stack usage), enetc_num_stack_tx_queues(), no longer gets
called.
We can move the netif_set_real_num_rx_queues() call to enetc_alloc_msix()
(probe time), since it is a runtime invariant. We can do the same thing
with netif_set_real_num_tx_queues(), and let enetc_reconfigure_xdp_cb()
explicitly recalculate and change the number of stack TX queues.
Fixes: c33bfaf91c ("net: enetc: set up XDP program under enetc_reconfigure()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
enetc_reconfigure() was modified in commit c33bfaf91c ("net: enetc:
set up XDP program under enetc_reconfigure()") to take an optional
callback that runs while the netdev is down, but this callback currently
cannot fail.
Code up the error handling so that the interface is restarted with the
old resources if the callback fails.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We keep a pointer to the xdp_prog in the private netdev structure as
well; what's replicated per RX ring is done so just for more convenient
access from the NAPI poll procedure.
Simplify enetc_num_stack_tx_queues() by looking at priv->xdp_prog rather
than iterating through the information replicated per RX ring.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the new devm_of_phy_optional_get() helper instead of open-coding the
same operation.
As devm_of_phy_optional_get() returns NULL if either the PHY cannot be
found, or if support for the PHY framework is not enabled, it is no
longer needed to check for -ENODEV or -ENOSYS.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Link: https://lore.kernel.org/r/f2d801cd73cca36a7162819289480d7fc91fcc7e.1674584626.git.geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Conversion to gpiod API done in commit 468ba54bd6 ("fec: convert
to gpio descriptor") clashed with gpiolib applying the same quirk to the
reset GPIO polarity (introduced in commit b02c85c945). This results in
the reset line being left active/device being left in reset state when
reset line is "active low".
Remove handling of 'phy-reset-active-high' property from the driver and
rely on gpiolib to apply needed adjustments to avoid ending up with the
double inversion/flipped logic.
Fixes: 468ba54bd6 ("fec: convert to gpio descriptor")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230201215320.528319-2-dmitry.torokhov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Conversion of the driver to gpiod API done in 468ba54bd6 ("fec:
convert to gpio descriptor") incorrectly made reset line mandatory and
resulted in aborting driver probe in cases where reset line was not
specified (note: this way of specifying PHY reset line is actually
deprecated).
Switch to using devm_gpiod_get_optional() and skip manipulating reset
line if it can not be located.
Fixes: 468ba54bd6 ("fec: convert to gpio descriptor")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230201215320.528319-1-dmitry.torokhov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A summary of the flags being set for various drivers is given below.
Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features
that can be turned off and on at runtime. This means that these flags
may be set and unset under RTNL lock protection by the driver. Hence,
READ_ONCE must be used by code loading the flag value.
Also, these flags are not used for synchronization against the availability
of XDP resources on a device. It is merely a hint, and hence the read
may race with the actual teardown of XDP resources on the device. This
may change in the future, e.g. operations taking a reference on the XDP
resources of the driver, and in turn inhibiting turning off this flag.
However, for now, it can only be used as a hint to check whether device
supports becoming a redirection target.
Turn 'hw-offload' feature flag on for:
- netronome (nfp)
- netdevsim.
Turn 'native' and 'zerocopy' features flags on for:
- intel (i40e, ice, ixgbe, igc)
- mellanox (mlx5).
- stmmac
- netronome (nfp)
Turn 'native' features flags on for:
- amazon (ena)
- broadcom (bnxt)
- freescale (dpaa, dpaa2, enetc)
- funeth
- intel (igb)
- marvell (mvneta, mvpp2, octeontx2)
- mellanox (mlx4)
- mtk_eth_soc
- qlogic (qede)
- sfc
- socionext (netsec)
- ti (cpsw)
- tap
- tsnep
- veth
- xen
- virtio_net.
Turn 'basic' (tx, pass, aborted and drop) features flags on for:
- netronome (nfp)
- cavium (thunder)
- hyperv.
Turn 'redirect_target' feature flag on for:
- amanzon (ena)
- broadcom (bnxt)
- freescale (dpaa, dpaa2)
- intel (i40e, ice, igb, ixgbe)
- ti (cpsw)
- marvell (mvneta, mvpp2)
- sfc
- socionext (netsec)
- qlogic (qede)
- mellanox (mlx5)
- tap
- veth
- virtio_net
- xen
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When memory allocation fails in lynx_pcs_create() and it returns NULL,
there remains a dangling reference to the mdiodev returned by
of_mdio_find_device() which is leaked as soon as memac_pcs_create()
returns empty-handed.
Fixes: a7c2a32e7f ("net: fman: memac: Use lynx pcs driver")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://lore.kernel.org/r/20230130193051.563315-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver can be trivially converted, as it only triggers the gpio
pin briefly to do a reset, and it already only supports DT.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be followed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.
The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.
Fixes: d678be1dc1 ("dpaa2-eth: add XDP_REDIRECT support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be followed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.
The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.
Fixes: a1e031ffb4 ("dpaa_eth: add XDP_REDIRECT support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The pMAC (ENETC_PFPMR_PMACE) is probably unconditionally enabled in the
enetc driver to allow RX of preemptible packets and not see them as
error frames. I don't know why TX preemption (ENETC_MMCSR_ME) is enabled
though. With no way to say which traffic classes are preemptible (all
are express by default), no preemptible frames would be transmitted
anyway.
Lastly, it may have been believed that the register write lock-step mode
(now deleted) needed the pMAC to be enabled at all times. I don't know
if that's true. However, I've checked that driver writes to PM1
registers do propagate through to the ENETC IP even when the pMAC is
disabled.
With such incomplete support for frame preemption, it's best to just
remove whatever exists right now and come with something more coherent
later.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the enetc driver duplicates its writes to the PM0 registers
also to PM1, but it doesn't do this consistently - for example we write
to ENETC_PM0_MAXFRM but not to ENETC_PM1_MAXFRM.
Create enetc_port_mac_wr() which writes both the PM0 and PM1 register
with the same value (if frame preemption is supported on this port).
Also create enetc_port_mac_rd() which reads from PM0 - the assumption
being that PM1 contains just the same value.
This will be necessary when we enable the MAC Merge layer properly, and
the pMAC becomes operational.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MWLM bit (MAC write lock-step mode) allows register writes to the
pMAC to be auto-performed whenever the corresponding eMAC register is
written by the driver. This allows their configuration to remain
in sync.
The driver has set this bit since the initial commit, but it doesn't do
anything, since the hardware feature doesn't work (and the bit has been
removed from more recent versions of the documentation).
The driver does attempt, more or less, to keep those MAC registers in
sync by writing the same value once to e.g. ENETC_PM0_CMD_CFG (eMAC) and
once to ENETC_PM1_CMD_CFG (pMAC). Because the lockstep feature doesn't
work, that's what it will stick to.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a preliminary patch which replaces the hardcoded 0x1000 present
in other PM1 (port MAC 1, aka pMAC) register definitions, which is an
offset to the PM0 (port MAC 0, aka eMAC) equivalent register.
This definition will be used in more places by future code.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to other TSN features, query the Station Interface capability
register to see whether preemption is supported on this port or not.
On LS1028A, preemption is available on ports 0 and 2, but not on 1
and 3.
This will allow us in the future to write the pMAC registers only on the
ENETC ports where a pMAC actually exists.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The build system is complaining about the following:
enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf
enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf
enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deciding if to probe of PHYs using C45 is now determine by if the bus
provides the C45 read method. This makes probe_capabilities redundant
so remove it.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
napi_synchronize() from enetc_stop() waits until the softirq has
finished execution and no longer wants to be rescheduled. However under
high traffic load, this will never happen, and the interface can never
be closed.
The problem is the fact that the NAPI poll routine is written to update
the consumer index which makes the device want to put more buffers in
the RX ring, which restarts the madness again.
Browsing around, it seems that some drivers like i40e keep a bit
(__I40E_VSI_DOWN) which they use as communication between the control
path and the data path. But that isn't my first choice, because
complications ensue - since the enetc hardirq may trigger while we are
in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks
it, but enetc_poll() never unmasks it. To prevent a stall in that case,
one would need to schedule all NAPI instances when ENETC_DOWN gets
cleared, to process what's pending.
I find it more desirable for the control path - enetc_stop() - to just
quiesce the RX ring and let the softirq finish what remains there,
without any explicit communication, just by making hardware not provide
any more packets.
This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]).
I can't seem to find an exact definition of what this bit does, but when
the RX ring is disabled, the port seems to no longer update the producer
index, and not react to software updates of the consumer index.
In fact, the RBaMR[EN] bit is already toggled by the driver, but too
late for what we want:
enetc_close()
-> enetc_stop()
-> napi_synchronize()
-> enetc_clear_bdrs()
-> enetc_clear_rxbdr()
The enetc_clear_bdrs() function contains not only logic to disable the
RX and TX rings, but also logic to wait for the TX ring stop being busy.
We split enetc_clear_bdrs() into enetc_disable_bdrs() and
enetc_wait_bdrs(). One needs to run before napi_synchronize() and the
other after (NAPI also processes TX completions, so we maximize our
chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is
stuck for some reason, ofc).
We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call
this from the mirror position in enetc_start() compared to enetc_stop(),
i.e. right before netif_tx_start_all_queues().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Offloading a BPF program to the RX path of the driver suffers from the
same problems as the PTP reconfiguration - improper error checking can
leave the driver in an invalid state, and the link on the PHY is lost.
Reuse the enetc_reconfigure() procedure, but here, we need to run some
code in the middle of the ring reconfiguration procedure - while the
interface is still down. Introduce a callback which makes that possible.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Follow the convention from this driver, which is to name "struct
net_device *" as "ndev", and the convention from other drivers, to name
"struct netdev_bpf *" as "bpf".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The crude enetc_stop() -> enetc_open() mechanism suffers from 2
problems:
1. improper error checking
2. it involves phylink_stop() -> phylink_start() which loses the link
Right now, the driver is prepared to offer a better alternative: a ring
reconfiguration procedure which takes the RX BD size (normal or
extended) as argument. It allocates new resources (failing if that
fails), stops the traffic, and assigns the new resources to the rings.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We want to introduce a fast interface reconfiguration procedure, which
involves temporarily stopping the rings.
But we want enetc_start() and enetc_stop() to not restart PHY autoneg,
because that can take a few seconds until it completes again.
So we need part of enetc_start() and enetc_stop(), but not all of them.
Move phylink_start() right next to phylink_of_phy_connect(), and
phylink_stop() right next to phylink_disconnect_phy(), both still in
ndo_open() and ndo_stop().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>