1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

520 commits

Author SHA1 Message Date
Vladimir Oltean
16bbebd356 net: mscc: ocelot: fix last VCAP IS1/IS2 filter persisting in hardware when deleted
ocelot_vcap_filter_del() works by moving the next filters over the
current one, and then deleting the last filter by calling vcap_entry_set()
with a del_filter which was specially created by memsetting its memory
to zeroes. vcap_entry_set() then programs this to the TCAM and action
RAM via the cache registers.

The problem is that vcap_entry_set() is a dispatch function which looks
at del_filter->block_id. But since del_filter is zeroized memory, the
block_id is 0, or otherwise said, VCAP_ES0. So practically, what we do
is delete the entry at the same TCAM index from VCAP ES0 instead of IS1
or IS2.

The code was not always like this. vcap_entry_set() used to simply be
is2_entry_set(), and then, the logic used to work.

Restore the functionality by populating the block_id of the del_filter
based on the VCAP block of the filter that we're deleting. This makes
vcap_entry_set() know what to do.

Fixes: 1397a2eb52 ("net: mscc: ocelot: create TCAM skeleton from tc filter chains")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05 19:15:15 -07:00
Vladimir Oltean
e1846cff2f net: mscc: ocelot: mark traps with a bool instead of keeping them in a list
Since the blamed commit, VCAP filters can appear on more than one list.
If their action is "trap", they are chained on ocelot->traps via
filter->trap_list. This is in addition to their normal placement on the
VCAP block->rules list head.

Therefore, when we free a VCAP filter, we must remove it from all lists
it is a member of, including ocelot->traps.

There are at least 2 bugs which are direct consequences of this design
decision.

First is the incorrect usage of list_empty(), meant to denote whether
"filter" is chained into ocelot->traps via filter->trap_list.
This does not do the correct thing, because list_empty() checks whether
"head->next == head", but in our case, head->next == head->prev == NULL.
So we dereference NULL pointers and die when we call list_del().

Second is the fact that not all places that should remove the filter
from ocelot->traps do so. One example is ocelot_vcap_block_remove_filter(),
which is where we have the main kfree(filter). By keeping freed filters
in ocelot->traps we end up in a use-after-free in
felix_update_trapping_destinations().

Attempting to fix all the buggy patterns is a whack-a-mole game which
makes the driver unmaintainable. Actually this is what the previous
patch version attempted to do:
https://patchwork.kernel.org/project/netdevbpf/patch/20220503115728.834457-3-vladimir.oltean@nxp.com/

but it introduced another set of bugs, because there are other places in
which create VCAP filters, not just ocelot_vcap_filter_create():

- ocelot_trap_add()
- felix_tag_8021q_vlan_add_rx()
- felix_tag_8021q_vlan_add_tx()

Relying on the convention that all those code paths must call
INIT_LIST_HEAD(&filter->trap_list) is not going to scale.

So let's do what should have been done in the first place and keep a
bool in struct ocelot_vcap_filter which denotes whether we are looking
at a trapping rule or not. Iterating now happens over the main VCAP IS2
block->rules. The advantage is that we no longer risk having stale
references to a freed filter, since it is only present in that list.

Fixes: e42bd4ed09 ("net: mscc: ocelot: keep traps in a list")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-05 19:15:14 -07:00
Vladimir Oltean
91d350d661 net: mscc: ocelot: don't use magic numbers for OCELOT_POLICER_DISCARD
OCELOT_POLICER_DISCARD helps "kill dropped packets dead" since a
PERMIT/DENY mask mode with a port mask of 0 isn't enough to stop the CPU
port from receiving packets removed from the forwarding path.

The hardcoded initialization done for it in ocelot_vcap_init() is
confusing. All we need from it is to have a rate and a burst size of 0.

Reuse qos_policer_conf_set() for that purpose.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 20:42:15 -07:00
Vladimir Oltean
8e90c499bd net: mscc: ocelot: drop port argument from qos_policer_conf_set
The "port" argument is used for nothing else except printing on the
error path. Print errors on behalf of the policer index, which is less
confusing anyway.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 20:42:15 -07:00
Vladimir Oltean
09fd1e0d14 net: mscc: ocelot: use list_for_each_entry in ocelot_vcap_filter_add_to_block
Unify the code paths for adding to an empty list and to a list with
elements by keeping a "pos" list_head element that indicates where to
insert. Initialize "pos" with the list head itself in case
list_for_each_entry() doesn't iterate over any element.

Note that list_for_each_safe() isn't needed because no element is
removed from the list while iterating.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 20:42:15 -07:00
Vladimir Oltean
3825a0d027 net: mscc: ocelot: add to tail of empty list in ocelot_vcap_filter_add_to_block
This makes no functional difference but helps in minimizing the delta
for a future change.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 20:42:14 -07:00
Vladimir Oltean
0a448bba50 net: mscc: ocelot: use list_add_tail in ocelot_vcap_filter_add_to_block()
list_add(..., pos->prev) and list_add_tail(..., pos) are equivalent, use
the later form to unify with the case where the list is empty later.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 20:42:14 -07:00
Colin Foster
2f187bfa6f net: ethernet: ocelot: remove the need for num_stats initializer
There is a desire to share the oclot_stats_layout struct outside of the
current vsc7514 driver. In order to do so, the length of the array needs to
be known at compile time, and defined in the struct ocelot and struct
felix_info.

Since the array is defined in a .c file and would be declared in the header
file via:
extern struct ocelot_stat_layout[];
the size of the array will not be known at compile time to outside modules.

To fix this, remove the need for defining the number of stats at compile
time and allow this number to be determined at initialization.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 13:38:49 +01:00
Jakub Kicinski
0e55546b18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/netdevice.h
net/core/dev.c
  6510ea973d ("net: Use this_cpu_inc() to increment net->core_stats")
  794c24e992 ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/

drivers/net/wan/cosa.c
  d48fea8401 ("net: cosa: fix error check return value of register_chrdev()")
  89fbca3307 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:02:01 -07:00
Vladimir Oltean
1fcb8fb352 net: mscc: ocelot: don't add VID 0 to ocelot->vlans when leaving VLAN-aware bridge
DSA, through dsa_port_bridge_leave(), first notifies the port of the
fact that it left a bridge, then, if that bridge was VLAN-aware, it
notifies the port of the change in VLAN awareness state, towards
VLAN-unaware mode.

So ocelot_port_vlan_filtering() can be called when ocelot_port->bridge
is NULL, and this makes ocelot_add_vlan_unaware_pvid() create a struct
ocelot_bridge_vlan with a vid of 0 and an "untagged" setting of true on
that port.

In a way this structure correctly reflects the reality, but by design,
VID 0 (OCELOT_STANDALONE_PVID) was not meant to be kept in the bridge
VLAN list of the driver, but managed separately.

Having OCELOT_STANDALONE_PVID in ocelot->vlans makes us trip up on
several sanity checks that did not expect to have this VID there.
For example, after we leave a VLAN-aware bridge and we re-join it, we
can no longer program egress-tagged VLANs to hardware:

 # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
 # ip link set swp0 master br0
 # ip link set swp0 nomaster
 # ip link set swp0 master br0
 # bridge vlan add dev swp0 vid 100
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.

But this configuration is in fact supported by the hardware, since we
could use OCELOT_PORT_TAG_NATIVE. According to its comment:

/* all VLANs except the native VLAN and VID 0 are egress-tagged */

yet when assessing the eligibility for this mode, we do not check for
VID 0 in ocelot_port_uses_native_vlan(), instead we just ensure that
ocelot_port_num_untagged_vlans() == 1. This is simply because VID 0
doesn't have a bridge VLAN structure.

The way I identify the problem is that ocelot_port_vlan_filtering(false)
only means to call ocelot_add_vlan_unaware_pvid() when we dynamically
turn off VLAN awareness for a bridge we are under, and the PVID changes
from the bridge PVID to a reserved PVID based on the bridge number.

Since OCELOT_STANDALONE_PVID is statically added to the VLAN table
during ocelot_vlan_init() and never removed afterwards, calling
ocelot_add_vlan_unaware_pvid() for it is not intended and does not serve
any purpose.

Fix the issue by avoiding the call to ocelot_add_vlan_unaware_pvid(vid=0)
when we're resetting VLAN awareness after leaving the bridge, to become
a standalone port.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:47:55 +01:00
Vladimir Oltean
9323ac3670 net: mscc: ocelot: ignore VID 0 added by 8021q module
Both the felix DSA driver and ocelot switchdev driver declare
dev->features & NETIF_F_HW_VLAN_CTAG_FILTER under certain circumstances*,
so the 8021q module will add VID 0 to our RX filter when the port goes
up, to ensure 802.1p traffic is not dropped.

We treat VID 0 as a special value (OCELOT_STANDALONE_PVID) which
deliberately does not have a struct ocelot_bridge_vlan associated with
it. Instead, this gets programmed to the VLAN table in ocelot_vlan_init().

If we allow external calls to modify VID 0, we reach the following
situation:

 # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
 # ip link set swp0 master br0
 # ip link set swp0 up # this adds VID 0 to ocelot->vlans with untagged=false
bridge vlan
port              vlan-id
swp0              1 PVID Egress Untagged # the bridge also adds VID 1
br0               1 PVID Egress Untagged
 # bridge vlan add dev swp0 vid 100 untagged
Error: mscc_ocelot_switch_lib: Port with egress-tagged VLANs cannot have more than one egress-untagged (native) VLAN.

This configuration should have been accepted, because
ocelot_port_manage_port_tag() should select OCELOT_PORT_TAG_NATIVE.
Yet it isn't, because we have an entry in ocelot->vlans which says
VID 0 should be egress-tagged, something the hardware can't do.

Fix this by suppressing additions/deletions on VID 0 and managing this
VLAN exclusively using OCELOT_STANDALONE_PVID.

*DSA toggles it when the port becomes VLAN-aware by joining a VLAN-aware
bridge. Ocelot declares it unconditionally for some reason.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:47:55 +01:00
Haowen Bai
985e254c73 net: mscc: ocelot: Remove useless code
payload only memset but no use at all, so we drop them.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:19:16 +01:00
Vladimir Oltean
4cf35a2b62 net: mscc: ocelot: fix broken IP multicast flooding
When the user runs:
bridge link set dev $br_port mcast_flood on

this command should affect not only L2 multicast, but also IPv4 and IPv6
multicast.

In the Ocelot switch, unknown multicast gets flooded according to
different PGIDs according to its type, and PGID_MC only handles L2
multicast. Therefore, by leaving PGID_MCIPV4 and PGID_MCIPV6 at their
default value of 0, unknown IP multicast traffic is never flooded.

Fixes: 421741ea56 ("net: mscc: ocelot: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220415151950.219660-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-19 10:33:33 +02:00
Vladimir Oltean
f2a0e216be net: mscc: ocelot: offload per-flow mirroring using tc-mirred and VCAP IS2
Per-flow mirroring with the VCAP IS2 TCAM (in itself handled as an
offload for tc-flower) is done by setting the MIRROR_ENA bit from the
action vector of the filter. The packet is mirrored to the port mask
configured in the ANA:ANA:MIRRORPORTS register (the same port mask as
the destinations for port-based mirroring).

Functionality was tested with:

tc qdisc add dev swp3 clsact
tc filter add dev swp3 ingress protocol ip \
	flower skip_sw ip_proto icmp \
	action mirred egress mirror dev swp1

and pinging through swp3, while seeing that the ICMP replies are
mirrored towards swp1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:47 -07:00
Vladimir Oltean
c3d427eac9 net: mscc: ocelot: establish functions for handling VCAP aux resources
Some VCAP filters utilize resources which are global to the switch, like
for example VCAP IS2 policers take an index into a global policer pool.

In commit c9a7fe1238 ("net: mscc: ocelot: add action of police on
vcap_is2"), Xiaoliang expressed this by hooking into the low-level
ocelot_vcap_filter_add_to_block() and ocelot_vcap_block_remove_filter()
functions, and allocating/freeing the policers from there.

Evaluating the code, there probably isn't a better place, but we'll need
to do something similar for the mirror ports, and the code will start to
look even more hacked up than it is right now.

Create two ocelot_vcap_filter_{add,del}_aux_resources() functions to
contain the madness, and pollute less the body of other functions such
as ocelot_vcap_filter_add_to_block().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:47 -07:00
Vladimir Oltean
ccb6ed426f net: mscc: ocelot: add port mirroring support using tc-matchall
Ocelot switches perform port-based ingress mirroring if
ANA:PORT:PORT_CFG field SRC_MIRROR_ENA is set, and egress mirroring if
the port is in ANA:ANA:EMIRRORPORTS.

Both ingress-mirrored and egress-mirrored frames are copied to the port
mask from ANA:ANA:MIRRORPORTS.

So the choice of limiting to a single mirror port via ocelot_mirror_get()
and ocelot_mirror_put() may seem bizarre, but the hardware model doesn't
map very well to the user space model. If the user wants to mirror the
ingress of swp1 towards swp2 and the ingress of swp3 towards swp4, we'd
have to program ANA:ANA:MIRRORPORTS with BIT(2) | BIT(4), and that would
make swp1 be mirrored towards swp4 too, and swp3 towards swp2. But there
are no tc-matchall rules to describe those actions.

Now, we could offload a matchall rule with multiple mirred actions, one
per desired mirror port, and force the user to stick to the multi-action
rule format for subsequent matchall filters. But both DSA and ocelot
have the flow_offload_has_one_action() check for the matchall offload,
plus the fact that it will get cumbersome to cross-check matchall
mirrors with flower mirrors (which will be added in the next patch).

As a result, we limit the configuration to a single mirror port, with
the possibility of lifting the restriction in the future.

Frames injected from the CPU don't get egress-mirrored, since they are
sent with the BYPASS bit in the injection frame header, and this
bypasses the analyzer module (effectively also the mirroring logic).
I don't know what to do/say about this.

Functionality was tested with:

tc qdisc add dev swp3 clsact
tc filter add dev swp3 ingress \
	matchall skip_sw \
	action mirred egress mirror dev swp1

and pinging through swp3, while seeing that the ICMP replies are
mirrored towards swp1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:46 -07:00
Vladimir Oltean
4fa7210802 net: mscc: ocelot: refactor policer work out of ocelot_setup_tc_cls_matchall
In preparation for adding port mirroring support to the ocelot driver,
the dispatching function ocelot_setup_tc_cls_matchall() must be free of
action-specific code. Move port policer creation and deletion to
separate functions.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:46 -07:00
Jakub Kicinski
e243f39685 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 13:56:58 -07:00
Vladimir Oltean
8e0341aefc net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since
the blamed commit, through a chain index whose number encodes a specific
PAG (Policy Action Group) and lookup number.

The chain number is translated through ocelot_chain_to_pag() into a PAG,
and through ocelot_chain_to_lookup() into a lookup number.

The problem with the blamed commit is that the above 2 functions don't
have special treatment for chain 0. So ocelot_chain_to_pag(0) returns
filter->pag = 224, which is in fact -32, but the "pag" field is an u8.

So we end up programming the hardware with VCAP IS2 entries having a PAG
of 224. But the way in which the PAG works is that it defines a subset
of VCAP IS2 filters which should match on a packet. The default PAG is
0, and previous VCAP IS1 rules (which we offload using 'goto') can
modify it. So basically, we are installing filters with a PAG on which
no packet will ever match. This is the hardware equivalent of adding
filters to a chain which has no 'goto' to it.

Restore the previous functionality by making ACL filters offloaded to
chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly
correct, but the choice of lookup number isn't "as before" (which was to
leave the lookup a "don't care"). However, lookup 0 should be fine,
since even though there are ACL actions (policers) which have a
requirement to be used in a specific lookup, that lookup is 0.

Fixes: 226e9cd82a ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 09:34:52 -07:00
Vladimir Oltean
72f56fdb97 net: mscc: ocelot: fix build error due to missing IEEE_8021QAZ_MAX_TCS
IEEE_8021QAZ_MAX_TCS is defined in include/uapi/linux/dcbnl.h, which is
included by net/dcbnl.h. Then, linux/netdevice.h conditionally includes
net/dcbnl.h if CONFIG_DCB is enabled.

Therefore, when CONFIG_DCB is disabled, this indirect dependency is
broken.

There isn't a good reason to include net/dcbnl.h headers into the ocelot
switch library which exports low-level hardware API, so replace
IEEE_8021QAZ_MAX_TCS with OCELOT_NUM_TC which has the same value.

Fixes: 978777d0fb ("net: dsa: felix: configure default-prio and dscp priorities")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220315131215.273450-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-15 11:47:11 -07:00
Vladimir Oltean
978777d0fb net: dsa: felix: configure default-prio and dscp priorities
Follow the established programming model for this driver and provide
shims in the felix DSA driver which call the implementations from the
ocelot switch lib. The ocelot switchdev driver wasn't integrated with
dcbnl due to lack of hardware availability.

The switch doesn't have any fancy QoS classification enabled by default.
The provided getters will create a default-prio app table entry of 0,
and no dscp entry. However, the getters have been made to actually
retrieve the hardware configuration rather than static values, to be
future proof in case DSA will need this information from more call paths.

For default-prio, there is a single field per port, in ANA_PORT_QOS_CFG,
called QOS_DEFAULT_VAL.

DSCP classification is enabled per-port, again via ANA_PORT_QOS_CFG
(field QOS_DSCP_ENA), and individual DSCP values are configured as
trusted or not through register ANA_DSCP_CFG (replicated 64 times).
An untrusted DSCP value falls back to other QoS classification methods.
If trusted, the selected ANA_DSCP_CFG register also holds the QoS class
in the QOS_DSCP_VAL field.

The hardware also supports DSCP remapping (DSCP value X is translated to
DSCP value Y before the QoS class is determined based on the app table
entry for Y) and DSCP packet rewriting. The dcbnl framework, for being
so flexible in other useless areas, doesn't appear to support this.
So this functionality has been left out.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:36:15 +00:00
Vladimir Oltean
c3cde44f3c net: mscc: ocelot: use pretty names for IPPROTO_UDP and IPPROTO_TCP
Hardcoding these IP protocol numbers in is2_entry_set() obscures the
purpose of the code, so replace the magic numbers with the definitions
from linux/in.h.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-04 09:21:23 +00:00
Vladimir Oltean
c5a0edaeb9 net: mscc: ocelot: use list_for_each_entry in ocelot_vcap_block_remove_filter
Simplify ocelot_vcap_block_remove_filter by using list_for_each_entry
instead of list_for_each.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-04 09:21:23 +00:00
Vladimir Oltean
ac45520960 net: mscc: ocelot: accept configuring bridge port flags on the NPI port
In order for the Felix DSA driver to be able to turn on/off flooding
towards its CPU port, we need to redirect calls on the NPI port to
actually act upon the index in the analyzer block that corresponds to
the CPU port module. This was never necessary until now because DSA
(or the bridge) never called ocelot_port_bridge_flags() for the NPI
port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Jianbo Liu
d97b4b105c flow_offload: reject offload for all drivers with invalid police parameters
As more police parameters are passed to flow_offload, driver can check
them to make sure hardware handles packets in the way indicated by tc.
The conform-exceed control should be drop/pipe or drop/ok. Besides,
for drop/ok, the police should be the last action. As hardware can't
configure peakrate/avrate/overhead, offload should not be supported if
any of them is configured.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-28 11:12:20 +00:00
Vladimir Oltean
54c3198460 net: mscc: ocelot: enforce FDB isolation when VLAN-unaware
Currently ocelot uses a pvid of 0 for standalone ports and ports under a
VLAN-unaware bridge, and the pvid of the bridge for ports under a
VLAN-aware bridge. Standalone ports do not perform learning, but packets
received on them are still subject to FDB lookups. So if the MAC DA that
a standalone port receives has been also learned on a VLAN-unaware
bridge port, ocelot will attempt to forward to that port, even though it
can't, so it will drop packets.

So there is a desire to avoid that, and isolate the FDBs of different
bridges from one another, and from standalone ports.

The ocelot switch library has two distinct entry points: the felix DSA
driver and the ocelot switchdev driver.

We need to code up a minimal bridge_num allocation in the ocelot
switchdev driver too, this is copied from DSA with the exception that
ocelot does not care about DSA trees, cross-chip bridging etc. So it
only looks at its own ports that are already in the same bridge.

The ocelot switchdev driver uses the bridge_num it has allocated itself,
while the felix driver uses the bridge_num allocated by DSA. They are
both stored inside ocelot_port->bridge_num by the common function
ocelot_port_bridge_join() which receives the bridge_num passed by value.

Once we have a bridge_num, we can only use it to enforce isolation
between VLAN-unaware bridges. As far as I can see, ocelot does not have
anything like a FID that further makes VLAN 100 from a port be different
to VLAN 100 from another port with regard to FDB lookup. So we simply
deny multiple VLAN-aware bridges.

For VLAN-unaware bridges, we crop the 4000-4095 VLAN region and we
allocate a VLAN for each bridge_num. This will be used as the pvid of
each port that is under that VLAN-unaware bridge, for as long as that
bridge is VLAN-unaware.

VID 0 remains only for standalone ports. It is okay if all standalone
ports use the same VID 0, since they perform no address learning, the
FDB will contain no entry in VLAN 0, so the packets will always be
flooded to the only possible destination, the CPU port.

The CPU port module doesn't need to be member of the VLANs to receive
packets, but if we use the DSA tag_8021q protocol, those packets are
part of the data plane as far as ocelot is concerned, so there it needs
to. Just ensure that the DSA tag_8021q CPU port is a member of all
reserved VLANs when it is created, and is removed when it is deleted.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:14 +00:00
Vladimir Oltean
961d8b6990 net: dsa: felix: support FDB entries on offloaded LAG interfaces
This adds the logic in the Felix DSA driver and Ocelot switch library.
For Ocelot switches, the DEST_IDX that is the output of the MAC table
lookup is a logical port (equal to physical port, if no LAG is used, or
a dynamically allocated number otherwise). The allocation we have in
place for LAG IDs is different from DSA's, so we can't use that:
- DSA allocates a continuous range of LAG IDs starting from 1
- Ocelot appears to require that physical ports and LAG IDs are in the
  same space of [0, num_phys_ports), and additionally, ports that aren't
  in a LAG must have physical port id == logical port id

The implication is that an FDB entry towards a LAG might need to be
deleted and reinstalled when the LAG ID changes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:44 -08:00
Jakub Kicinski
6b5567b1b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 11:44:20 -08:00
Vladimir Oltean
9d75b88185 net: mscc: ocelot: annotate which traps need PTP timestamping
The ocelot switch library does not need this information, but the felix
DSA driver does.

As a reminder, the VSC9959 switch in LS1028A doesn't have an IRQ line
for packet extraction, so to be notified that a PTP packet needs to be
dequeued, it receives that packet also over Ethernet, by setting up a
packet trap. The Felix driver needs to install special kinds of traps
for packets in need of RX timestamps, such that the packets are
replicated both over Ethernet and over the CPU port module.

But the Ocelot switch library sets up more than one trap for PTP event
messages; it also traps PTP general messages, MRP control messages etc.
Those packets don't need PTP timestamps, so there's no reason for the
Felix driver to send them to the CPU port module.

By knowing which traps need PTP timestamps, the Felix driver can
adjust the traps installed using ocelot_trap_add() such that only those
will actually get delivered to the CPU port module.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
e42bd4ed09 net: mscc: ocelot: keep traps in a list
When using the ocelot-8021q tagging protocol, the CPU port isn't
configured as an NPI port, but is a regular port. So a "trap to CPU"
operation is actually a "redirect" operation. So DSA needs to set up the
trapping action one way or another, depending on the tagging protocol in
use.

To ease DSA's work of modifying the action, keep all currently installed
traps in a list, so that DSA can live-patch them when the tagging
protocol changes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
b9bace6e53 net: mscc: ocelot: use a single VCAP filter for all MRP traps
The MRP assist code installs a VCAP IS2 trapping rule for each port, but
since the key and the action is the same, just the ingress port mask
differs, there isn't any need to do this. We can save some space in the
TCAM by using a single filter and adjusting the ingress port mask.

Reuse the ocelot_trap_add() and ocelot_trap_del() functions for this
purpose.

Now that the cookies are no longer per port, we need to change the
allocation scheme such that MRP traps use a fixed number.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
36fac35b29 net: mscc: ocelot: delete OCELOT_MRP_CPUQ
MRP frames are configured to be trapped to the CPU queue 7, and this
number is reflected in the extraction header. However, the information
isn't used anywhere, so just leave MRP frames to go to CPU queue 0
unless needed otherwise.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
c518afec28 net: mscc: ocelot: consolidate cookie allocation for private VCAP rules
Every use case that needed VCAP filters (in order: DSA tag_8021q, MRP,
PTP traps) has hardcoded filter identifiers that worked well enough for
that use case alone. But when two or more of those use cases would be
used together, some of those identifiers would overlap, leading to
breakage.

Add definitions for each cookie and centralize them in ocelot_vcap.h,
such that the overlaps are more obvious.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
e3c02b7c65 net: mscc: ocelot: use a consistent cookie for MRP traps
The driver uses an identifier equal to (ocelot->num_phys_ports + port)
for MRP traps installed when the system is in the role of an MRC, and an
identifier equal to (port) otherwise.

Use the same identifier in both cases as a consolidation for the various
cookie values spread throughout the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:50 +00:00
Vladimir Oltean
ef57640575 net: mscc: ocelot: fix use-after-free in ocelot_vlan_del()
ocelot_vlan_member_del() will free the struct ocelot_bridge_vlan, so if
this is the same as the port's pvid_vlan which we access afterwards,
what we're accessing is freed memory.

Fix the bug by determining whether to clear ocelot_port->pvid_vlan prior
to calling ocelot_vlan_member_del().

Fixes: d4004422f6 ("net: mscc: ocelot: track the port pvid using a pointer")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-15 14:38:20 +00:00
Colin Foster
d87b1c08f3 net: mscc: ocelot: use bulk reads for stats
Create and utilize bulk regmap reads instead of single access for gathering
stats. The background reading of statistics happens frequently, and over
a few contiguous memory regions.

High speed PCIe buses and MMIO access will probably see negligible
performance increase. Lower speed buses like SPI and I2C could see
significant performance increase, since the bus configuration and register
access times account for a large percentage of data transfer time.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-14 13:24:29 +00:00
Colin Foster
40f3a5c815 net: mscc: ocelot: add ability to perform bulk reads
Regmap supports bulk register reads. Ocelot does not. This patch adds
support for Ocelot to invoke bulk regmap reads. That will allow any driver
that performs consecutive reads over memory regions to optimize that
access.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-14 13:24:29 +00:00
Colin Foster
e27d785e60 net: mscc: ocelot: remove unnecessary stat reading from ethtool
The ocelot_update_stats function only needs to read from one port, yet it
was updating the stats for all ports. Update to only read the stats that
are necessary.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-14 13:24:29 +00:00
Colin Foster
7fbf6795d1 net: mscc: ocelot: fix mutex lock error during ethtool stats read
An ongoing workqueue populates the stats buffer. At the same time, a user
might query the statistics. While writing to the buffer is mutex-locked,
reading from the buffer wasn't. This could lead to buggy reads by ethtool.

This patch fixes the former blamed commit, but the bug was introduced in
the latter.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Fixes: 1e1caa9735 ("ocelot: Clean up stats update deferred work")
Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/all/20220210150451.416845-2-colin.foster@in-advantage.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 11:44:41 -08:00
Vladimir Oltean
59085208e4 net: mscc: ocelot: fix all IP traffic getting trapped to CPU with PTP over IP
The filters for the PTP trap keys are incorrectly configured, in the
sense that is2_entry_set() only looks at trap->key.ipv4.dport or
trap->key.ipv6.dport if trap->key.ipv4.proto or trap->key.ipv6.proto is
set to IPPROTO_TCP or IPPROTO_UDP.

But we don't do that, so is2_entry_set() goes through the "else" branch
of the IP protocol check, and ends up installing a rule for "Any IP
protocol match" (because msk is also 0). The UDP port is ignored.

This means that when we run "ptp4l -i swp0 -4", all IP traffic is
trapped to the CPU, which hinders bridging.

Fix this by specifying the IP protocol in the VCAP IS2 filters for PTP
over UDP.

Fixes: 96ca08c058 ("net: mscc: ocelot: set up traps for PTP packets")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-05 15:27:01 +00:00
Tom Rix
baa59504c1 net: mscc: ocelot: fix using match before it is set
Clang static analysis reports this issue
ocelot_flower.c:563:8: warning: 1st function call argument
  is an uninitialized value
    !is_zero_ether_addr(match.mask->dst)) {
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The variable match is used before it is set.  So move the
block.

Fixes: 75944fda1d ("net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-19 14:34:43 +00:00
Horatiu Vultur
c0b7f7d7e0 net: ocelot: Fix the call to switchdev_bridge_port_offload
In the blamed commit, the call to the function
switchdev_bridge_port_offload was passing the wrong argument for
atomic_nb. It was ocelot_netdevice_nb instead of ocelot_swtchdev_nb.
This patch fixes this issue.

Fixes: 4e51bf44a0 ("net: bridge: move the switchdev object replay helpers to "push" mode")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-17 13:04:25 +00:00
Vladimir Oltean
80f15f3bef net: mscc: ocelot: don't dereference NULL pointers with shared tc filters
The following command sequence:

tc qdisc del dev swp0 clsact
tc qdisc add dev swp0 ingress_block 1 clsact
tc qdisc add dev swp1 ingress_block 1 clsact
tc filter add block 1 flower action drop
tc qdisc del dev swp0 clsact

produces the following NPD:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
pc : vcap_entry_set+0x14/0x70
lr : ocelot_vcap_filter_del+0x198/0x234
Call trace:
 vcap_entry_set+0x14/0x70
 ocelot_vcap_filter_del+0x198/0x234
 ocelot_cls_flower_destroy+0x94/0xe4
 felix_cls_flower_del+0x70/0x84
 dsa_slave_setup_tc_block_cb+0x13c/0x60c
 dsa_slave_setup_tc_block_cb_ig+0x20/0x30
 tc_setup_cb_reoffload+0x44/0x120
 fl_reoffload+0x280/0x320
 tcf_block_playback_offloads+0x6c/0x184
 tcf_block_unbind+0x80/0xe0
 tcf_block_setup+0x174/0x214
 tcf_block_offload_cmd.isra.0+0x100/0x13c
 tcf_block_offload_unbind+0x5c/0xa0
 __tcf_block_put+0x54/0x174
 tcf_block_put_ext+0x5c/0x74
 clsact_destroy+0x40/0x60
 qdisc_destroy+0x4c/0x150
 qdisc_put+0x70/0x90
 qdisc_graft+0x3f0/0x4c0
 tc_get_qdisc+0x1cc/0x364
 rtnetlink_rcv_msg+0x124/0x340

The reason is that the driver isn't prepared to receive two tc filters
with the same cookie. It unconditionally creates a new struct
ocelot_vcap_filter for each tc filter, and it adds all filters with the
same identifier (cookie) to the ocelot_vcap_block.

The problem is here, in ocelot_vcap_filter_del():

	/* Gets index of the filter */
	index = ocelot_vcap_block_get_filter_index(block, filter);
	if (index < 0)
		return index;

	/* Delete filter */
	ocelot_vcap_block_remove_filter(ocelot, block, filter);

	/* Move up all the blocks over the deleted filter */
	for (i = index; i < block->count; i++) {
		struct ocelot_vcap_filter *tmp;

		tmp = ocelot_vcap_block_find_filter_by_index(block, i);
		vcap_entry_set(ocelot, i, tmp);
	}

what will happen is ocelot_vcap_block_get_filter_index() will return the
index (@index) of the first filter found with that cookie. This is _not_
the index of _this_ filter, but the other one with the same cookie,
because ocelot_vcap_filter_equal() gets fooled.

Then later, ocelot_vcap_block_remove_filter() is coded to remove all
filters that are ocelot_vcap_filter_equal() with the passed @filter.
So unexpectedly, both filters get deleted from the list.

Then ocelot_vcap_filter_del() will attempt to move all the other filters
up, again finding them by index (@i). The block count is 2, @index was 0,
so it will attempt to move up filter @i=0 and @i=1. It assigns tmp =
ocelot_vcap_block_find_filter_by_index(block, i), which is now a NULL
pointer because ocelot_vcap_block_remove_filter() has removed more than
one filter.

As far as I can see, this problem has been there since the introduction
of tc offload support, however I cannot test beyond the blamed commit
due to hardware availability. In any case, any fix cannot be backported
that far, due to lots of changes to the code base.

Therefore, let's go for the correct solution, which is to not call
ocelot_vcap_filter_add() and ocelot_vcap_filter_del(), unless the filter
is actually unique and not shared. For the shared filters, we should
just modify the ingress port mask and call ocelot_vcap_filter_replace(),
a function introduced by commit 95706be13b ("net: mscc: ocelot: create
a function that replaces an existing VCAP filter"). This way,
block->rules will only contain filters with unique cookies, by design.

Fixes: 07d985eef0 ("net: dsa: felix: Wire up the ocelot cls_flower methods")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-15 22:31:50 +00:00
Vladimir Oltean
33cb0ff30c net: mscc: ocelot: don't let phylink re-enable TX PAUSE on the NPI port
Since commit b39648079d ("net: mscc: ocelot: disable flow control on
NPI interface"), flow control should be disabled on the DSA CPU port
when used in NPI mode.

However, the commit blamed in the Fixes: tag below broke this, because
it allowed felix_phylink_mac_link_up() to overwrite SYS_PAUSE_CFG_PAUSE_ENA
for the DSA CPU port.

This issue became noticeable since the device tree update from commit
8fcea7be57 ("arm64: dts: ls1028a: mark internal links between Felix
and ENETC as capable of flow control").

The solution is to check whether this is the currently configured NPI
port from ocelot_phylink_mac_link_up(), and to not modify the statically
disabled PAUSE frame transmission if it is.

When the port is configured for lossless mode as opposed to tail drop
mode, but the link partner (DSA master) doesn't observe the transmitted
PAUSE frames, the switch termination throughput is much worse, as can be
seen below.

Before:

root@debian:~# iperf3 -c 192.168.100.2
Connecting to host 192.168.100.2, port 5201
[  5] local 192.168.100.1 port 37504 connected to 192.168.100.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  28.4 MBytes   238 Mbits/sec  357   22.6 KBytes
[  5]   1.00-2.00   sec  33.6 MBytes   282 Mbits/sec  426   19.8 KBytes
[  5]   2.00-3.00   sec  34.0 MBytes   285 Mbits/sec  343   21.2 KBytes
[  5]   3.00-4.00   sec  32.9 MBytes   276 Mbits/sec  354   22.6 KBytes
[  5]   4.00-5.00   sec  32.3 MBytes   271 Mbits/sec  297   18.4 KBytes
^C[  5]   5.00-5.06   sec  2.05 MBytes   270 Mbits/sec   45   19.8 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.06   sec   163 MBytes   271 Mbits/sec  1822             sender
[  5]   0.00-5.06   sec  0.00 Bytes  0.00 bits/sec                  receiver

After:

root@debian:~# iperf3 -c 192.168.100.2
Connecting to host 192.168.100.2, port 5201
[  5] local 192.168.100.1 port 49470 connected to 192.168.100.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   112 MBytes   941 Mbits/sec  259    143 KBytes
[  5]   1.00-2.00   sec   110 MBytes   920 Mbits/sec  329    144 KBytes
[  5]   2.00-3.00   sec   112 MBytes   936 Mbits/sec  255    144 KBytes
[  5]   3.00-4.00   sec   110 MBytes   927 Mbits/sec  355    105 KBytes
[  5]   4.00-5.00   sec   110 MBytes   926 Mbits/sec  350    156 KBytes
[  5]   5.00-6.00   sec   110 MBytes   925 Mbits/sec  305    148 KBytes
[  5]   6.00-7.00   sec   110 MBytes   924 Mbits/sec  320    143 KBytes
[  5]   7.00-8.00   sec   110 MBytes   925 Mbits/sec  273   97.6 KBytes
[  5]   8.00-9.00   sec   109 MBytes   913 Mbits/sec  299    141 KBytes
[  5]   9.00-10.00  sec   110 MBytes   922 Mbits/sec  287    146 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.08 GBytes   926 Mbits/sec  3032             sender
[  5]   0.00-10.00  sec  1.08 GBytes   925 Mbits/sec                  receiver

Fixes: de274be32c ("net: dsa: felix: set TX flow control according to the phylink_mac_link_up resolution")
Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-13 12:52:15 +00:00
Vladimir Oltean
5cad43a52e net: dsa: felix: add port fast age support
Add support for flushing the MAC table on a given port in the ocelot
switch library, and use this functionality in the felix DSA driver.

This operation is needed when a port leaves a bridge to become
standalone, and when the learning is disabled, and when the STP state
changes to a state where no FDB entry should be present.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220107144229.244584-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07 18:58:25 -08:00
Vladimir Oltean
a14e6b69f3 net: mscc: ocelot: fix incorrect balancing with down LAG ports
Assuming the test setup described here:
https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/
(swp1 and swp2 are in bond0, and bond0 is in a bridge with swp0)

it can be seen that when swp1 goes down (on either board A or B), then
traffic that should go through that port isn't forwarded anywhere.

A dump of the PGID table shows the following:

PGID_DST[0] = ports 0
PGID_DST[1] = ports 1
PGID_DST[2] = ports 2
PGID_DST[3] = ports 3
PGID_DST[4] = ports 4
PGID_DST[5] = ports 5
PGID_DST[6] = no ports
PGID_AGGR[0] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[1] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[2] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[3] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[4] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[5] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[6] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[7] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[8] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[9] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[10] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[11] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[12] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[13] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[14] = ports 0, 1, 2, 3, 4, 5
PGID_AGGR[15] = ports 0, 1, 2, 3, 4, 5
PGID_SRC[0] = ports 1, 2
PGID_SRC[1] = ports 0
PGID_SRC[2] = ports 0
PGID_SRC[3] = no ports
PGID_SRC[4] = no ports
PGID_SRC[5] = no ports
PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5

Whereas a "good" PGID configuration for that setup should have looked
like this:

PGID_DST[0] = ports 0
PGID_DST[1] = ports 1, 2
PGID_DST[2] = ports 1, 2
PGID_DST[3] = ports 3
PGID_DST[4] = ports 4
PGID_DST[5] = ports 5
PGID_DST[6] = no ports
PGID_AGGR[0] = ports 0, 2, 3, 4, 5
PGID_AGGR[1] = ports 0, 2, 3, 4, 5
PGID_AGGR[2] = ports 0, 2, 3, 4, 5
PGID_AGGR[3] = ports 0, 2, 3, 4, 5
PGID_AGGR[4] = ports 0, 2, 3, 4, 5
PGID_AGGR[5] = ports 0, 2, 3, 4, 5
PGID_AGGR[6] = ports 0, 2, 3, 4, 5
PGID_AGGR[7] = ports 0, 2, 3, 4, 5
PGID_AGGR[8] = ports 0, 2, 3, 4, 5
PGID_AGGR[9] = ports 0, 2, 3, 4, 5
PGID_AGGR[10] = ports 0, 2, 3, 4, 5
PGID_AGGR[11] = ports 0, 2, 3, 4, 5
PGID_AGGR[12] = ports 0, 2, 3, 4, 5
PGID_AGGR[13] = ports 0, 2, 3, 4, 5
PGID_AGGR[14] = ports 0, 2, 3, 4, 5
PGID_AGGR[15] = ports 0, 2, 3, 4, 5
PGID_SRC[0] = ports 1, 2
PGID_SRC[1] = ports 0
PGID_SRC[2] = ports 0
PGID_SRC[3] = no ports
PGID_SRC[4] = no ports
PGID_SRC[5] = no ports
PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5

In other words, in the "bad" configuration, the attempt is to remove the
inactive swp1 from the destination ports via PGID_DST. But when a MAC
table entry is learned, it is learned towards PGID_DST 1, because that
is the logical port id of the LAG itself (it is equal to the lowest
numbered member port). So when swp1 becomes inactive, if we set
PGID_DST[1] to contain just swp1 and not swp2, the packet will not have
any chance to reach the destination via swp2.

The "correct" way to remove swp1 as a destination is via PGID_AGGR
(remove swp1 from the aggregation port groups for all aggregation
codes). This means that PGID_DST[1] and PGID_DST[2] must still contain
both swp1 and swp2. This makes the MAC table still treat packets
destined towards the single-port LAG as "multicast", and the inactive
ports are removed via the aggregation code tables.

The change presented here is a design one: the ocelot_get_bond_mask()
function used to take an "only_active_ports" argument. We don't need
that. The only call site that specifies only_active_ports=true,
ocelot_set_aggr_pgids(), must retrieve the entire bonding mask, because
it must program that into PGID_DST. Additionally, it must also clear the
inactive ports from the bond mask here, which it can't do if bond_mask
just contains the active ports:

	ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i);
	ac &= ~bond_mask;  <---- here
	/* Don't do division by zero if there was no active
	 * port. Just make all aggregation codes zero.
	 */
	if (num_active_ports)
		ac |= BIT(aggr_idx[i % num_active_ports]);
	ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i);

So it becomes the responsibility of ocelot_set_aggr_pgids() to take
ocelot_port->lag_tx_active into consideration when populating the
aggr_idx array.

Fixes: 23ca3b727e ("net: mscc: ocelot: rebalance LAGs on link up/down events")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220107164332.402133-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07 18:54:59 -08:00
Baowen Zheng
5a9959008f flow_offload: add index to flow_action_entry structure
Add index to flow_action_entry structure and delete index from police and
gate child structure.

We make this change to offload tc action for driver to identify a tc
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:47 +00:00
Clément Léger
8438699512 net: ocelot: add support to get port mac from device-tree
Add support to get mac from device-tree using of_get_ethdev_address.

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:29:41 +00:00
Hangbin Liu
9c9211a3fc net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEX
Since commit 94dd016ae5 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP
ioctl to active device") the user could get bond active interface's
PHC index directly. But when there is a failover, the bond active
interface will change, thus the PHC index is also changed. This may
break the user's program if they did not update the PHC timely.

This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX.
When the user wants to get the bond active interface's PHC, they need to
add this flag and be aware the PHC index may be changed.

With the new flag. All flag checks in current drivers are removed. Only
the checking in net_hwtstamp_validate() is kept.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:28:24 +00:00
Clément Léger
3cfcda2aee net: ocelot: use dma_unmap_addr to get tx buffer dma_addr
dma_addr was declared using DEFINE_DMA_UNMAP_ADDR() which requires to
use dma_unmap_addr() to access it.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 753a026cfe ("net: ocelot: add FDMA support")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 14:51:21 +00:00