Change the check for unsupported control flags, to use the new helper
flow_rule_is_supp_control_flags().
Since the helper was based on sfc, then nothing really changes.
Compile-tested, and compiled objects are identical.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20240417140712.100905-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reduce the length of netlink error messages as they are likely to be
truncated anyway. Additionally, reword netlink error messages so they
are more consistent with previous messages.
Fixes: 9dbc8d2b9a ("sfc: add decrement ipv6 hop limit by offloading set hop limit actions")
Fixes: 3c9561c0a5 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310202136.4u7bv0hp-lkp@intel.com/
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20231020140149.30490-1-pieter.jansen-van-vuuren@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If an IP address and/or L4 port for NAPT is available from a CT match,
the MAE will perform the edits; if no CT lookup has been performed for
this packet, the CT lookup did not return a match, or the matched CT
entry did not include NAPT, the action will have no effect.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a foreign LHS rule (TC rule from a tunnel netdev which requests
conntrack lookup) matches on inner headers or enc_key_id, these matches
cannot be performed by the Outer Rule table, as the keys are only
available after the tunnel type has been identified (by the OR lookup)
and the rest of the headers parsed accordingly.
Offload such rules with an Action Rule, using the LOOKUP_CONTROL section
of the AR response to specify the conntrack and/or recirculation actions,
combined with an Outer Rule which performs only the usual Encap Match
duties.
This processing flow, as it requires two AR lookups per packet, is less
performant than OR-CT-AR, so only use it where necessary.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were a few places where no extack error message was set, or the
extack was not forwarded to callees, potentially resulting in a return
of -EOPNOTSUPP with no additional information.
Make sure to populate the error message in these cases. In practice
this does us no good as TC indirect block callbacks don't come with an
extack to fill in; but maybe they will someday and when debugging it's
possible to provide a fake extack and emit its message to the console.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Normally, if a TC filter on a tunnel netdev does not match on any
encap fields, we decline to offload it, as it cannot meet our
requirement for a <sip,dip,dport> tuple for the encap match.
However, if the rule has a nonzero chain_index, then for a packet to
reach the rule, it must already have matched a LHS rule which will
have included an encap match and determined the tunnel type, so in
that case we can offload the right-hand-side rule.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow a tunnel netdevice (such as a vxlan) to offload conntrack lookups,
in much the same way as efx netdevs.
To ensure this rule does not overlap with other tunnel rules on the same
sip,dip,dport tuple, register a pseudo encap match of a new type
(EFX_TC_EM_PSEUDO_OR), which unlike PSEUDO_MASK may only be referenced
once (because an actual Outer Rule in hardware exists, although its
fw_id is not recorded in the encap match entry).
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several places in TC offload code assumed that the return from
rhashtable_lookup_get_insert_fast() was always either NULL or a valid
pointer to an existing entry, but in fact that function can return an
error pointer. In that case, perform the usual cleanup of the newly
created entry, then pass up the error, rather than attempting to take a
reference on the old entry.
Fixes: d902e1a737 ("sfc: bare bones TC offload on EF100")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230919183949.59392-1-edward.cree@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Extend the pedit add actions to handle this case for ipv6. Similar to ipv4
dec ttl, decrementing ipv6 hop limit can be achieved by adding 0xff to the
hop limit field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce pedit add actions and use it to achieve decrement ttl offload.
Decrement ttl can be achieved by adding 0xff to the ttl field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload pedit set ipv6 hop limit, where the hop limit has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload pedit set ipv4 ttl field, where the ttl field has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce the first pedit set offload functionality for the sfc driver.
In addition to this, add offload functionality for both mac source and
destination pedit set actions.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce the initial ethernet pedit set action infrastructure in
preparation for adding mac src and dst pedit action offloads.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/sfc/tc.c
fa165e1949 ("sfc: don't unregister flow_indr if it was never registered")
3bf969e88a ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In efx_init_tc(), move the setting of efx->tc->up after the
flow_indr_dev_register() call, so that if it fails, efx_fini_tc()
won't call flow_indr_dev_unregister().
Fixes: 5b2e12d51b ("sfc: bind indirect blocks for TC offload on EF100")
Suggested-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/a81284d7013aba74005277bd81104e4cfbea3f6f.1692114888.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Handle the (comparatively) simple case of a -trk rule on an efx netdev
(i.e. not a tunnel decap rule) with ct and goto chain actions.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parse ct_state trk/est, mark and zone out of flower keys, and plumb
them through to the hardware, performing some minor translations.
Nothing can actually hit them yet as we're not offloading any DO_CT
actions.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Map it to an 8-bit recirc_id for use by the hardware.
Currently nothing in the driver is offloading 'goto chain' actions,
so these rules cannot yet be hit.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bind a stub callback to the netfilter flow table.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Access to the connection tracking table in EF100 hardware is through
a "generic" table mechanism, whereby a firmware call at probe time
gives the driver a description of the field widths and offsets, so
that the driver can then construct key and response bitstrings at
runtime.
Probe the NIC for this information and populate the needed metadata
into a new meta_ct field of struct efx_tc_state.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As 32bits of dissector->used_keys are exhausted,
increase the size to 64bits.
This is base change for ESP/AH flow dissector patch.
Please find patch and discussions at
https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Tested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing counter updates, if any action set using the newly
incremented counter includes an encap action, prod the corresponding
neighbouring entry to indicate to the neighbour cache that the entry
is still in use and passing traffic.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230621121504.17004-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For each neighbour we're interested in, create a struct efx_neigh_binder
object which has a list of all the encap_actions using it. When we
receive a neighbouring update (through the netevent notifier), find the
corresponding efx_neigh_binder and update all its users.
Since the actual generation of encap headers is still only a stub, the
resulting rules still get left on fallback actions.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Create software objects to manage the metadata for encap actions that
can be attached to TC rules. However, since we don't yet have the
neighbouring information (needed to generate the Ethernet header),
all rules with encap actions are marked as "unready" and thus insert
the fallback action into hardware rather than actually offloading the
encapsulation action.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When offloading a TC encap action, the action information for the
hardware might not be "ready": if there's currently no neighbour entry
available for the destination address, we can't construct the Ethernet
header to prepend to the packet. In this case, we still offload the
flow rule, but with its action-set-list ID pointing at a "fallback"
action which simply delivers the packet to its default destination (as
though no flow rule had matched), thus allowing software TC to handle
it. Later, when we receive a neighbouring update that allows us to
construct the encap header, the rule will become "ready" and we will
update its action-set-list ID in hardware to point at the actual
offloaded actions.
This patch sets up these fallback ASLs, but does not yet use them.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Failure ladders weren't exactly unwinding what the function had done up
to that point; most seriously, when we encountered an already offloaded
rule, the failure path tried to remove the new rule from the hashtable,
which would in fact remove the already-present 'old' rule (since it has
the same key) from the table, and leak its resources.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202305200745.xmIlkqjH-lkp@intel.com/
Fixes: d902e1a737 ("sfc: bare bones TC offload on EF100")
Fixes: 17654d84b4 ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230530202527.53115-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When writing error messages to extack for pseudo collisions, we can't
use encap->type as encap has already been freed. Fortunately the
same value is stored in local variable em_type, so use that instead.
Fixes: 3c9561c0a5 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow efx_tc_encap_match entries to include a udp_sport and a
udp_sport_mask. As with enc_ip_tos, use pseudos to enforce that all
encap matches within a given <src_ip,dst_ip,udp_dport> tuple have
the same udp_sport_mask.
Note that since we use a single layer of pseudos for both fields, two
matches that differ in (say) udp_sport value aren't permitted to have
different ip_tos_mask, even though this would technically be safe.
Current userland TC does not support setting enc_src_port; this patch
was tested with an iproute2 patched to support it.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
To avoid partially-overlapping Outer Rules (which can lead to undefined
behaviour in the hardware), store extra "pseudo" entries in our
encap_match hashtable, which are used to enforce that all Outer Rule
entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
equivalent) have the same ip_tos_mask.
The "direct" encap_match entry takes a reference on the "pseudo",
allowing it to be destroyed when all "direct" entries using it are
removed.
efx_tc_em_pseudo_type is an enum rather than just a bool because in
future an additional pseudo-type will be added to support Conntrack
offload.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently tc.c will block them before they get here, but following
patch will change that.
Use the extack message from efx_mae_check_encap_match_caps() instead
of writing a new one, since there's now more being fed in than just
an IP version.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When force-freeing leftover entries from our match_action_ht, call
efx_tc_delete_rule(), which releases all the rule's resources, rather
than open-coding it. The open-coded version was missing a call to
release the rule's encap match (if any).
It probably doesn't matter as everything's being torn down anyway, but
it's cleaner this way and prevents further error messages potentially
being logged by efx_tc_encap_match_free() later on.
Move efx_tc_flow_free() further down the file to avoid introducing a
forward declaration of efx_tc_delete_rule().
Fixes: 17654d84b4 ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A 'foreign' rule is one for which the net_dev is not the sfc netdevice
or any of its representors. The driver registers indirect flow blocks
for tunnel netdevs so that it can offload decap rules. For example:
tc filter add dev vxlan0 parent ffff: protocol ipv4 flower \
enc_src_ip 10.1.0.2 enc_dst_ip 10.1.0.1 \
enc_key_id 1000 enc_dst_port 4789 \
action tunnel_key unset \
action mirred egress redirect dev $REPRESENTOR
When notified of a rule like this, register an encap match on the IP
and dport tuple (creating an Outer Rule table entry) and insert an MAE
action rule to perform the decapsulation and deliver to the representee.
Moved efx_tc_delete_rule() below efx_tc_flower_release_encap_match() to
avoid the need for a forward declaration.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a hashtable to detect duplicate and conflicting matches. If match
is not a duplicate, call MAE functions to add/remove it from OR table.
Calling code not added yet, so mark the new functions as unused.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Translate the fields from flow dissector into struct efx_tc_match.
In efx_tc_flower_replace(), reject filters that match on them, because
only 'foreign' filters (i.e. those for which the ingress dev is not
the sfc netdev or any of its representors, e.g. a tunnel netdev) can
use them.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Includes an explanation of the lifetime of the 'cursor' action-set `act`.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EF100 can pop and/or push up to two VLAN tags.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230309115904.56442-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On FLOW_CLS_STATS, look up the MAE counter by TC cookie, and report the
change in packet and byte count since the last time FLOW_CLS_STATS read
them.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the only actions supported are COUNT and DELIVER, which can only
happen in the right order; but when more actions are added, it will be
necessary to check that they are only used in the same order in which the
hardware performs them (since the hardware API takes an action *set* in
which the order is implicit). For instance, a VLAN pop must not follow a
VLAN push. Most practical use-cases should be unaffected by these
restrictions.
Add a function efx_tc_flower_action_order_ok() that checks whether it is
appropriate to add a specified action to the existing action-set.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only actions that expect stats (that sfc HW supports) are gact shot
(drop), mirred redirect and mirred mirror. Since these are 'deliverish'
actions that end an action-set, we only require at most one counter per
action-set.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is no counter-allocating machinery to connect the
resulting counter update values to; that will be added in a
subsequent patch.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Start and stop MAE counter streaming, and grant credits.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support matching on UDP/TCP source and destination ports and TCP flags,
with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support matching on IP protocol, Type of Service, Time To Live, source
and destination addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support matching on EtherType, VLANs and ethernet source/destination
addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since we can now get a formatted message back to the user with
NL_SET_ERR_MSG_FMT_MOD(), there's no need for our special logging.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>