In efx_init_tc(), move the setting of efx->tc->up after the
flow_indr_dev_register() call, so that if it fails, efx_fini_tc()
won't call flow_indr_dev_unregister().
Fixes: 5b2e12d51b ("sfc: bind indirect blocks for TC offload on EF100")
Suggested-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/a81284d7013aba74005277bd81104e4cfbea3f6f.1692114888.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When processing counter updates, if any action set using the newly
incremented counter includes an encap action, prod the corresponding
neighbouring entry to indicate to the neighbour cache that the entry
is still in use and passing traffic.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230621121504.17004-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For each neighbour we're interested in, create a struct efx_neigh_binder
object which has a list of all the encap_actions using it. When we
receive a neighbouring update (through the netevent notifier), find the
corresponding efx_neigh_binder and update all its users.
Since the actual generation of encap headers is still only a stub, the
resulting rules still get left on fallback actions.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Create software objects to manage the metadata for encap actions that
can be attached to TC rules. However, since we don't yet have the
neighbouring information (needed to generate the Ethernet header),
all rules with encap actions are marked as "unready" and thus insert
the fallback action into hardware rather than actually offloading the
encapsulation action.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When offloading a TC encap action, the action information for the
hardware might not be "ready": if there's currently no neighbour entry
available for the destination address, we can't construct the Ethernet
header to prepend to the packet. In this case, we still offload the
flow rule, but with its action-set-list ID pointing at a "fallback"
action which simply delivers the packet to its default destination (as
though no flow rule had matched), thus allowing software TC to handle
it. Later, when we receive a neighbouring update that allows us to
construct the encap header, the rule will become "ready" and we will
update its action-set-list ID in hardware to point at the actual
offloaded actions.
This patch sets up these fallback ASLs, but does not yet use them.
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Failure ladders weren't exactly unwinding what the function had done up
to that point; most seriously, when we encountered an already offloaded
rule, the failure path tried to remove the new rule from the hashtable,
which would in fact remove the already-present 'old' rule (since it has
the same key) from the table, and leak its resources.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202305200745.xmIlkqjH-lkp@intel.com/
Fixes: d902e1a737 ("sfc: bare bones TC offload on EF100")
Fixes: 17654d84b4 ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230530202527.53115-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When writing error messages to extack for pseudo collisions, we can't
use encap->type as encap has already been freed. Fortunately the
same value is stored in local variable em_type, so use that instead.
Fixes: 3c9561c0a5 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow efx_tc_encap_match entries to include a udp_sport and a
udp_sport_mask. As with enc_ip_tos, use pseudos to enforce that all
encap matches within a given <src_ip,dst_ip,udp_dport> tuple have
the same udp_sport_mask.
Note that since we use a single layer of pseudos for both fields, two
matches that differ in (say) udp_sport value aren't permitted to have
different ip_tos_mask, even though this would technically be safe.
Current userland TC does not support setting enc_src_port; this patch
was tested with an iproute2 patched to support it.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
To avoid partially-overlapping Outer Rules (which can lead to undefined
behaviour in the hardware), store extra "pseudo" entries in our
encap_match hashtable, which are used to enforce that all Outer Rule
entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
equivalent) have the same ip_tos_mask.
The "direct" encap_match entry takes a reference on the "pseudo",
allowing it to be destroyed when all "direct" entries using it are
removed.
efx_tc_em_pseudo_type is an enum rather than just a bool because in
future an additional pseudo-type will be added to support Conntrack
offload.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently tc.c will block them before they get here, but following
patch will change that.
Use the extack message from efx_mae_check_encap_match_caps() instead
of writing a new one, since there's now more being fed in than just
an IP version.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When force-freeing leftover entries from our match_action_ht, call
efx_tc_delete_rule(), which releases all the rule's resources, rather
than open-coding it. The open-coded version was missing a call to
release the rule's encap match (if any).
It probably doesn't matter as everything's being torn down anyway, but
it's cleaner this way and prevents further error messages potentially
being logged by efx_tc_encap_match_free() later on.
Move efx_tc_flow_free() further down the file to avoid introducing a
forward declaration of efx_tc_delete_rule().
Fixes: 17654d84b4 ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A 'foreign' rule is one for which the net_dev is not the sfc netdevice
or any of its representors. The driver registers indirect flow blocks
for tunnel netdevs so that it can offload decap rules. For example:
tc filter add dev vxlan0 parent ffff: protocol ipv4 flower \
enc_src_ip 10.1.0.2 enc_dst_ip 10.1.0.1 \
enc_key_id 1000 enc_dst_port 4789 \
action tunnel_key unset \
action mirred egress redirect dev $REPRESENTOR
When notified of a rule like this, register an encap match on the IP
and dport tuple (creating an Outer Rule table entry) and insert an MAE
action rule to perform the decapsulation and deliver to the representee.
Moved efx_tc_delete_rule() below efx_tc_flower_release_encap_match() to
avoid the need for a forward declaration.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a hashtable to detect duplicate and conflicting matches. If match
is not a duplicate, call MAE functions to add/remove it from OR table.
Calling code not added yet, so mark the new functions as unused.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Translate the fields from flow dissector into struct efx_tc_match.
In efx_tc_flower_replace(), reject filters that match on them, because
only 'foreign' filters (i.e. those for which the ingress dev is not
the sfc netdev or any of its representors, e.g. a tunnel netdev) can
use them.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Includes an explanation of the lifetime of the 'cursor' action-set `act`.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EF100 can pop and/or push up to two VLAN tags.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230309115904.56442-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On FLOW_CLS_STATS, look up the MAE counter by TC cookie, and report the
change in packet and byte count since the last time FLOW_CLS_STATS read
them.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the only actions supported are COUNT and DELIVER, which can only
happen in the right order; but when more actions are added, it will be
necessary to check that they are only used in the same order in which the
hardware performs them (since the hardware API takes an action *set* in
which the order is implicit). For instance, a VLAN pop must not follow a
VLAN push. Most practical use-cases should be unaffected by these
restrictions.
Add a function efx_tc_flower_action_order_ok() that checks whether it is
appropriate to add a specified action to the existing action-set.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only actions that expect stats (that sfc HW supports) are gact shot
(drop), mirred redirect and mirred mirror. Since these are 'deliverish'
actions that end an action-set, we only require at most one counter per
action-set.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is no counter-allocating machinery to connect the
resulting counter update values to; that will be added in a
subsequent patch.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Start and stop MAE counter streaming, and grant credits.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support matching on UDP/TCP source and destination ports and TCP flags,
with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support matching on IP protocol, Type of Service, Time To Live, source
and destination addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support matching on EtherType, VLANs and ethernet source/destination
addresses, with masking if supported by the hardware.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since we can now get a formatted message back to the user with
NL_SET_ERR_MSG_FMT_MOD(), there's no need for our special logging.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This is the absolute minimum viable TC implementation to get traffic to
VFs and allow them to be tested; it supports no match fields besides
ingress port, no actions besides mirred and drop, and no stats.
Example usage:
tc filter add dev $PF parent ffff: flower skip_sw \
action mirred egress mirror dev $VFREP
tc filter add dev $VFREP parent ffff: flower skip_sw \
action mirred egress redirect dev $PF
gives a VF unfiltered access to the network out the physical port ($PF
acts here as a physical port representor).
More matches, actions, and counters will be added in subsequent patches.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Different versions of EF100 firmware and FPGA bitstreams support different
matching capabilities in the Match-Action Engine. Probe for these at
start of day; subsequent patches will validate TC offload requests
against the reported capabilities.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nothing inserts into this table yet, but we have code to remove rules
on FLOW_CLS_DESTROY or at driver teardown time, in both cases also
attempting to remove the corresponding hardware rules.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bind indirect blocks for recognised tunnel netdevices.
Currently these connect to a stub efx_tc_flower() that only returns
-EOPNOTSUPP; subsequent patches will implement flower offloads to the
Match-Action Engine.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bind direct blocks for the MAE-admin PF and each VF representor.
Currently these connect to a stub efx_tc_flower() that only returns
-EOPNOTSUPP; subsequent patches will implement flower offloads to the
Match-Action Engine.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Representors do not want to be subject to the PF's Ethernet address
filters, since traffic from VFs will typically have a destination
either elsewhere on the link segment or on an overlay network.
So, create a dynamic m-port with promiscuous and all-multicast
filters, and set it as the egress port of representor default rules.
Since the m-port is an alias of the calling PF's own m-port, traffic
will still be delivered to the PF's RXQs, but it will be subject to
the VNRX filter rules installed on the dynamic m-port (specified by
the v-port ID field of the filter spec).
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Default rules are low-priority switching rules which the hardware uses
in the absence of higher-priority rules. Each representor requires a
corresponding rule matching traffic from its representee VF and
delivering to the PF (where a check on INGRESS_MPORT in
__ef100_rx_packet() will direct it to the representor). No rule is
required in the reverse direction, because representor TX uses a TX
override descriptor to bypass the MAE and deliver directly to the VF.
Since inserting any rule into the MAE disables the firmware's own
default rules, also insert a pair of rules to connect the PF to the
physical network port and vice-versa.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>