linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Amit Cohen	b34f4de6d3	selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes 'qos_pfc' test checks PFC behavior. The idea is to limit the traffic using a shaper somewhere in the flow of the packets. In this area, the buffer is smaller than the buffer at the beginning of the flow, so it fills up until there is no more space left. The test configures there PFC which is supposed to notice that the headroom is filling up and send PFC Xoff to indicate the transmitter to stop sending traffic for the priorities sharing this PG. The Xon/Xoff threshold is auto-configured and always equal to 2*(MTU rounded up to cell size). Even after sending the PFC Xoff packet, traffic will keep arriving until the transmitter receives and processes the PFC packet. This amount of traffic is known as the PFC delay allowance. Currently the buffer for the delay traffic is configured as 100KB. The MTU in the test is 10KB, therefore the threshold for Xoff is about 20KB. This allows 80KB extra to be stored in this buffer. 8-lane ports use two buffers among which the configured buffer is split, the Xoff threshold then applies to each buffer in parallel. The test does not take into account the behavior of 8-lane ports, when the ports are configured to 400Gbps with 8 lanes or 800Gbps with 8 lanes, packets are dropped and the test fails. Check if the relevant ports use 8 lanes, in such case double the size of the buffer, as the headroom is split half-half. Cc: Shuah Khan <shuah@kernel.org> Fixes: `bfa804784e` ("selftests: mlxsw: Add a PFC test") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/23ff11b7dff031eb04a41c0f5254a2b636cd8ebb.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-18 09:48:09 -08:00
Amit Cohen	40cc674baf	selftests: mlxsw: qos_pfc: Remove wrong description In the diagram of the topology, $swp3 and $swp4 are described as 1Gbps ports. This is wrong information, the test does not configure such speed. Cc: Shuah Khan <shuah@kernel.org> Fixes: `bfa804784e` ("selftests: mlxsw: Add a PFC test") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/0087e2d416aff7e444d15f7c2958fc1d438dc27e.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-18 09:48:09 -08:00
Ido Schimmel	483ae90d8f	mlxsw: spectrum_acl_tcam: Fix stack corruption When tc filters are first added to a net device, the corresponding local port gets bound to an ACL group in the device. The group contains a list of ACLs. In turn, each ACL points to a different TCAM region where the filters are stored. During forwarding, the ACLs are sequentially evaluated until a match is found. One reason to place filters in different regions is when they are added with decreasing priorities and in an alternating order so that two consecutive filters can never fit in the same region because of their key usage. In Spectrum-2 and newer ASICs the firmware started to report that the maximum number of ACLs in a group is more than 16, but the layout of the register that configures ACL groups (PAGT) was not updated to account for that. It is therefore possible to hit stack corruption [1] in the rare case where more than 16 ACLs in a group are required. Fix by limiting the maximum ACL group size to the minimum between what the firmware reports and the maximum ACLs that fit in the PAGT register. Add a test case to make sure the machine does not crash when this condition is hit. [1] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120 [...] dump_stack_lvl+0x36/0x50 panic+0x305/0x330 __stack_chk_fail+0x15/0x20 mlxsw_sp_acl_tcam_group_update+0x116/0x120 mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110 mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20 mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 mlxsw_sp_acl_rule_add+0x47/0x240 mlxsw_sp_flower_replace+0x1a9/0x1d0 tc_setup_cb_add+0xdc/0x1c0 fl_hw_replace_filter+0x146/0x1f0 fl_change+0xc17/0x1360 tc_new_tfilter+0x472/0xb90 rtnetlink_rcv_msg+0x313/0x3b0 netlink_rcv_skb+0x58/0x100 netlink_unicast+0x244/0x390 netlink_sendmsg+0x1e4/0x440 ____sys_sendmsg+0x164/0x260 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xc0 do_syscall_64+0x40/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0x6b Fixes: `c3ab435466` ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC") Reported-by: Orel Hagag <orelh@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/2d91c89afba59c22587b444994ae419dbea8d876.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-18 09:48:08 -08:00
Amit Cohen	6d6eeabcfa	mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure Lately, a bug was found when many TC filters are added - at some point, several bugs are printed to dmesg [1] and the switch is crashed with segmentation fault. The issue starts when gen_pool_free() fails because of unexpected behavior - a try to free memory which is already freed, this leads to BUG() call which crashes the switch and makes many other bugs. Trying to track down the unexpected behavior led to a bug in eRP code. The function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated index, sets the value and returns an error code. When gen_pool_alloc() fails it returns address 0, we track it and return -ENOBUFS outside, BUT the call for gen_pool_alloc() already override the index in erp_table structure. This is a problem when such allocation is done as part of table expansion. This is not a new table, which will not be used in case of allocation failure. We try to expand eRP table and override the current index (non-zero) with zero. Then, it leads to an unexpected behavior when address 0 is freed twice. Note that address 0 is valid in erp_table->base_index and indeed other tables use it. gen_pool_alloc() fails in case that there is no space left in the pre-allocated pool, in our case, the pool is limited to ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max erp entries are required, we exceed the limit and return an error, this error leads to "Failed to migrate vregion" print. Fix this by changing erp_table->base_index only in case of a successful allocation. Add a test case for such a scenario. Without this fix it causes segmentation fault: $ TESTS="max_erp_entries_test" ./tc_flower.sh ./tc_flower.sh: line 988: 1560 Segmentation fault tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &>/dev/null [1]: kernel BUG at lib/genalloc.c:508! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1 Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021 RIP: 0010:gen_pool_free_owner+0xc9/0xe0 ... Call Trace: <TASK> __mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum] mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum] objagg_obj_root_destroy+0x18/0x80 [objagg] objagg_obj_destroy+0x12c/0x130 [objagg] mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum] mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum] mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum] mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum] mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum] mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum] tc_setup_cb_destroy+0xc1/0x180 fl_hw_destroy_filter+0x94/0xc0 [cls_flower] __fl_delete+0x1ac/0x1c0 [cls_flower] fl_destroy+0xc2/0x150 [cls_flower] tcf_proto_destroy+0x1a/0xa0 ... mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion Fixes: `f465261aa1` ("mlxsw: spectrum_acl: Implement common eRP core") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-18 09:48:08 -08:00
Benjamin Poirier	dd2d40acdb	selftests: bonding: Add more missing config options As a followup to commit `03fb8565c8` ("selftests: bonding: add missing build configs"), add more networking-specific config options which are needed for bonding tests. For testing, I used the minimal config generated by virtme-ng and I added the options in the config file. All bonding tests passed. Fixes: `bbb774d921` ("net: Add tests for bonding and team address list management") # for ipv6 Fixes: `6cbe791c0f` ("kselftest: bonding: add num_grat_arp test") # for tc options Fixes: `222c94ec0a` ("selftests: bonding: add tests for ether type changes") # for nlmon Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Link: https://lore.kernel.org/r/20240116154926.202164-1-bpoirier@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-18 11:59:26 +01:00
Jakub Kicinski	39369c9a6e	selftests: netdevsim: add a config file netdevsim tests aren't very well integrated with kselftest, which has its advantages and disadvantages. But regardless of the intended integration - a config file to know what kernel to build is very useful, add one. Fixes: `fc4c93f145` ("selftests: add basic netdevsim devlink flash testing") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240116154311.1945801-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-18 11:51:02 +01:00
Jakub Kicinski	03fb8565c8	selftests: bonding: add missing build configs bonding tests also try to create bridge, veth and dummy interfaces. These are not currently listed in config. Fixes: `bbb774d921` ("net: Add tests for bonding and team address list management") Fixes: `c078290a2b` ("selftests: include bonding tests into the kselftest infra") Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240116020201.1883023-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-16 06:58:53 -08:00
Jakub Kicinski	4697381bd0	selftests: netdevsim: correct expected FEC strings ethtool CLI has changed its output. Make the test compatible. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240114224748.1210578-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-16 12:42:29 +01:00
Jakub Kicinski	2c4ca79772	selftests: netdevsim: sprinkle more udevadm settle Number of tests are failing when netdev renaming is active on the system. Add udevadm settle in logic determining the names. Fixes: `242aaf03dc` ("selftests: add a test for ethtool pause stats") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240114224726.1210532-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-16 12:16:04 +01:00
Benjamin Poirier	c2518da8e6	selftests: bonding: Change script interpreter The tests changed by this patch, as well as the scripts they source, use features which are not part of POSIX sh (ex. 'source' and 'local'). As a result, these tests fail when /bin/sh is dash such as on Debian. Change the interpreter to bash so that these tests can run successfully. Fixes: `d43eff0b85` ("selftests: bonding: up/down delay w/ slave link flapping") Tested-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-16 09:06:15 +01:00
Jakub Kicinski	e63c1822ac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c `e009b2efb7` ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") `0f2b214779` ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 18:06:46 -08:00
Hangbin Liu	61fa2493ca	selftests: bonding: do not set port down when adding to bond Similar to commit `be80942465` ("selftests: bonding: do not set port down before adding to bond"). The bond-arp-interval-causes-panic test failed after commit `a4abfa627c` ("net: rtnetlink: Enslave device before bringing it up") as the kernel will set the port down _after_ adding to bond if setting port down specifically. Fix it by removing the link down operation when adding to bond. Fixes: `2ffd57327f` ("selftests: bonding: cause oops in bond_rr_gen_slave_id") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 14:17:05 +00:00
Ido Schimmel	af51d6bd0b	selftests: mlxsw: Add PCI reset test Test that PCI reset works correctly by verifying that only the expected reset methods are supported and that after issuing the reset the ifindex of the port changes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-18 17:38:51 +00:00
Jiri Pirko	6151ff9c75	selftests: netdevsim: use suitable existing dummy file for flash test The file name used in flash test was "dummy" because at the time test was written, drivers were responsible for file request and as netdevsim didn't do that, name was unused. However, the file load request is now done in devlink code and therefore the file has to exist. Use first random file from /lib/firmware for this purpose. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-13 10:43:13 +01:00
Zhengchao Shao	bf68583624	selftests: bonding: create directly devices in the target namespaces If failed to set link1_1 to netns client, we should delete link1_1 in the cleanup path. But if set link1_1 to netns client successfully, delete link1_1 will report warning. So it will be safer creating directly the devices in the target namespaces. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Closes: https://lore.kernel.org/all/ZNyJx1HtXaUzOkNA@Laptop-X1/ Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-28 10:24:08 +01:00
Jakub Kicinski	57ce6427e0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: include/net/inet_sock.h `f866fbc842` ("ipv4: fix data-races around inet->inet_id") `c274af2242` ("inet: introduce inet->inet_flags") https://lore.kernel.org/all/679ddff6-db6e-4ff6-b177-574e90d0103d@tessares.net/ Adjacent changes: drivers/net/bonding/bond_alb.c `e74216b8de` ("bonding: fix macvlan over alb bond support") `f11e5bd159` ("bonding: support balance-alb with openvswitch") drivers/net/ethernet/broadcom/bgmac.c `d6499f0b7c` ("net: bgmac: Return PTR_ERR() for fixed_phy_register()") `23a14488ea` ("net: bgmac: Fix return value check for fixed_phy_register()") drivers/net/ethernet/broadcom/genet/bcmmii.c `32bbe64a13` ("net: bcmgenet: Fix return value check for fixed_phy_register()") `acf50d1adb` ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()") net/sctp/socket.c `f866fbc842` ("ipv4: fix data-races around inet->inet_id") `b09bde5c35` ("inet: move inet->mc_loop to inet->inet_frags") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-24 10:51:39 -07:00
Hangbin Liu	246af950b9	selftests: bonding: add macvlan over bond testing Add a macvlan over bonding test with mode active-backup, balance-tlb and balance-alb. ]# ./bond_macvlan.sh TEST: active-backup: IPv4: client->server [ OK ] TEST: active-backup: IPv6: client->server [ OK ] TEST: active-backup: IPv4: client->macvlan_1 [ OK ] TEST: active-backup: IPv6: client->macvlan_1 [ OK ] TEST: active-backup: IPv4: client->macvlan_2 [ OK ] TEST: active-backup: IPv6: client->macvlan_2 [ OK ] TEST: active-backup: IPv4: macvlan_1->macvlan_2 [ OK ] TEST: active-backup: IPv6: macvlan_1->macvlan_2 [ OK ] TEST: active-backup: IPv4: server->client [ OK ] TEST: active-backup: IPv6: server->client [ OK ] TEST: active-backup: IPv4: macvlan_1->client [ OK ] TEST: active-backup: IPv6: macvlan_1->client [ OK ] TEST: active-backup: IPv4: macvlan_2->client [ OK ] TEST: active-backup: IPv6: macvlan_2->client [ OK ] TEST: active-backup: IPv4: macvlan_2->macvlan_2 [ OK ] TEST: active-backup: IPv6: macvlan_2->macvlan_2 [ OK ] [...] TEST: balance-alb: IPv4: client->server [ OK ] TEST: balance-alb: IPv6: client->server [ OK ] TEST: balance-alb: IPv4: client->macvlan_1 [ OK ] TEST: balance-alb: IPv6: client->macvlan_1 [ OK ] TEST: balance-alb: IPv4: client->macvlan_2 [ OK ] TEST: balance-alb: IPv6: client->macvlan_2 [ OK ] TEST: balance-alb: IPv4: macvlan_1->macvlan_2 [ OK ] TEST: balance-alb: IPv6: macvlan_1->macvlan_2 [ OK ] TEST: balance-alb: IPv4: server->client [ OK ] TEST: balance-alb: IPv6: server->client [ OK ] TEST: balance-alb: IPv4: macvlan_1->client [ OK ] TEST: balance-alb: IPv6: macvlan_1->client [ OK ] TEST: balance-alb: IPv4: macvlan_2->client [ OK ] TEST: balance-alb: IPv6: macvlan_2->client [ OK ] TEST: balance-alb: IPv4: macvlan_2->macvlan_2 [ OK ] TEST: balance-alb: IPv6: macvlan_2->macvlan_2 [ OK ] Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-08-24 10:07:13 +02:00
Hangbin Liu	27aa43f83c	selftest: bond: add new topo bond_topo_2d1c.sh Add a new testing topo bond_topo_2d1c.sh which is used more commonly. Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the extra link. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-08-24 10:07:13 +02:00
Hangbin Liu	be80942465	selftests: bonding: do not set port down before adding to bond Before adding a port to bond, it need to be set down first. In the lacpdu test the author set the port down specifically. But commit `a4abfa627c` ("net: rtnetlink: Enslave device before bringing it up") changed the operation order, the kernel will set the port down _after_ adding to bond. So all the ports will be down at last and the test failed. In fact, the veth interfaces are already inactive when added. This means there's no need to set them down again before adding to the bond. Let's just remove the link down operation. Fixes: `a4abfa627c` ("net: rtnetlink: Enslave device before bringing it up") Reported-by: Zhengchao Shao <shaozhengchao@huawei.com> Closes: https://lore.kernel.org/netdev/a0ef07c7-91b0-94bd-240d-944a330fcabd@huawei.com/ Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20230817082459.1685972-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-21 19:05:42 -07:00
Ido Schimmel	f520489e99	selftests: mlxsw: Fix test failure on Spectrum-4 Remove assumptions about shared buffer cell size and instead query the cell size from devlink. Adjust the test to send small packets that fit inside a single cell. Tested on Spectrum-{1,2,3,4}. Fixes: `4735402173` ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/f7dfbf3c4d1cb23838d9eb99bab09afaa320c4ca.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-18 19:41:06 -07:00
Zhengchao Shao	e56e220d73	selftests: bonding: remove redundant delete action of device link1_1 When run command "ip netns delete client", device link1_1 has been deleted. So, it is no need to delete link1_1 again. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-16 07:07:16 +01:00
Petr Machata	aae5bb8d18	selftests: mlxsw: router_bridge_lag: Add a new selftest Add a selftest to verify enslavement to a LAG with upper after fresh devlink reload. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/373a7754daa4dac32759a45095f47b08a2a869c8.1691498735.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-09 15:27:51 -07:00
Petr Machata	67d5ffb9ed	selftests: mlxsw: rif_bridge: Add a new selftest This test verifies driver behavior with regards to creation of RIFs for a bridge as LAGs are added or removed to/from it, and ports added or removed to/from the LAG. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-02 09:18:18 +01:00
Petr Machata	6b3f46837c	selftests: mlxsw: rif_lag_vlan: Add a new selftest This test verifies driver behavior with regards to creation of RIFs for LAG VLAN uppers as ports are added or removed to/from the LAG. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-02 09:18:18 +01:00
Petr Machata	4308967d98	selftests: mlxsw: rif_lag: Add a new selftest This test verifies driver behavior with regards to creation of RIFs for a LAG as ports are added or removed to/from it. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-02 09:18:18 +01:00
Petr Machata	d7eb1f1751	selftests: mlxsw: rtnetlink: Drop obsolete tests Support for enslaving ports to LAGs with uppers will be added in the following patches. Selftests to make sure it actually does the right thing are ready and will be sent as a follow-up. Similarly, ordering of MACVLAN creation and RIF creation will be relaxed and it will be permitted to create a MACVLAN first. Thus these two tests are obsolete. Drop them. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-07-21 08:54:04 +01:00
Ido Schimmel	0a1a818d8a	selftests: mlxsw: Test port range registers' occupancy Test that filters that match on the same port range, but with different combination of IPv4/IPv6 and TCP/UDP all use the same port range register by observing port range registers' occupancy via devlink-resource. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/0a2eb63b234fb062ff011e80231868cc80000c81.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-12 16:57:18 -07:00
Ido Schimmel	45c5a38476	selftests: mlxsw: Add scale test for port ranges Query the maximum number of supported port range registers using devlink-resource and test that this number can be reached by configuring tc filters with different port ranges. Test that an error is returned in case the maximum number is exceeded. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/48eee181270d9f291e09d1858c7b26a3f7fcc164.1689092769.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-07-12 16:57:18 -07:00
Petr Machata	664bc72dd2	selftests: mlxsw: one_armed_router: Use port MAC for bridge address In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge, the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The bridge eventually inherits MAC address from its first member, after the enslavement is acked. A number of (mainly VXLAN) selftests already work around the problem by setting the MAC address to whatever it will eventually be anyway. Do the same for this selftest. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	5541577521	selftests: mlxsw: vxlan: Disable IPv6 autogen on bridges In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge (this holds for all bridges used here), the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks various aspects of VXLAN offloading and the bridges do not need to participate in routing traffic. The IP addresses or the RIFs are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridges in this selftest, thus exempting them from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	08035d8e35	selftests: mlxsw: spectrum: q_in_vni_veto: Disable IPv6 autogen on a bridge In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge, the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks vetoing of a different aspect of the configuration and the bridge does not need to participate in routing traffic. The IP address or the RIF are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridge in this selftest, thus exempting it from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	ea2d5f757e	selftests: mlxsw: qos_mc_aware: Disable IPv6 autogen on bridges In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge (this holds for both bridges used here), the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks traffic prioritization and scheduling, and the bridges serve for their L2 forwarding capabilities, and do not need to participate in routing traffic. The IP addresses or the RIFs are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridges in this selftest, thus exempting them from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	ec7023e674	selftests: mlxsw: qos_ets_strict: Disable IPv6 autogen on bridges In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge (this holds for both bridges used here), the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks traffic prioritization and scheduling, and the bridges serve for their L2 forwarding capabilities, and do not need to participate in routing traffic. The IP addresses or the RIFs are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridges in this selftest, thus exempting them from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	6349f9bbbf	selftests: mlxsw: qos_dscp_bridge: Disable IPv6 autogen on a bridge In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge, the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks DCB DSCP-based prioritization, and the bridge serves for its L2 forwarding capabilities, and does not need to participate in routing traffic. The IP address or the RIF are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridge in this selftest, thus exempting it from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	32b3a7bf85	selftests: mlxsw: mirror_gre_scale: Disable IPv6 autogen on a bridge In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge, the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks how many mirroring sessions a machine is capable of offloading. The IP address or the RIF are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridge in this selftest, thus exempting it from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	a758dc469a	selftests: mlxsw: extack: Disable IPv6 autogen on bridges In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. At the time that the front panel port is enslaved to the bridge (this holds for all bridges used here), the bridge MAC address does not have the same prefix as other interfaces in the system. On Nvidia Spectrum-1 machines all the RIFs have to have the same 38-bit MAC address prefix. Since the bridge does not obey this limitation, the RIF cannot be created, and the enslavement attempt is vetoed on the grounds of the configuration not being offloadable. The selftest itself however checks whether a different vetoed aspect of the configuration provides an extack. The IP address or the RIF are irrelevant. Fix by disabling automatic IPv6 address generation for the HW-offloaded bridges in this selftest, thus exempting them from mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	8cfdd300a5	selftests: mlxsw: q_in_q_veto: Disable IPv6 autogen on bridges In a future patch, mlxsw will start adding RIFs to uppers of front panel port netdevices, if they have an IP address. The swp enslavement to the 802.1ad bridge is not allowed, because RIFs are not allowed to be created for 802.1ad bridges, but the address indicates one needs to be created. Thus the veto selftests fail already during the port enslavement. Then the attempt to create a VLAN on top of the same bridge is not vetoed, because the bridge is not related to mlxsw, and the selftest fails. Fix by disabling automatic IPv6 address generation for the bridges in this selftest, thus exempting them from the mlxsw router attention. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-21 14:02:52 -07:00
Petr Machata	34ad708d1b	selftests: mlxsw: egress_vid_classification: Fix the diagram The topology diagram implies that $swp1 and $swp2 are members of the bridge br0, when in fact only their uppers, $swp1.10 and $swp2.10 are. Adjust the diagram. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-05 11:29:49 +01:00
Petr Machata	204cc3d04f	selftests: mlxsw: ingress_rif_conf_1d: Fix the diagram The topology diagram implies that $swp1 and $swp2 are members of the bridge br0, when in fact only their uppers, $swp1.10 and $swp2.10 are. Adjust the diagram. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-05 11:29:49 +01:00
Jakub Kicinski	bc88ba0cad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-11 09:06:55 -07:00
Linus Torvalds	6e27831b91	Networking fixes for 6.4-rc2, including fixes from netfilter Current release - regressions: - mtk_eth_soc: fix NULL pointer dereference Previous releases - regressions: - core: - skb_partial_csum_set() fix against transport header magic value - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs(). - annotate sk->sk_err write from do_recvmmsg() - add vlan_get_protocol_and_depth() helper - netlink: annotate accesses to nlk->cb_running - netfilter: always release netdev hooks from notifier Previous releases - always broken: - core: deal with most data-races in sk_wait_event() - netfilter: fix possible bug_on with enable_hooks=1 - eth: bonding: fix send_peer_notif overflow - eth: xpcs: fix incorrect number of interfaces - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRcxawSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkv6wQAJgfOBlDAkZNKHzwtMuFiLxECeEMWY9h wJCyiq0qXnz9p5ZqjdmTmA8B+jUp9VkpgN5Z3lid5hXDfzDrvXL1KGZW4pc4ooz9 GUzrp0EUzO5UsyrlZRS9vJ9mbCGN5M1ZWtWH93g8OzGJPRnLs0Q/Tr4IFTBVKzVb GmJPy/ZYWYDjnvx3BgewRDuYeH3Rt9lsIt4Pxq/E+D8W3ypvVM0m3GvrO5eEzMeu EfeilAdmJGJUufeoGguKt0hheqILS3kNCjQO25XS2Lq1OqetnR/wqTwXaaVxL2du Eb2ca7wKkihDpl2l8bQ3ss6vqM0HEpZ63Y2PJaNBS8ASdLsMq4n2L6j2JMfT8hWY RG3nJS7F2UFLyYmCJjNL1/I+Z9XeMyFKnHORzHK1dAkMlhd+8NauKWAxdjlxMbxX p1msyTl54bG0g6FrU/zAirCWNAAZYCPdZG/XvA/2Jj9mdy64OlGlv/QdJvfjcx+C L6nkwZfwXU7QUwKeeTfP8abte2SLrXIxkJrnNEAntPnFOSmd16+/yvQ8JVlbWTMd JugJrSAIxjOglIr/1fsnUuV+Ab+JDYQv/wkoyzvtcY2tjhTAHzgmTwwSfeYiCTJE rEbjyVvVgMcLTUIk/R9QC5/k6nX/7/KRDHxPOMBX4boOsuA0ARVjzt8uKRvv/7cS dRV98RwvCKvD =MoPD -----END PGP SIGNATURE----- Merge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter. Current release - regressions: - mtk_eth_soc: fix NULL pointer dereference Previous releases - regressions: - core: - skb_partial_csum_set() fix against transport header magic value - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs(). - annotate sk->sk_err write from do_recvmmsg() - add vlan_get_protocol_and_depth() helper - netlink: annotate accesses to nlk->cb_running - netfilter: always release netdev hooks from notifier Previous releases - always broken: - core: deal with most data-races in sk_wait_event() - netfilter: fix possible bug_on with enable_hooks=1 - eth: bonding: fix send_peer_notif overflow - eth: xpcs: fix incorrect number of interfaces - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register" * tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits) af_unix: Fix data races around sk->sk_shutdown. af_unix: Fix a data race of sk->sk_receive_queue->qlen. net: datagram: fix data-races in datagram_poll() net: mscc: ocelot: fix stat counter register values ipvlan:Fix out-of-bounds caused by unclear skb->cb docs: networking: fix x25-iface.rst heading & index order gve: Remove the code of clearing PBA bit tcp: add annotations around sk->sk_shutdown accesses net: add vlan_get_protocol_and_depth() helper net: pcs: xpcs: fix incorrect number of interfaces net: deal with most data-races in sk_wait_event() net: annotate sk->sk_err write from do_recvmmsg() netlink: annotate accesses to nlk->cb_running kselftest: bonding: add num_grat_arp test selftests: forwarding: lib: add netns support for tc rule handle stats get Documentation: bonding: fix the doc of peer_notif_delay bonding: fix send_peer_notif overflow net: ethernet: mtk_eth_soc: fix NULL pointer dereference selftests: nft_flowtable.sh: check ingress/egress chain too selftests: nft_flowtable.sh: monitor result file sizes ...	2023-05-11 08:42:47 -05:00
Liang Li	2f0f556713	selftests: bonding: delete unnecessary line "ip link set dev "$devbond1" nomaster" This line code in bond-eth-type-change.sh is unnecessary. Because $devbond1 was not added to any master device. Signed-off-by: Liang Li <liali@redhat.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 10:06:42 +01:00
Hangbin Liu	6cbe791c0f	kselftest: bonding: add num_grat_arp test TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [ OK ] Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 09:27:20 +01:00
Linus Torvalds	f085df1be6	Disable building BPF based features by default for v6.4. We need to better polish building with BPF skels, so revert back to making it an experimental feature that has to be explicitely enabled using BUILD_BPF_SKEL=1. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZFbCXwAKCRCyPKLppCJ+ J7cHAP97erKY4hBXArjpfzcvpFmboh/oqhbTLntyIpS6TEnOyQEAyervAPGIjQYC DCo4foyXmOWn3dhNtK9M+YiRl3o2SgQ= =7G78 -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "Third version of perf tool updates, with the build problems with with using a 'vmlinux.h' generated from the main build fixed, and the bpf skeleton build disabled by default. Build: - Require libtraceevent to build, one can disable it using NO_LIBTRACEEVENT=1. It is required for tools like 'perf sched', 'perf kvm', 'perf trace', etc. libtraceevent is available in most distros so installing 'libtraceevent-devel' should be a one-time event to continue building perf as usual. Using NO_LIBTRACEEVENT=1 produces tooling that is functional and sufficient for lots of users not interested in those libtraceevent dependent features. - Allow Python support in 'perf script' when libtraceevent isn't linked, as not all features requires it, for instance Intel PT does not use tracepoints. - Error if the python interpreter needed for jevents to work isn't available and NO_JEVENTS=1 isn't set, preventing a build without support for JSON vendor events, which is a rare but possible condition. The two check error messages: $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.) $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.) - Make libbpf 1.0 the minimum required when building with out of tree, distro provided libbpf. - Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++ demangler, add 'perf test' entry for it. - Make binutils libraries opt in, as distros disable building with it due to licensing, they were used for C++ demangling, for instance. - Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or equivalent) isn't installed, we'll just have a build warning: Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev - Add a feature test for scandirat(), that is not implemented so far in musl and uclibc, disabling features that need it, such as scanning for tracepoints in /sys/kernel/tracing/events. perf BPF filters: - New feature where BPF can be used to filter samples, for instance: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) - In addition to 'period' (PERF_SAMPLE_PERIOD), the other PERF_SAMPLE_ can be used for filtering, and also some other sample accessible values, from tools/perf/Documentation/perf-record.txt: Essentially the BPF filter expression is: <term> <operator> <value> (("," \| "\|\|") <term> <operator> <value>)* The <term> can be one of: ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr, code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat, p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock, mem_dtlb, mem_blk, mem_hops The <operator> can be one of: ==, !=, >, >=, <, <=, & The <value> can be one of: <number> (for any term) na, load, store, pfetch, exec (for mem_op) l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl) na, none, hit, miss, hitm, fwd, peer (for mem_snoop) remote (for mem_remote) na, locked (for mem_locked) na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb) na, by_data, by_addr (for mem_blk) hops0, hops1, hops2, hops3 (for mem_hops) perf lock contention: - Show lock type with address. - Track and show mmap_lock, siglock and per-cpu rq_lock with address. This is done for mmap_lock by following the current->mm pointer: $ sudo ./perf lock con -abl -- sleep 10 contended total wait max wait avg wait address symbol ... 16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640 17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0 3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock 3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58 1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70 9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock 14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0 3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock 16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560 11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock 1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8 1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock 581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058 5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070 112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120 381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock 255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80 - Update default map size to 16384. - Allocate single letter option -M for --map-nr-entries, as it is proving being frequently used. - Fix struct rq lock access for older kernels with BPF's CO-RE (Compile once, run everywhere). - Fix problems found with MSAn. perf report/top: - Add inline information when using --call-graph=fp or lbr, as was already done to the --call-graph=dwarf callchain mode. - Improve the 'srcfile' sort key performance by really using an optimization introduced in 6.2 for the 'srcline' sort key that avoids calling addr2line for comparision with each sample. perf sched: - Make 'perf sched latency/map/replay' to use "sched:sched_waking" instead of "sched:sched_waking", consistent with 'perf record' since `d566a9c2d4` ("perf sched: Prefer sched_waking event when it exists"). perf ftrace: - Make system wide the default target for latency subcommand, run the following command then generate some network traffic and press control+C: # perf ftrace latency -T __kfree_skb ^C DURATION \| COUNT \| GRAPH \| 0 - 1 us \| 27 \| ############# \| 1 - 2 us \| 22 \| ########### \| 2 - 4 us \| 8 \| #### \| 4 - 8 us \| 5 \| ## \| 8 - 16 us \| 24 \| ############ \| 16 - 32 us \| 2 \| # \| 32 - 64 us \| 1 \| \| 64 - 128 us \| 0 \| \| 128 - 256 us \| 0 \| \| 256 - 512 us \| 0 \| \| 512 - 1024 us \| 0 \| \| 1 - 2 ms \| 0 \| \| 2 - 4 ms \| 0 \| \| 4 - 8 ms \| 0 \| \| 8 - 16 ms \| 0 \| \| 16 - 32 ms \| 0 \| \| 32 - 64 ms \| 0 \| \| 64 - 128 ms \| 0 \| \| 128 - 256 ms \| 0 \| \| 256 - 512 ms \| 0 \| \| 512 - 1024 ms \| 0 \| \| 1 - ... s \| 0 \| \| # perf top: - Add --branch-history (LBR: Last Branch Record) option, just like already available for 'perf record'. - Fix segfault in thread__comm_len() where thread->comm was being used outside thread->comm_lock. perf annotate: - Allow configuring objdump and addr2line in ~/.perfconfig., so that you can use alternative binaries, such as llvm's. perf kvm: - Add TUI mode for 'perf kvm stat report'. Reference counting: - Add reference count checking infrastructure to check for use after free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs, more to come. To build with it use -DREFCNT_CHECKING=1 in the make command line to build tools/perf. Documented at: https://perf.wiki.kernel.org/index.php/Reference_Count_Checking - The above caught, for instance, fix, present in this series: - Fix maps use after put in 'perf test "Share thread maps"': 'maps' is copied from leader, but the leader is put on line 79 and then 'maps' is used to read the reference count below - so a use after put, with the put of maps happening within thread__put. Fixed by reversing the order of puts so that the leader is put last. - Also several fixes were made to places where reference counts were not being held. - Make this one of the tests in 'make -C tools/perf build-test' to regularly build test it and to make sure no direct access to the reference counted structs are made, doing that via accessors to check the validity of the struct pointer. ARM64: - Fix 'perf report' segfault when filtering coresight traces by sparse lists of CPUs. - Add support for 'simd' as a sort field for 'perf report', to show ARM's NEON SIMD's predicate flags: "partial" and "empty". arm64 vendor events: - Add N1 metrics. Intel vendor events: - Add graniterapids, grandridge and sierraforrest events. - Refresh events for: alderlake, aldernaken, broadwell, broadwellde, broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex, jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids, silvermont, skylake, tigerlake and westmereep-dp - Refresh metrics for alderlake-n, broadwell, broadwellde, broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and skylakex. perf stat: - Implement --topdown using JSON metrics. - Add TopdownL1 JSON metric as a default if present, but disable it for now for some Intel hybrid architectures, a series of patches addressing this is being reviewed and will be submitted for v6.5. - Use metrics for --smi-cost. - Update topdown documentation. Vendor events (JSON) infrastructure: - Add support for computing and printing metric threshold values. For instance, here is one found in thesapphirerapids json file: { "BriefDescription": "Percentage of cycles spent in System Management Interrupts.", "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)", "MetricGroup": "smi", "MetricName": "smi_cycles", "MetricThreshold": "smi_cycles > 0.1", "ScaleUnit": "100%" }, - Test parsing metric thresholds with the fake PMU in 'perf test pmu-events'. - Support for printing metric thresholds in 'perf list'. - Add --metric-no-threshold option to 'perf stat'. - Add rand (reverse and) and has_pmem (optane memory) support to metrics. - Sort list of input files to avoid depending on the order from readdir() helping in obtaining reproducible builds. S/390: - Add common metrics: - CPI (cycles per instruction), prbstate (ratio of instructions executed in problem state compared to total number of instructions), l1mp (Level one instruction and data cache misses per 100 instructions). - Add cache metrics for z13, z14, z15 and z16. - Add metric for TLB and cache. ARM: - Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE (Memory Tagging Extension) and MOPS (Memory Operations) load/store. Intel PT hardware tracing: - Add event type names UINTR (User interrupt delivered) and UIRET (Exiting from user interrupt routine), documented in table 32-50 "CFE Packet Type and Vector Fields Details" in the Intel Processor Trace chapter of The Intel SDM Volume 3 version 078. - Add support for new branch instructions ERETS and ERETU. - Fix CYC timestamps after standalone CBR ARM CoreSight hardware tracing: - Allow user to override timestamp and contextid settings. - Fix segfault in dso lookup. - Fix timeless decode mode detection. - Add separate decode paths for timeless and per-thread modes. auxtrace: - Fix address filter entire kernel size. Miscellaneous: - Fix use-after-free and unaligned bugs in the PLT handling routines. - Use zfree() to reduce chances of use after free. - Add missing 0x prefix for addresses printed in hexadecimal in 'perf probe'. - Suppress massive unsupported target platform errors in the unwind code. - Fix return incorrect build_id size in elf_read_build_id(). - Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 . - Add missing new parameter in kfree_skb tracepoint to the python scripts using it. - Add 'perf bench syscall fork' benchmark. - Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in 'perf mem'. - Fix wrong size expectation for perf test 'Setup struct perf_event_attr' caused by the patch adding perf_event_attr::config3. - Fix some spelling mistakes" * tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits) Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" Revert "perf build: Warn for BPF skeletons if endian mismatches" perf metrics: Fix SEGV with --for-each-cgroup perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE perf stat: Separate bperf from bpf_profiler perf test record+probe_libc_inet_pton: Fix call chain match on x86_64 perf test record+probe_libc_inet_pton: Fix call chain match on s390 perf tracepoint: Fix memory leak in is_valid_tracepoint() perf cs-etm: Add fix for coresight trace for any range of CPUs perf build: Fix unescaped # in perf build-test perf unwind: Suppress massive unsupported target platform errors perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it perf script: Print raw ip instead of binary offset for callchain perf symbols: Fix return incorrect build_id size in elf_read_build_id() perf list: Modify the warning message about scandirat(3) perf list: Fix memory leaks in print_tracepoint_events() perf lock contention: Rework offset calculation with BPF CO-RE perf lock contention: Fix struct rq lock access perf stat: Disable TopdownL1 on hybrid perf stat: Avoid SEGV on counter->name ...	2023-05-07 11:32:18 -07:00
Petr Machata	8fcac79270	selftests: forwarding: generalize bail_on_lldpad from mlxsw mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD is not running, to prevent conflicts between the QoS configuration applied through TC or DCB command line tool, and the DCB configuration that LLDPAD might apply. This helper might be useful to others. Move the function to lib.sh, and parameterize to make reusable in other contexts. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Petr Machata	54e906f163	selftests: forwarding: sch_tbf_*: Add a pre-run hook The driver-specific wrappers of these selftests invoke bail_on_lldpad to make sure that LLDPAD doesn't trample the configuration. The function bail_on_lldpad is going to move to lib.sh in the next patch. With that, it won't be visible for the wrappers before sourcing the framework script. And after sourcing it, it is too late: the selftest will have run by then. One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but even if that worked (it might, it might not), that seems cumbersome. lib.sh is doing fair amount of stuff, and even if it works today, it does not look particularly solid as a solution. Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets invoked. Move the bail to the hook. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Hangbin Liu	2e825f8acc	selftests: bonding: add arp validate test This patch add bonding arp validate tests with mode active backup, monitor arp_ip_target and ns_ip6_target. It also checks mii_status to make sure all slaves are UP. Acked-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-07 08:47:20 +01:00
Hangbin Liu	481b56e039	selftests: bonding: re-format bond option tests To improve the testing process for bond options, A new bond topology lib is added to our testing setup. The current option_prio.sh file will be renamed to bond_options.sh so that all bonding options can be tested here. Specifically, for priority testing, we will run all tests using modes 1, 5, and 6. These changes will help us streamline the testing process and ensure that our bond options are rigorously evaluated. Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-07 08:47:20 +01:00
Patrice Duroux	7f8d3fbe09	perf tests test_bridge_fdb_stress.sh: Fix redirection of stderr to stdin It's not 2&>1, the correct is 2>&1. Signed-off-by: Patrice Duroux <patrice.duroux@gmail.com> Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/r/20230303193058.21274-1-patrice.duroux@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-04 09:39:55 -03:00
Nikolay Aleksandrov	222c94ec0a	selftests: bonding: add tests for ether type changes Add new network selftests for the bonding device which exercise the ether type changing call paths. They also test for the recent syzbot bug[1] which causes a warning and results in wrong device flags (IFF_SLAVE missing). The test adds three bond devices and a nlmon device, enslaves one of the bond devices to the other and then uses the nlmon device for successful and unsuccesful enslaves both of which change the bond ether type. Thus we can test for both MASTER and SLAVE flags at the same time. If the flags are properly restored we get: TEST: Change ether type of an enslaved bond device with unsuccessful enslave [ OK ] TEST: Change ether type of an enslaved bond device with successful enslave [ OK ] [1] https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-17 07:56:41 +00:00

1 2 3 4 5 ...

380 commits