linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Maciej Fijalkowski	5c57e507f2	ice: skip NULL check against XDP prog in ZC path Whole zero-copy variant of clean Rx IRQ is executed when xsk_pool is attached to rx_ring and it can happen only when XDP program is present on interface. Therefore it is safe to assume that program is always !NULL and there is no need for checking it in ice_run_xdp_zc. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:28:40 -08:00
Maciej Fijalkowski	43a925e49d	ice: remove redundant checks in ice_change_mtu dev_validate_mtu checks that mtu value specified by user is not less than min mtu and not greater than max allowed mtu. It is being done before calling the ndo_change_mtu exposed by driver, so remove these redundant checks in ice_change_mtu. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:20:13 -08:00
Maciej Fijalkowski	29b82f2a09	ice: move skb pointer from rx_buf to rx_ring Similar thing has been done in i40e, as there is no real need for having the sk_buff pointer in each rx_buf. Non-eop frames can be simply handled on that pointer moved upwards to rx_ring. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:18:40 -08:00
Maciej Fijalkowski	59c97d1b51	ice: simplify ice_run_xdp There's no need for 'result' variable, we can directly return the internal status based on action returned by xdp prog. Reviewed-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-12 10:14:30 -08:00
Tony Nguyen	741106f7bd	ice: Improve MSI-X fallback logic Currently if the driver is unable to get all the MSI-X vectors it wants, it falls back to the minimum configuration which equates to a single Tx/Rx traffic queue pair. Instead of using the minimum configuration, if given more vectors than the minimum, utilize those vectors for additional traffic queues after accounting for other interrupts. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>	2021-02-08 16:27:01 -08:00
Mitch Williams	fe6cd89050	ice: Fix trivial error message This message indicates an error on close, not open. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Bruce Allan	7a63dae0fa	ice: remove unnecessary casts Casting a void * rvalue in an assignment is unnecessary in C; remove the casts. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Chinh T Cao	fc2d1165d4	ice: Refactor DCB related variables out of the ice_port_info struct Refactor the DCB related variables out of the ice_port_info_struct. The goal is to make the ice_port_info struct cleaner. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Co-developed-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Jesse Brandeburg	1d9f7ca324	ice: fix writeback enable logic The writeback enable logic was incorrectly implemented (due to misunderstanding what the side effects of the implementation would be during polling). Fix this logic issue, while implementing a new feature allowing the user to control the writeback frequency using the knobs for controlling interrupt throttling that we already have. Basically if you leave adaptive interrupts enabled, the writeback frequency will be varied even if busy_polling or if napi-poll is in use. If the interrupt rates are set to a fixed value by ethtool -C and adaptive is off, the driver will allow the user-set interrupt rate to guide how frequently the hardware will complete descriptors to the driver. Effectively the user will get a control over the hardware efficiency, allowing the choice between immediate interrupts or delayed up to a maximum of the interrupt rate, even when interrupts are disabled during polling. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Ben Shelton	4f8a14976a	ice: Use PSM clock frequency to calculate RL profiles The core clock frequency is currently hardcoded at 446 MHz for the RL profile calculations. This causes issues since not all devices use that clock frequency. Read the GLGEN_CLKSTAT_SRC register to determine which PSM clock frequency is selected. This ensures that the rate limiter profile calculations will be correct. Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Kiran Patil	b126bd6bcd	ice: create scheduler aggregator node config and move VSIs Create set scheduler aggregator node and move for VSIs into respective scheduler node. Max children per aggregator node is 64. There are two types of aggregator node(s) created. 1. dedicated node for PF and _CTRL VSIs 2. dedicated node(s) for VFs. As part of reset and rebuild, aggregator nodes are recreated and VSIs are moved to respective aggregator node. Having related VSIs in respective tree avoid starvation between PF and VF w.r.t Tx bandwidth. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Co-developed-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Victor Raj <victor.raj@intel.com> Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Dave Ertman	df006dd4b1	ice: Add initial support framework for LAG Add the framework and initial implementation for receiving and processing netdev bonding events. This is only the software support and the implementation of the HW offload for bonding support will be coming at a later time. There are some architectural gaps that need to be closed before that happens. Because this is a software only solution that supports in kernel bonding, SR-IOV is not supported with this implementation. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Michal Swiatkowski	c7a219048e	ice: Remove xsk_buff_pool from VSI structure Current implementation of netdev already contains xsk_buff_pools. We no longer have to contain these structures in ice_vsi. Refactor the code to operate on netdev-provided xsk_buff_pools. Move scheduling napi on each queue to a separate function to simplify setup function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Dave Ertman	34295a3696	ice: implement new LLDP filter command There is an issue with some NVMs where an already existent LLDP filter is blocking the creation of a filter to allow LLDP packets to be redirected to the default VSI for the interface. This is blocking all LLDP functionality based in the kernel when the FW LLDP agent is disabled (e.g. software based DCBx). Implement the new AQ command to allow adding VSI destinations to existent filters on NVM versions that support the new command. The new lldp_fltr_ctrl AQ command supports Rx filters only, so the code flow for adding filters to disable Tx of control frames will remain intact. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Brett Creeley	382e0a6880	ice: log message when trusted VF goes in/out of promisc mode Currently there is no message printed on the host when a VF goes in and out of promiscuous mode. This is causing confusion because this is the expected behavior based on i40e. Fix this. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-08 16:27:01 -08:00
Bruce Allan	12aae8f1d8	ice: remove dead code The check for a NULL pf pointer is moot since the earlier declaration and assignment of struct device *dev already de-referenced the pointer. Also, the only caller of ice_set_dflt_mib() already ensures pf is not NULL. Cc: Dave Ertman <david.m.ertman@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:48 -08:00
Bruce Allan	11404310d5	ice: use flex_array_size where possible Use the flex_array_size() helper with the recently added flexible array members in structures. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:42 -08:00
Gustavo A. R. Silva	e94c0df984	ice: Replace one-element array with flexible-array member There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct ice_res_tracker, instead of a one-element array and use the struct_size() helper to calculate the size for the allocations. Also, notice that the code below suggests that, currently, two too many bytes are being allocated with devm_kzalloc(), as the total number of entries (pf->irq_tracker->num_entries) for pf->irq_tracker->list[] is _vectors_ and sizeof(pf->irq_tracker) also includes the size of the one-element array _list_ in struct ice_res_tracker. drivers/net/ethernet/intel/ice/ice_main.c:3511: 3511 / populate SW interrupts pool with number of OS granted IRQs. */ 3512 pf->num_avail_sw_msix = (u16)vectors; 3513 pf->irq_tracker->num_entries = (u16)vectors; 3514 pf->irq_tracker->end = pf->irq_tracker->num_entries; With this change, the right amount of dynamic memory is now allocated because, contrary to one-element arrays which occupy at least as much space as a single object of the type, flexible-array members don't occupy such space in the containing structure. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Built-tested-by: kernel test robot <lkp@intel.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:32 -08:00
Jacob Keller	e67fbcfbb4	ice: display stored UNDI firmware version via devlink info Just as we recently added support for other stored firmware flash versions, support display of the stored UNDI Option ROM version via devlink info. To do this, we need to introduce a new ice_get_inactive_orom_ver function. This is a little trickier than with other flash versions. The Option ROM version data was being read from a special "Boot Configuration" block of the NVM Preserved Field Area. This block only contains the active Option ROM version data. It is populated when the device firmware finishes updating the Option ROM. This method is ineffective at reading the stored Option ROM version data. Instead of reading from this section of the flash, replace this version extraction with one which locates the Combo Version information from within the Option ROM binary. This data is stored within the Option ROM at a 512 byte offset, in a simple structured format. The structure uses a simple modulo 256 checksum for integrity verification. Scan through the Option ROM to locate the CIVD data section, and extract the Combo Version. Refactor ice_get_orom_ver_info so that it takes the bank select enumeration parameter. Use this to implement ice_get_inactive_orom_ver. Although all ice devices have a Boot Configuration block in the NVM PFA, not all devices have a valid Option ROM. In this case, the old ice_get_orom_ver_info would "succeed" but report a version of all zeros. The new implementation would fail to locate the $CIV section in the Option ROM and report an error. Thus, we must ensure that ice_init_nvm does not fail if ice_get_orom_ver_info fails. Use the new ice_get_inactive_orom_ver to allow reporting the Option ROM versions for a pending update via devlink info. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:44:16 -08:00
Jacob Keller	e120a9ab45	ice: display stored netlist versions via devlink info Add a function to read the inactive netlist bank for version information. To support this, refactor how we read the netlist version data. Instead of using the firmware AQ interface with a module ID, read from the flash as a flat NVM, using ice_read_flash_module. This change requires a slight adjustment to the offset values used, as reading from the flat NVM includes the type field (which was stripped by firmware previously). Cleanup the macro names and move them to ice_type.h. For clarity in how we calculate the offsets and so that programmers can easily map the offset value to the data sheet, use a wrapper macro to account for the offset adjustments. Use the newly added ice_get_inactive_netlist_ver function to extract the version data from the pending netlist module update. Add the stored variants of "fw.netlist", and "fw.netlist.build" to the info version map array. With this change, we now report the "fw.netlist" and "fw.netlist.build" versions into the stored section of the devlink info report. As with the main NVM module versions, if there is no pending update, we report the currently active values as stored. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:43:37 -08:00
Jacob Keller	2c4fe41d72	ice: display some stored NVM versions via devlink info The devlink info interface supports drivers reporting "stored" versions. These versions indicate the version of an update that has been downloaded to the device, but is not yet active. The code for extracting the NVM version recently changed to enable support for reading from either the active or the inactive bank. Use this to implement ice_get_inactive_nvm_ver, which will read the NVM version data from the inactive section of flash. When reporting the versions via devlink info, first read the device capabilities. Determine if there is a pending flash update, and if so, extract relevant version information from the inactive flash. Store these within the info context structure. When reporting "stored" firmware versions, devlink documentation indicates that we ought to always report a stored value, even if there is no pending update. In this common case, the stored version should match the running version. This means that each stored version should by default fallback to the same value as reported by the running handler. To support this, modify the version structure to have both a "getter" and a "fallback". Modify the control loop so that it will use the "fallback" function if the "getter" function does not report a version. To report versions for which we can read the stored value, use a new "stored()" macro. This macro will insert two entries into the version list. The first entry is the traditional running version. The second is the stored version, implemented with a fallback to the active version. This is a little tricky, but reduces the overall duplication of elements in the entry list, and ensures that running and stored values remain consistent. To avoid some duplication, add a combined() macro that will insert both the running and stored versions into the version entry list. Using this new support, add pending version reporter functions for "fw.psid.api" and "fw.bundle_id". This enables reporting the stored values for some of versions in the NVM module of the flash. Reporting management versions is not implemented by this patch. The active management version is reported to the driver via the AdminQ mailbox during load. Although the version must be in the firmware binary somewhere, accessing this from the inactive firmware is not trivial and has not been implemented in this change. Future changes will introduce support for reading the UNDI Option ROM version and the version associated with the Netlist module. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:41:09 -08:00
Jacob Keller	0ce50c7066	ice: introduce function for reading from flash modules When reading from the flash memory of the device, the ice driver has two interfaces available to it. First, it can use a mediated interface via firmware that allows specifying a module ID. This allows reading from specific modules of the active flash bank. The second interface available is to perform flat reads. This allows complete access to the entire flash. However, using it requires the software to handle calculating module location and interpret pointer addresses. While most data required is accessible through the convenient first interface, certain flash contents are not. This includes the CSS header information associated with the Option ROM and NVM banks, as well as any access to the "inactive" banks used as scratch space for performing flash updates. In order to access all of the relevant flash contents, software must use the flat reads. Rather than forcing all flows to perform flat read calculations, introduce a new abstraction for reading from the flash: ice_read_flash_module. This function provides an abstraction for reading from either the active or inactive flash bank at the requested module. This interface is very similar to the abstraction provided via firmware, but allows access to additional modules, as well as providing a mechanism to request access to both flash banks. At first glance, it might make sense for this abstraction to allow specifying precisely which bank (1st or 2nd) the caller wishes to read. This is simpler to implement but more difficult to use. In practice, most callers only know whether they want the active bank, or the inactive bank. Rather than force callers to determine for themselves which bank to read from, implement ice_read_flash_module in terms of "active" vs "inactive". This significantly simplifies the implementation at the caller level and is a more useful abstraction over the flash contents. Make use of this new interface to refactor reading of the main NVM version information. Instead of using the firmware's mediated ShadowRAM function, use the ice_read_flash_module abstraction. To do this, notice that most reads of the NVM are going to be in 2-byte word chunks. To simplify using ice_read_flash_module for this case, ice_read_nvm_module is introduced. This is a simple wrapper around ice_read_flash_module which takes the correct pointer address for the NVM bank, and forces the 2-byte word format onto the caller. When reading the NVM versions, some fields are read from the Shadow RAM. The Shadow RAM is the first 64KB of flash memory, and is populated during device load. Most fields are copied from a section within the active NVM bank. In order to read this data from both the active and inactive NVM banks, we need to read not from the first 64KB of flash, but instead from the correct offset into the NVM bank. Introduce ice_read_nvm_sr_copy for this purpose. This function wraps around ice_read_nvm_module and has the same interface as the ice_read_sr_word, with the exception of allowing the caller to specify whether to read the active or inactive flash bank. With this change, it is now trivial to refactor ice_get_nvm_ver_info to read using the software mediated ice_read_flash_module interface instead of relying on the firmware mediated interface. This will be used in the following change to implement support for stored versions in the devlink info report. Additionally, the overall ice_read_flash_module interface will be used and extended to support all three major flash banks, and additionally to support reading the flash image security revision information. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	1fa95e0120	ice: cache NVM module bank information The ice flash contains two copies of each of the NVM, Option ROM, and Netlist modules. Each bank has a pointer word and a size word. In order to correctly read from the active flash bank, the driver must calculate the offset manually. During NVM initialization, read the Shadow RAM control word and determine which bank is active for each NVM module. Additionally, cache the size and pointer values for use in calculating the correct offset. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	74789085d9	ice: introduce context struct for info report The ice driver uses an array of structures which link an info name with a function that formats the associated version data into a string. All existing format functions simply format already captured static data from the driver hw structure. Future changes will introduce format functions for reporting the versions of flash sections stored but not yet applied. This type of version data is not stored as a member of the hw structure. This is because (a) it might not yet exist in the case there is no pending flash update, and (b) even if it does, it might change such as if an update is canceled or replaced by a new update before finalizing. We could simply have each format function gather its own data upon being called. However, in some cases the raw binary version data is a combination of multiple different reported fields. Additionally, the current interface doesn't have a way for the function to indicate that the version doesn't exist. Refactor this function interface to take a new ice_info_ctx structure instead of the buffer pointer and length. This context structure allows for future extensions to pre-gather version data that is stored within the context struct instead of the hw struct. Allocate this context structure initially at the start of ice_devlink_info_get. We use dynamic allocation instead of a local stack variable in order to avoid using too much kernel stack once we extend it with additional data structures. Modify the main loop that drives the info reporting so that the version buffer string is always cleared between each format. Explicitly check that the format function actually filled in a version string of non-zero length. If the string is not provided, simply skip this version without reporting an error. This allows for introducing format functions of versions which may or may not be present, such as the version of a pending update that has not yet been activated. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:41 -08:00
Jacob Keller	9af368fa9c	ice: create flash_info structure and separate NVM version The ice_nvm_info structure has become somewhat of a dumping ground for all of the fields related to flash version. It holds the NVM version and EETRACK id, the OptionROM info structure, the flash size, the ShadowRAM size, and more. A future change is going to add the ability to read the NVM version and EETRACK ID from the inactive NVM bank. To make this simpler, it is useful to have these NVM version info fields extracted to their own structure. Rename ice_nvm_info into ice_flash_info, and create a separate ice_nvm_info structure that will contain the eetrack and NVM map version. Move the netlist_ver structure into ice_flash_info and rename it ice_netlist_info for consistency. Modify the static ice_get_orom_ver_info to take the option rom structure as a pointer. This makes it more obvious what portion of the hw struct is being modified. Do the same for ice_get_netlist_ver_info. Introduce a new ice_get_nvm_ver_info function, which will be similar to ice_get_orom_ver_info and ice_get_netlist_ver_info, used to keep the NVM version extraction code co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 11:35:34 -08:00
Jacob Keller	08e1294daa	ice: report timeout length for erasing during devlink flash When erasing, notify userspace of how long we will potentially take to erase a module. Doing so allows userspace to report the timeout, giving a clear indication of the upper time bound of the operation. Since we're re-using the erase timeout value, make it a macro rather than a magic number. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-05 09:34:24 -08:00
Alexander Lobakin	a79afa78e6	net: use the new dev_page_is_reusable() instead of private versions Now we can remove a bunch of identical functions from the drivers and make them use common dev_page_is_reusable(). All {,un}likely() checks are omitted since it's already present in this helper. Also update some comments near the call sites. Suggested-by: David Rientjes <rientjes@google.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Alexander Lobakin <alobakin@pm.me> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-04 18:20:14 -08:00
Jakub Kicinski	c358f95205	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/can/dev.c `b552766c87` ("can: dev: prevent potential information leak in can_fill_info()") `3e77f70e73` ("can: dev: move driver related infrastructure into separate subdir") `0a042c6ec9` ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c `57ac4a31c4` ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") `214baf2287` ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c `20776b465c` ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") `ffb68fc58e` ("net: switchdev: remove the transaction structure from port object notifiers") `bae33f2b5a` ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 17:09:31 -08:00
Brett Creeley	f3fe97f643	ice: Fix MSI-X vector fallback logic The current MSI-X enablement logic tries to enable best-case MSI-X vectors and if that fails we only support a bare-minimum set. This includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X for the OICR interrupt. Unfortunately, the driver fails to load when we don't get as many MSI-X as requested for a couple reasons. First, the code to allocate MSI-X in the driver tries to allocate num_online_cpus() MSI-X for LAN traffic without caring about the number of MSI-X actually enabled/requested from the kernel for LAN traffic. So, when calling ice_get_res() for the PF VSI, it returns failure because the number of available vectors is less than requested. Fix this by not allowing the PF VSI to allocation more than pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues. Limiting the number of queues is done because we don't want more than 1 Tx/Rx queue per interrupt due to performance conerns. Second, the driver assigns pf->num_lan_msix = 2, to account for LAN traffic and the OICR. However, pf->num_lan_msix is only meant for LAN MSI-X. This is causing a failure when the PF VSI tries to allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has already been reserved, so there may not be enough MSI-X vectors left. Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for the failure case. Update the related defines used in ice_ena_msix_range() to align with the above behavior and remove the unused RDMA defines because RDMA is currently not supported. Also, remove the now incorrect comment. Fixes: `152b978a1f` ("ice: Rework ice_ena_msix_range") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:44:17 -08:00
Brett Creeley	943b881e35	ice: Don't allow more channels than LAN MSI-X available Currently users could create more channels than LAN MSI-X available. This is happening because there is no check against pf->num_lan_msix when checking the max allowed channels and will cause performance issues if multiple Tx and Rx queues are tied to a single MSI-X. Fix this by not allowing more channels than LAN MSI-X available in pf->num_lan_msix. Fixes: `87324e747f` ("ice: Implement ethtool ops for channels") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:44:17 -08:00
Nick Nunley	13ed5e8a9b	ice: update dev_addr in ice_set_mac_address even if HW filter exists Fix the driver to copy the MAC address configured in ndo_set_mac_address into dev_addr, even if the MAC filter already exists in HW. In some situations (e.g. bonding) the netdev's dev_addr could have been modified outside of the driver, with no change to the HW filter, so the driver cannot assume that they match. Fixes: `757976ab16` ("ice: Fix check for removing/adding mac filters") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Nick Nunley	1b0b0b581b	ice: Implement flow for IPv6 next header (extension header) This patch is based on a similar change to i40e by Slawomir Laba: "i40e: Implement flow for IPv6 next header (extension header)". When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with ICE_TX_CTX_EIPT_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: `a4e82a81f5` ("ice: Add support for tunnel offloads") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Henry Tieman	29e2d9eb82	ice: fix FDir IPv6 flexbyte The packet classifier would occasionally misrecognize an IPv6 training packet when the next protocol field was 0. The correct value for unspecified protocol is IPPROTO_NONE. Fixes: `165d80d6ad` ("ice: Support IPv6 Flow Director filters") Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-01-26 10:43:49 -08:00
Jakub Kicinski	2d9116be76	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-01-16 1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support, that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman. 2) Add support for using kernel module global variables (__ksym externs in BPF programs) retrieved via module's BTF, from Andrii Nakryiko. 3) Generalize BPF stackmap's buildid retrieval and add support to have buildid stored in mmap2 event for perf, from Jiri Olsa. 4) Various fixes for cross-building BPF sefltests out-of-tree which then will unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker. 5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann. 6) Clean up driver's XDP buffer init and split into two helpers to init per- descriptor and non-changing fields during processing, from Lorenzo Bianconi. 7) Minor misc improvements to libbpf & bpftool, from Ian Rogers. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) perf: Add build id data in mmap2 event bpf: Add size arg to build_id_parse function bpf: Move stack_map_get_build_id into lib bpf: Document new atomic instructions bpf: Add tests for new BPF atomic operations bpf: Add bitwise atomic instructions bpf: Pull out a macro for interpreting atomic ALU operations bpf: Add instructions for atomic_[cmp]xchg bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Move BPF_STX reserved field check into BPF_STX verifier code bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: x86: Factor out a lookup table for some ALU opcodes bpf: x86: Factor out emission of REX byte bpf: x86: Factor out emission of ModR/M for *(reg + off) tools/bpftool: Add -Wall when building BPF programs bpf, libbpf: Avoid unused function warning on bpf_tail_call_static selftests/bpf: Install btf_dump test cases selftests/bpf: Fix installation of urandom_read selftests/bpf: Move generated test files to $(TEST_GEN_FILES) selftests/bpf: Fix out-of-tree build ... ==================== Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 17:57:26 -08:00
Eric Dumazet	f73fc40327	ice: drop dead code in ice_receive_skb() napi_gro_receive() can never return GRO_DROP GRO_DROP can only be returned from napi_gro_frags() which is the other NAPI GRO entry point. Followup patch will remove GRO_DROP, because drivers are not supposed to call napi_gro_frags() if prior napi_get_frags() has failed. Note that I have left the gro_dropped variable. I leave to ice maintainers the decision to further remove it from ethtool -S results. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 14:24:25 -08:00
Lorenzo Bianconi	be9df4aff6	net, xdp: Introduce xdp_prepare_buff utility routine Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-08 13:39:24 -08:00
Lorenzo Bianconi	43b5169d83	net, xdp: Introduce xdp_init_buff utility routine Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-08 13:39:24 -08:00
Jakub Kicinski	30bfce1094	net: remove ndo_udp_tunnel_* callbacks All UDP tunnel port management is now routed via udp_tunnel_nic infra directly. Remove the old callbacks. Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 12:53:29 -08:00
Björn Töpel	8d14768a79	ice, xsk: clear the status bits for the next_to_use descriptor On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-16 10:51:07 -08:00
Björn Töpel	5bb0c4b5eb	ice, xsk: Move Rx allocation out of while-loop Instead doing the check for allocation in each loop, move it outside the while loop and do it every NAPI loop. This change boosts the xdpsock rxdrop scenario with 15% more packets-per-second. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20201211085410.59350-1-bjorn.topel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 17:54:42 -08:00
Jakub Kicinski	46d5e62dd3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net xdp_return_frame_bulk() needs to pass a xdp_buff to __xdp_return(). strlcpy got converted to strscpy but here it makes no functional difference, so just keep the right code. Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-11 22:29:38 -08:00
Björn Töpel	1beb7830d3	ice: avoid premature Rx buffer reuse The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: `efc2214b60` ("ice: Add support for XDP") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 15:26:58 -08:00
Simon Perron Caissy	5b13886da8	ice: Add space to unknown speed Add space to the end of 'Unknown' string in order to avoid concatenation with 'bps' string when formatting netdev log message. Signed-off-by: Simon Perron Caissy <simon.perron.caissy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Jacob Keller	9228d8b261	ice: join format strings to same line as ice_debug When printing messages with ice_debug, align the printed string to the origin line of the message in order to ease debugging and tracking messages back to their source. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	34d8461a65	ice: silence static analysis warning sparse warns about cast to/from restricted types which is not an actual problem; silence the warning. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Bruce Allan	32e6deb297	ice: cleanup misleading comment The maximum Admin Queue buffer size and NVM shadow RAM sector size are both 4 Kilobytes. Some comments refer to those as 4Kb which can be confused with 4 Kilobits. Update the comments to use the commonly used KB symbol instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:55 -08:00
Nick Nunley	bcf68ea1e5	ice: Remove vlan_ena from vsi structure vlan_ena was introduced to track whether VLAN filters are enabled on the device, but 1) checking for num_vlan > 1 already gives us this information, and is currently used in this way throughout the code 2) the logic for vlan_ena is broken when multiple VLANs are active Just remove vlan_ena and use num_vlan instead. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	956542cae5	ice: Remove gate to OROM init Remove the gate that prevents the OROM and netlist info from being populated. The NVM now has the appropriate section for software to reference the versioning info. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Jeb Cramer	c21125c997	ice: Enable Support for FW Override (E82X) The driver is able to override the firmware when it comes to supporting a more lenient link mode. This feature was limited to E810 devices. It is now extended to E82X devices. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00
Paul M Stillwell Jr	f2651a91b9	ice: don't always return an error for Get PHY Abilities AQ command There are times when the driver shouldn't return an error when the Get PHY abilities AQ command (0x0600) returns an error. Instead the driver should log that the error occurred and continue on. This allows the driver to load even though the AQ command failed. The user can then later determine the reason for the failure and correct it. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-12-09 08:11:54 -08:00

... 10 11 12 13 14 ...

1355 commits