Since the network driver and RDMA driver operate on the same PCI function,
we need to create an interface to allow the RDMA driver to share resources
with the network driver.
1. Create a new bnxt_en_dev struct which will be returned by
bnxt_ulp_probe() upon success. After that, all calls from the RDMA driver
to bnxt_en will pass a pointer to this struct.
2. This struct contains additional function pointers to register, request
msix, send fw messages, register for async events.
3. If the RDMA driver wants to enable RDMA on the function, it needs to
call the function pointer bnxt_register_device(). A ulp_ops structure
is passed for RCU protected upcalls from bnxt_en to the RDMA driver.
4. The RDMA driver can call firmware APIs using the bnxt_send_fw_msg()
function pointer.
5. 1 stats context is reserved when the RDMA driver registers. MSIX
and completion rings are reserved when the RDMA driver calls
bnxt_request_msix() function pointer.
6. When the RDMA driver calls bnxt_unregister_device(), all RDMA resources
will be cleaned up.
v2: Fixed 2 uninitialized variable warnings.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver register function with firmware consists of passing version
information and registering for async events. To support the RDMA driver,
the async events that we need to register may change. Separate the
driver register function into 2 parts so that we can just update the
async events for the RDMA driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the device supports RDMA, we'll setup network default rings so that
there are enough minimum resources for RDMA, if possible. However, the
user can still increase network rings to the max if he wants. The actual
RDMA resources won't be reserved until the RDMA driver registers.
v2: Fix compile warning when BNXT_CONFIG_SRIOV is not set.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All available remaining completion rings not used by the PF should be
made available for the VFs so that there are enough rings in the VF to
support RDMA. The earlier workaround code of capping the rings by the
statistics context is removed.
When SRIOV is disabled, call a new function bnxt_restore_pf_fw_resources()
to restore FW resources. Later on we need to add some logic to account
for RDMA resources.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that MSIX is enabled in bnxt_init_one(), resources may be allocated by
the RDMA driver before the network device is opened. So we cannot do
function reset in bnxt_open() which will clear all the resources.
The proper place to do function reset now is in bnxt_init_one().
If we get AER, we'll do function reset as well.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To better support the new RDMA driver, we need to move pci_enable_msix()
from bnxt_open() to bnxt_init_one(). This way, MSIX vectors are available
to the RDMA driver whether the network device is up or down.
Part of the existing bnxt_setup_int_mode() function is now refactored into
a new bnxt_init_int_mode(). bnxt_init_int_mode() is called during
bnxt_init_one() to enable MSIX. The remaining logic in
bnxt_setup_int_mode() to map the IRQs to the completion rings is called
during bnxt_open().
v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By refactoring existing code into this new function. The new function
will be used in subsequent patches.
v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function bnxt_hwrm_stat_ctx_alloc() always returns 0, even if the call
to _hwrm_send_message() fails. It may be better to propagate the errors
to the caller of bnxt_hwrm_stat_ctx_alloc().
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188661
Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report PFC statistics to ethtool -S and DCBNL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support only IEEE DCBX initially. Add IEEE DCBNL ops and functions to
get and set the hardware DCBX parameters. The DCB code is conditional on
Kconfig CONFIG_BNXT_DCB.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest interface has the latest DCB command structs. Get and store the
max number of lossless TCs the hardware can support.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new function bnxt_setup_mq_tc() to handle MQPRIO. This new function
will be called during ETS setup when we add DCBNL in the next patch.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
udplite conflict is resolved by taking what 'net-next' did
which removed the backlog receive method assignment, since
it is no longer necessary.
Two entries were added to the non-priv ethtool operations
switch statement, one in 'net' and one in 'net-next, so
simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
When busy polling while a link is down (during a link-flap test), TX
timeouts were observed as well as the following messages in the ring
buffer:
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free tx failed. rc:-1
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free rx failed. rc:-1
These were resolved by checking for link status and returning if link
was not up.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Rob Miller <rob.miller@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Knowing that:
#define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
#define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
and that 'bnxt_hwrm_tunnel_dst_port_alloc()' is only called with one of
these 2 constants, the TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_GENEVE can not
trigger.
Replace the bit test that overlap by an equality test, just as in
'bnxt_hwrm_tunnel_dst_port_free()' above.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All conflicts were simple overlapping changes except perhaps
for the Thunder driver.
That driver has a change_mtu method explicitly for sending
a message to the hardware. If that fails it returns an
error.
Normally a driver doesn't need an ndo_change_mtu method becuase those
are usually just range changes, which are now handled generically.
But since this extra operation is needed in the Thunder driver, it has
to stay.
However, if the message send fails we have to restore the original
MTU before the change because the entire call chain expects that if
an error is thrown by ndo_change_mtu then the MTU did not change.
Therefore code is added to nicvf_change_mtu to remember the original
MTU, and to restore it upon nicvf_update_hw_max_frs() failue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a missing synchronize_net() call to avoid potential use after free,
since we explicitly call napi_hash_del() to factorize the RCU grace
period.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The newer chips have proper support for 4-tuple UDP RSS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some dual port NICs, the speed setting on one port can affect the
available speed on the other port. Add logic to detect these changes
and adjust the advertised speed settings when necessary.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new FORCE_LINK_DWN bit to shutdown link during close.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the physical link is down and the VF virtual link is set to "enable",
the current code does not always work. If the link is down but the
cable is attached, the firmware returns LINK_SIGNAL instead of
NO_LINK. The current code is treating LINK_SIGNAL as link up.
The fix is to treat link as down when the link_status != LINK.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The logic is missing the check on whether the tx and rx rings are sharing
completion rings or not.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is automatically done from netif_napi_add(), and we want to not
export napi_hash_add() anymore in the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3: min_mtu 60, max_mtu 9000/1500
bnxt: min_mtu 60, max_mtu 9000
bnx2x: min_mtu 46, max_mtu 9600
- Fix up ETH_OVREHEAD -> ETH_OVERHEAD while we're in here, remove
duplicated defines from bnx2x_link.c.
bnx2: min_mtu 46, max_mtu 9000
- Use more standard ETH_* defines while we're at it.
bcm63xx_enet: min_mtu 46, max_mtu 2028
- compute_hw_mtu was made largely pointless, and thus merged back into
bcm_enet_change_mtu.
b44: min_mtu 60, max_mtu 1500
CC: netdev@vger.kernel.org
CC: Michael Chan <michael.chan@broadcom.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Dept-HSGLinuxNICDev@qlogic.com
CC: Siva Reddy Kallam <siva.kallam@broadcom.com>
CC: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.
For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.
Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
Set vf vlan protocol 802.1ad:
ip link set eth0 vf 1 vlan 100 proto 802.1ad
Set vf to VST (802.1Q) mode:
ip link set eth0 vf 1 vlan 100 proto 802.1Q
Or by omitting the new parameter
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_hwrm_fw_set_time() now returns -EOPNOTSUPP when built for kernel
without RTC_LIB. Setting the firmware time is not critical to the
successful completion of the firmware update process.
Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF link state can be changed via the 'ip link set' cmd.
Currently, the new link state does not take effect immediately.
The fix is for the PF to send a link change async event to the
designated VF after a VF link state change. This async event will
trigger the VF to update the link status.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restart autoneg if autoneg is enabled.
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware has a limitation that it won't pass host to BMC loopback
packets below 52-bytes.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After generating the random MAC address for VF, call the firmware to
approve it. This step serves 2 purposes. Some hypervisor (e.g. ESX)
wants to approve the MAC address. 2nd, the call will setup the
proper forwarding database in the internal switch.
We need to unlock the hwrm_cmd_lock mutex before calling bnxt_approve_mac().
We can do that because we are at the end of the function and all the
previous firmware response data has been copied.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-arrange the code so that the generation of the random MAC address for
the VF is at the end of the function. The next patch will add one more step
to call bnxt_approve_mac() to get the firmware to approve the random MAC
address.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing code is inconsistent in reporting and accepting the combined
channel count. bnxt_get_channels() reports maximum combined as the
maximum rx count. bnxt_set_channels() accepts combined count that
cannot be bigger than max rx or max tx.
For example, if max rx = 2 and max tx = 1, we report max supported
combined to be 2. But if the user tries to set combined to 2, it will
fail because 2 is bigger than max tx which is 1.
Fix the code to be consistent. Max allowed combined = max(max_rx, max_tx).
We will accept a combined channel count <= max(max_rx, max_tx).
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using Ethtool flashdev command, entire NVM package (*.pkg) files
may now be staged into the "update" area of the NVM and subsequently
verified and installed by the firmware using the newly introduced
command: NVM_INSTALL_UPDATE.
We also introduce use of the new firmware command FW_SET_TIME so that the
NVM-resident package installation log contains valid time-stamps.
Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove "Single-port/Dual-port" from the device names. Dual-port devices
will appear as 2 separate devices, so no need to call each a dual-port
device. Use a more generic name for VF devices belonging to the same
chip fanmily. Add some remaining NPAR device IDs.
Signed-off-by: David Christensen <david.christensen@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And remove redundant definitions of the same flags.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a code path where we are calling __iowrite64_copy() on
an address that is not 64-bit aligned. This causes an exception on
some architectures such as arm64. Fix that code path by using
__iowrite32_copy().
Reported-by: JD Zheng <jiandong.zheng@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there are not enough resources to enable ntuple filtering,
log a warning message.
v2: Use single message and add missing newline.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the destination MAC address in the ntuple filter structure. The
current code assumes that the destination MAC address is always the MAC
address of the NIC. This may not be true if there are macvlans, for
example. Add destination MAC address checking and configure the filter
correctly using the correct index for the destination MAC address.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
txr->dev_state was not consistently manipulated with the acquisition of
the per-queue lock, after further inspection the lock does not seem
necessary, either the value is read as BNXT_DEV_STATE_CLOSING or 0.
Reported-by: coverity (CID 1339583)
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bridge device in NS2 has the same device ID as the ethernet controller.
Add check to avoid probing the bridge device.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate special vnic for dropping packets not matching the RX filters.
First vnic is for normal RX packets and the driver will drop all
packets on the 2nd vnic.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate napi for special vnic, packets arriving on this
napi will simply be dropped and the buffers will be replenished back
to the HW.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware is unable to drop rx packets not matching the RX filters. To
workaround it, we create a special VNIC and configure the hardware to
direct all packets not matching the filters to it. We then setup the
driver to drop packets received on this VNIC.
This patch creates the infrastructure for this VNIC, reserves a
completion ring, and rx rings. Only shared completion ring mode is
supported. The next 2 patches add a NAPI to handle packets from this
VNIC and the setup of the VNIC.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nitro A0 has a hardware bug in the rx path. The workaround is to create
a special COS context as a path for non-RSS (non-IP) packets. Without this
workaround, the chip may stall when receiving RSS and non-RSS packets.
Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC. Allocate
and configure the CoS context for Nitro A0.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>