1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

841 commits

Author SHA1 Message Date
Tom Herbert
a188222b6e net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.

This patch:
  - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
  - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:08 -05:00
Tom Herbert
53692b1de4 sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:49:58 -05:00
Alexander Duyck
e1d0a2af2b ixgbe: Fix VLAN promisc in relation to SR-IOV
This patch is a follow-on for enabling VLAN promiscuous and allowing the PF
to add VLANs without adding a VLVF entry.  What this patch does is go
through and free the VLVF registers if they are not needed as the VLAN
belongs only to the PF which is the default pool.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 01:54:36 -08:00
Alexander Duyck
1636956491 ixgbe: Add support for VLAN promiscuous with SR-IOV
This patch adds support for VLAN promiscuous with SR-IOV enabled.

The code prior to this patch was only adding the PF to VLANs that the VF
had added.  As such enabling promiscuous mode would actually not add any
additional VLAN filters so visibility was limited.  This lead to a number
of issues as the bridge and OVS would expect us to accept all VLAN tagged
packets when promiscuous mode was enabled, and instead we would filter out
most if not all depending on the configuration of the PF.

With this patch what we do is set all the bits in the VFTA and all of the
VLVF bits associated with the pool belonging to the PF.  By doing this the
PF is guaranteed to receive all VLAN tagged traffic associated with the RAR
filters assigned to the PF.  In addition we will clean up those same bits
in the event of promiscuous mode being disabled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 01:49:25 -08:00
Alexander Duyck
b6488b662b ixgbe: Add support for adding/removing VLAN on PF bypassing the VLVF
This patch adds support for bypassing the VLVF entry creation when the PF
is adding a new VLAN.  The advantage to doing this is that we can then save
the VLVF entries for the VFs which must have them in order to function,
versus the PF which can fall back on the default pool entry.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 01:29:28 -08:00
Alexander Duyck
530fd82a9f ixgbe: Return error on failure to allocate mac_table
Add a check to make certain mac_table was actually allocated and is not
NULL.  If it is NULL return -ENOMEM and allow the probe routine to fail
rather then causing a NULL pointer dereference further down the line.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12 00:44:18 -08:00
Alexander Duyck
bf4d67d94c ixgbe: Reset interface after enabling SR-IOV
Enabling SR-IOV and then bringing the interface up was resulting in the PF
MAC addresses getting into a bad state.  Specifically the MAC address was
enabled for both VF 0 and the PF.  This resulted in some odd behaviors such
as VF 0 receiving a copy of the PFs traffic, which in turn enables the
ability for VF 0 to spoof the PF.

A workaround for this issue appears to be to bring up the interface first
and then enable SR-IOV as this way the reset is then triggered in the
existing code.

In order to correct this I have added a change to ixgbe_setup_tc where if
the interface is down we still will at least call ixgbe_reset so that the
MAC addresses for the device are reset to the correct pools.

Steps to reproduce issue:
modprobe ixgbe
echo 7 > /sys/bus/pci/devices/0000\:01\:00.1/sriov_numvfs
ifconfig enp1s0f1 up
ethregs -s 1:00.1 | grep MPSAR | grep -v 00000000

Result:
	MPSAR[0]               00000081
	MPSAR[254]             00000001

Expected Result, behavior after patch:
	MPSAR[0]               00000080
	MPSAR[254]             00000080

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 12:41:48 -08:00
Mark Rustad
36a92d7190 ixgbe: Handle extended IPv6 headers in Tx path
Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for recognizing problems with the first version
of this patch and recognizing how to coalesce error conditions
into a single location.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:54 -08:00
Mark Rustad
988d13073f ixgbe: Save VF info and take references
Save VF device pointers and take references to speed accesses used
to monitor the device behavior to avoid slot resets. The saved
information avoids lock contention during the search used to access
each of the VFs.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:14 -08:00
Mark Rustad
a9763f3cb5 ixgbe: Update PTP to support X550EM_x devices
The X550EM_x devices handle clocking differently, so update the
PTP implementation to accommodate them. This involves significant
changes to ixgbe's PTP code to accommodate the new range of
behaviors including things like non-power-of-2 clock wrapping.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:12 -08:00
Alexander Duyck
2f9be16655 ixgbe: Allow FDB entries access to more RAR filters
This change makes it so that we allow the PF to make use of all free RAR
entries for FDB use if needed.

Previously the code limited us to 16 unicast entries, however this was
shared between MACVLAN which wasn't limited and the FDB code which was.  So
instead of treating the FDB code as a second class citizen I have updated
it so that it has access to just as many entries as the MACVLAN filters.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:11 -08:00
Alexander Duyck
0f079d2283 ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses
This change replaces the ixgbe_write_uc_addr_list call in ixgbe_set_rx_mode
with a call to __dev_uc_sync instead.  This works much better with the MAC
addr list code that was already in place and solves an issue in which you
couldn't remove an FDB address without having to reset the port.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:10 -08:00
Alexander Duyck
c9f53e63c2 ixgbe: Refactor MAC address configuration code
In the process of tracking down a memory leak when adding/removing FDB
entries I had to go through the MAC address configuration code for ixgbe.
In the process of doing so I found a number of issues that impacted
readability and performance.  This change updates the code in general to
clean it up so it becomes clear what each step is doing.  From what I can
tell there a couple of bugs cleaned up in this code.

First is the fact that the MAC addresses were being double counted for the
PF.  As a result once entries up to 63 had been used you could no longer
add additional filters.

A simple test case for this:
  for i in `seq 0 96`
  do
    ip link add link ens8 name mv$i type macvlan
    ip link set dev mv$i up
  done

Test script:
  ethregs -s 0:8.0 | grep -e "RAH" | grep 8000....$

When things are working correctly RAL/H registers 1 - 97 will be consumed.
In the failing case it will stop at 63 and prevent any further filters from
being added.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:09 -08:00
Mark Rustad
780484d853 ixgbe: Use private workqueue to avoid certain possible hangs
Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-03 02:32:07 -08:00
Alexander Duyck
ef2662b2a8 ixgbe/ixgbevf: use napi_schedule_irqoff()
The ixgbe_intr and ixgbe/ixgbevf_msix_clean_rings functions run from hard
interrupt context or with interrupts already disabled in netpoll.

They can use napi_schedule_irqoff() instead of napi_schedule()

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-11-23 11:08:50 -08:00
Alexander Duyck
5d6002b7b8 ixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector
This patch corrects an issue in which the polling routine would increase
the budget for Rx to at least 1 per queue if multiple queues were present.
This would result in Rx packets being processed when the budget was 0 which
is meant to indicate that no Rx can be handled.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-11-23 11:08:49 -08:00
Jean Sacren
a897a2adb6 ixgbe: fix multiple kernel-doc errors
The commit dfaf891dd3 ("ixgbe: Refactor the RSS configuration code")
introduced a few kernel-doc errors:

1) The function name is missing;
2) The format is wrong;
3) The short description is redundant.

Fix all the above for the correct execution of the kernel doc.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-11-23 11:08:48 -08:00
Mark Rustad
cc1f88ba16 ixgbe: Delete redundant include file
Delete a redundant include of net/vxlan.h.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-11-23 11:08:48 -08:00
Eric Dumazet
93f93a4404 net: move skb_mark_napi_id() into core networking stack
We would like to automatically provide busy polling support
to all NAPI drivers, without them having to implement anything.

skb_mark_napi_id() can be called from napi_gro_receive() and
napi_get_frags().

Few drivers are still calling skb_mark_napi_id() because
they use netif_receive_skb(). They should eventually call
napi_gro_receive() instead. I will leave this to drivers
maintainers.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-18 16:17:41 -05:00
Hiroshi Shimamoto
54011e4db8 ixgbe: Add new ndo to trust VF
Implements the new netdev op to trust VF in ixgbe.

The administrator can turn on and off VF trusted by ip command which
supports trust message.
 # ip link set dev eth0 vf 1 trust on
or
 # ip link set dev eth0 vf 1 trust off

Send a ping to reset VF on changing the status of trusting.
VF driver will reconfigure its features on reset.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-23 05:48:53 -07:00
Jesse Brandeburg
32b3e08fff drivers/net/intel: use napi_complete_done()
As per Eric Dumazet's previous patches:
(see commit (24d2e4a507) - tg3: use napi_complete_done())

Quoting verbatim:
Using napi_complete_done() instead of napi_complete() allows
us to use /sys/class/net/ethX/gro_flush_timeout

GRO layer can aggregate more packets if the flush is delayed a bit,
without having to set too big coalescing parameters that impact
latencies.
</end quote>

Tested
configuration: low latency via ethtool -C ethx adaptive-rx off
				rx-usecs 10 adaptive-tx off tx-usecs 15
workload: streaming rx using netperf TCP_MAERTS

igb:
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589

Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
Local  Remote  Local  Remote  Xfered   Per                 Per
Recv   Send    Recv   Send             Recv (avg)          Send (avg)
    8       8      0       0 1176930056  1475.36    797726   16384.00  71905

MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763

Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
Local  Remote  Local  Remote  Xfered   Per                 Per
Recv   Send    Recv   Send             Recv (avg)          Send (avg)
    8       8      0       0 1175182320  50476.00     23282   16384.00  71816

i40e:
Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
always receives 87kB, even at the highest interrupt rate.

Other drivers were only compile tested.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-16 04:33:46 -07:00
Emil Tantilov
72bfd32d2f ixgbe: disable LRO by default
This patch disables LRO by default in favor of GRO.

LRO is incompatible with forwarding and is disabled when forwarding
is turned on which makes the default offloads of the driver
inconsistent. LRO can still be enabled via ethtool.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-15 01:49:03 -07:00
Emil Tantilov
f079fa005a ixgbe: add flow control ethertype to the anti-spoofing filter
This patch makes sure that flow control packets initiated by the VF are
dropped and reported as spoofed.

Flow control packets can be used to limit the throughput or as DOS
attack when generated from a VF. Flow control is not supported per VF
hence any pause frames generated from a VF are considered malicious.

Also cleaned up indentation and some redundant comments.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-13 23:46:49 -07:00
Mark Rustad
21dd560162 ixgbe: Advance version to 4.2.1
With the addition of X550em_x SFP+ support, the driver is now
functionally equivalent to what will be the 4.2.1 driver when
released, so change the version to match.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 23:09:34 -07:00
Mark Rustad
c5846ba445 ixgbe: X540 thermal warning interrupt not a GPI
The X540 thermal interrupt (IXGBE_EIMS_TS) is not an SDP, so it
doesn't need to be enabled in ixgbe_setup_gpie(). In fact the
value is simply not for the GPIE register at all.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 23:07:12 -07:00
Mark Rustad
9de7605ea2 ixgbe: Correct several flaws with with DCA setup
This change does two things. First, it makes it so that we always
set the relaxed ordering bits related to the DCA registers even if
DCA is not enabled. Second, it moves the configuration out of the
ixgbe_down function and into the ixgbe_configure function before
enabling the Rx and Tx rings. This ensures that DCA is configured
correctly before starting to process packets.

Thanks to Alex Duyck for this fix.

CC: Alex Duyck <aduyck@mirantis.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 23:00:03 -07:00
Mark Rustad
018d7146ee ixgbe: Add new X550EM SFP+ device ID
Add new device ID for X550EM device with SFPs.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 22:57:41 -07:00
Mark Rustad
f961ddae16 ixgbe: Add small packet padding support for X550
This patch sets RDRXCTL.PSP when the driver is in SRIOV mode which
enables padding of small packets.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 22:52:55 -07:00
Mark Rustad
052a1a7243 ixgbe: Correct setting of RDRXCTL register for X550* devices
Setting the X550* RDRXCTL register should fall through into X540
and 82599, not 82598.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 22:50:32 -07:00
Mark Rustad
58e7cd24d4 ixgbe: Limit SFP polling rate
Reduce the frequency of polling for SFP modules. Because the
service task sometimes runs at high rates, we can poll for
SFPs too often. When an SFP is not present, the I2C timeouts
that result are very costly. So, prevent SFP polling from
being done more than once every two seconds. To reduce latency,
the poll time is cleared in a couple of cases to permit the
next service task execution to poll the SFP module.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-23 22:43:22 -07:00
Mark Rustad
cbd45ec7aa ixgbe: Add X550EM support for SFP insertion interrupt
Add support for the SFP insertion interrupt on X550EM devices with
SFPs.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-22 15:58:30 -07:00
Mark Rustad
29a8dca199 ixgbe: Accept SFP not present errors on all devices
When an SFP not present error is returned by the reset_hw method,
accept it and go on, since an SFP can still be inserted. Previously
it was only accepted for 82598 devices.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-22 15:58:29 -07:00
Don Skidmore
a023bbd0b1 ixgbe: Add SFP+ detection for X550 hardware
This patch is part of the future enablement of X550 SFP+ support.  This
HW uses different SDP so the interrupts need to be set up accordingly.

Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-22 15:58:26 -07:00
Alexander Duyck
8ac34f10a5 ixgbe: Limit lowest interrupt rate for adaptive interrupt moderation to 12K
This patch updates the lowest limit for adaptive interrupt interrupt
moderation to roughly 12K interrupts per second.

The way I came about reaching 12K as the desired interrupt rate is by
testing with UDP flows.  Specifically I had a simple test that ran a
netperf UDP_STREAM test at varying sizes.  What I found was as the packet
sizes increased the performance fell steadily behind until we were only
able to receive at ~4Gb/s with a message size of 65507.  A bit of digging
found that we were dropping packets for the socket in the network stack,
and looking at things further what I found was I could solve it by increasing
the interrupt rate, or increasing the rmem_default/rmem_max.  What I found was
that when the interrupt coalescing resulted in more data being processed
per interrupt than could be stored in the socket buffer we started losing
packets and the performance dropped.  So I reached 12K based on the
following math.

rmem_default = 212992
skb->truesize = 2994
212992 / 2994 = 71.14 packets to fill the buffer

packet rate at 1514 packet size is 812744pps
71.14 / 812744 = 87.9us to fill socket buffer

From there it was just a matter of choosing the interrupt rate and
providing a bit of wiggle room which is why I decided to go with 12K
interrupts per second as that uses a value of 84us.

The data below is based on VM to VM over a direct assigned ixgbe interface.
The test run was:
	netperf -H <ip> -t UDP_STREAM"

Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
Before:
212992   65507   60.00     1100662      0     9613.4     10.89    0.557
212992           60.00      473474            4135.4     11.27    0.576

After:
212992   65507   60.00     1100413      0     9611.2     10.73    0.549
212992           60.00      974132            8508.3     11.69    0.598

Using bare metal the data is similar but not as dramatic as the throughput
increases from about 8.5Gb/s to 9.5Gb/s.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-15 17:05:12 -07:00
Alex Williamson
6b010e9b1f ixgbe: Teardown SR-IOV before unregister_netdev()
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci.  In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM.  Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
issue.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-15 17:05:07 -07:00
Don Skidmore
4ccc650cc8 ixgbe: fix issue with SFP events with new X550 devices
Add checks for systems that don't have SFP's to avoid incorrectly
acting on interrupts that are falsely interpreted as SFP events.
This also includes a modified check generating the EICR mask to be
more forward-looking.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-15 17:02:53 -07:00
Alex Williamson
7837e2867f ixgbe: Remove bimodal SR-IOV disabling
When unbinding an SR-IOV device with VFs configured from ixgbe, the
driver behaves in one of two ways.  If max_vfs was specified, the
SR-IOV state is disabled, removing the VFs.  The occurs regardless of
whether the VF count was later modified through sysfs.  If however
max_vfs is zero, such as by not specifying the module parameter, the
VFs persist after the PF is unbound from ixgbe.  If the PF is then
bound to vfio-pci to be assigned to a VM, the PF is non-functional.

>From the comment, commit da36b64736 ("ixgbe: Implement PCI SR-IOV
sysfs callback operation") clearly intended this alternate behavior,
but probably didn't realize the PF doesn't work in this mode.

This bimodal behavior is confusing to users and results in a state
where the PF is broken for other uses unless the user sets
sriov_numvfs to zero prior to unbinding the device.  Remove this
behavior so that VFs are removed and the PF is functional for other
uses after unbind, regardless of the way VFs are enabled.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 17:17:41 -07:00
Mark Rustad
454adb008d ixgbe: Add support for reporting 2.5G link speed
Now that we can do 2.5G link speed, we need to be able to report it.
Also change the nested triadic involved in creating the log message
to instead use a simpler switch statement to set a string pointer.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 17:15:20 -07:00
Emil Tantilov
7e3f5c8881 ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
This patch resolves an issue where users were not able to dynamically
set number of queues for 82598 via ethtool -L

Reported-by: Tal Abudi <talabudi@gmail.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 17:13:00 -07:00
Tom Barbette
1c7cf0784e ixgbe: support for ethtool set_rxfh
Allows to change the rxfh indirection table and/or key using
ethtool interface.

Signed-off-by: Tom Barbette <tom.barbette@ulg.ac.be>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 17:10:38 -07:00
Maninder Singh
bc52f951e3 ixgbe: use kzalloc for allocating one thing
Use kzalloc rather than kcalloc(1..

The semantic patch that makes this change is as follows:

// <smpl>
@@
@@

- kcalloc(1,
+ kzalloc(
          ...)
// </smpl>

and removing checkpatch below CHECK:
CHECK: Prefer kzalloc(sizeof(*fwd_adapter)...) over
kzalloc(sizeof(struct ixgbe_fwd_adapter)...)

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 17:01:17 -07:00
Don Skidmore
f9328bc6a7 ixgbe: add new bus type for intergrated I/O interface (IOSF)
With this patch we add support for a new bus type ixgbe_bus_type_internal.
X550em devices use IOSF and not PCIe bus so this new type is to accommodate
them.

Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 16:56:35 -07:00
Don Skidmore
6ac7439459 ixgbe: Add support for entering low power link up state
When the device is closing or suspending, call ixgbe_enter_lplu to
enter low power link up state on devices that support it. When this
is done, prevent the phy from being reset in the ixgbe_down path
so that link is present when calling ixgbe_enter_lplu.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 16:51:55 -07:00
Mark Rustad
67359c3c9f ixgbe: Add support for VXLAN RX offloads
Add support for VXLAN RX offloads for the X55x devices that support
them.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 16:49:37 -07:00
Mark Rustad
f467bc0602 ixgbe: Add support for UDP-encapsulated tx checksum offload
By using GSO for UDP-encapsulated packets, all ixgbe devices can
be directed to generate checksums for the inner headers because
the outer UDP checksum can be zero. So point the machinery at the
inner headers and have the hardware generate the checksum.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-09-01 16:47:19 -07:00
David S. Miller
0d36938bb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
Michal Hocko
2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
Jacob Keller
56d1392f2f ixgbe: TRIVIAL fix up double 'the' and comment style
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:07 -07:00
Fan Du
7edda4b871 ixgbe: Specify Rx hash type WRT Rx desc RSS type
RSS could be leveraged by taking account L4 src/dst ports
as ingredients, thus ingress skb Rx hash type should honor
such the real configuration.

Signed-off-by: Fan Du <fan.du@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-07-17 19:59:06 -07:00
Scott Feldman
7d4f8d871a switchdev; add VLAN support for port's bridge_getlink
One more missing piece of the puzzle.  Add vlan dump support to switchdev
port's bridge_getlink.  iproute2 "bridge vlan show" cmd already knows how
to show the vlans installed on the bridge and the device , but (until now)
no one implemented the port vlan part of the netlink PF_BRIDGE:RTM_GETLINK
msg.  Before this patch, "bridge vlan show":

	$ bridge -c vlan show
	port    vlan ids
	sw1p1    30-34			<< bridge side vlans
		 57

	sw1p1				<< device side vlans (missing)

	sw1p2    57

	sw1p2

	sw1p3

	sw1p4

	br0     None

(When the port is bridged, the output repeats the vlan list for the vlans
on the bridge side of the port and the vlans on the device side of the
port.  The listing above show no vlans for the device side even though they
are installed).

After this patch:

	$ bridge -c vlan show
	port    vlan ids
	sw1p1    30-34			<< bridge side vlan
		 57

	sw1p1    30-34			<< device side vlans
		 57
		 3840 PVID

	sw1p2    57

	sw1p2    57
		 3840 PVID

	sw1p3    3842 PVID

	sw1p4    3843 PVID

	br0     None

I re-used ndo_dflt_bridge_getlink to add vlan fill call-back func.
switchdev support adds an obj dump for VLAN objects, using the same
call-back scheme as FDB dump.  Support included for both compressed and
un-compressed vlan dumps.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-23 06:56:18 -07:00