linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Jacob Keller	8664109467	fm10k: consistently use Intel(R) for driver names Update every header file and other locations to consistently use Intel(R) instead of just Intel. Also update copyright year of files which we modified. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:07:34 -07:00
Jacob Keller	540a5d8590	fm10k: fix possible null pointer deref after kcalloc When writing a new default redirection table, we needed to populate a new RSS table using ethtool_rxfh_indir_default. We populated this table into a region of memory allocated using kcalloc, but never checked this for NULL. Fix this by moving the default table generation into fm10k_write_reta. If this function is passed a table, use it. Otherwise, generate the default table using ethtool_rxfh_indir_default, 4 at at time. Fixes: `0ea7fae440` ("fm10k: use ethtool_rxfh_indir_default for default redirection table") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:07:33 -07:00
Jacob Keller	9de6a1a6b8	fm10k: drop 1588 support The 1588 support within fm10k does not work correctly with the current version of the switch management software, and likely never worked correctly to begin with. Remove support for PTP/1588. Update copyright year for all these files while we're touching them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:22 -07:00
Jacob Keller	2d0f76bedb	fm10k: use DRV_SUMMARY to reduce code duplication Use DRV_SUMMARY, similar to DRV_VERSION so that we don't have to duplicate the driver summary in multiple places. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:21 -07:00
Alexander Duyck	144d830558	fm10k: Add support for bulk Tx cleanup & cleanup boolean logic This patch enables bulk free in Tx cleanup for fm10k and cleans up the boolean logic in the polling routines for fm10k in the hopes of avoiding any mix-ups similar to what occurred with i40e and i40evf. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-20 23:06:20 -07:00
Jacob Keller	0ea7fae440	fm10k: use ethtool_rxfh_indir_default for default redirection table The fm10k driver used its own code for generating a default indirection table on device load, which was not the same as the default generated by ethtool when indir_size of 0 is passed to SRXFH. Take advantage of ethtool_rxfh_indir_default() and simplify code to write the redirection table to reduce some code duplication. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:50 -07:00
Jacob Keller	4be37c42a4	fm10k: correctly clean up when init_queueing_scheme fails Fix a kernel panic that occurs during surprise removal. Clear the interface queue counts upon fm10k_init_msix_capability failure. This prevents further code (fm10k_update_stats etc.) from attempting to access unallocated queue vector or ring memory. [ 628.692648] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [ 628.692805] IP: [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.693173] PGD 0 [ 628.693759] Oops: 0000 [#1] SMP [ 628.699321] CPU: 10 PID: 8164 Comm: kworker/10:0 Tainted: G OE ------------ 3.10.0-327.el7.x86_64 #1 [ 628.700096] Hardware name: Supermicro X9DAi/X9DAi, BIOS 3.2 05/09/2015 [ 628.700894] Workqueue: pciehp-1 pciehp_power_thread [ 628.701686] task: ffff88086559c500 ti: ffff8808593c0000 task.ti: ffff8808593c0000 [ 628.702493] RIP: 0010:[<ffffffffa0475caf>] [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.703310] RSP: 0018:ffff8808593c3b00 EFLAGS: 00010282 [ 628.704132] RAX: 0000000000000000 RBX: ffff880860760000 RCX: 0000000000000000 [ 628.704963] RDX: ffff880860760b08 RSI: 0000000000000000 RDI: 0000000000000000 [ 628.705794] RBP: ffff8808593c3b40 R08: 0000000000000000 R09: 0000000000000000 [ 628.706604] R10: 0000000000000000 R11: ffff880860760c40 R12: 0000000000000080 [ 628.707420] R13: ffff8808607608c0 R14: ffff880860779ec0 R15: ffff880860779f40 [ 628.708238] FS: 0000000000000000(0000) GS:ffff88086f000000(0000) knlGS:0000000000000000 [ 628.709071] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 628.709923] CR2: 0000000000000068 CR3: 000000000194a000 CR4: 00000000001407e0 [ 628.710752] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 628.711596] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 628.712438] Stack: [ 628.713255] ffff880860764458 ffff8808607608c0 ffff880860760000 ffff880860760000 [ 628.714088] 0000000000000080 ffff8808607608c0 ffff880860779ec0 ffff880860779f40 [ 628.714925] ffff8808593c3b88 ffffffffa04780c5 ffff880860764458 0000000a8163cb5b [ 628.715752] Call Trace: [ 628.716560] [<ffffffffa04780c5>] fm10k_down+0x155/0x1f0 [fm10k] [ 628.717367] [<ffffffffa0479958>] fm10k_close+0x28/0xd0 [fm10k] [ 628.718184] [<ffffffff81526365>] __dev_close_many+0x85/0xd0 [ 628.718986] [<ffffffff815264d8>] dev_close_many+0x98/0x120 [ 628.719764] [<ffffffff81527ab8>] rollback_registered_many+0xa8/0x230 [ 628.720527] [<ffffffff81527c80>] rollback_registered+0x40/0x70 [ 628.721294] [<ffffffff81529198>] unregister_netdevice_queue+0x48/0x80 [ 628.722052] [<ffffffff815291ec>] unregister_netdev+0x1c/0x30 [ 628.722816] [<ffffffffa04762b8>] fm10k_remove+0xd8/0xe0 [fm10k] [ 628.723581] [<ffffffff81328c7b>] pci_device_remove+0x3b/0xb0 [ 628.724340] [<ffffffff813f5fbf>] __device_release_driver+0x7f/0xf0 [ 628.725088] [<ffffffff813f6053>] device_release_driver+0x23/0x30 [ 628.725814] [<ffffffff81321fe4>] pci_stop_bus_device+0x94/0xa0 [ 628.726535] [<ffffffff813220d2>] pci_stop_and_remove_bus_device+0x12/0x20 [ 628.727249] [<ffffffff8133de40>] pciehp_unconfigure_device+0xb0/0x1b0 [ 628.727964] [<ffffffff8133d822>] pciehp_disable_slot+0x52/0xd0 [ 628.728664] [<ffffffff8133d98a>] pciehp_power_thread+0xea/0x150 [ 628.729358] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [ 628.730036] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [ 628.730730] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [ 628.731385] [<ffffffff810a5aef>] kthread+0xcf/0xe0 [ 628.732036] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [ 628.732674] [<ffffffff81645858>] ret_from_fork+0x58/0x90 [ 628.733289] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [ 628.733883] Code: 83 e8 01 48 8d 97 40 02 00 00 45 31 c0 4c 8d 9c c7 48 02 0 [ 628.735202] RIP [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.735732] RSP <ffff8808593c3b00> [ 628.736285] CR2: 0000000000000068 [ 628.736846] ---[ end trace 9156088b311aff42 ]--- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:34 -07:00
Jacob Keller	b3525696ad	fm10k: base queue scheme covered by RSS In fm10k_set_num_queues, we previously assigned the base template. This would always be overwritten by either fm10k_set_qos_queues or fm10k_set_rss_queues. In either case, we don't need the base values, so we can just remove them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:07 -07:00
Bruce Allan	fcdb0a9951	fm10k: cleanup remaining right-bit-shifted 1 Use BIT() macro instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:30 -07:00
Bruce Allan	1aab144c50	fm10k: Move constants to the right of binary operators The semantic patch that makes this change is available in scripts/coccinelle/misc/compare_const_fl.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:39:55 -07:00
Linus Torvalds	1200b6809d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ...	2016-03-19 10:05:34 -07:00
Joonsoo Kim	fe896d1878	mm: introduce page reference manipulation functions The success of CMA allocation largely depends on the success of migration and key factor of it is page reference count. Until now, page reference is manipulated by direct calling atomic functions so we cannot follow up who and where manipulate it. Then, it is hard to find actual reason of CMA allocation failure. CMA allocation should be guaranteed to succeed so finding offending place is really important. In this patch, call sites where page reference is manipulated are converted to introduced wrapper function. This is preparation step to add tracepoint to each page reference manipulation function. With this facility, we can easily find reason of CMA allocation failure. There is no functional change in this patch. In addition, this patch also converts reference read sites. It will help a second step that renames page._count to something else and prevents later attempt to direct access to it (Suggested by Andrew). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Keller, Jacob E	1012014ef5	fm10k: don't reinitialize RSS flow table when RXFH configured Also print an error message incase we do have to reconfigure as this should no longer happen anymore due to ethtool changes. If it somehow does occur, user should be made aware of it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-16 15:19:49 -05:00
Bruce Allan	07146e2ea8	fm10k: don't initialize fm10k_workqueue at global level Cleans up checkpatch GLOBAL_INITIALIZERS error Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-22 04:20:19 -08:00
Bruce Allan	a4fcad656e	fm10k: whitespace cleanups Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:21 -08:00
Alexander Duyck	587731e684	fm10k: Cleanup MSI-X interrupts in case of failure If the q_vector allocation fails we should free the resources associated with the MSI-X vector table. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:28:04 -08:00
Jacob Keller	e3b6e95d07	fm10k: bump driver version We haven't bumped the driver version in a while despite many fixes being pulled in from the out-of-tree Sourceforge driver. Update the version to match. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-13 15:27:53 -08:00
Jacob Keller	03d13a51fb	fm10k: TRIVIAL cleanup order at top of fm10k_xmit_frame Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:24 -08:00
Jacob Keller	242722dd3d	fm10k: Update adaptive ITR algorithm The existing adaptive ITR algorithm is overly restrictive. It throttles incorrectly for various traffic rates, and does not produce good performance. The algorithm now allows for more interrupts per second, and does some calculation to help improve for smaller packet loads. In addition, take into account the new itr_scale from the hardware which indicates how much to scale due to PCIe link speed. Reported-by: Matthew Vick <matthew.vick@intel.com> Reported-by: Alex Duyck <alexander.duyck@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:21 -08:00
Jacob Keller	584373f5b9	fm10k: introduce ITR_IS_ADAPTIVE macro Define a macro for identifying when the itr value is dynamic or adaptive. The concept was taken from i40e. This helps make clear what the check is, and reduces the line length to something more reasonable in a few places. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-12-05 23:55:21 -08:00
Alexander Duyck	9f87298647	fm10k: Fix handling of NAPI budget when multiple queues are enabled per vector This patch corrects an issue in which the polling routine would increase the budget for Rx to at least 1 per queue if multiple queues were present. This would result in Rx packets being processed when the budget was 0 which is meant to indicate that no Rx can be handled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-11-23 11:08:49 -08:00
Jesse Brandeburg	32b3e08fff	drivers/net/intel: use napi_complete_done() As per Eric Dumazet's previous patches: (see commit (`24d2e4a507`) - tg3: use napi_complete_done()) Quoting verbatim: Using napi_complete_done() instead of napi_complete() allows us to use /sys/class/net/ethX/gro_flush_timeout GRO layer can aggregate more packets if the flush is delayed a bit, without having to set too big coalescing parameters that impact latencies. </end quote> Tested configuration: low latency via ethtool -C ethx adaptive-rx off rx-usecs 10 adaptive-tx off tx-usecs 15 workload: streaming rx using netperf TCP_MAERTS igb: MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1176930056 1475.36 797726 16384.00 71905 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo ... Interim result: 941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763 Alignment Offset Bytes Bytes Recvs Bytes Sends Local Remote Local Remote Xfered Per Per Recv Send Recv Send Recv (avg) Send (avg) 8 8 0 0 1175182320 50476.00 23282 16384.00 71816 i40e: Hard to test because the traffic is incoming so fast (24Gb/s) that GRO always receives 87kB, even at the highest interrupt rate. Other drivers were only compile tested. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-16 04:33:46 -07:00
Jacob Keller	b4a5127b03	fm10k: do not use enum as boolean Check for actual value NETREG_UNINITIALIZED in case it ever changes from the current value of zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-13 23:51:37 -07:00
Jacob Keller	80043f3bf5	fm10k: add support for extra debug statistics Add a private ethtool flag to enable display of these statistics, which are generally less useful. However, sometimes it can be useful for debugging purposes. The most useful portion is the ability to see what the PF thinks the VF mailboxes look like. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-22 15:58:27 -07:00
Jacob Keller	e71c931842	fm10k: send traffic on default VID to VLAN device if we have one This patch ensures that VLAN traffic on the default VID will go to the corresponding VLAN device if it exists. To do this, mask the rx_ring VID if we have an active VLAN on that VID. For this to work correctly, we need to update fm10k_process_skb_fields to correctly mask off the VLAN_PRIO_MASK bits and compare them separately, otherwise we incorrectly compare the priority bits with the cleared flag. This also happens to fix a related bug where having priority bits set causes us to incorrectly classify traffic. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:06:04 -07:00
Alexander Duyck	aae072e363	fm10k: Don't assume page fragments are page size This change pulls out the optimization that assumed that all fragments would be limited to page size. That hasn't been the case for some time now and to assume this is incorrect as the TCP allocator can provide up to a 32K page fragment. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-09-15 17:05:37 -07:00
Michal Hocko	2f064f3485	mm: make page pfmemalloc check more robust Commit `c48a11c7ad` ("netvm: propagate page->pfmemalloc to skb") added checks for page->pfmemalloc to __skb_fill_page_desc(): if (page->pfmemalloc && !page->mapping) skb->pfmemalloc = true; It assumes page->mapping == NULL implies that page->pfmemalloc can be trusted. However, __delete_from_page_cache() can set set page->mapping to NULL and leave page->index value alone. Due to being in union, a non-zero page->index will be interpreted as true page->pfmemalloc. So the assumption is invalid if the networking code can see such a page. And it seems it can. We have encountered this with a NFS over loopback setup when such a page is attached to a new skbuf. There is no copying going on in this case so the page confuses __skb_fill_page_desc which interprets the index as pfmemalloc flag and the network stack drops packets that have been allocated using the reserves unless they are to be queued on sockets handling the swapping which is the case here and that leads to hangs when the nfs client waits for a response from the server which has been dropped and thus never arrive. The struct page is already heavily packed so rather than finding another hole to put it in, let's do a trick instead. We can reuse the index again but define it to an impossible value (-1UL). This is the page index so it should never see the value that large. Replace all direct users of page->pfmemalloc by page_is_pfmemalloc which will hide this nastiness from unspoiled eyes. The information will get lost if somebody wants to use page->index obviously but that was the case before and the original code expected that the information should be persisted somewhere else if that is really needed (e.g. what SLAB and SLUB do). [akpm@linux-foundation.org: fix blooper in slub] Fixes: `c48a11c7ad` ("netvm: propagate page->pfmemalloc to skb") Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.com> Debugged-by: Jiri Bohac <jbohac@suse.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> [3.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-21 14:30:10 -07:00
Alexander Duyck	1a8782e59f	fm10k: fold fm10k_pull_tail into fm10k_add_rx_frag This change folds the fm10k_pull_tail call into fm10k_add_rx_frag. The advantage to doing this is that the fragment doesn't have to be modified after it is added to the skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-06-17 14:20:18 -07:00
Alexander Duyck	59486329b4	fm10k: Do not assume budget will never be 0 for NAPI The netpoll path will call napi->poll with a budget of 0 in order to clean the Tx rings only. This change updates the fm10k driver so that it will correctly support that instead of cleaning 1 Rx frame if a budget of 0 is received. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-04 00:06:55 -04:00
Jeff Kirsher	f4f88c6d3b	fm10k: Bump driver version to 0.15.2 With the recent driver changes, bump the version. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>	2015-04-14 15:57:02 -07:00
Jeff Kirsher	b382bb1b3e	fm10k: use separate workqueue for fm10k driver Since we run the watchdog periodically, which might take a while and potentially monopolize the system default workqueue, create our own separate work queue. This also helps reduce and stabilize latency between scheduling the work in our interrupt and actually performing the work. Still use a timer for the regular scheduled interval but queue the work onto its own work queue. It seemed overkill to create a single workqueue per interface, so we just spawn a single work queue for all interfaces upon driver load. For this reason, use a multi-threaded workqueue with one thread per processor, rather than single threaded queue. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-14 15:38:06 -07:00
Jeff Kirsher	5deaf50bfb	fm10k: remove extraneous "Reset interface" message Since we already print this message when a reset is requested via the RESET_REQUESTED flag, we do not need to print it before setting the flag. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-14 15:35:48 -07:00
Jeff Kirsher	de44519916	fm10k: fix unused warnings The were several functions which had parameters which were never or sometimes used in functions. To resolve possible compiler warnings, use __always_unused or __maybe_unused kernel macros to resolve. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-04-14 15:23:36 -07:00
Matthew Vick	eca3204765	fm10k: Resolve various spelling errors and checkpatch warnings Fix a few silly typos in the code and checkpatch warnings in support of general code cleanliness. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-03 01:07:23 -08:00
Matthew Vick	5bf33dc687	fm10k: Implement ndo_features_check The introduction of ndo_features_check allows drivers to report their offload capabilities per-skb. Implement this in fm10k to take advantage of this new functionality. Reported-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-03 01:07:22 -08:00
Matthew Vick	8c1a90aa49	fm10k: Modify tunnel length header check when offloading The FM10000 host interface can only support up to 184 bytes when performing tunnel offloads. Because of this, a check was added to prevent the driver from attempting to feed a header to the hardware too big for it to parse. Make this check a little more robust by calculating the inner L4 header length based on whether it is TCP or UDP. Cc: Joe Stringer <joestringer@nicira.com> Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-03 01:07:21 -08:00
Joe Stringer	b66b6d9f6d	fm10k: Check tunnel header length in encap offload fm10k supports up to 184 bytes of inner+outer headers. Add an initial check to fail encap offload if these are too large. Signed-off-by: Joe Stringer <joestringer@nicira.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:21 -08:00
Alexander Duyck	ba5b8dcdb8	fm10k: Clean-up page reuse code This patch cleans up the page reuse code getting it into a state where all the workarounds needed are in place as well as cleaning up a few minor oversights such as using __free_pages instead of put_page to drop a locally allocated page. It also cleans up how we clear the descriptor status bits. Previously they were zeroed as a part of clearing the hdr_addr. However the hdr_addr is a 64 bit field and 64 bit writes can be a bit more expensive on on 32 bit systems. Since we are no longer using the header split feature the upper 32 bits of the address no longer need to be cleared. As a result we can just clear the status bits and leave the length and VLAN fields as-is which should provide more information in debugging. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-01-22 18:10:17 -08:00
Jiri Pirko	df8a39defa	net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00
Alexander Duyck	124b74c18e	fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads This change makes it so that dma_rmb is used when reading the Rx descriptor. The advantage of dma_rmb is that it allows for a much lower cost barrier on x86, powerpc, arm, and arm64 architectures than a traditional memory barrier when dealing with reads that only have to synchronize to coherent memory. In addition I have updated the code so that it just checks to see if any bits have been set instead of just the DD bit since the DD bit will always be set as a part of a descriptor write-back so we just need to check for a non-zero value being present at that memory location rather than just checking for any specific bit. This allows the code itself to appear much cleaner and allows the compiler more room to optimize. Cc: Matthew Vick <matthew.vick@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 21:15:06 -05:00
Alexander Duyck	67fd893ee0	ethernet/intel: Use napi_alloc_skb This change replaces calls to netdev_alloc_skb_ip_align with napi_alloc_skb. The advantage of napi_alloc_skb is currently the fact that the page allocation doesn't make use of any irq disable calls. There are few spots where I couldn't replace the calls as the buffer allocation routine is called as a part of init which is outside of the softirq context. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-10 13:31:57 -05:00
Alexander Duyck	a94d9e224e	ethernet/intel: Use eth_skb_pad and skb_put_padto helpers Update the Intel Ethernet drivers to use eth_skb_pad() and skb_put_padto instead of doing their own implementations of the function. Also this cleans up two other spots where skb_pad was called but the length and tail pointers were being manipulated directly instead of just having the padding length added via __skb_put. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:47:42 -05:00
Alexander Duyck	42b17f0955	fm10k/igb/ixgbe: Replace __skb_alloc_page with dev_alloc_page The Intel drivers were pretty much just using the plain vanilla GFP flags in their calls to __skb_alloc_page so this change makes it so that they use dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 00:00:14 -05:00
Alexander Duyck	2c2b2f0cb9	fm10k: Add skb->xmit_more support This change adds support for skb->xmit_more based on the changes that were made to igb to support the feature. The main changes are moving up the check for maybe_stop_tx so that we can check netif_xmit_stopped to determine if we must write the tail because we can add no further buffers. Acked-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:09:14 -04:00
Eric Dumazet	42b0270b40	fm10k: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Alexander Duyck	a211e0136c	fm10k: Add support for PTP This change adds support for the Linux PTP Hardware clock and timestamping functionality provided by the hardware. There are actually two cases that this timestamping is meant to support. The first case would be an ordinary clock scenario. In this configuration the host interface does not have access to BAR 4. However all of the host interfaces should be locked into the same boundary clock region and as such they are all on the same clock anyway. With this being the case they can synchronize among themselves and only need to adjust the offset since they are all on the same clock with the same frequency. The second case is a boundary clock scenario. This is a special case and would require both BAR 4 access, and a means of presenting a netdev per boundary region. The current plan is to use DSA at some point in the future to provide these interfaces, but the DSA portion is still under development. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-23 03:59:23 -07:00
Alexander Duyck	7461fd913a	fm10k: Add support for debugfs This patch adds limited debugfs support for the driver. Most of the functionality needed for dumping registers is already provided via ethtool. The only thing we saw that we really neeed was the ability to dump the descriptor rings so as such this patch will add a fm10k directory containing a listing of directories each one with a unique PCI Bus, Device, and Function number. Each of those BDF directories will have a list of q_vectors, and the q_vectors will contain a file for each of the Rx/Tx rings that are a part of the vector. For example: # ls -RD /sys/kernel/debug/fm10k/ /sys/kernel/debug/fm10k/: 0000:01:00.0 /sys/kernel/debug/fm10k/0000:01:00.0: q_vector.000 q_vector.001 q_vector.002 q_vector.003 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000: rx_ring.000 tx_ring.000 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.001: rx_ring.001 tx_ring.001 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.002: rx_ring.002 tx_ring.002 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.003: rx_ring.003 tx_ring.003 # cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/rx_ring.000 DES DATA RSS STATERR LENGTH VLAN DGLORT SGLORT TIMESTAMP --------------------------------------------------------------------------- 000 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x13951807dc4fedf0 001 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x1395180906c9f2c8 002 0x3731c000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 003 0x3731d000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 004 0xaab3a000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 ... # cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/tx_ring.000 DES BUFFER_ADDRESS LENGTH VLAN MSS HDRLEN FLAGS --------------------------------------------------------- 000 0x00000000aa8a1002 0x005a 0x0000 0x0000 0x0000 0xc0 001 0x00000000aa8a2002 0x005a 0x0000 0x0000 0x0000 0xc0 002 0x000000006bc13202 0x004e 0x0000 0x0000 0x0000 0xc0 003 0x000000006bc13c02 0x002a 0x0000 0x0000 0x0000 0xe1 004 0x000000006bc13602 0x0062 0x0000 0x0000 0x0000 0xc0 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-23 03:59:22 -07:00
Alexander Duyck	5cd5e2e982	fm10k: Add support for MACVLAN acceleration This patch adds support for L2 MACVLAN by making use of the fact that the RRC provides a unique tag per filter called a Global Resource Tag, or GLORT. In the case of this offload what I have done is assigned a linear block of these so that each GLORT represents one of the MACVLAN netdevs. By doing this I can share the Rx queues and Tx queues for all of the MACVLAN netdevs while allowing them to be demuxed in the Rx cleanup path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-23 03:59:20 -07:00
Alexander Duyck	76a540d472	fm10k: Add support for netdev offloads This patch adds support for basic offloads including TSO, Tx checksum, Rx checksum, Rx hash, and the same features applied to VXLAN/NVGRE tunnels. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-23 03:59:19 -07:00
Alexander Duyck	aa3ac82268	fm10k: Add support for multiple queues This patch takes the driver from supporting a single queue to supporting multiple queues. The upper queue limit for the PF is 128 queues and the upper limit for the VF is (128 / num_vfs) rounded down to nearest power of 2. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-09-23 03:59:19 -07:00

1 2 3

105 commits