linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Linus Torvalds	2ef32ad224	virtio: features, fixes, cleanups Several new features here: - virtio-net is finally supported in vduse. - Virtio (balloon and mem) interaction with suspend is improved - vhost-scsi now handles signals better/faster. Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmZN570PHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRp2JUH/1K3fZOHymop6Y5Z3USFS7YdlF+dniedY/vg TKyWERkXOlxq1d9DVxC0mN7tk72DweuWI0YJjLXofrEW1VuW29ecSbyFXxpeWJls b7ErffxDAFRas5jkMCngD8TuFnbEegU0mGP5kbiHpEndBydQ2hH99Gg0x7swW+cE xsvU5zonCCLwLGIP2DrVrn9qGOHtV6o8eZfVKDVXfvicn3lFBkUSxlwEYsO9RMup aKxV4FT2Pb1yBicwBK4TH1oeEXqEGy1YLEn+kAHRbgoC/5L0/LaiqrkzwzwwOIPj uPGkacf8CIbX0qZo5EzD8kvfcYL1xhU3eT9WBmpp2ZwD+4bINd4= =nax1 -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "Several new features here: - virtio-net is finally supported in vduse - virtio (balloon and mem) interaction with suspend is improved - vhost-scsi now handles signals better/faster And fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits) virtio-pci: Check if is_avq is NULL virtio: delete vq in vp_find_vqs_msix() when request_irq() fails MAINTAINERS: add Eugenio Pérez as reviewer vhost-vdpa: Remove usage of the deprecated ida_simple_xx() API vp_vdpa: don't allocate unused msix vectors sound: virtio: drop owner assignment fuse: virtio: drop owner assignment scsi: virtio: drop owner assignment rpmsg: virtio: drop owner assignment nvdimm: virtio_pmem: drop owner assignment wifi: mac80211_hwsim: drop owner assignment vsock/virtio: drop owner assignment net: 9p: virtio: drop owner assignment net: virtio: drop owner assignment net: caif: virtio: drop owner assignment misc: nsm: drop owner assignment iommu: virtio: drop owner assignment drm/virtio: drop owner assignment gpio: virtio: drop owner assignment firmware: arm_scmi: virtio: drop owner assignment ...	2024-05-23 12:04:36 -07:00
Krzysztof Kozlowski	9a03bfa108	net: virtio: drop owner assignment virtio core already sets the .owner, so driver does not need to. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Message-Id: <20240331-module-owner-virtio-v2-17-98f04bfaf46a@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-05-22 08:31:17 -04:00
Daniel Jurgens	fa033def41	virtio_net: Fix missed rtnl_unlock The rtnl_lock would stay locked if allocating promisc_allmulti failed. Also changed the allocation to GFP_KERNEL. Fixes: `ff7c7d9f52` ("virtio_net: Remove command data from control_buf") Reported-by: Eric Dumazet <edumaset@google.com> Link: https://lore.kernel.org/netdev/CANn89iLazVaUCvhPm6RPJJ0owra_oFnx7Fhc8d60gV-65ad3WQ@mail.gmail.com/ Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240515163125.569743-1-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-16 19:37:41 -07:00
Xuan Zhuo	9719f039d3	virtio_net: remove the misleading comment We call the build_skb() actually without copying data. The comment is misleading. So remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240511031404.30903-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-13 17:07:41 -07:00
Xuan Zhuo	defd28aa5a	virtio_net: rx remove premapped failover code Now, the premapped mode can be enabled unconditionally. So we can remove the failover code for merge and small mode. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20240511031404.30903-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-13 17:07:41 -07:00
Xuan Zhuo	a377ae542d	virtio_net: big mode skip the unmap check The virtio-net big mode did not enable premapped mode, so we did not need to check the unmap. And the subsequent commit will remove the failover code for failing enable premapped for merge and small mode. So we need to remove the checking do_dma code in the big mode path. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240511031404.30903-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-13 17:07:41 -07:00
Daniel Jurgens	c39add9b24	virtio_net: Add TX stopped and wake counters Add a tx queue stop and wake counters, they are useful for debugging. $ ./tools/net/ynl/cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' ... {'ifindex': 13, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 14756682850, 'tx-packets': 226465, 'tx-stop': 113208, 'tx-wake': 113208}, {'ifindex': 13, 'queue-id': 1, 'queue-type': 'tx', 'tx-bytes': 18167675008, 'tx-packets': 278660, 'tx-stop': 8632, 'tx-wake': 8632}] Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240510201927.1821109-3-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-13 14:58:36 -07:00
Daniel Jurgens	b49bd37f0b	virtio_net: Fix memory leak in virtnet_rx_mod_work The pointer delcaration was missing the __free(kfree). Fixes: `ff7c7d9f52` ("virtio_net: Remove command data from control_buf") Reported-by: Jens Axboe <axboe@kernel.dk> Closes: https://lore.kernel.org/netdev/0674ca1b-020f-4f93-94d0-104964566e3f@kernel.dk/ Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240509183634.143273-1-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-10 18:20:07 -07:00
Daniel Jurgens	f8befdb21b	virtio_net: Remove rtnl lock protection of command buffers The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Daniel Jurgens	4d4ac2ecec	virtio_net: Add a lock for per queue RX coalesce Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a mutex per queue. A mutex is required because virtnet_send_command can sleep. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Daniel Jurgens	650d77c51e	virtio_net: Do DIM update for specified queue only Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Daniel Jurgens	6f45ab3e04	virtio_net: Add a lock for the command VQ. The command VQ will no longer be protected by the RTNL lock. Use a mutex to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Daniel Jurgens	ff7c7d9f52	virtio_net: Remove command data from control_buf Allocate memory for the data when it's used. Ideally the struct could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Daniel Jurgens	fce29030c5	virtio_net: Store RSS setting in virtnet_info Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-07 11:42:00 +02:00
Xuan Zhuo	d888f04c09	virtio-net: support queue stat To enhance functionality, we now support reporting statistics through the netdev-generic netlink (netdev-genl) queue stats interface. However, this does not extend to all statistics, so a new field, qstat_offset, has been introduced. This field determines which statistics should be reported via netdev-genl queue stats. Given that queue stats are retrieved individually per queue, it's necessary for the virtnet_get_hw_stats() function to be capable of fetching statistics for a specific queue. As the document https://docs.kernel.org/next/networking/statistics.html#notes-for-driver-authors We should not duplicate the stats which get reported via the netlink API in ethtool. If the stats are for queue stat, that will not be reported by ethtool -S. python3 ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml --dump qstats-get --json '{"scope": "queue"}' [{'ifindex': 2, 'queue-id': 0, 'queue-type': 'rx', 'rx-bytes': 157844011, 'rx-csum-bad': 0, 'rx-csum-none': 0, 'rx-csum-unnecessary': 2195386, 'rx-hw-drop-overruns': 0, 'rx-hw-drop-ratelimits': 0, 'rx-hw-drops': 12964, 'rx-packets': 598929}, {'ifindex': 2, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 1938511, 'tx-csum-none': 0, 'tx-hw-drop-errors': 0, 'tx-hw-drop-ratelimits': 0, 'tx-hw-drops': 0, 'tx-needs-csum': 61263, 'tx-packets': 15515}] Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:33 +02:00
Xuan Zhuo	d806e1ff79	virtio_net: add the total stats field Now, we just show the stats of every queue. But for the user, the total values of every stat may are valuable. NIC statistics: rx_packets: 373522 rx_bytes: 85919736 rx_drops: 0 rx_xdp_packets: 0 rx_xdp_tx: 0 rx_xdp_redirects: 0 rx_xdp_drops: 0 rx_kicks: 11125 rx_hw_notifications: 0 rx_hw_packets: 1325870 rx_hw_bytes: 263348963 rx_hw_interrupts: 0 rx_hw_drops: 1451 rx_hw_drop_overruns: 0 rx_hw_csum_valid: 1325870 rx_hw_needs_csum: 1325870 rx_hw_csum_none: 0 rx_hw_csum_bad: 0 rx_hw_ratelimit_packets: 0 rx_hw_ratelimit_bytes: 0 tx_packets: 10050 tx_bytes: 1230176 tx_xdp_tx: 0 tx_xdp_tx_drops: 0 tx_kicks: 10050 tx_timeouts: 0 tx_hw_notifications: 0 tx_hw_packets: 32281 tx_hw_bytes: 4315590 tx_hw_interrupts: 0 tx_hw_drops: 0 tx_hw_drop_malformed: 0 tx_hw_csum_none: 0 tx_hw_needs_csum: 32281 tx_hw_ratelimit_packets: 0 tx_hw_ratelimit_bytes: 0 rx0_packets: 373522 rx0_bytes: 85919736 rx0_drops: 0 rx0_xdp_packets: 0 rx0_xdp_tx: 0 rx0_xdp_redirects: 0 rx0_xdp_drops: 0 rx0_kicks: 11125 rx0_hw_notifications: 0 rx0_hw_packets: 1325870 rx0_hw_bytes: 263348963 rx0_hw_interrupts: 0 rx0_hw_drops: 1451 rx0_hw_drop_overruns: 0 rx0_hw_csum_valid: 1325870 rx0_hw_needs_csum: 1325870 rx0_hw_csum_none: 0 rx0_hw_csum_bad: 0 rx0_hw_ratelimit_packets: 0 rx0_hw_ratelimit_bytes: 0 tx0_packets: 10050 tx0_bytes: 1230176 tx0_xdp_tx: 0 tx0_xdp_tx_drops: 0 tx0_kicks: 10050 tx0_timeouts: 0 tx0_hw_notifications: 0 tx0_hw_packets: 32281 tx0_hw_bytes: 4315590 tx0_hw_interrupts: 0 tx0_hw_drops: 0 tx0_hw_drop_malformed: 0 tx0_hw_csum_none: 0 tx0_hw_needs_csum: 32281 tx0_hw_ratelimit_packets: 0 tx0_hw_ratelimit_bytes: 0 Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:33 +02:00
Xuan Zhuo	d86769b9d2	virtio_net: device stats helpers support driver stats In the last commit, we introduced some helpers for device stats. And the drivers stats are realized by the open code. This commit make the helpers to support driver stats. Then we can have the unify helper for device and driver stats. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:33 +02:00
Xuan Zhuo	941168f8b4	virtio_net: support device stats As the spec `42f3899898` make virtio-net support getting the stats from the device by ethtool -S <eth0>. NIC statistics: rx0_packets: 582951 rx0_bytes: 155307077 rx0_drops: 0 rx0_xdp_packets: 0 rx0_xdp_tx: 0 rx0_xdp_redirects: 0 rx0_xdp_drops: 0 rx0_kicks: 17007 rx0_hw_packets: 2179409 rx0_hw_bytes: 510015040 rx0_hw_notifications: 0 rx0_hw_interrupts: 0 rx0_hw_needs_csum: 2179409 rx0_hw_ratelimit_bytes: 0 tx0_packets: 15361 tx0_bytes: 1918970 tx0_xdp_tx: 0 tx0_xdp_tx_drops: 0 tx0_kicks: 15361 tx0_timeouts: 0 tx0_hw_packets: 32272 tx0_hw_bytes: 4311698 tx0_hw_notifications: 0 tx0_hw_interrupts: 0 tx0_hw_ratelimit_bytes: 0 The follow stats are hidden, there are exported by the queue stat API in the subsequent comment. VIRTNET_STATS_DESC_RX(basic, drops) VIRTNET_STATS_DESC_RX(basic, drop_overruns), VIRTNET_STATS_DESC_TX(basic, drops), VIRTNET_STATS_DESC_TX(basic, drop_malformed), VIRTNET_STATS_DESC_RX(csum, csum_valid), VIRTNET_STATS_DESC_RX(csum, csum_none), VIRTNET_STATS_DESC_RX(csum, csum_bad), VIRTNET_STATS_DESC_TX(csum, needs_csum), VIRTNET_STATS_DESC_TX(csum, csum_none), VIRTNET_STATS_DESC_RX(gso, gso_packets), VIRTNET_STATS_DESC_RX(gso, gso_bytes), VIRTNET_STATS_DESC_RX(gso, gso_packets_coalesced), VIRTNET_STATS_DESC_RX(gso, gso_bytes_coalesced), VIRTNET_STATS_DESC_TX(gso, gso_packets), VIRTNET_STATS_DESC_TX(gso, gso_bytes), VIRTNET_STATS_DESC_TX(gso, gso_segments), VIRTNET_STATS_DESC_TX(gso, gso_segments_bytes), VIRTNET_STATS_DESC_RX(speed, ratelimit_packets), VIRTNET_STATS_DESC_TX(speed, ratelimit_packets), Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:33 +02:00
Xuan Zhuo	de6df26ffc	virtio_net: remove "_queue" from ethtool -S The key size of ethtool -S is controlled by this macro. ETH_GSTRING_LEN 32 That includes the \0 at the end. So the max length of the key name must is 31. But the length of the prefix "rx_queue_0_" is 11. If the queue num is larger than 10, the length of the prefix is 12. So the key name max is 19. That is too short. We will introduce some keys such as "gso_packets_coalesced". So we should change the prefix to "rx0_". Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:33 +02:00
Xuan Zhuo	aff5b0e605	virtio_net: introduce ability to get reply info from device As the spec `42f3899898` Based on the description provided in the above specification, we have enabled the virtio-net driver to support acquiring some response information from the device via the CVQ (Control Virtqueue). Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-30 10:51:32 +02:00
Liang Chen	aa37f8916d	virtio_net: Support RX hash XDP hint The RSS hash report is a feature that's part of the virtio specification. Currently, virtio backends like qemu, vdpa (mlx5), and potentially vhost (still a work in progress as per [1]) support this feature. While the capability to obtain the RSS hash has been enabled in the normal path, it's currently missing in the XDP path. Therefore, we are introducing XDP hints through kfuncs to allow XDP programs to access the RSS hash. 1. https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.com/#r Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240417071822.27831-1-liangchen.linux@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-18 13:54:04 +02:00
Breno Leitao	059a49aa2e	virtio_net: Do not send RSS key if it is not supported There is a bug when setting the RSS options in virtio_net that can break the whole machine, getting the kernel into an infinite loop. Running the following command in any QEMU virtual machine with virtionet will reproduce this problem: # ethtool -X eth0 hfunc toeplitz This is how the problem happens: 1) ethtool_set_rxfh() calls virtnet_set_rxfh() 2) virtnet_set_rxfh() calls virtnet_commit_rss_command() 3) virtnet_commit_rss_command() populates 4 entries for the rss scatter-gather 4) Since the command above does not have a key, then the last scatter-gatter entry will be zeroed, since rss_key_size == 0. sg_buf_size = vi->rss_key_size; 5) This buffer is passed to qemu, but qemu is not happy with a buffer with zero length, and do the following in virtqueue_map_desc() (QEMU function): if (!sz) { virtio_error(vdev, "virtio: zero sized buffers are not allowed"); 6) virtio_error() (also QEMU function) set the device as broken vdev->broken = true; 7) Qemu bails out, and do not repond this crazy kernel. 8) The kernel is waiting for the response to come back (function virtnet_send_command()) 9) The kernel is waiting doing the following : while (!virtqueue_get_buf(vi->cvq, &tmp) && !virtqueue_is_broken(vi->cvq)) cpu_relax(); 10) None of the following functions above is true, thus, the kernel loops here forever. Keeping in mind that virtqueue_is_broken() does not look at the qemu `vdev->broken`, so, it never realizes that the vitio is broken at QEMU side. Fix it by not sending RSS commands if the feature is not available in the device. Fixes: `c7114b1249` ("drivers/net/virtio_net: Added basic RSS support.") Cc: stable@vger.kernel.org Cc: qemu-devel@nongnu.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-07 15:43:14 +01:00
Xuan Zhuo	5da7137de7	virtio_net: rename free_old_xmit_skbs to free_old_xmit Since free_old_xmit_skbs not only deals with skb, but also xdp frame and subsequent added xsk, so change the name of this function to free_old_xmit. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240229072044.77388-19-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-03-19 03:19:22 -04:00
Xuan Zhuo	b1dc24aba7	virtio_net: unify the code for recycling the xmit ptr There are two completely similar and independent implementations. This is inconvenient for the subsequent addition of new types. So extract a function from this piece of code and call this function uniformly to recover old xmit ptr. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240229072044.77388-18-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-03-19 03:19:22 -04:00
Jason Wang	0d197a1471	virtio-net: add cond_resched() to the command waiting loop Adding cond_resched() to the command waiting loop for a better co-operation with the scheduler. This allows to give CPU a breath to run other task(workqueue) instead of busy looping when preemption is not allowed on a device whose CVQ might be slow. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230720083839.481487-3-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>	2024-03-19 03:19:22 -04:00
Jason Wang	b9f7425239	virtio-net: convert rx mode setting to use workqueue This patch convert rx mode setting to be done in a workqueue, this is a must for allow to sleep when waiting for the cvq command to response since current code is executed under addr spin lock. Note that we need to disable and flush the workqueue during freeze, this means the rx mode setting is lost after resuming. This is not the bug of this patch as we never try to restore rx mode setting during resume. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230720083839.481487-2-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>	2024-03-19 03:19:22 -04:00
Zhu Yanjun	e3fe8d28c6	virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings Fix the warnings when building virtio_net driver. " drivers/net/virtio_net.c: In function ‘init_vqs’: drivers/net/virtio_net.c:4551:48: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Wformat-overflow=] 4551 \| sprintf(vi->rq[i].name, "input.%d", i); \| ^~ In function ‘virtnet_find_vqs’, inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8: drivers/net/virtio_net.c:4551:41: note: directive argument in the range [-2147483643, 65534] 4551 \| sprintf(vi->rq[i].name, "input.%d", i); \| ^~~~~~~~~~ drivers/net/virtio_net.c:4551:17: note: ‘sprintf’ output between 8 and 18 bytes into a destination of size 16 4551 \| sprintf(vi->rq[i].name, "input.%d", i); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/virtio_net.c: In function ‘init_vqs’: drivers/net/virtio_net.c:4552:49: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 9 [-Wformat-overflow=] 4552 \| sprintf(vi->sq[i].name, "output.%d", i); \| ^~ In function ‘virtnet_find_vqs’, inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8: drivers/net/virtio_net.c:4552:41: note: directive argument in the range [-2147483643, 65534] 4552 \| sprintf(vi->sq[i].name, "output.%d", i); \| ^~~~~~~~~~~ drivers/net/virtio_net.c:4552:17: note: ‘sprintf’ output between 9 and 19 bytes into a destination of size 16 4552 \| sprintf(vi->sq[i].name, "output.%d", i); " Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20240104020902.2753599-1-yanjun.zhu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-11 16:54:34 -08:00
Jakub Kicinski	e63c1822ac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c `e009b2efb7` ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") `0f2b214779` ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 18:06:46 -08:00
Xuan Zhuo	2311e06b9b	virtio_net: fix missing dma unmap for resize For rq, we have three cases getting buffers from virtio core: 1. virtqueue_get_buf{,_ctx} 2. virtqueue_detach_unused_buf 3. callback for virtqueue_resize But in commit 295525e29a5b("virtio_net: merge dma operations when filling mergeable buffers"), I missed the dma unmap for the #3 case. That will leak some memory, because I did not release the pages referred by the unused buffers. If we do such script, we will make the system OOM. while true do ethtool -G ens4 rx 128 ethtool -G ens4 rx 256 free -m done Fixes: `295525e29a` ("virtio_net: merge dma operations when filling mergeable buffers") Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20231226094333.47740-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-03 16:45:02 -08:00
Ahmed Zaki	fb6e30a725	net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added. This is part 1/2 of the fix, as suggested in [1]: - First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present. - Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack. Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-13 22:07:16 -08:00
Heng Qi	6208799553	virtio-net: support rx netdim By comparing the traffic information in the complete napi processes, let the virtio-net driver automatically adjust the coalescing moderation parameters of each receive queue. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-13 12:49:05 +00:00
Heng Qi	1db43c0818	virtio-net: extract virtqueue coalescig cmd for reuse Extract commands to set virtqueue coalescing parameters for reuse by ethtool -Q, vq resize and netdim. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-13 12:49:05 +00:00
Heng Qi	d7180080dd	virtio-net: separate rx/tx coalescing moderation cmds This patch separates the rx and tx global coalescing moderation commands to support netdim switches in subsequent patches. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-13 12:49:04 +00:00
Heng Qi	7949c06ad9	virtio-net: returns whether napi is complete rx netdim needs to count the traffic during a complete napi process, and start updating and comparing samples to make decisions after the napi ends. Let virtqueue_napi_complete() return true if napi is done, otherwise vice versa. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-13 12:49:04 +00:00
Eric Dumazet	61217d8f63	virtio_net: use u64_stats_t infra to avoid data-races syzbot reported a data-race in virtnet_poll / virtnet_stats [1] u64_stats_t infra has very nice accessors that must be used to avoid potential load-store tearing. [1] BUG: KCSAN: data-race in virtnet_poll / virtnet_stats read-write to 0xffff88810271b1a0 of 8 bytes by interrupt on cpu 0: virtnet_receive drivers/net/virtio_net.c:2102 [inline] virtnet_poll+0x6c8/0xb40 drivers/net/virtio_net.c:2148 __napi_poll+0x60/0x3b0 net/core/dev.c:6527 napi_poll net/core/dev.c:6594 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6727 __do_softirq+0xc1/0x265 kernel/softirq.c:553 invoke_softirq kernel/softirq.c:427 [inline] __irq_exit_rcu kernel/softirq.c:632 [inline] irq_exit_rcu+0x3b/0x90 kernel/softirq.c:644 common_interrupt+0x7f/0x90 arch/x86/kernel/irq.c:247 asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636 __sanitizer_cov_trace_const_cmp8+0x0/0x80 kernel/kcov.c:306 jbd2_write_access_granted fs/jbd2/transaction.c:1174 [inline] jbd2_journal_get_write_access+0x94/0x1c0 fs/jbd2/transaction.c:1239 __ext4_journal_get_write_access+0x154/0x3f0 fs/ext4/ext4_jbd2.c:241 ext4_reserve_inode_write+0x14e/0x200 fs/ext4/inode.c:5745 __ext4_mark_inode_dirty+0x8e/0x440 fs/ext4/inode.c:5919 ext4_evict_inode+0xaf0/0xdc0 fs/ext4/inode.c:299 evict+0x1aa/0x410 fs/inode.c:664 iput_final fs/inode.c:1775 [inline] iput+0x42c/0x5b0 fs/inode.c:1801 do_unlinkat+0x2b9/0x4f0 fs/namei.c:4405 __do_sys_unlink fs/namei.c:4446 [inline] __se_sys_unlink fs/namei.c:4444 [inline] __x64_sys_unlink+0x30/0x40 fs/namei.c:4444 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff88810271b1a0 of 8 bytes by task 2814 on cpu 1: virtnet_stats+0x1b3/0x340 drivers/net/virtio_net.c:2564 dev_get_stats+0x6d/0x860 net/core/dev.c:10511 rtnl_fill_stats+0x45/0x320 net/core/rtnetlink.c:1261 rtnl_fill_ifinfo+0xd0e/0x1120 net/core/rtnetlink.c:1867 rtnl_dump_ifinfo+0x7f9/0xc20 net/core/rtnetlink.c:2240 netlink_dump+0x390/0x720 net/netlink/af_netlink.c:2266 netlink_recvmsg+0x425/0x780 net/netlink/af_netlink.c:1992 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] ____sys_recvmsg+0x156/0x310 net/socket.c:2760 ___sys_recvmsg net/socket.c:2802 [inline] __sys_recvmsg+0x1ea/0x270 net/socket.c:2832 __do_sys_recvmsg net/socket.c:2842 [inline] __se_sys_recvmsg net/socket.c:2839 [inline] __x64_sys_recvmsg+0x46/0x50 net/socket.c:2839 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x000000000045c334 -> 0x000000000045c376 Fixes: `3fa2a1df90` ("virtio-net: per cpu 64 bit stats (v2)") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-27 10:58:15 +01:00
Jakub Kicinski	041c3466f3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. net/mac80211/key.c `02e0e426a2` ("wifi: mac80211: fix error path key leak") `2a8b665e6b` ("wifi: mac80211: remove key_mtx") `7d6904bf26` ("Merge wireless into wireless-next") https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/ti/Kconfig `a602ee3176` ("net: ethernet: ti: Fix mixed module-builtin object") `98bdeae950` ("net: cpmac: remove driver to prepare for platform removal") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-19 13:29:01 -07:00
Xuan Zhuo	5720c43d52	virtio_net: fix the missing of the dma cpu sync Commit `295525e29a` ("virtio_net: merge dma operations when filling mergeable buffers") unmaps the buffer with DMA_ATTR_SKIP_CPU_SYNC when the dma->ref is zero. We do that with DMA_ATTR_SKIP_CPU_SYNC, because we do not want to do the sync for the entire page_frag. But that misses the sync for the current area. This patch does cpu sync regardless of whether the ref is zero or not. Fixes: `295525e29a` ("virtio_net: merge dma operations when filling mergeable buffers") Reported-by: Michael Roth <michael.roth@amd.com> Closes: http://lore.kernel.org/all/20230926130451.axgodaa6tvwqs3ut@amd.com Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-10-15 11:49:57 -07:00
Heng Qi	c4e33cf261	virtio-net: a tiny comment update Update a comment because virtio-net now supports both VIRTIO_NET_F_NOTF_COAL and VIRTIO_NET_F_VQ_NOTF_COAL. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Heng Qi	f61fe5f081	virtio-net: fix the vq coalescing setting for vq resize According to the definition of virtqueue coalescing spec[1]: Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0. Upon disabling and re-enabling a receive virtqueue, the device MUST set the coalescing parameters of the virtqueue to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command, or, if the driver did not set any RX coalescing parameters, to 0. We need to add this setting for vq resize (ethtool -G) where vq_reset happens. [1] https://lists.oasis-open.org/archives/virtio-dev/202303/msg00415.html Fixes: `394bd87764` ("virtio_net: support per queue interrupt coalesce command") Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Heng Qi	bfb2b36091	virtio-net: fix per queue coalescing parameter setting When the user sets a non-zero coalescing parameter to 0 for a specific virtqueue, it does not work as expected, so let's fix this. Fixes: `394bd87764` ("virtio_net: support per queue interrupt coalesce command") Reported-by: Xiaoming Zhao <zxm377917@alibaba-inc.com> Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Heng Qi	e9420838ab	virtio-net: consistently save parameters for per-queue When using .set_coalesce interface to set all queue coalescing parameters, we need to update both per-queue and global save values. Fixes: `394bd87764` ("virtio_net: support per queue interrupt coalesce command") Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Heng Qi	134674c187	virtio-net: fix mismatch of getting tx-frames Since virtio-net allows switching napi_tx for per txq, we have to get the specific txq's result now. Fixes: `394bd87764` ("virtio_net: support per queue interrupt coalesce command") Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Heng Qi	3014a0d548	virtio-net: initially change the value of tx-frames Background: 1. Commit `0c465be183` ("virtio_net: ethtool tx napi configuration") uses tx-frames to toggle napi_tx (0 off and 1 on) if notification coalescing is not supported. 2. Commit `31c03aef9b` ("virtio_net: enable napi_tx by default") enables napi_tx for all txqs by default. Status: When virtio-net supports notification coalescing, after initialization, tx-frames is 0 and napi_tx is true. Problem: When the user only wants to set rx coalescing params using ethtool -C eth0 rx-usecs 10, or ethtool -Q eth0 queue_mask 0x1 -C rx-usecs 10, these cmds will carry tx-frames as 0, causing the napi_tx switching condition is satisfied. Then the user gets: netlink error: Device or resource busy. The same happens when trying to set rx-frames, adaptive_rx, adaptive_tx... How to fix: When notification coalescing feature is negotiated, initially make the value of tx-frames to be consistent with napi_tx. For compatibility with the past, it is still supported to use tx-frames to toggle napi_tx. Reported-by: Xiaoming Zhao <zxm377917@alibaba-inc.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-11 07:13:50 +01:00
Eric Dumazet	d12a26b74f	virtio_net: avoid data-races on dev->stats fields Use DEV_STATS_INC() and DEV_STATS_READ() which provide atomicity on paths that can be used concurrently. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-01 16:33:01 +01:00
Linus Torvalds	e4f1b8202f	virtio: features a small pull request this time around, mostly because the vduse network got postponed to next relase so we can be sure we got the security store right. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmT1BMAPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpYJUH+QHNhfn0JC/yE1IySwDwpmdgr73aaGik1LgV ObHi48ucRMtxB+QpXLjPWAlQhVVzZv1wBK+Up9QxW8e9USJrSeI/MWfoHtXOFnGe 1JdmNr+XQM/uDngZ+mjI4ZUwRkA61iOcTR7gEDdfBUOr+Yl6R7Na/+kKtTDiDMfy O8bOCLYVyJNiny2eSMmXH0mb4oPplkne4PzW4i/+ssKNoHlBmUIcx0jqj/qUVpSR ozr0SpyhlXKSEQGAtNxwR4PONeMDOOdkRBhxHW5N5QgnP9P7HQ57Ar39Vz7+Kc0i 6vO2g1gpYV1naQr9BCg8hIF9r68rjgi4IOSghmfpWWUL0yNURtU= =z/Df -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "A small pull request this time around, mostly because the vduse network got postponed to next relase so we can be sure we got the security store right" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_ring: fix avail_wrap_counter in virtqueue_add_packed virtio_vdpa: build affinity masks conditionally virtio_net: merge dma operations when filling mergeable buffers virtio_ring: introduce dma sync api for virtqueue virtio_ring: introduce dma map api for virtqueue virtio_ring: introduce virtqueue_reset() virtio_ring: separate the logic of reset/enable from virtqueue_resize virtio_ring: correct the expression of the description of virtqueue_resize() virtio_ring: skip unmap for premapped virtio_ring: introduce virtqueue_dma_dev() virtio_ring: support add premapped buf virtio_ring: introduce virtqueue_set_dma_premapped() virtio_ring: put mapping error check in vring_map_one_sg virtio_ring: check use_dma_api before unmap desc for indirect vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK vdpa: add get_backend_features vdpa operation vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag vdpa/mlx5: Remove unused function declarations	2023-09-04 10:43:44 -07:00
Xuan Zhuo	295525e29a	virtio_net: merge dma operations when filling mergeable buffers Currently, the virtio core will perform a dma operation for each buffer. Although, the same page may be operated multiple times. This patch, the driver does the dma operation and manages the dma address based the feature premapped of virtio core. This way, we can perform only one dma operation for the pages of the alloc frag. This is beneficial for the iommu device. kernel command line: intel_iommu=on iommu.passthrough=0 \| strict=0 \| strict=1 Before \| 775496pps \| 428614pps After \| 1109316pps \| 742853pps Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-09-03 18:10:24 -04:00
Feng Liu	dae64749db	virtio_net: Introduce skb_vnet_common_hdr to avoid typecasting The virtio_net driver currently deals with different versions and types of virtio net headers, such as virtio_net_hdr_mrg_rxbuf, virtio_net_hdr_v1_hash, etc. Due to these variations, the code relies on multiple type casts to convert memory between different structures, potentially leading to bugs when there are changes in these structures. Introduces the "struct skb_vnet_common_hdr" as a unifying header structure using a union. With this approach, various virtio net header structures can be converted by accessing different members of this structure, thus eliminating the need for type casting and reducing the risk of potential bugs. For example following code: static struct sk_buff page_to_skb(struct virtnet_info vi, struct receive_queue rq, struct page page, unsigned int offset, unsigned int len, unsigned int truesize, unsigned int headroom) { [...] struct virtio_net_hdr_mrg_rxbuf hdr; [...] hdr_len = vi->hdr_len; [...] ok: hdr = skb_vnet_hdr(skb); memcpy(hdr, hdr_p, hdr_len); [...] } When VIRTIO_NET_F_HASH_REPORT feature is enabled, hdr_len = 20 But the sizeof(hdr) is 12, memcpy(hdr, hdr_p, hdr_len); will copy 20 bytes to the hdr, which make a potential risk of bug. And this risk can be avoided by introducing struct skb_vnet_common_hdr. Change log v1->v2 feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com> feedback from Simon Horman <horms@kernel.org> 1. change to use net-next tree. 2. move skb_vnet_common_hdr inside kernel file instead of the UAPI header. v2->v3 feedback from Willem de Bruijn <willemdebruijn.kernel@gmail.com> 1. fix typo in commit message. 2. add original struct virtio_net_hdr into union 3. remove virtio_net_hdr_mrg_rxbuf variable in receive_buf; Signed-off-by: Feng Liu <feliu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-23 09:40:18 +01:00
Jakub Kicinski	7ff57803d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/sfc/tc.c `fa165e1949` ("sfc: don't unregister flow_indr if it was never registered") `3bf969e88a` ("sfc: add MAE table machinery for conntrack table") https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-18 12:44:56 -07:00
Linus Torvalds	0e8860d212	Including fixes from ipsec and netfilter. No known outstanding regressions. Fixes to fixes: - virtio-net: set queues after driver_ok, avoid a potential race added by recent fix - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning when VLAN 0 is registered explicitly - nf_tables: - fix false-positive lockdep splat in recent fixes - don't fail inserts if duplicate has expired (fix test failures) - fix races between garbage collection and netns dismantle Current release - new code bugs: - mlx5: Fix mlx5_cmd_update_root_ft() error flow Previous releases - regressions: - phy: fix IRQ-based wake-on-lan over hibernate / power off Previous releases - always broken: - sock: fix misuse of sk_under_memory_pressure() preventing system from exiting global TCP memory pressure if a single cgroup is under pressure - fix the RTO timer retransmitting skb every 1ms if linear option is enabled - af_key: fix sadb_x_filter validation, amment netlink policy - ipsec: fix slab-use-after-free in decode_session6() - macb: in ZynqMP resume always configure PS GTR for non-wakeup source Misc: - netfilter: set default timeout to 3 secs for sctp shutdown send and recv state (from 300ms), align with protocol timers Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmTemA4ACgkQMUZtbf5S IrtCThAAj+t35QM5BgGZowmrx9U4yF+kacDkdPztxlT8a/b+famrTtnZJ8USW+PF VCk3Eu8JXheuyAOMArHyM84/crS6wim6mzGcXaucusA3981PFzoqdgCLLf9emAJ2 j9vzKrnHBtdd5fj8Exwq70KN4CzXyrzRgqwr2EXBK9lH59HjX0+J7o+trbDxNmFK RZJE2oDCqf939iRGG3PhJryKYBmrQaMtdonNpSU5PiiRT0HnVYcEtdWcOXK7d53D onpoaPdawcsqsns5c5Qj01E1OdyM8X54BEGkl/S4FmSw5jF9Bp6btmTcxYYtdb7E M3CeYROZ0Kt8KcKKje/o1AzdGqWq8Hnxfwy+2WulZAHMucshg0JPm6Ev74WRondw NGYriKJSdORSO8idK9K/i7pnjZXYr9gU50lpPUFU+QzSdd+zv+U11arjAodwI9Wi pW+dFi3UR7J01LidaxclvHmWnZ7d5sSzE2khpqb0xd0+PagRGesl8qnKyoDJNS1P IHsOrRh9aXLzEZjud/rVG+sUobQvc1oiHW+hvbJ04GLKoli9U5poGT2fcaa4O67M T7JcN5oGDF+PIHJKgTEN7pfX2epY33gmofKUhbt/OPOqnvZOVbTu7/ojjuJZ8Lc5 SF8AvTe+lECcX8Htjq30PoVfai+FT6AhnZzK0H9K4HMfUB9O32Q= =Ze13 -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter. No known outstanding regressions. Fixes to fixes: - virtio-net: set queues after driver_ok, avoid a potential race added by recent fix - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning when VLAN 0 is registered explicitly - nf_tables: - fix false-positive lockdep splat in recent fixes - don't fail inserts if duplicate has expired (fix test failures) - fix races between garbage collection and netns dismantle Current release - new code bugs: - mlx5: Fix mlx5_cmd_update_root_ft() error flow Previous releases - regressions: - phy: fix IRQ-based wake-on-lan over hibernate / power off Previous releases - always broken: - sock: fix misuse of sk_under_memory_pressure() preventing system from exiting global TCP memory pressure if a single cgroup is under pressure - fix the RTO timer retransmitting skb every 1ms if linear option is enabled - af_key: fix sadb_x_filter validation, amment netlink policy - ipsec: fix slab-use-after-free in decode_session6() - macb: in ZynqMP resume always configure PS GTR for non-wakeup source Misc: - netfilter: set default timeout to 3 secs for sctp shutdown send and recv state (from 300ms), align with protocol timers" * tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) ice: Block switchdev mode when ADQ is active and vice versa qede: fix firmware halt over suspend and resume net: do not allow gso_size to be set to GSO_BY_FRAGS sock: Fix misuse of sk_under_memory_pressure() sfc: don't fail probe if MAE/TC setup fails sfc: don't unregister flow_indr if it was never registered net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset net/mlx5: Fix mlx5_cmd_update_root_ft() error flow net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT i40e: fix misleading debug logs iavf: fix FDIR rule fields masks validation ipv6: fix indentation of a config attribute mailmap: add entries for Simon Horman broadcom: b44: Use b44_writephy() return value net: openvswitch: reject negative ifindex team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves net: phy: broadcom: stub c45 read/write for 54810 netfilter: nft_dynset: disallow object maps netfilter: nf_tables: GC transaction race with netns dismantle netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path ...	2023-08-18 06:52:23 +02:00
Jason Wang	51b813176f	virtio-net: set queues after driver_ok Commit `25266128fe` ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: `25266128fe` ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-11 09:50:14 +01:00

1 2 3 4 5 ...

716 commits