Commit eaf9f17b86 ("wifi: ath12k: relocate ath12k_dp_pdev_pre_alloc()
call") moves ath12k_dp_pdev_pre_alloc() from ath12k_core_start() to
ath12k_mac_allocate(), resulting in ath12k_mac_flush() failure in
recovery scenarios:
[ 6849.684104] ath12k_pci 0000:04:00.0: pdev 0 successfully recovered
[ 6854.907320] ath12k_pci 0000:04:00.0: failed to flush transmit queue 0
[ 6860.027353] ath12k_pci 0000:04:00.0: failed to flush transmit queue 0
[ 6865.143385] ath12k_pci 0000:04:00.0: failed to flush transmit queue 0
This is because, with ath12k_dp_pdev_pre_alloc() moved to ath12k_mac_allocate(),
dp->num_tx_pending is not reset due to ATH12K_FLAG_REGISTERED set in
recovery scenarios.
So a possible fix would be to reset that counter at some proper point,
just like the old design. But considering that the counter tracks number
of packets pending to be freed or returned to mac80211, forcefully reset
it might make it hard to expose some real issues. For example if somehow
ath12k fails to free/return some TX packets, we don't know that because
no warnings any more.
That is to say we should not reset that counter during recovery (which is
already done due to above commit), instead should decrease it each time
a packet is freed/returned. Currently almost each related function has
this logic implemented, except ath12k_dp_cc_cleanup(). So add the same
there to fix this issue.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240426015434.94840-1-quic_bqiang@quicinc.com
Move the data path Tx and Rx descriptor primary page table CMEM
configuration into a helper function. This will make the code more
scalable for configuring partner device in support of multi-device MLO.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240411102226.4045323-5-quic_periyasa@quicinc.com
Currently, the Rx descriptor is placed before the Tx descriptor in the
primary page table of the hardware cookie conversion configuration. The
Tx and Rx descriptor offsets are implicitly hardcoded. To allow for easy
displacement of Tx and Rx descriptors, introduce Tx and Rx offset based
cookie conversion initializationi. Additionally, should consider
validating the respective offset ranges while retrieving the Tx and Rx
descriptors. This change will be utilize by the next patch in the series.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240411102226.4045323-3-quic_periyasa@quicinc.com
Currently, in the Rx data path cookie conversion initialization procedure,
the primary page table index (ppt_idx) is computed within the secondary
page table iteration. However this is invariant, and hence the ppt_idx
should be calculated outside of the iteration to avoid repeated execution
of the statement.
Found in code review.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240411102226.4045323-2-quic_periyasa@quicinc.com
Rename the "ATH12k" word to "ATH12K" for consistent capitalization in the
word.
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240405144524.1157122-1-quic_periyasa@quicinc.com
When a packet arrives in Rx rings, the RX descriptor moves from the used
list to the free list. Then, the rxdma ring gets replenished, where the Rx
descriptor again moves from the free list to the used list. At the end, the
descriptor came to the used list with unnecessary list movement. The
descriptor used list is maintained in the Rxdma ring structure, which
creates lock contention for the list operations (add, delete) in the Rx
data path. Optimize the Rx data path by removing the used list from the
common Rxdma ring and maintain as a local variable in the Rx ring handler
itself, which avoid lock contention. Now, to find the used list descriptor
during descriptor cleanup, we need to check the in_use flag for each Rx
descriptor.
This is a simple UDP UL throughput test case results on x86+NUC device
with QCN9274 card, which clearly shows 8% to 12% improvement in the CPU
idle for the same ingress traffic.
Before:
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: all 0.24 0.00 12.54 0.08 0.00 23.33 0.00 0.00 0.00 63.81
After:
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: all 0.34 0.00 4.60 0.00 0.00 19.59 0.00 0.00 0.00 75.47
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240320010615.91331-3-quic_periyasa@quicinc.com
Most of the RX descriptors fields are currently not used in the
ath12k driver. Hence add support to selectively subscribe to the
required quad words (64 bits) within msdu_end and mpdu_start of
rx_desc.
Add compact rx_desc structures and configure the bit mask for Rx TLVs
(msdu_end, mpdu_start, mpdu_end) via registers. With these registers
SW can configure to DMA the partial TLV struct to Rx buffer.
Each TLV type has its own register to configure the mask value.
The mask value configured in register will indicate if a particular
QWORD has to be written to rx buffer or not i.e., if Nth bit is enabled
in the mask Nth QWORD will be written and it will not be written if the
bit is disabled in mask. While 0th bit indicates whether TLV tag will be
written or not.
Advantages of Qword subscription of TLVs
- Avoid multiple cache-line misses as the all the required fields
of the TLV are within 128 bytes.
- Memory optimization as TLVs + DATA + SHINFO can fit in 2k buffer
even for 64 bit kernel.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Kathirvel <quic_kathirve@quicinc.com>
Co-developed-by: Raj Kumar Bhagat <quic_rajkbhag@quicinc.com>
Signed-off-by: Raj Kumar Bhagat <quic_rajkbhag@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240129065724.2310207-10-quic_rajkbhag@quicinc.com
With word mask subscription support, the rx_desc structure will
change. The fields in this structure rx_desc will be reduced to only
the required fields. To make word mask subscription changes compatible
with the older firmware version (firmware that does not support word
mask subscription), two different structures of rx_desc will be
required for the same hardware.
The hardware param hal_desc_sz value cannot be constant for the same
hardware. It depends on the size of rx_desc structure which may
change based on firmware capability to support word mask subscription.
Hence, remove hal_desc_sz from hardware param and add hal_rx_ops
to get the size of rx_desc in run time.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Raj Kumar Bhagat <quic_rajkbhag@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240129065724.2310207-9-quic_rajkbhag@quicinc.com
Currently Rxdma replenish require HW conversion argument which is
unnecessary argument since ath12k driver configures the Rxdma only
in HW conversion. To optimize the rx data path per packet, avoid
the explicit unnecessary argument and condition check in the rx
replenish.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00125-QCAHKSWPL_SILICONZ-1
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231111043934.20485-4-quic_periyasa@quicinc.com
Currently all Rxdma replenish callers pass the same return
buffer manager id argument, so make it implicitly. To
optimize the rx data path per packet, avoid the explicit
unnecessary argument in Rxdma replenish function.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00125-QCAHKSWPL_SILICONZ-1
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231111043934.20485-3-quic_periyasa@quicinc.com
Currently all Rxdma replenish callers pass zero for the mac id
argument, so make it as zero implicitly. To optimize the rx
data path per packet, avoid the explicit unnecessary argument
in Rxdma replenish function.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00125-QCAHKSWPL_SILICONZ-1
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20231111043934.20485-2-quic_periyasa@quicinc.com
When max virtual ap interfaces are configured in all the bands with
ACS and hostapd restart is done every 60s, a crash is observed at
random times.
In the above scenario, a fragmented packet is received for self peer,
for which rx_tid and rx_frags are not initialized in datapath.
While handling this fragment, crash is observed as the rx_frag list
is uninitialized and when we walk in ath12k_dp_rx_h_sort_frags,
skb null leads to exception.
To address this, before processing received fragments we check
dp_setup_done flag is set to ensure that peer has completed its
dp peer setup for fragment queue, else ignore processing the
fragments.
Call trace:
PC points to "ath12k_dp_process_rx_err+0x4e8/0xfcc [ath12k]"
LR points to "ath12k_dp_process_rx_err+0x480/0xfcc [ath12k]".
The Backtrace obtained is as follows:
ath12k_dp_process_rx_err+0x4e8/0xfcc [ath12k]
ath12k_dp_service_srng+0x78/0x260 [ath12k]
ath12k_pci_write32+0x990/0xb0c [ath12k]
__napi_poll+0x30/0xa4
net_rx_action+0x118/0x270
__do_softirq+0x10c/0x244
irq_exit+0x64/0xb4
__handle_domain_irq+0x88/0xac
gic_handle_irq+0x74/0xbc
el1_irq+0xf0/0x1c0
arch_cpu_idle+0x10/0x18
do_idle+0x104/0x248
cpu_startup_entry+0x20/0x64
rest_init+0xd0/0xdc
arch_call_rest_init+0xc/0x14
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Harshitha Prem <quic_hprem@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230821130343.29495-2-quic_hprem@quicinc.com
Currently when ath12k_dp_cc_desc_init() is called we allocate
memory to rx_descs and tx_descs. In ath12k_dp_cc_cleanup(), during
descriptor cleanup rx_descs and tx_descs memory is not freed.
This is cause of memory leak. These allocated memory should be
freed in ath12k_dp_cc_cleanup.
In ath12k_dp_cc_desc_init(), we can save base address of rx_descs
and tx_descs. In ath12k_dp_cc_cleanup(), we can free rx_descs and
tx_descs memory using their base address.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Rajat Soni <quic_rajson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230718053510.30894-1-quic_rajson@quicinc.com
Sparse warns:
drivers/net/wireless/ath/ath12k/dp.c:1471:15: warning: memset with byte count of 278528
There's no need to use memset() here, instead call dma_alloc_coherent() with __GFP_ZERO.
While at it, remove an extra line before the error handler.
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230222164014.860-1-kvalo@kernel.org
ath12k is a new mac80211 driver for Qualcomm Wi-Fi 7 devices, first
supporting QCN9274 and WCN7850 PCI devices. QCN9274 supports both AP
and station; WCN7850 supports only station mode. Monitor mode is not
(yet) supported. Only PCI bus devices are supported.
ath12k is forked from an earlier version of ath11k. It was simpler to
have a "clean start" for the new generation and not try to share the
code with ath11k. This makes maintenance easier and avoids major
changes in ath11k, which would have significantly increased the risk
of regressions in existing setups.
ath12k uses le32 and cpu_to_le32() macros to handle endian
conversions, instead of using the firmware byte swap feature utilized
by ath11k. There is only one kernel module, named ath12k.ko.
Currently ath12k only supports HE mode (IEEE 802.11ax) or older, but
work is ongoing to add EHT mode (IEEE 802.11be) support.
The size of the driver is ~41 kLOC and 45 files. To make the review
easier, this initial version of ath12k does not support Device Tree,
debugfs or any other extra features. Those will be added later, after
ath12k is accepted to upstream.
The driver is build tested by Intel's kernel test robot with both GCC
and Clang. Sparse reports no warnings. The driver is mostly free of
checkpatch warnings, albeit few of the warnings are omitted on
purpose, list of them here:
https://github.com/qca/qca-swiss-army-knife/blob/master/tools/scripts/ath12k/ath12k-check#L52
The driver has had multiple authors who are listed in alphabetical
order below.
Co-developed-by: Balamurugan Selvarajan <quic_bselvara@quicinc.com>
Signed-off-by: Balamurugan Selvarajan <quic_bselvara@quicinc.com>
Co-developed-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Co-developed-by: Bhagavathi Perumal S <quic_bperumal@quicinc.com>
Signed-off-by: Bhagavathi Perumal S <quic_bperumal@quicinc.com>
Co-developed-by: Carl Huang <quic_cjhuang@quicinc.com>
Signed-off-by: Carl Huang <quic_cjhuang@quicinc.com>
Co-developed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Co-developed-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Co-developed-by: P Praneesh <quic_ppranees@quicinc.com>
Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Co-developed-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com>
Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com>
Co-developed-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Co-developed-by: Sriram R <quic_srirrama@quicinc.com>
Signed-off-by: Sriram R <quic_srirrama@quicinc.com>
Co-developed-by: Vasanthakumar Thiagarajan <quic_vthiagar@quicinc.com>
Signed-off-by: Vasanthakumar Thiagarajan <quic_vthiagar@quicinc.com>
Co-developed-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>