The default credit size is 1792 bytes, but the IP mtu is 1500 bytes,
then it has about 290 bytes's waste for each data packet on sdio
transfer path for TX bundle, it will reduce the transmission utilization
ratio for data packet.
This patch enable the small credit size in firmware, firmware will use
the new credit size 1556 bytes, it will increase the transmission
utilization ratio for data packet on TX patch. It results in significant
performance improvement on TX path.
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-3-wgong@codeaurora.org
The transmission utilization ratio for sdio bus for small packet is
slow, because the space and time cost for sdio bus is same for large
length packet and small length packet. So the speed of data for large
length packet is higher than small length.
Test result of different length of data:
data packet(byte) cost time(us) calculated rate(Mbps)
256 28 73
512 33 124
1024 35 234
1792 45 318
14336 168 682
28672 333 688
57344 660 695
This patch change the TX packet from single packet to a large length
bundle packet, max size is 32, it results in significant performance
improvement on TX path.
Also there's a fourth thread "ath10k_tx_complete_wq" added to ath10k as it
improves TCP RX throughput (values in Mbps):
TCP-RX TCP-TX UDP-RX UDP-TX
use workqueue_tx_complete 423 357 448 412
change it to ar->workqueue 410 360 449 414
change it to ar->workqueue_aux 405 339 446 401
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
It only enable bundle for sdio chip.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWP-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200410061400.14231-2-wgong@codeaurora.org
For sdio chip, it is high latency bus, all the TX packet's content will
be tranferred from HOST memory to firmware memory via sdio bus, then it
need much more memory in firmware than low latency bus chip, for low
latency chip, such as PCI-E, it only need to transfer the TX descriptor
via PCI-E bus to firmware memory. For sdio chip, reduce the complexity of
TX logic will help TX efficiency since its memory is limited, and it will
reduce the TX circle's time of each packet and then firmware will have more
memory for TX since TX complete also need memeory.
This patch disable TX complete indication from firmware for htt data
packet, it will not have TX complete indication from firmware to ath10k.
It will cut the cost of bus bandwidth of TX complete and make the TX
logic of firmware simpler, it results in significant performance
improvement on TX path.
Udp TX throughout is 130Mbps without this patch, and it arrives
400Mbps with this patch.
The downside of this patch is the command "iw wlan0 station dump" will
show 0 for "tx retries" and "tx failed" since all tx packet's status
is success.
This patch only effect sdio chip, it will not effect PCI, SNOC etc.
Tested with QCA6174 SDIO with firmware
WLAN.RMH.4.4.1-00017-QCARMSWPZ-1
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200212080415.31265-2-wgong@codeaurora.org
For max bundle size 32, the bundle mask is not same with 8/16.
Change it to match the max bundle size of htc. Otherwise it
will not match with firmware, for example, when bundle count
is 17, then flags of ath10k_htc_hdr is 0x4, if without this
patch, it will be considered as non-bundled packet because it
does not have mask 0xF0, then trigger error message later:
payload length 56747 exceeds max htc length: 4088.
htc->max_msgs_per_htc_bundle is the min value of
HTC_HOST_MAX_MSG_PER_RX_BUNDLE and
msg->ready_ext.max_msgs_per_htc_bundle of ath10k_htc_wait_target,
it will be sent to firmware later in ath10k_htc_start, then
firmware will use it as the final max rx bundle count, in
WLAN.RMH.4.4.1-00029, msg->ready_ext.max_msgs_per_htc_bundle
is 32, it is same with HTC_HOST_MAX_MSG_PER_RX_BUNDLE, so the
final max rx bundle count will be set to 32 in firmware.
This patch only effect sdio chips.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Fixes: 224776520e ("ath10k: change max RX bundle size from 8 to 32 for sdio")
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The max bundle size support by firmware is 32, change it from 8 to 32
will help performance. This results in significant performance
improvement on RX path.
The real max rx bundle is decided in ath10k_htc_wait_target(),
it is the min value of HTC_HOST_MAX_MSG_PER_RX_BUNDLE and the value reported
from firmware. So this change shouldn't cause any regressions with other
hardware supported by ath10k.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00017-QCARMSWPZ-1.
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Use SPDX identifiers everywhere in ath10k.
Makefile was incorrectly marked in commit b24413180f ("License cleanup: add
SPDX GPL-2.0 license identifier to files with no license"), fix that as well.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
ath10k maintains common txqs list for all stations. This txq
management can be removed by migrating to mac80211 txq APIs
and let mac80211 handle txqs reordering based on reported airtime.
By doing this, txq fairness maintained in ath10k i.e processing
N frames per txq is removed. By adapting to mac80211 APIs,
ath10k will support mac80211 based airtime fairness algorithm.
Tested on QCA4019 with firmware version 10.4-3.2.1.1-00015
Tested on QCA9984 with firmware version 10.4-3.9.0.1-00005
Tested-by: Venkateswara Naralasetty <vnaralas@codeaurora.org>
Co-developed-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Without this, when receiving a packet that has this flag set
from firmware, we will read invalid trailer data from the packet,
which will be shown as various errors, e.g. "sdio mbox lookahead
is zero" or "invalid rx packet" or "payload length x exceeds max
htc length".
Co-Developed-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Alagu Sankar <alagusankar@silex-india.com>
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The hardcoded values used in ath10k_mac_tx_push_pending and
ath10k_mac_op_wake_tx_queue set an upper limit of how many packets that
can be consumed from the TX queue.
HTC_HOST_MAX_MSG_PER_TX_BUNDLE is a proper name for this constant, as
the value effectively limits the number of messages that can be consumed
in one step. Thus, the value is an upper limit of the number of messages
that can be added to a TX message bundle.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This define is only used for RX bundling so it is more descriptive if
RX is added to the define-name.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
WCN3990 target uses 3 Copy engine(CE1/CE9/CE10) in RX path
and CE 11 for pktlog.
Add data path HTC ep services and PKTLOG services for WCN3990.
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Added support for extended ready message.
The extended ready message contains the maximum bundle
count supported by SDIO chipsets.
It is transmitted by SDIO chipset only and replaces the
"standard" ready message in this case.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The RX trailer parsing is now capable of parsing lookahead reports.
A lookahead contains the first 4 bytes of the next HTC message
(that will be read in the next SDIO read operation).
Lookaheads are used by the SDIO/mbox HIF layer to determine if
the next message is part of a bundle, which endpoint it belongs
to and how long it is.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Changed ath10k_htc_notify_tx_completion and
ath10k_htc_process_trailer from static to non static.
These functions are needed by SDIO/mbox.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Simplified transmit credit distribution code somewhat.
Since the WMI control service will get assigned all credits
there is no need for having a credit_allocation array in
struct ath10k_htc.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Removed tx_credits_per_max_message and tx_credit_size
from struct ath10k_htc_ep since they are not used
anywhere in the code.
They are just written, never read.
Signed-off-by: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
<linux/semaphore.h> is not used anymore, so just remove the include.
Signed-off-by: Chaehyun Lim <chaehyun.lim@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Fix checkpatch warnings about use of spaces with operators:
spaces preferred around that '*' (ctx:VxV)
This has been recently added to checkpatch.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Since polling for tx completion is handled whenever target to host
messages are received, removing the unnecessary polling mechanism for
send completion at HTC level.
Reviewed-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Since polling for received messages not supported, remove unused
dl_is_polled.
Reviewed-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Register receive callbacks for every copy engines (CE) separately
instead of having common receive handler. Some of the copy engines
receives different type of messages (i.e HTT/HTC/pktlog) from target.
Hence to service them accordingly, register per copy engine receive
callbacks.
Reviewed-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Register send completion callbacks for every copy engines (CE) separately
instead of having common completion handler. Since some of the copy
engines delivers different type of messages, per-CE callbacks help to
service them differently.
Reviewed-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Export HTC layer tx and rx handlers. This will be used by HIF layer
for per-CE data processing. Instead of callback mechanism, HIF will
call appropriate upper layers API directly.
Reviewed-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This makes it a lot easier to log and debug
messages if there's more than 1 ath10k device on a
system.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This is not necessary anymore. There are no more
uncontrolled htc tx entry points.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The patch removes HTC endpoint tx workers in
favour of direct command submission. This makes a
lot more sense for data path.
mac80211 queues are effectively stopped/woken up
in a more timely fashion preventing build up of
frames. It's possible to push more traffic than
the device/system is able to handle and have no
hiccups or performance degradation with UDP
traffic.
WMI commands will now report errors properly and
possibly block as they actively can wait for tx
credits to become available.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This will allow higher layers to anticipate and
act upon TX credits renewal. This will be
important for some future rework of WMI command
submission.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It was possible to submit new HTC commands
after/while HTC stopped. This led to memory
corruption in some rare cases.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This reduces number of allocations and simplifies
memory managemnt.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Here's a new mac80211 driver for Qualcomm Atheros 802.11ac QCA98xx devices.
A major difference from ath9k is that there's now a firmware and
that's why we had to implement a new driver.
The wiki page for the driver is:
http://wireless.kernel.org/en/users/Drivers/ath10k
The driver has had many authors, they are listed here alphabetically:
Bartosz Markowski <bartosz.markowski@tieto.com>
Janusz Dziedzic <janusz.dziedzic@tieto.com>
Kalle Valo <kvalo@qca.qualcomm.com>
Marek Kwaczynski <marek.kwaczynski@tieto.com>
Marek Puzyniak <marek.puzyniak@tieto.com>
Michal Kazior <michal.kazior@tieto.com>
Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>