When interface stopped while running intensive Rx traffic, the following oops
observed:
[89846.734683] Call trace:
[89846.737117] [<ffffffc00083aa64>] dev_gro_receive+0xac/0x358
[89846.742674] [<ffffffc00083ae94>] napi_gro_receive+0x24/0xa4
[89846.748251] [<ffffffbffc1c2f88>] $x+0xec/0x1f8 [wil6210] wil_netif_rx_any
[89846.753547] [<ffffffbffc1c4830>] $x+0x34/0x54 [wil6210] wil_release_reorder_frame
[89846.758755] [<ffffffbffc1c48ac>] wil_release_reorder_frames+0x5c/0x78 [wil6210]
[89846.766044] [<ffffffbffc1c4bf8>] wil_tid_ampdu_rx_free+0x20/0x48 [wil6210]
[89846.772901] [<ffffffbffc1bedc8>] $x+0x190/0x1e8 [wil6210]
[89846.778285] [<ffffffbffc1c0ed4>] wmi_event_worker+0x230/0x2f8 [wil6210]
[89846.784865] [<ffffffc0000b0bc8>] process_one_work+0x278/0x3fc
[89846.790591] [<ffffffc0000b1218>] worker_thread+0x200/0x330
[89846.796060] [<ffffffc0000b6664>] kthread+0xac/0xb8
[89846.800836] Code: b940c661 f9406a62 8b010041 f9400026 (f8636882)
[89846.807008] ---[ end trace d6fdc17cd27d18f6 ]---
Reason is the following: when removing Rx vring
(wil_netdev_ops.ndo_stop -> wil_stop -> wil_down -> __wil_down -> wil_rx_fini),
Rx interrupt occurs. It trigger Rx NAPI, calling wil_rx_handle() that reaps
(already cleaned) buffer, causing skb referring to garbage memory being set into reorder buffer.
Then, network stack trying to access this buffer and fails.
Prevent Rx NAPI from being scheduled if device going to stop. Bit wil_status_napi_en reflects
NAPI enablement state, check it when triggering Rx NAPI.
Testing shows that check for wil_status_napi_en sometimes gets negative, and new error message
get printed - in this case kernel oops would be observed. Original oops is no more reproducible.
This change requires also changes in the AP flows.
Properly enable/disable NAPI for the AP. Make sure Rx VRING is disabled
when resetting target.
For this, promote __wil_up() and __wil_down() to the module scope, and use it
in the relevant flows.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To better reflect real action performed, rename:
s/wil6210_disable_irq/wil_mask_irq/
s/wil6210_enable_irq/wil_unmask_irq/
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
re-use of wmi_ready for both FW ready event and for wmi_call was causing
false "FW not ready" indication in case wmi_call() was invoked while reset
took place.
add wmi_call completion variable instead of re-using wmi_ready.
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
While handling Rx packet, BACK event arrives and frees tid_ampdu_rx array.
This causes kernel panic while accessing already freed spinlock
The fix is to remove tid_ampdu_rx[]'s spinlock and instead use single
sta's spinlock to guard the whole tid_ampdu_rx array.
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When CONFIG_DYNAMIC_DEBUG is not defined, print_hex_dump_debug
is mapped directly to print_hex_dump which might cause
printout to exist all the time
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
New module (wil_platform) for handling platform specific tasks
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
- parentheses, indentation, typos
- seq_puts() instead of seq_printf() with single argument
- sizeof(var) vs. sizeof(type)
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Firmware download implemented but is still experimental feature;
flag controlling it added, no_fw_load. It is true by default,
use no_fw_load=N to activate feature.
Reset flows also got some adjustment for the fw download to work
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When reading 'bf' file on debugfs, query beam forming status from firmware.
Ignore CID's that return error or return all zeros.
Remove obsolete code that used to maintain statistics on per-device basis,
as now it is reported be per-CID and current.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix Copyright headers in all files changed in 2014, to mention 2014
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
expose reading RGF_MAC_MTRL_COUNTER_0 in debugfs
Signed-off-by: Dedy Lansky <qca_dlansky@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
New register area defined in the firmware
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Firmware sets this register with the offset of the firmware trace area
within the peripheral memory region. Critical for the firmware trace
to work
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Use single data source for all information regarding the firmware
memory map. With this change "ucode_xxx" regions disappears since
they are in fact part of larger "upper area" region
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
New hardware release appears; it require some changes to properly support it.
Introduce struct wil_board and "board" attribute in wil6210_priv;
keep hardware variant information in this structure.
fill it on probe(). Used in the reset flow.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Track number of interrupts and Tx/Rx packets;
expose through debugfs 'info'. Reset upon read.
Used to analyse effectivness of interrupt coalescing and NAPI.
Read twice with some interval like
cat info > /dev/null; sleep 1; cat info
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Expose operational frequency and link info
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
For performance monitoring, trace time intervals when Tx vring
is idle/not idle. Use CPU cycle counter for this, because jiffies is
too rough, and other precise time measurement methods involve
overhead while get_cycles() should be fast.
This used to provide some estimation for percentage when Tx vring
was idle, i.e. when hardware is under-utilized.
Estimation is not precise because of many reasons - CPU frequency scaling,
grt_cycles() may be per core etc. But still, it is good estimation
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Provide 2 files on the debugfs:
- "rxon": write channel (1..4) to open Rx on it, 0 to rxoff
- "tx_mgmt": write binary frame, starting from MAC header
one need to care about turning receiver on/off before/after tx_mgmt
Correct sequence is:
echo $channel > rxon
cat mfmt_frame > tx_mgmt
echo 0 > rxon
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If scan has not finished in some reasonable time (10sec), interpret it as
if firmware error occurs but was not reported. Firmware should report
scan completion for every scan request, so it is error condition indeed.
Perform firmware recovery procedure.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In case there is something fundamentally wrong with the firmware
(example: RF cable disconnected), FW will always crash immediately
after reset. This leads to infinite fw error recovery loop.
Count consecutive unsuccessful error recovery attempts in a short period
of time, and stop doing recovery after some reasonable count.
It is still possible to manually reset fw doing
interface down/up sequence.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When using scatter-gather, more descriptor entries get used.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sometimes, due to the race between Rx path and WMI_BA_STATUS_EVENTID WMI event,
few frames may be passed to the stack before reorder buffer allocated.
Then, after BACK establishment, it start getting frames with sequence number ahead of
SSN, and it get interpreted as missing frames. Then, BACK mechanism will wait
for missing frames; data traffic will be stopped. In case of interface configured
for DHCP, this data delay causes DHCP failure.
Relax checking for sequence number; use sequence of 1-st frame handled by the buffer
as SSN for this buffer.
This is work-around, real fix should be done when proper BACK mechanism implemented.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When disconnecting some CID, corresponded Tx vring get released. During vring
release, all descriptors get freed. It is possible that Tx NAPI working on the same
vring simultaneously. If it happens, descriptor may be double freed.
To protect from the race above, make sure NAPI won't process the same vring.
Introduce 'enabled' flag in the struct vring_tx_data. Proceed with Tx NAPI only if
'enabled' flag set. Prior to Tx vring release, clear this flag and make sure NAPI
get synchronized.
NAPI enablement status protected by wil->mutex, add protection where it was
missing and check for it.
During reset, disconnect all peers first, then proceed with the Rx vring. It allows for
the disconnect flow to observe proper 'wil->status' and correctly notify cfg80211 about
connection status change
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Use 'real' indication for hardware state.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
upon fw error interrupt - in STA mode, disconnect/cancel scan and
then reset FW/HW
added module param - no_fw_recovery which is false by default
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
wil_reset() removes vring's
At the same time NAPI may be active performing Rx/Tx completion.
If this happens, Rx/Tx polling functions going to access already removed vrings
Make sure NAPI is idle and won't be started prior to vring removal.
For this, track NAPI enabled state
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Configure hardware to perform full reset on "power good". This mean,
reset HW on system boot. This improves card stability.
By default this is off.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Put all registers in order for easier navigation;
fix naming to reflect hardware cluster
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Hardware bug triggered by the MSI init while INTx asserted for some reason.
De-assert INTx prior to MSI set-up
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Support for new chip revision. Revision read from the
internal register, PCIE config's "revision id" register
do not indicate HW version properly
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Introduce enum to describe mapping type; allow 'none' in addition to
'single' and 'page'; this is preparation for GSO
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When setting fragmented skb for Tx, assign skb to the last descriptor
and set number of fragments in the 1-st one
On Tx complete, HW sets "DU" bit in Tx descriptor only for the last
descriptor; so search for it using number of fragments field.
Middle descriptors may have "DU" bit not set by the hardware.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When connection established, as reported by WMI_CONNECT_EVENTID,
4-way handshaking required for the secure connection is not done
yet. It is indicated by another WMI event. Wait for it and only then
allow data traffic. In case of non-secure connection, FW reports
"data port open" immediately after connection.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
implement del_station() method in the struct cfg80211_ops
It allows to disconnect single peer from the AP
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Calculate statistics per connection, report with "iw station dump"
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When running multiple connections, hardware can't do BACK reordering
and it should be done on the host.
Model after mac80211's implementation. Drop RCU for now;
to be re-added when BACK will be stabilized
BACK handshaking is not implemented yet in the hardware,
pretend it was done to support the way FW operating
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Enable multiple (up to 8 - HW/FW limitation) simultaneous connections.
Each connection has its own CID (connection ID) that describes chip's
beam-forming entity. Tx Vring should refer to correct CID for frame to reach
its destination.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Implement management frame passing. In order to receive frame on the other
side, remain_on_channel() should be implemented as well
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Use hardware capabilities to limit IRQ generation to about 15 per msec
It corresponds to about 7 packets/IRQ when running iperf with default
parameters at 1.3Gbps
Do not enable this feature in the sniffer (monitor) mode, because
interrupt moderation cause timestamp accuracy deterioration.
For the sniffer flow, it is important to get precise timestamp.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Enable adding more data to the SW context.
For now, add flag "mapped_as_page", to separate decisions on free-ing skb
and type of DMA mapping.
This allows linking skb itself to any descriptor of fragmented skb.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
No more need for special processing of EAPOL, FW can now send EAPOL frames
using normal Tx queue for TID 0
This fixes "schedule while atomic" bug - start_xmit called in softirq context;
while WMI mechanism that was used may sleep.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
sme_state is private wdev's variable.
Track connection state internally
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Introduce NAPI for Rx and Tx completion.
This fixes packet reordering that happens when Rx handled right in
the IRQ: netif_rx puts packet in 'percpu' queue, then network stack
fetches packets from 'percpu' queues for processing, with different
pattern of queue switching. As result, network stack see packets
in different order. This causes hard to understand TCP throughput
degradation in about 30min
Complete polling if only one packet was processed - this eliminates
empty polls that would be otherwise done at the end of each burst
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>