To avoid preventing the display from coming up before the rootfs is
mounted, without resorting to packing fw in the initrd, the GPU has
this limbo state where the device is probed, but we aren't ready to
start sending commands to it. This is particularly problematic for
a6xx, since the GMU (which requires fw to be loaded) is the one that
is controlling the power/clk/icc votes.
So defer enabling runpm until we are ready to call gpu->hw_init(),
as that is a point where we know we have all the needed fw and are
ready to start sending commands to the coproc's.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/489337/
Link: https://lore.kernel.org/r/20220613182036.2567963-1-robdclark@gmail.com
The restriction to 4G was strictly to work around 64b math bug in some
versions of SQE firmware. This appears to be fixed in a650+ SQE fw, so
allow a larger address space size on these devices.
Also, add a modparam override for debugging and igt.
v2: Send the right version of the patch (ie. the one that actually
compiles)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/487601/
Link: https://lore.kernel.org/r/20220529180428.2577832-1-robdclark@gmail.com
Leading spaces are not something checkpatch likes, and it says so when
they are present. Use tabs consistently to indent function body and
unwrap a 83-char-long line, as 100 is cool nowadays.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/487592/
Link: https://lore.kernel.org/r/20220528160353.157870-4-konrad.dybcio@somainline.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
There are various SKUs of A619, ranging from 565 MHz to 850 MHz, depending
on the bin. Add support for distinguishing them, so that proper frequency
ranges can be applied, depending on the HW.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/487590/
Link: https://lore.kernel.org/r/20220528160353.157870-3-konrad.dybcio@somainline.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
From testing on sc7180-trogdor devices, reading the GMU registers
needs the GMU clocks to be enabled. Those clocks get turned on in
a6xx_gmu_resume(). Confusingly enough, that function is called as a
result of the runtime_pm of the GPU "struct device", not the GMU
"struct device". Unfortunately the current a6xx_gpu_busy() grabs a
reference to the GMU's "struct device".
The fact that we were grabbing the wrong reference was easily seen to
cause crashes that happen if we change the GPU's pm_runtime usage to
not use autosuspend. It's also believed to cause some long tail GPU
crashes even with autosuspend.
We could look at changing it so that we do pm_runtime_get_if_in_use()
on the GPU's "struct device", but then we run into a different
problem. pm_runtime_get_if_in_use() will return 0 for the GPU's
"struct device" the whole time when we're in the "autosuspend
delay". That is, when we drop the last reference to the GPU but we're
waiting a period before actually suspending then we'll think the GPU
is off. One reason that's bad is that if the GPU didn't actually turn
off then the cycle counter doesn't lose state and that throws off all
of our calculations.
Let's change the code to keep track of the suspend state of
devfreq. msm_devfreq_suspend() is always called before we actually
suspend the GPU and msm_devfreq_resume() after we resume it. This
means we can use the suspended state to know if we're powered or not.
NOTE: one might wonder when exactly our status function is called when
devfreq is supposed to be disabled. The stack crawl I captured was:
msm_devfreq_get_dev_status
devfreq_simple_ondemand_func
devfreq_update_target
qos_notifier_call
qos_max_notifier_call
blocking_notifier_call_chain
pm_qos_update_target
freq_qos_apply
apply_constraint
__dev_pm_qos_update_request
dev_pm_qos_update_request
msm_devfreq_idle_work
Fixes: eadf79286a ("drm/msm: Check for powered down HW in the devfreq callbacks")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/489124/
Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
Prior to the last commit, this could result in setting the GPU
written fence value back to an older value, if we had missed
updating completed_fence prior to suspend. This was mostly
harmless as the GPU would eventually overwrite it again with
the correct value. But we should just not do this. Instead
just leave a sanity check that the fence looks plausible (in
case the GPU scribbled on memory).
Reported-by: Steev Klimaszewski <steev@kali.org>
Fixes: 95d1deb02a ("drm/msm/gem: Add fenced vma unpin")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Patchwork: https://patchwork.freedesktop.org/patch/490138/
Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com
Following commit 17e822f759 ("drm/msm: fix unbalanced
pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to
adreno_unbind() will disable runtime PM twice, as indicated by the call
trees below:
adreno_unbind()
-> pm_runtime_force_suspend()
-> pm_runtime_disable()
adreno_unbind()
-> gpu->funcs->destroy() [= aNxx_destroy()]
-> adreno_gpu_cleanup()
-> pm_runtime_disable()
Note that pm_runtime_force_suspend() is called right before
gpu->funcs->destroy() and both functions are called unconditionally.
With recent addition of the eDP AUX bus code, this problem manifests
itself when the eDP panel cannot be found yet and probing is deferred.
On the first probe attempt, we disable runtime PM twice as described
above. This then causes any later probe attempt to fail with
[drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13
preventing the driver from loading.
As there seem to be scenarios where the aNxx_destroy() functions are not
called from adreno_unbind(), simply removing pm_runtime_disable() from
inside adreno_unbind() does not seem to be the proper fix. This is what
commit 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in
adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check
whether runtime PM is still enabled, and only disable it in that case.
Fixes: 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJihZ6hAAoJEJMTUIYYBUIphx8IAJZ+8FJ+qjCne0GaVFGczjFr
cz1PVE7CAXWlgA+1V/xSw7Awt0oPhrB400SRh/2bea8awHkKZxtcnBIA1apEoDjo
+qyhHMpwcItCHQM5xOKKGZb0KU+MCK9/6PH5VjD8k1gX52OoxQ4e0uUng7ljVgCe
F/LBBpW6oY6ZAiZxaxieXSD36bNkwhYXdYGWdJFQPQF+N4gYPmSWOmijaRnkJtnv
o4VS5HnTN3FMgKu+h2Rr7r4A5s7WygVJM7K+sbJZI/F+OCFv3lftkU2JOXGtS2oc
5sInJB972aafkD3lOzTPyARiHrtvLpsSGPXok24b3suRN0pHRYtvAOdrI0m/sUg=
=y6Lb
-----END PGP SIGNATURE-----
Merge tag 'msm-next-5.19-fixes' of https://gitlab.freedesktop.org/abhinavk/msm into drm-next
5.19 fixes for msm-next
- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a big-time
hitter in the CI-runs.
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
a6xx_gmu_init() passes the node to of_find_device_by_node()
and of_dma_configure(), of_find_device_by_node() will takes its
reference, of_dma_configure() doesn't need the node after usage.
Add missing of_node_put() to avoid refcount leak.
Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
- Fourcc modifier for tiled but not compressed layouts
- Support for userspace allocated IOVA (GPU virtual address)
- Devfreq clamp_to_idle fix
- DPU: DSC (Display Stream Compression) support
- DPU: inline rotation support on SC7280
- DPU: update DP timings to follow vendor recommendations
- DP, DPU: add support for wide bus (on newer chipsets)
- DP: eDP support
- Merge DPU1 and MDP5 MDSS driver, make dpu/mdp device the master
component
- MDSS: optionally reset the IP block at the bootup to drop
bootloader state
- Properly register and unregister internal bridges in the DRM framework
- Complete DPU IRQ cleanup
- DP: conversion to use drm_bridge and drm_bridge_connector
- eDP: drop old eDP parts again
- DPU: writeback support
- Misc small fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvJCr_1D8d0dgmyQC5HD4gmXeZw=bFV_CNCfceZbpMxRw@mail.gmail.com
Check if 'aspace' is set before using it as it will stay null without
IOMMU, such as on msm8974.
Fixes: bc2112583a ("drm/msm/gpu: Track global faults per address-space")
Signed-off-by: Luca Weiss <luca@z3ntu.xyz>
Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz
Signed-off-by: Rob Clark <robdclark@chromium.org>
Move tracking and busy time calculation to msm_devfreq_get_dev_status.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220416003314.59211-2-olvaffe@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
The motivation at this point is mainly native userspace mesa driver in a
VM guest. The one remaining synchronous "hotpath" is buffer allocation,
because guest needs to wait to know the bo's iova before it can start
emitting cmdstream/state that references the new bo. By allocating the
iova in the guest userspace, we no longer need to wait for a response
from the host, but can just rely on the allocation request being
processed before the cmdstream submission. Allocation failures (OoM,
etc) would just be treated as context-lost (ie. GL_GUILTY_CONTEXT_RESET)
or subsequent allocations (or readpix, etc) can raise GL_OUT_OF_MEMORY.
v2: Fix inuse check
v3: Change mismatched iova case to -EBUSY
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/r/20220411215849.297838-11-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Get rid of all the unnecessary conversion between address/size and page
offsets. It just confuses things.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220411215849.297838-6-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
The ring seqno counter duplicates the fence-context last_fence counter.
They end up getting incremented in lock-step, on the same scheduler
thread, but the split just makes things less obvious.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
In the cause of using the GPU via virtgpu, the host side process is
really a sort of proxy, and not terribly interesting from the PoV of
crash/fault logging. Add a way to override these per process so that
we can see the guest process's name.
v2: Handle kmalloc failure, add comment to explain kstrdup returns
NULL if passed NULL [Dan Carpenter]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
The 64b value field is already suffient to hold a pointer instead of
immediate, but we also need a length field.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
When building with CONFIG_PM=y and CONFIG_PM_SLEEP=n (such as ARCH=riscv
allmodconfig), the following warnings/errors occur:
drivers/gpu/drm/msm/adreno/adreno_device.c:679:12: error: 'adreno_system_resume' defined but not used [-Werror=unused-function]
679 | static int adreno_system_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/msm/adreno/adreno_device.c:655:12: error: 'adreno_system_suspend' defined but not used [-Werror=unused-function]
655 | static int adreno_system_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
These functions are only used in SET_SYSTEM_SLEEP_PM_OPS(), which
evaluates to empty when CONFIG_PM_SLEEP is not set, making these
functions unused.
To resolve this, use the SYSTEM_SLEEP_PM_OPS() and RUNTIME_PM_OPS()
macros, which were introduced in commit 1a3c7bb088 ("PM: core: Add new
*_PM_OPS macros, deprecate old ones"). They are designed to avoid these
compiler warnings while still guarding their use on
CONFIG_PM{,_SLEEP}=y.
Fixes: 7e4167c9e0 ("drm/msm/gpu: Park scheduler threads for system suspend")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220411181249.2758344-1-nathan@kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
The fourth param is size, rather than range_end.
Note that we could increase the address space size if we had a way to
prevent buffers from spanning a 4G split, mostly just to avoid fw bugs
with 64b math.
Fixes: 84c31ee16f ("drm/msm/a6xx: Add support for per-instance pagetables")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220407202836.1211268-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
The mutex wasn't really protecting anything before. Before the previous
patch we could still be racing with the scheduler's kthread, as that is
not necessarily frozen yet. Now that we've parked the sched threads,
the only race is with jobs retiring, and that is harmless, ie.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220310234611.424743-4-robdclark@gmail.com
In the system suspend path, we don't want to be racing with the
scheduler kthreads pushing additional queued up jobs to the hw
queue (ringbuffer). So park them first. While we are at it,
move the wait for active jobs to complete into the new system-
suspend path.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220310234611.424743-3-robdclark@gmail.com
These casts need to happen before the shift. The only time it would
matter would be if "rev.core" is >= 128. In that case the sign bit
would be extended and we do not want that.
Fixes: afab9d91d8 ("drm/msm/adreno: Expose speedbin to userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220307133105.GA17534@kili
Signed-off-by: Rob Clark <robdclark@chromium.org>
Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc)
does not require counters to maintain state across context switches. So
clear them if systemwide profiling is not active.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-5-robdclark@gmail.com
Add a SYSPROF param for system profiling tools like Mesa's pps-producer
(perfetto) to control behavior related to system-wide performance
counter collection. In particular, for profiling, one wants to ensure
that GPU context switches do not effect perfcounter state, and might
want to suppress suspend (which would cause counters to lose state).
v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and
initialize the sysprof_active refcount to one (because the under/
overflow checking in refcount_t doesn't expect a 0->1 transition)
meaning that values greater than 1 means sysprof is active.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-4-robdclark@gmail.com
It was always expected to have a use for this some day, so we left a
placeholder. Now we do. (And I expect another use in the not too
distant future when we start allowing userspace to allocate GPU iova.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304005317.776110-3-robdclark@gmail.com
Update headers from mesa commit:
commit 7e63fa2bb13cf14b765ad06d046789ee1879b5ef
Author: Rob Clark <robclark@freedesktop.org>
AuthorDate: Wed Mar 2 17:11:10 2022 -0800
freedreno/registers: Add a couple regs we need for kernel
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>
Signed-off-by: Rob Clark <robdclark@chromium.org>
[for display bits:]
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://lore.kernel.org/r/20220304005317.776110-2-robdclark@gmail.com
Other processes don't need to know about faults that they are isolated
from by virtue of address space isolation. They are only interested in
whether some of their state might have been corrupted.
But to be safe, also track unattributed faults. This case should really
never happen unless there is a kernel bug (and that would never happen,
right?)
v2: Instead of adding a new param, just change the behavior of the
existing param to match what userspace actually wants [anholt]
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5934
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220201161618.778455-3-robdclark@gmail.com
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Rob Clark <robdclark@chromium.org>
msm_ioremap() functions take additional argument dbgname which is now
unused.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20220105232700.444170-2-dmitry.baryshkov@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
System suspend uses pm_runtime_force_suspend(), which cheekily bypasses
the runpm reference counts. This doesn't actually work so well when the
GPU is active. So add a reasonable delay waiting for the GPU to become
idle.
Alternatively we could just return -EBUSY in this case, but that has the
disadvantage of causing system suspend to fail.
v2: s/ret/remaining [sboyd], and switch to using active_submits count
to ensure we aren't racing with submit cleanup (and devfreq idle
work getting scheduled, etc)
v3: fix inverted logic
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220108180913.814448-2-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
A CP_PROTECT entry for SMMU registers is missing for A540. According to
downstream sources its length is same as on A530 - 0x20000 bytes.
On all other revisions SMMU region length is 0x10000 bytes. Despite
this, we setup region of length 0x20000 on all revisions. This doesn't
cause any issues on those GPUs. As for preventing accesses to the region
from protected mode it was tested to work the same.
This patch drops the "if" condition in setup of CP_PROTECT entry because
it already includes all supported revisions except A540.
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Link: https://lore.kernel.org/r/20211212160333.980343-2-vladimir.lypak@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
I am seeing some crash logs which imply that we are trying to use
crashdumper hw to read back GPU state when the GPU isn't initialized.
This doesn't go well (for example, GPU could be in 32b address mode
and ignoring the upper bits of buffer that it is trying to dump state
to).
I'm not *quite* sure how we get into this state in the first place,
but lets not make a bad situation worse by triggering iova fault
crashes.
While we're at it, also add the information about whether the GPU is
initialized to the devcore dump to make this easier to see in the
logs (which makes the WARN_ON() redundant and even harmful because
it fills up the small bit of dmesg we get with the crash report).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211209193118.1163248-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
In preparation for registering the mdss interrupt controller earlier,
move the allocation of msm_drm_private from component bind time to
msm_drv probe; this also allows us to use the devm variant of kzalloc.
Since it is not right to allocate the drm_device at probe time (as
it should exist only when all components are bound, and taken down
when components get cleaned up), the only way to make this happen is
to pass a pointer to msm_drm_private as driver data (like done in
many other DRM drivers), instead of one to drm_device like it's
currently done in this driver.
This is also simplifying some bind/unbind functions around drm/msm,
as some of them are using drm_device just to grab a pointer to the
msm_drm_private structure, which we now retrieve in one call.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20211201105210.24970-2-angelogioacchino.delregno@collabora.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
If you don't realize is_a650_family() also encompasses a660 family,
you'd think that the debug buffer is double allocated. Add a comment
to make this more clear.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211124214151.1427022-11-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
It appears to be a GMU fw build option whether it does anything with
debug and log buffers, but if they are all zeros it won't add anything
to the devcore size.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211124214151.1427022-10-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
This also includes a history of start index of the last 8 messages on
each queue, since parsing backwards to decode recently sent HFI messages
is hard(ish).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20211124214151.1427022-9-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>