1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

32 commits

Author SHA1 Message Date
Jonathan Kim
9bd443cb74 drm/amdgpu: fix debug wait on idle for gfx9.4.1
Wait calls for amd_ip_block_type not amd_hw_ip_block_type.

Reported-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:37:50 -04:00
Jonathan Kim
e0f85f4690 drm/amdkfd: add debug set and clear address watch points operation
Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.

Allow the debugger to pass in a watch point to a particular memory
address per device.

Note that there exists only 4 watch points per devices to date, so have
the KFD keep track of what watch points are allocated or not.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:36:46 -04:00
Jonathan Kim
aea1b4738b drm/amdkfd: add debug wave launch mode operation
Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:36:39 -04:00
Jonathan Kim
101827e130 drm/amdkfd: add debug wave launch override operation
This operation allows the debugger to override the enabled HW
exceptions on the device.

On debug devices that only support the debugging of a single process,
the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK
register.
Because they are global, only address watch exceptions are allowed to
be enabled.  In other words, the debugger must preserve all non-address
watch exception states in normal mode operation by barring a full
replacement override or a non-address watch override request.

For multi-process debugging, all HW exception overrides are per-VMID so
all exceptions can be overridden or fully replaced.

In order for the debugger to know what is permissible, returned the
supported override mask back to the debugger along with the previously
enable overrides.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:36:37 -04:00
Jonathan Kim
7cee6a6824 drm/amdgpu: add configurable grace period for unmap queues
The HWS schedule allows a grace period for wave completion prior to
preemption for better performance by avoiding CWSR on waves that can
potentially complete quickly. The debugger, on the other hand, will
want to inspect wave status immediately after it actively triggers
preemption (a suspend function to be provided).

To minimize latency between preemption and debugger wave inspection, allow
immediate preemption by setting the grace period to 0.

Note that setting the preepmtion grace period to 0 will result in an
infinite grace period being set due to a CP FW bug so set it to 1 for now.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:35:31 -04:00
Jonathan Kim
01f648202c drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls
On GFX9.4.1, the implicit wait count instruction on s_barrier is
disabled by default in the driver during normal operation for
performance requirements.

There is a hardware bug in GFX9.4.1 where if the implicit wait count
instruction after an s_barrier instruction is disabled, any wave that
hits an exception may step over the s_barrier when returning from the
trap handler with the barrier logic having no ability to be
aware of this, thereby causing other waves to wait at the barrier
indefinitely resulting in a shader hang.  This bug has been corrected
for GFX9.4.2 and onward.

Since the debugger subscribes to hardware exceptions, in order to avoid
this bug, the debugger must enable implicit wait count on s_barrier
for a debug session and disable it on detach.

In order to change this setting in the in the device global SQ_CONFIG
register, the GFX pipeline must be idle.  GFX9.4.1 as a compute device
will either dispatch work through the compute ring buffers used for
image post processing or through the hardware scheduler by the KFD.

Have the KGD suspend and drain the compute ring buffer, then suspend the
hardware scheduler and block any future KFD process job requests before
changing the implicit wait count setting.  Once set, resume all work.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:35:15 -04:00
Mukul Joshi
5bdd3eb253 drm/amdkfd: Remove unused old debugger implementation
Cleanup the kfd code by removing the unused old debugger
implementation.
The address watch was only ever implemented in the upstream
driver for GFXv7 (Kaveri). The user mode tools runtime using
this API was never open-sourced. Work on the old debugger
prototype that used this API has been discontinued years ago.
Only a small piece of resetting wavefronts is kept and
is moved to kfd_device_queue_manager.c.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-09 16:57:51 -05:00
Graham Sider
56c5977eae drm/amdkfd: replace/remove remaining kgd_dev references
Remove get_amdgpu_device and other remaining kgd_dev references aside
from declaration/kfd struct entry and initialization.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
420185fdad drm/amdkfd: replace kgd_dev in hqd/mqd kfd2kgd funcs
Modified definitions:

- hqd_load
- hiq_mqd_load
- hqd_sdma_load
- hqd_dump
- hqd_sdma_dump
- hqd_is_occupied
- hqd_destroy
- hqd_sdma_is_occupied
- hqd_sdma_destroy

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Mukul Joshi
f270921a17 drm/amdkfd: CWSR with sw scheduler on Aldebaran and Arcturus
Program trap handler settings to enable CWSR with software scheduler
on Aldebaran and Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:33 -04:00
Souptick Joarder
5f5cb2afd6 drm/amdgpu: Added missing prototype
Kernel test robot throws below warning ->

>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:125:5: warning:
>> no previous prototype for 'kgd_arcturus_hqd_sdma_load'
>> [-Wmissing-prototypes]
     125 | int kgd_arcturus_hqd_sdma_load(struct kgd_dev *kgd, void
*mqd,

>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:195:5: warning:
>> no previous prototype for 'kgd_arcturus_hqd_sdma_dump'
>> [-Wmissing-prototypes]
     195 | int kgd_arcturus_hqd_sdma_dump(struct kgd_dev *kgd,

>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:227:6: warning:
>> no previous prototype for 'kgd_arcturus_hqd_sdma_is_occupied'
>> [-Wmissing-prototypes]
     227 | bool kgd_arcturus_hqd_sdma_is_occupied(struct kgd_dev *kgd,
void *mqd)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:246:5: warning:
>> no previous prototype for 'kgd_arcturus_hqd_sdma_destroy'
>> [-Wmissing-prototypes]
     246 | int kgd_arcturus_hqd_sdma_destroy(struct kgd_dev *kgd, void
*mqd,
         |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Added prototype for these functions.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-28 23:35:49 -04:00
Jonathan Kim
5073506c7e drm/amdkfd: add aldebaran kfd2kgd callbacks to kfd device (v2)
Create dedicated Aldebaran kfd2kgd callbacks to prepare
for new per-vmid register instructions for debug trap
setting functions and sending host traps.

v2: rebase (Alex)

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:59:28 -04:00
Yong Zhao
36e22d59dd drm/amdkfd: Add Aldebaran KFD support
Add initial KFD support.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:13 -05:00
Ramesh Errabolu
aeee2a48ec drm/amd/amdgpu: Enable arcturus devices to access the method kgd_gfx_v9_get_cu_occupancy that is already defined
[Why]
Allow user to know number of compute units (CU) that are in use at any
given moment.

[How]
Remove the keyword static for the method kgd_gfx_v9_get_cu_occupancy

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-04 17:10:17 -05:00
Felix Kuehling
332f6e1e98 drm/amdkfd: call amdgpu_amdkfd_get_hive_id directly
No need to use a function pointer because the implementation is not
ASIC-specific.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:18 -04:00
Oak Zeng
9fb1506eb6 drm/amdgpu: Use function pointer for some mmhub functions
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Christoph Hellwig
9bf5b9eb23 kernel: move use_mm/unuse_mm to kthread.c
Patch series "improve use_mm / unuse_mm", v2.

This series improves the use_mm / unuse_mm interface by better documenting
the assumptions, and my taking the set_fs manipulations spread over the
callers into the core API.

This patch (of 3):

Use the proper API instead.

Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de

These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward.  Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200416053158.586887-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-10 19:14:18 -07:00
Joe Perches
2541f95c17 AMD KFD: Use fallthrough;
Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-13 11:52:35 -04:00
Yong Zhao
fd7d08bad7 drm/amdkfd: Make get_tile_config() generic
Given we can query all the asic specific information from amdgpu_gfx_config,
we can make get_tile_config() generic.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-28 16:59:20 -05:00
Aaron Liu
35cd89d5a6 drm/amdkfd: use kiq to load the mqd of hiq queue for gfx v9 (v6)
There is an issue that CP will check the HIQ queue to be configured and mapped
with KIQ ring, otherwise, it will be unable to read back the secure buffer while
the gfxoff is enabled even with trusted IP blocks.

v1 -> v2:
- Fix to remove surplus set_resources packets.
- Fill the whole configuration in MQD.
- Change the author as Aaron because he addressed the key point of this issue.
- Add kiq ring lock.

v2 -> v3:
- Free the lock while in error return case.
- Remove the programming only needed by the queue is unmapped.

v3 -> v4:
- Remove doorbell programming because it's used for restarting queue.
- Remove CP scheduler programming because map_queue packet will handle this.

v4 -> v5:
- Remove cp_hqd_active because mec ucode will enable it while use map_queues.
- Revise goto out_unlock.
- Correct the right doorbell offset for HIQ that kfd driver assigned in the
  packet.

v5 -> v6:
- Merge Arcturus fix into this patch because it will get oops in Arcturus
  platform.

Reported-by: Lisa Saturday <Lisa.Saturday@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:50 -05:00
Alex Sierra
d175e9acf6 drm/amdgpu: flush TLB functions removal from kfd2kgd interface
[Why]
kfd2kgd interface will be deprecated. This removal only covers TLB
invalidation for now. They have been replaced in amdgpu_amdkfd API.

[How]
TLB invalidate functions removed from the different amdkfd_gfx_v*
versions.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-16 13:34:42 -05:00
Yong Zhao
a434b94c5a drm/amdkfd: Improve function get_sdma_rlc_reg_offset() (v2)
The SOC15_REG_OFFSET() macro needs to dereference adev->reg_offset[IP]
pointer, which is sometimes NULL when there are fewer than 8 sdma engines.
Avoid that by not initializing the array regardless.

v2: squash in warning fixes

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:17:05 -05:00
Yong Zhao
ad5901df88 drm/amdkfd: Use Arcturus specific set_vm_context_page_table_base()
Since Arcturus has it own function pointer, we can move Arcturus
specific logic to there rather than leaving it entangled with
other GFX9 chips.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:06 -05:00
Yong Zhao
55695b36c1 drm/amdkfd: Delete unnecessary pr_fmt switch
Given amdkfd.ko has been merged into amdgpu.ko, this switch is no
longer useful.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-30 11:06:51 -04:00
Yong Zhao
e392c887df drm/amdkfd: Use array to probe kfd2kgd_calls
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:05 -05:00
Yong Zhao
56fc40aba4 drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid
get_atc_vmid_pasid_mapping_valid() is very similar to
get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new
function get_atc_vmid_pasid_mapping_info() to reduce register access
times. More importantly, getting the PASID and the valid bit atomically
with a single read fixes some potential race conditions where the
mapping changes between the two reads.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:04 -05:00
Yong Zhao
b55a8b8b41 drm/amdkfd: Use better name for sdma queue non HWS path
The old name is prone to confusion. The register offset is for a RLC queue
rather than a SDMA engine. The value is not a base address, but a
register offset.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
9941a6bfbd drm/amdkfd: Delete useless SDMA register setting on non HWS path
HW folks have confirm that we should not touch RESUME_CTX of
SDMA*_GFX_CONTEXT_CNTL when manipulating RLC queues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:03 -05:00
Yong Zhao
c637b36aea drm/amdkfd: Fix NULL pointer dereference for set_scratch_backing_va()
Currently this function pointer is missing for GFX10. Considering it is
a void function since GFX9, fix it by checking the function pointer
before dereferencing it.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:01 -05:00
Yong Zhao
812330eb69 drm/amdkfd: Add an error print if SDMA RLC is not idle
The message will be useful when troubleshooting the issues.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03 09:11:00 -05:00
Alex Deucher
32978d8cfd drm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c
Unused.

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-31 14:32:56 -05:00
Oak Zeng
3e205a0849 drm/amdkfd: Implement kfd2kgd_calls for Arcturus
Arcturus shares most of the kfd2kgd_calls with gfx9. But due to
SDMA register address change, it can't share SDMA related functions.
Export gfx9 kfd2kgd_calls and implement SDMA related functions
for Arcturus.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:04 -05:00