1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

49 commits

Author SHA1 Message Date
Shashank Sharma
664c3b03f9 drm/amdgpu: cleanup MES process level doorbells
MES allocates process level doorbells, but there is no userspace
client to consume it. It was only being used for the MES ring
tests (in kernel), and was written by kernel doorbell write.

The previous patch of this series has changed the MES ring test code to
use kernel level MES doorbells. This patch now cleans up the process level
doorbell allocation code which is not required.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:07 -04:00
Shashank Sharma
e3cbb1f404 drm/amdgpu: use doorbell mgr for MES kernel doorbells
This patch:
- Removes the existing doorbell management code, and its variables
  from the doorbell_init function, it will be done in doorbell
  manager now.
- uses the doorbell page created for MES kernel level needs (doorbells
  for MES self tests)
- current MES code was allocating MES doorbells in MES process context,
  but those were getting written using kernel doorbell calls. This patch
  instead allocates a MES kernel doorbell for this (in add_hw_queue).

V2: Create an extra page of doorbells for MES during kernel doorbell
    creation (Alex)
V4: Move MES doorbell size and page offset objects in this patch from
    patch 6.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:07 -04:00
Daniel Vetter
3d00c59d14 amd-drm-next-6.6-2023-07-28:
amdgpu:
 - Lots of checkpatch cleanups
 - GFX 9.4.3 updates
 - Add USB PD and IFWI flashing documentation
 - GPUVM updates
 - RAS fixes
 - DRR fixes
 - FAMS fixes
 - Virtual display fixes
 - Soft IH fixes
 - SMU13 fixes
 - Rework PSP firmware loading for other IPs
 - Kernel doc fixes
 - DCN 3.0.1 fixes
 - LTTPR fixes
 - DP MST fixes
 - DCN 3.1.6 fixes
 - SubVP fixes
 - Display bandwidth calculation fixes
 - VCN4 secure submission fixes
 - Allow building DC on RISC-V
 - Add visible FB info to bo_print_info
 - HBR3 fixes
 - Add PSP 14.0 support
 - GFX9 MCBP fix
 - GMC10 vmhub index fix
 - GMC11 vmhub index fix
 - Create a new doorbell manager
 - SR-IOV fixes
 
 amdkfd:
 - Cleanup CRIU dma-buf handling
 - Use KIQ to unmap HIQ
 - GFX 9.4.3 debugger updates
 - GFX 9.4.2 debugger fixes
 - Enable cooperative groups fof gfx11
 - SVM fixes
 
 radeon:
 - Lots of checkpatch cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZMQ0vAAKCRC93/aFa7yZ
 2EOOAQCrsNf1IEynXVj0gVYOWFDpBCdaDkw+gXR73nOlwBeZzgD8DAoismXYDY95
 pkKlx/HL5O8qyZ25Lc9ZlgsJnTpnpw4=
 =c/Jk
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.6-2023-07-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.6-2023-07-28:

amdgpu:
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SubVP fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- Add PSP 14.0 support
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes

amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes

radeon:
- Lots of checkpatch cleanups

Merge conflicts:
- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
	The switch to drm eu helpers in 8a206685d3 ("drm/amdgpu: use
	drm_exec for GEM and CSA handling v2") clashed with the
	cosmetic cleanups from 30953c4d00 ("drm/amdgpu: Fix style
	issues in amdgpu_gem.c"). I
	kept the former since the cleanup up code is gone.
- drivers/gpu/drm/amd/amdgpu/atom.c.
	adf64e2142 ("drm/amd: Avoid reading the VBIOS part number
	twice") removed code that 992b8fe106 ("drm/radeon: Replace
	all non-returning strlcpy with strscpy") polished.

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230728214228.8102-1-alexander.deucher@amd.com
[sima: some merge conflict wrangling as noted]
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2023-08-04 11:10:18 +02:00
Jonathan Kim
7a1c5c6753 drm/amdkfd: enable cooperative groups for gfx11
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS.  Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for cooperative groups since cooperative groups are device
occupancy limited.

Since some GFX11 devices can only be debugged with partial CUs, do not
allow the debugging of cooperative groups on these devices as the CU
occupancy limit will change on attach.

In addition, zero initialize the MES add queue submission vector for MES
initialization tests as we do not want these to be cooperative
dispatches.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:35:43 -04:00
Guchun Chen
5003ca63bc drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create
Recent code set xcp_id stored from file private data when opening
device to amdgpu bo for accounting memory usage etc, but not all
VMs are attached to this fpriv structure like the vm cases in
amdgpu_mes_self_test, otherwise, KASAN will complain below out
of bound access. And more importantly, VM code should not touch
fpriv structure, so drop fpriv code handling from amdgpu_vm_pt.

[   77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069
[   77.294146] Call Trace:
[   77.294178]  <TASK>
[   77.294208]  dump_stack_lvl+0x49/0x63
[   77.294260]  print_report+0x16f/0x4a6
[   77.294307]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.295979]  ? kasan_complete_mode_report_info+0x3c/0x200
[   77.296057]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.297556]  kasan_report+0xb4/0x130
[   77.297609]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.299202]  __asan_load4+0x6f/0x90
[   77.299272]  amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.300796]  ? amdgpu_init+0x6e/0x1000 [amdgpu]
[   77.302222]  ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu]
[   77.303721]  ? preempt_count_sub+0x18/0xc0
[   77.303786]  amdgpu_vm_init+0x39e/0x870 [amdgpu]
[   77.305186]  ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu]
[   77.306683]  ? kasan_set_track+0x25/0x30
[   77.306737]  ? kasan_save_alloc_info+0x1b/0x30
[   77.306795]  ? __kasan_kmalloc+0x87/0xa0
[   77.306852]  amdgpu_mes_self_test+0x169/0x620 [amdgpu]

v2: without specifying xcp partition for PD/PT bo, the xcp id is -1.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686
Fixes: 3ebfd221c1 ("drm/amdkfd: Store xcp partition id to amdgpu bo")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 11:18:16 -04:00
Christian König
2acc73f81f drm/amdgpu: use drm_exec for MES testing
Start using the new component here as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-6-christian.koenig@amd.com
2023-07-12 14:14:44 +02:00
Jonathan Kim
09d49e14ea drm/amdkfd: fix and enable debugging for gfx11
There are a couple of fixes required to enable gfx11 debugging.

First, ADD_QUEUE.trap_en is an inappropriate place to toggle
a per-process register so move it to SET_SHADER_DEBUGGER.trap_en.
When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize
the SET_SHADER_DEBUGGER.trap_en setting.

Second, to preserve correct save/restore priviledged wave states
in coordination with the trap enablement setting, resume suspended
waves early in the disable call.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:48:19 -04:00
Jonathan Kim
a9818854ea drm/amdgpu: expose debug api for mes
Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process
debug API.

The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES
from clearing the process context when the first queue is added to the
scheduler in order to maintain debug mode settings during queue preemption
and restore.  The MES clears the process context in this case due to an
unresolved FW caching bug during normal mode operations.
During debug mode, the KFD will hold a reference to the target process
so the process context should never go stale and MES can afford to skip
this requirement.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:35:43 -04:00
Guchun Chen
93ab59ac6d drm/amdgpu: switch to unified amdgpu_ring_test_helper
This will simplify code.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:57:13 -04:00
Jack Xiao
97998b893c drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
To avoid new mes fw running with old driver, rename
mes schq fw to gc_*_mes_2.bin.

v2: add MODULE_FIRMWARE declaration
v3: squash in fixup patch

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:21 -04:00
Jack Xiao
5ee33d905f drm/amd/amdgpu: limit one queue per gang
Limit one queue per gang in mes self test,
due to mes schq fw change.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 01:07:04 -04:00
Lee Jones
0b9ff428de drm/amd/amdgpu/amdgpu_mes: Ensure amdgpu_bo_create_kernel()'s return value is checked
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_ctx_alloc_meta_data’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1099:13: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Jack Xiao <Jack.Xiao@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Mario Limonciello
11e0b0067e drm/amd: Use amdgpu_ucode_* helpers for MES
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.

The `amdgpu_ucode_release` helper provides symmetry for releasing firmware.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09 17:02:18 -05:00
Mario Limonciello
cc42e76e7d drm/amd: Load MES microcode during early_init
Add an early_init phase to MES for fetching and validating microcode
from the filesystem.

If MES microcode is required but not available during early init, the
firmware framebuffer will have already been released and the screen will
freeze.

Move the request for MES microcode into the early_init phase
so that if it's not available, early_init will fail.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09 17:02:18 -05:00
Yifan Zhang
0af4ed0c32 drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1
there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be
zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd
engine exists and map/unmap SDMA queues to the non-existent engine.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 15:00:34 -04:00
Le Ma
2d7a1f7183 drm/amdgpu/mes: ring aggregatged doorbell when mes queue is unmapped
Ring aggregated doorbel to make unmapped queue scheduled in mes firmware.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Le Ma
0fe6906203 drm/amdgpu/mes: init aggregated doorbell
Allocate and enable aggregated doorbell.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-13 11:25:17 -04:00
Jack Xiao
737dad0b5d drm/amdgpu/mes: fix bo va unmap issue in mes
Need reserve buffers before unmap mes ctx bo va.

v2: fix removal of dma_resv_excl_fence() (Alex)
v3: fix dma_resv_usage (Alex)

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-12 10:03:20 -04:00
Jack Xiao
35ba8850b6 drm/amdgpu/mes: fix mes submission in atomic context
For some cases (accessing registers, unmap legacy queue), it needs
access mes in atomic context. Use spinlock to protect agaist mes
ring buffer race condition.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-08 18:25:56 -04:00
Jianglei Nie
c3c483391b drm/amdgpu/mes: Fix an error handling path in amdgpu_mes_self_test()
if amdgpu_mes_ctx_alloc_meta_data() fails, we should call amdgpu_vm_fini()
to handle amdgpu_vm_init().

Add a new lable before amdgpu_vm_init() and goto this lable when
amdgpu_mes_ctx_alloc_meta_data() fails.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-05 16:18:07 -04:00
Jack Xiao
adc0e6ab0d drm/amdgpu/mes: add mes register access interface
Add mes register access routines:
1. read register
2. write register
3. wait register
4. write and wait register

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30 15:28:18 -04:00
Jack Xiao
fe4e9ff987 drm/amdgpu: add mc wptr addr support for mes
MES requires mc wptr address for usermode queues.
Export bo gart address for mc wptr address.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-28 11:24:05 -04:00
Graham Sider
a957995618 drm/amdgpu: Update mes_v11_api_def.h
Update MES API to support oversubscription without aggregated doorbell
for usermode queues.

v2: Change oversubscription_no_aggregated_en to is_kfd_process (align
with MES)

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-23 17:22:31 -04:00
Alex Deucher
948ceec7c4 drm/amdgpu/mes: fix format specifier for size_t
To avoid a warning on 32 bit.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00
Jack Xiao
18ee4ce63e drm/amdgpu: add mes unmap legacy queue routine
For mes kiq has been taken over by mes sched, drv can't directly
use mes kiq to unmap queues. drv has to use mes sched api to
unmap legacy queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:54 -04:00
Mukul Joshi
464913c0dd drm/amdgpu/mes: Update the doorbell function signatures
Update the function signatures for process doorbell allocations
with MES enabled to make them more generic. KFD would need to
access these functions to allocate/free doorbells when MES is
enabled.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:53 -04:00
Jack Xiao
da1c0338f0 drm/amdgpu/mes: disable mes sdma queue test
Disable mes sdma queue test on sienna cichlid+,
for fw hasn't supported to map sdma queue.
The test can be enabled if fw supports.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:53 -04:00
Jack Xiao
7c18b40e22 drm/amdgpu/mes: fix vm csa update issue
Need reserve VM buffers before update VM csa.

v2: rebase fixes

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:53 -04:00
Jack Xiao
6624d16103 drm/amdgpu/mes: implement mes self test
Add mes self test to verify its fundamental functionality by
running ring test and ib test of mes kernel queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
cdb7476d96 drm/amdgpu/mes: add ring/ib test for mes self test
Run the ring test and ib test for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
f1d93c9c27 drm/amdgpu/mes: create gang and queues for mes self test
Create gang and queues for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
a22f760a02 drm/amdgpu/mes: map ctx metadata for mes self test
Map ctx metadata for mes self test.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
e3652b0976 drm/amdgpu/mes: add helper functions to alloc/free ctx metadata
Add the helper functions to allocate/free context metadata.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
9cc654c8ce drm/amdgpu/mes: implement removing mes ring
Remove the mes ring and its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
d0c423b647 drm/amdgpu/mes: use ring for kernel queue submission
Use ring as the front end for kernel queue submission.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
11ec5b3605 drm/amdgpu/mes: add helper function to get the ctx meta data offset
Add the helper function to get the corresponding ctx meta data offset.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
1a27aacb6e drm/amdgpu/mes: add helper function to convert ring to queue property
Add the helper function to convert ring to queue property.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
bcc4e1e1d4 drm/amdgpu/mes: implement removing mes queue
Remove the MES queue from MES scheduling and free its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:52 -04:00
Jack Xiao
be5609de15 drm/amdgpu/mes: implement adding mes queue
Allocate related resources for the queue and add it to mes
for scheduling.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
5fa963d0fc drm/amdgpu/mes: initialize mqd from queue properties
Add helper function to initialize mqd from queue properties.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
ea756bd5cc drm/amdgpu/mes: implement resuming all gangs
Implement resuming all gangs.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
c8bb10572c drm/amdgpu/mes: implement suspending all gangs
Implement suspending all gangs.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
b0306e5840 drm/amdgpu/mes: implement removing mes gang
Free the mes gang and its resources.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
5d0f619f72 drm/amdgpu/mes: implement adding mes gang
Gang is a group of the same type queue, which is the scheduling
unit of mes hardware scheduler.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
063a38d662 drm/amdgpu/mes: implement destroying mes process
Destroy the mes process, which free resources of the process.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
48dcd2b751 drm/amdgpu/mes: implement creating mes process v2
Create a mes process which contains process-related resources,
like vm, doorbell bitmap, process ctx bo and etc.

v2: move the simple variable to the end

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
0bf478f01a drm/amdgpu/mes: relocate status_fence slot allocation
Move the status_fence slot allocation from ip specific function
to general mes function.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
b04c1d6468 drm/amdgpu/mes: initialize/finalize common mes structure v2
Initialize/finalize common mes structure.

v2: add mutex_init for adev->mes.mutex

Cc: Le Ma <le.ma@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00
Jack Xiao
32de57e9ef drm/amdgpu/mes: manage mes doorbell allocation
It is used to manage the doorbell allocation of mes processes and queues.
Driver calls into process doorbell allocation to get the slice doorbell
for the process, then the doorbell for a queue is allocated from the
process doorbell slice.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04 10:43:51 -04:00