1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

13994 commits

Author SHA1 Message Date
Lijo Lazar
c4b9dc5313 drm/amdgpu: Add SMU v13.0.6 default reset methods
For APUs with SMU v13.0.6, mode-2 reset is kept as default and for
others mode-1 is the default reset method.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:31:37 -04:00
Lee Jones
04cef5f583 drm/amd/amdgpu/amdgpu_doorbell_mgr: Correct misdocumented param 'doorbell_index'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Function parameter or member 'doorbell_index' not described in 'amdgpu_doorbell_index_on_bar'
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c:123: warning: Excess function parameter 'db_index' description in 'amdgpu_doorbell_index_on_bar'

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:27:23 -04:00
Lee Jones
a728342ae4 drm/amd/amdgpu/imu_v11_0: Increase buffer size to ensure all possible values can be stored
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/imu_v11_0.c: In function ‘imu_v11_0_init_microcode’:
 drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:54: warning: ‘_imu.bin’ directive output may be truncated writing 8 bytes into a region of size between 4 and 33 [-Wformat-truncation=]
 drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 40

Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:27:18 -04:00
Lee Jones
ac84d99a11 drm/amd/amdgpu/amdgpu_sdma: Increase buffer size to account for all possible values
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c: In function ‘amdgpu_sdma_init_microcode’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:64: warning: ‘.bin’ directive output may be truncated writing 4 bytes into a region of size between 0 and 32 [-Wformat-truncation=]
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:217:17: note: ‘snprintf’ output between 13 and 52 bytes into a destination of size 40
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:66: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:215:17: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 40

Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:27:06 -04:00
Lee Jones
3dd8a754a5 drm/amd/amdgpu/amdgpu_ras: Increase buffer size to account for all possible values
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_sysfs_create’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1406:20: warning: ‘_err_count’ directive output may be truncated writing 10 bytes into a region of size between 1 and 32 [-Wformat-truncation=]
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1405:9: note: ‘snprintf’ output between 11 and 42 bytes into a destination of size 32

Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:27:03 -04:00
Lee Jones
8057a9d656 drm/amd/amdgpu/amdgpu_device: Provide suitable description for param 'xcc_id'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:516: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_mm_wreg_mmio_rlc'

Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:26:58 -04:00
Christophe JAILLET
415b7ba36a drm/amdgpu: Use kvzalloc() to simplify code
kvzalloc() can be used instead of kvmalloc() + memset() + explicit NULL
assignments.

It is less verbose and more future proof.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:26:47 -04:00
Christophe JAILLET
5f5c75bf16 drm/amdgpu: Remove amdgpu_bo_list_array_entry()
Now that there is an explicit flexible array at the end of 'struct
amdgpu_bo_list', it can be used to remove amdgpu_bo_list_array_entry() and
simplify some macro.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:26:45 -04:00
Christophe JAILLET
a23abe1fbd drm/amdgpu: Remove a redundant sanity check
The case where 'num_entries' is too big, is already handled by
struct_size(), because kvmalloc() would fail.

It will return -ENOMEM instead of -EINVAL, but it is only related to a
unlikely to happen sanity check.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:26:38 -04:00
Christophe JAILLET
ff49bd2c74 drm/amdgpu: Explicitly add a flexible array at the end of 'struct amdgpu_bo_list'
'struct amdgpu_bo_list' is really used as if it was ended by a flex array.
So make it more explicit and add a 'struct amdgpu_bo_list_entry entries[]'
field at the end of the structure.

This way, struct_size() can be used when it is allocated.
It is less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:26:32 -04:00
Hawking Zhang
ec70578c83 drm/amdgpu: Allow issue disable gfx ras cmd to firmware
Disable gfx ras command is needed in some use cases

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:25:54 -04:00
Lijo Lazar
e370f8f389 drm/amdgpu: Add bootloader wait for PSP v13
Implement the wait for bootloader call back for PSP v13.0 ASICs. Only
for ASICs with PSP v13.0.6, it needs an additional check for VBIOS
mailbox status.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:25:44 -04:00
Hamza Mahfooz
1c6b6bd078 drm/amdgpu: register a dirty framebuffer callback for fbcon
fbcon requires that we implement &drm_framebuffer_funcs.dirty.
Otherwise, the framebuffer might take a while to flush (which would
manifest as noticeable lag). However, we can't enable this callback for
non-fbcon cases since it may cause too many atomic commits to be made at
once. So, implement amdgpu_dirtyfb() and only enable it for fbcon
framebuffers (we can use the "struct drm_file file" parameter in the
callback to check for this since it is only NULL when called by fbcon,
at least in the mainline kernel) on devices that support atomic KMS.

Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org # 6.1+
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:25:34 -04:00
Mangesh Gadre
7caebc8f99 drm/amdgpu: Updated TCP/UTCL1 programming
Update TCP/UTCL1 thrashing control settings

v2: updated rev_id check

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:20:14 -04:00
Srinivasan Shanmugam
f54e1d47e0 drm/amdgpu: Fix kcalloc over kzalloc in 'gmc_v9_0_init_mem_ranges'
Replace kzalloc(n * sizeof(...), ...) with kcalloc(n, sizeof(...), ...)
since kcalloc is the preferred API in case of allocating with multiply.

Fixes the below:

WARNING: Prefer kcalloc over kzalloc with multiply

Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:20:02 -04:00
Philip Yang
5d44a766f7 drm/amdkfd: Share the original BO for GTT mapping
If mGPUs is on same IOMMU group, or is ram direct mapped, then mGPUs
can share the original BO for GTT mapping dma address, without creating
new BO from export/import dmabuf.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:19:07 -04:00
Hawking Zhang
2c0f880abc drm/amdgpu: Fix the return for gpu mode1_reset
amdgpu_device_mode1_reset will return gpu mode1_reset
succeed (ret = 0) as long as wait_for_bootloader call
succeed, regardless of the status reported by smu or
psp firmware. This results to driver continue executing
recovery even smu or psp fail to perform mode1 reset.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:01:44 -04:00
benl
96271dd4d5 drm/amdgpu: add gfxhub 11.5.0 support
Add initial gfxhub 11.5 support.

Signed-off-by: benl <ben.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:01:15 -04:00
Prike Liang
b90975fa5b drm/amdgpu: enable gmc11 for GC 11.5.0
Add to IP discovery table.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:01:13 -04:00
Lang Yu
aba2be4147 drm/amdgpu: add mmhub 3.3.0 support
Add initial implementation for mmhub 3.3.0.

v2: squash in client id fix (Alex)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:01:09 -04:00
Prike Liang
b5549a2df0 drm/amdgpu/discovery: enable gfx11 for GC 11.5.0
Add to IP discovery table.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:01:03 -04:00
Lang Yu
d3ff0189c1 drm/amdgpu/discovery: enable mes block for gc 11.5.0
Add to IP discovery table.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:59 -04:00
Aaron Liu
10c9d86918 drm/amdgpu: add mes firmware support for gc_11_5_0
Add scheduler and kiq firmware support for gc_11_5_0.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:57 -04:00
Aaron Liu
d717da1775 drm/amdgpu: add imu firmware support for gc_11_5_0
Add imu firmware support for gc_11_5_0.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:54 -04:00
Aaron Liu
8e42b463df drm/amdgpu: add golden setting for gc_11_5_0
Initialize golden setting for gc_11_5_0.

v2: squash in latest golden updates (Alex)
v3: squash in checkpatch fix (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:47 -04:00
Prike Liang
15e7cbd91d drm/amdgpu/gfx11: initialize gfx11.5.0
Initalize gfx 11.5.0 and set gfx hw configuration.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:44 -04:00
Prike Liang
dd5a326155 drm/amdgpu/gmc11: initialize GMC for GC 11.5.0 memory support
Initialize vram attribute and VMHUB for GC 11.5.0.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:39 -04:00
Prike Liang
d9d6833442 drm/amdgpu/discovery: add nbio 7.11.0 support
Add to IP discovery table.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:30 -04:00
benl
e44d856eaa drm/amdgpu: add nbio 7.11 support
Add initial nbio 7.11 implementation.

Signed-off-by: benl <ben.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:27 -04:00
Prike Liang
bb7249ee45 drm/amdgpu/discovery: enable soc21 support
Add 11.5.0 to IP discovery table.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:22 -04:00
Prike Liang
0d1db799e7 drm/amdgpu/soc21: add initial GC 11.5.0 soc21 support
Disable clock gating and power gating on the early bring up phase.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:18 -04:00
Prike Liang
2c8a7ca164 drm/amdgpu: add new AMDGPU_FAMILY definition
add GC 11.5.0 family

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:15 -04:00
Lang Yu
f56c1941eb drm/amdgpu: use 6.1.0 register offset for HDP CLK_CNTL
Use 6.1.0 register offset and remove unused variable.

v2: clean up logic (Alex)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:00:12 -04:00
Mangesh Gadre
559259362e drm/amdgpu: Remove SRAM clock gater override by driver
rlc firmware does required setting, driver need not do it.

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:59:30 -04:00
Lijo Lazar
15c5c5f575 drm/amdgpu: Add bootloader status check
Add a function to wait till bootloader has reached steady state.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:59:24 -04:00
Horace Chen
0bc119fa2e drm/amdkfd: use correct method to get clock under SRIOV
[What]
Current SRIOV still using adev->clock.default_XX which gets from
atomfirmware. But these fields are abandoned in atomfirmware long ago.
Which may cause function to return a 0 value.

[How]
We don't need to check whether SR-IOV. For SR-IOV one-vf-mode,
pm is enabled and VF is able to read dpm clock
from pmfw, so we can use dpm clock interface directly. For
multi-VF mode, VF pm is disabled, so driver can just react as pm
disabled. One-vf-mode is introduced from GFX9 so it shall not have
any backward compatibility issue.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:59:21 -04:00
Lijo Lazar
36b0f88988 drm/amdgpu: Unset baco dummy mode on nbio v7.9
BACO dummy mode could be set under reset conditions and that affects
framebuffer access. Check If baco dummy mode is set, unset it if so.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:59:16 -04:00
YiPeng Chai
80578f1641 drm/amdgpu: Enable ras for mp0 v13_0_6 sriov
Enable ras for mp0 v13_0_6 sriov

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:58:16 -04:00
Samir Dhume
00481158ca drm/amdgpu/jpeg - skip change of power-gating state for sriov
Powergating is handled in the host driver.

Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Samir Dhume <samir.dhume@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:57:59 -04:00
Lijo Lazar
f8a499aed2 drm/amdgpu: Keep reset handlers shared
Instead of maintaining a list per device, keep the reset handlers common
per ASIC family. A pointer to the list of handlers is maintained in
reset control.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:57:54 -04:00
Le Ma
e240020ad1 drm/amdgpu: update gc_info v2_1 from discovery
Several new fields are exposed in gc_info v2_1

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:57:32 -04:00
Le Ma
f489a41998 drm/amdgpu: update mall info v2 from discovery
Mall info v2 is introduced in ip discovery

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:57:29 -04:00
Candice Li
46963ed585 drm/amdgpu: Only support RAS EEPROM on dGPU platform
RAS EEPROM device is only supported on dGPU platform for smu v13_0_6.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:57:26 -04:00
Chen Jiahao
d903af1a91 drm/amd/amdgpu: Use kmemdup to simplify kmalloc and memcpy logic
Using kmemdup() helper function rather than implementing it again
with kmalloc() + memcpy(), which improves the code readability.

Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:56:47 -04:00
Ilpo Järvinen
ce7d88110b drm/amdgpu: Use RMW accessors for changing LNKCTL
Don't assume that only the driver would be accessing LNKCTL. ASPM policy
changes can trigger write to LNKCTL outside of driver's control.  And in
the case of upstream bridge, the driver does not even own the device it's
changing the registers for.

Use RMW capability accessors which do proper locking to avoid losing
concurrent updates to the register value.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: a2e73f56fa ("drm/amdgpu: Add support for CIK parts")
Fixes: 62a3755341 ("drm/amdgpu: add si implementation v10")
Link: https://lore.kernel.org/r/20230717120503.15276-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-21 14:11:35 -05:00
Lijo Lazar
e20ff05170 drm/amdgpu: Add memory vendor information
For ASICs with GC v9.4.3, determine the vendor information from scratch
register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:38:11 -04:00
Mario Limonciello
0dee726395 drm/amd: flush any delayed gfxoff on suspend entry
DCN 3.1.4 is reported to hang on s2idle entry if graphics activity
is happening during entry.  This is because GFXOFF was scheduled as
delayed but RLC gets disabled in s2idle entry sequence which will
hang GFX IP if not already in GFXOFF.

To help this problem, flush any delayed work for GFXOFF early in
s2idle entry sequence to ensure that it's off when RLC is changed.

commit 4b31b92b14 ("drm/amdgpu: complete gfxoff allow signal during
suspend without delay") modified power gating flow so that if called
in s0ix that it ensured that GFXOFF wasn't put in work queue but
instead processed immediately.

This is dead code due to commit 10cb67eb8a ("drm/amdgpu: skip
CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly
called as part of the suspend entry code.  Remove that dead code.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:35:14 -04:00
Tim Huang
603b9a575d drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are disabled when S0ix suspend but fails to
re-enable when resume because of the GFX is in GFXOFF.

[  203.349571] [drm] Fence fallback timer expired on ring sdma0
[  203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0
[  203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0

For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers
to configure the fence driver interrupts for rings that belong to GFX.
The interrupts configuration will be restored by GFXOFF exit.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:34:57 -04:00
Lijo Lazar
b5cdadedaa drm/amdgpu: Remove gfxoff check in GFX v9.4.3
GFXOFF feature is not there for GFX 9.4.3 ASICs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:34:50 -04:00
James Zhu
400a39f1ec drm/amdgpu: skip xcp drm device allocation when out of drm resource
Return 0 when drm device alloc failed with -ENOSPC in
order to  allow amdgpu drive loading. But the xcp without
drm device node assigned won't be visiable in user space.
This helps amdgpu driver loading on system which has more
than 64 nodes, the current limitation.

The proposal to add more drm nodes is discussed in public,
which will support up to 2^20 nodes totally.
kernel drm:
https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/
libdrm:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:34:11 -04:00