In multi container use case, reset time is important, so skip ring
tests and cp halt wait during ip suspending for reset as they are
going to fail and cost more time on reset
v2: add a hang flag to indicate the reset comes from a job timeout,
skip ring test and cp halt wait in this case
v3: move hang flag to adev
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For some hang caused by slow tests, engine cannot be stopped which
may cause resume failure after reset. In this case, force halt
engine by reverting context addresses
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Save and restore gfxhub regs as they will be reset during mode 2
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Introduce amdgpu_reset_level debugfs in order to help debug and
test specific type of reset. Also helps blocking unwanted type of
resets.
By default, mode2 reset will not be enabled
v2: make this debugfs in adev and use debugfs_create_u32
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- introduce AMDGPU_SKIP_MODE2_RESET flag
- let mode2 reset fallback to default reset method if failed
v2: move this part out from the asic specific part
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To meet the requirement for multi container usecase which needs
a quicker reset and not causing VRAM lost, adding the Mode2
reset handler for sienna_cichlid.
v2: move skip mode2 flag part separately
v3: remove the use of asic_reset_res
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add secure display TA load for Renoir
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no point in returning an int here. It only returns 0 which
the caller never uses. Therefore return void and remove the unnecessary
assignment.
Addresses-Coverity: 1504988 ("Unused value")
Fixes: 8da1170a16 ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Suggested-by: Ruijing Dong <ruijing.dong@amd.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Local variable 'rq' is initialized by an address
of field of drm_sched_job, so it does not make
sense to compare 'rq' with NULL.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Andrey Strachuk <strochuk@ispras.ru>
Fixes: 7c6e68c777 ("drm/amdgpu: Avoid HW GPU reset for RAS.")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
During multi-vf executing benchmark (Luxmark) observed kiq error timeout.
It happenes because all of VFs do the tlb invalidation at the same time.
Although each VF has the invalidate register set, from hardware side
the invalidate requests are queue to execute.
[How]
In case of 12 VF increase timeout on 12*100ms
Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable AMD_CG_SUPPORT_IH_CG support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable AMD_CG_SUPPORT_ATHUB_MGCG and AMD_CG_SUPPORT_ATHUB_LS support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable AMD_CG_SUPPORT_HDP_MGCG and AMD_CG_SUPPORT_HDP_LS support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable AMD_CG_SUPPORT_MC_MGCG and AMD_CG_SUPPORT_MC_LS support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add ATHUB IP v3.0.1 in athub_v3_0_set_clockgating.
The regATHUB_MISC_CNTL has different offset for ATHUB IP v3.0.1,
so need to add IP version checking to use the right REG offset.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add set/get_clockgating for HDP IP v5.2.1.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add set/get_clockgating for MMHUB IP v3.0.1.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add ih cg and hdp sd on smu_v13_0_7
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some stability issues were reported with these features.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix a double free and an uninitialized pointer read error. Both tmp and
new are pointing at same address and both are freed which leads to
double free. Adding a check to verify if new and tmp are free in the
error_free label fixes the double free issue. new is not initialized to
null which also leads to a free on an uninitialized pointer.
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Suggested by: S. Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Sebin Sebastian <mailmesebin00@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We changed it for all asics due to a hibernation regression
on hawaii, but the workaround breaks suspend on a polaris12.
Just disable it for hawaii.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216119
Fixes: 3a4b1cc28f ("drm/amdgpu/display: disable prefer_shadow for generic fb helpers")
Reviewed-and-tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Add AMD_PG_SUPPORT_GFX_PG support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable GFX Power Gating control for GC IP v11.0.1.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For typesafety, use container_of() instead of implicit cast from struct
ttm_tt to struct amdgpu_ttm_tt.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable translate_further on Arcturus and Aldebaran server chips
in order to increase the UTCL2 reach from 8 GiB to 64 GiB,
which is more in line with the amount of framebuffer DRAM in
the devices.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Kent Russell <kent.russell@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove redundant reference of header file dev_printk.h
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A list of devices to be reset is already created in
amdgpu_device_gpu_recover function. Creating another list with the
same nodes is incorrect and not supported in list_head. Instead, pass
the device list as part of reset context.
Fixes: 9e08564727 (drm/amdgpu: Refactor mode2 reset logic for v13.0.2)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the vram width algorithm for vram_info v3_0 to align with the
changes of latest IFWI.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.19.x
Separately accumulate a statistic of rounded up allocations to use
to report availability, with a view to increasing the likelihood a
buffer object can be successfully allocated at exactly the size
reported by the availability API.
Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rounding up allocations in the allocation path caused test regressions,
so now just round in the availability path.
Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The builds for alpha and mips allmodconfig fails with the error:
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:534:23: error: implicit declaration of function 'vmalloc'; did you mean 'kvmalloc'? [-Werror=implicit-function-declaration]
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:534:21: error: assignment to 'void *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c:545:33: error: implicit declaration of function 'vfree'; did you mean 'kvfree'? [-Werror=implicit-function-declaration]
Add the header file for vmalloc and vfree.
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch will enable support for psp 13.0.4 blcok.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch will add files for PSP 13.0.4.
Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch corrects RLC_RLCS_BOOTLOAD_STATUS offset and index for
GC 11.0.1.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hives that require psp topology info to be reflected will leak hive
reference so fix it.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mes self test rely on vm mapping, move it after
drm sched re-started so that vm mapping can work
during gpu reset.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-and-tested-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This extra call trace dump comes out in every gpu reset.
And it gives people a wrong impression that something
went wrong. Although actually there was not.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Not enable VCN pg because encode issue
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable support for vcn_4_0_2 video codec
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add function to support VCN_4_0_2 doorbell
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The return value should be set in vcn4 boot poll.
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to reference the BO here, dmabuf framework will handle that.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>