Fixes the rlc reference clock used for GPU timestamps.
Value is 100Mhz. Confirmed with hardware team.
v2: reword commit message.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1480
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This fixes incorrect TCC harvesting info reported to userspace.
The impact was a very very tiny performance degradation (unnecessary
GL2 cache flushes).
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Remove NULL checks before vfree() to fix these warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:102:2-8: WARNING: NULL
check before some freeing functions is not needed.
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Extend retry times of KIQ to avoid starvation situation caused by
long time full access of GPU by other VFs.
Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As dimgrey_cavefish driver is stable enough, set gpu recovery as default
in HW hang for dimgrey_cavefish.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If device suspend fails when we attempt to runtime suspend,
reset the runpm flag.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the flag used by kfd is not actually related to fbcon, it just happens
to align. Use the runpm flag instead so that we can decouple it from
the fbcon flag.
v2: fix resume as well
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These are already called in amdgpu_device_suspend/resume which
are already called in the same functions.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism. This should in general
only be used for validation.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism. This should in general
only be used for validation.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism. This should in general
only be used for validation.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows us to use generic PCI reset mechanisms (FLR, SBR) as
a reset mechanism to verify that the generic PCI reset mechanisms
are working properly.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop duplicate reset method logging, whitespace changes.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop duplicate reset method logging, whitespace changes.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop duplicate reset method logging, whitespace changes.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To achieve the best QoS for high priority compute jobs it is
required to limit waves on other compute pipes as well.
This patch will set min value in non high priority
mmSPI_WCL_PIPE_PERCENT_CS[0-3] registers to minimize the
impact of normal/low priority compute jobs over high priority
compute jobs.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
In order to get appropriate timing for registers which
read/write is vertical line sensitive, add new IRQ source variable.
This interrupt is triggered by specific vertical line,
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
simplify the list operation.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add functions to support enable/disable rom clock gating and get rom
clock gating status.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switch to smuio callbacks: use smuio v11_0_6 callbacks for
Sienna_cichlid and forward ASIC, use smuio v11_0 callbacks for the
other NV family ASIC.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Implement smuio v11_0_6 callbacks which will used by Sienna_Cichlid and
forward ASIC.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support to program ASPM and LTR for Sienna Cichlid and forward ASIC.
Disable ASPM for Sienna Cichlid and forward ASIC by default.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable DCS
V1: Enable Async DCS.
V2: Add the ppfeaturemask bit to enable from the modprobe parameter.
V3:
1. add the flag to skip APU support.
2. remove the hunk for workload selection since
it doesn't impact the function.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The hw interface changed on arcturus so the old numbering
scheme doesn't work.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable gfx wave limiting for gfx jobs before pushing high priority
compute jobs so that high priority compute jobs gets more resources
to finish early.
v2: use ring priority instead of job priority.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wave limiting can be use to load balance high priority
compute jobs along with gfx jobs. When enabled, this will reserve
~75% of waves for compute jobs.
We do not need this from gfx10 onwards because >=gfx10 has
asynchronous compute tunneling to replace wave limit requirement.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For high priority compute to work properly we need to enable
wave limiting on gfx pipe. Wave limiting is done through writing
into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high
priority compute queue to avoid race condition between multiple
high priority compute queues writing that register simultaneously.
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch consist of below related changes:
1 Rename ring->priority to ring->hw_prio.
2 Assign correct hardware ring priority.
3 Remove ring->priority_mutex as ring priority remains unchanged
after initialization.
4 Remove unused ring->num_jobs.
v3: remove ring->num_jobs.
v2: remove ring->priority_mutex.
Fixes: 33abcb1f5a ("drm/amdgpu: set compute queue priority at mqd_init")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some newer APUs can scanout directly from GTT, that saves us from
allocating another bounce buffer in VRAM and enables freesync in such
configurations.
Without this patch creating a framebuffer from the imported BO will
fail and userspace will fall back to a copy.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For Vangogh:
The offset of the CGTS_TCC_DISABLE is 0x5006 by calculation.
The offset of the CGTS_USER_TCC_DISABLE is 0x5007 by calculation.
Signed-off-by: chen gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.
Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Some newer APUs can scanout directly from GTT, that saves us from
allocating another bounce buffer in VRAM and enables freesync in such
configurations.
Without this patch creating a framebuffer from the imported BO will
fail and userspace will fall back to a copy.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For Vangogh:
The offset of the CGTS_TCC_DISABLE is 0x5006 by calculation.
The offset of the CGTS_USER_TCC_DISABLE is 0x5007 by calculation.
Signed-off-by: chen gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Flag TTM_PL_FLAG_CONTIGUOUS is only valid for VRAM domain. So fix the
false positive by checking memory type too.
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Enable 1:1 mapping between VRAM of a DRM node and a scatterlist node
[How]
Ensure construction of DRM node to not exceed specified limit
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.
Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The purpose of this patch is to add a missing device ID for Sienna Cichlid.
The missing ID "0x73A1" is now added to the "amdgpu_drv.c" file.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move all the dummy functions in amdgpu_amdkfd.c to
amdgpu_amdkfd.h as inline functions.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Until the issues in the SMU firmware are fixed.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
The double negative makes it hard to read "if (!ACPI_FAILURE(status))".
Replace it with "if (ACPI_SUCCESS(status))".
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the ! operator is incorrectly being used to flip bits on
mask values. Fix this by using the bit-wise ~ operator instead.
Addresses-Coverity: ("Logical vs. bitwise operator")
Fixes: 3c9a7b7d6e ("drm/amdgpu: update mmhub mgcg&ls for mmhub_v2_3")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
when vram lost happened in guest, try to write vram can lead to
kernel stuck.
[How]
When the readback data is invalid, don't do write work, directly
reschedule a new work.
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix a racing issue when jobs on 2 rings timeout simultaneously.
If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grabbed by 2 local linked list,
which may cause wild pointer issue in iterating.
lock the device earily to prevent the node be added to 2
different lists.
also increase karma for the skipped job since the job is also
timed out and should be guilty.
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable pinning of VRAM without forcing it to be contiguous. When memory is
already pinned, make sure it's contiguous if requested.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Starting from vangogh, the ATCL2 and DAGB0 registers relative
to mgcg/ls has changed.
For MGCG:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.
For MGLS:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.
Add DAGB0_(WR/RD)_CGTT_CLK_CTRL registers.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GCR_GENERAL_CNTL is defined differently in gc_10_1_0_offset.h and
gc_10_3_0_offset.h. Update GCR_GENERAL_CNTL for Vangogh.
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>