1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

8637 commits

Author SHA1 Message Date
Asher.Song
0c61ac8134 drm/amdgpu:disable VCN for Navi12 SKU
Navi12 0x7360/C7 SKU has no video support, so remove it.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Asher.Song <Asher.Song@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-03 22:48:33 -05:00
Alex Deucher
31ada99bdd drm/amdgpu: Only check for S0ix if AMD_PMC is configured
The S0ix check only makes sense if the AMD PMC driver is
present.  We need to use the legacy S3 pathes when the
PMC driver is not present.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-03 22:46:55 -05:00
Prike Liang
b092b19602 drm/amdgpu: fix shutdown and poweroff process failed with s0ix
In the shutdown and poweroff opt on the s0i3 system we still need
un-gate the gfx clock gating and power gating before destory amdgpu device.

Fixes: 628c36d7b2 ("drm/amdgpu: update amdgpu device suspend/resume sequence for s0i3 support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1499
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-24 09:48:46 -05:00
Alex Deucher
6e80fb8ab0 drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)
Fixes the rlc reference clock used for GPU timestamps.
Value is 100Mhz.  Confirmed with hardware team.

v2: reword commit message.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1480
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-18 16:43:09 -05:00
Marek Olšák
4112c00354 drm/amdgpu: fix CGTS_TCC_DISABLE register offset on gfx10.3
This fixes incorrect TCC harvesting info reported to userspace.
The impact was a very very tiny performance degradation (unnecessary
GL2 cache flushes).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-02-18 16:42:55 -05:00
Tian Tao
802b8c8355 drm/amdgpu: fix unnecessary NULL check warnings
Remove NULL checks before vfree() to fix these warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:102:2-8: WARNING: NULL
check before some freeing functions is not needed.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:49:33 -05:00
Jiawei Gu
006cc1a213 drm/amdgpu: extend MAX_KIQ_REG_TRY to 1000
Extend retry times of KIQ to avoid starvation situation caused by
long time full access of GPU by other VFs.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:48:49 -05:00
Tao Zhou
27859ee3df drm/amdgpu: enable gpu recovery for dimgrey_cavefish
As dimgrey_cavefish driver is stable enough, set gpu recovery as default
in HW hang for dimgrey_cavefish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:25 -05:00
Alex Deucher
cef8b03bbc drm/amdgpu: reset runpm flag if device suspend fails
If device suspend fails when we attempt to runtime suspend,
reset the runpm flag.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:13 -05:00
Alex Deucher
ad887af9b6 drm/amdgpu: use runpm flag rather than fbcon for kfd runtime suspend (v2)
the flag used by kfd is not actually related to fbcon, it just happens
to align.  Use the runpm flag instead so that we can decouple it from
the fbcon flag.

v2: fix resume as well

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:07 -05:00
Alex Deucher
a8d3d80a8c drm/amdgpu: drop extra drm_kms_helper_poll_enable/disable calls
These are already called in amdgpu_device_suspend/resume which
are already called in the same functions.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:03 -05:00
Alex Deucher
f172865a36 drm/amdgpu/nv: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:30:01 -05:00
Alex Deucher
1176a1e0b9 drm/amdgpu/soc15: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:59 -05:00
Alex Deucher
ffbfd081b4 drm/amdgpu/si: add PCI reset support
Use generic PCI reset for GPU reset if the user specifies
PCI reset as the reset mechanism.  This should in general
only be used for validation.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:57 -05:00
Alex Deucher
af484df800 drm/amdgpu: add generic pci reset as an option
This allows us to use generic PCI reset mechanisms (FLR, SBR) as
a reset mechanism to verify that the generic PCI reset mechanisms
are working properly.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:54 -05:00
Alex Deucher
d5ab066917 drm/amdgpu/vi: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:52 -05:00
Alex Deucher
44ab8bb0bb drm/amdgpu/cik: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:50 -05:00
Alex Deucher
25bd55276b drm/amdgpu/si: minor clean up of reset code
Drop duplicate reset method logging, whitespace changes.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:40 -05:00
Nirmoy Das
f8bf645018 drm/amdgpu: enable wave limit on non high prio cs pipes
To achieve the best QoS for high priority compute jobs it is
required to limit waves on other compute pipes as well.
This patch will set min value in non high priority
mmSPI_WCL_PIPE_PERCENT_CS[0-3] registers to minimize the
impact of normal/low priority compute jobs over high priority
compute jobs.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:29:08 -05:00
Wayne Lin
11f1a5538b drm/amdgpu: Add otg vertical IRQ Source
[Why & How]
In order to get appropriate timing for registers which
read/write is vertical line sensitive, add new IRQ source variable.
This interrupt is triggered by specific vertical line,

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:55 -05:00
Kevin Wang
be8901c2ee drm/amdgpu: optimize list operation in amdgpu_xgmi
simplify the list operation.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:49 -05:00
Likun Gao
1001f2a1f3 drm/amdgpu: support rom clockgating related function for NV family
Add functions to support enable/disable rom clock gating and get rom
clock gating status.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:36 -05:00
Likun Gao
0bf7f2dcb9 drm/amdgpu: switch to use smuio callbacks for NV family
Switch to smuio callbacks: use smuio v11_0_6 callbacks for
Sienna_cichlid and forward ASIC, use smuio v11_0 callbacks for the
other NV family ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:27 -05:00
Likun Gao
1deb98534c drm/amdgpu: implement smuio v11_0_6 callbacks
Implement smuio v11_0_6 callbacks which will used by Sienna_Cichlid and
forward ASIC.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:21 -05:00
Likun Gao
e1edaeafeb drm/amdgpu: support ASPM for some specific ASIC
Support to program ASPM and LTR for Sienna Cichlid and forward ASIC.
Disable ASPM for Sienna Cichlid and forward ASIC by default.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:28:04 -05:00
Kenneth Feng
680602d6c2 drm/amd/pm: enable DCS
Enable DCS

V1: Enable Async DCS.
V2: Add the ppfeaturemask bit to enable from the modprobe parameter.
V3:
1. add the flag to skip APU support.
2. remove the hunk for workload selection since
it doesn't impact the function.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:57 -05:00
Alex Deucher
e83db77487 drm/amdgpu/gmc9: fix mmhub client mapping for arcturus
The hw interface changed on arcturus so the old numbering
scheme doesn't work.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:47 -05:00
Nirmoy Das
22e4f31529 drm/amdgpu: enable gfx wave limiting for high priority compute jobs
Enable gfx wave limiting for gfx jobs before pushing high priority
compute jobs so that high priority compute jobs gets more resources
to finish early.

v2: use ring priority instead of job priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:11 -05:00
Nirmoy Das
0a52a6cacc drm/amdgpu: add wave limit functionality for gfx8,9
Wave limiting can be use to load balance high priority
compute jobs along with gfx jobs. When enabled, this will reserve
~75% of waves for compute jobs.

We do not need this from gfx10 onwards because >=gfx10 has
asynchronous compute tunneling to replace wave limit requirement.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:27:04 -05:00
Nirmoy Das
8c0225d792 drm/amdgpu: enable only one high prio compute queue
For high priority compute to work properly we need to enable
wave limiting on gfx pipe. Wave limiting is done through writing
into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high
priority compute queue to avoid race condition between multiple
high priority compute queues writing that register simultaneously.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:26:56 -05:00
Nirmoy Das
ebdd2e9d1a drm/amdgpu: cleanup struct amdgpu_ring
This patch consist of below related changes:

1 Rename ring->priority to ring->hw_prio.
2 Assign correct hardware ring priority.
3 Remove ring->priority_mutex as ring priority remains unchanged
  after initialization.
4 Remove unused ring->num_jobs.

v3: remove ring->num_jobs.
v2: remove ring->priority_mutex.

Fixes: 33abcb1f5a ("drm/amdgpu: set compute queue priority at mqd_init")
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-09 15:26:41 -05:00
Christian König
dd017d01c3 drm/amdgpu: enable freesync for A+A configs
Some newer APUs can scanout directly from GTT, that saves us from
allocating another bounce buffer in VRAM and enables freesync in such
configurations.

Without this patch creating a framebuffer from the imported BO will
fail and userspace will fall back to a copy.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:54 -05:00
chen gong
2cb96b2387 drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH
For Vangogh:
The offset of the CGTS_TCC_DISABLE is 0x5006 by calculation.
The offset of the CGTS_USER_TCC_DISABLE is 0x5007 by calculation.

Signed-off-by: chen gong <curry.gong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:29 -05:00
xinhui pan
e1a4b67aac drm/amdgpu: Fix a false positive when pin non-VRAM memory
Flag TTM_PL_FLAG_CONTIGUOUS is only valid for VRAM domain. So fix the
false positive by checking memory type too.

Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:06:21 -05:00
Ramesh Errabolu
b131c363c8 drm/amdgpu: Limit the maximum size of contiguous VRAM that can be encapsulated by an instance of DRM memory node
[Why]
Enable 1:1 mapping between VRAM of a DRM node and a scatterlist node

[How]
Ensure construction of DRM node to not exceed specified limit

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 12:05:29 -05:00
Huang Rui
875440fd7d drm/amdkfd: fix null pointer panic while free buffer in kfd
In drm_gem_object_free, it will call funcs of drm buffer obj. So
kfd_alloc should use amdgpu_gem_object_create instead of
amdgpu_bo_create to initialize the funcs as amdgpu_gem_object_funcs.

[  396.231390] amdgpu: Release VA 0x7f76b4ada000 - 0x7f76b4add000
[  396.231394] amdgpu:   remove VA 0x7f76b4ada000 - 0x7f76b4add000 in entry 0000000085c24a47
[  396.231408] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  396.231445] #PF: supervisor read access in kernel mode
[  396.231466] #PF: error_code(0x0000) - not-present page
[  396.231484] PGD 0 P4D 0
[  396.231495] Oops: 0000 [#1] SMP NOPTI
[  396.231509] CPU: 7 PID: 1352 Comm: clinfo Tainted: G           OE     5.11.0-rc2-custom #1
[  396.231537] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS WCD0401N_Weekly_20_04_0 04/01/2020
[  396.231563] RIP: 0010:drm_gem_object_free+0xc/0x22 [drm]
[  396.231606] Code: eb ec 48 89 c3 eb e7 0f 1f 44 00 00 55 48 89 e5 48 8b bf 00 06 00 00 e8 72 0d 01 00 5d c3 0f 1f 44 00 00 48 8b 87 40 01 00 00 <48> 8b 00 48 85 c0 74 0b 55 48 89 e5 e8 54 37 7c db 5d c3 0f 0b c3
[  396.231666] RSP: 0018:ffffb4704177fcf8 EFLAGS: 00010246
[  396.231686] RAX: 0000000000000000 RBX: ffff993a0d0cc400 RCX: 0000000000003113
[  396.231711] RDX: 0000000000000001 RSI: e9cda7a5d0791c6d RDI: ffff993a333a9058
[  396.231736] RBP: ffffb4704177fdd0 R08: ffff993a03855858 R09: 0000000000000000
[  396.231761] R10: ffff993a0d1f7158 R11: 0000000000000001 R12: 0000000000000000
[  396.231785] R13: ffff993a0d0cc428 R14: 0000000000003000 R15: ffffb4704177fde0
[  396.231811] FS:  00007f76b5730740(0000) GS:ffff993b275c0000(0000) knlGS:0000000000000000
[  396.231840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.231860] CR2: 0000000000000000 CR3: 000000016d2e2000 CR4: 0000000000350ee0
[  396.231885] Call Trace:
[  396.231897]  ? amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x24c/0x25f [amdgpu]
[  396.232056]  ? __dynamic_dev_dbg+0xcd/0x100
[  396.232076]  kfd_ioctl_free_memory_of_gpu+0x91/0x102 [amdgpu]
[  396.232214]  kfd_ioctl+0x211/0x35b [amdgpu]
[  396.232341]  ? kfd_ioctl_get_queue_wave_state+0x52/0x52 [amdgpu]

Fixes: 246cb7e49a ("drm/amdgpu: Introduce GEM object functions")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Changfeng <changzhu@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 10:47:47 -05:00
Huang Rui
c5f85696cb drm/amdgpu: fix the issue that retry constantly once the buffer is oversize
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-02 10:47:47 -05:00
Ori Messinger
d26bbbcc16 amdgpu: Add Missing Sienna Cichlid DID
The purpose of this patch is to add a missing device ID for Sienna Cichlid.
The missing ID "0x73A1" is now added to the "amdgpu_drv.c" file.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-02-01 11:51:46 -05:00
Lang Yu
cd63989e0e drm/amd/amdkfd: adjust dummy functions' placement
Move all the dummy functions in amdgpu_amdkfd.c to
amdgpu_amdkfd.h as inline functions.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-28 14:58:27 -05:00
Alex Deucher
33cf440d59 drm/amdgpu: disable gpu reset on Vangogh for now
Until the issues in the SMU firmware are fixed.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-28 14:58:10 -05:00
Colin Ian King
5993e79398 drm/amdgpu: Fix masking binary not operator on two mask operations
Currently the ! operator is incorrectly being used to flip bits on
mask values. Fix this by using the bit-wise ~ operator instead.

Addresses-Coverity: ("Logical vs. bitwise operator")
Fixes: 3c9a7b7d6e ("drm/amdgpu: update mmhub mgcg&ls for mmhub_v2_3")
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:47:05 -05:00
Jingwen Chen
64dcf2f01d drm/amd/amdgpu: add error handling to amdgpu_virt_read_pf2vf_data
[Why]
when vram lost happened in guest, try to write vram can lead to
kernel stuck.

[How]
When the readback data is invalid, don't do write work, directly
reschedule a new work.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:46:48 -05:00
Horace Chen
91fb309d82 drm/amdgpu: race issue when jobs on 2 ring timeout
Fix a racing issue when jobs on 2 rings timeout simultaneously.

If 2 rings timed out at the same time, the
amdgpu_device_gpu_recover will be reentered. Then the
adev->gmc.xgmi.head will be grabbed by 2 local linked list,
which may cause wild pointer issue in iterating.

lock the device earily to prevent the node be added to 2
different lists.

also increase karma for the skipped job since the job is also
timed out and should be guilty.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:45:16 -05:00
Felix Kuehling
eda1068dc9 drm/amdgpu: Make contiguous pinning optional
Enable pinning of VRAM without forcing it to be contiguous. When memory is
already pinned, make sure it's contiguous if requested.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-25 17:45:10 -05:00
Huang Rui
dcb820d185 drm/amdgpu: remove gpu info firmware of green sardine
The ip discovery is supported on green sardine, it doesn't need gpu info
firmware anymore.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 09:55:23 -05:00
Feifei Xu
2b3a1f515f drm/amdgpu:Add pcie gen5 support in pcie capability.
Add PCIE_SPEED_32_0GT and PCIE GEN5 support for amdgpu.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 09:54:56 -05:00
Jinzhou Su
366468ff6c drm/amdgpu: Allow GfxOff on Vangogh as default
Send allow GfxOff message to SMU to enter GfxOff
mode as default.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 09:54:50 -05:00
Aaron Liu
3c9a7b7d6e drm/amdgpu: update mmhub mgcg&ls for mmhub_v2_3
Starting from vangogh, the ATCL2 and DAGB0 registers relative
to mgcg/ls has changed.

For MGCG:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.

For MGLS:
Replace mmMM_ATC_L2_MISC_CG with mmMM_ATC_L2_CGTT_CLK_CTRL.
Add DAGB0_(WR/RD)_CGTT_CLK_CTRL registers.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 09:54:07 -05:00
Jinzhou Su
860cc26a01 drm/amdgpu: Add RLC_PG_DELAY_3 for Vangogh
Driver should enable the CGPG feature for RLC in safe mode to
prevent any misalignment or conflict in middle of any power
feature entry/exit sequence.
Achieved by setting RLC_PG_CNTL.GFX_POWER_GATING_ENABLE = 0x1,
and RLC_PG_DELAY_3.CGCG_ACTIVE_BEFORE_CGPG to the desired CGPG
hysteresis value in refclk count.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-21 09:53:33 -05:00
Jinzhou Su
91067d8959 drm/amdgpu: modify GCR_GENERAL_CNTL for Vangogh
GCR_GENERAL_CNTL is defined differently in gc_10_1_0_offset.h and
gc_10_3_0_offset.h. Update GCR_GENERAL_CNTL for Vangogh.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-20 16:38:23 -05:00