1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

10198 commits

Author SHA1 Message Date
Guchun Chen
e17e27f9bd drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:56 -04:00
Christian König
127aedf979 drm/amdgpu: print warning and taint kernel if lockup timeout is disabled
Make sure that we notice this in error reports.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:42 -04:00
Christian König
c8365dbda0 drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"
This reverts commit 728e7e0cd6.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:36 -04:00
Yifan Zhang
286826d7d9 drm/amdgpu: init iommu after amdkfd device init
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:29 -04:00
Lijo Lazar
1d617c029f drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-05 10:55:07 -04:00
Lang Yu
b072ef1215 drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 10:53:45 -04:00
Alex Deucher
630e959f25 drm/amdgpu/gmc9: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Guo Zhengkui
8001ba85d0 drm/amdgpu: remove some repeated includings
Remove two repeated includings in line 46 and 47.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lijo Lazar
d04287d062 drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
4b3a624c4c drm/amdgpu: consolidate case statements
IP_VERSION(11, 0, 13) does the exact same thing as
IP_VERSION(11, 0, 12) so squash them together.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu
c60511493b drm/amdgpu/jpeg: add jpeg2.6 start/end
Add jpeg2.6 start/end with updated PCTL0_MMHUB_DEEPSLEEP_IB address.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu
d4b0ee65de drm/amdgpu/jpeg2: move jpeg2 shared macro to header file
Move jpeg2 shared macro to header file

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lang Yu
546dc20fed drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
a79d3709c4 drm/amdgpu: add an option to override IP discovery table from a file
If you set amdgpu.discovery=2 you can force the the driver to
fetch the IP discovery table from a file rather than from the
table shipped on the device.  This is useful for debugging and
for device bring up and emulation when the tables may be in flux.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
5b983db8c3 drm/amdkfd: clean up parameters in kgd2kfd_probe
We can get the pdev and asic type from the adev.  No need
to pass them explicitly.

v2: squash in build fix for !CONFIG_HSA_AMD from Anson

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher
6d46d419af drm/amdgpu: add support for SRIOV in IP discovery path
Handle SRIOV requirements when adding IP blocks.

v2: add comment about UVD/VCE support on vega20 SR-IOV

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
b05b9c591f drm/amdgpu: clean up set IP function
Split into several smaller per IP functions to make it
easier to handle ordering issues for things like
SR-IOV in a follow up patch.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
1d789535a0 drm/amdgpu: convert IP version array to include instances
Allow us to query instances versions more cleanly.

Instancing support is not consistent unfortunately. SDMA is a
good example.  Sienna cichlid has 4 total SDMA instances, each
enumerated separately (HWIDs 42, 43, 68, 69).  Arcturus has 8
total SDMA instances, but they are enumerated as multiple
instances of the same HWIDs (4x HWID 42, 4x HWID 43).  UMC
is another example.  On most chips there are multiple
instances with the same HWID.  This allows us to support both
forms.

v2: rebase
v3: clarify instancing support

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
d0761fd24e drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default
For new chips with no explicit entry in the PCI ID list.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
3ae695d691 drm/amdgpu: add new asic_type for IP discovery
Add a new asic type for asics where we don't have an
explicit entry in the PCI ID list.  We don't need
an asic type for these asics, other than something higher
than the existing ones, so just use this for all new
asics.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
aa9f8cc349 drm/amdgpu/ucode: add default behavior
Default to PSP ucode loading unless the user specifies
direct.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
f174161517 drm/amdgpu: get VCN harvest information from IP discovery table
Use the table rather than asic specific harvest registers.

v2: remove harvesting register checking

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
1b592d00b4 drm/amdgpu/vcn: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
fe323f039d drm/amdgpu/sdma: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5c3720be7d drm/amdgpu: get VCN and SDMA instances from IP discovery table
Rather than hardcoding it.  We already have the number of VCN
instances from a previous patch, so just update the VCN
instances for chips with static tables.

v2: squash in checks for SDMA3,4 (Guchun)
v3: clarify VCN changes

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
5eceb20192 drm/amdgpu: add VCN1 hardware IP
So we can store the VCN IP revision for each instance of VCN.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
75a07bcd1d drm/amdgpu/soc15: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher
0b64a5a852 drm/amdgpu/vcn2.5: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
96b8dd4423 drm/amdgpu/amdgpu_vcn: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in fix for navy flounder and sienna cichlid

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
1fcc208cd7 drm/amdgpu/psp_v13.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
e47868ea15 drm/amdgpu/psp_v11.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
82d05736c4 drm/amdgpu/amdgpu_psp: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
9d0cb2c318 drm/amdgpu/gfx9.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
24be2d7004 drm/amdgpu/hdp4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
43bf00f21e drm/amdgpu/sdma4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
f7f12b2582 drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support
We are not going to support any new chips with the old
non-DC code so make it the default.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
9878844094 drm/amdgpu: drive all vega asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
91e9db33be drm/amdgpu/soc15: get rev_id in soc15_common_early_init
for consistency with other SoCs.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
d4c6e870bd drm/amdgpu: add initial IP discovery support for vega based parts
Hardcode the IP versions for asics without IP discovery tables
and then enumerate the asics based on the IP versions.

TODO: fix SR-IOV support

v2: Squash in HDP fix for Renoir

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher
994470b252 drm/amdgpu/soc15: export common IP functions
So they can be driven by IP discovery table.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
5f93148955 drm/amdgpu: add DCI HWIP
So we can track grab the appropriate DCE info out of the
IP discovery table.  This is a separare IP from DCN.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
75aa18415a drm/amdgpu: drive all navi asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

v2: rebase

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
3e67f4f2e2 drm/amdgpu/nv: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
7c69d6153e drm/amdgpu/navi10_ih: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
258fa17d1a drm/amdgpu/athub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
13ebe284a2 drm/amdgpu/athub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
4edbbfde89 drm/amdgpu/vcn3.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
bc7c3d1d8a drm/amdgpu/mmhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
ce2d99a84f drm/amdgpu/mmhub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher
fac1772374 drm/amdgpu/gfxhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00