linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Alex Deucher	524cf3ab85	drm/amdgpu: drive nav10 from the IP discovery table Rather than hardcoding based on asic_type, use the IP discovery table to configure the driver. Only tested on Navi10 so far. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	63352b7f98	drm/amdgpu: Use IP discovery to drive setting IP blocks by default Drive the asic setup from the IP discovery table rather than hardcoded settings based on asic type. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	5db9d0657e	drm/amdgpu/gmc10.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. v2: squash in gmc fixes v3: rebase Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	eb4fd29afd	drm/amdgpu: bind to any 0x1002 PCI diplay class device Bind to all 0x1002 GPU devices. For now we explicitly return -ENODEV for generic bindings. Remove this check once IP discovery based checking is in place. v2: rebase (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	bdbeb0dde4	drm/amdgpu: filter out radeon PCI device IDs Once we claim all 0x1002 PCI display class devices, we will need to filter out devices owned by radeon. v2: rename radeon id array to make it more clear that the devices are not supported by amdgpu. add r128, mach64 pci ids as well Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	4b0ad84254	drm/amdgpu/gfx10: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. v2: rebase, squash in navi10 fixes (Alex) Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	8f4bb1e784	drm/amdgpu/sdma5.2: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	02200e910c	drm/amdgpu/sdma5.0: convert to IP version checking Use IP versions rather than asic_type to differentiate IP version specific features. v2: rebase Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	795d08391b	drm/amdgpu: add initial IP enumeration via IP discovery table Add initial support for all navi based parts. v2: rebase Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	a1f62df75b	drm/amdgpu/nv: export common IP functions So they can be driven by IP dicovery table. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	1534db5549	drm/amdgpu: add XGMI HWIP So we can track grab the appropriate XGMI info out of the IP discovery table. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	54d2b1f402	drm/amdgpu: fill in IP versions from IP discovery table Prerequisite for using IP versions in the driver rather than asic type. v2: Use IP_VERSION() macro instead of new function Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	5f52e9a780	drm/amdgpu: store HW IP versions in the driver structure So we can check the IP versions directly rather than using asic type. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	81d1bf01e4	drm/amdgpu: add debugfs access to the IP discovery table Useful for debugging and new asic validation. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
Alex Deucher	f76f795a8f	drm/amdgpu: move headless sku check into harvest function Consolidate harvesting information. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:58 -04:00
John Clements	eb601e61d3	drm/amdgpu: resolve RAS query bug clear error count when persistant harvesting is not enabled Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:57 -04:00
Tao Zhou	c749094923	amd/amdkfd: add ras page retirement handling for sq/sdma (v3) In ras poison mode, page retirement will be handled by the irq handler of the module which consumes corrupted data. v2: rename ras_process_cb to ras_poison_consumption_handler. move the handler's implementation from ASIC specific file to common file. v3: call gpu reset for xGMI connected mode. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:57 -04:00
Prike Liang	e5d59cfa33	drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix In the s2idle stress test sdma resume fail occasionally,in the failed case GPU is in the gfxoff state.This issue may introduce by firmware miss handle doorbell S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:57 -04:00
Zhan Liu	3f68c01be9	drm/amd/display: add cyan_skillfish display support [Why] add display related cyan_skillfish files in. makefile controlled by CONFIG_DRM_AMD_DC_DCN201 flag. v2: squash in clang fixes from Harry, Nathan v3: squash in missing CONFIG_DRM_AMD_DC check (Alex) Signed-off-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Jun Lei <jun.lei@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-10-04 15:22:57 -04:00
Sean Paul	6f67e6fd4d	Revert "drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()" This reverts commit `299f040e85`. This patchset breaks on intel platforms and was previously NACK'd by Ville. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Fernando Ramos <greenfoo@u92.eu> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211002154542.15800-2-sean@poorly.run	2021-10-04 09:34:55 -04:00
Tom Lendacky	e9d1d2bb75	treewide: Replace the use of mem_encrypt_active() with cc_platform_has() Replace uses of mem_encrypt_active() with calls to cc_platform_has() with the CC_ATTR_MEM_ENCRYPT attribute. Remove the implementation of mem_encrypt_active() across all arches. For s390, since the default implementation of the cc_platform_has() matches the s390 implementation of mem_encrypt_active(), cc_platform_has() does not need to be implemented in s390 (the config option ARCH_HAS_CC_PLATFORM is not set). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210928191009.32551-9-bp@alien8.de	2021-10-04 11:47:24 +02:00
Fernando Ramos	299f040e85	drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN() As requested in Documentation/gpu/todo.rst, replace driver calls to drm_modeset_lock_all() with DRM_MODESET_LOCK_ALL_BEGIN() and DRM_MODESET_LOCK_ALL_END() Signed-off-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210924064324.229457-16-greenfoo@u92.eu	2021-10-01 13:00:57 -04:00
Andrey Grodzovsky	5c67ff3a4c	drm/amdgpu: Add a UAPI flag for hot plug/unplug To support libdrm tests. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-29 17:30:00 -04:00
Andrey Grodzovsky	894c6890a2	drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case Handle all DMA IOMMU group related dependencies before the group is removed and we try to access it after free. v2: Move the actul handling function to TTM Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-29 17:30:00 -04:00
Ernst Sjöstrand	5039f52988	drm/amd/amdgpu: Validate ip discovery blob We use the number_instance index that we get from the fw discovery blob to index into an array for example. Update error messages (Alex) Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-29 17:30:00 -04:00
Arnd Bergmann	335aea75b0	drm/amdgpu: fix warning for overflow check The overflow check in amdgpu_bo_list_create() causes a warning with clang-14 on 64-bit architectures, since the limit can never be exceeded. drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list)) ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The check remains useful for 32-bit architectures, so just avoid the warning by using size_t as the type for the count. Fixes: `920990cb08` ("drm/amdgpu: allocate the bo_list array after the list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-29 17:30:00 -04:00
Simon Ser	2f350ddadc	drm/amdgpu: check tiling flags when creating FB on GFX8- On GFX9+, format modifiers are always enabled and ensure the frame-buffers can be scanned out at ADDFB2 time. On GFX8-, format modifiers are not supported and no other check is performed. This means ADDFB2 IOCTLs will succeed even if the tiling isn't supported for scan-out, and will result in garbage displayed on screen [1]. Fix this by adding a check for tiling flags for GFX8 and older. The check is taken from radeonsi in Mesa (see how is_displayable is populated in gfx6_compute_surface). Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel) [1]: https://github.com/swaywm/wlroots/issues/3185 Signed-off-by: Simon Ser <contact@emersion.fr> Acked-by: Michel Dänzer <mdaenzer@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-29 17:30:00 -04:00
Matthew Auld	43d46f0b78	drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/ It covers more than just ttm_bo_type_sg usage, like with say dma-buf, since one other user is userptr in amdgpu, and in the future we might have some more. Hence EXTERNAL is likely a more suitable name. v2(Christian): - Rename these to TTM_TT_FLAGS_* - Fix up all the holes in the flag values Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2021-09-29 16:17:56 +02:00
Matthew Auld	21856e1e34	drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu Now that setting page->index shouldn't be needed anymore, we are just left with setting page->mapping, and here it looks like amdgpu is the only user, where pointing the page->mapping at the dev_mapping is used to verify that the pages do indeed belong to the device, if userspace later tries to touch them. v2(Christian): - Drop the functions altogether and just inline modifying the page->mapping Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-3-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2021-09-29 13:55:09 +02:00
Prike Liang	26db706a6d	drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix In the s2idle stress test sdma resume fail occasionally,in the failed case GPU is in the gfxoff state.This issue may introduce by firmware miss handle doorbell S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-09-28 14:40:27 -04:00
Simon Ser	98122e63a7	drm/amdgpu: check tiling flags when creating FB on GFX8- On GFX9+, format modifiers are always enabled and ensure the frame-buffers can be scanned out at ADDFB2 time. On GFX8-, format modifiers are not supported and no other check is performed. This means ADDFB2 IOCTLs will succeed even if the tiling isn't supported for scan-out, and will result in garbage displayed on screen [1]. Fix this by adding a check for tiling flags for GFX8 and older. The check is taken from radeonsi in Mesa (see how is_displayable is populated in gfx6_compute_surface). Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel) [1]: https://github.com/swaywm/wlroots/issues/3185 Signed-off-by: Simon Ser <contact@emersion.fr> Acked-by: Michel Dänzer <mdaenzer@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-09-28 14:40:19 -04:00
Hawking Zhang	9f52c25f59	drm/amdgpu: correct initial cp_hqd_quantum for gfx9 didn't read the value of mmCP_HQD_QUANTUM from correct register offset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2021-09-28 14:39:29 -04:00
Leslie Shi	66805763a9	drm/amdgpu: fix gart.bo pin_count leak gmc_v{9,10}_0_gart_disable() isn't called matched with correspoding gart_enbale function in SRIOV case. This will lead to gart.bo pin_count leak on driver unload. Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 14:38:16 -04:00
Hawking Zhang	e794747622	drm/amdgpu: correct initial cp_hqd_quantum for gfx9 didn't read the value of mmCP_HQD_QUANTUM from correct register offset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:08 -04:00
Tao Zhou	f524dd54a7	drm/amdgpu: skip umc ras irq handling in poison mode (v2) In ras poison mode, umc uncorrectable error will be ignored until the corrupted data consumed by another ras module (such as gfx, sdma). v2: update the debug message and replace dev_warn with dev_info. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:07 -04:00
Tao Zhou	e43488493c	drm/amdgpu: set poison supported flag for RAS (v2) Add RAS poison supported flag and tell PSP RAS TA about the info. v2: rename poison mode to poison supported, we can also disable poison mode even we support it. print value of poison supported if ras feature enablement fails. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:07 -04:00
Tao Zhou	aaca8c3861	drm/amdgpu: add poison mode query for UMC Add ras poison mode query interface for UMC. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:06 -04:00
Tao Zhou	ca5c636dc6	drm/amdgpu: add poison mode query for DF (v2) Add ras poison mode query interface for DF. v2: replace RREG32_PCIE with RREG32_SOC15. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:06 -04:00
Candice Li	77ec28eac2	drm/amdgpu: Update PSP TA Invoke to use common TA context as input Updated invoke to use new common TA structure similarily to load/unload. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:06 -04:00
Leslie Shi	71cf9e72b3	drm/amdgpu: fix gart.bo pin_count leak gmc_v{9,10}_0_gart_disable() isn't called matched with correspoding gart_enbale function in SRIOV case. This will lead to gart.bo pin_count leak on driver unload. Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-28 09:30:05 -04:00
Dave Airlie	1e3944578b	Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.16-2021-09-27: amdgpu: - RAS improvements - BACO fixes - Yellow Carp updates - Misc code cleanups - Initial DP 2.0 support - VCN priority handling - Cyan Skillfish updates - Rework IB handling for multimedia engine tests - Backlight fixes - DCN 3.1 power saving improvements - Runtime PM fixes - Modifier support for DCC image stores for gfx 10.3 - Hotplug fixes - Clean up stack related warnings in display code - DP alt mode fixes - Display rework for better handling FP code - Debugfs fixes amdkfd: - SVM fixes - DMA map fixes radeon: - AGP fix From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210927212653.4575-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2021-09-28 17:08:26 +10:00
Alex Deucher	2485e2753e	drm/amdgpu: make soc15_common_ip_funcs static It's not used outside of soc15.c Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 16:35:27 -04:00
Candice Li	9080a18fc5	drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init All code paths under the EAGAIN path in RAS late init are unused. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 16:35:13 -04:00
John Clements	73490d2658	drm/amdgpu: Consolidate RAS cmd warning messages Explicity post warning if cmd is issued against unsupported IP Update to latest RAS TA interface Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 16:35:04 -04:00
John Clements	640ae42efb	drm/amdgpu: Updated RAS infrastructure Update RAS infrastructure to support RAS query for MCA subblocks Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 16:34:43 -04:00
Guchun Chen	6effad8abe	drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu will still use adev->rmmio for access after amdgpu_device_unmap_mmio. The patch is to move such SRIOV calling earlier to fini_early stage. Fixes: `07775fc138` ("drm/amdgpu: Unmap all MMIO mappings") Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 16:34:28 -04:00
Andrey Grodzovsky	ebe86a57c8	drm/amdgpu: Fix resume failures when device is gone Problem: When device goes into suspend and unplugged during it then all HW programming during resume fails leading to a bad SW during pci remove handling which follows. Because device is first resumed and only later removed we cannot rely on drm_dev_enter/exit here. Fix: Use a flag we use for PCIe error recovery to avoid accessing registres. This allows to successfully complete pm resume sequence and finish pci remove. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 15:17:29 -04:00
Andrey Grodzovsky	c03509cbc0	drm/amdgpu: Fix MMIO access page fault Add more guards to MMIO access post device unbind/unplug Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 15:17:29 -04:00
Andrey Grodzovsky	d82e2c249c	drm/amdgpu: Fix crash on device remove/driver unload Crash: BUG: unable to handle page fault for address: 00000000000010e1 RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu] Call Trace: pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu] amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu] amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu] vce_v4_0_hw_fini+0x95/0xa0 [amdgpu] amdgpu_device_fini_hw+0x232/0x30d [amdgpu] amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu] amdgpu_pci_remove+0x27/0x40 [amdgpu] pci_device_remove+0x3e/0xb0 device_release_driver_internal+0x103/0x1d0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x79/0xa0 pci_stop_and_remove_bus_device_locked+0x1b/0x30 remove_store+0x7b/0x90 dev_attr_store+0x17/0x30 sysfs_kf_write+0x4b/0x60 kernfs_fop_write_iter+0x151/0x1e0 Why: VCE/UVD had dependency on SMC block for their suspend but SMC block is the first to do HW fini due to some constraints How: Since the original patch was dealing with suspend issues move the SMC block dependency back into suspend hooks as was done in V1 of the original patches. Keep flushing idle work both in suspend and HW fini seuqnces since it's essential in both cases. Fixes: `859e465927` ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend") Fixes: `bf756fb833` ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 15:17:29 -04:00
xinhui pan	0a2267809f	drm/amdgpu: Fix uvd ib test timeout when use pre-allocated BO Now we use same BO for create/destroy msg. So destroy will wait for the fence returned from create to be signaled. The default timeout value in destroy is 10ms which is too short. Lets wait both fences with the specific timeout. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-09-23 15:17:29 -04:00

... 6 7 8 9 10 ...

10198 commits