linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Alex Deucher	7b1c6263ea	drm/amdgpu: don't use pci_is_thunderbolt_attached() It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2023-11-03 11:59:44 -04:00
Kenneth Feng	5f38ac54e6	drm/amd/pm: fix the high voltage and temperature issue fix the high voltage and temperature issue after the driver is unloaded on smu 13.0.0, smu 13.0.7 and smu 13.0.10 v2 - fix the code format and make sure it is used on the unload case only. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:16 -04:00
Yifan Zhang	a17f574ab4	drm/amdgpu: remove amdgpu_mes_self_test in gpu recover gpu tlb flush is skipped if reset sem is held, it makes mes_self_test fail since it involves add_hw_queue/remove_hw_queue which needs tlb flush functional. Remove mes_self_test in gpu recover sequence. This patch is to fix the recover failure in gfx11. [ 1831.768292] [drm] ring sdma_32769.3.3 was added [ 1831.768313] [drm] ring gfx_32769.1.1 ib test pass [ 1831.768337] [drm] ring compute_32769.2.2 ib test pass [ 1831.768399] amdgpu 0000:c2:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:8 pasid:32769, for process pid 0 thread pid 0) [ 1831.768434] amdgpu 0000:c2:00.0: amdgpu: in page starting at address 0x0000aec200000000 from client 10 [ 1831.768456] amdgpu 0000:c2:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800A30 [ 1831.768473] amdgpu 0000:c2:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 1831.768489] amdgpu 0000:c2:00.0: amdgpu: MORE_FAULTS: 0x0 [ 1831.768501] amdgpu 0000:c2:00.0: amdgpu: WALKER_ERROR: 0x0 [ 1831.768513] amdgpu 0000:c2:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [ 1831.768521] amdgpu 0000:c2:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 1831.768529] amdgpu 0000:c2:00.0: amdgpu: RW: 0x0 [ 1831.931229] amdgpu 0000:c2:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring sdma_32769.3.3 test failed (-110) [ 1832.062917] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] ERROR MES failed to response msg=3 [ 1832.063107] [drm:amdgpu_mes_remove_hw_queue [amdgpu]] ERROR failed to remove hardware queue, queue id = 3 Fixes: `e2e3788850` ("drm/amdgpu: rework lock handling for flush_tlb v2") Reported-by: Li Ma <li.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-31 16:40:15 -04:00
Dave Airlie	631808095a	amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZTwWOAAKCRC93/aFa7yZ 2Lz3AP0cInVS4qZYTZCh2O/k5AoidJjcmRl2DVm8OdowBPCa4wEAoNsekTIQnZsI Ru4SoVKhT2bs1LEMOcdzexsVrwlaxA8= =G3QX -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.7-2023-10-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231027200343.57132-1-alexander.deucher@amd.com	2023-10-31 12:37:19 +10:00
Dave Airlie	915b6d034b	Merge tag 'drm-misc-next-2023-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: drm-misc-next-2023-10-19 + following: UAPI Changes: Cross-subsystem Changes: - Convert fbdev drivers to use fbdev i/o mem helpers. Core Changes: - Use cross-references for macros in docs. - Make drm_client_buffer_addb use addfb2. - Add NV20 and NV30 YUV formats. - Documentation updates for create_dumb ioctl. - CI fixes. - Allow variable number of run-queues in scheduler. Driver Changes: - Rename drm/ast constants. - Make ili9882t its own driver. - Assorted fixes in ivpu, vc4, bridge/synopsis, amdgpu. - Add planar formats to rockchip. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/3d92fae8-9b1b-4165-9ca8-5fda11ee146b@linux.intel.com	2023-10-31 10:47:50 +10:00
Mario Limonciello	2757a848cb	drm/amd: Explicitly disable ASPM when dynamic switching disabled Currently there are separate but related checks: * amdgpu_device_should_use_aspm() * amdgpu_device_aspm_support_quirk() * amdgpu_device_pcie_dynamic_switching_supported() Simplify into checking whether DPM was enabled or not in the auto case. This works because amdgpu_device_pcie_dynamic_switching_supported() populates that value. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Mario Limonciello	1a6513de49	drm/amd: Move AMD_IS_APU check for ASPM into top level function There is no need for every ASIC driver to perform the same check. Move the duplicated code into amdgpu_device_should_use_aspm(). Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Mario Limonciello	fbf1035b03	drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-26 18:41:23 -04:00
Alex Deucher	b70438004a	drm/amdgpu: move buffer funcs setting up a level Rather than doing this in the IP code for the SDMA paging engine, move it up to the core device level init level. This should fix the scheduler init ordering. v2: drop extra parens v3: drop SDMA helpers v4: Added a Fixes tag because amdgpu dereferences an uninitialized scheduler without this patch, and this patch fixes this. (Luben) Tested-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20231025171928.3318505-1-alexander.deucher@amd.com Acked-by: Christian König <christian.koenig@amd.com> Fixes: `56e449603f` ("drm/sched: Convert the GPU scheduler to variable number of run-queues") Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>	2023-10-26 16:04:24 -04:00
Luben Tuikov	56e449603f	drm/sched: Convert the GPU scheduler to variable number of run-queues The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com	2023-10-26 12:03:47 -04:00
André Almeida	69619868d3	drm/amdgpu: Move coredump code to amdgpu_reset file Giving that we use codedump just for device resets, move it's functions and structs to a more semantic file, the amdgpu_reset.{c, h}. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:29 -04:00
André Almeida	2d6a2a28cd	drm/amdgpu: Encapsulate all device reset info To better organize struct amdgpu_device, keep all reset information related fields together in a separated struct. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Tao Zhou	21226f02d7	drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_count Simplify the code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:28 -04:00
Mario Limonciello	cb11ca3233	drm/amd: Add concept of running prepare_suspend() sequence for IP blocks If any IP blocks allocate memory during their hw_fini() sequence this can cause the suspend to fail under memory pressure. Introduce a new phase that IP blocks can use to allocate memory before suspend starts so that it can potentially be evicted into swap instead. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:58 -04:00
Mario Limonciello	5095d54181	drm/amd: Evict resources during PM ops prepare() callback Linux PM core has a prepare() callback run before suspend. If the system is under high memory pressure, the resources may need to be evicted into swap instead. If the storage backing for swap is offlined during the suspend() step then such a call may fail. So move this step into prepare() to move evict majority of resources and update all non-pmops callers to call the same callback. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:18 -04:00
Stanley.Yang	80285ae1ec	drm/amdgpu: Fix potential null pointer derefernce The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:46 -04:00
Lijo Lazar	8a2b51392a	drm/amdgpu: Refactor FRU product information Keep FRU related information together in a separate structure. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:08 -04:00
Lijo Lazar	4798db85b7	Documentation/amdgpu: Add board info details Add documentation for board info sysfs attribute. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	76da73f026	drm/amdgpu: Add sysfs attribute to get board info Add a sysfs attribute which shows the board form factor like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Mario Limonciello	c4c8955b8a	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:43:11 -04:00
Tao Zhou	1934907234	drm/amdgpu: exit directly if gpu reset fails No need to perform the full reset operation in case of gpu reset failure. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Mario Limonciello	3657a1d5ac	drm/amd: Limit seamless boot by default to APUs A hang is reported on DCN 3.2 with seamless boot enabled. As the benefits come from an eDP setup, limit it to only enabled by default with APUs. Suggested-by: Alexander.Deucher@amd.com Reported-by: feifei.xu@amd.com Closes: https://lore.kernel.org/amd-gfx/85b427f6-11ec-4249-bf6f-eadf9c375f88@amd.com/T/#m2887e919d7c01b2a4860d2261b366d22e070f309 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:35:31 -04:00
Lijo Lazar	c45e38f217	drm/amdgpu: Restore partition mode after reset On a full device reset, PSP FW gets unloaded. Hence restore the partition mode by placing a new request. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
André Almeida	a769178585	drm/amdgpu: Rework coredump to use memory dynamically Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Mario Limonciello	7f4ce7b50a	drm/amd: Enable seamless boot by default on newer ASICs Seamless boot can technically be supported as far back as DCN1 but to avoid regressions on older hardware, enable it for DCN3 and later. If users report using the module parameter that it works on older ASICs as well, this can be adjusted. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:46 -04:00
Mario Limonciello	5dc270d366	drm/amd: Add a module parameter for seamless boot The module parameter can be used to test more easily enabling seamless boot support on additional ASICs. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:39 -04:00
Mario Limonciello	bb0f84293e	drm/amd: Move seamless boot check out of display This will allow base driver to dictate whether seamless should be enabled. No intended functional changes. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:21 -04:00
Lijo Lazar	4e8303cf2c	drm/amdgpu: Use function for IP version check Use an inline function for version check. Gives more flexibility to handle any format changes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:23:28 -04:00
Hamza Mahfooz	169ed4ece8	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:18:17 -04:00
Hamza Mahfooz	601c63ad8e	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:12:20 -04:00
Candice Li	d57e24aa56	drm/amdgpu: Update amdgpu_device_indirect_r/wreg_ext Only calculate pcie_index_hi for register address greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:26 -04:00
Candice Li	a76b2870bd	drm/amdgpu: Add RREG64_PCIE_EXT/WREG64_PCIE_EXT functions Add 64bits register access support on register whose address is greater than 32bits. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-06 14:34:18 -04:00
André Almeida	6d1b345548	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:10:11 -04:00
Hawking Zhang	7d4424373d	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 18:02:10 -04:00
Lijo Lazar	7656168a8a	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 17:58:58 -04:00
Evan Quan	90bcb9b595	drm/amdgpu: revise the device initialization sequences By placing the sysfs interfaces creation after `.late_int`. Since some operations performed during `.late_init` may affect how the sysfs interfaces should be created. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:33 -04:00
Evan Quan	3e38b634f9	drm/amd/pm: introduce a new set of OD interfaces There will be multiple interfaces(sysfs files) exposed with each representing a single OD functionality. And all those interface will be arranged in a tree liked hierarchy with the top dir as "gpu_od". Meanwhile all functionalities for the same component will be arranged under the same directory. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-31 16:35:26 -04:00
André Almeida	d68ccdb263	drm/amdgpu: Allocate coredump memory in a nonblocking way During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:51:16 -04:00
Lee Jones	8057a9d656	drm/amd/amdgpu/amdgpu_device: Provide suitable description for param 'xcc_id' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:516: warning: Function parameter or member 'xcc_id' not described in 'amdgpu_mm_wreg_mmio_rlc' Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:26:58 -04:00
Hawking Zhang	2c0f880abc	drm/amdgpu: Fix the return for gpu mode1_reset amdgpu_device_mode1_reset will return gpu mode1_reset succeed (ret = 0) as long as wait_for_bootloader call succeed, regardless of the status reported by smu or psp firmware. This results to driver continue executing recovery even smu or psp fail to perform mode1 reset. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 15:01:44 -04:00
Lijo Lazar	15c5c5f575	drm/amdgpu: Add bootloader status check Add a function to wait till bootloader has reached steady state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-30 14:59:24 -04:00
Mario Limonciello	0dee726395	drm/amd: flush any delayed gfxoff on suspend entry DCN 3.1.4 is reported to hang on s2idle entry if graphics activity is happening during entry. This is because GFXOFF was scheduled as delayed but RLC gets disabled in s2idle entry sequence which will hang GFX IP if not already in GFXOFF. To help this problem, flush any delayed work for GFXOFF early in s2idle entry sequence to ensure that it's off when RLC is changed. commit `4b31b92b14` ("drm/amdgpu: complete gfxoff allow signal during suspend without delay") modified power gating flow so that if called in s0ix that it ensured that GFXOFF wasn't put in work queue but instead processed immediately. This is dead code due to commit `10cb67eb8a` ("drm/amdgpu: skip CG/PG for gfx during S0ix") because GFXOFF will now not be explicitly called as part of the suspend entry code. Remove that dead code. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-16 11:35:14 -04:00
Jiadong Zhu	1e9e15dcf4	drm/amdgpu: disable mcbp if parameter zero is set The parameter amdgpu_mcbp shall have priority against the default value calculated from the chip version. User could disable mcbp by setting the parameter mcbp as zero. v2: do not trigger preemption in sw ring muxer when mcbp is disabled. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:44:02 -04:00
Srinivasan Shanmugam	4c452b5c7d	drm/amdgpu: Fix missing comment for mb() in 'amdgpu_device_aper_access' This patch adds the missing code comment for memory barrier WARNING: memory barrier without comment + mb(); WARNING: memory barrier without comment + mb(); Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-15 17:40:44 -04:00
Alex Deucher	c99a2e7ae2	drm/amdkfd: drop IOMMUv2 support Now that we use the dGPU path for all APUs, drop the IOMMUv2 support. v2: drop the now unused queue manager functions for gfx7/8 APUs Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-11 14:47:25 -04:00
Emily Deng	f734b2133c	drm/amdgpu/irq: Move irq resume to the beginning Need to move irq resume to the beginning of reset sriov, or if one interrupt occurs before irq resume, then the irq won't work anymore. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:46:04 -04:00
Lijo Lazar	7957ec80ef	drm/amdgpu: Add FRU sysfs nodes only if needed Create sysfs nodes for FRU data only if FRU data is available. Move the logic to FRU specific file. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-09 09:45:55 -04:00
Mario Limonciello	70e64c4d52	drm/amd: Disable S/G for APUs when 64GB or more host memory Users report a white flickering screen on multiple systems that is tied to having 64GB or more memory. When S/G is enabled pages will get pinned to both VRAM carve out and system RAM leading to this. Until it can be fixed properly, disable S/G when 64GB of memory or more is detected. This will force pages to be pinned into VRAM. This should fix white screen flickers but if VRAM pressure is encountered may lead to black screens. It's a trade-off for now. Fixes: `81d0bcf990` ("drm/amdgpu: make display pinning more flexible (v2)") Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: <stable@vger.kernel.org> # 6.1.y: `bf0207e172` ("drm/amdgpu: add S/G display parameter") Cc: <stable@vger.kernel.org> # 6.4.y Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2735 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-08-07 16:32:54 -04:00
Srinivasan Shanmugam	b8920e1e0d	drm/amdgpu: Fix ENOSYS means 'invalid syscall nr' in amdgpu_device.c ENOSYS should be used for nonexistent syscalls only, replace ENOSYS with EOPNOTSUPP for reset handlers that are not implemented for respective ASIC. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) WARNING: ENOSYS means 'invalid syscall nr' and nothing else + if (r == -ENOSYS) And other following style fixes in amdgpu_device.c: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. WARNING: Block comments should align the * on each line WARNING: Missing a blank line after declarations WARNING: braces {} are not necessary for single statement blocks Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Kent Russell <kent.russell@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-27 14:59:29 -04:00
Horace Chen	83f24a8f05	drm/amdgpu: set sw state to gfxoff after SR-IOV reset [Why] Current SR-IOV will not set GC to off state, while it is a real GC hard reset. Whthout GFX off flag, driver may do gfxhub invalidation before firmware load and gfxhub gart enable. This operation may cause CP to become busy because GC is not in the right state for invalidation. [How] Add a function for SR-IOV to clean up some sw state before recover. Set adev->gfx.is_poweron to false to prevent gfxhub invalidation before gfx firmware autoload complete. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: HaiJun Chang <HaiJun.Chang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:23 -04:00

1 2 3 4 5 ...

1244 commits