linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Dennis Li	cbfd17f7ba	drm/amdgpu: fix the nullptr issue when reenter GPU recovery in single gpu system, if driver reenter gpu recovery, amdgpu_device_lock_adev will return false, but hive is nullptr now. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:23:54 -04:00
Dennis Li	6049db43d6	drm/amdgpu: change reset lock from mutex to rw_semaphore clients don't need reset-lock for synchronization when no GPU recovery. v2: change to return the return value of down_read_killable. v3: if GPU recovery begin, VF ignore FLR notification. Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:23:48 -04:00
Dennis Li	53b3f8f40e	drm/amdgpu: refine codes to avoid reentering GPU recovery if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:22:56 -04:00
jqdeng	9a1cddd637	drm/amdgpu: Fix repeatly flr issue Only for no job running test case need to do recover in flr notification. For having job in mirror list, then let guest driver to hit job timeout, and then do recover. Signed-off-by: jqdeng <Emily.Deng@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-18 18:22:02 -04:00
Christian König	f1403342eb	drm/amdgpu: revert "fix system hang issue during GPU reset" The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit `df9c8d1aa2`. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Evan Quan	f10bb940d8	drm/amd/powerplay: optimize the interface for mgpu fan boost enablement Cover the implementation details from outside(of power). Also preparing for expanding this to swSMU. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Dennis Li	72e14ebf9f	drm/amdgpu: annotate a false positive recursive locking [ 584.110304] ============================================ [ 584.110590] WARNING: possible recursive locking detected [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE [ 584.111164] -------------------------------------------- [ 584.111456] kworker/38:1/553 is trying to acquire lock: [ 584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] but task is already holding lock: [ 584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] other info that might help us debug this: [ 584.113689] Possible unsafe locking scenario: [ 584.114350] CPU0 [ 584.114685] ---- [ 584.115014] lock(&adev->reset_sem); [ 584.115349] lock(&adev->reset_sem); [ 584.115678] * DEADLOCK * [ 584.116624] May be due to missing lock nesting notation [ 584.117284] 4 locks held by kworker/38:1/553: [ 584.117616] #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630 [ 584.117967] #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630 [ 584.118358] #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [ 584.118786] #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] stack backtrace: [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] __lock_acquire+0x1139/0x16e0 [ 584.122931] ? trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? cancel_delayed_work+0xa6/0xc0 [ 584.123771] lock_acquire+0xb8/0x1c0 [ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] down_write+0x49/0x120 [ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] process_one_work+0x29e/0x630 [ 584.127208] worker_thread+0x3c/0x3f0 [ 584.127621] ? __kthread_parkme+0x61/0x90 [ 584.128014] kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 [ 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] ret_from_fork+0x3a/0x50 Each adev has owned lock_class_key to avoid false positive recursive locking. v2: 1. register adev->lock_key into lockdep, otherwise lockdep will report the below warning [ 1216.705820] BUG: key ffff890183b647d0 has not been registered! [ 1216.705924] ------------[ cut here ]------------ [ 1216.705972] DEBUG_LOCKS_WARN_ON(1) [ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210 v3: change to use down_write_nest_lock to annotate the false dead-lock warning. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Thomas Zimmermann	534b1f9071	Merge drm/drm-next into drm-misc-next Backmerging drm-next into drm-misc-next for nouveau and panel updates. Resolves a conflict between ttm and nouveau, where struct ttm_mem_res got renamed to struct ttm_resource. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2020-08-12 20:42:08 +02:00
Liu ChengZhe	6b6124bb4a	drm/amdgpu: fix PSP autoload twice in FLR Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 16:44:41 -04:00
Colin Ian King	c16ce56240	drm/amdgpu: fix spelling mistake "paramter" -> "parameter" There is a spelling mistake in a dev_warn message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:43:26 -04:00
Dave Airlie	6c28aed6e5	drm/amdgfx/ttm: use wrapper to get ttm memory managers Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-38-airlied@gmail.com	2020-08-06 12:32:03 +10:00
Alex Deucher	72de33f8f7	drm/amdgpu: move IP discovery data to mman It's related to the memory manager so move it there. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Monk Liu	a300de40f6	drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5) what: the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks during world switch which impacts a lot to multi-VF performance how: introduce a paramter to control the number of KCQ to avoid performance drop if there is no kernel compute queue needed notes: this paramter only affects gfx 8/9/10 v2: refine namings v3: choose queues for each ring to that try best to cross pipes evenly. v4: fix indentation some cleanupsin the gfx_compute_queue_acquire() v5: further fix on indentations more cleanupsin gfx_compute_queue_acquire() TODO: in the future we will let hypervisor driver to set this paramter automatically thus no need for user to configure it through modprobe in virtual machine Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:29 -04:00
Guchun Chen	e8fbaf0342	drm/amdgpu: break GPU recovery once it's in bad state(v4) When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:54 -04:00
Guchun Chen	b82e65a935	drm/amdgpu: break driver init process when it's bad GPU(v5) When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:31 -04:00
Liu ChengZhe	392cf6a739	drm/amdgpu: fix PSP autoload twice in FLR Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Mauro Rossi	64200c468f	drm/amdgpu: enable DC support for SI parts (v2) [Why] amdgpu_device.c requires changes for SI chipsets support si.c require changes for Display Manager IP block enabling [How] amdgpu_device.c: add SI families in amdgpu_device_asic_has_dc_support() si.c: changes in si_set_ip_blocks() for Display Manager IP blocks enablement (v1) NOTE: As per Kaveri and older amdgpu.dc=1 kernel cmdline is required (v2) fix for bc011f9350 ("drm/amdgpu: Change SI/CI gfx/sdma/smu init sequence") remove CHIP_HAINAN support since it does not have physical DCE6 module Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-28 09:22:48 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Bhawanpreet Lakha	a6c5308f2a	drm/amd/display: add DC support for navy flounder Plumb DC support for navy flounder through. Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 13:27:26 -04:00
Jiansong Chen	41f446bf52	drm/amdgpu: set asic family and ip blocks for navy_flounder Add the asic family and IP blocks for navy flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:45:48 -04:00
Jiansong Chen	120eb83336	drm/amdgpu: add navy_flounder gpu info firmware Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:45:43 -04:00
Jiansong Chen	ddd8fbe77d	drm/amdgpu: add navy_flounder asic type Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:45:39 -04:00
Wenhui Sheng	bb5c7235ea	drm/amdgpu: RAS emergency restart logic refine If we are in RAS triggered situation and BACO isn't support, emergency restart is needed, and this code is only needed for some specific cases(vega20 with given smu fw version). After we add smu mode1 reset for sienna cichlid, we need to share AMD_RESET_METHOD_MODE1 with psp mode1 reset, so in amdgpu_device_gpu_recover, we need differentiate which mode1 reset we are using, then decide if it's a full reset and then decide if emergency restart is needed, the logic will become much more complex. After discussion with Hawking, move emergency restart logic to an independent function. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:41:47 -04:00
Nirmoy Das	2b9f78481b	drm/amdgpu: minor cleanup of phase1 suspend code Cleanup of phase1 suspend code to reduce unnecessary indentation. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-10 17:42:42 -04:00
Likun Gao	131a3c7474	drm/amdgpu: enable gpu recovery for sienna cichlid Enable gpu recovery for sienna cichlid by default to trigger gpu recovery once needed. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-10 17:41:06 -04:00
Hawking Zhang	e78b579d2d	Revert "drm/amdgpu: support access regs outside of mmio bar" This reverts commit `2eee0229f6`. Fallback to a stable base until we have a correct new one Signed-off-by:Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-02 12:02:56 -04:00
Monk Liu	2c738637ba	drm/amdgpu: make IB test synchronize with init for SRIOV(v2) issue: originally we kickoff IB test asynchronously with driver's init, thus the IB test may still running when the driver loading done (modprobe amdgpu done). if we shutdown VM immediately after amdgpu driver loaded then GPU may hang because the IB test is still running fix: flush the delayed_init routine at the bottom of device_init to avoid driver loading done prior to the IB test completes Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-02 12:02:55 -04:00
Wenhui Sheng	e3a4d51c76	drm/amdgpu: merge atombios init block After we move request full access before set ip blocks, we can merge atombios init block of SRIOV and baremetal together. Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-02 12:02:51 -04:00
Wenhui Sheng	00a979f3d6	drm/amdgpu: invoke req full access early enough From SIENNA_CICHLID, HW introduce a new protection feature which can control the FB, doorbell and MMIO write access for VF, so guest driver should request full access before ip discovery, or we couldn't access ip discovery data in FB. Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-02 12:02:51 -04:00
Nirmoy Das	75e1658ea0	drm/amdgpu: call release_firmware() without a NULL check The release_firmware() function is NULL tolerant so we do not need to check for NULL param before calling it. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:27 -04:00
Kevin Wang	5d5bd5e32e	drm/amdgpu: restrict the hw sched jobs number to power of two the module parameter sched_hw_submission is probably from user mode, and the kernel need to check whether it is legal. 1. align hw sched jobs to power of 2 and set minimum number is 2. 2. use kernel api is_power_of_2() to simplify driver code. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:23 -04:00
Colton Lewis	282fd22b46	drm/amd: correct trivial kernel-doc inconsistencies Silence documentation warnings by correcting kernel-doc comments. ./drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3388: warning: Excess function parameter 'suspend' description in 'amdgpu_device_suspend' ./drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3485: warning: Excess function parameter 'resume' description in 'amdgpu_device_resume' ./drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c:418: warning: Excess function parameter 'tbo' description in 'amdgpu_vram_mgr_del' ./drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c:418: warning: Excess function parameter 'place' description in 'amdgpu_vram_mgr_del' ./drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c:279: warning: Excess function parameter 'tbo' description in 'amdgpu_gtt_mgr_del' ./drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c:279: warning: Excess function parameter 'place' description in 'amdgpu_gtt_mgr_del' ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:332: warning: Function parameter or member 'hdcp_workqueue' not described in 'amdgpu_display_manager' ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:332: warning: Function parameter or member 'cached_dc_state' not described in 'amdgpu_display_manager' Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:19 -04:00
Alex Deucher	b7221f2b46	drm/amdgpu: skip BAR resizing if the bios already did it No need to do it again. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:17 -04:00
Stanley.Yang	5278a159cf	drm/amdgpu: support reserve bad page for virt (v3) v1: rename some functions name, only init ras error handler data for supported asic. v2: fix potential memory leak. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:17 -04:00
Tianci.Yin	cc375d8c52	drm/amdgpu: temporarily read bounding box from gpu_info fw for navi12 The bounding box is still needed by Navi12, temporarily read it from gpu_info firmware. Should be droped when DAL no longer needs it. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:15 -04:00
Jerry (Fangzhi) Zuo	81d9bfb8c5	drm/amdgpu/dc: Add missing Sienna_Cichlid chip id Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:10 -04:00
Likun Gao	11e8aef52e	drm/amdgpu: set asic family and ip blocks for sienna_cichlid Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-06-03 13:51:57 -04:00
Likun Gao	c0a43457dc	drm/amdgpu: add sienna_cichlid gpu info firmware v2 gpu info fw contains chip specific parameters. v2: fix fw_name Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-06-03 13:51:57 -04:00
Likun Gao	ccaf72d3c2	drm/amdgpu: add sienna_cichlid asic type Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-06-03 13:51:56 -04:00
Alex Deucher	4292b0b202	drm/amdgpu: clean up discovery testing Rather than checking of the variable is enabled and the chip is the right family check for the presence of the discovery table. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-29 13:55:07 -04:00
Alex Deucher	258620d0b3	drm/amdgpu: skip gpu_info firmware if discovery info is available The GPU info firmware is only applicable at bring up when the IP discovery table is not present. If it's available, use that first and then fallback to parsing the gpu info firmware. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-29 13:55:07 -04:00
Evan Quan	b265bdbd9f	drm/amdgpu: added a sysfs interface for thermal throttling related V4 User can check and set the enablement of throttling logging and the interval between each logging. V2: simplify the sysfs interface(no string parsing) V3: add proper lock protection on updating throttling_logging_rs.interval V4: documentation cosmetic per Luben's suggestion Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-29 13:55:07 -04:00
Alex Deucher	da87c30b17	drm/amdgpu: put some case statments in family order SI and CIK came before VI and newer asics. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-28 14:00:49 -04:00
Alex Deucher	e1ad2d53bc	drm/amdgpu: simplify CZ/ST and KV/KB/ML checks Just check for APU. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-28 14:00:49 -04:00
Alex Deucher	70534d1ee8	drm/amdgpu: simplify raven and renoir checks Just check for APU. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-28 14:00:49 -04:00
Alex Deucher	54f78a7655	drm/amdgpu: add apu flags (v2) Add some APU flags to simplify handling of different APU variants. It's easier to understand the special cases if we use names flags rather than checking device ids and silicon revisions. v2: rebase on latest code Acked-by: Evan Quan <evan.quan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-22 13:41:53 -04:00
Alex Deucher	6e29c227a4	drm/amdgpu: move gpu_info parsing after common early init We need to get the silicon revision id before we parse the firmware in order to load the correct gpu info firmware for raven2 variants. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1103 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-22 13:41:32 -04:00
Alex Deucher	6ba57b7a8f	drm/amdgpu: move discovery gfx config fetching Move it into the fw_info function since it's logically part of the same functionality. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-22 13:41:22 -04:00
Jiange Zhao	728e7e0cd6	drm/amdgpu: Add autodump debugfs node for gpu reset v8 When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wait_queue_head and give it 10 minutes to dump info. After usermode app has done its work, this 'autodump' node is closed. On node closure, amdgpu gets to know the dump is done through the completion that is triggered in release(). There is no write or read callback because necessary info can be obtained through dmesg and umr. Messages back and forth between usermode app and amdgpu are unnecessary. v2: (1) changed 'registered' to 'app_listening' (2) add a mutex in open() to prevent race condition v3 (chk): grab the reset lock to avoid race in autodump_open, rename debugfs file to amdgpu_autodump, provide autodump_read as well, style and code cleanups v4: add 'bool app_listening' to differentiate situations, so that the node can be reopened; also, there is no need to wait for completion when no app is waiting for a dump. v5: change 'bool app_listening' to 'enum amdgpu_autodump_state' add 'app_state_mutex' for race conditions: (1)Only 1 user can open this file node (2)wait_dump() can only take effect after poll() executed. (3)eliminated the race condition between release() and wait_dump() v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex' removed state checking in amdgpu_debugfs_wait_dump Improve on top of version 3 so that the node can be reopened. v7: move reinit_completion into open() so that only one user can open it. v8: remove complete_all() from amdgpu_debugfs_wait_dump(). Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-18 11:23:37 -04:00
Nirmoy Das	77f3a5cd70	drm/amdgpu: cleanup sysfs file handling Create sysfs file using attributes. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-08 14:32:24 -04:00

... 2 3 4 5 6 ...

918 commits