linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Evan Quan	7dd8c205ea	drm/amdgpu: code cleanup around gpu reset Make code more readable. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	9e94d22c00	drm/amdgpu: optimize the gpu reset for XGMI setup V2 This is basically just some code cosmetic. The current design for XGMI setup gput reset is to operate on current device(adev) first and then on other devices from the hive(by another 'for' loop). But actually we can do some sort to the device list(to put current device 1st position) and handle all the devices in a single 'for' loop. V2: added missing hive->hive_lock protection Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	52fb44cf30	drm/amdgpu: correct cancel_delayed_work_sync on gpu reset As for XGMI setup, it should be performed on other devices from the hive also. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	a2f63ee8b5	drm/amdgpu: correct fbdev suspend on gpu reset As for XGMI setup, it needs to be performed on all the devices from the same hive. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Bernard Zhao	10f39758b8	drm/amdgpu: cleanup coding style in amdkfd a bit Make the code a bit more readable by using a common error handling pattern. Signed-off-by: Bernard Zhao <bernard@vivo.com> Reviewed-by: Christian König <christian.koenig@amd.com>. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Kevin Wang	e05185b341	drm/amdgpu: clean up unused variable about ring lru clean up unused variable: 1. ring_lru_list 2. ring_lru_list_lock related-commit: drm/amdgpu: remove ring lru handling Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Dennis Li	4cc1178e16	drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhub Prefix RAS message printing in gfx/mmhub with PCI device info, which assists the debug in multiple GPU case. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Jiawei	7aba19182e	drm/amdgpu: disble vblank when unloading sriov driver disble vblank in dce_vitual_crtc_commit(), which is skipped under sriov before Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiawei <Jiawei.Gu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Yong Zhao	d69b8971e5	drm/amdgpu: Print CU information by default during initialization This is convenient for multiple teams to obtain the information. Also, add device info by using dev_info(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Yong Zhao	e1046a1f70	drm/amdgpu: Adjust the SDMA doorbell info printing Turn off the printing by default because it is not very useful, while adding more details. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Jonathan Kim	d84a430d9f	drm/amdgpu: fix race between pstate and remote buffer map Vega20 arbitrates pstate at hive level and not device level. Last peer to remote buffer unmap could drop P-State while another process is still remote buffer mapped. With this fix, P-States still needs to be disabled for now as SMU bug was discovered on synchronous P2P transfers. This should be fixed in the next FW update. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
James Zhu	4f610503f0	Revert "drm/amdgpu: Disable gfx off if VCN is busy" This reverts commit `3fded222f4` This is work around for vcn1 only. Currently vcn1 has separate begin_use and idle work handle. Signed-off-by: James Zhu <James.Zhu@amd.com> Tested-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Guchun Chen	12c17b9d62	drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU When running ras uncorrectable error injection and triggering GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: ffffffffc0f4f750 [ 80.047300] #PF: supervisor write access in kernel mode [ 80.047351] #PF: error_code(0x0003) - permissions violation [ 80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 404316061 [ 80.047477] Oops: 0003 [#1] SMP PTI [ 80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G OE 5.4.0-rc7-guchchen #1 [ 80.047594] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu] Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Kent Russell	69d0c18dda	drm/amdgpu: Disable FRU read on Arcturus Update the list with supported Arcturus chips, but disable for now until final list is confirmed. Ideally we can poll atombios for FRU support, instead of maintaining this list of chips, but this will enable serial number reading for supported ASICs for the time-being. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Rajneesh Bhardwaj	53c9c89ac1	drm/amdgpu/gmc: Fix spelling mistake. Fixes a minor typo in the file. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Kent Russell	fdd21e62b0	Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2" This reverts commit `c12b84d6e0`. The original patch causes a RAS event and subsequent kernel hard-hang when running the KFDMemoryTest.PtraceAccessInvisibleVram on VG20 and Arcturus dmesg output at hang time: [drm] RAS event of type ERREVENT_ATHUB_INTERRUPT detected! amdgpu 0000:67:00.0: GPU reset begin! Evicting PASID 0x8000 queues Started evicting pasid 0x8000 qcm fence wait loop timeout expired The cp might be in an unrecoverable state due to an unsuccessful queues preemption Failed to evict process queues Failed to suspend process 0x8000 Finished evicting pasid 0x8000 Started restoring pasid 0x8000 Finished restoring pasid 0x8000 [drm] UVD VCPU state may lost due to RAS ERREVENT_ATHUB_INTERRUPT amdgpu: [powerplay] Failed to send message 0x26, response 0x0 amdgpu: [powerplay] Failed to set soft min gfxclk ! amdgpu: [powerplay] Failed to upload DPM Bootup Levels! amdgpu: [powerplay] Failed to send message 0x7, response 0x0 amdgpu: [powerplay] [DisableAllSMUFeatures] Failed to disable all smu features! amdgpu: [powerplay] [DisableDpmTasks] Failed to disable all smu features! amdgpu: [powerplay] [PowerOffAsic] Failed to disable DPM! [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <powerplay> failed -5 Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Alex Deucher	079c72ad3a	drm/amdgpu/gfx9: add gfxoff quirk Fix screen corruption with firefox. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207171 Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
John Clements	7f70443fd8	drm/amdgpu: set mp1 state before reload Set MP1 state to prepare for unload before reloading SMU FW Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
John Clements	40e611bdd1	drm/amdgpu: update psp fw loading sequence Added dedicated function to check if particular fw should be skipped from loading. Added dedicated function for SMU FW loading via PSP Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:45 -04:00
Prike Liang	ced1ba9761	drm/amdgpu: fix the hw hang during perform system reboot and reset The system reboot failed as some IP blocks enter power gate before perform hw resource destory. Meanwhile use unify interface to set device CGPG to ungate state can simplify the amdgpu poweroff or reset ungate guard. Fixes: `487eca11a3` ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Tested-by: Mengbing Wang <Mengbing.Wang@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:45 -04:00
Dave Airlie	1aa63ddf72	drm-misc-next for 5.8: UAPI Changes: - drm: error out with EBUSY when device has existing master - drm: rework SET_MASTER and DROP_MASTER perm handling Cross-subsystem Changes: - fbdev: savage: fix -Wextra build warning - video: omap2: Use scnprintf() for avoiding potential buffer overflow Core Changes: - Remove drm_pci.h - drm_pci_{alloc/free)() are now legacy - Introduce managed DRM resourcesA - Allow drivers to subclass struct drm_framebuffer - Introduce struct drm_afbc_framebuffer and helpers - fbdev: remove return value from generic fbdev setup - Introduce simple-encoder helper - vram-helpers: set fence on plane - dp_mst: ACT timeout improvements - dp_mst: Remove drm_dp_mst_has_audio() - TTM: ttm_trace_dma_{map/unmap}() cleanups - dma-buf: add flag for PCIP2P support - EDID: Various improvements - Encoder: cleanup semantics of possible_clones and possible_crtcs - VBLANK documentation updates - Writeback documentation updates Driver Changes: - Convert several drivers to i2c_new_client_device() - Drop explicit drm_mode_config_cleanup() calls from drivers - Auto-release device structures with drmm_add_final_kfree() - Init bfdev console after registering DRM device - Make various .debugfs functions return 0 unconditionally; ignore errors - video: Use scnprintf() to avoid buffer overflows - Convert drivers to simple encoders - drm/amdgpu: note that we can handle peer2peer DMA-buf - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 - drm/kirin: Revert change to register connectors - drm/lima: Add optional devfreq and cooling device support - drm/lima: Various improvements wrt. task handling - drm/panel: nt39016: Support multiple modes and 50Hz - drm/panel: Support Leadtek LTK050H3146W - drm/rockchip: Add support for afbc - drm/virtio: Various cleanups - drm/hisilicon/hibmc: Enforce 128-byte stride alignment - drm/qxl: Fix notify port address of cursor ring buffer - drm/sun4i: Improvements to format handling - drm/bridge: dw-hdmi: Various improvements -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl6VfAAACgkQaA3BHVML eiNjBwgAtzRaqrKX3c4aL4NCBmfWzqxvKN0fVcx8tHtjhmrPTLITsHCM+wfcD2qC lkr/RMYJT02pNPGnX3jamQk0q/2GKGagChVZgORRsdYOOf5IqGIjvllhkg+U+7YV X0pHAfvGk2VyriHYj3s/cnwi9OwZ2UFjdS+f/u2Qp9jQYG/k8u9CCSnzgratY99I bI4jZi6JIoRkwuBpBEc9NbrduenKhcYNgPLDiYXY2TFmVz89NwITPnLyf5FWG5zd HsQ+dfIS9eoIxL3DTRgBZrPMvrqgiUjztB7cM4bdE0ttwTS7MW6M50/iV553qb9k DZ1+/pWFFyZLOPUYc3EK/QYdu8R3QA== =MQkd -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-04-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.8: UAPI Changes: - drm: error out with EBUSY when device has existing master - drm: rework SET_MASTER and DROP_MASTER perm handling Cross-subsystem Changes: - mm: export two symbols from slub/slob - fbdev: savage: fix -Wextra build warning - video: omap2: Use scnprintf() for avoiding potential buffer overflow Core Changes: - Remove drm_pci.h - drm_pci_{alloc/free)() are now legacy - Introduce managed DRM resourcesA - Allow drivers to subclass struct drm_framebuffer - Introduce struct drm_afbc_framebuffer and helpers - fbdev: remove return value from generic fbdev setup - Introduce simple-encoder helper - vram-helpers: set fence on plane - dp_mst: ACT timeout improvements - dp_mst: Remove drm_dp_mst_has_audio() - TTM: ttm_trace_dma_{map/unmap}() cleanups - dma-buf: add flag for PCIP2P support - EDID: Various improvements - Encoder: cleanup semantics of possible_clones and possible_crtcs - VBLANK documentation updates - Writeback documentation updates Driver Changes: - Convert several drivers to i2c_new_client_device() - Drop explicit drm_mode_config_cleanup() calls from drivers - Auto-release device structures with drmm_add_final_kfree() - Init bfdev console after registering DRM device - Make various .debugfs functions return 0 unconditionally; ignore errors - video: Use scnprintf() to avoid buffer overflows - Convert drivers to simple encoders - drm/amdgpu: note that we can handle peer2peer DMA-buf - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 - drm/kirin: Revert change to register connectors - drm/lima: Add optional devfreq and cooling device support - drm/lima: Various improvements wrt. task handling - drm/panel: nt39016: Support multiple modes and 50Hz - drm/panel: Support Leadtek LTK050H3146W - drm/rockchip: Add support for afbc - drm/virtio: Various cleanups - drm/hisilicon/hibmc: Enforce 128-byte stride alignment - drm/qxl: Fix notify port address of cursor ring buffer - drm/sun4i: Improvements to format handling - drm/bridge: dw-hdmi: Various improvements Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200414090738.GA16827@linux-uq9g	2020-04-22 10:41:35 +10:00
Thomas Zimmermann	08d99b2c23	Merge drm/drm-next into drm-misc-next Backmerging required to pull topic/phy-compliance. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2020-04-17 08:12:22 +02:00
Dave Airlie	1e6adfe565	Merge tag 'amd-drm-fixes-5.7-2020-04-15' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.7-2020-04-15: amdgpu: - gfx10 fix - SMU7 overclocking fix - RAS fix - GPU reset fix - Fix a regression in a previous s/r fix - Add a gfxoff quirk Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200415221631.3924-1-alexander.deucher@amd.com	2020-04-16 11:24:10 +10:00
Alex Deucher	974229db7e	drm/amdgpu/gfx9: add gfxoff quirk Fix screen corruption with firefox. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207171 Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-04-14 12:48:43 -04:00
Prike Liang	b2a7e9735a	drm/amdgpu: fix the hw hang during perform system reboot and reset The system reboot failed as some IP blocks enter power gate before perform hw resource destory. Meanwhile use unify interface to set device CGPG to ungate state can simplify the amdgpu poweroff or reset ungate guard. Fixes: `487eca11a3` ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Tested-by: Mengbing Wang <Mengbing.Wang@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-04-14 12:48:01 -04:00
Evan Quan	028cfb2444	drm/amdgpu: fix wrong vram lost counter increment V2 Vram lost counter is wrongly increased by two during baco reset. V2: assumed vram lost for mode1 reset on all ASICs Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:07:09 -04:00
Jason Yan	8e2f842063	drm/amdgpu: remove dead code in si_dpm.c This code is dead, let's remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:42 -04:00
Aurabindo Pillai	dd4fa6c1b8	drm/amd/amdgpu: remove hardcoded module name in prints Let format prefixes take care of printing the module name through pr_fmt and dev_fmt definitions. Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:40 -04:00
Aurabindo Pillai	539489fc91	drm/amd/amdgpu: add print prefix for dev_* variants Define dev_fmt macro for informative print messages Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:37 -04:00
Aurabindo Pillai	d57229b1da	drm/amd/amdgpu: add prefix for pr_* prints amdgpu uses lots of pr_* calls for printing error messages. With this prefix, errors shall be more obvious to the end use regarding its origin, and may help debugging. Prefix format: [xxx.xxxxx] amdgpu: ... Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:35 -04:00
Alex Deucher	a4c2468027	drm/amdgpu/ring: simplify scheduler setup logic Set up a GPU scheduler based on the ring flag rather than the ring type. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:26 -04:00
Alex Deucher	a783910d5c	drm/amdgpu/kiq: add no_scheduler flag to KIQ We don't want a GPU scheduler for this ring. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:24 -04:00
Alex Deucher	cb3d108501	drm/amdgpu/ring: add no_scheduler flag This allows IPs to flag whether a specific ring requires a GPU scheduler or not. E.g., sometimes instances of an IP are asymmetric and have different capabilities. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:19 -04:00
Evan Quan	dadce777e0	drm/amdgpu: fix wrong vram lost counter increment V2 Vram lost counter is wrongly increased by two during baco reset. V2: assumed vram lost for mode1 reset on all ASICs Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:08 -04:00
Guchun Chen	ed72aa21c7	drm/amdgpu: replace DRM prefix with PCI device info for GFX RAS Prefix RAS message printing in GFX IP with PCI device info, which assists the debug in multiple GPU case. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:03 -04:00
Yintian Tao	d32709dac6	drm/amdgpu: resume kiq access debugfs If there is no GPU hang, user still can access debugfs through kiq. Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Monk Liu <Monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:56 -04:00
Guchun Chen	6952e99cfd	drm/amdgpu: refine ras related message print Prefix ras related kernel message logging with PCI device info by replacing DRM_INFO/WARN/ERROR with dev_info/warn/err. This can clearly tell user about GPU device information where ras is. And add some other ras message printing to make it more clear and friendly as well. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:50 -04:00
Guchun Chen	1f3ef0efba	drm/amdgpu: add uncorrectable error count print in UMC ecc irq cb Uncorrectable error count printing is missed when issuing UMC UE injection. When going to the error count log function in GPU recover work thread, there is no chance to get correct error count value by last error injection and print, because the error status register is automatically cleared after reading in UMC ecc irq callback. So add such message printing in UMC ecc irq cb to be consistent with other RAS error interrupt cases. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:44 -04:00
Yintian Tao	95a2f91738	drm/amdgpu: restrict debugfs register access under SR-IOV Under bare metal, there is no more else to take care of the GPU register access through MMIO. Under Virtualization, to access GPU register is implemented through KIQ during run-time due to world-switch. Therefore, under SR-IOV user can only access debugfs to r/w GPU registers when meets all three conditions below. - amdgpu_gpu_recovery=0 - TDR happened - in_gpu_reset=0 v2: merge amdgpu_virt_can_access_debugfs() into amdgpu_virt_enable_access_debugfs() v3: drop ret variable in amdgpu_virt_enable_access_debugfs() and directly return result Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:04 -04:00
Linus Torvalds	21c5b3c6d7	drm fixes for 5.7-rc1 (part two) legacy: - fix drm_local_map.offset type ttm: - temporarily disable hugepages to debug amdgpu problems. prime: - fix sg extraction amdgpu: - Various Renoir fixes - Fix gfx clockgating sequence on gfx10 - RAS fixes - Avoid MST property creation after registration - Various cursor/viewport fixes - Fix a confusing log message about optional firmwares i915: - Flush all the reloc_gpu batch (Chris) - Ignore readonly failures when updating relocs (Chris) - Fill all the unused space in the GGTT (Chris) - Return the right vswing table (Jose) - Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre) analogix_dp: - probe fix virtio: - oob fix in object create -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJej453AAoJEAx081l5xIa+r7YQAIX48cUROehoNDhzEHnAxJuU WZXNHKaMCaDPzAs6SyCtHiPFWH6trBR5McE2dXfg6qc33lnzROFNp5PLB7qb4O+q +3QkJ5cGd1bohT7vn3omP9FxxZeD4H4bE/zat+yUPwMWJYSUz4m6w6Ya0rIPa8HS d9nRL0Y6wGBhm8/E0WCB6fe5G96D1JOFGLhfbVajGlDW/I+eBiS5WEyrtlIW698K Q7lTNOXKEi9kFEZiW39RbKwW3YwqOEiQf1k0KbvUqctq4qLskHD3MgJpmkAiGVPH mSnTYPPyATILVGKcmEHR3oB9wuYsoPhNgGCVhhm1MppI8GVUzfk6uqOfdK8UNfDU IRAZ05AynmMMFNu/4Fw0SyR1sbj4OtAiG0hWaJ6Ou9MBzhERGXfT3+/BzeHsR4MJ +fVIbOArSCAeFTAkqcLVbMKjivAJjullpsj36DFn+lXmHnxB98zAkSNT5dQcDjzl bp6FhJXm7pWYx8SvkGRneESqLdr2WVgyZmP6u+kgzZ5pPubWSDqY1IFu1exb5Sne bf7HoPzQ6LyD5KgX5WdoJ5++bcvQ9G4/qDF96NY6emCMKcwnOaAzvtErxQLpFeWP dZwnxHXXtxY4Z4r5bFURPeWX3rWX5f/cCZ8B7mTUDSTa4hgzV8yUX3ZQBc+9Knja zuvnpm4j1BXFqOg0Xfsu =ZTAf -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm Pull more drm fixes from Dave Airlie: "As expected, more fixes did turn up in the latter part of the week. The drm_local_map build regression fix is here, along with temporary disabling of the hugepage work due to some amdgpu related crashes. Otherwise it's just a bunch of i915, and amdgpu fixes. legacy: - fix drm_local_map.offset type ttm: - temporarily disable hugepages to debug amdgpu problems. prime: - fix sg extraction amdgpu: - Various Renoir fixes - Fix gfx clockgating sequence on gfx10 - RAS fixes - Avoid MST property creation after registration - Various cursor/viewport fixes - Fix a confusing log message about optional firmwares i915: - Flush all the reloc_gpu batch (Chris) - Ignore readonly failures when updating relocs (Chris) - Fill all the unused space in the GGTT (Chris) - Return the right vswing table (Jose) - Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre) analogix_dp: - probe fix virtio: - oob fix in object create" * tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm: (34 commits) drm/ttm: Temporarily disable the huge_fault() callback drm/bridge: analogix_dp: Split bind() into probe() and real bind() drm/legacy: Fix type for drm_local_map.offset drm/amdgpu/display: fix warning when compiling without debugfs drm/amdgpu: unify fw_write_wait for new gfx9 asics drm/amd/powerplay: error out on forcing clock setting not supported drm/amdgpu: fix gfx hang during suspend with video playback (v2) drm/amd/display: Check for null fclk voltage when parsing clock table drm/amd/display: Acknowledge wm_optimized_required drm/amd/display: Make cursor source translation adjustment optional drm/amd/display: Calculate scaling ratios on every medium/full update drm/amd/display: Program viewport when source pos changes for DCN20 hw seq drm/amd/display: Fix incorrect cursor pos on scaled primary plane drm/amd/display: change default pipe_split policy for DCN1 drm/amd/display: Translate cursor position by source rect drm/amd/display: Update stream adjust in dc_stream_adjust_vmin_vmax drm/amd/display: Avoid create MST prop after registration drm/amdgpu/psp: dont warn on missing optional TA's drm/amdgpu: update RAS related dmesg print drm/amdgpu: resolve mGPU RAS query instability ...	2020-04-10 12:38:28 -07:00
Likun Gao	bfa5807d4d	Revert "drm/amdgpu: change SH MEM alignment mode for gfx10" This reverts commit `b74fb888f4`. Revert the auto alignment mode set of SH MEM config, as it will result to OCL Conformance Test fail. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:44:41 -04:00
John Clements	9a785c7ad1	drm/amdgpu: increased atom cmd timeout added macro to define timeout Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:33 -04:00
Aurabindo Pillai	ad36d71b3f	amdgpu_kms: Remove unnecessary condition check Execution will only reach here if the asserted condition is true. Hence there is no need for the additional check. Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Aaron Liu	ba714a56fc	drm/amdgpu: unify fw_write_wait for new gfx9 asics Make the fw_write_wait default case true since presumably all new gfx9 asics will have updated firmware. That is using unique WAIT_REG_MEM packet with opration=1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Tested-by: Aaron Liu <aaron.liu@amd.com> Tested-by: Yuxian Dai <Yuxian.Dai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	2eee0229f6	drm/amdgpu: support access regs outside of mmio bar add indirect access support to registers outside of mmio bar. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	f384ff95f6	drm/amdgpu: retire AMDGPU_REGS_KIQ flag all the register access through kiq is redirected to amdgpu_kiq_rreg/amdgpu_kiq_wreg Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	ec59847e74	drm/amdgpu: retire RREG32_IDX/WREG32_IDX those are not needed anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	3c888c1635	drm/amdgpu: retire indirect mmio reg support from cgs not needed anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	46e840ed10	drm/amdgpu: replace indirect mmio access in non-dc code path all the mmCUR_CONTROL instances are in mmr range and can be accessd directly by using RREG32/WREG32 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	dec0520aff	drm/amdgpu: remove inproper workaround for vega10 the workaround is not needed for soc15 ASICs except for vega10. it is even not needed with latest vega10 vbios. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00

... 10 11 12 13 14 ...

7773 commits