1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

11668 commits

Author SHA1 Message Date
YiPeng Chai
f5c7e77970 drm/amdgpu: Adjust removal control flow for smu v13_0_2
Adjust removal control flow for smu v13_0_2:
   During amdgpu uninstallation, when removing the first
device, the kernel needs to first send a mode1reset message
to all gpu devices. Otherwise, smu initialization will fail
the next time amdgpu is installed.

V2:
1. Update commit comments.
2. Remove the global variable amdgpu_device_remove_cnt
   and add a variable to the structure amdgpu_hive_info.
3. Use hive to detect the first removed device instead of
   a global variable.

V3:
 1. Update commit comments.
 2. Split a patch into multiple patches.
 3. The current patch does:
    a. Add a work mode of AMDGPU_RESET_FOR_DEVICE_REMOVE into
       the existing gpu recover path, which make all devices
       in hive list only have HW reset but no resume (except
       the base IP).
    b. Call AMDGPU_RESET_FOR_DEVICE_REMOVE and
       AMDGPU_NEED_FULL_RESET mode of amdgpu_device_gpu_recover
       in amdgpu_pci_remove when removing the first device in
       hive list.
    c. When removing the first device, the IP blocks keyword
       function call sequence is as follows:
.suspend->mode1reset->.resume(basic ip)->.hw_fini->.early_fini->.sw_fini.
   ^                           |
   |-<----------<---------<----|
	The first three sequences are because of a call to
        amdgpu_device_gpu_recover. The three sequences will be
        executed in a loop until all devices in the hive list
        are iterated.
        The sequences starting from .hw_fini only apply to the
        first device. Since .suspend has been called before,
        except the resumed phase1 basic ip blocks, all other ip
        blocks .hw_fini of current device will do nothing.
     d. When removing other devices, the calling sequences is the
        same as legacy:
	   .hw_fini -> .early_fini -> .sw_fini.
	Since .suspend has been called when removing the first device,
        except the resumed phase1 basic ip blocks, all of other ip
        blocks .hw_fini of current device will do nothing.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:17:20 -04:00
Yifan Zhang
10faf07871 drm/amdgpu: add MES and MES-KIQ version in debugfs
This patch addes MES and MES-KIQ version in debugfs.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:10:04 -04:00
Yang Li
7f89f9973c drm/amd/display: clean up some inconsistent indentings
clean up some inconsistent indentings

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2178
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:07:54 -04:00
Hawking Zhang
670c6edfbb drm/amdgpu: add rlcv/rlcp version info to debugfs
amdgpu_firmware_info debugfs will show rlcv/rlcp
ucode version info

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:07:40 -04:00
Hawking Zhang
d5c6ad7296 drm/amdgpu: support print rlc v2_x ucode hdr
add rlc v2_x support to print_rlc_hdr helper

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:07:30 -04:00
Hawking Zhang
ed2eee42d3 drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx
cache rlcv/rlcvp ucode version info in amdgpu_gfx
structure

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:07:24 -04:00
Mukul Joshi
876552e5d5 drm/amdgpu: Update PTE flags with TF enabled
This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
  consecutive invalid PTEs as 0 is considered a fault. To prevent
  this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
  into a no-retry fault, doesn't work with TF enabled. As a result,
  update invalid PTE flags settings which works for both TF enabled
  and disabled case.

Fixes: 352e683b72 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:06:13 -04:00
Philip Yang
3913f0179b drm/amdgpu: SDMA update use unlocked iterator
SDMA update page table may be called from unlocked context, this
generate below warning. Use unlocked iterator to handle this case.

WARNING: CPU: 0 PID: 1475 at
drivers/dma-buf/dma-resv.c:483 dma_resv_iter_next
Call Trace:
 dma_resv_iter_first+0x43/0xa0
 amdgpu_vm_sdma_update+0x69/0x2d0 [amdgpu]
 amdgpu_vm_ptes_update+0x29c/0x870 [amdgpu]
 amdgpu_vm_update_range+0x2f6/0x6c0 [amdgpu]
 svm_range_unmap_from_gpus+0x115/0x300 [amdgpu]
 svm_range_cpu_invalidate_pagetables+0x510/0x5e0 [amdgpu]
 __mmu_notifier_invalidate_range_start+0x1d3/0x230
 unmap_vmas+0x140/0x150
 unmap_region+0xa8/0x110

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:05:54 -04:00
Yifan Zhang
0af4ed0c32 drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1
there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be
zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd
engine exists and map/unmap SDMA queues to the non-existent engine.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 15:00:34 -04:00
Alex Deucher
a8671493d2 drm/amdgpu: make sure to init common IP before gmc
Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-09-14 14:21:49 -04:00
Alex Deucher
e3163bc8ff drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega
This mirrors what we do for other asics and this way we are
sure the sdma doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  However, the statement says that it applies to
multimedia as well, but the VCN code currently initializes
doorbells after GFX and there are no known issues there.  In my
testing at least I don't see any problems on SDMA.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-09-14 14:21:43 -04:00
Alex Deucher
dc1d85cb79 drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega
This mirrors what we do for other asics and this way we are
sure the ih doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  In this case IH is initialized before GFX,
so there should be no issue.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-09-14 14:21:25 -04:00
Alex Deucher
c1c39032a0 drm/amdgpu: make sure to init common IP before gmc
Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 12:38:52 -04:00
Alex Deucher
59c43748c7 drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega
This mirrors what we do for other asics and this way we are
sure the sdma doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  However, the statement says that it applies to
multimedia as well, but the VCN code currently initializes
doorbells after GFX and there are no known issues there.  In my
testing at least I don't see any problems on SDMA.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 12:38:52 -04:00
Alex Deucher
db10109793 drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega
This mirrors what we do for other asics and this way we are
sure the ih doorbell range is properly initialized.

There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized.  In this case IH is initialized before GFX,
so there should be no issue.

This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-14 12:38:52 -04:00
Maxime Ripard
3f1a3a28e9 Immutable backlight-detect-refactor branch between acpi, drm-* and pdx86
Tag (immutable branch) with v6.0-rc1 + the (acpi/x86) backlight
 detect refactor work. For merging into the acpi, drm-* and pdx86
 subsystems.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmMVsogUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yy6wgAlig+7hkq940L62lTpj0g2gNQv8zc
 HCsMpnU7dnJcZYaEvIjouZhf33ZbN52c0fQq2JWjt7fFX04LLyIiyrJ26Lc293JR
 ++yXpJcVoewRGqApy/P3Z05TKUCLll5bexvK4t8isnhOtEXD/nDPWKTLIV2Kd1DK
 nLY4KgRznXZ85RhYheUEdidZ7Lwlzt1JVBMq7tpnzu3nVdDExyZmqlqCUITcLynu
 ysuASQGr0D2i+1vb9eifHIA3xsQO0S37Bv62aBMBKxB6B8Fz1DYr8VA2YvoT82Hv
 IFT0hzCCZ/63Ljga05O78TwraxAQX0RvZWqjqGgnZg6fIBh2hxUiqeQY6g==
 =SA1R
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABMIAB0WIQTXEe0+DlZaRlgM8LOIQ8rmN6G3ywUCYyG6jgAKCRCIQ8rmN6G3
 y3JSAQCKELIhrWPrqAixxWn3OWcaU9PHb4Uhf9Qg3sLL2Bm2TAEAuT8iJ0Kssakg
 GKWScS5nj+8kbFm0J067JLOOYjvTxEI=
 =42X0
 -----END PGP SIGNATURE-----

Merge tag 'backlight-detect-refactor-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 into drm-misc-next

Immutable backlight-detect-refactor branch between acpi, drm-* and pdx86

Tag (immutable branch) with v6.0-rc1 + the (acpi/x86) backlight
detect refactor work. For merging into the acpi, drm-* and pdx86
subsystems.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmMVsogUHGhkZWdvZWRl
# QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yy6wgAlig+7hkq940L62lTpj0g2gNQv8zc
# HCsMpnU7dnJcZYaEvIjouZhf33ZbN52c0fQq2JWjt7fFX04LLyIiyrJ26Lc293JR
# ++yXpJcVoewRGqApy/P3Z05TKUCLll5bexvK4t8isnhOtEXD/nDPWKTLIV2Kd1DK
# nLY4KgRznXZ85RhYheUEdidZ7Lwlzt1JVBMq7tpnzu3nVdDExyZmqlqCUITcLynu
# ysuASQGr0D2i+1vb9eifHIA3xsQO0S37Bv62aBMBKxB6B8Fz1DYr8VA2YvoT82Hv
# IFT0hzCCZ/63Ljga05O78TwraxAQX0RvZWqjqGgnZg6fIBh2hxUiqeQY6g==
# =SA1R
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 05 Sep 2022 09:25:44 AM IST
# gpg:                using RSA key BAF03B5D2718411A5E9E177E92EC4779440327DC
# gpg:                issuer "hdegoede@redhat.com"
# gpg: Can't check signature: No public key
From: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/261afe3d-7790-e945-adf6-a2c96c9b1eff@redhat.com
2022-09-14 12:27:10 +01:00
Alex Deucher
221bb3a9c3 drm/amdgpu: fix warning about missing imu prototype
for imu_v11_0_3_program_rlc_ram(). Include imu_v11_0_3.h
in imu_v11_0_3.c.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
d4e8ad908b drm/amdgpu: reorder CS code
Sort the functions in the order they are called

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
88c98d54b2 drm/amdgpu: cleanup CS init/fini and pass1
Cleanup the coding style and function names to represent the data
they process. Only initialize and cleanup the CS structure in
init/fini.

Check the size of the IB chunk in pass1.

v2: fix job initialisation order and use correct scheduler instance
v3: try to move all functional changes into a separate patch.
v4: move reordering and pass2 out of this patch as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
4247084057 drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP v2
Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.

v2: actually update all usages for KFD

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
dd80d9c8ee drm/amdgpu: revert "partial revert "remove ctx->lock" v2"
This reverts commit 94f4c4965e.

We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
736ec9fadd drm/amdgpu: move setting the job resources
Move setting the job resources into amdgpu_job.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Christian König
9b94c609cc drm/amdgpu: remove SRIOV and MCBP dependencies from the CS
We should not have any different CS constrains based
on the execution environment.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:33:01 -04:00
Candice Li
6da15a236c drm/amdgpu: Skip reset error status for psp v13_0_0
No need to reset error status since only umc ras supported on psp v13_0_0.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:59 -04:00
Candice Li
1582252946 drm/amdgpu: Add EEPROM I2C address for smu v13_0_0
Set correct EEPROM I2C address for smu v13_0_0.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
John Clements
c3db1b9065 drm/amdgpu: added support for ras driver loading
copy ras driver to psp if present

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Alex Deucher
7a87040ca6 drm/amdgpu: add HDP remap functionality to nbio 7.7
Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Jingyu Wang
e7c94bfb74 drm/amdgpu: cleanup coding style in amdgpu_amdkfd_gpuvm.c
Fix everything checkpatch.pl complained about in amdgpu_amdkfd_gpuvm.c

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Jingyu Wang
2f4ca1ba6c drm/amdgpu: cleanup coding style in amdgpu_amdkfd.c
Fix everything checkpatch.pl complained about in amdgpu_amdkfd.c

Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Jingyu Wang
826f03b8ac drm/amdgpu: cleanup coding style in amdgpu_sync.c file
This is a patch to the amdgpu_sync.c file that fixes some warnings found by the checkpatch.pl tool

Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Jingyu Wang
556bdae320 drm/amdgpu: cleanup coding style in amdgpu_acpi.c
Fix everything checkpatch.pl complained about in amdgpu_acpi.c

Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
zhang songyi
79c0d7ddcb drm/amdgpu: Remove the unneeded result variable
Return the sdma_v6_0_start() directly instead of storing it in another
redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:58 -04:00
Vignesh Chander
2efc30f016 drm/amdgpu: Fix hive reference count leak
both get_xgmi_hive and put_xgmi_hive can be skipped since the
reset domain is not necessary for VF

Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:57 -04:00
Candice Li
86875d558b drm/amdgpu: Skip reset error status for psp v13_0_0
No need to reset error status since only umc ras supported on psp v13_0_0.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:26:59 -04:00
Alex Deucher
8c5708d3da drm/amdgpu: add HDP remap functionality to nbio 7.7
Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:26:59 -04:00
Yang Wang
36de13fdb0 drm/amdgpu: change the alignment size of TMR BO to 1M
align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.

v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:26:59 -04:00
Candice Li
df2c6e0c95 drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0
Enable full reset for RAS supported configuration on gc v11_0_0.

v2: simplify the code.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:25:39 -04:00
Hamza Mahfooz
66f99628eb drm/amdgpu: use dirty framebuffer helper
Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:25:39 -04:00
Lijo Lazar
6c20490663 drm/amdgpu: Don't enable LTR if not supported
As per PCIE Base Spec r4.0 Section 6.18
'Software must not enable LTR in an Endpoint unless the Root Complex
and all intermediate Switches indicate support for LTR.'

This fixes the Unsupported Request error reported through AER during
ASPM enablement.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:25:39 -04:00
Yang Wang
cd3a49af58 drm/amdgpu: change the alignment size of TMR BO to 1M
align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.

v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:56:16 -04:00
Candice Li
34dfca8908 drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0
Enable full reset for RAS supported configuration on gc v11_0_0.

v2: simplify the code.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:56:16 -04:00
Candice Li
d27ec594b4 drm/amdgpu: Rely on MCUMC_STATUS for umc v8_10 correctable error counter only
Only check MCUMC_STATUS for CE counter for umc v8_10.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:56:16 -04:00
shaoyunl
46c676600c drm/amdgpu: Use per device reset_domain for XGMI on sriov configuration
For SRIOV configuration, host driver control the reset method(either FLR or
heavier chain reset). The host will notify the guest individually with FLR
message if individual GPU within the hive need to be reset. So for guest
side, no need to use hive->reset_domain to replace the original per
device reset_domain

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:23 -04:00
Hamza Mahfooz
542ab49173 drm/amdgpu: use dirty framebuffer helper
Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:23 -04:00
Lijo Lazar
dd6aeb4e5f drm/amdgpu: Don't enable LTR if not supported
As per PCIE Base Spec r4.0 Section 6.18
'Software must not enable LTR in an Endpoint unless the Root Complex
and all intermediate Switches indicate support for LTR.'

This fixes the Unsupported Request error reported through AER during
ASPM enablement.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:22 -04:00
Dave Airlie
47519d8224 Merge tag 'amd-drm-next-6.1-2022-09-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.1-2022-09-08:

amdgpu:
- Mode2 reset for RDNA2
- Lots of new DC documentation
- Add documentation about different asic families
- DSC improvements
- Aldebaran fixes
- Misc spelling and grammar fixes
- GFXOFF stats support for vangogh
- DC frame size fixes
- NBIO 7.7 updates
- DCN 3.2 updates
- DCN 3.1.4 Updates
- SMU 13.x updates
- Misc bug fixes
- Rework DC register offset handling
- GC 11.x updates
- PSP 13.x updates
- SDMA 6.x updates
- GMC 11.x updates
- SR-IOV updates
- PSP fixes for TA unloading
- DSC passthrough support
- Misc code cleanups

amdkfd:
- ISA fixes for some GC 10.3 IPs
- Misc code cleanups

radeon:
- Delayed work flush fix
- Use time_after for some jiffies calculations

drm:
- DSC passthrough aux support

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220908155202.57862-1-alexander.deucher@amd.com
2022-09-12 19:17:41 +10:00
Dave Airlie
fb34d8a04e Merge tag 'drm-misc-next-2022-09-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.1-rc1:

[airlied - fix sun4i_tv build]

UAPI Changes:
- Hide unregistered connectors from GETCONNECTOR ioctl.
- drm/virtio no longer advertises LINEAR modifier, as it doesn't work.
-

Cross-subsystem Changes:
- Fix GPF in udmabuf failure path.

Core Changes:
- Rework TTM placement to use intersect/compatible functions.
- Drop legacy DP-MST support.
- More DP-MST related fixes, and move all state into atomic.
- Make DRM_MIPI_DBI select DRM_KMS_HELPER.
- Add audio_infoframe packing for DP.
- Add logging when some atomic check functions fail.
- Assorted documentation updates and fixes.

Driver Changes:
- Assorted cleanups and fixes in msm, lcdif, nouveau, virtio,
  panel/ilitek, bridge/icn6211, tve200, gma500, bridge/*, panfrost, via,
  bochs, qxl, sun4i.
- Add add AUO B133UAN02.1, IVO M133NW4J-R3, Innolux N120ACA-EA1 eDP panels.
- Improve DP-MST modeset state handling in amdgpu, nouveau, i915.
- Drop DP-MST from radeon driver, it was broken and only user of legacy
  DP-MST.
- Handle unplugging better in vc4.
- Simplify drm cmdparser tests.
- Add DP support to ti-sn65dsi86.
- Add MT8195 DP support to mediatek.
- Support RGB565, XRGB64, and ARGB64 formats in vkms.
- Convert sun4i tv support to atomic.
- Refactor vc4/vec TV Modesetting, and fix timings.
- Use atomic helpers instead of simple display helpers in ssd130x.

Maintainer changes:
- Add Douglas Anderson as reviewer for panel-edp.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a489485b-3ebc-c734-0f80-aed963d89efe@linux.intel.com
2022-09-11 22:03:07 +10:00
Guchun Chen
aac4cec1ec drm/amdgpu: prevent toc firmware memory leak
It's missed in psp fini.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-07 22:52:43 -04:00
Yifan Zhang
d832db12af drm/amdgpu: correct doorbell range/size value for CSDMA_DOORBELL_RANGE
current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE
range/size manipulation, while these 2 registers have difference size
field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Xiaojian Du <Xiaojian.Du@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-07 22:52:32 -04:00
Yifan Zhang
ae0448bc88 drm/amdkfd: print address in hex format rather than decimal
Addresses should be printed in hex format.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-07 22:52:19 -04:00