1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

10937 commits

Author SHA1 Message Date
Linus Torvalds
ab18b7b36a drm next for 5.19-rc1 (part 2/fixes)
msm:
 - Limiting WB modes to max sspp linewidth
 - Fixing the supported rotations to add 180 back for IGT
 - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
   in the bind() path for dpu driver
 - Fix the irq_free() without request issue which was a big-time
   hitter in the CI-runs.
 
 amdgpu:
 - Update fdinfo to the common drm format
 - uapi: Add VM_NOALLOC GPUVM attribute to prevent buffers for going into the MALL
   Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that can be discarded on eviction
   Mesa code which uses these: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466
 - Link training fixes
 - DPIA fixes
 - Misc code cleanups
 - Aux fixes
 - Hotplug fixes
 - More FP clean up
 - Misc GFX9/10 fixes
 - Fix a possible memory leak in SMU shutdown
 - SMU 13 updates
 - RAS fixes
 - TMZ fixes
 - GC 11 updates
 - SMU 11 metrics fixes
 - Fix coverage blend mode for overlay plane
 - Note DDR vs LPDDR memory
 - Fuzz fix for CS IOCTL
 - Add new PCI DID
 
 amdkfd:
 - Clean up hive setup
 - Misc fixes
 
 tegra:
 - add some prelim 5.20 work to avoid inter-tree mess
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmKZfhwACgkQDHTzWXnE
 hr5v1A/+KLUDtz1vB/JGHYnqJYeGicSs48Pe2OtSW2YX6rI2IwaQBnYjXtm8J8NF
 47Lw35KjzYLS9NRst9/+zhcyij6s573dibWVTi9NG5O0vYMoww6EvD7ORqPAv1JP
 WNl1ek9DmkorEtcjJWwWDpMUP34WrN1qCHRT1p6nkcIJ28pp+x4fwt3cPi3fmEmk
 hH5/pjwymvaWX9m/HBRi/esone/uLVVIDThX769spyY7c1jKG3BKO0/hGrvY/qHs
 0QRIG4eroB2b6KEzg6gO4a8NX7PHamBtImT9Nqw+o8pBzYhuez39ITM5sMkCrChj
 kvCpF+YXwUWyr+4SbkKLGHZqerpgz219kHOh1Z7VOwCSQomIljkmdoKHUJ50QMKk
 pNzF/U7ng1zgPhTJVj/o3wymeE9y1X0W3p/yLXS1jXt+Nl4AjsBWSZLd0Nn+HKck
 ZpqwhIfH5vuFjpN+u2F3pVaRCjNM2UXavS6CD1thIOXDQ4gQ+p2rndHL/yWIyVFb
 HXkQu0PBSKzF6ys4ikvXxsT7xM0EJEa680dEjJuMhhC5LgDzw3A5jLdSfhIr2nal
 Z+CWW1NKukAarI2rPOUotO8GYsY3bMF3o/02wFRAmDwPwXUWvy/mULv8vSl63ocH
 6B/YOitiGKz6tpOqRwBgjkFCTebMwKc7Mbt2JVZKiG73BMKHcFQ=
 =IC8q
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "This is mostly regular fixes, msm and amdgpu. There is a tegra patch
  that is bit of prep work for a 5.20 feature to avoid some inter-tree
  syncs, and a couple of late addition amdgpu uAPI changes but best to
  get those in early, and the userspace pieces are ready.

  msm:
   - Limiting WB modes to max sspp linewidth
   - Fixing the supported rotations to add 180 back for IGT
   - Fix to handle pm_runtime_get_sync() errors to avoid unclocked
     access in the bind() path for dpu driver
   - Fix the irq_free() without request issue which was a big-time
     hitter in the CI-runs.

  amdgpu:
   - Update fdinfo to the common drm format
   - uapi:
       - Add VM_NOALLOC GPUVM attribute to prevent buffers for going
         into the MALL
       - Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that
         can be discarded on eviction
       - Mesa code which uses these:
           https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466
   - Link training fixes
   - DPIA fixes
   - Misc code cleanups
   - Aux fixes
   - Hotplug fixes
   - More FP clean up
   - Misc GFX9/10 fixes
   - Fix a possible memory leak in SMU shutdown
   - SMU 13 updates
   - RAS fixes
   - TMZ fixes
   - GC 11 updates
   - SMU 11 metrics fixes
   - Fix coverage blend mode for overlay plane
   - Note DDR vs LPDDR memory
   - Fuzz fix for CS IOCTL
   - Add new PCI DID

  amdkfd:
   - Clean up hive setup
   - Misc fixes

  tegra:
   - add some prelim 5.20 work to avoid inter-tree mess"

* tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm: (57 commits)
  drm/msm/dpu: Move min BW request and full BW disable back to mdss
  drm/msm/dpu: Fix pointer dereferenced before checking
  drm/msm/dpu: Remove unused code
  drm/msm/disp/dpu1: remove superfluous init
  drm/msm/dp: Always clear mask bits to disable interrupts at dp_ctrl_reset_irq_ctrl()
  gpu: host1x: Add context bus
  drm/amdgpu: add drm-client-id to fdinfo v2
  drm/amdgpu: Convert to common fdinfo format v5
  drm/amdgpu: bump minor version number
  drm/amdgpu: add AMDGPU_VM_NOALLOC v2
  drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE
  drm/amdgpu: add beige goby PCI ID
  drm/amd/pm: Return auto perf level, if unsupported
  drm/amdkfd: fix typo in comment
  drm/amdgpu/gfx: fix typos in comments
  drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
  drm/amdgpu: differentiate between LP and non-LP DDR memory
  drm/amdgpu: Resolve pcie_bif RAS recovery bug
  drm/amdgpu: clean up asd on the ta_firmware_header_v2_0
  drm/amdgpu/discovery: validate VCN and SDMA instances
  ...
2022-06-03 09:49:29 -07:00
Linus Torvalds
98931dd95f Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages.
 
 Johannes Weiner has arranged for zswap memory use to be tracked and
 managed on a per-cgroup basis.
 
 Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
 enablement of the recent huge page vmemmap optimization feature.
 
 Baolin Wang contributes a series to fix some issues around hugetlb
 pagetable invalidation.
 
 Zhenwei Pi has fixed some interactions between hwpoisoned pages and
 virtualization.
 
 Tong Tiangen has enabled the use of the presently x86-only
 page_table_check debugging feature on arm64 and riscv.
 
 David Vernet has done some fixup work on the memcg selftests.
 
 Peter Xu has taught userfaultfd to handle write protection faults against
 shmem- and hugetlbfs-backed files.
 
 More DAMON development from SeongJae Park - adding online tuning of the
 feature and support for monitoring of fixed virtual address ranges.  Also
 easier discovery of which monitoring operations are available.
 
 Nadav Amit has done some optimization of TLB flushing during mprotect().
 
 Neil Brown continues to labor away at improving our swap-over-NFS support.
 
 David Hildenbrand has some fixes to anon page COWing versus
 get_user_pages().
 
 Peng Liu fixed some errors in the core hugetlb code.
 
 Joao Martins has reduced the amount of memory consumed by device-dax's
 compound devmaps.
 
 Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
 
 Muchun Song has found and fixed some errors in the TLB flushing of
 transparent hugepages.
 
 Roman Gushchin has done more work on the memcg selftests.
 
 And, of course, many smaller fixes and cleanups.  Notably, the customary
 million cleanup serieses from Miaohe Lin.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
 jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
 PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
 =nFe6
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Christian König
9bdc1992c9 drm/amdgpu: add drm-client-id to fdinfo v2
This is enough to get gputop working :)

v2: rebase and some addition cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
af0b541670 drm/amdgpu: Convert to common fdinfo format v5
Convert fdinfo format to one documented in drm-usage-stats.rst.

It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.

So 10% usage for application A could mean something completely different
than 10% usage for application B.

Completely nuke that and just use the now standardized nanosecond
interface.

v2: drop the documentation change for now, nuke percentage calculation
v3: only account for each hw_ip, move the time_spend to the ctx mgr.
v4: move general ctx changes into separate patch, rework the fdinfo to
    ctx_mgr interface so that all usages are calculated at once, drop
    some unecessary and dangerous refcount dance.
v5: add one more comment how we calculate the time spend

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
08cffb3eb7 drm/amdgpu: bump minor version number
Increase the minor version number to indicate that the new flags are
available.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
b6c65a2c92 drm/amdgpu: add AMDGPU_VM_NOALLOC v2
Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.

v2: also add the flag to the allowed flags.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Christian König
fab2cc8335 drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE
Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
doesn't needs to be preserved during eviction.

KFD was already using a similar functionality for SVM BOs so replace the
internal flag with the new UAPI.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:34 -04:00
Alex Deucher
62e9bd2003 drm/amdgpu: add beige goby PCI ID
Add a beige goby PCI ID.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-05-26 14:56:33 -04:00
Julia Lawall
6bd8d4b7d5 drm/amdkfd: fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Julia Lawall
ab5a7fb6d2 drm/amdgpu/gfx: fix typos in comments
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Dave Airlie
31ab27b14d drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
Submitting a cs with 0 chunks, causes an oops later, found trying
to execute the wrong userspace driver.

MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo

[172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[172536.665188] #PF: supervisor read access in kernel mode
[172536.665189] #PF: error_code(0x0000) - not-present page
[172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0
[172536.665195] Oops: 0000 [#1] SMP NOPTI
[172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P           O      5.10.81 #1-NixOS
[172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
[172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu]
[172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10
[172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246
[172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68
[172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38
[172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40
[172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28
[172536.665283] FS:  00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000
[172536.665284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0
[172536.665287] Call Trace:
[172536.665322]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665332]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[172536.665338]  drm_ioctl+0x201/0x3b0 [drm]
[172536.665369]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665372]  ? selinux_file_ioctl+0x135/0x230
[172536.665399]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[172536.665403]  __x64_sys_ioctl+0x83/0xb0
[172536.665406]  do_syscall_64+0x33/0x40
[172536.665409]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Alex Deucher
d534ca7128 drm/amdgpu: differentiate between LP and non-LP DDR memory
Some applications want to know whether the memory is LP or
not.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Candice Li
a5457087eb drm/amdgpu: Resolve pcie_bif RAS recovery bug
Check shared buf instead of init flag for xgmi ta shared buf init
during xgmi ta initialization.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Prike Liang
1c75524146 drm/amdgpu: clean up asd on the ta_firmware_header_v2_0
On the psp13 series use ta_firmware_header_v2_0 and the asd firmware
was buildin ta, so needn't request asd firmware separately.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Alex Deucher
a0ccc717c4 drm/amdgpu/discovery: validate VCN and SDMA instances
Validate the VCN and SDMA instances against the driver
structure sizes to make sure we don't get into a
situation where the firmware reports more instances than
the driver supports.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Evan Quan
caa5eadc14 drm/amdgpu: suppress some compile warnings
Suppress two compile warnings about "no previous prototype".

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:33 -04:00
Sunil Khatri
49b74d12d1 drm/amdgpu: add support of tmz for GC 10.3.7
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Sunil Khatri
0ef3dc7e97 drm/amdgpu: change code name to ip version for tmz set
Use IP version rather then code name of IPs for
tmz set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Sunil Khatri
4d33e7040d drm/amdgpu: move amdgpu_gmc_tmz_set after ip_version populated
To enable TMZ feature based on IP version needs adev->ip_version
populated but its empty. Move amdgpu_gmc_tmz_set to a place where
ip_version is populated.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Stanley.Yang
950d64250f drm/amdgpu: support ras on SRIOV
support umc/gfx/sdma ras on guest side

Changed from V1:
    move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:32 -04:00
Haohui Mai
10784fec9c drm/amdgpu/gfx10: rework KIQ programming
Make sure the queue is not longer active before programming
the kiq EOP registers.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Haohui Mai
842035543c drm/amdgpu: Set CP_HQD_PQ_CONTROL.RPTR_BLOCK_SIZE correctly
Remove the accidental shifts on the values of RPTR_BLOCK_SIZE
in gfx_v8-v11. The bug essentially always programs the
corresponding fields to zero instead of the correct value.
The hardware clamps the min value to 5 so this resulted in a
value of 5 being programmed.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Christian König
69493c034d drm/amdgpu: cleanup ctx implementation
Let each context have a pointer to the ctx manager and properly
initialize the adev pointer inside the context manager.

Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and
directly return the sequence number instead of writing into a parmeter.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Haohui Mai
2c2dd0555f drm/amdgpu: Clean up of initializing doorbells for gfx_v9 and gfx_v10
Clean up redundant, copy-paste code blocks during the initialization of
the doorbells in mqd_init().

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26 14:56:31 -04:00
Dave Airlie
00df0514ab Merge tag 'amd-drm-next-5.19-2022-05-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.19-2022-05-18:

amdgpu:
- Misc code cleanups
- Additional SMU 13.x enablement
- Smartshift fixes
- GFX11 fixes
- Support for SMU 13.0.4
- SMU mutex fix
- Suspend/resume fix

amdkfd:
- static checker fix
- Doorbell/MMIO resource handling fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com
2022-05-19 14:09:54 +10:00
Mario Limonciello
0223e51647 drm/amd: Don't reset dGPUs if the system is going to s2idle
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de0874 ("drm/amdgpu:
always reset the asic in suspend (v2)").  A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Luben Tuikov
5ad25ace7c drm/amdgpu: Unmap legacy queue when MES is enabled
This fixes a kernel oops when MES is not enabled.

Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Fixes: 18ee4ce63e ("drm/amdgpu: add mes unmap legacy queue routine")
Fixes: 3d879e81f0 ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-18 15:20:18 -04:00
Xiaojian Du
0d6ec07a95 drm/amdgpu/discovery: add SMU v13.0.4 into the IP discovery list
This patch will add SMU v13.0.4 into the IP discovery list.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:58 -04:00
Jack Xiao
7bd3114b1c drm/amdgpu/gfx11: fix mes mqd settings
Use the correct Memory Queue Descriptor (MQD)
structure for GC 11.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Jack Xiao
2fc092d4c7 drm/amdgpu/gfx11: fix me field handling in map_queue packet
Select the correct microengine (me) when using the
map_queue packet.  There are different me's for GFX,
compute, and scheduling.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Lang Yu
7226f40af6 drm/amdkfd: allocate MMIO/DOORBELL BOs with AMDGPU_GEM_CREATE_PREEMPTIBLE
MMIO/DOORBELL BOs' backing resources(bus address resources that are
used to talk to the GPU) are not managed by GTT manager, but they
are counted by GTT manager. That makes no sense.

With AMDGPU_GEM_CREATE_PREEMPTIBLE flag, such BOs will be managed by
PREEMPT manager(for preemptible contexts, e.g., KFD). Then they won't
be evicted and don't need to be pinned as well.

But we still leave these BOs pinned to indicate that the underlying
resource never moves.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Haohui Mai
b992a19085 drm/amdgpu: Ensure the DMA engine is deactivated during set ups
Setting the HALT bit of SDMA_F32_CNTL in all paths before programming
the ring buffer of the SDMA engine.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Alex Deucher
505c170b62 drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2)
Check if the requested stable pstate matches the current one before
changing it.  This avoids changing the stable pstate on context
destroy if the user never changed it in the first place via the
IOCTL.

v2: compare the current and requested rather than setting a flag (Lijo)

Fixes: 8cda7a4f96 ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-16 10:02:57 -04:00
Jiapeng Chong
0a360aeb86 drm/amdgpu: clean up some inconsistent indenting
Eliminate the follow smatch warning:

drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c:35 nbio_v7_7_get_rev_id() warn:
inconsistent indenting.

drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c:214 nbio_v7_7_init_registers()
warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-13 13:14:37 -04:00
Florian Rommel
5b4494896c mmap locking API: fix missed mmap_sem references in comments
Commit c1e8d7c6a7 ("mmap locking API: convert mmap_sem comments") missed
replacing some references of mmap_sem by mmap_lock due to misspelling
(mm_sem instead of mmap_sem).

Link: https://lkml.kernel.org/r/20220503113333.214124-1-mail@florommel.de
Signed-off-by: Florian Rommel <mail@florommel.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:07 -07:00
Wan Jiabing
81c5495910 drm/amdgpu: Remove duplicated argument in vcn_v4_0
Fix following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:724:4-36: duplicated argument to & or |

Remove duplicated UVD_SUVD_CGC_GATE__SRE_H264_MASK.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:13 -04:00
Philip Yang
5be323562c drm/amdgpu: vm flush needed after updating PDEs
If page table PDEs is evicted and restored, after updating PDEs, need
increase vm->tlb_seq, then amdgpu_vm_flush will flush TLB before command
submission.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:13 -04:00
James Zhu
7865f22a5a drm/amdgpu/vcn: include header for vcn_dec_sw_ring_emit_fence
Fixed warning: no previous prototype for 'vcn_dec_sw_ring_emit_fence'.

v2: regenerate patch after git rebase.
v3: update commit message.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:13 -04:00
Mohammad Zafar Ziya
0ae99221f3 drm/amdgpu/vcn: Add vcn ras poison consumption event handling
Add vcn ras poison consumption event handling

V2: Removed default poison consumption handling function cb

Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:13 -04:00
Mohammad Zafar Ziya
7e0357fcf8 drm/amdgpu/jpeg: add jpeg ras poison consumption handling
Add jpeg ras poison event callback and consumption handling

V2: Removed the default poison consumption cb handle

Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:13 -04:00
Tao Zhou
b63ac5d303 drm/amdgpu: refine RAS poison consumption handler
Qeury ras status before ras poison consumption handling, add more
comment and log.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-and-tested-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Tao Zhou
3678060682 drm/amdgpu: enable RAS IH for poison consumption
Enable RAS IH if poison consumption handler is implemented.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Likun Gao
362c3c7014 drm/amdgpu: support memory power gating for lsdma 6.0.2
Support memory power gating control for lsdma 6.0.2.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Likun Gao
41967850e4 drm/amdgpu: support memory power gating for lsdma
Support memory power gating control for LSDMA.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Likun Gao
74c9b2e704 drm/amdgpu: add LSDMA block for LSDMA v6.0.2
Add LSDMA ip block for LSDMA v6.0.2.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Likun Gao
04de4afc13 drm/amdgpu: add LSDMA block for LSDMA v6.0.0
Add LSDMA ip block for LSDMA v6.0.0.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:12 -04:00
Likun Gao
d9b9aaae3a drm/amdgpu: support fill mem for LSDMA
Support constant data filling in PIO mode for LSDMA.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00
Likun Gao
f932ffbbf6 drm/amdgpu: support mem copy for LSDMA
Support memory to memory linear copy in PIO mode for LSDMA.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00
Likun Gao
1b49133042 drm/amdgpu: add lsdma block
Add Light SDMA (LSDMA) block and related function. LSDMA
is a small instance of SDMA mainly for kernel driver use.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00
Dan Carpenter
0d6355844b drm/amdgpu/gfx11: unlock on error in gfx_v11_0_kiq_resume()
Add a missing amdgpu_bo_unreserve(ring->mqd_obj) to an error path in
gfx_v11_0_kiq_resume().

Fixes: 3d879e81f0 ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10 17:53:11 -04:00