linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Dave Airlie	0a17875064	Merge tag 'amd-drm-fixes-5.19-2022-06-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-06-08: amdgpu: - DCN 3.1 golden settings fix - eDP fixes - DMCUB fixes - GFX11 fixes and cleanups - VCN fix for yellow carp - GMC11 fixes - RAS fixes - GPUVM TLB flush fixes - SMU13 fixes - VCN3 AV1 regression fix - VCN2 JPEG fix - Other misc fixes amdkfd: - MMU notifier fix - Support for more GC 10.3.x families - Pinned BO handling fix - Partial migration bug fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220608203008.6187-1-alexander.deucher@amd.com	2022-06-09 17:22:49 +10:00
Yifan Zhang	431d071286	drm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs. invalid/prime icahce operation takes effect both pipes cuconrrently, therefore CP_MES_IC_BASE_LO/HI and CP_MES_MDBASE_LO/HI both have to be set before prime icache. Otherwise MES hardware gets garbage data in above regsters and causes page fault [ 470.873200] amdgpu 0000:33:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:217 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 470.873222] amdgpu 0000:33:00.0: amdgpu: in page starting at address 0x000092cb89b00000 from client 10 [ 470.873234] amdgpu 0000:33:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000BB3 [ 470.873242] amdgpu 0000:33:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 470.873247] amdgpu 0000:33:00.0: amdgpu: MORE_FAULTS: 0x1 [ 470.873251] amdgpu 0000:33:00.0: amdgpu: WALKER_ERROR: 0x1 [ 470.873256] amdgpu 0000:33:00.0: amdgpu: PERMISSION_FAULTS: 0xb [ 470.873260] amdgpu 0000:33:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 470.873264] amdgpu 0000:33:00.0: amdgpu: RW: 0x0 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-08 15:39:16 -04:00
Mohammad Zafar Ziya	578eb31776	drm/amdgpu/jpeg2: Add jpeg vmid update under IB submit Add jpeg vmid update under IB submit Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-06-08 11:24:50 -04:00
Christian König	84205d0093	drm/amdgpu: always flush the TLB on gfx8 The TLB on GFX8 stores each block of 8 PTEs where any of the valid bits are set. Fixes: `5255e146c9` ("drm/amdgpu: rework TLB flushing") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-08 11:24:13 -04:00
Christian König	1d2afeb798	drm/amdgpu: fix limiting AV1 to the first instance on VCN3 The job is not yet initialized here. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `cdc7893fc9` ("drm/amdgpu: use job and ib structures directly in CS parsers") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-08 11:24:13 -04:00
Joseph Greathouse	b3f9234e10	drm/amdgpu: Add MODE register to wave debug info in gfx11 All other chips, from gfx6-gfx10, now include the MODE register at the end of the wave debug state. This appears to have been missed in gfx11, so this patch adds in MODE to the debug state for gfx11. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-07 16:18:48 -04:00
Guchun Chen	41782d7056	Revert "drm/amdgpu: Ensure the DMA engine is deactivated during set ups" This reverts commit `b992a19085`. This causes regression in GPU reset related test. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: ricetons@gmail.com Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-07 16:18:07 -04:00
Evan Quan	12d6c18cfa	drm/amdgpu: suppress the compile warning about 64 bit type Suppress the compile warning below: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292 gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type? Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-03 16:43:36 -04:00
Lang Yu	4fac4fcf45	drm/amdkfd: add pinned BOs to kfd_bo_list The kfd_bo_list is used to restore process BOs after evictions. As page tables could be destroyed during evictions, we should also update pinned BOs' page tables during restoring to make sure they are valid. So for pinned BOs, 1, Validate them and update their page tables. 2, Don't add eviction fence for them. v2: - Don't handle pinned ones specially in BO validation.(Felix) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-03 16:27:17 -04:00
Philip Yang	4d1e5f12b7	drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved Flush TLBs when existing PDEs are updated because a PTB or PDB moved, but avoids unnecessary TLB flushes when new PDBs or PTBs are added to the page table, which commonly happens when memory is mapped for the first time. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-03 16:27:17 -04:00
Sunil Khatri	371017309a	drm/amdgpu: enable tmz by default for GC 10.3.7 Add IP GC 10.3.7 in the list of target to have tmz enabled by default. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x	2022-06-03 16:27:00 -04:00
Linus Torvalds	ab18b7b36a	drm next for 5.19-rc1 (part 2/fixes) msm: - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. amdgpu: - Update fdinfo to the common drm format - uapi: Add VM_NOALLOC GPUVM attribute to prevent buffers for going into the MALL Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that can be discarded on eviction Mesa code which uses these: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466 - Link training fixes - DPIA fixes - Misc code cleanups - Aux fixes - Hotplug fixes - More FP clean up - Misc GFX9/10 fixes - Fix a possible memory leak in SMU shutdown - SMU 13 updates - RAS fixes - TMZ fixes - GC 11 updates - SMU 11 metrics fixes - Fix coverage blend mode for overlay plane - Note DDR vs LPDDR memory - Fuzz fix for CS IOCTL - Add new PCI DID amdkfd: - Clean up hive setup - Misc fixes tegra: - add some prelim 5.20 work to avoid inter-tree mess -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmKZfhwACgkQDHTzWXnE hr5v1A/+KLUDtz1vB/JGHYnqJYeGicSs48Pe2OtSW2YX6rI2IwaQBnYjXtm8J8NF 47Lw35KjzYLS9NRst9/+zhcyij6s573dibWVTi9NG5O0vYMoww6EvD7ORqPAv1JP WNl1ek9DmkorEtcjJWwWDpMUP34WrN1qCHRT1p6nkcIJ28pp+x4fwt3cPi3fmEmk hH5/pjwymvaWX9m/HBRi/esone/uLVVIDThX769spyY7c1jKG3BKO0/hGrvY/qHs 0QRIG4eroB2b6KEzg6gO4a8NX7PHamBtImT9Nqw+o8pBzYhuez39ITM5sMkCrChj kvCpF+YXwUWyr+4SbkKLGHZqerpgz219kHOh1Z7VOwCSQomIljkmdoKHUJ50QMKk pNzF/U7ng1zgPhTJVj/o3wymeE9y1X0W3p/yLXS1jXt+Nl4AjsBWSZLd0Nn+HKck ZpqwhIfH5vuFjpN+u2F3pVaRCjNM2UXavS6CD1thIOXDQ4gQ+p2rndHL/yWIyVFb HXkQu0PBSKzF6ys4ikvXxsT7xM0EJEa680dEjJuMhhC5LgDzw3A5jLdSfhIr2nal Z+CWW1NKukAarI2rPOUotO8GYsY3bMF3o/02wFRAmDwPwXUWvy/mULv8vSl63ocH 6B/YOitiGKz6tpOqRwBgjkFCTebMwKc7Mbt2JVZKiG73BMKHcFQ= =IC8q -----END PGP SIGNATURE----- Merge tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm Pull more drm updates from Dave Airlie: "This is mostly regular fixes, msm and amdgpu. There is a tegra patch that is bit of prep work for a 5.20 feature to avoid some inter-tree syncs, and a couple of late addition amdgpu uAPI changes but best to get those in early, and the userspace pieces are ready. msm: - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. amdgpu: - Update fdinfo to the common drm format - uapi: - Add VM_NOALLOC GPUVM attribute to prevent buffers for going into the MALL - Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that can be discarded on eviction - Mesa code which uses these: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466 - Link training fixes - DPIA fixes - Misc code cleanups - Aux fixes - Hotplug fixes - More FP clean up - Misc GFX9/10 fixes - Fix a possible memory leak in SMU shutdown - SMU 13 updates - RAS fixes - TMZ fixes - GC 11 updates - SMU 11 metrics fixes - Fix coverage blend mode for overlay plane - Note DDR vs LPDDR memory - Fuzz fix for CS IOCTL - Add new PCI DID amdkfd: - Clean up hive setup - Misc fixes tegra: - add some prelim 5.20 work to avoid inter-tree mess" * tag 'drm-next-2022-06-03-1' of git://anongit.freedesktop.org/drm/drm: (57 commits) drm/msm/dpu: Move min BW request and full BW disable back to mdss drm/msm/dpu: Fix pointer dereferenced before checking drm/msm/dpu: Remove unused code drm/msm/disp/dpu1: remove superfluous init drm/msm/dp: Always clear mask bits to disable interrupts at dp_ctrl_reset_irq_ctrl() gpu: host1x: Add context bus drm/amdgpu: add drm-client-id to fdinfo v2 drm/amdgpu: Convert to common fdinfo format v5 drm/amdgpu: bump minor version number drm/amdgpu: add AMDGPU_VM_NOALLOC v2 drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE drm/amdgpu: add beige goby PCI ID drm/amd/pm: Return auto perf level, if unsupported drm/amdkfd: fix typo in comment drm/amdgpu/gfx: fix typos in comments drm/amdgpu/cs: make commands with 0 chunks illegal behaviour. drm/amdgpu: differentiate between LP and non-LP DDR memory drm/amdgpu: Resolve pcie_bif RAS recovery bug drm/amdgpu: clean up asd on the ta_firmware_header_v2_0 drm/amdgpu/discovery: validate VCN and SDMA instances ...	2022-06-03 09:49:29 -07:00
Candice Li	2a46096335	drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus Adjust the sequence for ras late init and separate ras reset error status from query status. v2: squash in fix from Candice Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:58:45 -04:00
Stanley.Yang	28caf8c467	drm/amdgpu: fix ras supported check Fix aldebaran ras supported check on SRIOV guest side, the previous check conditicon block all ras feature on baremetal Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:58:09 -04:00
sunliming	8365ed22d0	drm/amdgpu: make gfx_v11_0_rlc_stop static This symbol is not used outside of gfx_v11_0.c, so marks it static. Fixes the following w1 warning: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1945:6: warning: no previous prototype for function 'gfx_v11_0_rlc_stop' [-Wmissing-prototypes]. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:57:12 -04:00
sunliming	418214ddcf	drm/amdgpu: fix a missing break in gfx_v11_0_handle_priv_fault Fixes the following w1 warning: drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5873:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:57:08 -04:00
Roman Li	ae969b62e7	drm/amdgpu: fix aper_base for APU [Why] Wrong fb offset results in dmub f/w errors and white screen. [drm:dc_dmub_srv_wait_idle [amdgpu]] ERROR Error waiting for DMUB idle: status=3 [How] Read aper_base from mmhub because GC is off by default v2: use BAR for passthrough (Alex) Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:56:59 -04:00
Alex Deucher	97e5030554	drm/amdgpu: update VCN codec support for Yellow Carp Supports AV1. Mesa already has support for this and doesn't rely on the kernel caps for yellow carp, so this was already working from an application perspective. Fixes: `554398174d` ("amdgpu/nv.c - Added video codec support for Yellow Carp") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2002 Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-06-01 15:56:49 -04:00
Jiapeng Chong	11594fa114	drm/amdgpu: make program_imu_rlc_ram static This symbol is not used outside of imu_v11_0.c, so marks it static. Fixes the following w1 warning: drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:302:6: warning: no previous prototype for ‘program_imu_rlc_ram’ [-Wmissing-prototypes]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-06-01 15:56:49 -04:00
Linus Torvalds	98931dd95f	Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. David Vernet has done some fixup work on the memcg selftests. Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. Nadav Amit has done some optimization of TLB flushing during mprotect(). Neil Brown continues to labor away at improving our swap-over-NFS support. David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). Peng Liu fixed some errors in the core hugetlb code. Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. Some cleanups of the arch-specific pagemap code from Anshuman Khandual. Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. Roman Gushchin has done more work on the memcg selftests. And, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8= =nFe6 -----END PGP SIGNATURE----- Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...	2022-05-26 12:32:41 -07:00
Christian König	9bdc1992c9	drm/amdgpu: add drm-client-id to fdinfo v2 This is enough to get gputop working :) v2: rebase and some addition cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:34 -04:00
Christian König	af0b541670	drm/amdgpu: Convert to common fdinfo format v5 Convert fdinfo format to one documented in drm-usage-stats.rst. It turned out that the existing implementation was actually completely nonsense. The calculated percentages indeed represented the usage of the engine, but with varying time slices. So 10% usage for application A could mean something completely different than 10% usage for application B. Completely nuke that and just use the now standardized nanosecond interface. v2: drop the documentation change for now, nuke percentage calculation v3: only account for each hw_ip, move the time_spend to the ctx mgr. v4: move general ctx changes into separate patch, rework the fdinfo to ctx_mgr interface so that all usages are calculated at once, drop some unecessary and dangerous refcount dance. v5: add one more comment how we calculate the time spend Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:34 -04:00
Christian König	08cffb3eb7	drm/amdgpu: bump minor version number Increase the minor version number to indicate that the new flags are available. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:34 -04:00
Christian König	b6c65a2c92	drm/amdgpu: add AMDGPU_VM_NOALLOC v2 Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation. v2: also add the flag to the allowed flags. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:34 -04:00
Christian König	fab2cc8335	drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO doesn't needs to be preserved during eviction. KFD was already using a similar functionality for SVM BOs so replace the internal flag with the new UAPI. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:34 -04:00
Alex Deucher	62e9bd2003	drm/amdgpu: add beige goby PCI ID Add a beige goby PCI ID. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-05-26 14:56:33 -04:00
Julia Lawall	6bd8d4b7d5	drm/amdkfd: fix typo in comment Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Julia Lawall	ab5a7fb6d2	drm/amdgpu/gfx: fix typos in comments Spelling mistakes (triple letters) in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Dave Airlie	31ab27b14d	drm/amdgpu/cs: make commands with 0 chunks illegal behaviour. Submitting a cs with 0 chunks, causes an oops later, found trying to execute the wrong userspace driver. MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo [172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8 [172536.665188] #PF: supervisor read access in kernel mode [172536.665189] #PF: error_code(0x0000) - not-present page [172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0 [172536.665195] Oops: 0000 [#1] SMP NOPTI [172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P O 5.10.81 #1-NixOS [172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015 [172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu] [172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10 [172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246 [172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68 [172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38 [172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40 [172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28 [172536.665283] FS: 00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000 [172536.665284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0 [172536.665287] Call Trace: [172536.665322] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665332] drm_ioctl_kernel+0xaa/0xf0 [drm] [172536.665338] drm_ioctl+0x201/0x3b0 [drm] [172536.665369] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665372] ? selinux_file_ioctl+0x135/0x230 [172536.665399] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [172536.665403] __x64_sys_ioctl+0x83/0xb0 [172536.665406] do_syscall_64+0x33/0x40 [172536.665409] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018 Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Alex Deucher	d534ca7128	drm/amdgpu: differentiate between LP and non-LP DDR memory Some applications want to know whether the memory is LP or not. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Candice Li	a5457087eb	drm/amdgpu: Resolve pcie_bif RAS recovery bug Check shared buf instead of init flag for xgmi ta shared buf init during xgmi ta initialization. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Prike Liang	1c75524146	drm/amdgpu: clean up asd on the ta_firmware_header_v2_0 On the psp13 series use ta_firmware_header_v2_0 and the asd firmware was buildin ta, so needn't request asd firmware separately. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Alex Deucher	a0ccc717c4	drm/amdgpu/discovery: validate VCN and SDMA instances Validate the VCN and SDMA instances against the driver structure sizes to make sure we don't get into a situation where the firmware reports more instances than the driver supports. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Evan Quan	caa5eadc14	drm/amdgpu: suppress some compile warnings Suppress two compile warnings about "no previous prototype". Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:33 -04:00
Sunil Khatri	49b74d12d1	drm/amdgpu: add support of tmz for GC 10.3.7 Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:32 -04:00
Sunil Khatri	0ef3dc7e97	drm/amdgpu: change code name to ip version for tmz set Use IP version rather then code name of IPs for tmz set. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:32 -04:00
Sunil Khatri	4d33e7040d	drm/amdgpu: move amdgpu_gmc_tmz_set after ip_version populated To enable TMZ feature based on IP version needs adev->ip_version populated but its empty. Move amdgpu_gmc_tmz_set to a place where ip_version is populated. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:32 -04:00
Stanley.Yang	950d64250f	drm/amdgpu: support ras on SRIOV support umc/gfx/sdma ras on guest side Changed from V1: move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:32 -04:00
Haohui Mai	10784fec9c	drm/amdgpu/gfx10: rework KIQ programming Make sure the queue is not longer active before programming the kiq EOP registers. Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:31 -04:00
Haohui Mai	842035543c	drm/amdgpu: Set CP_HQD_PQ_CONTROL.RPTR_BLOCK_SIZE correctly Remove the accidental shifts on the values of RPTR_BLOCK_SIZE in gfx_v8-v11. The bug essentially always programs the corresponding fields to zero instead of the correct value. The hardware clamps the min value to 5 so this resulted in a value of 5 being programmed. Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:31 -04:00
Christian König	69493c034d	drm/amdgpu: cleanup ctx implementation Let each context have a pointer to the ctx manager and properly initialize the adev pointer inside the context manager. Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and directly return the sequence number instead of writing into a parmeter. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:31 -04:00
Haohui Mai	2c2dd0555f	drm/amdgpu: Clean up of initializing doorbells for gfx_v9 and gfx_v10 Clean up redundant, copy-paste code blocks during the initialization of the doorbells in mqd_init(). Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-26 14:56:31 -04:00
Dave Airlie	00df0514ab	Merge tag 'amd-drm-next-5.19-2022-05-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-05-18: amdgpu: - Misc code cleanups - Additional SMU 13.x enablement - Smartshift fixes - GFX11 fixes - Support for SMU 13.0.4 - SMU mutex fix - Suspend/resume fix amdkfd: - static checker fix - Doorbell/MMIO resource handling fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com	2022-05-19 14:09:54 +10:00
Mario Limonciello	0223e51647	drm/amd: Don't reset dGPUs if the system is going to s2idle An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC reset for handling aborted suspend can't work with s2idle. This functionality was introduced in commit `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)"). A few other commits have gone on top of the ASIC reset, but this still doesn't work on the A+A configuration in s2idle. Avoid doing the reset on dGPUs specifically when using s2idle. Fixes: `daf8de0874` ("drm/amdgpu: always reset the asic in suspend (v2)") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-18 15:20:18 -04:00
Luben Tuikov	5ad25ace7c	drm/amdgpu: Unmap legacy queue when MES is enabled This fixes a kernel oops when MES is not enabled. Reported-by: Kenny Ho <Kenny.Ho@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Fixes: `18ee4ce63e` ("drm/amdgpu: add mes unmap legacy queue routine") Fixes: `3d879e81f0` ("drm/amdgpu: add init support for GFX11 (v2)") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-18 15:20:18 -04:00
Xiaojian Du	0d6ec07a95	drm/amdgpu/discovery: add SMU v13.0.4 into the IP discovery list This patch will add SMU v13.0.4 into the IP discovery list. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-16 10:02:58 -04:00
Jack Xiao	7bd3114b1c	drm/amdgpu/gfx11: fix mes mqd settings Use the correct Memory Queue Descriptor (MQD) structure for GC 11. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-16 10:02:57 -04:00
Jack Xiao	2fc092d4c7	drm/amdgpu/gfx11: fix me field handling in map_queue packet Select the correct microengine (me) when using the map_queue packet. There are different me's for GFX, compute, and scheduling. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-16 10:02:57 -04:00
Lang Yu	7226f40af6	drm/amdkfd: allocate MMIO/DOORBELL BOs with AMDGPU_GEM_CREATE_PREEMPTIBLE MMIO/DOORBELL BOs' backing resources(bus address resources that are used to talk to the GPU) are not managed by GTT manager, but they are counted by GTT manager. That makes no sense. With AMDGPU_GEM_CREATE_PREEMPTIBLE flag, such BOs will be managed by PREEMPT manager(for preemptible contexts, e.g., KFD). Then they won't be evicted and don't need to be pinned as well. But we still leave these BOs pinned to indicate that the underlying resource never moves. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-16 10:02:57 -04:00
Haohui Mai	b992a19085	drm/amdgpu: Ensure the DMA engine is deactivated during set ups Setting the HALT bit of SDMA_F32_CNTL in all paths before programming the ring buffer of the SDMA engine. Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-16 10:02:57 -04:00

1 2 3 4 5 ...

10955 commits