1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

52 commits

Author SHA1 Message Date
Colin Ian King
e6a7746ef9 drm/amdkfd: Fix spelling mistake "detroyed" -> "destroyed"
There is a spelling mistake in a pr_debug message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-19 15:07:46 -04:00
Philip Yang
3a87606089 drm/amdkfd: Migrate in CPU page fault use current mm
migrate_vma_setup shows below warning because we don't hold another
process mm mmap_lock. We should use current vmf->vma->vm_mm instead, the
caller already hold current mmap lock inside CPU page fault handler.

 WARNING: CPU: 10 PID: 3054 at include/linux/mmap_lock.h:155 find_vma
 Call Trace:
  walk_page_range+0x76/0x150
  migrate_vma_setup+0x18a/0x640
  svm_migrate_vram_to_ram+0x245/0xa10 [amdgpu]
  svm_migrate_to_ram+0x36f/0x470 [amdgpu]
  do_swap_page+0xcfe/0xec0
  __handle_mm_fault+0x96b/0x15e0
  handle_mm_fault+0x13f/0x3e0
  do_user_addr_fault+0x1e7/0x690

Fixes: e1f84eef31 ("drm/amdkfd: handle CPU fault on COW mapping")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 14:32:05 -04:00
Philip Yang
c969c5fd21 drm/amdkfd: Remove prefault before migrating to VRAM
Prefaulting potentially allocates system memory pages before a
migration. This adds unnecessary overhead. Instead we can skip
unallocated pages in the migration and just point migrate->dst to a
0-initialized VRAM page directly. Then the VRAM page will be inserted
to the PTE. A subsequent CPU page fault will migrate the page back to
system memory.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:23 -04:00
Philip Yang
e1f84eef31 drm/amdkfd: handle CPU fault on COW mapping
If CPU page fault in a page with zone_device_data svm_bo from another
process, that means it is COW mapping in the child process and the
range is migrated to VRAM by parent process. Migrate the parent
process range back to system memory to recover the CPU page fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-13 12:54:23 -04:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Philip Yang
4959e609de drm/amdkfd: Set svm range max pages
This will be used to split giant svm range into smaller ranges, to
support VRAM overcommitment by giant range and improve GPU retry fault
recover on giant range.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-28 16:05:14 -04:00
Alex Sierra
c83dee9b63 drm/amdkfd: add SPM support for SVM
When CPU is connected throug XGMI, it has coherent access to VRAM
resource.  In this case that resource is taken from a table in the device
gmc aperture base.  This resource is used along with the device type,
which could be DEVICE_PRIVATE or DEVICE_COHERENT to create the device page
map region.

Also, MIGRATE_VMA_SELECT_DEVICE_COHERENT flag is selected for coherent
type case during migration to device.

Link: https://lkml.kernel.org/r/20220715150521.18165-8-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:28 -07:00
Philip Yang
acac270d09 drm/amdkfd: Add migration SMI event
For migration start and end event, output timestamp when migration
starts, ends, svm range address and size, GPU id of migration source and
destination and svm range attributes,

Migration trigger could be prefetch, CPU or GPU page fault and TTM
eviction.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30 15:31:04 -04:00
Philip Yang
88467db6e2 drm/amdkfd: Fix partial migration bugs
Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:

1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.

2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.

3. copy to ram, should collect all continuous VRAM pages and copy
together.

4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-06-03 16:43:22 -04:00
Dave Airlie
b900352f9d Merge tag 'amd-drm-next-5.19-2022-04-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.19-2022-04-29:

amdgpu
- RAS updates
- SI dpm deadlock fix
- Misc code cleanups
- HDCP fixes
- PSR fixes
- DSC fixes
- SDMA doorbell cleanups
- S0ix fix
- DC FP fix
- Zen dom0 regression fix for APUs
- IP discovery updates
- Initial SoC21 support
- Support for new vbios tables
- Runtime PM fixes
- Add PSP TA debugfs interface

amdkfd:
- Misc code cleanups
- Ignore bogus MEC signals more efficiently
- SVM fixes
- Use bitmap helpers

radeon:
- Misc code cleanups
- Spelling/grammer fixes

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220429144853.5742-1-alexander.deucher@amd.com
2022-05-06 15:05:27 +10:00
Yang Wang
cc9d82fc96 drm/amdkfd: use kvcalloc() instead of kvmalloc() in kfd_migrate
simplify programming with existing functions.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-22 14:49:54 -04:00
Linus Torvalds
b14ffae378 drm for 5.18-rc1
dma-buf:
 - rename dma-buf-map to iosys-map
 
 core:
 - move buddy allocator to core
 - add pci/platform init macros
 - improve EDID parser deep color handling
 - EDID timing type 7 support
 - add GPD Win Max quirk
 - add yes/no helpers to string_helpers
 - flatten syncobj chains
 - add nomodeset support to lots of drivers
 - improve fb-helper clipping support
 - add default property value interface
 
 fbdev:
 - improve fbdev ops speed
 
 ttm:
 - add a backpointer from ttm bo->ttm resource
 
 dp:
 - move displayport headers
 - add a dp helper module
 
 bridge:
 - anx7625 atomic support, HDCP support
 
 panel:
 - split out panel-lvds and lvds bindings
 - find panels in OF subnodes
 
 privacy:
 - add chromeos privacy screen support
 
 fb:
 - hot unplug fw fb on forced removal
 
 simpledrm:
 - request region instead of marking ioresource busy
 - add panel oreintation property
 
 udmabuf:
 - fix oops with 0 pages
 
 amdgpu:
 - power management code cleanup
 - Enable freesync video mode by default
 - RAS code cleanup
 - Improve VRAM access for debug using SDMA
 - SR-IOV rework special register access and fixes
 - profiling power state request ioctl
 - expose IP discovery via sysfs
 - Cyan skillfish updates
 - GC 10.3.7, SDMA 5.2.7, DCN 3.1.6 updates
 - expose benchmark tests via debugfs
 - add module param to disable XGMI for testing
 - GPU reset debugfs register dumping support
 
 amdkfd:
 - CRIU support
 - SDMA queue fixes
 
 radeon:
 - UVD suspend fix
 - iMac backlight fix
 
 i915:
 - minimal parallel submission for execlists
 - DG2-G12 subplatform added
 - DG2 programming workarounds
 - DG2 accelerated migration support
 - flat CCS and CCS engine support for XeHP
 - initial small BAR support
 - drop fake LMEM support
 - ADL-N PCH support
 - bigjoiner updates
 - introduce VMA resources and async unbinding
 - register definitions cleanups
 - multi-FBC refactoring
 - DG1 OPROM over SPI support
 - ADL-N platform enabling
 - opregion mailbox #5 support
 - DP MST ESI improvements
 - drm device based logging
 - async flip optimisation for DG2
 - CPU arch abstraction fixes
 - improve GuC ADS init to work on aarch64
 - tweak TTM LRU priority hint
 - GuC 69.0.3 support
 - remove short term execbuf pins
 
 nouveau:
 - higher DP/eDP bitrates
 - backlight fixes
 
 msm:
 - dpu + dp support for sc8180x
 - dp support for sm8350
 - dpu + dsi support for qcm2290
 - 10nm dsi phy tuning support
 - bridge support for dp encoder
 - gpu support for additional 7c3 SKUs
 
 ingenic:
 - HDMI support for JZ4780
 - aux channel EDID support
 
 ast:
 - AST2600 support
 - add wide screen support
 - create DP/DVI connectors
 
 omapdrm:
 - fix implicit dma_buf fencing
 
 vc4:
 - add CSC + full range support
 - better display firmware handoff
 
 panfrost:
 - add initial dual-core GPU support
 
 stm:
 - new revision support
 - fb handover support
 
 mediatek:
 - transfer display binding document to yaml format.
 - add mt8195 display device binding.
 - allow commands to be sent during video mode.
 - add wait_for_event for crtc disable by cmdq.
 
 tegra:
 - YUV format support
 
 rcar-du:
 - LVDS support for M3-W+ (R8A77961)
 
 exynos:
 - BGR pixel format for FIMD device
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmI71h4ACgkQDHTzWXnE
 hr6wKg//SvKFiEOhptua8Ao8XYkhXpg1/tgdAs4D7bZ0YgJyF4Im0RuFOKMmF3mN
 0Y8AwguqrsmrOAFbK8B1WEysB66DmGlZN/V2Q75X7fui8xs4uGF2Fcxyr+265zhf
 vONPwAoxYr+KXqwOI1p1BP2QEL6bJTdu+nrXRsXIBIrWnw8ehXJlw3fDhgvG5QBn
 RPdbU7lQnd47hdYxkbe5SiZvWnPC46dJmpqsRJir0xjskR6juU36f34C4IKhTGwO
 NDPeWVgusVXtIC/F4X6RebCWG0f66h+CUFa9zeYIleI/2/5yZWXfcw6Obx8HgPkt
 gieiI0R4TpkVxeHCApCQ5UpxWgfSOXdoDoyw172bKQw7JCHVEkSwenyMEEwNet6r
 SCJrRmlB1PBI/iTWmhm9qgrU46ZZyAnQoTlCsXGzJncdP3hzGlA1embl00yfEl7f
 wzM35N20qd5T4VKUEF8QYF0fLZYmKw4cWVASu4hQ3qmGal6frilphz2J8JK8hQNq
 KhFqNbVTnZsQNr9LBCbrf0kOPaMzpmW+2vQG9ApdAb1N3gNPZT7ctti0Xq5N2OUR
 AipWFAsDPS2NPADKmBtDU55PgFH9MqUIsoHHXLV4Qi76dvCqYoN68qRQxrL7rpSu
 b0gr0YKU2QcIB/uytjOPHcgtI5Xvrh+q8JPz/dJ38/Esgjmk4wo=
 =uRsT
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-03-24' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Lots of work all over, Intel improving DG2 support, amdkfd CRIU
  support, msm new hw support, and faster fbdev support.

  dma-buf:
   - rename dma-buf-map to iosys-map

  core:
   - move buddy allocator to core
   - add pci/platform init macros
   - improve EDID parser deep color handling
   - EDID timing type 7 support
   - add GPD Win Max quirk
   - add yes/no helpers to string_helpers
   - flatten syncobj chains
   - add nomodeset support to lots of drivers
   - improve fb-helper clipping support
   - add default property value interface

  fbdev:
   - improve fbdev ops speed

  ttm:
   - add a backpointer from ttm bo->ttm resource

  dp:
   - move displayport headers
   - add a dp helper module

  bridge:
   - anx7625 atomic support, HDCP support

  panel:
   - split out panel-lvds and lvds bindings
   - find panels in OF subnodes

  privacy:
   - add chromeos privacy screen support

  fb:
   - hot unplug fw fb on forced removal

  simpledrm:
   - request region instead of marking ioresource busy
   - add panel oreintation property

  udmabuf:
   - fix oops with 0 pages

  amdgpu:
   - power management code cleanup
   - Enable freesync video mode by default
   - RAS code cleanup
   - Improve VRAM access for debug using SDMA
   - SR-IOV rework special register access and fixes
   - profiling power state request ioctl
   - expose IP discovery via sysfs
   - Cyan skillfish updates
   - GC 10.3.7, SDMA 5.2.7, DCN 3.1.6 updates
   - expose benchmark tests via debugfs
   - add module param to disable XGMI for testing
   - GPU reset debugfs register dumping support

  amdkfd:
   - CRIU support
   - SDMA queue fixes

  radeon:
   - UVD suspend fix
   - iMac backlight fix

  i915:
   - minimal parallel submission for execlists
   - DG2-G12 subplatform added
   - DG2 programming workarounds
   - DG2 accelerated migration support
   - flat CCS and CCS engine support for XeHP
   - initial small BAR support
   - drop fake LMEM support
   - ADL-N PCH support
   - bigjoiner updates
   - introduce VMA resources and async unbinding
   - register definitions cleanups
   - multi-FBC refactoring
   - DG1 OPROM over SPI support
   - ADL-N platform enabling
   - opregion mailbox #5 support
   - DP MST ESI improvements
   - drm device based logging
   - async flip optimisation for DG2
   - CPU arch abstraction fixes
   - improve GuC ADS init to work on aarch64
   - tweak TTM LRU priority hint
   - GuC 69.0.3 support
   - remove short term execbuf pins

  nouveau:
   - higher DP/eDP bitrates
   - backlight fixes

  msm:
   - dpu + dp support for sc8180x
   - dp support for sm8350
   - dpu + dsi support for qcm2290
   - 10nm dsi phy tuning support
   - bridge support for dp encoder
   - gpu support for additional 7c3 SKUs

  ingenic:
   - HDMI support for JZ4780
   - aux channel EDID support

  ast:
   - AST2600 support
   - add wide screen support
   - create DP/DVI connectors

  omapdrm:
   - fix implicit dma_buf fencing

  vc4:
   - add CSC + full range support
   - better display firmware handoff

  panfrost:
   - add initial dual-core GPU support

  stm:
   - new revision support
   - fb handover support

  mediatek:
   - transfer display binding document to yaml format.
   - add mt8195 display device binding.
   - allow commands to be sent during video mode.
   - add wait_for_event for crtc disable by cmdq.

  tegra:
   - YUV format support

  rcar-du:
   - LVDS support for M3-W+ (R8A77961)

  exynos:
   - BGR pixel format for FIMD device"

* tag 'drm-next-2022-03-24' of git://anongit.freedesktop.org/drm/drm: (1529 commits)
  drm/i915/display: Do not re-enable PSR after it was marked as not reliable
  drm/i915/display: Fix HPD short pulse handling for eDP
  drm/amdgpu: Use drm_mode_copy()
  drm/radeon: Use drm_mode_copy()
  drm/amdgpu: Use ternary operator in `vcn_v1_0_start()`
  drm/amdgpu: Remove pointless on stack mode copies
  drm/amd/pm: fix indenting in __smu_cmn_reg_print_error()
  drm/amdgpu/dc: fix typos in comments
  drm/amdgpu: fix typos in comments
  drm/amd/pm: fix typos in comments
  drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.
  drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations.
  drm/amdkfd: evict svm bo worker handle error
  drm/amdgpu/vcn: fix vcn ring test failure in igt reload test
  drm/amdgpu: only allow secure submission on rings which support that
  drm/amdgpu: fixed the warnings reported by kernel test robot
  drm/amd/display: 3.2.177
  drm/amd/display: [FW Promotion] Release 0.0.108.0
  drm/amd/display: Add save/restore PANEL_PWRSEQ_REF_DIV2
  drm/amd/display: Wait for hubp read line for Pollock
  ...
2022-03-24 16:19:43 -07:00
Philip Yang
9527b9caf8 drm/amdkfd: evict svm bo worker handle error
Migrate vram to ram may fail to find the vma if process is exiting
and vma is removed, evict svm bo worker sets prange->svm_bo to NULL
and warn svm_bo ref count != 1 only if migrating vram to ram
successfully.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-03-15 15:01:12 -04:00
Christoph Hellwig
27674ef6c7 mm: remove the extra ZONE_DEVICE struct page refcount
ZONE_DEVICE struct pages have an extra reference count that complicates
the code for put_page() and several places in the kernel that need to
check the reference count to see that a page is not being used (gup,
compaction, migration, etc.). Clean up the code so the reference count
doesn't need to be treated specially for ZONE_DEVICE pages.

Note that this excludes the special idle page wakeup for fsdax pages,
which still happens at refcount 1.  This is a separate issue and will
be sorted out later.  Given that only fsdax pages require the
notifiacation when the refcount hits 1 now, the PAGEMAP_OPS Kconfig
symbol can go away and be replaced with a FS_DAX check for this hook
in the put_page fastpath.

Based on an earlier patch from Ralph Campbell <rcampbell@nvidia.com>.

Link: https://lkml.kernel.org/r/20220210072828.2930359-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Christoph Hellwig
730ff52194 mm: remove pointless includes from <linux/hmm.h>
hmm.h pulls in the world for no good reason at all.  Remove the
includes and push a few ones into the users instead.

Link: https://lkml.kernel.org/r/20220210072828.2930359-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Rajneesh Bhardwaj
2243f4937a drm/amdkfd: Fix leftover errors and warnings
A bunch of errors and warnings are leftover KFD over the years, attempt
to fix the errors and most warnings reported by checkpatch tool. Still a
few warnings remain which may be false positives so ignore them for now.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14 15:08:40 -05:00
Alex Sierra
7f161df1a5 drm/amdkfd: replace err by dbg print at svm vram migration
Avoid spam the kernel log on application memory allocation failures.
__func__ argument was also removed from dev_fmt macro due to
parameter conflicts with dynamic_dev_dbg.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.comi>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-11 16:20:24 -05:00
Christian König
1b08dfb889 drm/amdgpu: remove gart.ready flag
That's just a leftover from old radeon days and was preventing CS and GART
bindings before the hardware was initialized. But nowdays that is
perfectly valid.

The only thing we need to warn about are GART binding before the table
is even allocated.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-19 22:32:47 -05:00
Philip Yang
69879b3083 drm/amdkfd: fix svm_bo release invalid wait context warning
Add svm_range_bo_unref_async to schedule work to wait for svm_bo
eviction work done and then free svm_bo. __do_munmap put_page
is atomic context, call svm_range_bo_unref_async to avoid warning
invalid wait context. Other non atomic context call svm_range_bo_unref.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 13:43:46 -05:00
Isabella Basso
bbe04dec5c drm/amd: fix improper docstring syntax
This fixes various warnings relating to erroneous docstring syntax, of
which some are listed below:

 warning: Function parameter or member 'adev' not described in
 'amdgpu_atomfirmware_ras_rom_addr'
 ...
 warning: expecting prototype for amdgpu_atpx_validate_functions().
 Prototype was for amdgpu_atpx_validate() instead
 ...
 warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new'
 ...
 warning: Cannot understand  * @kfd_get_cu_occupancy - Collect number of
 waves in-flight on this device
 ...
 warning: This comment starts with '/**', but isn't a kernel-doc
 comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Graham Sider
046e674b96 drm/amdkfd: convert misc checks to IP version checking
Switch to IP version checking instead of asic_type on various KFD
version checks.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:09:46 -05:00
Linus Torvalds
304ac8032d drm next/fixes for 5.16-rc1
bridge:
 - HPD improvments for lt9611uxc
 - eDP aux-bus support for ps8640
 - LVDS data-mapping selection support
 
 ttm:
 - remove huge page functionality (needs reworking)
 - fix a race condition during BO eviction
 
 panels:
 - add some new panels
 
 fbdev:
 - fix double-free
 - remove unused scrolling acceleration
 - CONFIG_FB dep improvements
 
 locking:
 - improve contended locking logging
 - naming collision fix
 
 dma-buf:
 - add dma_resv_for_each_fence iterator
 - fix fence refcounting bug
 - name locking fixesA
 
 prime:
 - fix object references during mmap
 
 nouveau:
 - various code style changes
 - refcount fix
 - device removal fixes
 - protect client list with a mutex
 - fix CE0 address calculation
 
 i915:
 - DP rates related fixes
 - Revert disabling dual eDP that was causing state readout problems
 - put the cdclk vtables in const data
 - Fix DVO port type for older platforms
 - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
 - CCS FBs related fixes
 - Fix recursive lock in GuC submission
 - Revert guc_id from i915_request tracepoint
 - Build fix around dmabuf
 
 amdgpu:
 - GPU reset fix
 - Aldebaran fix
 - Yellow Carp fixes
 - DCN2.1 DMCUB fix
 - IOMMU regression fix for Picasso
 - DSC display fixes
 - BPC display calculation fixes
 - Other misc display fixes
 - Don't allow partial copy from user for DC debugfs
 - SRIOV fixes
 - GFX9 CSB pin count fix
 - Various IP version check fixes
 - DP 2.0 fixes
 - Limit DCN1 MPO fix to DCN1
 
 amdkfd:
 - SVM fixes
 - Fix gfx version for renoir
 - Reset fixes
 
 udl:
 - timeout fix
 
 imx:
 - circular locking fix
 
 virtio:
 - NULL ptr deref fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGN3YwACgkQDHTzWXnE
 hr6aZQ/+Pobf1VE7V3wPUcopxccJYmgBvG/uY8EDyjA8qaxHs2pQqGN2IooOGxr6
 F8G1N94Hem/PCDn3T8JI2Tqw5z4sy4UwLahEWISurFCen1IMAfA7hYfutp9X3O7X
 8h7b+PgkvVruEAHF7z0kqnWGPHmcro29cIHNkXVRjnJuz+Gmn1XRfo6Jj65n6D7u
 NfMeU4/lWRR3767oJQzTqyAYtGxsKaZT3/tBD5WggZBzEKC7hqhAl8EUoOLWwojo
 fDqwiEpLXpraPRIQH8trkXVHhzPeLAmG916WwS8JG3CEk9mUQ+I7Jshhd8cw+bsQ
 XPuk3OBfU9mtuiGgNzrLP3xXJZs/QN3EkpKZWLefTnJY+C4BgiP2RifTnghmwV31
 6/7Pr83CX/cn3BRd7r0xaeBZYvVYBZmwoZcsZFJBM8SVjd/ofKUfAmCzZZKheio2
 5qa6bj9DQoyjEoFAULh23plcX6hvATGP7wzfRTnJ9AlAJ0KyEjVJ3r0qE6jHMDc/
 uzcTAnKIWCxt9kSgE5qwLQtxLBaBpr/iOniZbCqGkPjiZeMzqP/ug1AKVP7kk39x
 FxZVT8ZOKk8Xt4iLZx8jmHi2KKheXYZi9LqieoTrJd44qMXDOmR9DCtQX9FZuWJS
 EJAlMj6sCowAZdODPZMVpoMc3Gti9nZ2Fpu7mLrRcMk1gKfjKwo=
 =qMNk
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "I missed a drm-misc-next pull for the main pull last week. It wasn't
  that major and isn't the bulk of this at all. This has a bunch of
  fixes all over, a lot for amdgpu and i915.

  bridge:
   - HPD improvments for lt9611uxc
   - eDP aux-bus support for ps8640
   - LVDS data-mapping selection support

  ttm:
   - remove huge page functionality (needs reworking)
   - fix a race condition during BO eviction

  panels:
   - add some new panels

  fbdev:
   - fix double-free
   - remove unused scrolling acceleration
   - CONFIG_FB dep improvements

  locking:
   - improve contended locking logging
   - naming collision fix

  dma-buf:
   - add dma_resv_for_each_fence iterator
   - fix fence refcounting bug
   - name locking fixesA

  prime:
   - fix object references during mmap

  nouveau:
   - various code style changes
   - refcount fix
   - device removal fixes
   - protect client list with a mutex
   - fix CE0 address calculation

  i915:
   - DP rates related fixes
   - Revert disabling dual eDP that was causing state readout problems
   - put the cdclk vtables in const data
   - Fix DVO port type for older platforms
   - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
   - CCS FBs related fixes
   - Fix recursive lock in GuC submission
   - Revert guc_id from i915_request tracepoint
   - Build fix around dmabuf

  amdgpu:
   - GPU reset fix
   - Aldebaran fix
   - Yellow Carp fixes
   - DCN2.1 DMCUB fix
   - IOMMU regression fix for Picasso
   - DSC display fixes
   - BPC display calculation fixes
   - Other misc display fixes
   - Don't allow partial copy from user for DC debugfs
   - SRIOV fixes
   - GFX9 CSB pin count fix
   - Various IP version check fixes
   - DP 2.0 fixes
   - Limit DCN1 MPO fix to DCN1

  amdkfd:
   - SVM fixes
   - Fix gfx version for renoir
   - Reset fixes

  udl:
   - timeout fix

  imx:
   - circular locking fix

  virtio:
   - NULL ptr deref fix"

* tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits)
  drm/ttm: Double check mem_type of BO while eviction
  drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
  drm/amdgpu: drop jpeg IP initialization in SRIOV case
  drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
  drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
  drm/amd/display: Don't lock connection_mutex for DMUB HPD
  drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
  drm/amdkfd: Fix retry fault drain race conditions
  drm/amdkfd: lower the VAs base offset to 8KB
  drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
  drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
  drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
  drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages
  drm/i915/fb: Fix rounding error in subsampled plane size calculation
  drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
  drm/locking: fix __stack_depot_* name conflict
  drm/virtio: Fix NULL dereference error in virtio_gpu_poll
  drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
  drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
  drm/amd/amdkfd: Don't sent command to HWS on kfd reset
  ...
2021-11-12 12:11:07 -08:00
Alistair Popple
ab09243aa9 mm/migrate.c: remove MIGRATE_PFN_LOCKED
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect().  If it
wasn't then the a second attempt is made to lock the page.  However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity.  So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-11 09:34:35 -08:00
Alex Sierra
a6283010e2 drm/amdkfd: avoid recursive lock in migrations back to RAM
[Why]:
When we call hmm_range_fault to map memory after a migration, we don't
expect memory to be migrated again as a result of hmm_range_fault. The
driver ensures that all memory is in GPU-accessible locations so that
no migration should be needed. However, there is one corner case where
hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
back to system memory due to a write-fault when a system memory page in
the same range was mapped read-only (e.g. COW). Ranges with individual
pages in different locations are usually the result of failed page
migrations (e.g. page lock contention). The unexpected migration back
to system memory causes a deadlock from recursive locking in our
driver.

[How]:
Creating a task reference new member under svm_range_list struct.
Setting this with "current" reference, right before the hmm_range_fault
is called. This member is checked against "current" reference at
svm_migrate_to_ram callback function. If equal, the migration will be
ignored.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:10:58 -04:00
Felix Kuehling
740a451b07 drm/amdkfd: Handle incomplete migration to system memory
If some pages fail to migrate to system memory, don't update
prange->actual_loc = 0. This prevents endless CPU page faults after
partial migration failures due to contested page locks.

Migration to RAM must be complete during migrations from VRAM to VRAM and
during evictions. Implement retry and fail if the migration to RAM fails.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Philip Yang
33c6bd989d drm/amdkfd: debug message to count successfully migrated pages
Not all migrate.cpages returned from migrate_vma_setup can be migrated,
for example non anonymous page, or out of device memory. So after
migrate_vma_pages returns, add debug message to count pages are
successfully migrated which has MIGRATE_PFN_VALID and
MIGRATE_PFN_MIGRATE flag set.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:35 -04:00
Philip Yang
75fa98d6e4 drm/amdkfd: clarify the origin of cpages returned by migration functions
cpages is only updated by migrate_vma_setup. So capture its value at
that point to clarify the significance of the number. The next patch
will add counting of actually migrated pages after migrate_vma_pages for
debug purposes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:29 -04:00
Philip Yang
ca432dcc27 drm/amdkfd: handle svm partial migration cpages 0
migrate_vma_setup may return cpages 0, means 0 page can be migrated,
treat this as error case to skip the rest of vma migration steps.

Change svm_migrate_vma_to_vram and svm_migrate_vma_to_ram to return the
number of pages migrated successfully or error code. The caller add up
all the successful migration pages and update prange->actual_loc only if
the total migrated pages is not 0.

This also removes the warning message "VRAM BO missing during
validation" if migration cpages is 0.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:15:46 -04:00
Philip Yang
a273bc9937 drm/amdkfd: ratelimited svm debug messages
No function change, use pr_debug_ratelimited to avoid per page debug
message overflowing dmesg buf and console log.

use dev_err to show error message from unexpected situation, to provide
clue to help debug without enabling dynamic debug log. Define dev_fmt to
output function name in error message.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:15:40 -04:00
Yang Li
0de5472a01 drm/amdkfd: fix resource_size.cocci warnings
Use resource_size function on resource object
instead of explicit computation.

Clean up coccicheck warning:
./drivers/gpu/drm/amd/amdkfd/kfd_migrate.c:905:10-13: ERROR: Missing
resource_size with res

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Philip Yang
22f4f4faf3 drm/amdkfd: fix svm_migrate_fini warning
Device manager releases device-specific resources when a driver
disconnects from a device, devm_memunmap_pages and
devm_release_mem_region calls in svm_migrate_fini are redundant.

It causes below warning trace after patch "drm/amdgpu: Split
amdgpu_device_fini into early and late", so remove function
svm_migrate_fini.

BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718

WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
devm_release_action+0x51/0x60
Call Trace:
    ? memunmap_pages+0x360/0x360
    svm_migrate_fini+0x2d/0x60 [amdgpu]
    kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
    amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
    amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
    drm_dev_release+0x20/0x40 [drm]
    release_nodes+0x196/0x1e0
    device_release_driver_internal+0x104/0x1d0
    driver_detach+0x47/0x90
    bus_remove_driver+0x7a/0xd0
    pci_unregister_driver+0x3d/0x90
    amdgpu_exit+0x11/0x20 [amdgpu]

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:57 -04:00
Philip Yang
586d71a427 drm/amdkfd: handle svm migrate init error
If svm migration init failed to create pgmap for device memory, set
pgmap type to 0 to disable device SVM support capability.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:49 -04:00
Alex Sierra
7981ec6549 drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
Each zone-device page holds a reference to the SVM BO that manages its
backing storage. This is necessary to correctly hold on to the BO in
case zone_device pages are shared with a child-process.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
3bf8282c6b drm/amdkfd: add invalid pages debug at vram migration
This is for debug purposes only.
It conditionally generates partial migrations to test mixed
CPU/GPU memory domain pages in a prange easily.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
6ffecc946f drm/amdkfd: skip migration for pages already in VRAM
Migration skipped for pages that are already in VRAM
domain. These could be the result of previous partial
migrations to SYS RAM, and prefetch back to VRAM.
Ex. Coherent pages in VRAM that were not written/invalidated after
a copy-on-write.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
1ade5f84cc drm/amdkfd: skip invalid pages during migrations
Invalid pages can be the result of pages that have been migrated
already due to copy-on-write procedure or pages that were never
migrated to VRAM in first place. This is not an issue anymore,
as pranges now support mixed memory domains (CPU/GPU).

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
a010d98a78 drm/amdkfd: set owner ref to svm range prefault
svm_range_prefault is called right before migrations to VRAM,
to make sure pages are resident in system memory before the migration.
With partial migrations, this reference is used by hmm range get pages
to avoid migrating pages that are already in the same VRAM domain.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
3a61dae854 drm/amdkfd: device pgmap owner at the svm migrate init
GPUs in the same XGMI hive have direct access to all
members'VRAM. When mapping memory to a GPU, we don't need
hmm_range_fault to fault device-private pages in the same
hive back to the host. Identifying the page owner as the hive,
rather than the individual GPU, accomplishes this.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Philip Yang
d4ebc20070 drm/amdkfd: implement counters for vm fault and migration
Add helper function to get process device data structure from adev to
update counters.

Update vm faults, page_in, page_out counters will no be executed in
parallel, use WRITE_ONCE to avoid any form of compiler optimizations.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Christian König
2fdcb55dfc drm/amdkfd: use resource cursor in svm_migrate_copy_to_vram v2
Access to the mm_node is now forbidden. So instead of hand wiring that
use the cursor functionality.

v2: fix handling as pointed out by Philip.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by and Tested-by: Philip Yang <philip.yang@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-5-christian.koenig@amd.com
2021-06-04 15:16:46 +02:00
Philip Yang
04fe3fd10e drm/amdkfd: handle errors returned by svm_migrate_copy_to_vram/ram
If migration copy failed because process is killed, or out of VRAM or
system memory, pass error code back to caller to handle error
gracefully.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:08:32 -04:00
Felix Kuehling
2e4ec25162 drm/amdkfd: Make svm_migrate_put_sys_page static
This function is only used in this source file.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:06:44 -04:00
Philip Yang
c0f76fc8ad drm/amdkfd: fix double free device pgmap resource
Use devm_memunmap_pages instead of memunmap_pages to release pgmap
and remove pgmap from device action, to avoid double free pgmap when
unloading driver module.

Release device memory region if failed to create device memory pages
structure.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-28 23:36:04 -04:00
Colin Ian King
a40eb089b4 drm/amdkfd: remove redundant initialization to variable r
The variable r is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-23 17:17:50 -04:00
Felix Kuehling
1a3b2b5dca drm/amdkfd: multiple gpu migrate vram to vram
If prefetch range to gpu with acutal location is another gpu, or GPU
retry fault restore pages to migrate the range with acutal location is
gpu, then migrate from one gpu to another gpu.

Use system memory as bridge because sdma engine may not able to access
another gpu vram, use sdma of source gpu to migrate to system memory,
then use sdma of destination gpu to migrate from system memory to gpu.

Print out gpuid or gpuidx in debug messages.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:50:22 -04:00
Felix Kuehling
cda0f85bfa drm/amdkfd: refine migration policy with xnack on
With xnack on, GPU vm fault handler decide the best restore location,
then migrate range to the best restore location and update GPU mapping
to recover the GPU vm fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:50:03 -04:00
Felix Kuehling
90d7d3eda5 drm/amdkfd: invalidate tables on page retry fault
GPU page tables are invalidated by unmapping prange directly at
the mmu notifier, when page fault retry is enabled through
amdgpu_noretry global parameter. The restore page table is
performed at the page fault handler.

If xnack is on, we update GPU mappings after migration to avoid
unnecessary GPUVM faults.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:48:50 -04:00
Felix Kuehling
48ff079b28 drm/amdkfd: HMM migrate vram to ram
If CPU page fault happens, HMM pgmap_ops callback migrate_to_ram start
migrate memory from vram to ram in steps:

1. migrate_vma_pages get vram pages, and notify HMM to invalidate the
pages, HMM interval notifier callback evict process queues
2. Allocate system memory pages
3. Use svm copy memory to migrate data from vram to ram
4. migrate_vma_pages copy pages structure from vram pages to ram pages
5. Return VM_FAULT_SIGBUS if migration failed, to notify application
6. migrate_vma_finalize put vram pages, page_free callback free vram
pages and vram nodes
7. Restore work wait for migration is finished, then update GPU page
table mapping to system memory, and resume process queues

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:48:43 -04:00
Felix Kuehling
0b0e518d61 drm/amdkfd: HMM migrate ram to vram
Register svm range with same address and size but perferred_location
is changed from CPU to GPU or from GPU to CPU, trigger migration the svm
range from ram to vram or from vram to ram.

If svm range prefetch location is GPU with flags
KFD_IOCTL_SVM_FLAG_HOST_ACCESS, validate the svm range on ram first,
then migrate it from ram to vram.

After migrating to vram is done, CPU access will have cpu page fault,
page fault handler migrate it back to ram and resume cpu access.

Migration steps:

1. migrate_vma_pages get svm range ram pages, notify the
interval is invalidated and unmap from CPU page table, HMM interval
notifier callback evict process queues
2. Allocate new pages in vram using TTM
3. Use svm copy memory to sdma copy data from ram to vram
4. migrate_vma_pages copy ram pages structure to vram pages structure
5. migrate_vma_finalize put ram pages to free ram pages and memory
6. Restore work wait for migration is finished, then update GPUs page
table mapping to new vram pages, resume process queues

If migrate_vma_setup failed to collect all ram pages of range, retry 3
times until success to start migration.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:48:30 -04:00
Philip Yang
50ea50cf6f drm/amdkfd: copy memory through gart table
Use sdma linear copy to migrate data between ram and vram. The sdma
linear copy command uses kernel buffer function queue to access system
memory through gart table.

Use reserved gart table window 0 to map system page address, and vram
page address is direct mapping. Use the same kernel buffer function to
fill in gart table mapping, so this is serialized with memory copy by
sdma job submit. We only need wait for the last memory copy sdma fence
for larger buffer migration.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:48:23 -04:00