linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Adrián Larumbe	b12f3ea7c1	drm/panfrost: Replace fdinfo's profiling debugfs knob with sysfs Debugfs isn't always available in production builds that try to squeeze every single byte out of the kernel image, but we still need a way to toggle the timestamp and cycle counter registers so that jobs can be profiled for fdinfo's drm engine and cycle calculations. Drop the debugfs knob and replace it with a sysfs file that accomplishes the same functionality, and document its ABI in a separate file. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306015819.822128-2-adrian.larumbe@collabora.com	2024-03-11 13:27:10 +01:00
AngeloGioacchino Del Regno	157ad4ccff	drm/panfrost: Synchronize and disable interrupts before powering off To make sure that we don't unintentionally perform any unclocked and/or unpowered R/W operation on GPU registers, before turning off clocks and regulators we must make sure that no GPU, JOB or MMU ISR execution is pending: doing that requires to add a mechanism to synchronize the interrupts on suspend. Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform interrupts masking and ISR execution synchronization, and then call those in the panfrost_device_runtime_suspend() handler in the exact sequence of job (may require mmu!) -> mmu -> gpu. As a side note, JOB and MMU suspend_irq functions needed some special treatment: as their interrupt handlers will unmask interrupts, it was necessary to add an `is_suspended` bitmap which is used to address the possible corner case of unintentional IRQ unmasking because of ISR execution after a call to synchronize_irq(). At resume, clear each is_suspended bit in the reset path of JOB/MMU to allow unmasking the interrupts. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-4-angelogioacchino.delregno@collabora.com	2023-12-05 11:39:59 +01:00
AngeloGioacchino Del Regno	b98e9a84d3	drm/panfrost: Add gpu_irq, mmu_irq to struct panfrost_device In preparation for adding a IRQ synchronization mechanism for PM suspend, add gpu_irq and mmu_irq variables to struct panfrost_device and change functions panfrost_gpu_init() and panfrost_mmu_init() to use those. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-3-angelogioacchino.delregno@collabora.com	2023-12-05 11:39:57 +01:00
Maxime Ripard	3bf3e21c15	Merge drm/drm-next into drm-misc-next Let's kickstart the v6.8 release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2023-11-15 10:56:44 +01:00
AngeloGioacchino Del Regno	889a2b06f8	drm/panfrost: Implement ability to turn on/off regulators in suspend Some platforms/SoCs can power off the GPU entirely by completely cutting off power, greatly enhancing battery time during system suspend: add a new pm_feature GPU_PM_VREG_OFF to allow turning off the GPU regulators during full suspend only on selected platforms. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-6-angelogioacchino.delregno@collabora.com	2023-11-10 14:12:16 +00:00
AngeloGioacchino Del Regno	56e76c0179	drm/panfrost: Implement ability to turn on/off GPU clocks in suspend Currently, the GPU is being internally powered off for runtime suspend and turned back on for runtime resume through commands sent to it, but note that the GPU doesn't need to be clocked during the poweroff state, hence it is possible to save some power on selected platforms. Add suspend and resume handlers for full system sleep and then add a new panfrost_gpu_pm enumeration and a pm_features variable in the panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to enable this power saving technique only on SoCs that are able to safely use it. Note that this was implemented only for the system sleep case and not for runtime PM because testing on one of my MediaTek platforms showed issues when turning on and off clocks aggressively (in PM runtime) resulting in a full system lockup. Doing this only for full system sleep never showed issues during my testing by suspending and resuming the system continuously for more than 100 cycles. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231109102543.42971-4-angelogioacchino.delregno@collabora.com	2023-11-10 14:11:57 +00:00
Linus Torvalds	ecae0bd517	Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series "Fixes and cleanups to compaction". - Joel Fernandes has a patchset ("Optimize mremap during mutual alignment within PMD") which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested. - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series "Do not try to access unaccepted memory" Adrian Hunter provides some fixups for the recently-added "unaccepted memory' feature. To increase the feature's checking coverage. "Plug a few gaps where RAM is exposed without checking if it is unaccepted memory". - In the series "cleanups for lockless slab shrink" Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code. - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series "use refcount+RCU method to implement lockless slab shrink". - David Hildenbrand contributes some maintenance work for the rmap code in the series "Anon rmap cleanups". - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series "mm: migrate: more folio conversion and unification". - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series "Add and use bdev_getblk()". - In the series "Use nth_page() in place of direct struct page manipulation" Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames. - In the series "mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO" has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use. - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code rationalization and folio conversions in the hugetlb code. - Yin Fengwei has improved mlock()'s handling of large folios in the series "support large folio for mlock" - In the series "Expose swapcache stat for memcg v1" Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2. - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named "MDWE without inheritance". - Kefeng Wang has provided the series "mm: convert numa balancing functions to use a folio" which does what it says. - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch makes is possible for a process to propagate KSM treatment across exec(). - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use "high bandwidth memory" in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named "memory tiering: calculate abstract distance based on ACPI HMAT" - In the series "Smart scanning mode for KSM" Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans. - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series "mm: memcg: fix tracking of pending stats updates values". - In the series "Implement IOCTL to get and optionally clear info about PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU. - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance" - a bunch of relatively minor maintenance tweaks to this code. - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series "Handle more faults under the VMA lock". Some rationalizations of the fault path became possible as a result. - In the series "mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups and folio conversions. - In the series "various improvements to the GUP interface" Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements. - Andrey Konovalov has sent along the series "kasan: assorted fixes and improvements" which does those things. - Some page allocator maintenance work from Kemeng Shi in the series "Two minor cleanups to break_down_buddy_pages". - In thes series "New selftest for mm" Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults. - In the series "Add folio_end_read" Matthew Wilcox provides cleanups and an optimization to the core pagecache code. - Nhat Pham has added memcg accounting for hugetlb memory in the series "hugetlb memcg accounting". - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series "Abstract vma_merge() and split_vma()". - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series "Fix page_owner's use of free timestamps". - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series "permit write-sealed memfd read-only shared mappings". - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series "Batch hugetlb vmemmap modification operations". - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series "Finish the create_empty_buffers() transition". - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series "mm: PCP high auto-tuning". - Roman Gushchin has contributed the patchset "mm: improve performance of accounted kernel memory allocations" which improves their performance by ~30% as measured by a micro-benchmark. - folio conversions from Kefeng Wang in the series "mm: convert page cpupid functions to folios". - Some kmemleak fixups in Liu Shixin's series "Some bugfix about kmemleak". - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series "handle memoryless nodes more appropriately". - khugepaged conversions from Vishal Moola in the series "Some khugepaged folio conversions". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y FgeUPAD1oasg6CP+INZvCj34waNxwAc= =E+Y4 -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...	2023-11-02 19:38:47 -10:00
Qi Zheng	e11c4f3acb	drm/panfrost: dynamically allocate the drm-panfrost shrinker In preparation for implementing lockless slab shrink, use new APIs to dynamically allocate the drm-panfrost shrinker, so that it can be freed asynchronously via RCU. Then it doesn't need to wait for RCU read-side critical section when releasing the struct panfrost_device. Link: https://lkml.kernel.org/r/20230911094444.68966-23-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: David Airlie <airlied@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Talpey <tom@talpey.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-10-04 10:32:25 -07:00
Adrián Larumbe	f11b0417ee	drm/panfrost: Add fdinfo support GPU load metrics The drm-stats fdinfo tags made available to user space are drm-engine, drm-cycles, drm-max-freq and drm-curfreq, one per job slot. This deviates from standard practice in other DRM drivers, where a single set of key:value pairs is provided for the whole render engine. However, Panfrost has separate queues for fragment and vertex/tiler jobs, so a decision was made to calculate bus cycles and workload times separately. Maximum operating frequency is calculated at devfreq initialisation time. Current frequency is made available to user space because nvtop uses it when performing engine usage calculations. It is important to bear in mind that both GPU cycle and kernel time numbers provided are at best rough estimations, and always reported in excess from the actual figure because of two reasons: - Excess time because of the delay between the end of a job processing, the subsequent job IRQ and the actual time of the sample. - Time spent in the engine queue waiting for the GPU to pick up the next job. To avoid race conditions during enablement/disabling, a reference counting mechanism was introduced, and a job flag that tells us whether a given job increased the refcount. This is necessary, because user space can toggle cycle counting through a debugfs file, and a given job might have been in flight by the time cycle counting was disabled. The main goal of the debugfs cycle counter knob is letting tools like nvtop or IGT's gputop switch it at any time, to avoid power waste in case no engine usage measuring is necessary. Also add a documentation file explaining the possible values for fdinfo's engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-3-adrian.larumbe@collabora.com	2023-10-04 13:04:15 +02:00
Alyssa Rosenzweig	a769a7af9d	drm/panfrost: Increase MAX_PM_DOMAINS to 5 Increase the MAX_PM_DOMAINS constant from 3 to 5, to support the extra power domains required by the Mali-G57 on the MT8192. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230316102041.210269-9-angelogioacchino.delregno@collabora.com	2023-03-23 10:43:17 +01:00
Paul Cercueil	53d36818ae	drm: panfrost: Remove #ifdef guards for PM related functions Use the EXPORT_GPL_RUNTIME_DEV_PM_OPS() and pm_ptr() macros to handle the PM callbacks. These macros allow the PM functions to be automatically dropped by the compiler when CONFIG_PM is disabled, without having to use #ifdef guards. This has the advantage of always compiling these functions in, independently of any Kconfig option. Thanks to that, bugs and other regressions are subsequently easier to catch. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221129191942.138244-3-paul@crapouillou.net	2022-12-12 13:07:01 +00:00
Steven Price	030761e097	drm/panfrost: Queue jobs on the hardware The hardware has a set of '_NEXT' registers that can hold a second job while the first is executing. Make use of these registers to enqueue a second job per slot. v5: * Fix a comment in panfrost_job_init() v3: * Fix the done/err job dequeuing logic to get a valid active state * Only enable the second slot on GPUs supporting jobchain disambiguation * Split interrupt handling in sub-functions Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-16-boris.brezillon@collabora.com	2021-07-01 08:53:37 +02:00
Boris Brezillon	30b5d4ed5b	drm/panfrost: Kill in-flight jobs on FD close If the process who submitted these jobs decided to close the FD before the jobs are done it probably means it doesn't care about the result. v5: * Add a panfrost_exception_is_fault() helper and the DRM_PANFROST_EXCEPTION_MAX_NON_FAULT value v4: * Don't disable/restore irqs when taking the job_lock (not needed since this lock is never taken from an interrupt context) v3: * Set fence error to ECANCELED when a TERMINATED exception is received Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-15-boris.brezillon@collabora.com	2021-07-01 08:53:36 +02:00
Boris Brezillon	2905db2764	drm/panfrost: Don't reset the GPU on job faults unless we really have to If we can recover from a fault without a reset there's no reason to issue one. v3: * Drop the mention of Valhall requiring a reset on JOB_BUS_FAULT * Set the fence error to -EINVAL instead of having per-exception error codes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-14-boris.brezillon@collabora.com	2021-07-01 08:53:35 +02:00
Boris Brezillon	ed7a34c57d	drm/panfrost: Disable the AS on unhandled page faults If we don't do that, we have to wait for the job timeout to expire before the fault jobs gets killed. v3: * Make sure the AS is re-enabled when new jobs are submitted to the context Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-12-boris.brezillon@collabora.com	2021-07-01 08:53:33 +02:00
Boris Brezillon	a11c471123	drm/panfrost: Simplify the reset serialization logic Now that we can pass our own workqueue to drm_sched_init(), we can use an ordered workqueue on for both the scheduler timeout tdr and our own reset work (which we use when the reset is not caused by a fault/timeout on a specific job, like when we have AS_ACTIVE bit stuck). This guarantees that the timeout handlers and reset handler can't run concurrently which drastically simplifies the locking. v5: * Don't call cancel_delayed_timeout() in the reset path (those works are canceled in drm_sched_stop()) v4: * Actually pass the reset workqueue to drm_sched_init() * Don't call cancel_work_sync() in panfrost_reset(). It will deadlock since it might be called from the reset work, which is executing and cancel_work_sync() will wait for the handler to return. Checking the reset pending status should avoid spurious resets v3: * New patch Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-10-boris.brezillon@collabora.com	2021-07-01 08:53:32 +02:00
Boris Brezillon	229f45788e	drm/panfrost: Expose a helper to trigger a GPU reset Expose a helper to trigger a GPU reset so we can easily trigger reset operations outside the job timeout handler. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-8-boris.brezillon@collabora.com	2021-07-01 08:53:30 +02:00
Boris Brezillon	7319965fa1	drm/panfrost: Do the exception -> string translation using a table Do the exception -> string translation using a table. This way we get rid of those magic numbers and can easily add new fields if we need to attach extra information to exception types. v4: * Don't expose exception type to userspace * Merge the enum definition and the enum -> string table declaration in the same patch v3: * Drop the error field Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-7-boris.brezillon@collabora.com	2021-07-01 08:53:29 +02:00
Boris Brezillon	6ef2f37f40	drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name() Currently unused. We'll add it back if we need per-GPU definitions. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-6-boris.brezillon@collabora.com	2021-07-01 08:53:28 +02:00
Boris Brezillon	7fdc48cc63	drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv Jobs can be in-flight when the file descriptor is closed (either because the process did not terminate properly, or because it didn't wait for all GPU jobs to be finished), and apparently panfrost_job_close() does not cancel already running jobs. Let's refcount the MMU context object so it's lifetime is no longer bound to the FD lifetime and running jobs can finish properly without generating spurious page faults. Reported-by: Icecream95 <ixn@keemail.me> Fixes: `7282f7645d` ("drm/panfrost: Implement per FD address spaces") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210621133907.1683899-2-boris.brezillon@collabora.com	2021-06-24 09:25:56 +02:00
Alyssa Rosenzweig	3e2926f875	drm/panfrost: Add AFBC_FEATURES parameter The value of the AFBC_FEATURES register is required by userspace to determine AFBC support on Bifrost. A user on our IRC channel (#panfrost) reported a workload that raised a fault on one system's Mali G31 but worked flawlessly with another system's Mali G31. We determined the cause to be missing AFBC support on one vendor's Mali implementation -- it turns out AFBC is optional on Bifrost! Whether AFBC is supported or not is exposed in the AFBC_FEATURES register on Bifrost, which reads back as 0 on Midgard. A zero value indicates AFBC is fully supported, provided the architecture itself supports AFBC, allowing backwards-compatibility with Midgard. Bits 0 and 15 indicate that AFBC support is absent for texturing and rendering respectively. The user experiencing the fault reports that AFBC_FEATURES reads back 0x10001 on their system, confirming the architectural lack of AFBC. Userspace needs this parameter to know to disable AFBC on that chip, and perhaps others. v2: Fix typo from copy-paste fail. v3: Bump the UABI version. This commit was cherry-picked from another series so chalking this up to a rebase fail. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210604130011.3203-1-alyssa.rosenzweig@collabora.com	2021-06-04 16:16:04 +01:00
Boris Brezillon	5bc5cc2819	drm/panfrost: Move the GPU reset bits outside the timeout handler We've fixed many races in panfrost_job_timedout() but some remain. Instead of trying to fix it again, let's simplify the logic and move the reset bits to a separate work scheduled when one of the queue reports a timeout. v5: - Simplify panfrost_scheduler_stop() (Steven Price) - Always restart the queue in panfrost_scheduler_start() even if the status is corrupted (Steven Price) v4: - Rework the logic to prevent a race between drm_sched_start() (reset work) and drm_sched_job_timedout() (timeout work) - Drop Steven's R-b - Add dma_fence annotation to the panfrost_reset() function (Daniel Vetter) v3: - Replace the atomic_cmpxchg() by an atomic_xchg() (Robin Murphy) - Add Steven's R-b v2: - Use atomic_cmpxchg() to conditionally schedule the reset work (Steven Price) Fixes: `1a11a88cfd` ("drm/panfrost: Fix job timeout handling") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201105151704.2010667-1-boris.brezillon@collabora.com	2020-11-16 10:27:30 +00:00
Robin Murphy	268af50f38	drm/panfrost: Support cache-coherent integrations When the GPU's ACE-Lite interface is fully wired up and capable of snooping CPU caches, it may be described as "dma-coherent" in devicetree, which will already inform the DMA layer not to perform unnecessary cache maintenance. However, we still need to ensure that the GPU uses the appropriate cacheable outer-shareable attributes in order to generate the requisite snoop signals, and that CPU mappings don't create a mismatch by using a non-cacheable type either. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/7024ce18c1cb1a226e918037d49175571db0b436.1600780574.git.robin.murphy@arm.com	2020-10-30 10:08:08 +01:00
Neil Armstrong	91e89097b8	drm/panfrost: add support for vendor quirk The T820, G31 & G52 GPUs integrated by Amlogic in the respective GXM, G12A/SM1 & G12B SoCs needs a quirk in the PWR registers after each reset. This adds a callback in the device compatible struct of permit this. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> [Steven: Fix typo in commit log] Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200916150147.25753-2-narmstrong@baylibre.com	2020-09-21 10:13:50 +01:00
Clément Péron	512f21227f	drm/panfrost: dynamically alloc regulators We will later introduce regulators managed by OPP. Only alloc regulators when it's needed. This also help use to release the regulators only when they are allocated. Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-10-peron.clem@gmail.com	2020-08-07 10:11:26 -06:00
Clément Péron	9bfacfc82f	drm/panfrost: introduce panfrost_devfreq struct Introduce a proper panfrost_devfreq to deal with devfreq variables. Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-5-peron.clem@gmail.com	2020-08-07 10:11:26 -06:00
Nicolas Boichat	506629c868	drm/panfrost: Add support for multiple power domains When there is a single power domain per device, the core will ensure the power domain is switched on (so it is technically equivalent to having not power domain specified at all). However, when there are multiple domains, as in MT8183 Bifrost GPU, we need to handle them in driver code. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200207052627.130118-6-drinkcat@chromium.org	2020-02-25 14:35:35 -06:00
Nicolas Boichat	3e1399bccf	drm/panfrost: Add support for multiple regulators Some GPUs, namely, the bifrost/g72 part on MT8183, have a second regulator for their SRAM, let's add support for that. We extend the framework in a generic manner so that we could support more than 2 regulators, if required. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Steven Price <steven.price@arm.com> Reviwed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200207052627.130118-5-drinkcat@chromium.org	2020-02-25 14:35:31 -06:00
Steven Price	9e62b885f7	drm/panfrost: Simplify devfreq utilisation tracking Instead of tracking per-slot utilisation track a single value for the entire GPU. Ultimately it doesn't matter if the GPU is busy with only vertex or a combination of vertex and fragment processing - if it's busy then it's busy and devfreq should be scaling appropriately. This also makes way for being able to submit multiple jobs per slot which requires more values than the original boolean per slot. Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191025134143.14324-3-steven.price@arm.com	2019-10-29 13:01:51 -05:00
Steven Price	221bc77914	drm/panfrost: Use generic code for devfreq Use dev_pm_opp_set_rate() instead of open coding the devfreq integration, simplifying the code. Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191025134143.14324-2-steven.price@arm.com	2019-10-29 13:01:36 -05:00
Rob Herring	45d0dbd15a	drm/panfrost: Remove unnecessary hwaccess_lock spin_lock With the introduction of the as_lock to serialize address space registers, the hwaccess_lock is only used within the job code and is not protecting anything. panfrost_job_hw_submit() only accesses registers for 1 job slot and it's already serialized by drm_sched. Fixes: `7282f7645d` ("drm/panfrost: Implement per FD address spaces") Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190826223317.28509-9-robh@kernel.org	2019-08-30 09:53:52 -05:00
Rob Herring	e316f08f1a	drm/panfrost: Remove unnecessary mmu->lock mutex There's no need to serialize io-pgtable calls and the as_lock is sufficient to serialize flush operations, so we can remove the per page table lock. Fixes: `7282f7645d` ("drm/panfrost: Implement per FD address spaces") Suggested-by: Robin Murphy <robin.murphy@arm.com> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190826223317.28509-4-robh@kernel.org	2019-08-30 09:52:47 -05:00
Rob Herring	7282f7645d	drm/panfrost: Implement per FD address spaces Up until now, a single shared GPU address space was used. This is not ideal as there's no protection between processes and doesn't work for supporting the same GPU/CPU VA feature. Most importantly, this will hopefully mitigate Alyssa's fear of WebGL, whatever that is. Most of the changes here are moving struct drm_mm and struct panfrost_mmu objects from the per device struct to the per FD struct. The critical function is panfrost_mmu_as_get() which handles allocating and switching the h/w address spaces. There's 3 states an AS can be in: free, allocated, and in use. When a job runs, it requests an address space and then marks it not in use when job is complete(but stays assigned). The first time thru, we find a free AS in the alloc_mask and assign the AS to the FD. Then the next time thru, we most likely already have our AS and we just mark it in use with a ref count. We need a ref count because we have multiple job slots. If the job/FD doesn't have an AS assigned and there are no free ones, then we pick an allocated one not in use from our LRU list and switch the AS from the old FD to the new one. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190813150115.30338-1-robh@kernel.org	2019-08-19 11:34:57 -05:00
Rob Herring	73e467f60a	drm/panfrost: Consolidate reset handling Runtime PM resume and job timeouts both call the same sequence of functions, so consolidate them to a common function. This will make changing the reset related code easier. The MMU also needs some re-initialization on reset, so rework its call. In the process, we hide the address space details within the MMU code in preparation to support multiple address spaces. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190808222200.13176-7-robh@kernel.org	2019-08-12 14:20:46 -06:00
Rob Herring	013b651013	drm/panfrost: Add madvise and shrinker support Add support for madvise and a shrinker similar to other drivers. This allows userspace to mark BOs which can be freed when there is memory pressure. Unlike other implementations, we don't depend on struct_mutex. The driver maintains a list of BOs which can be freed when the shrinker is called. Access to the list is serialized with the shrinker_lock. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190805143358.21245-2-robh@kernel.org	2019-08-08 15:57:36 -06:00
Steven Price	4bced8bea0	drm/panfrost: Export all GPU feature registers Midgard/Bifrost GPUs have a bunch of feature registers providing details of what the hardware supports. Panfrost already reads these, this patch exports them all to user space so that the jobs created by the user space driver can be tuned for the particular hardware implementation. Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190724105626.53552-1-steven.price@arm.com	2019-07-25 16:10:52 -06:00
Boris Brezillon	7786fd1087	drm/panfrost: Expose performance counters through unstable ioctls Expose performance counters through 2 driver specific ioctls: one to enable/disable the perfcnt block, and one to dump the counter values. There are discussions to expose global performance monitors (those counters that can't be retrieved on a per-job basis) in a consistent way, but this is likely to take time to settle on something that works for various HW/users. The ioctls are marked unstable so we can get rid of them when the time comes. We initally went for a debugfs-based interface, but this was making the transition to per-FD address space more complicated (we need to specify the namespace the GPU has to use when dumping the perf counters), hence the decision to switch back to driver specific ioctls which are passed the FD they operate on and thus will have a dedicated address space attached to them. Other than that, the implementation is pretty simple: it basically dumps all counters and copy the values to a userspace buffer. The parsing is left to userspace which has to know the specific layout that's used by the GPU (layout differs on a per-revision basis). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-5-boris.brezillon@collabora.com	2019-06-18 09:23:48 -06:00
Boris Brezillon	1e51348013	drm/panfrost: Add an helper to check the GPU generation All models with an ID >= 0x1000 are Bifrost GPUs for now (might change with new gens). Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-4-boris.brezillon@collabora.com	2019-06-18 09:23:36 -06:00
Boris Brezillon	92f0ad0b1d	drm/panfrost: Add a module parameter to expose unstable ioctls We plan to expose performance counters through 2 driver specific ioctls until there's a solution to expose them in a generic way. In order to be able to deprecate those ioctls when this new infrastructure is in place we add an unsafe module parameter that will keep those ioctls hidden unless it's set to true (which also has the effect of tainting the kernel). All unstable ioctl handlers should use panfrost_unstable_ioctl_check() to check whether they're supposed to handle the request or reject it with ENOSYS. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618081648.17297-3-boris.brezillon@collabora.com	2019-06-18 09:23:23 -06:00
Clément Péron	b681af0bc1	drm: panfrost: add optional bus_clock Allwinner H6 has an ARM Mali-T720 MP2 which required a bus_clock. Add an optional bus_clock at the init of the panfrost driver. Signed-off-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190521161102.29620-2-peron.clem@gmail.com	2019-05-22 14:23:29 -05:00
Tomeu Vizoso	aa20236784	drm/panfrost: Prevent concurrent resets If a job times out in slot 0 while a reset is performed because a job timed out in slot 1, the drm-sched core can get into a deadlock. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190418084305.45021-1-tomeu.vizoso@collabora.com	2019-04-18 09:27:34 -05:00
Rob Herring	f3ba91228e	drm/panfrost: Add initial panfrost driver This adds the initial driver for panfrost which supports Arm Mali Midgard and Bifrost family of GPUs. Currently, only the T860 and T760 Midgard GPUs have been tested. v2: - Add GPU reset on job hangs (Tomeu) - Add RuntimePM and devfreq support (Tomeu) - Fix T760 support (Tomeu) - Add a TODO file (Rob, Tomeu) - Support multiple in fences (Tomeu) - Drop support for shared fences (Tomeu) - Fill in MMU de-init (Rob) - Move register definitions back to single header (Rob) - Clean-up hardcoded job submit todos (Rob) - Implement feature setup based on features/issues (Rob) - Add remaining Midgard DT compatible strings (Rob) v3: - Add support for reset lines (Neil) - Add a MAINTAINERS entry (Rob) - Call dma_set_mask_and_coherent (Rob) - Do MMU invalidate on map and unmap. Restructure to do a single operation per map/unmap call. (Rob) - Add a missing explicit padding to struct drm_panfrost_create_bo (Rob) - Fix 0-day error: "panfrost_devfreq.c:151:9-16: ERROR: PTR_ERR applied after initialization to constant on line 150" - Drop HW_FEATURE_AARCH64_MMU conditional (Rob) - s/DRM_PANFROST_PARAM_GPU_ID/DRM_PANFROST_PARAM_GPU_PROD_ID/ (Rob) - Check drm_gem_shmem_prime_import_sg_table() error code (Rob) - Re-order power on sequence (Rob) - Move panfrost_acquire_object_fences() before scheduling job (Rob) - Add NULL checks on array pointers in job clean-up (Rob) - Rework devfreq (Tomeu) - Fix devfreq init with no regulator (Rob) - Various WS and comments clean-up (Rob) Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marty E. Plummer <hanetzer@startmail.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190409205427.6943-4-robh@kernel.org	2019-04-12 12:56:46 -05:00

42 commits