- gvt-next stuff mostly with refactor for the new MDEV interface.
i915 Changes:
- PSR fixes and improvements (Jouni)
- DP DSC fixes (Vinod, Jouni)
- More general display cleanups (Jani)
- More display collor management cleanup targetting degamma (Ville)
- remove circ_buf.h includes (Jiri)
- wait power off delay at driver remove to optimize probe (Jani)
- More audio cleanup targeting the ELD precompute readout (Ville)
- Enable DC power states on all eDP ports (Imre)
- RPL-P stepping info (Matt Atwood)
- MTL enabling patches (RK)
- Removal of DG2 force_probe (Matt)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmN3+28ACgkQ+mJfZA7r
E8rGZQf9HiTNW1yM1/YGq7ezD6/XN2sbCaH/iVNd0+FHxz4HHaTJaEVQ1xEBTdIb
poMQiWl64ig3t2LuDrRGwlsVmFYV3vK5byt758sMpLb/M3cWbbB2swsW4YTbORM6
qcatv9eYyuoiylwj6fVLFPMXpwNTyCdbngbLZt+tTcl1oxaZYp0PVXBkzlKFP02U
KL4p+Dv4OHSjP38beIzUaHz3IJJAik/ttsxa1JhhC8F9UUJs7kuo6GXMY1OzRNpc
jKj3+F0OtDRA2xaw7YpbLI8yt+pWzerqFsnZDAvvx8Js7SQLkyNtjkuHusp8OP77
x99MlyOfbSfBnkbdm4Rhm7IIPTiUnw==
=SHn9
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2022-11-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
GVT Changes:
- gvt-next stuff mostly with refactor for the new MDEV interface.
i915 Changes:
- PSR fixes and improvements (Jouni)
- DP DSC fixes (Vinod, Jouni)
- More general display cleanups (Jani)
- More display collor management cleanup targetting degamma (Ville)
- remove circ_buf.h includes (Jiri)
- wait power off delay at driver remove to optimize probe (Jani)
- More audio cleanup targeting the ELD precompute readout (Ville)
- Enable DC power states on all eDP ports (Imre)
- RPL-P stepping info (Matt Atwood)
- MTL enabling patches (RK)
- Removal of DG2 force_probe (Matt)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y3f71obyEkImXoUF@intel.com
All calls to i915_vma_move_to_active are surrounded by vma lock
and/or there are multiple local helpers for it in particular tests.
Let's replace it by common helper.
The patch should not introduce functional changes.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-3-andrzej.hajda@intel.com
Since almost all calls to i915_vma_move_to_active are prepended with
i915_request_await_object, let's call the latter from
_i915_vma_move_to_active by default and add flag allowing bypassing it.
Adjust all callers accordingly.
The patch should not introduce functional changes.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019215906.295296-2-andrzej.hajda@intel.com
Turns out many of the files that need i915_reg.h get it implicitly via
{display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h
-> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h,
makes sense to drop it, but that requires adding quite a few new
includes all over the place.
Prefer including i915_reg.h where needed instead of adding another
implicit include, because eventually we'll want to split up i915_reg.h
and only include the specific registers at each place.
Also some places actually needed i915_irq.h too.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com
We rely on page_sizes.sg in setup_scratch_page() reporting the correct
value if the underlying sgl is not contiguous, however in
get_pages_internal() we are only looking at the layout of the created
pages when calculating the sg_page_sizes, and not the final sgl, which
could in theory be completely different. In such a situation we might
incorrectly think we have a 64K scratch page, when it is actually only
4K or similar split over multiple non-contiguous entries, which could
lead to broken behaviour when touching the scratch space within the
padding of a 64K GTT page-table. For most of the other backends we
already just call i915_sg_dma_sizes() on the final mapping, so rather
just move that into __i915_gem_object_set_pages() to avoid such issues
coming back to bite us later.
v2: Update missing conversion in gvt
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108103238.165447-1-matthew.auld@intel.com
Driver Changes:
- Fix for #7306: [Arc A380] white flickering when using arc as a
secondary gpu (Matt A)
- Add Wa_18017747507 for DG2 (Wayne)
- Avoid spurious WARN on DG1 due to incorrect cache_dirty flag
(Niranjana, Matt A)
- Corrections to CS timestamp support for Gen5 and earlier (Ville)
- Fix a build error used with clang compiler on hwmon (GG)
- Improvements to LMEM handling with RPM (Anshuman, Matt A)
- Cleanups in dmabuf code (Mike)
- Selftest improvements (Matt A)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com
Using PAGE_SIZE here potentially hides issues so bump that to something
larger. This should also make it possible for iommu to coalesce entries
for us. With that in place verify we can write from the GPU using the
importers sg_table, followed by checking that our writes match when read
from the CPU side.
v2: Switch over to igt_gpu_fill_dw(), which looks to be more widely
supported than the migrate stuff (at least OOTB).
References: https://gitlab.freedesktop.org/drm/intel/-/issues/7306
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-2-matthew.auld@intel.com
Since a7c01fa93a ("signal: break out of wait loops on kthread_stop()")
kthread_stop() started asserting a pending signal which wreaks havoc with
a few of our selftests. Mainly because they are not fully expecting to
handle signals, but also cutting the intended test runtimes short due
signal_pending() now returning true (via __igt_timeout), which therefore
breaks both the patterns of:
kthread_run()
..sleep for igt_timeout_ms to allow test to exercise stuff..
kthread_stop()
And check for errors recorded in the thread.
And also:
Main thread | Test thread
---------------+------------------------------
kthread_run() |
kthread_stop() | do stuff until __igt_timeout
| -- exits early due signal --
Where this kthread_stop() was assume would have a "join" semantics, which
it would have had if not the new signal assertion issue.
To recap, threads are now likely to catch a previously impossible
ERESTARTSYS or EINTR, marking the test as failed, or have a pointlessly
short run time.
To work around this start using kthread_work(er) API which provides
an explicit way of waiting for threads to exit. And for cases where
parent controls the test duration we add explicit signaling which threads
will now use instead of relying on kthread_should_stop().
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020130841.3845791-1-tvrtko.ursulin@linux.intel.com
Prepare i915 driver to the common dynamic dma-buf locking convention
by starting to use the unlocked versions of dma-buf API functions
and handling cases where importer now holds the reservation lock.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-7-dmitry.osipenko@collabora.com
It turns out that on production DG2/ATS HW we should have support for
PS64. This feature allows to provide a 64K TLB hint at the PTE level,
which is a lot more flexible than the current method of enabling 64K GTT
pages for the entire page-table, since that leads to all kinds of
annoying restrictions, as documented in:
commit caa574ffc4
Author: Matthew Auld <matthew.auld@intel.com>
Date: Sat Feb 19 00:17:49 2022 +0530
drm/i915/uapi: document behaviour for DG2 64K support
On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.
With PS64, we can now drop the 2M GTT alignment restriction, and instead
only require 64K or larger when dealing with lmem. We still use the
compact-pt layout when possible, but only when we are certain that this
doesn't interfere with userspace.
Note that this is a change in uAPI behaviour, but hopefully shouldn't be
a concern (IGT is at least able to autodetect the alignment), since we
are only making the GTT alignment constraint less restrictive.
Based on a patch from CQ Tang.
v2: update the comment wrt scratch page
v3: (Nirmoy)
- Fix the selftest to actually use the random size, plus some comment
improvements, also drop the rem stuff.
Reported-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221004114915.221708-1-matthew.auld@intel.com
Daniele needs 84d4333c1e ("misc/mei: Add NULL check to component match
callback functions") in order to merge the DG2 HuC patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
The inline function has no place in i915_drv.h. Move it away, un-inline,
and untangle some header dependencies while at it.
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220914163514.1837467-1-jani.nikula@intel.com
drm/i915 feature pull for v6.1:
Features and functionality:
- Early Meteorlake (MTL) enabling (José, Radhakrishna, Clint, Imre, Vandita, Ville, Jani)
- Support more HDMI pixel clock frequencies on DG2 (Clint)
- Sanity check PCI BARs (Piotr Piórkowski)
- Enable DC5 on DG2 (Anusha)
- DG2 DMC firmware version bump to v2.07 (Madhumitha)
- New ADL-S PCI ID (José)
Refactoring and cleanups:
- Add display sub-struct to struct drm_i915_private (Jani)
- Add initial runtime info to device info (Jani)
- Split out HDCP and backlight registers to separate files (Jani)
Fixes:
- Skip wm/ddb readout for disabled pipes (Ville)
- HDMI port timing quirk for GLK ECS Liva Q2 (Diego Santa Cruz)
- Fix bw init null pointer dereference (Łukasz Bartosik)
- Disable PPS power hook for DP AUX backlight (Jouni)
- Avoid warnings on registering multiple backlight devices (Arun)
- Fix dual-link DSI backlight and CABC ports for display 11+ (Jani)
- Fix Type-C PHY ownership programming in HDMI legacy mode (Imre)
- Fix unclaimed register access while loading PIPEDMC-C/D (Imre)
- Bump up CDCLK for DG2 (Stan)
- Prune modes that require HDMI 2.1 FRL (Ankit)
- Disable FBC when PSR1 is enabled in display 12-13 (Matt)
- Fix TGL+ HDMI transcoder clock and DDI BUF disable order (Imre)
- Disable PSR before disable pipe (José)
- Disable DMC handlers during firmware loading/disabling on display 12+ (Imre)
- Disable clock gating for PIPEDMC-A/B as a workaround (Imre)
Merges:
- Two drm-next backmerges (Rodrigo, Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87k06rfaku.fsf@intel.com
So far, different views (normal, partial, rotated and remapped)
into the same object are only supported for GGTT mappings.
But with the upcoming VM_BIND feature, PPGTT will also use the
partial view mapping. Hence rename ggtt_view to more generic
gtt_view.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com
This will help in an upcoming patch where the live selftest wrappers
are extended to do more.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <john.c.harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-2-alan.previn.teres.alexis@intel.com
If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.
v2:
- In the selftests make sure we grab the runtime pm around the reset.
Also make sure we grab the reset lock before checking if the device
is wedged, since the wedge might still be in-progress and hence the
bit might not be set yet.
- Don't wedge or put the object into an unknown state, if the request
construction fails (or similar). Just returning an error and
skipping the fallback should be safe here.
- Make sure we wedge each gt. (Thomas)
- Peek at the unknown_state in io_reserve, that way we don't have to
export or hand roll the fault_wait_for_idle. (Thomas)
- Add the missing read-side barriers for the unknown_state. (Thomas)
- Some kernel-doc fixes. (Thomas)
v3:
- Tweak the ordering of the set_wedged, also add FIXME.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
We should always be explicit and allocate a fence slot before adding a
new fence.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-10-matthew.auld@intel.com
It's not supported, and just skips later anyway. With small-BAR things
get more complicated since all of stolen is likely not even CPU
accessible, hence not passing I915_BO_ALLOC_GPU_ONLY just results in the
object create failing.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-9-matthew.auld@intel.com
Update the selftest to include Tile 4 mode and switch to Tile 4 on
platforms that supports Tile 4 but no Tile Y and vice versa.
Also switch to XY_FAST_COPY_BLT on platforms that supports it.
v4: update commit message to reflect the code changes properly.
v3: add a function to find X-tile availability for a platform.
v2: disable Tile X for iGPU in fastblit and
fix checkpath --strict warnings.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5879
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220516082015.32020-1-nirmoy.das@intel.com
In order to get the GSC Support merged on drm-intel-gt-next
in a clean fashion we needed this ATS-M patch to avoid
conflict in i915_pci.c:
commit 412c942bdf ("drm/i915/ats-m: add ATS-M platform info")
--
Fixing a silent conflict on drivers/gpu/drm/i915/gt/intel_gt_gmch.c:
- if (!intel_vtd_active(i915))
+ if (!i915_vtd_active(i915))
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/drm-next has a build fix for the NewVision NV3052C panel
(drivers/gpu/drm/panel/panel-newvision-nv3052c.c), which needs to be
merged back to drm-misc-next, as it was failing to build there.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Sync up with v5.18-rc1, in particular to get 5e3094cfd9
("drm/i915/xehpsdv: Add has_flat_ccs to device info").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.
Rework all drivers to use this interface instead and deprecate the old one.
v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
This change adds the dma_resv_usage enum and allows us to specify why a
dma_resv object is queried for its containing fences.
Additional to that a dma_resv_usage_rw() helper function is added to aid
retrieving the fences for a read or write userspace submission.
This is then deployed to the different query functions of the dma_resv
object and all of their users. When the write paratermer was previously
true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise.
v2: add KERNEL/OTHER in separate patch
v3: some kerneldoc suggestions by Daniel
v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in
the rebase pointed out by Bas.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com
Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.
This is the next step towards handling the exclusive fence like a
shared one.
v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com
With the upcoming multitile support each tile will have its own
local memory. Mark the current LMEM with the suffix '0' to
emphasise that it belongs to the root tile.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-2-andi.shyti@linux.intel.com
vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.
The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.
Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.
v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.
v3:
- Documentation update
- Commit message formatting
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
Include linux/highmem.h and linux/swap.h explicitly where needed so we
can drop the linux/i2c.h include from i915_drv.h where it pulled in the
dependencies implicitly.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303181931.1661767-5-jani.nikula@intel.com
Remove the local yesno() implementation and adopt the str_yes_no() from
linux/string_helpers.h.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-1-lucas.demarchi@intel.com
The compute engine handles the same commands the render engine can
(except 3D pipeline), so it makes sense that CCS is more similar to RCS
than non-render engines.
The CCS context state (lrc) is also similar to the render one, so reuse
it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE
register.
In order to avoid having multiple RCS && CCS checks, add the following
engine flag:
- I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx.
BSpec: 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com
Exercise each of the migration scenarios, verifying that the final
placement and buffer contents match our expectations.
v2(Thomas): Replace for_i915_gem_ww() block with simpler object_lock()
v3:
- For testing purposes allow forcing the io_size such that we can
exercise the allocation + migration path on devices that don't have the
small BAR limit.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-4-matthew.auld@intel.com
If we have to contend with non-mappable LMEM, then we need to ensure the
object fits within the mappable portion, like in the selftests, where we
later try to CPU access the pages. However if it can't then we need to
gracefully handle this, without throwing an error.
Also it looks like TTM will return -ENOMEM, in ttm_bo_mem_space() after
exhausting all possible placements.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-3-matthew.auld@intel.com
It's unclear what reference the initial vma kref reference refers to.
A vma can have multiple weak references, the object vma list,
the vm's bound list and the GT's closed_list, and the initial vma
reference can be put from lookups of all these lists.
With the current implementation this means
that any holder of yet another vma refcount (currently only
i915_gem_object_unbind()) needs to be holding two of either
*) An object refcount,
*) A vm open count
*) A vma open count
in order for us to not risk leaking a reference by having the
initial vma reference being put twice.
Address this by re-introducing i915_vma_destroy() which removes all
weak references of the vma and *then* puts the initial vma refcount.
This makes a strong vma reference hold on to the vma unconditionally.
Perhaps a better name would be i915_vma_revoke() or i915_vma_zombify(),
since other callers may still hold a refcount, but with the prospect of
being able to replace the vma refcount with the object lock in the near
future, let's stick with i915_vma_destroy().
Finally this commit fixes a race in that previously i915_vma_release() and
now i915_vma_destroy() could destroy a vma without taking the vm->mutex
after an advisory check that the vma mm_node was not allocated.
This would race with the ungrab_vma() function creating a trace similar
to the below one. This was fixed in one of the __i915_vma_put() callsites
in
commit bc1922e5d3 ("drm/i915: Fix a race between vma / object destruction and unbinding")
but although not seemingly triggered by CI, that
is not sufficient. This patch is needed to fix that properly.
[823.012188] Console: switching to colour dummy device 80x25
[823.012422] [IGT] gem_ppgtt: executing
[823.016667] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0
[852.436465] stack segment: 0000 [#1] PREEMPT SMP NOPTI
[852.436480] CPU: 0 PID: 3200 Comm: gem_ppgtt Not tainted 5.16.0-CI-CI_DRM_11115+ #1
[852.436489] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.2422.A00.2110131104 10/13/2021
[852.436499] RIP: 0010:ungrab_vma+0x9/0x80 [i915]
[852.436711] Code: ef e8 4b 85 cf e0 e8 36 a3 d6 e0 8b 83 f8 9c 00 00 85 c0 75 e1 5b 5d 41 5c 41 5d c3 e9 d6 fd 14 00 55 53 48 8b af c0 00 00 00 <8b> 45 00 85 c0 75 03 5b 5d c3 48 8b 85 a0 02 00 00 48 89 fb 48 8b
[852.436727] RSP: 0018:ffffc90006db7880 EFLAGS: 00010246
[852.436734] RAX: 0000000000000000 RBX: ffffc90006db7598 RCX: 0000000000000000
[852.436742] RDX: ffff88815349e898 RSI: ffff88815349e858 RDI: ffff88810a284140
[852.436748] RBP: 6b6b6b6b6b6b6b6b R08: ffff88815349e898 R09: ffff88815349e8e8
[852.436754] R10: 0000000000000001 R11: 0000000051ef1141 R12: ffff88810a284140
[852.436762] R13: 0000000000000000 R14: ffff88815349e868 R15: ffff88810a284458
[852.436770] FS: 00007f5c04b04e40(0000) GS:ffff88849f000000(0000) knlGS:0000000000000000
[852.436781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[852.436788] CR2: 00007f5c04b38fe0 CR3: 000000010a6e8001 CR4: 0000000000770ef0
[852.436797] PKRU: 55555554
[852.436801] Call Trace:
[852.436806] <TASK>
[852.436811] i915_gem_evict_for_node+0x33c/0x3c0 [i915]
[852.437014] i915_gem_gtt_reserve+0x106/0x130 [i915]
[852.437211] i915_vma_pin_ww+0x8f4/0xb60 [i915]
[852.437412] eb_validate_vmas+0x688/0x860 [i915]
[852.437596] i915_gem_do_execbuffer+0xc0e/0x25b0 [i915]
[852.437770] ? deactivate_slab+0x5f2/0x7d0
[852.437778] ? _raw_spin_unlock_irqrestore+0x50/0x60
[852.437789] ? i915_gem_execbuffer2_ioctl+0xc6/0x2c0 [i915]
[852.437944] ? init_object+0x49/0x80
[852.437950] ? __lock_acquire+0x5e6/0x2580
[852.437963] i915_gem_execbuffer2_ioctl+0x116/0x2c0 [i915]
[852.438129] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438300] drm_ioctl_kernel+0xac/0x140
[852.438310] drm_ioctl+0x201/0x3d0
[852.438316] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438490] __x64_sys_ioctl+0x6a/0xa0
[852.438498] do_syscall_64+0x37/0xb0
[852.438507] entry_SYSCALL_64_after_hwframe+0x44/0xae
[852.438515] RIP: 0033:0x7f5c0415b317
[852.438523] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[852.438542] RSP: 002b:00007ffd765039a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[852.438553] RAX: ffffffffffffffda RBX: 000055e4d7829dd0 RCX: 00007f5c0415b317
[852.438562] RDX: 00007ffd76503a00 RSI: 00000000c0406469 RDI: 0000000000000017
[852.438571] RBP: 00007ffd76503a00 R08: 0000000000000000 R09: 0000000000000081
[852.438579] R10: 00000000ffffff7f R11: 0000000000000246 R12: 00000000c0406469
[852.438587] R13: 0000000000000017 R14: 00007ffd76503a00 R15: 0000000000000000
[852.438598] </TASK>
[852.438602] Modules linked in: snd_hda_codec_hdmi i915 mei_hdcp x86_pkg_temp_thermal snd_hda_intel snd_intel_dspcfg drm_buddy coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ttm ghash_clmulni_intel snd_hwdep snd_hda_core e1000e drm_dp_helper ptp snd_pcm mei_me drm_kms_helper pps_core mei syscopyarea sysfillrect sysimgblt fb_sys_fops prime_numbers intel_lpss_pci smsc75xx usbnet mii
[852.440310] ---[ end trace e52cdd2fe4fd911c ]---
v2: Fix typos in the commit message.
Fixes: 7e00897be8 ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.")
Fixes: bc1922e5d3 ("drm/i915: Fix a race between vma / object destruction and unbinding")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222133209.587978-1-thomas.hellstrom@linux.intel.com
With small LMEM-BAR we need to be able to differentiate between the
total size of LMEM, and how much of it is CPU mappable. The end goal is
to be able to utilize the entire range, even if part of is it not CPU
accessible.
v2: also update intelfb_create
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-1-matthew.auld@intel.com
UAPI Changes:
- Weak parallel submission support for execlists
Minimal implementation of the parallel submission support for
execlists backend that was previously only implemented for GuC.
Support one sibling non-virtual engine.
Core Changes:
- Two backmerges of drm/drm-next for header file renames/changes and
i915_regs reorganization
Driver Changes:
- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)
- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)
- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)
- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
discrete cards optimise 64K GTT pages for local-memory, since everything
should be allocated at 64K granularity. We say goodbye to sparse
entries, and instead get a compact 256B page-table for 64K pages,
which should be more cache friendly. 4K pages for local-memory
are no longer supported by the HW.
v4: don't return uninitialized err in igt_ppgtt_compact
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-8-ramalingam.c@intel.com
For local-memory objects we need to align the GTT addresses
to 64K, both for the ppgtt and ggtt.
We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.
For compact-pt we further align and pad lmem object GTT addresses
to 2MB to ensure PDEs contain consistent page sizes as
required by the HW.
v3:
* use needs_compact_pt flag to discriminate between
64K and 64K with compact-pt
* add i915_vm_obj_min_alignment
* use i915_vm_obj_min_alignment to round up vma reservation
if compact-pt instead of hard coding
v5:
* fix i915_vm_obj_min_alignment for internal objects which
have no memory region
v6:
* tiled_blits_create correctly pick largest required alignment
v8:
* i915_vm_min_alignment protect against array overflow for mock region
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-7-ramalingam.c@intel.com