The surface_state_base is an offset into the batch, so we need to pass
the correct batch address for STATE_BASE_ADDRESS.
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20210210122728.20097-1-chris@chris-wilson.co.uk
(cherry picked from commit 1914911f4aa08ddc05bae71d3516419463e0c567)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Flush; invalidate; change registers; invalidate; flush.
Will this finally work on every device? Or will Baytrail complain again?
On the positive side, we immediately see the benefit of having hsw-gt1 in
CI.
Fixes: ace44e13e5 ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals")
Testcase: igt/gem_render_tiled_blits # hsw-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210125220247.31701-1-chris@chris-wilson.co.uk
(cherry picked from commit d30bbd62b1bfd9e0a33c3583c5a9e5d66f60cbd7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
the machine stops responding milliseconds after receipt of the reset
request [GDRT]. By disabling the cached atomics, the hang do not occur
and we presume the GPU would reset normally for similar hangs.
Sadly this is a shotgun approach, but since the impact is critical it is
better to err on the safe side and work back from there.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlesktrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210125220152.24070-1-chris@chris-wilson.co.uk
Cc: stable@vger.kernel.org
(cherry picked from commit b267c7ae0ad5b437b068f46919b17f85000154b4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
If we enable_breadcrumbs for a request while that request is being
removed from HW; we may see that the request is active as we take the
ce->signal_lock and proceed to attach the request to ce->signals.
However, during unsubmission after marking the request as inactive, we
see that the request has not yet been added to ce->signals and so skip
the removal. Pull the check during cancel_breadcrumbs under the same
spinlock as enabling so that we the two tests are consistent in
enable/cancel.
Otherwise, we may insert a request onto ce->signals that we expect should
not be there:
intel_context_remove_breadcrumbs:488 GEM_BUG_ON(!__i915_request_is_complete(rq))
While updating, we can note that we are always called with
irqs-disabled, due to the engine->active.lock being held at the single
caller, and so remove the irqsave/restore making it symmetric to
enable_breadcrumbs.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2931
Fixes: c18636f763 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.10+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119162057.31097-1-chris@chris-wilson.co.uk
(cherry picked from commit e7004ea4f5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
If while we are cancelling the breadcrumb signaling, we find that the
request is already completed, move it to the irq signaler and let it be
signaled.
v2: Tweak reference counting so that we only acquire a new reference on
adding to a signal list, as opposed to a hidden i915_request_put of the
caller's reference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-5-chris@chris-wilson.co.uk
(cherry picked from commit 85cc2917a3)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since we do a bare context switch with no restore, the clear residual
kernel runs on dirty state, and we must be careful to avoid executing
with bad state from context registers inherited from a malicious client.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2955
Fixes: 09aa9e4586 ("drm/i915/gt: Restore clear-residual mitigations for Ivybridge, Baytrail")
Testcase: igt/gem_ctx_isolation # ivb,vlv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210117093015.29143-1-chris@chris-wilson.co.uk
(cherry picked from commit ace44e13e5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Now that we are careful to always force-restore contexts upon rewinding
(where necessary), we can restore our optimisation to skip over
completed active execlists when dequeuing.
Referenecs: 35f3fd8182 ("drm/i915/execlists: Workaround switching back to a completed context")
References: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210120121718.26435-2-chris@chris-wilson.co.uk
The obj->stolen is currently used to identify an object allocated from
stolen memory. This dates back to when there were just 1.5 types of
objects, an object backed by shmemfs and an object backed by shmemfs
with a contiguous physical address. Now that we have several different
types of objects, we no longer want to treat stolen objects as a special
case.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-3-chris@chris-wilson.co.uk
There is a module parameter for controlling what GuC/HuC features are
enabled. Setting to -1 means 'use the default'. However, the default
was not well defined, out of date and needs to be different across
platforms.
The default is now to disable both GuC and HuC on legacy platforms
where legacy means TGL/RKL and anything prior to Gen12. For new
platforms, the default is to load HuC but not enable GuC submission
as that has not landed yet.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113220724.2484897-1-John.C.Harrison@Intel.com
If we enable_breadcrumbs for a request while that request is being
removed from HW; we may see that the request is active as we take the
ce->signal_lock and proceed to attach the request to ce->signals.
However, during unsubmission after marking the request as inactive, we
see that the request has not yet been added to ce->signals and so skip
the removal. Pull the check during cancel_breadcrumbs under the same
spinlock as enabling so that we the two tests are consistent in
enable/cancel.
Otherwise, we may insert a request onto ce->signals that we expect should
not be there:
intel_context_remove_breadcrumbs:488 GEM_BUG_ON(!__i915_request_is_complete(rq))
While updating, we can note that we are always called with
irqs-disabled, due to the engine->active.lock being held at the single
caller, and so remove the irqsave/restore making it symmetric to
enable_breadcrumbs.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2931
Fixes: c18636f763 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.10+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119162057.31097-1-chris@chris-wilson.co.uk
In a few places we always end up mapping the pool object with the FORCE
constraint(to prevent hitting -EBUSY) which will destroy the cached
mapping if it has a different type. As a simple first step, make the
mapping type part of the pool interface, where the behaviour is to only
give out pool objects which match the requested mapping type.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com
Take advantage of calling xcs_resume under a forcewake by using direct
mmio access. In particular, we can avoid the sleeping variants to allow
resume to be called from softirq context, required for engine resets.
v2: Keep the posting read at the start of resume as a guardian memory
barrier.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-5-chris@chris-wilson.co.uk
During the reset of ring submission, we first stop the engine by
clearing the HEAD/TAIL and marking the ring as disabled. However, it
would be safer to disable the ring (after emptying) before resetting the
HEAD/TAIL.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-4-chris@chris-wilson.co.uk
CI reports that Baytail requires one more invalidate after CACHE_MODE
for it to be happy.
Fixes: ace44e13e5 ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-1-chris@chris-wilson.co.uk
Similar to commit 49b20dbf74 ("drm/i915/gt: Perform an arbitration
check before busywaiting"), also add a check prior to the busywait
on gen8+, as we have now seen (because we added a selftest to add fault
injection into the engine resets) the same engine reset failure leading
to an indefinite wait on the ring-stop semaphore. So not a Tigerlake
specific bug after all, though it still seems odd behaviour for the
busywait as we do get the arbitration point elsewhere on a miss.
Testcase: igt_reset_fail_engine
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210117110418.3361-1-chris@chris-wilson.co.uk
Since we do a bare context switch with no restore, the clear residual
kernel runs on dirty state, and we must be careful to avoid executing
with bad state from context registers inherited from a malicious client.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2955
Fixes: 008ead6ef8 ("drm/i915/gt: Restore clear-residual mitigations for Ivybridge, Baytrail")
Fixes: 09aa9e4586 ("drm/i915/gt: Restore clear-residual mitigations for Ivybridge, Baytrail")
Testcase: igt/gem_ctx_isolation # ivb,vlv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210117093015.29143-1-chris@chris-wilson.co.uk
Since we allow removing the timeline map at runtime, there is a risk
that rq->hwsp points into a stale page. To control that risk, we hold
the RCU read lock while reading *rq->hwsp, but we missed a couple of
important barriers. First, the unpinning / removal of the timeline map
must be after all RCU readers into that map are complete, i.e. after an
rcu barrier (in this case courtesy of call_rcu()). Secondly, we must
make sure that the rq->hwsp we are about to dereference under the RCU
lock is valid. In this case, we make the rq->hwsp pointer safe during
i915_request_retire() and so we know that rq->hwsp may become invalid
only after the request has been signaled. Therefore is the request is
not yet signaled when we acquire rq->hwsp under the RCU, we know that
rq->hwsp will remain valid for the duration of the RCU read lock.
This is a very small window that may lead to either considering the
request not completed (causing a delay until the request is checked
again, any wait for the request is not affected) or dereferencing an
invalid pointer.
Fixes: 3adac4689f ("drm/i915: Introduce concept of per-timeline (context) HWSP")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.1+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218122421.18344-1-chris@chris-wilson.co.uk
(cherry picked from commit 9bb36cf660)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210118101755.476744-1-chris@chris-wilson.co.uk
On error we unpin and free the wa_ctx.vma, but do not clear any of the
derived flags. During lrc_init, we look at the flags and attempt to
dereference the wa_ctx.vma if they are set. To protect the error path
where we try to limp along without the wa_ctx, make sure we clear those
flags!
Reported-by: Matt Roper <matthew.d.roper@intel.com>
Fixes: 604a8f6f1e ("drm/i915/lrc: Only enable per-context and per-bb buffers if set")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.15+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-1-chris@chris-wilson.co.uk
(cherry-picked from 5b4dc95cf7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210118095332.458813-1-chris@chris-wilson.co.uk
As context-in/out is now always serialised, we do not have to worry
about concurrent enabling/disable of the busy-stats and can reduce the
atomic_t active to a plain unsigned int, and the seqlock to a seqcount.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-3-chris@chris-wilson.co.uk
Since schedule-in/out is now entirely serialised by the tasklet bitlock,
we do not need to worry about concurrent in/out operations and so reduce
the atomic operations to plain instructions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-1-chris@chris-wilson.co.uk
Give more flexibility to the caller, if they already have an allocated
object, in case they wish to apply some transformation to the object
prior to handing it over to the region specific initialisation step,
like in gem_create_ext where we would like to first apply the extensions
to the object.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114182402.840247-3-matthew.auld@intel.com
When we know that we are inside the timeline mutex, or inside the
submission flow (under active.lock or the holder's rcu lock), we know
that the rq->hwsp is stable and we can use the simpler direct version.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
Backmerging to get a common base for merging topic branches between
drm-intel-next and drm-intel-gt-next.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
UAPI Changes:
- Deprecate I915_PMU_LAST and optimize state tracking (Tvrtko)
Avoid relying on last item ABI marker in i915_drm.h, add a
comment to mark as deprecated.
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- Restore clear residuals security mitigations for Ivybridge and
Baytrail (Chris)
- Close#1858: Allow sysadmin to choose applied GPU security mitigations
through i915.mitigations=... similar to CPU (Chris)
- Fix for #2024: GPU hangs on HSW GT1 (Chris)
- Fix for #2707: Driver hang when editing UVs in Blender (Chris, Ville)
- Fix for #2797: False positive GuC loading error message (Chris)
- Fix for #2859: Missing GuC firmware for older Cometlakes (Chris)
- Lessen probability of GPU hang due to DMAR faults [reason 7,
next page table ptr is invalid] on Tigerlake (Chris)
- Fix REVID macros for TGL to fetch correct stepping (Aditya)
- Limit frequency drop to RPe on parking (Chris, Edward)
- Limit W/A 1406941453 to TGL, RKL and DG1 (Swathi)
- Make W/A 22010271021 permanent on DG1 (Lucas)
- Implement W/A 16011163337 to prevent a HS/DS hang on DG1 (Swathi)
- Only disable preemption on gen8 render engines (Chris)
- Disable arbitration around Braswell's PDP updates (Chris)
- Disable arbitration on no-preempt requests (Chris)
- Check for arbitration after writing start seqno before busywaiting (Chris)
- Retain default context state across shrinking (Venkata, CQ)
- Fix mismatch between misplaced vma check and vma insert for 32-bit
addressing userspaces (Chris, CQ)
- Propagate error for vmap() failure instead kernel NULL deref (Chris)
- Propagate error from cancelled submit due to context closure
immediately (Chris)
- Fix RCU race on HWSP tracking per request (Chris)
- Clear CMD parser shadow and GPU reloc batches (Matt A)
- Populate logical context during first pin (Maarten)
- Optimistically prune dma-resv from the shrinker (Chris)
- Fix for virtual engine ownership race (Chris)
- Remove timeslice suppression to restore fairness for virtual engines (Chris)
- Rearrange IVB/HSW workarounds properly between GT and engine (Chris)
- Taint the reset mutex with the shrinker (Chris)
- Replace direct submit with direct call to tasklet (Chris)
- Multiple corrections to virtual engine dequeue and breadcrumbs code (Chris)
- Avoid wakeref from potentially hard IRQ context in PMU (Tvrtko)
- Use raw clock for RC6 time estimation in PMU (Tvrtko)
- Differentiate OOM failures from invalid map types (Chris)
- Fix Gen9 to have 64 MOCS entries similar to Gen11 (Chris)
- Ignore repeated attempts to suspend request flow across reset (Chris)
- Remove livelock from "do_idle_maps" VT-d W/A (Chris)
- Cancel the preemption timeout early in case engine reset fails (Chris)
- Code flow optimization in the scheduling code (Chris)
- Clear the execlists timers upon reset (Chris)
- Drain the breadcrumbs just once (Chris, Matt A)
- Track the overall GT awake/busy time (Chris)
- Tweak submission tasklet flushing to avoid starvation (Chris)
- Track timelines created using the HWSP to restore on resume (Chris)
- Use cmpxchg64 for 32b compatilibity for active tracking (Chris)
- Prefer recycling an idle GGTT fence to avoid GPU wait (Chris)
- Restructure GT code organization for clearer split between GuC
and execlists (Chris, Daniele, John, Matt A)
- Remove GuC code that will remain unused by new interfaces (Matt B)
- Restructure the CS timestamp clocks code to local to GT (Chris)
- Fix error return paths in perf code (Zhang)
- Replace idr_init() by idr_init_base() in perf (Deepak)
- Fix shmem_pin_map error path (Colin)
- Drop redundant free_work worker for GEM contexts (Chris, Mika)
- Increase readability and understandability of intel_workarounds.c (Lucas)
- Defer enabling the breadcrumb interrupt to after submission (Chris)
- Deal with buddy alloc block sizes beyond 4G (Venkata, Chris)
- Encode fence specific waitqueue behaviour into the wait.flags (Chris)
- Don't cancel the breadcrumb interrupt shadow too early (Chris)
- Cancel submitted requests upon context reset (Chris)
- Use correct locks in GuC code (Tvrtko)
- Prevent use of engine->wa_ctx after error (Chris, Matt R)
- Fix build warning on 32-bit (Arnd)
- Avoid memory leak if platform would have more than 16 W/A (Tvrtko)
- Avoid unnecessary #if CONFIG_PM in PMU code (Chris, Tvrtko)
- Improve debugging output (Chris, Tvrtko, Matt R)
- Make file local variables static (Jani)
- Avoid uint*_t types in i915 (Jani)
- Selftest improvements (Chris, Matt A, Dan)
- Documentation fixes (Chris, Jose)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# Conflicts:
# drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
# drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
# drivers/gpu/drm/i915/gt/intel_lrc.c
# drivers/gpu/drm/i915/gvt/mmio_context.h
# drivers/gpu/drm/i915/i915_drv.h
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114152232.GA21588@jlahtine-mobl.ger.corp.intel.com
Remove the extraneous inlines. The only split by the compiler that
looked dubious was execlists_schedule_out, so push the code around
slightly to move all the work into the out-of-line function.
In a normal build, bloat-o-meter shows that only the
execlists_schedule_out is contentious:
add/remove: 1/0 grow/shrink: 0/2 up/down: 803/-1532 (-729)
Function old new delta
__execlists_schedule_out - 803 +803
execlists_submission_tasklet 6488 5766 -722
execlists_reset_csb.constprop 1587 777 -810
Total: Before=1605815, After=1605086, chg -0.05%
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113151112.15212-1-chris@chris-wilson.co.uk
In the legacy ringbuffer submission, we still had an open-coded version
of intel_engine_stop_cs() with one additional verification step. Transfer
that verification to intel_engine_stop_cs() itself, and call it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113204709.15020-1-chris@chris-wilson.co.uk
Since we are system_highpri_wq, we expected the heartbeat to be
scheduled promptly. However, we see delays of over 10ms upsetting our
assertions. Accept this as inevitable and bump the minimum error
threshold to 20ms (from 6 jiffies).
<6> [616.784749] rcs0: Heartbeat delay: 3570us [2802, 9188]
<6> [616.807790] bcs0: Heartbeat delay: 2111us [745, 4372]
<6> [616.853776] vcs0: Heartbeat delay: 6485us [2424, 11637]
<3> [616.859296] vcs0: Heartbeat delay was 6485us, expected less than 6000us
<3> [616.860901] i915/intel_heartbeat_live_selftests: live_heartbeat_fast failed with error -22
v2: More context from CI.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113163115.5740-2-chris@chris-wilson.co.uk
Initialize all required entries from guc_set_default_submission, instead
of calling the execlists function. The previously inherited setup has
been copied over from the execlist code and simplified by removing the
execlists submission-specific parts.
v2: move setting of relative_mmio flag to engine_setup_common (Chris)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-5-daniele.ceraolospurio@intel.com
Instead of starting the engine in execlists submission mode and then
switching to GuC, start directly in GuC submission mode. The initial
setup functions have been copied over from the execlists code
and simplified by removing the execlists submission-specific parts.
v2: remove unneeded unexpected starting state check (Chris)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-4-daniele.ceraolospurio@intel.com
GuC owns the execlists state and the context IDs used for submission, so
the status of the ports and the CSB entries are not something we control
or can decode from the i915 side, therefore we can avoid dumping it. A
follow-up patch will also stop setting the csb pointers when using GuC
submission.
GuC dumps all the required events in the GuC logs when verbosity is set
high enough.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-3-daniele.ceraolospurio@intel.com
Delete GuC code unused in future patches that rewrite the GuC interface
to work with the new firmware. Most of the code deleted relates to
workqueues or execlist port. The code is safe to remove because we still
don't allow GuC submission to be enabled, even when overriding the
modparam, so it currently can't be reached.
The defines + structs for the process descriptor and workqueue remain.
Although the new GuC interface does not require either of these for the
normal submission path multi-lrc submission does. The usage of the
process descriptor and workqueue for multi-lrc will be quite different
from the code that is deleted in this patch. A future patch will
implement multi-lrc submission.
v2: add a code in the commit message about the code being safe to
remove (Chris)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-2-daniele.ceraolospurio@intel.com
Device local-memory should be thought of as part the GT, which means it
should also sit under gt/.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210112164300.356524-1-matthew.auld@intel.com
The clear-residuals mitigation is a relatively heavy hammer and under some
circumstances the user may wish to forgo the context isolation in order
to meet some performance requirement. Introduce a generic module
parameter to allow selectively enabling/disabling different mitigations.
To disable just the clear-residuals mitigation (on Ivybridge, Baytrail,
or Haswell) use the module parameter: i915.mitigations=auto,!residuals
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1858
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210111225220.3483-3-chris@chris-wilson.co.uk
(cherry picked from commit f7452c7cbd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The mitigation is required for all gen7 platforms, now that it does not
cause GPU hangs, restore it for Ivybridge and Baytrail.
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Bloomfield Jon <jon.bloomfield@intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210111225220.3483-2-chris@chris-wilson.co.uk
(cherry picked from commit 008ead6ef8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the
range [0, n-1] where n is #EU * (#threads/EU) with the number of threads
based on plaform and the number of EU based on the number of slices and
subslices. This is a fixed number per platform/gt, so appropriately
limit the number of threads we spawn to match the device.
v2: Oversaturate the system with tasks to force execution on every HW
thread; if the thread idles it is returned to the pool and may be reused
again before an unused thread.
v3: Fix more state commands, which was causing Baytrail to barf.
v4: STATE_CACHE_INVALIDATE requires a stall on Ivybridge
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Randy Wright <rwright@hpe.com>
Cc: stable@vger.kernel.org # v5.7+
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210111225220.3483-1-chris@chris-wilson.co.uk
(cherry picked from commit eebfb32e26)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
During igt_reset_nop_engine, it was observed that an unexpected failed
engine reset lead to us busywaiting on the stop-ring semaphore (set
during the reset preparations) on the first request afterwards. There was
no explicit MI_ARB_CHECK in this sequence as the presumption was that
the failed MI_SEMAPHORE_WAIT would itself act as an arbitration point.
It did not in this circumstance, so force it.
This patch is based on the assumption that the MI_SEMAPHORE_WAIT failure
to arbitrate is a rare Tigerlake bug, similar to the lite-restore vs
semaphore issues previously seen in the CS. The explicit MI_ARB_CHECK
should always ensure that there is at least one arbitration point in the
request before the MI_SEMAPHORE_WAIT to trigger the IDLE->ACTIVE event.
Upon processing that event, we will clear the stop-ring flag and release
the semaphore from its busywait.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210112100759.32698-2-chris@chris-wilson.co.uk
On the off chance that we need to arbitrate before launching the
payload, perform the check after we signal the request is ready to
start. Assuming instantaneous processing of the CS event, the request
will then be treated as having started when we make the decisions as to
how to process that CS event.
v2: More commentary about the users of i915_request_started() as a
reminder about why we are marking the initial breadcrumb.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210112100759.32698-1-chris@chris-wilson.co.uk
The clear-residuals mitigation is a relatively heavy hammer and under some
circumstances the user may wish to forgo the context isolation in order
to meet some performance requirement. Introduce a generic module
parameter to allow selectively enabling/disabling different mitigations.
To disable just the clear-residuals mitigation (on Ivybridge, Baytrail,
or Haswell) use the module parameter: i915.mitigations=auto,!residuals
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1858
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210111225220.3483-3-chris@chris-wilson.co.uk