1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

53 commits

Author SHA1 Message Date
Matthew Brost
6b540bf6f1 drm/i915/guc: Implement multi-lrc submission
Implement multi-lrc submission via a single workqueue entry and single
H2G. The workqueue entry contains an updated tail value for each
request, of all the contexts in the multi-lrc submission, and updates
these values simultaneously. As such, the tasklet and bypass path have
been updated to coalesce requests into a single submission.

v2:
 (John Harrison)
  - s/wqe/wqi
  - Use FIELD_PREP macros
  - Add GEM_BUG_ONs ensures length fits within field
  - Add comment / white space to intel_guc_write_barrier
 (Kernel test robot)
  - Make need_tasklet a static function
v3:
 (Docs)
  - A comment for submission_stall_reason
v4:
 (Kernel test robot)
  - Initialize return value in bypass tasklt submit function
 (John Harrison)
  - Add comment near work queue defs
  - Add BUILD_BUG_ON to ensure WQ_SIZE is a power of 2
  - Update write_barrier comment to talk about work queue
v5:
 (John Harrison)
  - Fix typo in work queue comment

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-13-matthew.brost@intel.com
2021-10-15 10:37:40 -07:00
Michal Wajdeczko
fb2d2de353 drm/i915/guc: Move and improve error message for missed CTB reply
If we timeout waiting for a CT reply we print very simple error
message. Improve that and by moving error reporting to the caller
we can use CT_ERROR instead of DRM_ERROR and report just fence
as error code will be reported later anyway.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926184545.1407-5-michal.wajdeczko@intel.com
2021-10-01 12:04:24 -07:00
Michal Wajdeczko
0e9deac513 drm/i915/guc: Print error name on CTB send failure
Instead of plain error value (%d) print more user friendly error
name (%pe).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926184545.1407-4-michal.wajdeczko@intel.com
2021-10-01 12:04:24 -07:00
Michal Wajdeczko
0de9765da5 drm/i915/guc: Print error name on CTB (de)registration failure
Instead of plain error value (%d) print more user friendly error
name (%pe).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926184545.1407-3-michal.wajdeczko@intel.com
2021-10-01 12:04:23 -07:00
Michal Wajdeczko
217ecd310d drm/i915/guc: Verify result from CTB (de)register action
In commit b839a869df ("drm/i915/guc: Add support for data
reporting in GuC responses") we missed the hypothetical case
that GuC might return positive non-zero value as success data.

While that would be lucky treated as error case, and at the
end will result in reporting valid -EIO, in the meantime this
value will be passed to ERR_PTR that could be misleading.

v2: rebased

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926184545.1407-2-michal.wajdeczko@intel.com
2021-10-01 12:04:23 -07:00
Matthew Brost
d67e3d5a5d drm/i915/guc: Process all G2H message at once in work queue
Rather than processing 1 G2H at a time and re-queuing the work queue if
more messages exist, process all the G2H in a single pass of the work
queue.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-6-matthew.brost@intel.com
2021-09-13 11:30:29 -07:00
John Harrison
d75dc57fee drm/i915/guc: Don't complain about reset races
It is impossible to seal all race conditions of resets occurring
concurrent to other operations. At least, not without introducing
excesive mutex locking. Instead, don't complain if it occurs. In
particular, don't complain if trying to send a H2G during a reset.
Whatever the H2G was about should get redone once the reset is over.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-17-matthew.brost@intel.com
2021-07-27 17:31:57 -07:00
Matthew Brost
f7957e603c drm/i915/guc: Handle engine reset failure notification
GuC will notify the driver, via G2H, if it fails to
reset an engine. We recover by resorting to a full GPU
reset.

v2:
 (John Harrison):
  - s/drm_dbg/drm_err

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-14-matthew.brost@intel.com
2021-07-27 17:31:52 -07:00
Matthew Brost
1e0fd2b5da drm/i915/guc: Handle context reset notification
GuC will issue a reset on detecting an engine hang and will notify
the driver via a G2H message. The driver will service the notification
by resetting the guilty context to a simple state or banning it
completely.

v2:
 (John Harrison)
  - Move msg[0] lookup after length check
v3:
 (John Harrison)
  - s/drm_dbg/drm_err

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-13-matthew.brost@intel.com
2021-07-27 17:31:50 -07:00
Matthew Brost
28ff6520a3 drm/i915/guc: Update GuC debugfs to support new GuC
Update GuC debugfs to support the new GuC structures.

v2:
 (John Harrison)
  - Remove intel_lrc_reg.h include from i915_debugfs.c
 (Michal)
  - Rename GuC debugfs functions

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-17-matthew.brost@intel.com
2021-07-22 10:07:27 -07:00
Matthew Brost
b97060a99b drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC
When running the GuC the GPU can't be considered idle if the GuC still
has contexts pinned. As such, a call has been added in
intel_gt_wait_for_idle to idle the UC and in turn the GuC by waiting for
the number of unpinned contexts to go to zero.

v2: rtimeout -> remaining_timeout
v3: Drop unnecessary includes, guc_submission_busy_loop ->
guc_submission_send_busy_loop, drop negatie timeout trick, move a
refactor of guc_context_unpin to earlier path (John H)
v4: Add stddef.h back into intel_gt_requests.h, sort circuit idle
function if not in GuC submission mode

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-16-matthew.brost@intel.com
2021-07-22 10:07:23 -07:00
Matthew Brost
f4eb1f3fe9 drm/i915/guc: Ensure G2H response has space in buffer
Ensure G2H response has space in the buffer before sending H2G CTB as
the GuC can't handle any backpressure on the G2H interface.

v2:
 (Matthew)
  - s/INTEL_GUC_SEND/INTEL_GUC_CT_SEND
v3:
 (Matthew)
  - Add G2H credit accounting to blocking path, add g2h_release_space
    helper
 (John H)
  - CTB_G2H_BUFFER_SIZE / 4 == G2H_ROOM_BUFFER_SIZE

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-15-matthew.brost@intel.com
2021-07-22 10:07:21 -07:00
Matthew Brost
e0717063cc drm/i915/guc: Defer context unpin until scheduling is disabled
With GuC scheduling, it isn't safe to unpin a context while scheduling
is enabled for that context as the GuC may touch some of the pinned
state (e.g. LRC). To ensure scheduling isn't enabled when an unpin is
done, a call back is added to intel_context_unpin when pin count == 1
to disable scheduling for that context. When the response CTB is
received it is safe to do the final unpin.

Future patches may add a heuristic / delay to schedule the disable
call back to avoid thrashing on schedule enable / disable.

v2:
 (John H)
  - s/drm_dbg/drm_err
 (Daneiel)
  - Clean up sched state function

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-9-matthew.brost@intel.com
2021-07-22 10:07:13 -07:00
Matthew Brost
3a4cdf1982 drm/i915/guc: Implement GuC context operations for new inteface
Implement GuC context operations which includes GuC specific operations
alloc, pin, unpin, and destroy.

v2:
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched in busy loop
 (Michal)
  - Remove C++ style comment
v3:
 (Matthew Brost)
  - Drop GUC_ID_START
 (John Harrison)
  - Fix a bunch of typos
  - Use drm_err rather than drm_dbg for G2H errors
 (Daniele)
  - Fix ;; typo
  - Clean up sched state functions
  - Add lockdep for guc_id functions
  - Don't call __release_guc_id when guc_id is invalid
  - Use MISSING_CASE
  - Add comment in guc_context_pin
  - Use shorter path to rpm
 (Daniele / CI)
  - Don't call release_guc_id on an invalid guc_id in destroy
v4:
 (Daniel Vetter)
  - Add FIXME comment

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-7-matthew.brost@intel.com
2021-07-22 10:07:08 -07:00
John Harrison
3101e9952b drm/i915/guc: Module load failure test for CT buffer creation
Add several module failure load inject points in the CT buffer creation
code path.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-8-matthew.brost@intel.com
2021-07-13 13:50:06 -07:00
Matthew Brost
75452167a2 drm/i915/guc: Optimize CTB writes and reads
CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.

v2:
 (Michal)
  - Add additional sanity checks for head / tail pointers
  - Use GUC_CTB_HDR_LEN rather than magic 1
v3:
 (Michal / John H)
  - Drop redundant check of head value
v4:
 (John H)
  - Drop redundant checks of tail / head values
v5:
 (Michal)
  - Address more nits
v6:
 (Michal)
  - Add GEM_BUG_ON sanity check on ctb->space

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-7-matthew.brost@intel.com
2021-07-13 13:50:05 -07:00
Matthew Brost
b43b995048 drm/i915/guc: Add stall timer to non blocking CTB send function
Implement a stall timer which fails H2G CTBs once a period of time
with no forward progress is reached to prevent deadlock.

v2:
 (Michal)
  - Improve error message in ct_deadlock()
  - Set broken when ct_deadlock() returns true
  - Return -EPIPE on ct_deadlock()
v3:
 (Michal)
  - Add ms to stall timer comment
 (Matthew)
  - Move broken check to intel_guc_ct_send()

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-6-matthew.brost@intel.com
2021-07-13 13:50:03 -07:00
Matthew Brost
1681924d8b drm/i915/guc: Add non blocking CTB send function
Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.

The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.

The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.

v2:
 (Michal)
  - Use define for H2G room calculations
  - Move INTEL_GUC_SEND_NB define
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched
v3:
 (Michal)
  - Move includes to following patch
  - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
v4:
 (John H)
  - Update comment, add type local variable

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-5-matthew.brost@intel.com
2021-07-13 13:50:02 -07:00
Matthew Brost
c26e289f1d drm/i915/guc: Increase size of CTB buffers
With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-4-matthew.brost@intel.com
2021-07-13 13:50:01 -07:00
Matthew Brost
dd9c0f3cbb drm/i915/guc: Improve error message for unsolicited CT response
Improve the error message when a unsolicited CT response is received by
printing fence that couldn't be found, the last fence, and all requests
with a response outstanding.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-3-matthew.brost@intel.com
2021-07-13 13:50:00 -07:00
Matthew Brost
1ccf7294b7 drm/i915/guc: Relax CTB response timeout
In upcoming patch we will allow more CTB requests to be sent in
parallel to the GuC for processing, so we shouldn't assume any more
that GuC will always reply without 10ms.

Use bigger value hardcoded value of 1s instead.

v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
v3:
 (Daniel Vetter)
  - Use hardcoded value of 1s rather than config option
v4:
 (Michal)
  - Use defines for timeout values

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-2-matthew.brost@intel.com
2021-07-13 13:49:58 -07:00
Michal Wajdeczko
572f2a5cd9 drm/i915/guc: Update firmware to v62.0.0
Most of the changes to the 62.0.0 firmware revolved around CTB
communication channel. Conform to the new (stable) CTB protocol.

v2:
 (Michal)
  Add values back to kernel DOC for actions
 (Docs)
  Add 'CT buffer' back in to fix warning

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[mattrope: Tweaked kerneldoc while pushing as suggested by Daniele/Michal]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-3-matthew.brost@intel.com
2021-06-18 15:31:01 -07:00
Michal Wajdeczko
8d99e09c5d drm/i915/guc: Always copy CT message to new allocation
Since most of future CT traffic will be based on G2H requests,
instead of copying incoming CT message to static buffer and then
create new allocation for such request, always copy incoming CT
message to new allocation. Also by doing it while reading CT
header, we can safely fallback if that atomic allocation fails.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-19-matthew.brost@intel.com
2021-06-04 10:42:15 +02:00
Michal Wajdeczko
65dd4ed0f4 drm/i915/guc: Don't receive all G2H messages in irq handler
In irq handler try to receive just single G2H message, let other
messages to be received from tasklet.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-18-matthew.brost@intel.com
2021-06-04 10:41:51 +02:00
Michal Wajdeczko
2e496ac200 drm/i915/guc: Stop using mutex while sending CTB messages
We are no longer using descriptor to hold G2H replies and we are
protecting access to the descriptor and command buffer by the
separate spinlock, so we can stop using mutex.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-17-matthew.brost@intel.com
2021-06-04 10:40:50 +02:00
Matthew Brost
d35ca60087 drm/i915/guc: Ensure H2G buffer updates visible before tail update
Ensure H2G buffer updates are visible before descriptor tail updates by
inserting a barrier between the H2G buffer update and the tail. The
barrier is simple wmb() for SMEM and is register write for LMEM. This is
needed if more than 1 H2G can be inflight at once.

If this barrier is not inserted it is possible the descriptor tail
update is scene by the GuC before H2G buffer update which results in the
GuC reading a corrupt H2G value. This can bring down the H2G channel
among other bad things.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-16-matthew.brost@intel.com
2021-06-04 10:39:04 +02:00
Michal Wajdeczko
7c567bbf6f drm/i915/guc: Start protecting access to CTB descriptors
We want to stop using guc.send_mutex while sending CTB messages
so we have to start protecting access to CTB send descriptor.

For completeness protect also CTB receive descriptor.

Add spinlock to struct intel_guc_ct_buffer and start using it.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-15-matthew.brost@intel.com
2021-06-04 10:35:44 +02:00
Michal Wajdeczko
df12d1c301 drm/i915/guc: Update sizes of CTB buffers
Future GuC will require CTB buffers sizes to be multiple of 4K.
Make these changes now as this shouldn't impact us too much.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603230408.54856-2-matthew.brost@intel.com
2021-06-04 10:20:13 +02:00
Michal Wajdeczko
b43f0fc8b8 drm/i915/guc: Replace CTB array with explicit members
Upcoming GuC firmware will always require just two CTBs and we
also plan to configure them with different sizes, so definining
them as array is no longer suitable.

v2: Use %p for ptrdiff print
v3: Use %tx for ptrdiff print

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603230408.54856-1-matthew.brost@intel.com
2021-06-04 10:16:58 +02:00
Michal Wajdeczko
480c6fe120 drm/i915/guc: Don't repeat CTB layout calculations
We can retrieve offsets to cmds buffers and descriptor from
actual pointers that we already keep locally.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-11-matthew.brost@intel.com
2021-06-03 23:36:08 +02:00
Michal Wajdeczko
99b2f5f51c drm/i915/guc: Only rely on own CTB size
In upcoming GuC firmware, CTB size will be removed from the CTB
descriptor so we must keep it locally for any calculations.

While around, improve some debug messages and helpers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-10-matthew.brost@intel.com
2021-06-03 23:35:57 +02:00
Michal Wajdeczko
882be6e0b7 drm/i915/guc: Stop using fence/status from CTB descriptor
Stop using fence/status from CTB descriptor as future GuC ABI will
no longer support replies over CTB descriptor.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-8-matthew.brost@intel.com
2021-06-03 23:32:11 +02:00
Daniele Ceraolo Spurio
6fb086e5e6 drm/i915/guc: use probe_error log for CT enablement failure
We have a couple of failure injection points in the CT enablement path,
so we need to use i915_probe_error() to select the appropriate log level.
A new macro (CT_PROBE_ERROR) has been added to the set of CT logging
macros to be used in this scenario and upcoming ones.

While adding the new macros, fix the underlying logging mechanics used
by the existing ones (DRM_DEV_* -> drm_*) and move the inlines to
before they're used inside the macros.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-3-matthew.brost@intel.com
2021-06-03 23:30:57 +02:00
John Harrison
0f41e31a7b drm/i915/guc: Clear pointers on free
Clear out some pointers when objects have been de-allocated. This
makes it much easier to track down use-after-free type issues.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-4-John.C.Harrison@Intel.com
2020-10-29 13:46:28 +02:00
Michal Wajdeczko
e85de17703 drm/i915/guc: Introduce guc_is_ready
We already have guc_is_running function, but it only reflects
firmware status, while to fully use GuC we need to know if we've
already established communication with it.

v2: also s/intel_guc_is_running/intel_guc_is_fw_running (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131153706.109528-1-michal.wajdeczko@intel.com
2020-01-31 23:42:59 +00:00
Michal Wajdeczko
4c22abfbcb drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200120191817.50164-1-michal.wajdeczko@intel.com
2020-01-24 21:08:24 +00:00
Michal Wajdeczko
77b20896d5 drm/i915/guc: Introduce CT_DEBUG
As we now have "ct" available almost in all functions we can
start using dev variants of logs also for debug.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-6-michal.wajdeczko@intel.com
2020-01-17 14:02:05 +00:00
Michal Wajdeczko
d624d40177 drm/i915/guc: Switch to CT_ERROR in ct_read
As we now have "ct" available in ct_read function we can switch
from generic DRM_ERROR to our custom CT_ERROR.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-5-michal.wajdeczko@intel.com
2020-01-17 14:02:05 +00:00
Michal Wajdeczko
235198d7c9 drm/i915/guc: Don't pass CTB while reading
Since we only have one RECV buffer we don't need to explicitly pass
it to the read function.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-4-michal.wajdeczko@intel.com
2020-01-17 14:02:05 +00:00
Michal Wajdeczko
6a327cb186 drm/i915/guc: Don't pass CTB while writing
Since we only have one SEND buffer we don't need to explicitly pass
it to the write function.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-3-michal.wajdeczko@intel.com
2020-01-17 14:02:05 +00:00
Michal Wajdeczko
1b9fc94a77 drm/i915/guc: Don't GEM_BUG_ON on corrupted G2H CTB
We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset). While here
move descriptor error reporting to the function that actually
touches that descriptor.

Note that unexpected content of the specific CT messages, that
still complies with generic CT message format, shall not trigger
disabling whole CTB, as that might just indicate new unsupported
message types.

v2: drop redundant message (Daniele)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-2-michal.wajdeczko@intel.com
2020-01-17 14:00:45 +00:00
Michal Wajdeczko
88a57514cf drm/i915/guc: Use correct name for last CT fence
While we have function that returns "next fence" that can be used
by new CT request, we internally store value of the last used fence.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-5-michal.wajdeczko@intel.com
2020-01-14 13:07:39 +00:00
Michal Wajdeczko
59a46ad9f8 drm/i915/guc: Update CTB helpers to use CT_ERROR
Update GuC CTB action helpers to benefit from new CT_ERROR macro.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-4-michal.wajdeczko@intel.com
2020-01-14 13:07:39 +00:00
Michal Wajdeczko
18c8832523 drm/i915/guc: Introduce CT_ERROR
We should start using dev variants of error logging and
to simplify that introduce helper macro that will do any
necessary conversions to obtain pointer to device struct.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-3-michal.wajdeczko@intel.com
2020-01-14 13:06:55 +00:00
Michal Wajdeczko
d8186dd239 drm/i915/guc: Simpler CT message size calculation
We need CT message size in bytes so just use that in helper var.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200111231114.59208-2-michal.wajdeczko@intel.com
2020-01-14 13:06:55 +00:00
Daniele Ceraolo Spurio
8c69bd74a0 drm/i915/guc: Remove function pointers for send/receive calls
Since we started using CT buffers on all gens, the function pointers can
only be set to either the _nop() or the _ct() functions. Since the
_nop() case applies to when the CT are disabled, we can just handle that
case in the _ct() functions and call them directly.

v2: keep intel_guc_send() and make the CT send/receive functions work on
    intel_guc_ct. (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-5-daniele.ceraolospurio@intel.com
2019-12-17 15:22:49 -08:00
Daniele Ceraolo Spurio
7524c365c3 drm/i915/guc/ct: Group request-related variables in a sub-structure
For better isolation of the request tracking from the rest of the
CT-related data.

v2: split to separate patch, move next_fence to substructure (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-4-daniele.ceraolospurio@intel.com
2019-12-17 15:22:49 -08:00
Daniele Ceraolo Spurio
9ab28cd20c drm/i915/guc/ct: Stop expecting multiple CT channels
The GuC supports having multiple CT buffer pairs and we designed our
implementation with that in mind. However, the different channels are not
processed in parallel within the GuC, so there is very little advantage
in having multiple channels (independent locks?), compared to the
drawbacks (one channel can starve the other if messages keep being
submitted to it). Given this, it is unlikely we'll ever add a second
channel and therefore we can simplify our code by removing the
flexibility.

v2: split substructure grouping to separate patch, improve docs (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-3-daniele.ceraolospurio@intel.com
2019-12-17 15:22:47 -08:00
Daniele Ceraolo Spurio
7f5390c433 drm/i915/guc/ct: Drop guards in enable/disable calls
We track the status of the GuC much more closely now and we expect the
enable/disable functions to be correctly called only once. If this isn't
true we do want to flag it as a flow failure (via the BUG_ON in the ctch
functions) and not silently ignore the call.

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-2-daniele.ceraolospurio@intel.com
2019-12-17 15:22:47 -08:00
Daniele Ceraolo Spurio
e627ad50a2 drm/i915/guc: Merge communication_stop and communication_disable
The only difference from the GuC POV between guc_communication_stop and
guc_communication_disable is that the former can be called after GuC
has been reset. Instead of having two separate paths, we can just skip
the call into GuC in the disabling path and re-use that.

Note that by using the disable() path instead of the stop() one there
are two additional changes in SW side for the stop path:

- interrupts are now disabled before disabling the CT, which is ok
  because we do not want interrupts with CT disabled;
- guc_get_mmio_msg() is called in the stop case as well, which is ok
  because if there are errors before the reset we do want to record
  them.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217012316.13271-1-daniele.ceraolospurio@intel.com
2019-12-17 15:22:46 -08:00