This data is used to know which engines/classes are reachable on a given
channel's runlist, and needs to be replaced with something that doesn't
rely on subdev index.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Will be used by common code in subsequent commits to lookup driver
engine state from HW engine ID.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Will be used by common code in subsequent commits to replace arrays
indexed by subdev index.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
None of the chipsets we use this on have instanced engines, so this is fine.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
This switches to using the subdev list for lookup, and otherwise should
be a no-op aside from switching the function signatures.
Callers will be transitioned to split type+inst individually.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Much easier to store this to avoid having to reconstruct a string for a
specific subdev, taking into account whether it's instanced or not.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Previous hardware allowed a MMU fault to be generated by software to
trigger a context switch for engine recovery. Turing has the capability
to preempt all work from a specific runlist processor and removed the
registers currently used for triggering MMU faults. Attempting to access
these non-existent registers results in further errors, so use the
runlist preemption register instead.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some of the low level FIFO interrupt status bits have changed for
Turing. Update the handling of these to match the hardware.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Turing requires some changes to FIFO interrupt handling due to changes
in HW register layout. It also requires some changes to implement robust
channel (RC) recovery. This preparatory patch moves the functions
requiring changes into nvkm/engine/fifo/tu102.c so they can be altered
without affecting gk104 and other users. It should not introduce any
functional changes.
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pascal was particularly incorrect, as the register changed to be more in the
same format as the MMU fault buffers are.
Shouldn't have impacted much more than confusing MMU fault log messages.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Would like to be able to reuse gf100_fifo_intr_fault() for (some of) the
later chipsets too, as it's identical.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The bulk SPDX addition made all these files into GPL-2.0 licensed files.
However the remainder of the project is MIT-licensed, these files
were simply missing the boiler plate and got caught up in the global update.
Fixes: 96ac6d4351 (treewide: Add SPDX license identifier - Kbuild)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The bulk SPDX addition made all these files into GPL-2.0 licensed files.
However the remainder of the project is MIT-licensed, these files
(primarily header files) were simply missing the boiler plate and got
caught up in the global update.
Fixes: b24413180f (License cleanup: add SPDX GPL-2.0 license identifier to files with no license)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0
Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch aims to suppress 29 missing-break-in-switch false positives.
Addresses-Coverity-ID: 1456891 ("Missing break in switch")
Addresses-Coverity-ID: 1324063 ("Missing break in switch")
Addresses-Coverity-ID: 1324063 ("Missing break in switch")
Addresses-Coverity-ID: 141432 ("Missing break in switch")
Addresses-Coverity-ID: 141433 ("Missing break in switch")
Addresses-Coverity-ID: 141434 ("Missing break in switch")
Addresses-Coverity-ID: 141435 ("Missing break in switch")
Addresses-Coverity-ID: 141436 ("Missing break in switch")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
The token will also contain runlist ID on Turing, so instead expose it as
an opaque value from NVKM so the client doesn't need to care.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The GPU saves off some stuff to the address specified in this part of RAMFC
when the channel faults, so we should probably point it at a valid address.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is needed for Turing, but we're supposed to wait for completion after
re-writing the value on older GPUs anyway.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Newer HW doesn't appear to send this event, which will cause long delays
in runlist updates if they don't complete immediately.
RM doesn't use these events anywhere, and an NVGPU commit message notes
that polling is the preferred method even on HW that supports the event.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We didn't used to be aware that runlist/engine IDs weren't the same thing,
or that there was such variability in configuration between GPUs.
By exposing this information to a client, and giving it explicit control
of which runlist it's allocating a channel on, we're able to make better
choices.
The immediate effect of this is that on GPUs where CE0 is the "GRCE", we
will now be allocating a copy engine running asynchronously to GR for BO
migrations - as intended.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>