linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Ben Skeggs	cde540211a	drm/nouveau/fifo/gk104-: fix parsing of mmu fault data Pascal was particularly incorrect, as the register changed to be more in the same format as the MMU fault buffers are. Shouldn't have impacted much more than confusing MMU fault log messages. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-08-23 12:55:32 +10:00
Ben Skeggs	cf9518b50a	drm/nouveau/fifo/gf1xx: convert to using nvkm_fault_data Would like to be able to reuse gf100_fifo_intr_fault() for (some of) the later chipsets too, as it's identical. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2019-08-23 12:55:32 +10:00
Ben Skeggs	f7cc47e436	drm/nouveau/fifo/gm200-: read pbdma count more directly The trick we used (and still use for older GPUs) doesn't work on Turing. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-12-11 15:37:48 +10:00
Ben Skeggs	f37a302e67	drm/nouveau/fifo/gk104-: virtualise pbdma enable function Turing will require different code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-12-11 15:37:48 +10:00
Ben Skeggs	fb80ad15f8	drm/nouveau/fifo/gk104-: group pbdma functions together We're about to be adding more of them. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-12-11 15:37:48 +10:00
Ben Skeggs	efa44c664f	drm/nouveau/fifo/gk104-: separate runlist building from committing to hw We will need to bash different registers on Turing. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-12-11 15:37:48 +10:00
Ben Skeggs	302daab1a7	drm/nouveau/fifo/gf100-: call into BAR to reset BARs after MMU fault This is needed for Turing, but we're supposed to wait for completion after re-writing the value on older GPUs anyway. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-12-11 15:37:47 +10:00
Kees Cook	6396bb2215	treewide: kzalloc() -> kcalloc() The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) \| kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) \| kzalloc(sizeof(TYPE) * C2, ...) \| kzalloc(C1 * C2 * C3, ...) \| kzalloc(C1 * C2, ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Ben Skeggs	37e1c45a58	drm/nouveau/fifo/gv100: initial support Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:46 +10:00
Ben Skeggs	334cc26d4d	drm/nouveau/fifo/gp100-: force individual channels into a channel group RM does this for some reason, and is enforced in HW on Volta. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:22 +10:00
Ben Skeggs	79bb4b617f	drm/nouveau/fifo/gk208-: write pbdma timeout regs during initialisation Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:22 +10:00
Ben Skeggs	8c4e9f9dff	drm/nouveau/fifo/gk110-: support writing channel group runlist entries Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:22 +10:00
Ben Skeggs	4f2fc25c0f	drm/nouveau/fifo/gk104-: poll for runlist update completion Newer HW doesn't appear to send this event, which will cause long delays in runlist updates if they don't complete immediately. RM doesn't use these events anywhere, and an NVGPU commit message notes that polling is the preferred method even on HW that supports the event. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:21 +10:00
Ben Skeggs	665870837a	drm/nouveau/fifo/gk104-: add interfaces to support different runlist layouts This will be required to support features on newer hardware. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:21 +10:00
Ben Skeggs	f9360c3aa6	drm/nouveau/fifo/gk104-: simplify definition of channel classes Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:21 +10:00
Ben Skeggs	cc36205085	drm/nouveau/fifo/gk104-: support querying engines available on each runlist Will be used to improve channel runlist selection. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:21 +10:00
Ben Skeggs	ddc669e256	drm/nouveau/fifo/gk104-: allow fault recovery code to be called by other subdevs This will be required to support Volta. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-05-18 15:01:21 +10:00
Ben Skeggs	01f349fcad	drm/nouveau/fifo/gf100-: use new interfaces for vmm operations Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-11-02 13:32:31 +10:00
Ben Skeggs	997a89003c	drm/nouveau/core/memory: add reference counting We need to be able to prevent memory from being freed while it's still mapped in a GPU's address-space. Will be used by upcoming MMU changes. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-11-02 13:32:22 +10:00
Ben Skeggs	19a82e492c	drm/nouveau/core/memory: change map interface to support upcoming mmu changes Map flags (access, kind, etc) are currently defined in either the VMA, or the memory object, which turns out to not be ideal for things like suballocated buffers, etc. These will become per-map flags instead, so we need to support passing these arguments in nvkm_memory_map(). Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-11-02 13:32:22 +10:00
Ben Skeggs	570889dc50	drm/nouveau/bar: modify interface to bar1 vmm mapping Upcoming changes will remove the nvkm_vmm pointer from nvkm_vma, instead requiring it to be explicitly specified on each operation. It's not currently possible to get this information for BAR1 mappings, so let's fix that ahead of time. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-11-02 13:32:18 +10:00
Dan Carpenter	2579b8b0ec	drm/nouveau/fifo/gk104-: Silence a locking warning Presumably we can never actually hit this return, but static checkers complain that we should unlock before we return. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-05-17 09:09:41 +10:00
Ben Skeggs	3ebef76a1d	drm/nouveau/fifo/gk104-: trigger mmu fault before attempting engine recovery Greatly improves the chances of recovering the GPU from a CTXSW_TIMEOUT. Tested with piglit's arb_shader_image_load_store-atomicity, which causes GR to hang in such a way that recovery failed (CTXSW_TIMEOUT continually re-triggers). Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:15 +10:00
Ben Skeggs	03f16f5f27	drm/nouveau/fifo/gk104-: ACK SCHED_ERROR before attempting CTXSW_TIMEOUT recovery Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:15 +10:00
Ben Skeggs	91b9d659ab	drm/nouveau/fifo/gk104-: directly use new recovery code for ctxsw timeout Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:14 +10:00
Ben Skeggs	3534821df5	drm/nouveau/fifo/gk104-: directly use new recovery code for mmu faults Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:14 +10:00
Ben Skeggs	eaa5ed65ee	drm/nouveau/fifo/gk104-: reset all engines a killed channel is still active on Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:13 +10:00
Ben Skeggs	0faaa47d44	drm/nouveau/fifo/gk104-: refactor recovery code This will serve as a basis for implementing some improvements to how we recover the GPU from channel errors. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:13 +10:00
Ben Skeggs	ec5c6bda19	drm/nouveau/fifo/gk104-: better detection of chid when parsing engine status The previous commit simply changes the interface, but should result in the same behaviour as previously. This commit has been split out from it as it can result in a different channel being selected. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:12 +10:00
Ben Skeggs	b88917fe0f	drm/nouveau/fifo/gk104-: separate out engine status parsing We'll be wanting to reuse this logic in more places. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:12 +10:00
Ben Skeggs	ff9f29abf0	drm/nouveau/fifo/gf100-: provide notification to user if channel is killed There are instances (such as non-recoverable GPU page faults) where NVKM decides that a channel's context is no longer viable, and will be removed from the runlist. This commit notifies the owner of the channel when this happens, so it has the opportunity to take some kind of recovery action instead of hanging. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 17:38:08 +10:00
Ben Skeggs	d2ee360564	drm/nouveau/core/memory: distinguish between coherent/non-coherent targets Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-17 15:15:01 +10:00
Ben Skeggs	ec884f74f1	drm/nouveau/fifo/gf100-: recover from host mmu faults This has been on the TODO list for a while now, recovering from things such as attempting to execute a push buffer or touch a semaphore in an unmapped memory area. The only thing required on the HW side here is that the offending channel is removed from the runlist, and not a full reset of PFIFO. This used to be a bit messier to handle before the rework to make use of engine topology info, but is apparently now trivial. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-12-13 11:38:51 +10:00
Ben Skeggs	1fe8c02fbc	drm/nouveau/fifo/gk104-: translate engidx into human-readable name in debug output Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-07-14 11:53:25 +10:00
Ben Skeggs	952eb819e3	drm/nouveau/top: take nvkm_device as argument to public functions Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-07-14 11:53:25 +10:00
Ben Skeggs	289e082706	drm/nouveau/fifo/gk104-: identify mmu engine ids for host faults It appears these don't map to PBDMAs (at least on Kepler, it may or may be valid for Fermi - this hasn't been checked), but to runlists. This drops the NVKM_ENGINE_FIFO data from the entries too, as resetting all of PFIFO is not the way to handle such faults. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-05-20 14:43:04 +10:00
Ben Skeggs	e50d0237fc	drm/nouveau/fifo/gk104-: implement support for PTOP fault info Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-05-20 14:43:04 +10:00
Ben Skeggs	91419acf78	drm/nouveau/fifo/gk104-: abstract mmu fault data structures Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-05-20 14:43:04 +10:00
Ben Skeggs	98ac3f061a	drm/nouveau/fifo/gk104-: subclass func Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-05-20 14:43:04 +10:00
Ben Skeggs	e93e198d46	drm/nouveau/fifo/gk104-: use device info from top subdev Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-05-20 14:43:04 +10:00
Ben Skeggs	b4c5fc4b85	drm/nouveau/fifo/gk104: submit NOP after all PBDMA_INTR_0, not just DEVICE Prevents the same interrupt from re-triggering forever. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:47 +10:00
Ben Skeggs	608fd040b7	drm/nouveau/fifo/gk104: add nvdec plumbing Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:46 +10:00
Ben Skeggs	9e4fff3205	drm/nouveau/fifo/gk104: add nvenc plumbing Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:46 +10:00
Ben Skeggs	19f89279fa	drm/nouveau/fifo/gk104: make use of topology info during fault recovery Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:42 +10:00
Ben Skeggs	af83a67779	drm/nouveau/fifo/gk104: make use of topology info when handling ctxsw timeout Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:41 +10:00
Ben Skeggs	41e5171ba8	drm/nouveau/fifo/gk104: read device topology information from hw Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:41 +10:00
Ben Skeggs	69aa40e276	drm/nouveau/fifo/gk104: cosmetic engine->runlist changes Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:40 +10:00
Ben Skeggs	acdf7d4f7e	drm/nouveau/fifo/gk104: don't attempt recovery of unknown mmu engines Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:40 +10:00
Ben Skeggs	55252da161	drm/nouveau/fifo/gk104: identify fault-recovery members more clearly Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:39 +10:00
Ben Skeggs	6d39b83f13	drm/nouveau/fifo/gk104: rename spoon to pbdma, and move detection to oneinit Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-03-14 10:13:39 +10:00

1 2

82 commits