1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

107 commits

Author SHA1 Message Date
Ryan Grimm
6f963ec2d6 cxl: Fix device_node reference counting
When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-02 14:51:31 +11:00
Ryan Grimm
62fa19d4b4 cxl: Add ability to reset the card
Adds reset to sysfs which will PERST the card. If load_image_on_perst is set
to "user" or "factory", the PERST will cause that image to be loaded.

load_image_on_perst is set to "user" for production.

"none" could be used for debugging. The PSL trace arrays are preserved which
then can be read through debugfs.

PERST also triggers CAPP recovery. An HMI comes in, which is handled by EEH.
EEH unbinds the driver, calls into Sapphire to reinitialize the PHB, then
rebinds the driver.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:52 +11:00
Ryan Grimm
1212aa1c8c cxl: Enable CAPP recovery
Turning snoops on is the last step in CAPP recovery. Sapphire is expected to
have reinitialized the PHB and done the previous recovery steps.

Add mode argument to opal call to do this. Driver can turn snoops off although
it does not currently.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:52 +11:00
Ryan Grimm
4beb5421ba cxl: Use image state defaults for reloading FPGA
Select defaults such that a PERST causes flash image reload.  Select which
image based on what the card is set up to load.

CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image
load.

CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST.

cxl_update_image_control writes these bits into the VSEC.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:51 +11:00
Ian Munsie
d6a6af2c18 cxl: Disable AFU debug flag
Upon inspection of the implementation specific registers, it was
discovered that the high bit of the implementation specific RXCTL
register was enabled, which enables the DEADB00F debug feature.

The debug feature causes MMIO reads to a disabled AFU to respond with
0xDEADB00F instead of all Fs. In general this should not be visible as
the kernel will only allow MMIO access to enabled AFUs, but there may be
some circumstances where an AFU may become disabled while it is use.
One such case would be an AFU designed to only be used in the dedicated
process mode and to disable itself after it has completed it's work
(however even in that case the effects of this debug flag would be
limited as the userspace application must have completed any required
MMIO accesses before the AFU disables itself with or without the flag).

This patch removes the debug flag and replaces the magic value
programmed into this register with a preprocessor define so it is
clearer what the rest of this initialisation does.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29 15:45:43 +11:00
Ian Munsie
ee41d11d53 cxl: Change contexts_lock to a mutex to fix sleep while atomic bug
We had a known sleep while atomic bug if a CXL device was forcefully
unbound while it was in use. This could occur as a result of EEH, or
manually induced with something like this while the device was in use:

echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind

The issue was that in this code path we iterated over each context and
forcefully detached it with the contexts_lock spin lock held, however
the detach also needed to take the spu_mutex, and call schedule.

This patch changes the contexts_lock to a mutex so that we are not in
atomic context while doing the detach, thereby avoiding the sleep while
atomic.

Also delete the related TODO comment, which suggested an alternate
solution which turned out to not be workable.

Cc: stable@vger.kernel.org
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12 13:06:47 +11:00
Ian Munsie
f204e0b8ce cxl: Driver code for powernv PCIe based cards for userspace access
This is the core of the cxl driver.

It adds support for using cxl cards in the powernv environment only (ie POWER8
bare metal). It allows access to cxl accelerators by userspace using the
/dev/cxl/afuM.N char devices.

The kernel driver has no knowledge of the function implemented by the
accelerator. It provides services to userspace via the /dev/cxl/afuM.N
devices. When a program opens this device and runs the start work IOCTL, the
accelerator will have coherent access to that processes memory using the same
virtual addresses. That process may mmap the device to access any MMIO space
the accelerator provides.  Also, reads on the device will allow interrupts to
be received. These services are further documented in a later patch in
Documentation/powerpc/cxl.txt.

Documentation of the cxl hardware architecture and userspace API is provided in
subsequent patches.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:57 +11:00