VMID introduced an extra increment of cmd_pending, causing double-counting
of the I/O. The normal increment ios performed in lpfc_get_scsi_buf.
Link: https://lore.kernel.org/r/20220701211425.2708-5-jsmart2021@gmail.com
Fixes: 33c79741de ("scsi: lpfc: vmid: Introduce VMID in I/O path")
Cc: <stable@vger.kernel.org> # v5.14+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Calls to starget_to_rport() may return NULL. Add check for NULL rport
before dereference.
Link: https://lore.kernel.org/r/20220603174329.63777-5-jsmart2021@gmail.com
Fixes: bb21fc9911 ("scsi: lpfc: Use fc_block_rport()")
Cc: <stable@vger.kernel.org> # v5.18
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mostly small bug fixes plus other trivial updates. The major change
of note is moving ufs out of scsi and a minor update to lpfc vmid
handling.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYptvayYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishb1rAPwJWSl1
kB9Qt1xPa1JK0Z7To2LRkrQ4uxYbAnpTLEP6UgEA3YuBccXRppcYQe5CXCrcZryz
BtTbtI3ApiV40xC/SRk=
=Uit3
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes plus other trivial updates.
The major change of note is moving ufs out of scsi and a minor update
to lpfc vmid handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
scsi: qla2xxx: Remove unused 'ql_dm_tgt_ex_pct' parameter
scsi: qla2xxx: Remove setting of 'req' and 'rsp' parameters
scsi: mpi3mr: Fix kernel-doc
scsi: lpfc: Add support for ATTO Fibre Channel devices
scsi: core: Return BLK_STS_TRANSPORT for ALUA transitioning
scsi: sd_zbc: Prevent zone information memory leak
scsi: sd: Fix potential NULL pointer dereference
scsi: mpi3mr: Rework mrioc->bsg_device model to fix warnings
scsi: myrb: Fix up null pointer access on myrb_cleanup()
scsi: core: Unexport scsi_bus_type
scsi: sd: Don't call blk_cleanup_disk() in sd_probe()
scsi: ufs: ufshcd: Delete unnecessary NULL check
scsi: isci: Fix typo in comment
scsi: pmcraid: Fix typo in comment
scsi: smartpqi: Fix typo in comment
scsi: qedf: Fix typo in comment
scsi: esas2r: Fix typo in comment
scsi: storvsc: Fix typo in comment
scsi: ufs: Split the drivers/scsi/ufs directory
scsi: qla1280: Remove redundant variable
...
This series consists of a small set of driver updates (lpfc, ufs,
mpt3sas mpi3mr, iscsi target). Apart from that this is mostly small
fixes with very few core changes (the biggest one being VPD caching.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYo2WnyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfEiAP4zvniL
xidsiCXGQ4pWF4QW3UxukXpGh5xFREhNCYT9+QEA+DyilCALOI+ZT5GKu2V6gkby
R29ve48/NAWl3fwYjMQ=
=GPL1
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of a small set of driver updates (lpfc, ufs, mpt3sas
mpi3mr, iscsi target). Apart from that this is mostly small fixes with
very few core changes (the biggest one being VPD caching)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (177 commits)
scsi: target: tcmu: Avoid holding XArray lock when calling lock_page
scsi: elx: efct: Remove NULL check after calling container_of()
scsi: dpt_i2o: Drop redundant spinlock initialization
scsi: qedf: Remove redundant variable op
scsi: hisi_sas: Fix memory ordering in hisi_sas_task_deliver()
scsi: fnic: Replace DMA mask of 64 bits with 47 bits
scsi: mpi3mr: Add target device related sysfs attributes
scsi: mpi3mr: Add shost related sysfs attributes
scsi: elx: efct: Remove redundant memset() statement
scsi: megaraid_sas: Remove redundant memset() statement
scsi: mpi3mr: Return error if dma_alloc_coherent() fails
scsi: hisi_sas: Fix rescan after deleting a disk
scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset
scsi: libsas: Refactor sas_ata_hard_reset()
scsi: mpt3sas: Update driver version to 42.100.00.00
scsi: mpt3sas: Fix junk chars displayed while printing ChipName
scsi: ipr: Use kobj_to_dev()
scsi: mpi3mr: Fix a NULL vs IS_ERR() bug in mpi3mr_bsg_init()
scsi: bnx2fc: Avoid using get_cpu() in bnx2fc_cmd_alloc()
scsi: libfc: Remove get_cpu() semantics in fc_exch_em_alloc()
...
Rework lpfc_vmid_get_appid() arguments to remove scsi_cmnd dependency. The
function is now callable by the NVMe I/O path. Fix up SCSI call path to
accommodate the arg change.
Link: https://lore.kernel.org/r/20220519123110.17361-4-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove VMID code from its SCSI-specific location and move to a new file
solely for VMID code.
Link: https://lore.kernel.org/r/20220519123110.17361-3-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, VMID registration is configured via module parameters. This
could lead to VMID compatibility issues if two ports are connected to
different brands of switches, as the two brands implement VMID differently.
Make logical changes so that VMID registration is based on common service
parameters from FLOGI_ACC with fabric rather than module parameters.
Link: https://lore.kernel.org/r/20220506035519.50908-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
No need to have these helpers inline. Also remove the stubs and just use
an IS_ENABLED for the get side (the set side already is only built
conditionlly).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20220420042723.1010598-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The lpfc_iocbq data structure has void * pointers that are overloaded to be
as many as 8 different data types and the driver translates the void * by
casting. This patch removes the void * pointers by declaring the specific
types needed by the driver. It also expands the context_un to include more
seldom used pointer types to save structure bytes. It also groups the u8
types together to pack the 8 bytes needed. This work allows the lpfc_iocbq
data structure to be more strongly typed and keeps it from being allocated
from the 512 byte slab.
[mkp: rolled in zeroday fix]
Link: https://lore.kernel.org/r/20220412222008.126521-21-jsmart2021@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following was seen with CMF enabled:
BUG: using smp_processor_id() in preemptible
code: systemd-udevd/31711
kernel: caller is lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: CPU: 12 PID: 31711 Comm: systemd-udevd
kernel: Call Trace:
kernel: <TASK>
kernel: dump_stack_lvl+0x44/0x57
kernel: check_preemption_disabled+0xbf/0xe0
kernel: lpfc_update_cmf_cmd+0x214/0x420 [lpfc]
kernel: lpfc_nvme_fcp_io_submit+0x23b4/0x4df0 [lpfc]
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.
Link: https://lore.kernel.org/r/20220412222008.126521-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During stress I/O tests with 500+ vports, hard LOCKUP call traces are
observed.
CPU A:
native_queued_spin_lock_slowpath+0x192
_raw_spin_lock_irqsave+0x32
lpfc_handle_fcp_err+0x4c6
lpfc_fcp_io_cmd_wqe_cmpl+0x964
lpfc_sli4_fp_handle_cqe+0x266
__lpfc_sli4_process_cq+0x105
__lpfc_sli4_hba_process_cq+0x3c
lpfc_cq_poll_hdler+0x16
irq_poll_softirq+0x76
__softirqentry_text_start+0xe4
irq_exit+0xf7
do_IRQ+0x7f
CPU B:
native_queued_spin_lock_slowpath+0x5b
_raw_spin_lock+0x1c
lpfc_abort_handler+0x13e
scmd_eh_abort_handler+0x85
process_one_work+0x1a7
worker_thread+0x30
kthread+0x112
ret_from_fork+0x1f
Diagram of lockup:
CPUA CPUB
---- ----
lpfc_cmd->buf_lock
phba->hbalock
lpfc_cmd->buf_lock
phba->hbalock
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
Link: https://lore.kernel.org/r/20220412222008.126521-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During heavy I/O stress tests with 100+ vports and cable pulls, it may take
a while before the vport logs back into the fabric to resume I/O.
Currently, the driver immediately fails the I/O with DID_ERROR.
Change behavior to return DID_REQUEUE, and rely on SCSI layer's max retry
of 5 before erroring out the I/O.
Link: https://lore.kernel.org/r/20220412222008.126521-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There was a merge error in ther 14.2.0.0 patches that resulted in the SLI4
path using the SLI3 issue_abort_iotag() routine. This resulted in txcmplq
corruption.
Fix to use the SLI4 routine when SLI4.
Link: https://lore.kernel.org/r/20220323205545.81814-2-jsmart2021@gmail.com
Fixes: 31a59f7570 ("scsi: lpfc: SLI path split: Refactor Abort paths")
Cc: <stable@vger.kernel.org> # v5.2+
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyrights to 2022 for files modified in the 14.2.0.0 patch set.
Link: https://lore.kernel.org/r/20220225022308.16486-18-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch refactors the Abort paths to use SLI-4 as the primary interface.
- Introduce generic lpfc_sli_prep_abort_xri jump table routine
- Consolidate lpfc_sli4_issue_abort_iotag and lpfc_sli_issue_abort_iotag
into a single generic lpfc_sli_issue_abort_iotag routine
- Consolidate lpfc_sli4_abort_fcp_cmpl and lpfc_sli_abort_fcp_cmpl into a
single generic lpfc_sli_abort_fcp_cmpl routine
- Remove unused routine lpfc_get_iocb_from_iocbq
- Conversion away from using SLI-3 iocb structures to set/access fields in
common routines. Use the new generic get/set routines that were added.
This move changes code from indirect structure references to using local
variables with the generic routines.
- Refactor routines when setting non-generic fields, to have both SLI3 and
SLI4 specific sections. This replaces the set-as-SLI3 then translate to
SLI4 behavior of the past.
Link: https://lore.kernel.org/r/20220225022308.16486-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch refactors the SCSI paths to use SLI-4 as the primary interface.
- Conversion away from using SLI-3 iocb structures to set/access fields in
common routines. Use the new generic get/set routines that were added.
This move changes code from indirect structure references to using local
variables with the generic routines.
- Refactor routines when setting non-generic fields, to have both SLI3 and
SLI4 specific sections. This replaces the set-as-SLI3 then translate to
SLI4 behavior of the past.
Link: https://lore.kernel.org/r/20220225022308.16486-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, SLI3 and SLI4 data paths use the same lpfc_iocbq structure.
This is a "common" structure but many of the components refer to sli-rev
specific entities which can lead the developer astray as to what they
actually mean, should be set to, or when they should be used.
This first patch prepares the lpfc_iocbq structure so that elements common
to both SLI3 and SLI4 data paths are more appropriately named, making it
clear they apply generically.
Fieldnames based on 'iocb' (sli3) or 'wqe' (sli4) which are actually
generic to the paths are renamed to 'cmd':
- iocb_flag is renamed to cmd_flag
- lpfc_vmid_iocb_tag is renamed to lpfc_vmid_tag
- fabric_iocb_cmpl is renamed to fabric_cmd_cmpl
- wait_iocb_cmpl is renamed to wait_cmd_cmpl
- iocb_cmpl and wqe_cmpl are combined and renamed to cmd_cmpl
- rsvd2 member is renamed to num_bdes due to pre-existing usage
The structure name itself will retain the iocb reference as changing to a
more relevant "job" or "cmd" title induces many hundreds of line changes
for only a name change.
lpfc_post_buffer is also renamed to lpfc_sli3_post_buffer to indicate use
in the SLI3 path only.
Link: https://lore.kernel.org/r/20220225022308.16486-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We only need the rport structure for lpfc_chk_tgt_mapped().
Link: https://lore.kernel.org/r/20220301143718.40913-6-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of passing in a scsi_cmnd we should be using the rport; we already
have the target and LUN ID as parameters, so there's no need to pass the
scsi_cmnd too.
Link: https://lore.kernel.org/r/20220301143718.40913-5-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use fc_block_rport() instead of fc_block_scsi_eh() as the SCSI command will
be removed as argument for SCSI EH functions.
Link: https://lore.kernel.org/r/20220301143718.40913-4-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The default SCSI EH action for a non-existing EH callback is to return
FAILED, so having a callback just returning FAILED is pointless.
Link: https://lore.kernel.org/r/20220301143718.40913-3-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_bus_reset_handler() is really just a loop calling
lpfc_target_reset_handler() over all targets, which is what the error
handler will be doing anyway.
Link: https://lore.kernel.org/r/20220301143718.40913-2-hare@suse.de
Cc: James Smart <james.smart@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During heavy I/O testing with issue_lip to bounce the link, occasionally
I/O is terminated with status 3 result 9, which means the RPI is suspended.
The I/O is completed and this type of error will result in immediate retry
by the SCSI layer. The retry count expires and the I/O fails and returns
error to the application.
To avoid these quick retry/retries exhausted scenarios change the return
code given to the midlayer to DID_REQUEUE rather than DID_ERROR. This gets
them retried, and eventually succeed when the link recovers.
Link: https://lore.kernel.org/r/20211204002644.116455-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A link bounce to a slow fabric may observe FDISC response delays lasting
longer than devloss tmo. Current logic decrements the final fabric node
kref during a devloss tmo event. This results in a NULL ptr dereference
crash if the FDISC completes for that fabric node after devloss tmo.
Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when
devloss tmo triggers and we've noticed that fabric node recovery has
already started or finished in between the time lpfc_dev_loss_tmo_callbk
queues lpfc_dev_loss_tmo_handler. If fabric node recovery succeeds, then
the driver reverses the devloss tmo marked kref put with a kref get. If
fabric node recovery fails, then the final kref put relies on the ELS
timing out or the REG_LOGIN cmpl routine.
Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A prior patch introduced HBA_NEEDS_CFG_PORT flag logic, but in
lpfc_sli_brdrestart_s3() code path, right after HBA_NEEDS_CFG_PORT is set,
the phba->hba_flag is cleared in lpfc_sli_brdreset().
Fix by calling lpfc_sli_chipset_init() to wait for successful restart of
the HBA in lpfc_host_reset_handler() after lpfc_sli_brdrestart().
lpfc_sli_chipset_init() sets the HBA_NEEDS_CFG_PORT flag so that the
lpfc_sli_hba_setup() routine from lpfc_online() will execute
lpfc_sli_config_port() initialization step when the brdrestart is
successful.
Link: https://lore.kernel.org/r/20211020211417.88754-3-jsmart2021@gmail.com
Fixes: d2f2547efd ("scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In cases when lpfc_enable_pci_dev() fails, lpfc_printf_log() with
LOG_TRACE_EVENT set will call lpfc_dmp_dbg() which uses the
phba->port_list_lock.
However, phba->port_list_lock does not get initialized until
lpfc_setup_driver_resource_phase1(). Thus, any initialization routine with
LOG_TRACE_EVENT log message prior to lpfc_setup_driver_resource_phase1()
will crash.
Revert LOG_TRACE_EVENT back to LOG_INIT for all log messages in routines
prior to lpfc_setup_driver_resource_phase1().
Link: https://lore.kernel.org/r/20211020211417.88754-2-jsmart2021@gmail.com
CC: Zheyu Ma <zheyuma97@gmail.com>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.
Link: https://lore.kernel.org/r/20211012233558.4066756-28-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge the 5.15/scsi-fixes branch into the staging tree to resolve UFS
conflict reported by sfr.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The PBDE feature, setting payload buffer address explicitly in the WQE so
it doesn't have to be fetched from the SGL, only makes sense when there is
a single buffer for the I/O. When there are multiple buffers it actually
hurts performance as the SGL subsequently has to be fetched.
Rework the SGL logic to only use PBDE when a single buffer.
Link: https://lore.kernel.org/r/20210910233159.115896-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the congestion management framework dynamically enables, it may do so
while I/O is in flight. The updates of cmf info due to inflight I/O
completing may happen before values have been initialized.
Fix by ensure cmf_max_bytes_per_interval is initialized when checking
bandwidth utilization for SCSI layer blocking.
Link: https://lore.kernel.org/r/20210910233159.115896-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Injecting errors on the PCI slot while the driver is handling NVMe I/O will
cause crashes and hangs.
There are several rather difficult scenarios occurring. The main issue is
that the adapter can report a PCI error before or simultaneously to the PCI
subsystem reporting the error. Both paths have different entry points and
currently there is no interlock between them. Thus multiple teardown paths
are competing and all heck breaks loose.
Complicating things is the NVMs path. To a large degree, I/O was able to be
shutdown for a full FC port on the SCSI stack. But on NVMe, there isn't a
similar call. At best, it works on a per-controller basis, but even at the
controller level, it's a controller "reset" call. All of which means I/O is
still flowing on different CPUs with reset paths expecting hw access
(mailbox commands) to execute properly.
The following modifications are made:
- A new flag is set in PCI error entrypoints so the driver can track being
called by that path.
- An interlock is added in the SLI hw error path and the PCI error path
such that only one of the paths proceeds with the teardown logic.
- RPI cleanup is patched such that RPIs are marked unregistered w/o mbx
cmds in cases of hw error.
- If entering the SLI port re-init calls, a case where SLI error teardown
was quick and beat the PCI calls now reporting error, check whether the
SLI port is still live on the PCI bus.
- In the PCI reset code to bring the adapter back, recheck the IRQ
settings. Different checks for SLI3 vs SLI4.
- In I/O completions, that may be called as part of the cleanup or
underway just before the hw error, check the state of the adapter. If
in error, shortcut handling that would expect further adapter
completions as the hw error won't be sending them.
- In routines waiting on I/O completions, which may have been in progress
prior to the hw error, detect the device is being torn down and abort
from their waits and just give up. This points to a larger issue in the
driver on ref-counting for data structures, as it doesn't have
ref-counting on q and port structures. We'll do this fix for now as it
would be a major rework to be done differently.
- Fix the NVMe cleanup to simulate NVMe I/O completions if I/O is being
failed back due to hw error.
- In I/O buf allocation, done at the start of new I/Os, check hw state and
fail if hw error.
Link: https://lore.kernel.org/r/20210910233159.115896-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following coccicheck REVIEW:
./drivers/scsi/lpfc/lpfc_scsi.c:1498:9-12 REVIEW Unneeded variable
Link: https://lore.kernel.org/r/20210831114058.17817-1-lv.ruyi@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cm>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Chi Minghao <chi.minghao@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The Kernel test robot flagged the following warning:
".../lpfc_init.c:7788:35: error: 'struct lpfc_sli4_hba' has no member
named 'c_stat'"
Reviewing this issue highlighted that one of the recent patches caused the
driver to no longer compile cleanly if CONFIG_DEBUG_FS is not set.
Correct the different areas that are failing to compile.
Link: https://lore.kernel.org/r/20210908050927.37275-1-jsmart2021@gmail.com
Fixes: 02243836ad ("scsi: lpfc: Add support for the CM framework")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use the SCSI midlayer interfaces to query protection interval, reference
tag, per-command DIX flags, and logical block count.
Link: https://lore.kernel.org/r/20210817025014.12085-3-martin.petersen@oracle.com
CC: James Smart <james.smart@broadcom.com>
CC: Dick Kennedy <dick.kennedy@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver provides overwatch of the cm behavior by maintaining a set of rx
I/O statistics. This information is also used in later updating of the cm
statistics buffer.
Link: https://lore.kernel.org/r/20210816162901.121235-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Complete the enablement of the cm framework feature in the adapter. Perform
the following:
- Detect the presence of the congestion management framework feature.
When the cm framework is present:
- Issue the SET_FEATURE command to enable the feature.
- Register the cm statistics buffer with the adapter.
- Read the cm enablement buffer to determine the cm framework state for cm
management.
When cm management is enabled:
- Monitor all FPIN and congestion signalling events, incrementing
counters.
- Regularly sync with the adapter to communicate congestion events and to
receive an rx request limit.
- Monitor requests for rx data and ensure that no more than the
adapter prescribed limit is issued on the link. If the limit is
exceeded, SCSI and/or NVMe traffic is temporarily suspended.
- Maintain the minute, hourly, daily statistics buffer.
- Monitor for congestion enablement change events, causing a reread of the
enablement buffer and acting on any change in enablement.
And:
- Add teardown logic, including buffer deregistration, on adapter
detachment or reset.
Link: https://lore.kernel.org/r/20210816162901.121235-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-28-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Support for Topology and RAS logging capabilities were qualified by PCIe
device ID checks necessitating additional driver changes for new device
IDs.
Reduce reliance on specific PCIe device IDs by substituting checks for SLI
family information. This automatically picks up support on the newest
hardware.
Link: https://lore.kernel.org/r/20210722221721.74388-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Previous logic accidentally overrides the status variable to FAILURE when
target reset status is SUCCESS.
Refactor the non-SUCCESS logic of lpfc_vmid_vport_cleanup(), which resolves
the false override.
Link: https://lore.kernel.org/r/20210707184351.67872-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Integration with VMID patches resulted in a build error when
CONFIG_DEBUG_FS is disabled and driver option CONFIG_SCSI_LPFC_DEBUG_FS is
disabled.
It results in an undefined variable:
lpfc_scsi:5595:3: error: 'uuid' undeclared (first use in this function); did you mean 'upid'?
Link: https://lore.kernel.org/r/20210618171842.79710-1-jsmart2021@gmail.com
Fixes: 33c79741de ("scsi: lpfc: vmid: Introduce VMID in I/O path")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce the VMID in the I/O path. Check if the VMID is enabled and if
the I/O belongs to a VM or not.
Link: https://lore.kernel.org/r/20210608043556.274139-14-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement routines to save, retrieve, and remove the VMIDs from the data
structure. A hash table is used to save the VMIDs and the corresponding
UUIDs associated with the application/VMs.
Link: https://lore.kernel.org/r/20210608043556.274139-9-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add supporting datastructures for mailbox command which helps in
determining if the firmware supports appid. Allocate resources for VMID at
initialization time and clean them up on removal.
Link: https://lore.kernel.org/r/20210608043556.274139-7-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce scsi_build_sense() as a wrapper around scsi_build_sense_buffer()
to format the buffer and set the correct SCSI status.
Link: https://lore.kernel.org/r/20210427083046.31620-8-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Default behavior for the driver, when aborting an I/O, is to terminate the
I/O with the adapter. The adapter will initiate an ABTS to terminate the
exchange on the link and mark the exchange is terminated so that no further
use of the sgl or any traffic for the exchange is worked on. Completion on
the Abort is then posted to the driver, which as the I/O is terminated can
complete the I/O to the OS. This completion may occur prior to the ABTS
handshake completing on the wire. The ABTS handshake can take a long time
to complete with timeouts and retries reaching 60+ seconds. Note: if
retries fail, LOGO occurs.
Some devices want to ensure that the ABTS handshake fully completes (this
device has fully ack'd it) before the I/O completion is posted back to the
OS, where a failed I/O may be retried via a different path.
To support this behavior, an option was added to the driver to change I/O
completion from the Abort cmd completion to the Exchange termination (aka
ABTS) completion.
Link: https://lore.kernel.org/r/20210514195559.119853-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code inspection showed lpfc was using three different pointer formats when
logging discovery object pointers.
Standardize the pointer format to x%px.
Note: %px use is limited to discovery objects in order to aid core
analysis.
Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/lpfc/lpfc_scsi.c:746: warning: expecting prototype for lpfc_release_scsi_buf(). Prototype was for lpfc_release_scsi_buf_s3() instead
drivers/scsi/lpfc/lpfc_scsi.c:979: warning: expecting prototype for App checking is required for(). Prototype was for BG_ERR_CHECK() instead
drivers/scsi/lpfc/lpfc_scsi.c:3701: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf'
drivers/scsi/lpfc/lpfc_scsi.c:3701: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf'
drivers/scsi/lpfc/lpfc_scsi.c:3717: warning: Function parameter or member 'fcpi_parm' not described in 'lpfc_send_scsi_error_event'
drivers/scsi/lpfc/lpfc_scsi.c:3717: warning: Excess function parameter 'rsp_iocb' description in 'lpfc_send_scsi_error_event'
drivers/scsi/lpfc/lpfc_scsi.c:3837: warning: Function parameter or member 'fcpi_parm' not described in 'lpfc_handle_fcp_err'
drivers/scsi/lpfc/lpfc_scsi.c:3837: warning: expecting prototype for lpfc_handler_fcp_err(). Prototype was for lpfc_handle_fcp_err() instead
drivers/scsi/lpfc/lpfc_scsi.c:4021: warning: Function parameter or member 'wcqe' not described in 'lpfc_fcp_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_scsi.c:4021: warning: Excess function parameter 'pwqeOut' description in 'lpfc_fcp_io_cmd_wqe_cmpl'
drivers/scsi/lpfc/lpfc_scsi.c:4621: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf_s3'
drivers/scsi/lpfc/lpfc_scsi.c:4621: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf_s3'
drivers/scsi/lpfc/lpfc_scsi.c:4698: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf_s4'
drivers/scsi/lpfc/lpfc_scsi.c:4698: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf_s4'
drivers/scsi/lpfc/lpfc_scsi.c:4954: warning: expecting prototype for lpfc_taskmgmt_def_cmpl(). Prototype was for lpfc_tskmgmt_def_cmpl() instead
drivers/scsi/lpfc/lpfc_scsi.c:5094: warning: expecting prototype for lpfc_poll_rearm_time(). Prototype was for lpfc_poll_rearm_timer() instead
Link: https://lore.kernel.org/r/20210312094738.2207817-4-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets,
update the copyright for 2021.
Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>