'scsi_scan_type' is only used in one source file. Hence declare it static.
Link: https://lore.kernel.org/r/20211129194609.3466071-3-bvanassche@acm.org
Fixes: a19a93e4c6 ("scsi: core: pm: Rely on the device driver core for async power management")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Suppress the following kernel-doc warning:
drivers/scsi/scsi_scan.c:129: warning: Function parameter or member 'dev' not described in 'scsi_enable_async_suspend'
Link: https://lore.kernel.org/r/20211129194609.3466071-2-bvanassche@acm.org
Fixes: a19a93e4c6 ("scsi: core: pm: Rely on the device driver core for async power management")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The variable 'page' is set but never used throughout qedi_alloc_bdq().
Therefore remove it.
Link: https://lore.kernel.org/r/20211126201708.27140-2-f.fainelli@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the UFS Device WLUN is runtime suspended and is in the same power mode,
link state, and b_rpm_dev_flush_capable (BKOP or WB buffer flush etc)
state, then it can remain runtime suspended instead of being runtime
resumed and then system suspended.
The following patch has cleared the way for that to happen:
scsi: core: pm: Only runtime resume if necessary
So amend the logic accordingly.
Note, the ufs-hisi driver uses different RPM and SPM, but it is made
explicit by a new parameter to suspend prepare.
Link: https://lore.kernel.org/r/20211027130614.406985-2-adrian.hunter@intel.com
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Now that blk_execute_rq does not take a gendisk argument there is no need
to pass it through the scsi_ioctl callchain either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait
given that it is unused now. Also convert the boolean at_head parameter
to actually use the bool type while touching the prototype.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just use the disk attached to the request_queue instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
All modern drivers can support extra partitions using the extended
dev_t. In fact except for the ioctl method drivers never even see
partitions in normal operation.
So remove the GENHD_FL_EXT_DEVT and allow extra partitions for all
block devices that do support partitions, and require those that
do not support partitions to explicit disallow them using
GENHD_FL_NO_PART.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211122130625.1136848-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
GENHD_FL_CD marks a gendisk as a vaguely CD-ROM like device.
Besides being used internally inside of sunvdc.c an xen-blkfront it
is used by xen-blkback as a hint to claim a device exported to a
guest is a CD-ROM like device. Just check for disk->cdi instead
which is the right indicator for "real" CD-ROM or DVD drivers. This
will miss the paravirtualized guest drivers, but those make little
sense to report anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211122130625.1136848-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE is all about the event reporting
mechanism, so move it to the event_flags field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211122130625.1136848-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_rq_err_bytes is only used by the scsi midlayer, so move it there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20211117061404.331732-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Misc fixes all over the place.
Revert of virtio used length validation series: the approach taken does
not seem to work, breaking too many guests in the process. We'll need to
do length validation using some other approach.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmGe0sEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp8WEH/imDIq1iduDeAuvFnmrm5eEO9w3wzXCT4NiG
8Pla241FzQ1pEFEAne16KP0+SlLhj7P0oc5FR8vkYvxxuyneDbCzcS2M1kYMOpA1
ry28PuObAnekzE/WXxvC031ozB5Zb/FL54gmw+/1EdAOdMGL0CdQ1aJxREBHRTBo
p4ZHr83GA2D2C/IyKCsgQ8cB9ZrMqImTQQ4vRD89HoFBp+GH2u2Di1iyXEWuOqdI
n1+7M9jjbyW8A+N1bkOicpShS/6UcyJQOOcg8kvUQOV6srVkYhfaiWC/CbOP2g73
8PKK+/K2Htf92s6RdvDUPSKmvqGR/4KPZWPtWThXBYXGgWul0uI=
=q6tO
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost,virtio,vdpa bugfixes from Michael Tsirkin:
"Misc fixes all over the place.
Revert of virtio used length validation series: the approach taken
does not seem to work, breaking too many guests in the process. We'll
need to do length validation using some other approach"
[ This merge also ends up reverting commit f7a36b03a7 ("vsock/virtio:
suppress used length validation"), which came in through the
networking tree in the meantime, and was part of that whole used
length validation series - Linus ]
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa_sim: avoid putting an uninitialized iova_domain
vhost-vdpa: clean irqs before reseting vdpa device
virtio-blk: modify the value type of num in virtio_queue_rq()
vhost/vsock: cleanup removing `len` variable
vhost/vsock: fix incorrect used length reported to the guest
Revert "virtio_ring: validate used buffer length"
Revert "virtio-net: don't let virtio core to validate used length"
Revert "virtio-blk: don't let virtio core to validate used length"
Revert "virtio-scsi: don't let virtio core to validate used buffer length"
This reverts commit c57911ebfb.
Attempts to validate length in the core did not work out. We'll drop
them for now, so revert the dependent changes in drivers.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A commit introduced formal regstration of all Fabric nodes to the SCSI
transport as well as REG/UNREG RPI mailbox requests. The commit introduced
the NLP_RELEASE_RPI flag for rports set in the lpfc_cmpl_els_logo_acc()
routine to help clean up the RPIs. This new code caused the driver to
release the RPI value used for the remote port and marked the RPI invalid.
When the driver later attempted to re-login, it would use the invalid RPI
and the adapter rejected the PLOGI request. As no login occurred, the
devloss timer on the rport expired and connectivity was lost.
This patch corrects the code by removing the snippet that requests the rpi
to be unregistered. This change only occurs on a node that is already
marked to be rediscovered. This puts the code back to its original
behavior, preserving the already-assigned rpi value (registered or not)
which can be used on the re-login attempts.
Link: https://lore.kernel.org/r/20211123165646.62740-1-jsmart2021@gmail.com
Fixes: fe83e3b9b4 ("scsi: lpfc: Fix node handling for Fabric Controller and Domain Controller")
Cc: <stable@vger.kernel.org> # v5.14+
Co-developed-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a reset is requested the position of the write pointer is updated but
the data in the corresponding zone is not cleared. Instead scsi_debug
returns any data written before the write pointer was reset. This is an
error and prevents using scsi_debug for stale page cache testing of the
BLKRESETZONE ioctl.
Zero written data in the zone when resetting the write pointer.
Link: https://lore.kernel.org/r/20211122061223.298890-1-shinichiro.kawasaki@wdc.com
Fixes: f0d1cf9378 ("scsi: scsi_debug: Add ZBC zone commands")
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This fixes an issue added in commit 4edd8cd4e8 ("scsi: core: sysfs: Fix
hang when device state is set via sysfs") where if userspace is requesting
to set the device state to SDEV_RUNNING when the state is already
SDEV_RUNNING, we return -EINVAL instead of count. The commmit above set ret
to count for this case, when it should have set it to 0.
Link: https://lore.kernel.org/r/20211120164917.4924-1-michael.christie@oracle.com
Fixes: 4edd8cd4e8 ("scsi: core: sysfs: Fix hang when device state is set via sysfs")
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For updating the IOC firmware's timestamp with system timestamp, the driver
issues the Mpi26IoUnitControlRequest message. While framing the
Mpi26IoUnitControlRequest, the driver should copy the lower 32 bits of the
current timestamp into IOCParameterValue field and the higher 32 bits into
Reserved7 field.
Link: https://lore.kernel.org/r/20211117123215.25487-1-sreekanth.reddy@broadcom.com
Fixes: f98790c003 ("scsi: mpt3sas: Sync time periodically between driver and firmware")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While determining the SAS address of a drive, the driver checks whether the
handle number is less than the HBA phy count or not. If the handle number
is less than the HBA phy count then driver assumes that this handle belongs
to HBA and hence it assigns the HBA SAS address.
During IOC firmware downgrade operation, if the number of HBA phys is
reduced and the OS drive's device handle drops below the phy count while
determining the drive's SAS address, the driver ends up using the HBA's SAS
address. This leads to a mismatch of drive's SAS address and hence the
driver unregisters the OS drive and the system goes into read-only mode.
Update the IOC's num_phys to the HBA phy count provided by actual loaded
firmware.
Link: https://lore.kernel.org/r/20211117105058.3505-1-sreekanth.reddy@broadcom.com
Fixes: a5e99fda01 ("scsi: mpt3sas: Update hba_port objects after host reset")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There's no reason to have a double space between "UFS" and "Temperature",
hence drop it.
Link: https://lore.kernel.org/r/20211106164741.1571206-1-geert@linux-m68k.org
Fixes: e88e2d3220 ("scsi: ufs: core: Probe for temperature notification support")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The build only descends into drivers/scsi/ufs/ if SCSI_UFSHCD is enabled.
Hence all later config symbols should depend on SCSI_UFSHCD to prevent
asking the user about config symbols for driver code that won't be built
anyway. Unfortunately not all symbols have that dependency.
Fix this by wrapping them all into a big if/endif block. Remove the now
superfluous explicit dependencies on SCSI_UFSHCD from all symbols that
already had it.
Link: https://lore.kernel.org/r/20211106164650.1571068-1-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pm8001_mpi_build_cmd() prepares and sends all commands to a controller.
Having pm80xx_mpi_build_cmd tracepoint can help us with latency issues.
this patch depends on patch "scsi: pm80xx: Add tracepoints".
Link: https://lore.kernel.org/r/20211115215750.131696-3-changyuanl@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Co-developed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tracepoints for tracking controller and ATA commands issued and completed.
Link: https://lore.kernel.org/r/20211115215750.131696-2-changyuanl@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Co-developed-by: Akshat Jain <akshatzen@google.com>
Signed-off-by: Akshat Jain <akshatzen@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We used to allocate X bytes while we only need X bits.
Link: https://lore.kernel.org/r/20211101232825.2350233-5-ipylypiv@google.com
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Starting from commit 05c6c029a4 ("scsi: pm80xx: Increase number of
supported queues") driver initializes only max_q_num queues. Do not use an
invalid queue if the WARN_ON condition is true.
Link: https://lore.kernel.org/r/20211101232825.2350233-4-ipylypiv@google.com
Fixes: 7640e1eb8c ("scsi: pm80xx: Make mpi_build_cmd locking consistent")
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Address-of operator cannot return NULL.
Link: https://lore.kernel.org/r/20211101232825.2350233-3-ipylypiv@google.com
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Phy ID is located in the least significant byte of the 4-byte field.
mpi_phy_stop_resp() already applies such mask.
Link: https://lore.kernel.org/r/20211101232825.2350233-2-ipylypiv@google.com
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In some scenarios START STOP UNIT may time out. The default recovery
time of 30 seconds is relatively large. Modifying rq_timeout to adjust
the START STOP UNIT timeout value will affect the regular I/O.
Commit 9728c0814e ("[SCSI] make scsi_eh_try_stu use block timeout")
switched to rq_timeout for the START STOP UNIT command. However commit
0816c9251a ("[SCSI] Allow error handling timeout to be specified")
introduced an explicit eh_timeout parameter. It makes more sense to
use this value as the timeout for START STOP UNIT.
Link: https://lore.kernel.org/r/1636507412-21678-1-git-send-email-brookxu.cn@gmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Simplify the scsi_host_alloc() implementation by setting the shost_class
.dev_groups member instead of copying all host attribute group pointers
into the shost_dev_attr_groups[] array.
Link: https://lore.kernel.org/r/20211116223115.2103031-1-bvanassche@acm.org
Cc: Steffen Maier <maier@linux.ibm.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Suggested-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While looping over shost's sdev list it is possible that one
of the drives is getting removed and its sas_target object is
freed but its sdev object remains intact.
Consequently, a kernel panic can occur while the driver is trying to access
the sas_address field of sas_target object without also checking the
sas_target object for NULL.
Link: https://lore.kernel.org/r/20211117104909.2069-1-sreekanth.reddy@broadcom.com
Fixes: f92363d123 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This was found by coccicheck:
./drivers/scsi/ufs/ufs-mediatek.c, 211, 1-7, ERROR missing put_device;
call of_find_device_by_node on line 1185, but without a corresponding
object release within this function.
Link: https://lore.kernel.org/r/20211110105133.150171-1-ye.guojin@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The > comparison needs to be >= to prevent accessing one element beyond the
end of the app_reply->ports[] array.
Link: https://lore.kernel.org/r/20211109115219.GE16587@kili
Fixes: 7878f22a2e ("scsi: qla2xxx: edif: Add getfcinfo and statistic bsgs")
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following sparse warnings in ufshpb_set_hpb_read_to_upiu():
sparse warnings: (new ones prefixed by >>)
drivers/scsi/ufs/ufshpb.c:335:27: sparse: sparse: cast from restricted __be64
drivers/scsi/ufs/ufshpb.c:335:25: sparse: expected restricted __be64 [usertype] ppn_tmp
drivers/scsi/ufs/ufshpb.c:335:25: sparse: got unsigned long long [usertype]
Link: https://lore.kernel.org/r/20211111222452.384089-1-huobean@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Simplify the nested conditionals in the function by using a label for the
error path. Introduce local "shost" to avoid repeated "sdev->shost" usage.
Also remove scsi_eh_complete_abort() since there is now only one place it
would be called.
Link: https://lore.kernel.org/r/20211029194311.17504-3-emilne@redhat.com
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SCM changes set the flags in mcp->out_mb instead of mcp->in_mb so the
data was not actually being read into the mcp->mb[] array from the adapter.
Link: https://lore.kernel.org/r/20211108183012.13895-1-emilne@redhat.com
Fixes: 9f2475fe74 ("scsi: qla2xxx: SAN congestion management implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
hba->outstanding_tasks, which is read under host_lock spinlock, tells the
interrupt handler what task management tags are in use by the driver. The
doorbell register bits indicate which tags are in use by the hardware. A
doorbell bit that is 0 is because the bit has yet to be set by the driver,
or because the task is complete. It is only possible to disambiguate the 2
cases, if reading/writing the doorbell register is synchronized with
reading/writing hba->outstanding_tasks.
For that reason, reading REG_UTP_TASK_REQ_DOOR_BELL must be done under
spinlock.
Link: https://lore.kernel.org/r/20211108064815.569494-3-adrian.hunter@intel.com
Fixes: f5ef336fd2 ("scsi: ufs: core: Fix task management completion")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, which
races with the completion function ufshcd_tmc_handler() which expects
req->end_io_data to have a value.
Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already
synchronized using hba->tmf_rqs and hba->outstanding_tasks under the
host_lock spinlock.
It is also not necessary (nor typical) to clear req->end_io_data because
the block layer does it before allocating out requests e.g. via
blk_get_request().
So fix by not clearing it.
Link: https://lore.kernel.org/r/20211108064815.569494-2-adrian.hunter@intel.com
Fixes: f5ef336fd2 ("scsi: ufs: core: Fix task management completion")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This fixes a regression added with:
commit f0f82e2476 ("scsi: core: Fix capacity set to zero after
offlinining device")
The problem is that after iSCSI recovery, iscsid will call into the kernel
to set the dev's state to running, and with that patch we now call
scsi_rescan_device() with the state_mutex held. If the SCSI error handler
thread is just starting to test the device in scsi_send_eh_cmnd() then it's
going to try to grab the state_mutex.
We are then stuck, because when scsi_rescan_device() tries to send its I/O
scsi_queue_rq() calls -> scsi_host_queue_ready() -> scsi_host_in_recovery()
which will return true (the host state is still in recovery) and I/O will
just be requeued. scsi_send_eh_cmnd() will then never be able to grab the
state_mutex to finish error handling.
To prevent the deadlock move the rescan-related code to after we drop the
state_mutex.
This also adds a check for if we are already in the running state. This
prevents extra scans and helps the iscsid case where if the transport class
has already onlined the device during its recovery process then we don't
need userspace to do it again plus possibly block that daemon.
Link: https://lore.kernel.org/r/20211105221048.6541-3-michael.christie@oracle.com
Fixes: f0f82e2476 ("scsi: core: Fix capacity set to zero after offlinining device")
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: lijinlin <lijinlin3@huawei.com>
Cc: Wu Bo <wubo40@huawei.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We can race where iscsi_session_recovery_timedout() has woken up the error
handler thread and it's now setting the devices to offline, and
session_recovery_timedout()'s call to scsi_target_unblock() is also trying
to set the device's state to transport-offline. We can then get a mix of
states.
For the case where we can't relogin we want the devices to be in
transport-offline so when we have repaired the connection
__iscsi_unblock_session() can set the state back to running.
Set the device state then call into libiscsi to wake up the error handler.
Link: https://lore.kernel.org/r/20211105221048.6541-2-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following has been observed on a test setup:
WARNING: CPU: 4 PID: 250 at drivers/scsi/ufs/ufshcd.c:2737 ufshcd_queuecommand+0x468/0x65c
Call trace:
ufshcd_queuecommand+0x468/0x65c
scsi_send_eh_cmnd+0x224/0x6a0
scsi_eh_test_devices+0x248/0x418
scsi_eh_ready_devs+0xc34/0xe58
scsi_error_handler+0x204/0x80c
kthread+0x150/0x1b4
ret_from_fork+0x10/0x30
That warning is triggered by the following statement:
WARN_ON(lrbp->cmd);
Fix this warning by clearing lrbp->cmd from the abort handler.
Link: https://lore.kernel.org/r/20211104181059.4129537-1-bvanassche@acm.org
Fixes: 7a3e97b0dc ("[SCSI] ufshcd: UFS Host controller driver")
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series is all the stragglers that didn't quite make the first
merge window pull. It's mostly minor updates and bug fixes of merge
window code but it also has two driver updates: ufs and qla2xxx.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYY5mOyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpjAQDboVkH
7RQblJf8AKDMjN2baSIrmbk7qEUqzRgo6Ef3egEAi044Gx4KqBwzBLiCREcFW/Mt
F95pt5udsLypGhpfZlE=
=fiv8
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This series is all the stragglers that didn't quite make the first
merge window pull. It's mostly minor updates and bug fixes of merge
window code but it also has two driver updates: ufs and qla2xxx"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (46 commits)
scsi: scsi_debug: Don't call kcalloc() if size arg is zero
scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()
scsi: scsi_ioctl: Validate command size
scsi: ufs: ufshpb: Properly handle max-single-cmd
scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run
scsi: bsg: Fix errno when scsi_bsg_register_queue() fails
scsi: sr: Remove duplicate assignment
scsi: ufs: ufs-exynos: Introduce ExynosAuto v9 virtual host
scsi: ufs: ufs-exynos: Multi-host configuration for ExynosAuto v9
scsi: ufs: ufs-exynos: Support ExynosAuto v9 UFS
scsi: ufs: ufs-exynos: Add pre/post_hce_enable drv callbacks
scsi: ufs: ufs-exynos: Factor out priv data init
scsi: ufs: ufs-exynos: Add EXYNOS_UFS_OPT_SKIP_CONFIG_PHY_ATTR option
scsi: ufs: ufs-exynos: Support custom version of ufs_hba_variant_ops
scsi: ufs: ufs-exynos: Add setup_clocks callback
scsi: ufs: ufs-exynos: Add refclkout_stop control
scsi: ufs: ufs-exynos: Simplify drv_data retrieval
scsi: ufs: ufs-exynos: Change pclk available max value
scsi: ufs: Add quirk to enable host controller without PH configuration
scsi: ufs: Add quirk to handle broken UIC command
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmGKqAcQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpojrD/4yA+GgV+jWeIepYWvU81TQFpt9AJmzWbrY
uryj4dy7EdMjun+JkAP8k4qreqvTZRsJMkr9dhmS4qaM8/Vt8K/RU/0n/lxNVmqc
1//ZaTS6DURVAc52GHIXD3q4cv8pHofTZZlrj1Hgz35shlOayStGJtktH5f8uQl4
5Yxjh+HKr15Chym+fKlbR6T7BgVxxNyhT9q89BgUwMAJX+1KRVtwtkyVK5IbObFy
zOeiC+n9niQ6iJHcLoqb7LjfBOs/VjdNOQYGSCAnrBxuQ8GnEP2xDw2nvFlOPE12
5tWEwTgAX7381ilbL6VvNTlTafIs/Axt8mI0cY/OMW7ApiHwO3rXjQSqA4yrnKCJ
h6M1QavqThd2DtMnOi0U5wwgtD2UjS+CMpK5XFxeIyl6GqTgZcaWm3VqRnG68KZD
r5+o99GKWCHy0cckxq2WiWJouReeNZ9u9R6HNDw0Vb8UNyWgBR+v2MkX+SHS/c85
2gXm10hwBH7BFnC4X8ceiuT/bm7xm9S6D/3LCVitlUTBRfqobsQEQjSciPeoOtL0
rRSTKob7jtokiB2q01wx3q1jnUMpxE1fqJkpLjUvebTzw+a+xfPwy0nNTGq0XXIv
WMVRRpSWCZm04Ru0q/K8cj0GOyur5x+ilefZ1V+/sRU5dVmGuJgbJUxei1HPC6eV
z9Rn0aFv4g==
=1GPi
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/block-2021-11-09' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Set of fixes for the batched tag allocation (Ming, me)
- add_disk() error handling fix (Luis)
- Nested queue quiesce fixes (Ming)
- Shared tags init error handling fix (Ye)
- Misc cleanups (Jean, Ming, me)
* tag 'for-5.16/block-2021-11-09' of git://git.kernel.dk/linux-block:
nvme: wait until quiesce is done
scsi: make sure that request queue queiesce and unquiesce balanced
scsi: avoid to quiesce sdev->request_queue two times
blk-mq: add one API for waiting until quiesce is done
blk-mq: don't free tags if the tag_set is used by other device in queue initialztion
block: fix device_add_disk() kobject_create_and_add() error handling
block: ensure cached plug request matches the current queue
block: move queue enter logic into blk_mq_submit_bio()
block: make bio_queue_enter() fast-path available inline
block: split request allocation components into helpers
block: have plug stored requests hold references to the queue
blk-mq: update hctx->nr_active in blk_mq_end_request_batch()
blk-mq: add RQF_ELV debug entry
blk-mq: only try to run plug merge if request has same queue with incoming bio
block: move RQF_ELV setting into allocators
dm: don't stop request queue after the dm device is suspended
block: replace always false argument with 'false'
block: assign correct tag before doing prefetch of request
blk-mq: fix redundant check of !e expression
For fixing queue quiesce race between driver and block layer(elevator
switch, update nr_requests, ...), we need to support concurrent quiesce
and unquiesce, which requires the two call balanced.
It isn't easy to audit that in all scsi drivers, especially the two may
be called from different contexts, so do it in scsi core with one
per-device atomic variable to balance quiesce and unquiesce.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: e70feb8b3e ("blk-mq: support concurrent queue quiesce/unquiesce")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211109071144.181581-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For fixing queue quiesce race between driver and block layer(elevator
switch, update nr_requests, ...), we need to support concurrent quiesce
and unquiesce, which requires the two to be balanced.
blk_mq_quiesce_queue() calls blk_mq_quiesce_queue_nowait() for updating
quiesce depth and marking the flag, then scsi_internal_device_block() calls
blk_mq_quiesce_queue_nowait() two times actually.
Fix the double quiesce and keep quiesce and unquiesce balanced.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes: e70feb8b3e ("blk-mq: support concurrent queue quiesce/unquiesce")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211109071144.181581-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This series consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes. Notable core changes are the removal of scsi->tag which caused
some churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYYUfBCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbUJAQDZt4oc
vUx9JpyrdHxxTCuOzVFd8W1oJn0k5ltCBuz4yAD8DNbGhGm93raMSJ3FOOlzLEbP
RG8vBdpxMudlvxAPi/A=
=BSFz
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes.
Notable core changes are the removal of scsi->tag which caused some
churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
scsi: lpfc: Update lpfc version to 14.0.0.3
scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
scsi: lpfc: Fix link down processing to address NULL pointer dereference
scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted
scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine
scsi: lpfc: Correct sysfs reporting of loop support after SFP status change
scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup()
scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
scsi: ufs: mediatek: Avoid sched_clock() misuse
scsi: mpt3sas: Make mpt3sas_dev_attrs static
scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions
scsi: target: core: Stop using bdevname()
scsi: aha1542: Use memcpy_{from,to}_bvec()
scsi: sr: Add error handling support for add_disk()
scsi: sd: Add error handling support for add_disk()
scsi: target: Perform ALUA group changes in one step
scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path
scsi: target: Fix alua_tg_pt_gps_count tracking
scsi: target: Fix ordered tag handling
...
If the size arg to kcalloc() is zero, it returns ZERO_SIZE_PTR. Because of
that, for a following NULL pointer check to work on the returned pointer,
kcalloc() must not be called with the size arg equal to zero. Return early
without error before the kcalloc() call if size arg is zero.
BUG: KASAN: null-ptr-deref in memcpy include/linux/fortify-string.h:191 [inline]
BUG: KASAN: null-ptr-deref in sg_copy_buffer+0x138/0x240 lib/scatterlist.c:974
Write of size 4 at addr 0000000000000010 by task syz-executor.1/22789
CPU: 1 PID: 22789 Comm: syz-executor.1 Not tainted 5.15.0-syzk #1
Hardware name: Red Hat KVM, BIOS 1.13.0-2
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:106
__kasan_report mm/kasan/report.c:446 [inline]
kasan_report.cold.14+0x112/0x117 mm/kasan/report.c:459
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x1a3/0x210 mm/kasan/generic.c:189
memcpy+0x3b/0x60 mm/kasan/shadow.c:66
memcpy include/linux/fortify-string.h:191 [inline]
sg_copy_buffer+0x138/0x240 lib/scatterlist.c:974
do_dout_fetch drivers/scsi/scsi_debug.c:2954 [inline]
do_dout_fetch drivers/scsi/scsi_debug.c:2946 [inline]
resp_verify+0x49e/0x930 drivers/scsi/scsi_debug.c:4276
schedule_resp+0x4d8/0x1a70 drivers/scsi/scsi_debug.c:5478
scsi_debug_queuecommand+0x8c9/0x1ec0 drivers/scsi/scsi_debug.c:7533
scsi_dispatch_cmd drivers/scsi/scsi_lib.c:1520 [inline]
scsi_queue_rq+0x16b0/0x2d40 drivers/scsi/scsi_lib.c:1699
blk_mq_dispatch_rq_list+0xb9b/0x2700 block/blk-mq.c:1639
__blk_mq_sched_dispatch_requests+0x28f/0x590 block/blk-mq-sched.c:325
blk_mq_sched_dispatch_requests+0x105/0x190 block/blk-mq-sched.c:358
__blk_mq_run_hw_queue+0xe5/0x150 block/blk-mq.c:1761
__blk_mq_delay_run_hw_queue+0x4f8/0x5c0 block/blk-mq.c:1838
blk_mq_run_hw_queue+0x18d/0x350 block/blk-mq.c:1891
blk_mq_sched_insert_request+0x3db/0x4e0 block/blk-mq-sched.c:474
blk_execute_rq_nowait+0x16b/0x1c0 block/blk-exec.c:62
blk_execute_rq+0xdb/0x360 block/blk-exec.c:102
sg_scsi_ioctl drivers/scsi/scsi_ioctl.c:621 [inline]
scsi_ioctl+0x8bb/0x15c0 drivers/scsi/scsi_ioctl.c:930
sg_ioctl_common+0x172d/0x2710 drivers/scsi/sg.c:1112
sg_ioctl+0xa2/0x180 drivers/scsi/sg.c:1165
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Link: https://lore.kernel.org/r/1636056397-13151-1-git-send-email-george.kennedy@oracle.com
Reported-by: syzkaller <syzkaller@googlegroups.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
No need to deduce command size in scsi_setup_scsi_cmnd() anymore as
appropriate checks have been added to scsi_fill_sghdr_rq() function and the
cmd_len should never be zero here. The code to do that wasn't correct
anyway, as it used uninitialized cmd->cmnd, which caused a null-ptr-deref
if the command size was zero as in the trace below. Fix this by removing
the unneeded code.
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 1822 Comm: repro Not tainted 5.15.0 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
Call Trace:
blk_mq_dispatch_rq_list+0x7c7/0x12d0
__blk_mq_sched_dispatch_requests+0x244/0x380
blk_mq_sched_dispatch_requests+0xf0/0x160
__blk_mq_run_hw_queue+0xe8/0x160
__blk_mq_delay_run_hw_queue+0x252/0x5d0
blk_mq_run_hw_queue+0x1dd/0x3b0
blk_mq_sched_insert_request+0x1ff/0x3e0
blk_execute_rq_nowait+0x173/0x1e0
blk_execute_rq+0x15c/0x540
sg_io+0x97c/0x1370
scsi_ioctl+0xe16/0x28e0
sd_ioctl+0x134/0x170
blkdev_ioctl+0x362/0x6e0
block_ioctl+0xb0/0xf0
vfs_ioctl+0xa7/0xf0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
---[ end trace 8b086e334adef6d2 ]---
Kernel panic - not syncing: Fatal exception
Link: https://lore.kernel.org/r/20211103170659.22151-2-tadeusz.struk@linaro.org
Fixes: 2ceda20f0a ("scsi: core: Move command size detection out of the fast path")
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 5.15, 5.14, 5.10
Reported-by: syzbot+5516b30f5401d4dcbcae@syzkaller.appspotmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>