linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Johannes Berg	2fcb4090cd	Revert "um: allocate a guard page to helper threads" This reverts commit `ef4459a6da` ("um: allocate a guard page to helper threads"), it's broken in multiple ways: 1) the free no longer matches the alloc; and 2) more importantly, the set_memory_ro() causes allocation of page tables for the normal memory that doesn't have any, and that later causes corruption and crashes (usually but not always in vfree()). We could fix the first bug and use vmalloc() to work around the second, but set_memory_ro() actually doesn't do anything either so I'll just revert that as well. Reported-by: Benjamin Berg <benjamin@sipsolutions.net> Fixes: `ef4459a6da` ("um: allocate a guard page to helper threads") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2021-01-26 22:11:38 +01:00
Hajime Tazaki	94c41b3a7c	um: ubd: fix command line handling of ubd This commit fixes a regression to handle command line parameters of ubd. With a simple line "./linux ubd0="./disk-ext4.img", it fails at ubd_setup_common(). The commit adds additional checks to the variables in order to properly parse the paremeters which previously worked. Fixes: `ef3ba87cb7` ("um: ubd: Set device serial attribute from cmdline") Cc: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Hajime Tazaki <thehajime@gmail.com> Acked-by: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2021-01-26 22:10:44 +01:00
Johannes Berg	ef4459a6da	um: allocate a guard page to helper threads We've been running into stack overflows in helper threads corrupting memory (e.g. because somebody put printf() or os_info() there), so to avoid those causing hard-to-debug issues later on, allocate a guard page for helper thread stacks and mark it read-only. Unfortunately, the crash dump at that point is useless as the stack tracer will try to backtrace the kernel thread, not the helper thread, but at least we don't survive to a random issue caused by corruption. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-12-13 22:38:06 +01:00
Johannes Berg	36d46a5907	um: Support dynamic IRQ allocation It's cumbersome and error-prone to keep adding fixed IRQ numbers, and for proper device wakeup support for the virtio/vhost-user support we need to have different IRQs for each device. Even if in theory two IRQs (with and without wake) might be sufficient, it's much easier to reason about it when we have dynamic number assignment. It also makes it easier to add new devices that may dynamically exist or depending on the configuration, etc. Add support for this, up to 64 IRQs (the same limit as epoll FDs we have right now). Since it's not easy to port all the existing places to dynamic allocation (some data is statically initialized) keep the low numbers are reserved for the existing hard-coded IRQ numbers. Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-12-13 22:22:08 +01:00
Christopher Obbard	ef3ba87cb7	um: ubd: Set device serial attribute from cmdline Adds the ability to set the UBD device serial number from the commandline, disabling the serial number functionality by default. In some cases it may be useful to set a serial to the UBD device, such that downstream users (i.e. udev) can use this information to better describe the hardware to the user from the UML cmdline. In our case we use this parameter to create some entries under /dev/disk/by-ubd-id/ for each of the UBD devices passed through the UML cmdline. Signed-off-by: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-12-13 22:22:01 +01:00
Gabriel Krisman Bertazi	fc6b6a872d	um: ubd: Submit all data segments atomically Internally, UBD treats each physical IO segment as a separate command to be submitted in the execution pipe. If the pipe returns a transient error after a few segments have already been written, UBD will tell the block layer to requeue the request, but there is no way to reclaim the segments already submitted. When a new attempt to dispatch the request is done, those segments already submitted will get duplicated, causing the WARN_ON below in the best case, and potentially data corruption. In my system, running a UML instance with 2GB of RAM and a 50M UBD disk, I can reproduce the WARN_ON by simply running mkfs.fvat against the disk on a freshly booted system. There are a few ways to around this, like reducing the pressure on the pipe by reducing the queue depth, which almost eliminates the occurrence of the problem, increasing the pipe buffer size on the host system, or by limiting the request to one physical segment, which causes the block layer to submit way more requests to resolve a single operation. Instead, this patch modifies the format of a UBD command, such that all segments are sent through a single element in the communication pipe, turning the command submission atomic from the point of view of the block layer. The new format has a variable size, depending on the number of elements, and looks like this: +------------+-----------+-----------+------------ \| cmd_header \| segment 0 \| segment 1 \| segment ... +------------+-----------+-----------+------------ With this format, we push a pointer to cmd_header in the submission pipe. This has the advantage of reducing the memory footprint of executing a single request, since it allow us to merge some fields in the header. It is possible to reduce even further each segment memory footprint, by merging bitmap_words and cow_offset, for instance, but this is not the focus of this patch and is left as future work. One issue with the patch is that for a big number of segments, we now perform one big memory allocation instead of multiple small ones, but I wasn't able to trigger any real issues or -ENOMEM because of this change, that wouldn't be reproduced otherwise. This was tested using fio with the verify-crc32 option, and by running an ext4 filesystem over this UBD device. The original WARN_ON was: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141 refcount_t: underflow; use-after-free. Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346 Stack: 6084eed0 6063dc77 00000009 6084ef60 00000000 604b8d9f 6084eee0 6063dcbc 6084ef40 6006ab8d e013d780 1c00000000 Call Trace: [<600a0c1c>] ? printk+0x0/0x94 [<6004a888>] show_stack+0x13b/0x155 [<6063dc77>] ? dump_stack_print_info+0xdf/0xe8 [<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141 [<6063dcbc>] dump_stack+0x2a/0x2c [<6006ab8d>] __warn+0x107/0x134 [<6008da6c>] ? wake_up_process+0x17/0x19 [<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd [<6006b05f>] warn_slowpath_fmt+0xd1/0xdf [<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf [<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15 [<600619ae>] ? os_nsecs+0x1d/0x2b [<604b8d9f>] refcount_warn_saturate+0x13f/0x141 [<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37 [<6048c8de>] blk_mq_free_request+0xf1/0x10d [<6048ca06>] __blk_mq_end_request+0x10c/0x114 [<6005ac0f>] ubd_intr+0xb5/0x169 [<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e [<600a1b70>] handle_irq_event_percpu+0x26/0x69 [<600a1bd9>] handle_irq_event+0x26/0x34 [<600a1bb3>] ? handle_irq_event+0x0/0x34 [<600a5186>] ? unmask_irq+0x0/0x37 [<600a57e6>] handle_edge_irq+0xbc/0xd6 [<600a131a>] generic_handle_irq+0x21/0x29 [<60048f6e>] do_IRQ+0x39/0x54 [...] ---[ end trace c6e7444e55386c0f ]--- Cc: Christopher Obbard <chris.obbard@collabora.com> Reported-by: Martyn Welch <martyn@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Tested-by: Christopher Obbard <chris.obbard@collabora.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-12-13 22:21:55 +01:00
Gabriel Krisman Bertazi	e355b2f55e	um: ubd: Retry buffer read on any kind of error Should bulk_req_safe_read return an error, we want to retry the read, otherwise, even though no IO will be done, os_write_file might still end up writing garbage to the pipe. Cc: Martyn Welch <martyn.welch@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:37 +02:00
Gabriel Krisman Bertazi	6e682d53fc	um: ubd: Prevent buffer overrun on command completion On the hypervisor side, when completing commands and the pipe is full, we retry writing only the entries that failed, by offsetting io_req_buffer, but we don't reduce the number of bytes written, which can cause a buffer overrun of io_req_buffer, and write garbage to the pipe. Cc: Martyn Welch <martyn.welch@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2020-03-29 23:21:33 +02:00
Linus Torvalds	33c84e89ab	SCSI misc on 20200129 This series is slightly unusual because it includes Arnd's compat ioctl tree here: `1c46a2cf2d` Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue Excluding Arnd's changes, this is mostly an update of the usual drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas. There are a couple of core and base updates around error propagation and atomicity in the attribute container base we use for the SCSI transport classes. The rest is minor changes and updates. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXjHQJyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZZ8AQC02N+v iUnTl1YxGPjIWBbnHuUxN2Qbb9D3C6gAT1LkigEArlk163K3A1XEQHF/VNCdAz/f 01XYTd3p1VHuegIBHlk= =Cn52 -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This series is slightly unusual because it includes Arnd's compat ioctl tree here: `1c46a2cf2d` Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue Excluding Arnd's changes, this is mostly an update of the usual drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas. There are a couple of core and base updates around error propagation and atomicity in the attribute container base we use for the SCSI transport classes. The rest is minor changes and updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (149 commits) scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity scsi: hisi_sas: Modify the file permissions of trigger_dump to write only scsi: hisi_sas: Replace magic number when handle channel interrupt scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock scsi: hisi_sas: use threaded irq to process CQ interrupts scsi: ufs: Use UFS device indicated maximum LU number scsi: ufs: Add max_lu_supported in struct ufs_dev_info scsi: ufs: Delete is_init_prefetch from struct ufs_hba scsi: ufs: Inline two functions into their callers scsi: ufs: Move ufshcd_get_max_pwr_mode() to ufshcd_device_params_init() scsi: ufs: Split ufshcd_probe_hba() based on its called flow scsi: ufs: Delete struct ufs_dev_desc scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails scsi: ufs-mediatek: enable low-power mode for hibern8 state scsi: ufs: export some functions for vendor usage scsi: ufs-mediatek: add dbg_register_dump implementation scsi: qla2xxx: Fix a NULL pointer dereference in an error path scsi: qla1280: Make checking for 64bit support consistent scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1 ...	2020-01-29 18:16:16 -08:00
Arnd Bergmann	ab0cf1e425	compat_ioctl: ubd, aoe: use blkdev_compat_ptr_ioctl These drivers implement the HDIO_GET_IDENTITY and CDROMVOLREAD ioctl commands, which are compatible between 32-bit and 64-bit user space and traditionally handled by compat_blkdev_driver_ioctl(). As a prerequisite to removing that function, make both drivers use blkdev_compat_ptr_ioctl() as their .compat_ioctl callback. Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2020-01-03 09:33:02 +01:00
Arnd Bergmann	853bc0ab34	um: ubd: use 64-bit time_t where possible The ubd code suffers from a possible y2038 overflow on 32-bit architectures, both for the cow header and the os_file_modtime() function. Replace time_t with time64_t to extend the ubd_kern side as much as possible. Whether this makes a difference for the user side depends on the host libc implementation that may use either 32-bit or 64-bit time_t. For the cow file format, the header contains an unsigned 32-bit timestamp, which is good until y2106, passing this through a 'long long' gives us a consistent interpretation between 32-bit and 64-bit um kernels. Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-12-18 18:07:31 +01:00
Anton Ivanov	d848074b2f	um-ubd: Entrust re-queue to the upper layers Fixes crashes due to ubd requeue logic conflicting with the block-mq logic. Crash is reproducible in 5.0 - 5.3. Fixes: `53766defb8` ("um: Clean-up command processing in UML UBD driver") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-10-29 10:07:41 -06:00
Alex Dewar	dbddf429dc	um: Add SPDX headers for files in arch/um/drivers Convert files to use SPDX header. All files are licensed under the GPLv2. Signed-off-by: Alex Dewar <alex.dewar@gmx.co.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-09-15 21:37:16 +02:00
Daniel Walter	9ca55299f2	um: Do not unlock mutex that is not hold. Return error instead of trying to unlock a mutex that is not hold. Signed-off-by: Daniel Walter <dwalter@google.com> Reviewed-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-05-07 23:18:28 +02:00
Anton Ivanov	aea05eb56e	um: Fix for a possible OOPS in ubd initialization If the ubd device failed to allocate a queue during initialization it tried call blk_cleanup_queue resulting in an oops. This patch simplifies the cleanup logic and ensures that blk_queue_cleanup is called only if there is a valid queue. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-03-06 22:39:18 +01:00
Anton Ivanov	940b241d90	um: Remove obsolete reenable_XX calls reenable_fd has been a NOP since the introduction of the EPOLL based interrupt controller. reenable_channel() is no longer needed as the flow control is now handled via the write IRQs on the channel. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-12-27 22:48:35 +01:00
Anton Ivanov	50109b5a03	um: Add support for DISCARD in the UBD Driver Support for DISCARD and WRITE_ZEROES in the ubd driver using fallocate. DISCARD is enabled by default and can be disabled using a new UBD command line flag. If the underlying fs on which the UBD image is stored does not support DISCARD the support for both DISCARD and WRITE_ZEROES is turned off. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-12-27 22:48:20 +01:00
Anton Ivanov	a41421edb9	um: Remove unsafe printks from the io thread Printk out of the io thread has been proven to be unsafe. This is not surprising as the thread is part of the UML hypervisor code. It is not supposed to invoke any kernel code/resources. It is necesssary to pass the error to the block IO layer and let it Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-12-27 22:48:20 +01:00
Anton Ivanov	53766defb8	um: Clean-up command processing in UML UBD driver Clean-up command processing and return BLK_STS_NOTSUP for uknown commands. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-12-27 22:48:20 +01:00
Anton Ivanov	a43c83161a	um: Switch to block-mq constants in the UML UBD driver Switch to block mq-constants for both commands, error codes and various computations. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-12-27 22:48:20 +01:00
Anton Ivanov	0033dfd92a	ubd: fix missing initialization of io_req The SYNC path doesn't initialize io_req->error, which can cause random errors. Before the conversion to blk-mq, we always completed requests with BLK_STS_OK status, but now we actually look at the error field and this issue becomes apparent. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> [axboe: fixed up commit message to explain what is actually going on] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-08 06:16:09 -07:00
Jens Axboe	6961cd4d0f	ubd: fix missing lock around request issue We need to hold the device lock (and disable interrupts) while writing new commands, or we could be interrupted while that is happening and read invalid requests in the completion path. Fixes: `4e6da0fe80` ("um: Convert ubd driver to blk-mq") Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-07 14:34:05 -07:00
Christoph Hellwig	ecb0a83e31	ubd: remove use of blk_rq_map_sg There is no good reason to create a scatterlist in the ubd driver, it can just iterate the request directly. Signed-off-by: Christoph Hellwig <hch@lst.de> [rw: Folded in improvements as discussed with hch and jens] Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-18 15:13:12 -06:00
Richard Weinberger	4e6da0fe80	um: Convert ubd driver to blk-mq Convert the driver to the modern blk-mq framework. As byproduct we get rid of our open coded restart logic and let blk-mq handle it. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-10-14 12:48:50 -06:00
Hannes Reinecke	fef912bf86	block: genhd: add 'groups' argument to device_add_disk Update device_add_disk() to take an 'groups' argument so that individual drivers can register a device with additional sysfs attributes. This avoids race condition the driver would otherwise have if these groups were to be created with sysfs_add_groups(). Signed-off-by: Martin Wilck <martin.wilck@suse.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-09-28 08:30:28 -06:00
Kees Cook	6da2ec5605	treewide: kmalloc() -> kmalloc_array() The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kmalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kmalloc( - sizeof(char) * COUNT + COUNT , ...) \| kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) \| kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) \| kmalloc(sizeof(TYPE) * C2, ...) \| kmalloc(C1 * C2 * C3, ...) \| kmalloc(C1 * C2, ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) \| - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) \| - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Christoph Hellwig	3f3942aca6	proc: introduce proc_create_single{,_data} Variants of proc_create{,_data} that directly take a seq_file show callback and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:23:35 +02:00
Anton Ivanov	ff6a17989c	Epoll based IRQ controller 1. Removes the need to walk the IRQ/Device list to determine who triggered the IRQ. 2. Improves scalability (up to several times performance improvement for cases with 10s of devices). 3. Improves UML baseline IO performance for one disk + one NIC use case by up to 10%. 4. Introduces write poll triggered IRQs. 5. Prerequisite for introducing high performance mmesg family of functions in network IO. 6. Fixes RNG shutdown which was leaking a file descriptor Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2018-02-19 19:38:51 +01:00
Christoph Hellwig	2a842acab1	block: introduce new block status code type Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-06-09 09:27:32 -06:00
Anton Ivanov	f88f0bdfc3	um: UBD Improvements UBD at present is extremely slow because it handles only one request at a time in the IO thread and IRQ handler. The single request at a time is replaced by handling multiple requests as well as necessary workarounds for short reads/writes. Resulting performance improvement in disk IO - 30% Signed-off-by: Anton Ivanov <aivanov@kot-begemot.co.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-12-14 22:46:55 +01:00
Dan Williams	d72a57835c	um: track 'parent' device in a local variable In preparation for the removal of 'driverfs_dev' from 'struct gendisk' use a local variable to track the parented vs un-parented case in ubd_disk_register(). Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-06-22 10:19:16 -07:00
Mike Christie	3a5e02ced1	block, drivers: add REQ_OP_FLUSH operation This adds a REQ_OP_FLUSH operation that is sent to request_fn based drivers by the block layer's flush code, instead of sending requests with the request->cmd_flags REQ_FLUSH bit set. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-06-07 13:41:38 -06:00
Jens Axboe	f935a8ce0a	um: switch to using blk_queue_write_cache() Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2016-04-12 16:00:39 -06:00
Anton Ivanov	8c6157b6b3	um: Update UBD to use pread/pwrite family of functions This decreases the number of syscalls per read/write by half. Signed-off-by: Anton Ivanov <aivanov@brocade.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2016-01-10 21:49:48 +01:00
Thorsten Knabe	2a2361228c	um: ubd: Fix for processes stuck in D state forever Starting with Linux 3.12 processes get stuck in D state forever in UserModeLinux under sync heavy workloads. This bug was introduced by commit `805f11a0d5` (um: ubd: Add REQ_FLUSH suppport). Fix bug by adding a check if FLUSH request was successfully submitted to the I/O thread and keeping the FLUSH request on the request queue on submission failures. Fixes: `805f11a0d5` (um: ubd: Add REQ_FLUSH suppport) Signed-off-by: Thorsten Knabe <linux@thorsten-knabe.de> Cc: stable@kernel.org # >= 3.12 Signed-off-by: Richard Weinberger <richard@nod.at>	2014-10-13 21:45:55 +02:00
Richard Weinberger	91d44ff860	um: Cleanup SIGTERM handling Richard reported that some UML processes survive if the UML main process receives a SIGTERM. This issue was caused by a wrongly placed signal(SIGTERM, SIG_DFL) in init_new_thread_signals(). It disabled the UML exit handler accidently for some processes. The correct solution is to disable the fatal handler for all UML helper threads/processes. Such that last_ditch_exit() does not get called multiple times and all processes can exit due to SIGTERM. Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2013-09-07 10:56:58 +02:00
Richard Weinberger	bc1d72e73b	um: ubd: Introduce submit_request() Just a clean-up patch to remove the open coded variants and to ensure that all requests are submitted the same way. Signed-off-by: Richard Weinberger <richard@nod.at>	2013-09-07 10:56:55 +02:00
Richard Weinberger	805f11a0d5	um: ubd: Add REQ_FLUSH suppport UML's block device driver does not support write barriers, to support this this patch adds REQ_FLUSH suppport. Every time the block layer sends a REQ_FLUSH we fsync() now our backing file to guarantee data consistency. Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2013-09-07 10:56:49 +02:00
Al Viro	db2a144bed	block_device_operations->release() should return void The value passed is 0 in all but "it can never happen" cases (and those only in a couple of drivers) and it would've been lost on the way out anyway, even if something tried to pass something meaningful. Just don't bother. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-07 02:16:21 -04:00
Al Viro	37185b3324	um: get rid of pointless include "..." where include <...> will do Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-10-09 22:28:45 +02:00
Martin Pärtel	d4afcba95f	um: fix ubd_file_size for read-only files Made ubd_file_size not request write access. Fixes use of read-only images. Signed-off-by: Martin Pärtel <martin.partel@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-08-02 00:44:49 +02:00
Al Viro	8ea3c06a2e	um: clean up the includes in ubd Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-03-25 00:29:52 +01:00
Yong Zhang	c0b79a90b1	um: irq: Remove IRQF_DISABLED Since commit [`e58aa3d2`: genirq: Run irq handlers with interrupts disabled], We run all interrupt handlers with interrupts disabled and we even check and yell when an interrupt handler returns with interrupts enabled (see commit [`b738a50a`: genirq: Warn when handler enables interrupts]). So now this flag is a NOOP and can be removed. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-03-25 00:29:52 +01:00
Richard Weinberger	8535639810	um: fix ubd cow size ubd_file_size() cannot use ubd_dev->cow.file because at this time ubd_dev->cow.file is not initialized. Therefore, ubd_file_size() will always report a wrong disk size when COW files are used. Reading from /dev/ubd* would crash the kernel. We have to read the correct disk size from the COW file's backing file. Signed-off-by: Richard Weinberger <richard@nod.at> CC: stable@kernel.org	2011-11-02 14:15:42 +01:00
Al Viro	0a9e70b1cd	um: kill shared/mem_kern.h ... nothing declared there exists Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2011-11-02 14:15:10 +01:00
Al Viro	dbddc51bc8	um: don't include kern.h unless it's needed Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2011-11-02 14:15:07 +01:00
Thomas Gleixner	22e6500458	um: Replace deprecated spinlock initialization SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Dike <jdike@addtoit.com>	2011-01-27 12:30:37 +01:00
Linus Torvalds	5704e44d28	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: introduce CONFIG_BKL. dabusb: remove the BKL sunrpc: remove the big kernel lock init/main.c: remove BKL notations blktrace: remove the big kernel lock rtmutex-tester: make it build without BKL dvb-core: kill the big kernel lock dvb/bt8xx: kill the big kernel lock tlclk: remove big kernel lock fix rawctl compat ioctls breakage on amd64 and itanic uml: kill big kernel lock parisc: remove big kernel lock cris: autoconvert trivial BKL users alpha: kill big kernel lock isapnp: BKL removal s390/block: kill the big kernel lock hpet: kill BKL, add compat_ioctl	2010-10-22 10:43:11 -07:00
Arnd Bergmann	9a181c5861	uml: kill big kernel lock Three uml device drivers still use the big kernel lock, but all of them can be safely converted to using a per-driver mutex instead. Most likely this is not even necessary, so after further review these can and should be removed as well. The exec system call no longer requires the BKL either, so remove it from there, too. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: user-mode-linux-devel@lists.sourceforge.net	2010-10-19 11:29:42 +02:00
Tejun Heo	47526903fe	ubd: fix incorrect sector handling during request restart Commit `f81f2f7c` (ubd: drop unnecessary rq->sector manipulation) dropped request->sector manipulation in preparation for global request handling cleanup; unfortunately, it incorrectly assumed that the updated sector wasn't being used. ubd tries to issue as many requests as possible to io_thread. When issuing fails due to memory pressure or other reasons, the device is put on the restart list and issuing stops. On IO completion, devices on the restart list are scanned and IO issuing is restarted. ubd issues IOs sg-by-sg and issuing can be stopped in the middle of a request, so each device on the restart queue needs to remember where to restart in its current request. ubd needs to keep track of the issue position itself because, * blk_rq_pos(req) is now updated by the block layer to keep track of _completion_ position. * Multiple io_req's for the current request may be in flight, so it's difficult to tell where blk_rq_pos(req) currently is. Add ubd->rq_pos to keep track of the issue position and use it to correctly restart io_req issue. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Richard Weinberger <richard@nod.at> Tested-by: Richard Weinberger <richard@nod.at> Tested-by: Chris Frey <cdfrey@foursquare.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-10-15 12:56:21 +02:00

1 2 3

143 commits