Ring buffer fixes for v6.14:
- Enable resize on mmap() error When a process mmaps a ring buffer, its size is locked and resizing is disabled. But if the user passes in a wrong parameter, the mmap() can fail after the resize was disabled and the mmap() exits with error without reenabling the ring buffer resize. This prevents the ring buffer from ever being resized after that. Reenable resizing of the ring buffer on mmap() error. - Have resizing return proper error and not always -ENOMEM If the ring buffer is mmapped by one task and another task tries to resize the buffer it will error with -ENOMEM. This is confusing to the user as there may be plenty of memory available. Have it return the error that actually happens (in this case -EBUSY) where the user can understand why the resize failed. - Test the sub-buffer array to validate persistent memory buffer On boot up, the initialization of the persistent memory buffer will do a validation check to see if the content of the data is valid, and if so, it will use the memory as is, otherwise it re-initializes it. There's meta data in this persistent memory that keeps track of which sub-buffer is the reader page and an array that states the order of the sub-buffers. The values in this array are indexes into the sub-buffers. The validator checks to make sure that all the entries in the array are within the sub-buffer list index, but it does not check for duplications. While working on this code, the array got corrupted and had duplicates, where not all the sub-buffers were accounted for. This passed the validator as all entries were valid, but the link list was incorrect and could have caused a crash. The corruption only produced incorrect data, but it could have been more severe. To fix this, create a bitmask that covers all the sub-buffer indexes and set it to all zeros. While iterating the array checking the values of the array content, have it set a bit corresponding to the index in the array. If the bit was already set, then it is a duplicate and mark the buffer as invalid and reset it. - Prevent mmap()ing persistent ring buffer The persistent ring buffer uses vmap() to map the persistent memory. Currently, the mmap() logic only uses virt_to_page() to get the page from the ring buffer memory and use that to map to user space. This works because a normal ring buffer uses alloc_page() to allocate its memory. But because the persistent ring buffer use vmap() it causes a kernel crash. Fixing this to work with vmap() is not hard, but since mmap() on persistent memory buffers never worked, just have the mmap() return -ENODEV (what was returned before mmap() for persistent memory ring buffers, as they never supported mmap. Normal buffers will still allow mmap(). Implementing mmap() for persistent memory ring buffers can wait till the next merge window. - Fix polling on persistent ring buffers There's a "buffer_percent" option (default set to 50), that is used to have reads of the ring buffer binary data block until the buffer fills to that percentage. The field "pages_touched" is incremented every time a new sub-buffer has content added to it. This field is used in the calculations to determine the amount of content is in the buffer and if it exceeds the "buffer_percent" then it will wake the task polling on the buffer. As persistent ring buffers can be created by the content from a previous boot, the "pages_touched" field was not updated. This means that if a task were to poll on the persistent buffer, it would block even if the buffer was completely full. It would block even if the "buffer_percent" was zero, because with "pages_touched" as zero, it would be calculated as the buffer having no content. Update pages_touched when initializing the persistent ring buffer from a previous boot. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ7DtcxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qmTQAQD1W/xHfS8yLw7BQBjM+6kqExdrKI/D Z378Et0LSWjZBQD/VtPKiSjLhhNgLUBy5fAWS5t4X/DZ49GKhTA36AzGHwE= =1b+2 -----END PGP SIGNATURE----- Merge tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull trace ring buffer fixes from Steven Rostedt: - Enable resize on mmap() error When a process mmaps a ring buffer, its size is locked and resizing is disabled. But if the user passes in a wrong parameter, the mmap() can fail after the resize was disabled and the mmap() exits with error without reenabling the ring buffer resize. This prevents the ring buffer from ever being resized after that. Reenable resizing of the ring buffer on mmap() error. - Have resizing return proper error and not always -ENOMEM If the ring buffer is mmapped by one task and another task tries to resize the buffer it will error with -ENOMEM. This is confusing to the user as there may be plenty of memory available. Have it return the error that actually happens (in this case -EBUSY) where the user can understand why the resize failed. - Test the sub-buffer array to validate persistent memory buffer On boot up, the initialization of the persistent memory buffer will do a validation check to see if the content of the data is valid, and if so, it will use the memory as is, otherwise it re-initializes it. There's meta data in this persistent memory that keeps track of which sub-buffer is the reader page and an array that states the order of the sub-buffers. The values in this array are indexes into the sub-buffers. The validator checks to make sure that all the entries in the array are within the sub-buffer list index, but it does not check for duplications. While working on this code, the array got corrupted and had duplicates, where not all the sub-buffers were accounted for. This passed the validator as all entries were valid, but the link list was incorrect and could have caused a crash. The corruption only produced incorrect data, but it could have been more severe. To fix this, create a bitmask that covers all the sub-buffer indexes and set it to all zeros. While iterating the array checking the values of the array content, have it set a bit corresponding to the index in the array. If the bit was already set, then it is a duplicate and mark the buffer as invalid and reset it. - Prevent mmap()ing persistent ring buffer The persistent ring buffer uses vmap() to map the persistent memory. Currently, the mmap() logic only uses virt_to_page() to get the page from the ring buffer memory and use that to map to user space. This works because a normal ring buffer uses alloc_page() to allocate its memory. But because the persistent ring buffer use vmap() it causes a kernel crash. Fixing this to work with vmap() is not hard, but since mmap() on persistent memory buffers never worked, just have the mmap() return -ENODEV (what was returned before mmap() for persistent memory ring buffers, as they never supported mmap. Normal buffers will still allow mmap(). Implementing mmap() for persistent memory ring buffers can wait till the next merge window. - Fix polling on persistent ring buffers There's a "buffer_percent" option (default set to 50), that is used to have reads of the ring buffer binary data block until the buffer fills to that percentage. The field "pages_touched" is incremented every time a new sub-buffer has content added to it. This field is used in the calculations to determine the amount of content is in the buffer and if it exceeds the "buffer_percent" then it will wake the task polling on the buffer. As persistent ring buffers can be created by the content from a previous boot, the "pages_touched" field was not updated. This means that if a task were to poll on the persistent buffer, it would block even if the buffer was completely full. It would block even if the "buffer_percent" was zero, because with "pages_touched" as zero, it would be calculated as the buffer having no content. Update pages_touched when initializing the persistent ring buffer from a previous boot. * tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Update pages_touched to reflect persistent buffer content tracing: Do not allow mmap() of persistent ring buffer ring-buffer: Validate the persistent meta data subbuf array tracing: Have the error of __tracing_resize_ring_buffer() passed to user ring-buffer: Unlock resize on mmap error
This commit is contained in:
commit
5784d8c93e
2 changed files with 31 additions and 9 deletions
|
@ -1672,7 +1672,8 @@ static void *rb_range_buffer(struct ring_buffer_per_cpu *cpu_buffer, int idx)
|
|||
* must be the same.
|
||||
*/
|
||||
static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
|
||||
struct trace_buffer *buffer, int nr_pages)
|
||||
struct trace_buffer *buffer, int nr_pages,
|
||||
unsigned long *subbuf_mask)
|
||||
{
|
||||
int subbuf_size = PAGE_SIZE;
|
||||
struct buffer_data_page *subbuf;
|
||||
|
@ -1680,6 +1681,9 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
|
|||
unsigned long buffers_end;
|
||||
int i;
|
||||
|
||||
if (!subbuf_mask)
|
||||
return false;
|
||||
|
||||
/* Check the meta magic and meta struct size */
|
||||
if (meta->magic != RING_BUFFER_META_MAGIC ||
|
||||
meta->struct_size != sizeof(*meta)) {
|
||||
|
@ -1712,6 +1716,8 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
|
|||
|
||||
subbuf = rb_subbufs_from_meta(meta);
|
||||
|
||||
bitmap_clear(subbuf_mask, 0, meta->nr_subbufs);
|
||||
|
||||
/* Is the meta buffers and the subbufs themselves have correct data? */
|
||||
for (i = 0; i < meta->nr_subbufs; i++) {
|
||||
if (meta->buffers[i] < 0 ||
|
||||
|
@ -1725,6 +1731,12 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,
|
|||
return false;
|
||||
}
|
||||
|
||||
if (test_bit(meta->buffers[i], subbuf_mask)) {
|
||||
pr_info("Ring buffer boot meta [%d] array has duplicates\n", cpu);
|
||||
return false;
|
||||
}
|
||||
|
||||
set_bit(meta->buffers[i], subbuf_mask);
|
||||
subbuf = (void *)subbuf + subbuf_size;
|
||||
}
|
||||
|
||||
|
@ -1838,6 +1850,11 @@ static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer)
|
|||
cpu_buffer->cpu);
|
||||
goto invalid;
|
||||
}
|
||||
|
||||
/* If the buffer has content, update pages_touched */
|
||||
if (ret)
|
||||
local_inc(&cpu_buffer->pages_touched);
|
||||
|
||||
entries += ret;
|
||||
entry_bytes += local_read(&head_page->page->commit);
|
||||
local_set(&cpu_buffer->head_page->entries, ret);
|
||||
|
@ -1889,17 +1906,22 @@ static void rb_meta_init_text_addr(struct ring_buffer_meta *meta)
|
|||
static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages)
|
||||
{
|
||||
struct ring_buffer_meta *meta;
|
||||
unsigned long *subbuf_mask;
|
||||
unsigned long delta;
|
||||
void *subbuf;
|
||||
int cpu;
|
||||
int i;
|
||||
|
||||
/* Create a mask to test the subbuf array */
|
||||
subbuf_mask = bitmap_alloc(nr_pages + 1, GFP_KERNEL);
|
||||
/* If subbuf_mask fails to allocate, then rb_meta_valid() will return false */
|
||||
|
||||
for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
|
||||
void *next_meta;
|
||||
|
||||
meta = rb_range_meta(buffer, nr_pages, cpu);
|
||||
|
||||
if (rb_meta_valid(meta, cpu, buffer, nr_pages)) {
|
||||
if (rb_meta_valid(meta, cpu, buffer, nr_pages, subbuf_mask)) {
|
||||
/* Make the mappings match the current address */
|
||||
subbuf = rb_subbufs_from_meta(meta);
|
||||
delta = (unsigned long)subbuf - meta->first_buffer;
|
||||
|
@ -1943,6 +1965,7 @@ static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages)
|
|||
subbuf += meta->subbuf_size;
|
||||
}
|
||||
}
|
||||
bitmap_free(subbuf_mask);
|
||||
}
|
||||
|
||||
static void *rbm_start(struct seq_file *m, loff_t *pos)
|
||||
|
@ -7126,6 +7149,7 @@ int ring_buffer_map(struct trace_buffer *buffer, int cpu,
|
|||
kfree(cpu_buffer->subbuf_ids);
|
||||
cpu_buffer->subbuf_ids = NULL;
|
||||
rb_free_meta_page(cpu_buffer);
|
||||
atomic_dec(&cpu_buffer->resize_disabled);
|
||||
}
|
||||
|
||||
unlock:
|
||||
|
|
|
@ -5977,8 +5977,6 @@ static int __tracing_resize_ring_buffer(struct trace_array *tr,
|
|||
ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
|
||||
unsigned long size, int cpu_id)
|
||||
{
|
||||
int ret;
|
||||
|
||||
guard(mutex)(&trace_types_lock);
|
||||
|
||||
if (cpu_id != RING_BUFFER_ALL_CPUS) {
|
||||
|
@ -5987,11 +5985,7 @@ ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
|
|||
return -EINVAL;
|
||||
}
|
||||
|
||||
ret = __tracing_resize_ring_buffer(tr, size, cpu_id);
|
||||
if (ret < 0)
|
||||
ret = -ENOMEM;
|
||||
|
||||
return ret;
|
||||
return __tracing_resize_ring_buffer(tr, size, cpu_id);
|
||||
}
|
||||
|
||||
static void update_last_data(struct trace_array *tr)
|
||||
|
@ -8285,6 +8279,10 @@ static int tracing_buffers_mmap(struct file *filp, struct vm_area_struct *vma)
|
|||
struct trace_iterator *iter = &info->iter;
|
||||
int ret = 0;
|
||||
|
||||
/* Currently the boot mapped buffer is not supported for mmap */
|
||||
if (iter->tr->flags & TRACE_ARRAY_FL_BOOT)
|
||||
return -ENODEV;
|
||||
|
||||
ret = get_snapshot_map(iter->tr);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
|
Loading…
Add table
Reference in a new issue