linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Christoph Hellwig	680895d6ef	sysctl: switch to use uuid_t Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>	2017-06-05 16:59:15 +02:00
Frederic Weisbecker	f99973e18b	nohz: Fix buggy tick delay on IRQ storms When the tick is stopped and we reach the dynticks evaluation code on IRQ exit, we perform a soft tick restart if we observe an expired timer from there. It means we program the nearest possible tick but we stay in dynticks mode (ts->tick_stopped = 1) because we may need to stop the tick again after that expired timer is handled. Now this solution works most of the time but if we suffer an IRQ storm and those interrupts trigger faster than the hardware clockevents min delay, our tick won't fire until that IRQ storm is finished. Here is the problem: on IRQ exit we reprog the timer to at least NOW() + min_clockevents_delay. Another IRQ fires before the tick so we reschedule again to NOW() + min_clockevents_delay, etc... The tick is eternally rescheduled min_clockevents_delay ahead. A solution is to simply remove this soft tick restart. After all the normal dynticks evaluation path can handle 0 delay just fine. And by doing that we benefit from the optimization branch which avoids clock reprogramming if the clockevents deadline hasn't changed since the last reprog. This fixes our issue because we don't do repetitive clock reprog that always add hardware min delay. As a side effect it should even optimize the 0 delay path in general. Reported-and-tested-by: Octavian Purdila <octavian.purdila@nxp.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1496328429-13317-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-05 09:33:50 +02:00
Alexei Starovoitov	f91840a32d	perf, bpf: Add BPF support to all perf_event types Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all perf_event types, including HW_CACHE, RAW, and dynamic pmu events. Only tracepoint/kprobe events are treated differently which require BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly. Also add support for reading all event counters using bpf_perf_event_read() helper. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:58:01 -04:00
Thomas Gleixner	f2c45807d3	alarmtimer: Switch over to generic set/get/rearm routine All required callbacks are in place. Switch the alarm timer based posix interval timer callbacks to the common implementation and remove the incorrect private implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.825471962@linutronix.de	2017-06-04 15:40:32 +02:00
Thomas Gleixner	b3bf6f369d	alarmtimer: Implement arm callback Preparatory change to utilize the common posix timer mechanisms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.747567162@linutronix.de	2017-06-04 15:40:31 +02:00
Thomas Gleixner	e344c9e76b	alarmtimer: Implement try_to_cancel callback Preparatory change to utilize the common posix timer mechanisms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.670026824@linutronix.de	2017-06-04 15:40:31 +02:00
Thomas Gleixner	d653d8457c	alarmtimer: Implement remaining callback Preparatory change to utilize the common posix timer mechanisms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.592676753@linutronix.de	2017-06-04 15:40:31 +02:00
Thomas Gleixner	e7561f1633	alarmtimer: Implement forward callback Preparatory change to utilize the common posix timer mechanisms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.513694229@linutronix.de	2017-06-04 15:40:30 +02:00
Thomas Gleixner	b3db80f77a	alarmtimer: Implement timer_rearm() callback Preparatory change to utilize the common posix timer mechanisms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.434598989@linutronix.de	2017-06-04 15:40:30 +02:00
Thomas Gleixner	eae1c4ae27	posix-timers: Make use of cancel/arm callbacks Replace the hrtimer calls by calls to the new try_to_cancel()/arm() kclock callbacks and move the hrtimer specific implementation into the corresponding callback functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.355396667@linutronix.de	2017-06-04 15:40:29 +02:00
Thomas Gleixner	525b8ed916	posix-timers: Add cancel/arm callbacks Add timer_try_to_cancel() and timer_arm() callbacks to kclock which allow to make common_timer_set() usable by both hrtimer and alarmtimer based clocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.278022962@linutronix.de	2017-06-04 15:40:28 +02:00
Thomas Gleixner	eabdec0438	posix-timers: Zero settings value in common code Zero out the settings struct in the common code so the callbacks do not have to do it themself. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.200870713@linutronix.de	2017-06-04 15:40:28 +02:00
Thomas Gleixner	91d57bae08	posix-timers: Make use of forward/remaining callbacks Replace the hrtimer calls by calls to the new forward/remaining kclock callbacks and move the hrtimer specific implementation into the corresponding callback functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.121437232@linutronix.de	2017-06-04 15:40:27 +02:00
Thomas Gleixner	63841b2a69	posix-timers: Add forward/remaining callbacks Add two callbacks to kclock which allow using common_)timer_get() for both hrtimer and alarm timer based clocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211657.044915536@linutronix.de	2017-06-04 15:40:27 +02:00
Thomas Gleixner	21e55c1f83	posix-timers: Add active flag to k_itimer Keep track of the activation state of posix timers. This is a preparatory change for making common_timer_get() usable by both hrtimer and alarm timer implementations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.967783982@linutronix.de	2017-06-04 15:40:26 +02:00
Thomas Gleixner	f37fb0aa4f	posix-timers: Use timer_rearm() callback in posixtimer_rearm() Use the new timer_rearm() callback to replace the conditional hardcoded calls into the hrtimer and cpu timer code. This allows later to bring the same logic to alarmtimers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.889661919@linutronix.de	2017-06-04 15:40:26 +02:00
Thomas Gleixner	96fe3b072f	posix-timers: Rename do_schedule_next_timer That function is a misnomer. Rename it with a proper prefix to posixtimer_rearm(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.811362578@linutronix.de	2017-06-04 15:40:25 +02:00
Thomas Gleixner	3080294589	posix-timers: Add timer_rearm() callback Add a timer_rearm() callback which is used to make the rescheduling of posix interval timers independent of the underlying clock implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.732632167@linutronix.de	2017-06-04 15:40:25 +02:00
Thomas Gleixner	d97bb75ddd	posix-timers: Store k_clock pointer in k_itimer Having the k_clock pointer in the k_itimer struct avoids the lookup in several code pathes and makes the next steps of unification of the hrtimer and alarmtimer based posix timers simpler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.641222072@linutronix.de	2017-06-04 15:40:25 +02:00
Thomas Gleixner	80105cd0e6	posix-timers: Move interval out of the union Preparatory patch to unify the alarm timer and hrtimer based posix interval timer handling. The interval is used as a criteria for rearming decisions so moving it out of the clock specific data structures allows later unification. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.563922908@linutronix.de	2017-06-04 15:40:24 +02:00
Thomas Gleixner	af888d677a	posix-timers: Unify overrun/requeue_pending handling hrtimer based posix-timers and posix-cpu-timers handle the update of the rearming and overflow related status fields differently. Move that update to the common rearming code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.484936964@linutronix.de	2017-06-04 15:40:24 +02:00
Thomas Gleixner	bab0aae9dc	posix-timers: Move posix-timer internals to core None of these declarations is required outside of kernel/time. Move them to an internal header. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170530211656.394803853@linutronix.de	2017-06-04 15:40:23 +02:00
Thomas Gleixner	6631fa12c1	posix-timers: Avoid gazillions of forward declarations Move it below the actual implementations as there are new callbacks coming which would require even more forward declarations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.238209952@linutronix.de	2017-06-04 15:40:23 +02:00
Thomas Gleixner	3a06c7ac24	posix-clocks: Remove interval timer facility and mmap/fasync callbacks The only user of this facility is ptp_clock, which does not implement any of those functions. Remove them to prevent accidental users. Especially the interval timer interfaces are now more or less impossible to implement because the necessary infrastructure has been confined to the core code. Aside of that it's really complex to make these callbacks implemented according to spec as the alarm timer implementation demonstrates. If at all then a nanosleep callback might be a reasonable extension. For now keep just what ptp_clock needs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.145036286@linutronix.de	2017-06-04 15:40:22 +02:00
Thomas Gleixner	a81129e5a1	posix-timers: Remove unused export of posix_timer_event() Since the removal of the mmtimer driver the export is not longer needed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211656.052744418@linutronix.de	2017-06-04 15:40:22 +02:00
Thomas Gleixner	18c700c4e3	alarmtimer: Remove pointless config conditional Having a IF_ENABLED(CONFIG_POSIX_TIMERS) inside of a #ifdef CONFIG_POSIX_TIMERS section is pointless. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20170530211655.975218056@linutronix.de	2017-06-04 15:40:22 +02:00
Thomas Gleixner	978267b643	Merge branch 'timers/urgent' into WIP.timers Pick up urgent fixes to avoid conflicts.	2017-06-04 15:21:52 +02:00
Thomas Gleixner	ff86bf0c65	alarmtimer: Rate limit periodic intervals The alarmtimer code has another source of potentially rearming itself too fast. Interval timers with a very samll interval have a similar CPU hog effect as the previously fixed overflow issue. The reason is that alarmtimers do not implement the normal protection against this kind of problem which the other posix timer use: timer expires -> queue signal -> deliver signal -> rearm timer This scheme brings the rearming under scheduler control and prevents permanently firing timers which hog the CPU. Bringing this scheme to the alarm timer code is a major overhaul because it lacks all the necessary mechanisms completely. So for a quick fix limit the interval to one jiffie. This is not problematic in practice as alarmtimers are usually backed by an RTC for suspend which have 1 second resolution. It could be therefor argued that the resolution of this clock should be set to 1 second in general, but that's outside the scope of this fix. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de	2017-06-04 15:21:18 +02:00
Thomas Gleixner	f4781e76f9	alarmtimer: Prevent overflow of relative timers Andrey reported a alartimer related RCU stall while fuzzing the kernel with syzkaller. The reason for this is an overflow in ktime_add() which brings the resulting time into negative space and causes immediate expiry of the timer. The following rearm with a small interval does not bring the timer back into positive space due to the same issue. This results in a permanent firing alarmtimer which hogs the CPU. Use ktime_add_safe() instead which detects the overflow and clamps the result to KTIME_SEC_MAX. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de	2017-06-04 15:21:18 +02:00
Christoph Hellwig	31ea70e030	posix-timers: Move the do_schedule_next_timer declaration Having it in asm-generic/siginfo.h doesn't make any sense as it is in no way architecture specific. Move it to posix-timers.h instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: sparclinux@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170603190102.28866-4-hch@lst.de	2017-06-04 15:11:46 +02:00
Thomas Gleixner	04c848d398	genirq: Warn when IRQ_NOAUTOEN is used with shared interrupts Shared interrupts do not go well with disabling auto enable: 1) The sharing interrupt might request it while it's still disabled and then wait for interrupts forever. 2) The interrupt might have been requested by the driver sharing the line before IRQ_NOAUTOEN has been set. So the driver which expects that disabled state after calling request_irq() will not get what it wants. Even worse, when it calls enable_irq() later, it will trigger the unbalanced enable_irq() warning. Reported-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: dianders@chromium.org Cc: jeffy <jeffy.chen@rock-chips.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: tfiga@chromium.org Link: http://lkml.kernel.org/r/20170531100212.210682135@linutronix.de	2017-06-04 14:38:41 +02:00
Thomas Gleixner	201d7f47f3	genirq: Handle NOAUTOEN interrupt setup proper If an interrupt is marked NOAUTOEN then request_irq() installs the action, but does not enable the interrupt via startup_irq(). The interrupt is enabled via enable_irq() later from the driver. enable_irq() calls irq_enable(). That means that for interrupts which have a irq_startup() callback this callback is never invoked. Neither is irq_domain_activate_irq() invoked for such interrupts. If an interrupt depends on irq_startup() or irq_domain_activate_irq() then the enable via irq_enable() is not enough. Add a status flag IRQD_IRQ_STARTED_UP and use this to select the proper mechanism in enable_irq(). Use the flag also to avoid pointless calls into the low level functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: dianders@chromium.org Cc: jeffy <jeffy.chen@rock-chips.com> Cc: Brian Norris <briannorris@chromium.org> Cc: tfiga@chromium.org Link: http://lkml.kernel.org/r/20170531100212.130986205@linutronix.de	2017-06-04 14:35:13 +02:00
Sebastian Andrzej Siewior	40da1b11f0	cpu/hotplug: Drop the device lock on error If a custom CPU target is specified and that one is not available _or_ can't be interrupted then the code returns to userland without dropping a lock as notices by lockdep: \|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target \| ================================================ \| [ BUG: lock held when returning to user space! ] \| ------------------------------------------------ \| bash/503 is leaving the kernel with locks still held! \| 1 lock held by bash/503: \| #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40 So release the lock then. Fixes: `757c989b99` ("cpu/hotplug: Make target state writeable") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de	2017-06-03 09:35:04 +02:00
Alexander Levin	e5aeee51f6	perf/core: Don't release cred_guard_mutex if not taken If we failed to acquire task's cred_guard_mutex we shouldn't proceed to release it in the error path. Fixes: `a63fbed776` ("perf/tracing/cpuhotplug: Fix locking order") Signed-off-by: Alexander Levin <alexander.levin@verizon.com> Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: mhiramat@kernel.org Cc: paulmck@linux.vnet.ibm.com Cc: bigeasy@linutronix.de Link: http://lkml.kernel.org/r/20170603033903.12056-1-alexander.levin@verizon.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-06-03 09:28:45 +02:00
Chenbo Feng	80b7d81912	bpf: Remove the capability check for cgroup skb eBPF program Currently loading a cgroup skb eBPF program require a CAP_SYS_ADMIN capability while attaching the program to a cgroup only requires the user have CAP_NET_ADMIN privilege. We can escape the capability check when load the program just like socket filter program to make the capability requirement consistent. Change since v1: Change the code style in order to be compliant with checkpatch.pl preference Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 14:24:40 -04:00
Chenbo Feng	fb9a307d11	bpf: Allow CGROUP_SKB eBPF program to access sk_buff This allows cgroup eBPF program to classify packet based on their protocol or other detail information. Currently program need CAP_NET_ADMIN privilege to attach a cgroup eBPF program, and A process with CAP_NET_ADMIN can already see all packets on the system, for example, by creating an iptables rules that causes the packet to be passed to userspace via NFLOG. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 14:24:40 -04:00
Linus Torvalds	035f1456f9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fix from Jiri Kosina: "Kconfig dependency fix for livepatching infrastructure from Miroslav Benes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS	2017-06-02 08:59:17 -07:00
Alexei Starovoitov	b870aa901f	bpf: use different interpreter depending on required stack size 16 __bpf_prog_run() interpreters for various stack sizes add .text but not a lot comparing to run-time stack savings text data bss dec hex filename 26350 10328 624 37302 91b6 kernel/bpf/core.o.before_split 25777 10328 624 36729 8f79 kernel/bpf/core.o.after_split 26970 10328 624 37922 9422 kernel/bpf/core.o.now Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:48 -04:00
Alexei Starovoitov	80a58d0255	bpf: reconcile bpf_tail_call and stack_depth The next set of patches will take advantage of stack_depth tracking, so make sure that the program that does bpf_tail_call() has stack depth large enough for the callee. We could have tracked the stack depth of the prog_array owner program and only allow insertion of the programs with stack depth less than the owner, but it will break existing applications. Some of them have trivial root bpf program that only does multiple bpf_tail_calls and at init time the prog array is empty. In the future we may add a flag to do such tracking optionally, but for now play simple and safe. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:47 -04:00
Alexei Starovoitov	8726679a0f	bpf: teach verifier to track stack depth teach verifier to track bpf program stack depth Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:47 -04:00
Alexei Starovoitov	f696b8f471	bpf: split bpf core interpreter split __bpf_prog_run() interpreter into stack allocation and execution parts. The code section shrinks which helps interpreter performance in some cases. text data bss dec hex filename 26350 10328 624 37302 91b6 kernel/bpf/core.o.before 25777 10328 624 36729 8f79 kernel/bpf/core.o.after Very short programs got slower (due to extra function call): Before: test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 7 PASS test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 8 PASS test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647 jited:0 7 PASS test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296 jited:0 11 PASS test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 = -1 jited:0 7 PASS After: test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 11 PASS test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 11 PASS test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647 jited:0 11 PASS test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296 jited:0 14 PASS test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 = -1 jited:0 10 PASS Longer programs got faster: Before: test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations jited:0 20286 20513 PASS test_bpf: #267 BPF_MAXINSNS: Call heavy transformations jited:0 31853 31768 PASS test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 9815 PASS test_bpf: #269 BPF_MAXINSNS: Very long jump backwards jited:0 6 PASS test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse jited:0 13959 PASS test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 210 PASS test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id jited:0 21724 PASS test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:0 19118 PASS After: test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations jited:0 19008 18827 PASS test_bpf: #267 BPF_MAXINSNS: Call heavy transformations jited:0 29238 28450 PASS test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 9485 PASS test_bpf: #269 BPF_MAXINSNS: Very long jump backwards jited:0 12 PASS test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse jited:0 13257 PASS test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 213 PASS test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id jited:0 19389 PASS test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop jited:0 19583 PASS For real world production programs the difference is noise. This patch is first step towards reducing interpreter stack consumption. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:47 -04:00
Alexei Starovoitov	71189fa9b0	bpf: free up BPF_JMP \| BPF_CALL \| BPF_X opcode free up BPF_JMP \| BPF_CALL \| BPF_X opcode to be used by actual indirect call by register and use kernel internal opcode to mark call instruction into bpf_tail_call() helper. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:47 -04:00
Richard Guy Briggs	7786f6b6df	audit: add ambient capabilities to CAPSET and BPRM_FCAPS records Capabilities were augmented to include ambient capabilities in v4.3 commit `58319057b7` ("capabilities: ambient capabilities"). Add ambient capabilities to the audit BPRM_FCAPS and CAPSET records. The record contains fields "old_pp", "old_pi", "old_pe", "new_pp", "new_pi", "new_pe" so in keeping with the previous record normalizations, change the "new_*" variants to simply drop the "new_" prefix. A sample of the replaced BPRM_FCAPS record: RAW: type=BPRM_FCAPS msg=audit(1491468034.252:237): fver=2 fp=0000000000200000 fi=0000000000000000 fe=1 old_pp=0000000000000000 old_pi=0000000000000000 old_pe=0000000000000000 old_pa=0000000000000000 pp=0000000000200000 pi=0000000000000000 pe=0000000000200000 pa=0000000000000000 INTERPRET: type=BPRM_FCAPS msg=audit(04/06/2017 04:40:34.252:237): fver=2 fp=sys_admin fi=none fe=chown old_pp=none old_pi=none old_pe=none old_pa=none pp=sys_admin pi=none pe=sys_admin pa=none A sample of the replaced CAPSET record: RAW: type=CAPSET msg=audit(1491469502.371:242): pid=833 cap_pi=0000003fffffffff cap_pp=0000003fffffffff cap_pe=0000003fffffffff cap_pa=0000000000000000 INTERPRET: type=CAPSET msg=audit(04/06/2017 05:05:02.371:242) : pid=833 cap_pi=chown,dac_override,dac_read_search,fowner,fsetid,kill, setgid,setuid,setpcap,linux_immutable,net_bind_service,net_broadcast, net_admin,net_raw,ipc_lock,ipc_owner,sys_module,sys_rawio,sys_chroot, sys_ptrace,sys_pacct,sys_admin,sys_boot,sys_nice,sys_resource,sys_time, sys_tty_config,mknod,lease,audit_write,audit_control,setfcap, mac_override,mac_admin,syslog,wake_alarm,block_suspend,audit_read cap_pp=chown,dac_override,dac_read_search,fowner,fsetid,kill,setgid, setuid,setpcap,linux_immutable,net_bind_service,net_broadcast, net_admin,net_raw,ipc_lock,ipc_owner,sys_module,sys_rawio,sys_chroot, sys_ptrace,sys_pacct,sys_admin,sys_boot,sys_nice,sys_resource, sys_time,sys_tty_config,mknod,lease,audit_write,audit_control,setfcap, mac_override,mac_admin,syslog,wake_alarm,block_suspend,audit_read cap_pe=chown,dac_override,dac_read_search,fowner,fsetid,kill,setgid, setuid,setpcap,linux_immutable,net_bind_service,net_broadcast, net_admin,net_raw,ipc_lock,ipc_owner,sys_module,sys_rawio,sys_chroot, sys_ptrace,sys_pacct,sys_admin,sys_boot,sys_nice,sys_resource, sys_time,sys_tty_config,mknod,lease,audit_write,audit_control,setfcap, mac_override,mac_admin,syslog,wake_alarm,block_suspend,audit_read cap_pa=none See: https://github.com/linux-audit/audit-kernel/issues/40 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2017-05-30 17:36:11 -04:00
Frederic Weisbecker	7c25904508	nohz: Reset next_tick cache even when the timer has no regs Handle tick interrupts whose regs are NULL, out of general paranoia. It happens when hrtimer_interrupt() is called from non-interrupt contexts, such as hotplug CPU down events. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-05-30 18:35:32 +02:00
Al Viro	bcfe8ad8ef	do_sigaltstack(): lift copying to/from userland into callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-05-27 15:38:17 -04:00
Al Viro	613763a1f0	take compat_sys_old_getrlimit() to native syscall ... and sanitize the ifdefs in there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-05-27 15:38:06 -04:00
Linus Torvalds	39b8ab31bc	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixlet from Thomas Gleixner: "Silence dmesg spam by making the posix cpu timer printks depend on print_fatal_signals" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timers: Make signal printks conditional	2017-05-27 09:14:24 -07:00
Linus Torvalds	805f286907	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Thomas Gleixner: "A fix for a state leak which was introduced in the recent rework of futex/rtmutex interaction" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()	2017-05-27 08:59:37 -07:00
Linus Torvalds	d024baa58a	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kthread fix from Thomas Gleixner: "A single fix which prevents a use after free when kthread fork fails" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kthread: Fix use-after-free if kthread fork fails	2017-05-27 08:52:27 -07:00
Linus Torvalds	77d6465695	There's been a few memory issues found with ftrace. One was simply a memory leak where not all was being freed that should have been in releasing a file pointer on set_graph_function. Then Thomas found that the ftrace trampolines were marked for read/write as well as execute. To shrink the possible attack surface, he added calls to set them to ro. Which also uncovered some other issues with freeing module allocated memory that had its permissions changed. Kprobes had a similar issue which is fixed and a selftest was added to trigger that issue again. -----BEGIN PGP SIGNATURE----- iQExBAABCAAbBQJZKOiVFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L vBoH/jxVozuAEVCv+Nbj6fhRxe4emjo0lZZb32EbEaSV/nUQGqHIZFdDQtbt+ld+ sn06/BSMBI+L4BqLj1BCAW0e/zIn/4birIg53SX5jQwc3AlhUG7HS2d+RJZZCrp9 Zofq9L6xZ4Hl2XjkPXqwEgtrwxQtkIPLlJqeYDJ6BVrlPfOPEwB7bfR7B684wiYT 6h2Qo7f/ZQzgJ1sK8N2IjHEnAgE08KCYcj4IB4WHJk6SqQz3bv1Y00WBg2UQihVT TPPSVhYLnrSw53fxyALqZbHo2DvnQf1TnNadWxvSIpbvgm/T5GG60FDtvHgNfbwz yKuKAog+P9xBLkoAcfvODLY9O5s= =75TZ -----END PGP SIGNATURE----- Merge tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "There's been a few memory issues found with ftrace. One was simply a memory leak where not all was being freed that should have been in releasing a file pointer on set_graph_function. Then Thomas found that the ftrace trampolines were marked for read/write as well as execute. To shrink the possible attack surface, he added calls to set them to ro. Which also uncovered some other issues with freeing module allocated memory that had its permissions changed. Kprobes had a similar issue which is fixed and a selftest was added to trigger that issue again" * tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/ftrace: Make sure that ftrace trampolines are not RWX x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range() selftests/ftrace: Add a testcase for many kprobe events kprobes/x86: Fix to set RWX bits correctly before releasing trampoline ftrace: Fix memory leak in ftrace_graph_release()	2017-05-27 08:30:30 -07:00

... 9 10 11 12 13 ...

25348 commits