linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Heiko Carstens	2f2728f6de	mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that performs proper zero and sign extension convert all 64 bit parameters to their corresponding 32 bit compat counterparts. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2014-03-06 16:30:47 +01:00
Heiko Carstens	ca2c405ab9	kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that performs proper zero and sign extension convert all 64 bit parameters to their corresponding 32 bit compat counterparts. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2014-03-06 16:30:46 +01:00
Heiko Carstens	62a6fa9768	kernel/compat: convert to COMPAT_SYSCALL_DEFINE Convert all compat system call functions where all parameter types have a size of four or less than four bytes, or are pointer types to COMPAT_SYSCALL_DEFINE. The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper zero and sign extension to 64 bit of all parameters if needed. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2014-03-06 15:35:10 +01:00
Roman Pen	af5040da01	blktrace: fix accounting of partially completed requests trace_block_rq_complete does not take into account that request can be partially completed, so we can get the following incorrect output of blkparser: C R 232 + 240 [0] C R 240 + 232 [0] C R 248 + 224 [0] C R 256 + 216 [0] but should be: C R 232 + 8 [0] C R 240 + 8 [0] C R 248 + 8 [0] C R 256 + 8 [0] Also, the whole output summary statistics of completed requests and final throughput will be incorrect. This patch takes into account real completion size of the request and fixes wrong completion accounting. Signed-off-by: Roman Pen <r.peniaev@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> CC: linux-kernel@vger.kernel.org Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>	2014-03-05 16:11:21 -07:00
Thomas Gleixner	8f945a3325	genirq: Move kstat_incr_irqs_this_cpu() to core No more users outside the core code. Put it into the poison cabinet. That also gets rid of the linux/irq.h include in kernel_stat.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140223212739.124207133@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 17:37:54 +01:00
Thomas Gleixner	792d0018a5	genirq: Add a kstat helper to increment irq stats There is a common pattern all over the place: kstat_incr_irqs_this_cpu(irq, irq_to_desc(irq)); This results in a call to core code anyway. So provide a function which does the same thing in core. While at it, replace the butt ugly macro with an inline. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140223212737.422068876@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 17:37:53 +01:00
Viresh Kumar	38edbb0b91	timer: Make sure TIMER_FLAG_MASK bits are free in allocated base Currently we are using two lowest bit of base for internal purpose and so they both should be zero in the allocated address. The code was doing the right thing before this patch came in: commit `c5f66e99b` (timer: Implement TIMER_IRQSAFE) Tejun probably forgot to update this piece of code which checks if the lowest 'n' bits are zero or not and so wasn't updated according to the new flag. Lets use TIMER_FLAG_MASK in the calculations here. [ tglx: Massaged changelog ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: linaro-kernel@lists.linaro.org Cc: fweisbec@gmail.com Cc: tj@kernel.org Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/9144e10d7e854a0aa8a673332adec356d81a923c.1393576981.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 12:30:29 +01:00
Viresh Kumar	c24a4a3694	timer: Check failure of timer_cpu_notify() before calling init_timer_stats() timer_cpu_notify() should return NOTIFY_OK and nothing else. Anything else would trigger a BUG_ON(). Return value of this routine is already checked correctly but is done after issuing a call to init_timer_stats(). The right order would be to check the error case first and then call init_timer_stats(). Lets do it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: linaro-kernel@lists.linaro.org Cc: fweisbec@gmail.com Cc: tj@kernel.org Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/c439f5b6bbc2047e1662f4d523350531425bcf9d.1393576981.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-04 12:30:29 +01:00
Steven Rostedt (Red Hat)	7dec935a3a	tracepoint: Do not waste memory on mods with no tracepoints No reason to allocate tp_module structures for modules that have no tracepoints. This just wastes memory. Fixes: `b75ef8b44b` "Tracepoint: Dissociate from module mutex" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-03 21:23:08 -05:00
Steven Rostedt (Red Hat)	45ab2813d4	tracing: Do not add event files for modules that fail tracepoints If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Link: http://lkml.kernel.org/r/20140227154923.265882695@goodmis.org Fixes: `6d723736e4` "tracing/events: add support for modules to TRACE_EVENT" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org # 2.6.31+ Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-03-03 21:11:05 -05:00
Li Zefan	b8dadcb58d	cpuset: use rcu_read_lock() to protect task_cs() We no longer use task_lock() to protect tsk->cgroups. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-03-03 17:28:36 -05:00
Linus Torvalds	0c0bd34a14	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes, most of them SCHED_DEADLINE fallout" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Prevent rt_time growth to infinity sched/deadline: Switch CPU's presence test order sched/deadline: Cleanup RT leftovers from {inc/dec}_dl_migration sched: Fix double normalization of vruntime	2014-03-03 10:49:24 -08:00
Heiko Carstens	03b8c7b623	futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test If an architecture has futex_atomic_cmpxchg_inatomic() implemented and there is no runtime check necessary, allow to skip the test within futex_init(). This allows to get rid of some code which would always give the same result, and also allows the compiler to optimize a couple of if statements away. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20140302120947.GA3641@osiris Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-03-03 11:32:08 +01:00
Rashika Kheria	64e8d20bd3	kernel: Include appropriate header file in time/timekeeping_debug.c Include appropriate header file kernel/time/timekeeping_internal.h in kernel/time/timekeeping_debug.c because it has prototype declaration of function defined in kernel/time/timekeeping_debug.c. This eliminates the following warning in kernel/time/timekeeping_debug.c: kernel/time/timekeeping_debug.c:68:6: warning: no previous prototype for ‘tk_debug_account_sleep_time’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-03-02 20:52:58 -08:00
Linus Torvalds	3154da34be	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes, most of them on the tooling side" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Fix strict alias issue for find_first_bit perf tools: fix BFD detection on opensuse perf: Fix hotplug splat perf/x86: Fix event scheduling perf symbols: Destroy unused symsrcs perf annotate: Check availability of annotate when processing samples	2014-03-02 11:37:07 -06:00
Eric W. Biederman	6f285b19d0	audit: Send replies in the proper network namespace. In perverse cases of file descriptor passing the current network namespace of a process and the network namespace of a socket used by that socket may differ. Therefore use the network namespace of the appropiate socket to ensure replies always go to the appropiate socket. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-02-28 19:44:55 -08:00
Sebastian Capella	421a5fa1a6	PM / hibernate: use name_to_dev_t to parse resume Use the name_to_dev_t call to parse the device name echo'd to to /sys/power/resume. This imitates the method used in hibernate.c in software_resume, and allows the resume partition to be specified using other equivalent device formats as well. By allowing /sys/debug/resume to accept the same syntax as the resume=device parameter, we can parse the resume=device in the init script and use the resume device directly from the kernel command line. Signed-off-by: Sebastian Capella <sebastian.capella@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-01 01:02:52 +01:00
Rashika Kheria	6c5be29165	PM / wakeup: Include appropriate header file in kernel/power/wakelock.c Include appropriate header file kernel/power/power.h in kernel/power/wakelock.c because it has prototype declaration of function defined in kernel/power/wakelock.c. This eliminates the following warning in kernel/power/wakelock.c: kernel/power/wakelock.c:34:9: warning: no previous prototype for ‘pm_show_wakelocks’ [-Wmissing-prototypes] kernel/power/wakelock.c:184:5: warning: no previous prototype for ‘pm_wake_lock’ [-Wmissing-prototypes] kernel/power/wakelock.c:232:5: warning: no previous prototype for ‘pm_wake_unlock’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-01 01:02:09 +01:00
Rashika Kheria	04b7346975	PM / sleep: Move prototype declaration to header file kernel/power/power.h Move prototype declaration of function to header file kernel/power/power.h because it is used by more than one file. This eliminates the following warning in kernel/power/snapshot.c: kernel/power/snapshot.c:1588:16: warning: no previous prototype for ‘swsusp_save’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-01 01:01:25 +01:00
Ingo Molnar	d27c8438ee	Merge branch 'timers/core' into sched/idle Avoid heavy conflicts caused by WIP patches in drivers/cpuidle/cpuidle.c, by merging these into a single base. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-28 13:58:25 +01:00
Eric W. Biederman	48095d991d	audit: Use struct net not pid_t to remember the network namespce to reply in In struct audit_netlink_list and audit_reply add a reference to the network namespace of the caller and remove the userspace pid of the caller. This cleanly remembers the callers network namespace, and removes a huge class of races and nasty failure modes that can occur when attempting to relook up the callers network namespace from a pid_t (including the caller's network namespace changing, pid wraparound, and the pid simply not being present). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-02-28 04:04:33 -08:00
Thomas Gleixner	bce1936951	Merge branch 'timers.2014.02.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into timers/core Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-28 11:53:48 +01:00
Ingo Molnar	62c206bd51	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: * Update RCU documentation. These were posted to LKML at https://lkml.org/lkml/2014/2/17/555. * Miscellaneous fixes. These were posted to LKML at https://lkml.org/lkml/2014/2/17/530. Note that two of these are RCU changes to other maintainer's trees: `add1f09954` (fs) and `8857563b81` (notifer), both of which substitute rcu_access_pointer() for rcu_dereference_raw(). * Real-time latency fixes. These were posted to LKML at https://lkml.org/lkml/2014/2/17/544. * Torture-test changes, including refactoring of rcutorture and introduction of a vestigial locktorture. These were posted to LKML at https://lkml.org/lkml/2014/2/17/599. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-28 08:38:30 +01:00
Rashika Kheria	864f32a52b	kernel: Mark function as static in kernel/seccomp.c Mark function as static in kernel/seccomp.c because it is not used outside this file. This eliminates the following warning in kernel/seccomp.c: kernel/seccomp.c:296:6: warning: no previous prototype for ?seccomp_attach_user_filter? [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Will Drewry <wad@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2014-02-28 13:54:27 +11:00
Linus Torvalds	8d7531825c	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull filesystem fixes from Jan Kara: "Notification, writeback, udf, quota fixes The notification patches are (with one exception) a fallout of my fsnotify rework which went into -rc1 (I've extented LTP to cover these cornercases to avoid similar breakage in future). The UDF patch is a nasty data corruption Al has recently reported, the revert of the writeback patch is due to possibility of violating sync(2) guarantees, and a quota bug can lead to corruption of quota files in ocfs2" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: Allocate overflow events with proper type fanotify: Handle overflow in case of permission events fsnotify: Fix detection whether overflow event is queued Revert "writeback: do not sync data dirtied after sync start" quota: Fix race between dqput() and dquot_scan_active() udf: Fix data corruption on file type conversion inotify: Fix reporting of cookies for inotify events	2014-02-27 10:37:22 -08:00
Li Zefan	99afb0fd5f	cpuset: fix a race condition in __cpuset_node_allowed_softwall() It's not safe to access task's cpuset after releasing task_lock(). Holding callback_mutex won't help. Cc: <stable@vger.kernel.org> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-02-27 09:39:54 -05:00
Li Zefan	4729583006	cpuset: fix a locking issue in cpuset_migrate_mm() I can trigger a lockdep warning: # mount -t cgroup -o cpuset xxx /cgroup # mkdir /cgroup/cpuset # mkdir /cgroup/tmp # echo 0 > /cgroup/tmp/cpuset.cpus # echo 0 > /cgroup/tmp/cpuset.mems # echo 1 > /cgroup/tmp/cpuset.memory_migrate # echo $$ > /cgroup/tmp/tasks # echo 1 > /cgruop/tmp/cpuset.mems =============================== [ INFO: suspicious RCU usage. ] 3.14.0-rc1-0.1-default+ #32 Not tainted ------------------------------- include/linux/cgroup.h:682 suspicious rcu_dereference_check() usage! ... [<ffffffff81582174>] dump_stack+0x72/0x86 [<ffffffff810b8f01>] lockdep_rcu_suspicious+0x101/0x140 [<ffffffff81105ba1>] cpuset_migrate_mm+0xb1/0xe0 ... We used to hold cgroup_mutex when calling cpuset_migrate_mm(), but now we hold cpuset_mutex, which causes task_css() to complain. This is not a false-positive but a real issue. Holding cpuset_mutex won't prevent a task from migrating to another cpuset, and it won't prevent the original task->cgroup from destroying during this change. Fixes: `5d21cc2db0` (cpuset: replace cgroup_mutex locking with cpuset internal locking) Cc: <stable@vger.kernel.org> # 3.9+ Signed-off-by: Li Zefan <lizefan@huawei.com> Sigend-off-by: Tejun Heo <tj@kernel.org>	2014-02-27 09:35:59 -05:00
Rashika Kheria	64be38ab03	genirq: Include missing header file in irqdomain.c Include appropriate header file include/linux/of_irq.h in kernel/irq/irqdomain.c because it contains prototype definition of function define in kernel/irq/irqdomain.c. This eliminates the following warning in kernel/irq/irqdomain.c: kernel/irq/irqdomain.c:468:14: warning: no previous prototype for ‘irq_create_of_mapping’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Link: http://lkml.kernel.org/r/eb89aebea7ff1a46122918ac389ebecf8248be9a.1393493276.git.rashika.kheria@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-27 13:29:35 +01:00
Peter Zijlstra	4a2345937c	perf: Optimize group_sched_in() Use the ctx pmu instead of the event pmu. When a group leader is a software event but the group contains hardware events, the entire group is on the hardware PMU. Using the hardware PMU for the transaction makes most sense since that's the most expensive one to programm (and software PMUs generally don't have TXN support anyway). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-sctoo9t2f3nn2c9g568928q3@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:43:26 +01:00
Mark Rutland	fdded676c3	perf: Remove redundant PMU assignment Currently perf_branch_stack_sched_in iterates over the set of pmus, checks that each pmu has a flush_branch_stack callback, then overwrites the pmu before calling the callback. This is either redundant or broken. In systems with a single hw pmu, pmu == cpuctx->ctx.pmu, and thus the assignment is redundant. In systems with multiple hw pmus (i.e. multiple pmus with task_ctx_nr == perf_hw_context) the pmus share the same perf_cpu_context. Thus the assignment can cause one of the pmus to flush its branch stack repeatedly rather than causing each of the pmus to flush their branch stacks. Worse still, if only some pmus have the callback the assignment can result in a branch to NULL. This patch removes the redundant assignment. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1392054264-23570-3-git-send-email-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:43:24 +01:00
Mark Rutland	9e3170411e	perf: Fix prototype of find_pmu_context() For some reason find_pmu_context() is defined as returning void * rather than a __percpu struct perf_cpu_context . As all the requisite types are defined in advance there's no reason to keep it that way. This patch modifies the prototype of pmu_find_context to return a __percpu struct perf_cpu_context . Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1392054264-23570-2-git-send-email-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:43:21 +01:00
Ingo Molnar	ff5a7088f0	Merge branch 'perf/urgent' into perf/core Merge the latest fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:17 +01:00
Dongsheng Yang	2b3942e4bb	trace: Replace hardcoding of 19 with MAX_NICE Use MAX_NICE instead of the value 19 for ring_buffer_benchmark. Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1393251121-25534-1-git-send-email-yangds.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:03 +01:00
Peter Zijlstra	37e117c07b	sched: Guarantee task priority in pick_next_task() Michael spotted that the idle_balance() push down created a task priority problem. Previously, when we called idle_balance() before pick_next_task() it wasn't a problem when -- because of the rq->lock droppage -- an rt/dl task slipped in. Similarly for pre_schedule(), rt pre-schedule could have a dl task slip in. But by pulling it into the pick_next_task() loop, we'll not try a higher task priority again. Cure this by creating a re-start condition in pick_next_task(); and triggering this from pick_next_task_{rt,fair}(). It also fixes a live-lock where we get stuck in pick_next_task_fair() due to idle_balance() seeing !0 nr_running but there not actually being any fair tasks about. Reported-by: Michael Wang <wangyun@linux.vnet.ibm.com> Fixes: `38033c37fa` ("sched: Push down pre_schedule() and idle_balance()") Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140224121218.GR15586@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:02 +01:00
Peter Zijlstra	06d50c65b1	sched/idle: Remove stale old file Commit `cf37b6b484` ("sched/idle: Move cpu/idle.c to sched/idle.c") said to simply move a file; somehow it got mangled and created an old version of the file and forgot to remove the old file. Fix this fail; add the lost change and remove the now identical old file. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: rjw@rjwysocki.net Cc: nicolas.pitre@linaro.org Cc: preeti@linux.vnet.ibm.com Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Link: http://lkml.kernel.org/r/20140224172207.GC9987@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:01 +01:00
Dietmar Eggemann	f5f9739d7a	sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED The struct sched_avg of struct rq is only used in case group scheduling is enabled inside __update_tg_runnable_avg() to update per-cpu representation of a task group. I.e. that there is no need to maintain the runnable avg of a rq in the !CONFIG_FAIR_GROUP_SCHED case. This patch guards struct sched_avg of struct rq and update_rq_runnable_avg() with CONFIG_FAIR_GROUP_SCHED. There is an extra empty definition for update_rq_runnable_avg() necessary for the !CONFIG_FAIR_GROUP_SCHED && CONFIG_SMP case. The function print_cfs_group_stats() which prints out struct sched_avg of struct rq is already guarded with CONFIG_FAIR_GROUP_SCHED. Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/530DCDC5.1060406@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:41:00 +01:00
Peter Zijlstra	e3703f8cdf	perf: Fix hotplug splat Drew Richardson reported that he could make the kernel go boom when hotplugging while having perf events active. It turned out that when you have a group event, the code in __perf_event_exit_context() fails to remove the group siblings from the context. We then proceed with destroying and freeing the event, and when you re-plug the CPU and try and add another event to that CPU, things go boom because you've still got dead entries there. Reported-by: Drew Richardson <drew.richardson@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/n/tip-k6v5wundvusvcseqj1si0oz0@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:38:03 +01:00
Juri Lelli	faa5993736	sched/deadline: Prevent rt_time growth to infinity Kirill Tkhai noted: Since deadline tasks share rt bandwidth, we must care about bandwidth timer set. Otherwise rt_time may grow up to infinity in update_curr_dl(), if there are no other available RT tasks on top level bandwidth. RT task were in fact throttled right after they got enqueued, and never executed again (rt_time never again went below rt_runtime). Peter then proposed to accrue DL execution on rt_time only when rt timer is active, and proposed a patch (this patch is a slight modification of that) to implement that behavior. While this solves Kirill problem, it has a drawback. Indeed, Kirill noted again: It looks we may get into a situation, when all CPU time is shared between RT and DL tasks: rt_runtime = n rt_period = 2n \| RT working, DL sleeping \| DL working, RT sleeping \| ----------------------------------------------------------- \| (1) duration = n \| (2) duration = n \| (repeat) \|--------------------------\|------------------------------\| \| (rt_bw timer is running) \| (rt_bw timer is not running) \| No time for fair tasks at all. While this can happen during the first period, if rq is always backlogged, RT tasks won't have the opportunity to execute anymore: rt_time reached rt_runtime during (1), suppose after (2) RT is enqueued back, it gets throttled since rt timer didn't fire, replenishment is from now on eaten up by DL tasks that accrue their execution on rt_time (while rt timer is active - we have an RT task waiting for replenishment). FAIR tasks are not touched after this first period. Ok, this is not ideal, and the situation is even worse! What above (the nice case), practically never happens in reality, where your rt timer is not aligned to tasks periods, tasks are in general not periodic, etc.. Long story short, you always risk to overload your system. This patch is based on Peter's idea, but exploits an additional fact: if you don't have RT tasks enqueued, it makes little sense to continue incrementing rt_time once you reached the upper limit (DL tasks have their own mechanism for throttling). This cures both problems: - no matter how many DL instances in the past, you'll have an rt_time slightly above rt_runtime when an RT task is enqueued, and from that point on (after the first replenishment), the task will normally execute; - you can still eat up all bandwidth during the first period, but not anymore after that, remember that DL execution will increment rt_time till the upper limit is reached. The situation is still not perfect! But, we have a simple solution for now, that limits how much you can jeopardize your system, as we keep working towards the right answer: RT groups scheduled using deadline servers. Reported-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140225151515.617714e2f2cd6c558531ba61@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:29:41 +01:00
Juri Lelli	eec751ed41	sched/deadline: Switch CPU's presence test order Commit `82b9580` ("sched/deadline: Test for CPU's presence explicitly") changed how we check if a CPU returned by cpudeadline machinery is valid. But, we don't want to call cpu_present() if best_cpu is equal to -1. So, switch the order of tests inside WARN_ON(). Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: boris.ostrovsky@oracle.com Cc: konrad.wilk@oracle.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1393238832-9100-1-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:29:40 +01:00
Kirill Tkhai	3908ac13b5	sched/deadline: Cleanup RT leftovers from {inc/dec}_dl_migration In deadline class we do not have group scheduling. So, let's remove unnecessary X = X; equations. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@gmail.com> Link: http://lkml.kernel.org/r/1393343543.4089.5.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:29:39 +01:00
George McCollister	791c9e0292	sched: Fix double normalization of vruntime dequeue_entity() is called when p->on_rq and sets se->on_rq = 0 which appears to guarentee that the !se->on_rq condition is met. If the task has done set_current_state(TASK_INTERRUPTIBLE) without schedule() the second condition will be met and vruntime will be incorrectly adjusted twice. In certain cases this can result in the task's vruntime never increasing past the vruntime of other tasks on the CFS' run queue, starving them of CPU time. This patch changes switched_from_fair() to use !p->on_rq instead of !se->on_rq. I'm able to cause a task with a priority of 120 to starve all other tasks with the same priority on an ARM platform running 3.2.51-rt72 PREEMPT RT by writing one character at time to a serial tty (16550 UART) in a tight loop. I'm also able to verify making this change corrects the problem on that platform and kernel version. Signed-off-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1392767811-28916-1-git-send-email-george.mccollister@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-02-27 12:29:38 +01:00
Chuansheng Liu	c685689fd2	genirq: Remove racy waitqueue_active check We hit one rare case below: T1 calling disable_irq(), but hanging at synchronize_irq() always; The corresponding irq thread is in sleeping state; And all CPUs are in idle state; After analysis, we found there is one possible scenerio which causes T1 is waiting there forever: CPU0 CPU1 synchronize_irq() wait_event() spin_lock() atomic_dec_and_test(&threads_active) insert the __wait into queue spin_unlock() if(waitqueue_active) atomic_read(&threads_active) wake_up() Here after inserted the __wait into queue on CPU0, and before test if queue is empty on CPU1, there is no barrier, it maybe cause it is not visible for CPU1 immediately, although CPU0 has updated the queue list. It is similar for CPU0 atomic_read() threads_active also. So we'd need one smp_mb() before waitqueue_active.that, but removing the waitqueue_active() check solves it as wel l and it makes things simple and clear. Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Cc: Xiaoming Wang <xiaoming.wang@intel.com> Link: http://lkml.kernel.org/r/1393212590-32543-1-git-send-email-chuansheng.liu@intel.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-27 10:54:16 +01:00
Bjorn Helgaas	5edb93b89f	resource: Add resource_contains() We have two identical copies of resource_contains() already, and more places that could use it. This moves it to ioport.h where it can be shared. resource_contains(struct resource r1, struct resource r2) returns true iff r1 and r2 are the same type (most callers already checked this separately) and the r1 address range completely contains r2. In addition, the new resource_contains() checks that both r1 and r2 have addresses assigned to them. If a resource is IORESOURCE_UNSET, it doesn't have a valid address and can't contain or be contained by another resource. Some callers already check this or for res->start. No functional change. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2014-02-26 14:42:09 -07:00
Paul E. McKenney	f5604f67fe	Merge branch 'torture.2014.02.23a' into HEAD torture.2014.02.23a: locktorture addition and rcutorture changes	2014-02-26 06:38:59 -08:00
Paul E. McKenney	322efba5b6	Merge branches 'doc.2014.02.24a', 'fixes.2014.02.26a' and 'rt.2014.02.17b' into HEAD doc.2014.02.24a: Documentation changes fixes.2014.02.26a: Miscellaneous fixes rt.2014.02.17b: Response-time-related changes	2014-02-26 06:36:09 -08:00
Paul Gortmaker	5cb5c6e18f	rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone The kbuild test bot uncovered an implicit dependence on the trace header being present before rcu.h in ia64 allmodconfig that looks like this: In file included from kernel/ksysfs.c:22:0: kernel/rcu/rcu.h: In function '__rcu_reclaim': kernel/rcu/rcu.h:107:3: error: implicit declaration of function 'trace_rcu_invoke_kfree_callback' [-Werror=implicit-function-declaration] kernel/rcu/rcu.h:112:3: error: implicit declaration of function 'trace_rcu_invoke_callback' [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors Looking at other rcu.h users, we can find that they all were sourcing the trace header in advance of rcu.h itself, as seen in the context of this diff. There were also some inconsistencies as to whether it was or wasn't sourced based on the parent tracing Kconfig. Rather than "fix" it at each use site, and have inconsistent use based on whether "#ifdef CONFIG_RCU_TRACE" was used or not, lets just source the trace header just once, in the actual consumer of it, which is rcu.h itself. We include it unconditionally, as build testing shows us that is a hard requirement for some files. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2014-02-26 06:35:18 -08:00
Paul Gortmaker	7a75474318	rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c This commit fixes the follwoing warning: kernel/ksysfs.c:143:5: warning: symbol 'rcu_expedited' was not declared. Should it be static? Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> [ paulmck: Moved the declaration to include/linux/rcupdate.h to avoid including the RCU-internal rcu.h file outside of RCU. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-26 06:35:16 -08:00
Paul E. McKenney	8857563b81	notifier: Substitute rcu_access_pointer() for rcu_dereference_raw() (Trivial patch.) If the code is looking at the RCU-protected pointer itself, but not dereferencing it, the rcu_dereference() functions can be downgraded to rcu_access_pointer(). This commit makes this downgrade in __blocking_notifier_call_chain() which simply compares the RCU-protected pointer against NULL with no dereferencing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-26 06:35:13 -08:00
Vijaya Kumar K	d498d4b47f	KGDB: make kgdb_breakpoint() as noinline The function kgdb_breakpoint() sets up break point at compile time by calling arch_kgdb_breakpoint(); Though this call is surrounded by wmb() barrier, the compile can still re-order the break point, because this scheduling barrier is not a code motion barrier in gcc. Making kgdb_breakpoint() as noinline solves this problem of code reording around break point instruction and also avoids problem of being called as inline function from other places More details about discussion on this can be found here http://comments.gmane.org/gmane.linux.ports.arm.kernel/269732 Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-26 11:16:25 +00:00
Oleg Nesterov	aea369b959	timers: Make internal_add_timer() update ->next_timer if ->active_timers == 0 The internal_add_timer() function updates base->next_timer only if timer->expires < base->next_timer. This is correct, but it also makes sense to do the same if we add the first non-deferrable timer. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Mike Galbraith <bitbucket@online.de>	2014-02-25 12:39:01 -08:00

... 3 4 5 6 7 ...

17785 commits