linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Ming Lei	53b615ccca	PM / Runtime: Introduce trace points for tracing rpm_* functions This patch introduces 3 trace points to prepare for tracing rpm_idle/rpm_suspend/rpm_resume functions, so we can use these trace points to replace the current dev_dbg(). Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-09-27 22:53:27 +02:00
Joe Perches	bfb9035c98	treewide: Correct spelling of successfully in comments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-27 18:08:04 +02:00
Wang Xingchao	f4cfb33ed9	sched: Remove redundant test in check_preempt_tick() The caller already checks for nr_running > 1, therefore we don't have to do so again. Signed-off-by: Wang Xingchao <xingchao.wang@intel.com> Reviewed-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1316194552-12019-1-git-send-email-xingchao.wang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-26 13:25:49 +02:00
Ingo Molnar	ed3982cf37	Merge commit 'v3.1-rc7' into perf/core Merge reason: Pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-26 12:54:28 +02:00
Simon Kirby	6ebbe7a07b	sched: Fix up wchan borkage Commit `c259e01a1e` ("sched: Separate the scheduler entry for preemption") contained a boo-boo wrecking wchan output. It forgot to put the new schedule() function in the __sched section and thereby doesn't get properly ignored for things like wchan. Tested-by: Simon Kirby <sim@hostway.ca> Cc: stable@kernel.org # 2.6.39+ Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110923000346.GA25425@hostway.ca Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-26 12:51:08 +02:00
Russell King	c825dda905	Merge branch 'for_3_2/for-rmk/arm_cpu_pm' of git://gitorious.org/omap-sw-develoment/linux-omap-dev into devel-stable	2011-09-26 09:36:36 +01:00
Oleg Nesterov	f9d81f61c8	ptrace: PTRACE_LISTEN forgets to unlock ->siglock If PTRACE_LISTEN fails after lock_task_sighand() it doesn't drop ->siglock. Reported-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-25 11:02:00 -07:00
Colin Cross	6f3eaec87b	cpu_pm: call notifiers during suspend Implements syscore_ops in cpu_pm to call the cpu and cpu cluster notifiers during suspend and resume, allowing drivers receiving the notifications to avoid implementing syscore_ops. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Kevin Hilman <khilman@ti.com> Tested-and-Acked-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Vishwanath BS <vishwanath.bs@ti.com>	2011-09-23 12:05:29 +05:30
Colin Cross	ab10023e00	cpu_pm: Add cpu power management notifiers During some CPU power modes entered during idle, hotplug and suspend, peripherals located in the CPU power domain, such as the GIC, localtimers, and VFP, may be powered down. Add a notifier chain that allows drivers for those peripherals to be notified before and after they may be reset. Notified drivers can include VFP co-processor, interrupt controller and it's PM extensions, local CPU timers context save/restore which shouldn't be interrupted. Hence CPU PM event APIs must be called with interrupts disabled. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Kevin Hilman <khilman@ti.com> Tested-and-Acked-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Kevin Hilman <khilman@ti.com> Tested-by: Vishwanath BS <vishwanath.bs@ti.com>	2011-09-23 12:05:29 +05:30
Steven Rostedt	e36de1de4a	tracing: Fix preemptirqsoff tracer to not stop at preempt off If irqs are disabled when preemption count reaches zero, the preemptirqsoff tracer should not flag that as the end. When interrupts are enabled and preemption count is not zero the preemptirqsoff correctly continues its tracing. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-09-22 11:11:51 -04:00
hank	cbbc719fcc	time: Change jiffies_to_clock_t() argument type to unsigned long The parameter's origin type is long. On an i386 architecture, it can easily be larger than 0x80000000, causing this function to convert it to a sign-extended u64 type. Change the type to unsigned long so we get the correct result. Signed-off-by: hank <pyu@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: <stable@kernel.org> [ build fix ] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:28:51 +02:00
Russell King	f70cac8d9c	Merge branch 'kprobes-test' of git://git.yxit.co.uk/linux into devel-stable	2011-09-21 08:48:33 +01:00
Rob Herring	eef24afb28	irq: Fix check for already initialized irq_domain in irq_domain_add The sanity check in irq_domain_add() tests desc->irq_data != NULL or irq_data->domain != NULL. This prevents adding an irq_domain to a irq descriptor when irq_data exists, which true when the irq descriptor exists. This went unnoticed so far as the simple domain code did not enter this code path because domain->nr_irqs is always 0 for the simple domains. Split the check for irq_data == NULL out and have a separate warning for it. [ tglx: Made the check for irq_data == NULL separate ] Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: marc.zyngier@arm.com Cc: thomas.abraham@linaro.org Cc: jamie@jamieiles.com Cc: b-cousson@ti.com Cc: shawn.guo@linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: devicetree-discuss@lists.ozlabs.org Link: http://lkml.kernel.org/r/1316017900-19918-3-git-send-email-robherring2@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-20 12:16:22 +02:00
Anton Blanchard	590e4d8571	sched: Allow SD_NODES_PER_DOMAIN to be overridden We want to override the default value of SD_NODES_PER_DOMAIN on ppc64, so move it into linux/topology.h. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-09-20 15:53:21 +10:00
Linus Torvalds	9d037a7776	Merge branch 'irq-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip * 'irq-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86, iommu: Mark DMAR IRQ as non-threaded genirq: Make irq_shutdown() symmetric vs. irq_startup again	2011-09-19 17:23:41 -07:00
Linus Torvalds	58c3c3aa01	Make taskstats round statistics down to nearest 1k bytes/events Even with just the interface limited to admin, there really is little to reason to give byte-per-byte counts for taskstats. So round it down to something less intrusive. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-19 17:10:57 -07:00
Linus Torvalds	1a51410abe	Make TASKSTATS require root access Ok, this isn't optimal, since it means that 'iotop' needs admin capabilities, and we may have to work on this some more. But at the same time it is very much not acceptable to let anybody just read anybody elses IO statistics quite at this level. Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative to checking the capabilities by hand. Reported-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Johannes Berg <johannes.berg@intel.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-19 17:04:37 -07:00
Steven Rostedt	6249687f76	tracing: Add a counter clock for those that do not trust clocks When debugging tight race conditions, it can be helpful to have a synchronized tracing method. Although in most cases the global clock provides this functionality, if timings is not the issue, it is more comforting to know that the order of events really happened in a precise order. Instead of using a clock, add a "counter" that is simply an incrementing atomic 64bit counter that orders the events as they are perceived to happen. The trace_clock_counter() is added from the attempt by Peter Zijlstra trying to convert the trace_clock_global() to it. I took Peter's counter code and made trace_clock_counter() instead, and added it to the choice of clocks. Just echo counter > /debug/tracing/trace_clock to activate it. Requested-by: Thomas Gleixner <tglx@linutronix.de> Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-09-19 11:35:58 -04:00
Thomas Gleixner	cba9bd22a5	watchdog: Drop FIFO policy in exit path When the watchdog thread exits it runs through the exit path with FIFO priority. There is no point in doing so. Switch back to SCHED_NORMAL before exiting. Cc: Don Zickus <dzickus@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121337461.2723@ionos Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-18 14:34:07 +02:00
Ingo Molnar	bfa322c48d	Merge branch 'linus' into sched/core Merge reason: We are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-18 14:01:39 +02:00
Peter Zijlstra	0119fee449	lockdep: Comment all warnings Andrew requested I comment all the lockdep WARN()s to help other people figure out wth is wrong.. Requested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315301493.3191.9.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-18 13:58:57 +02:00
Shawn Bohrer	3be209a8e2	sched/rt: Migrate equal priority tasks to available CPUs Commit `43fa5460fe` ("sched: Try not to migrate higher priority RT tasks") also introduced a change in behavior which keeps RT tasks on the same CPU if there is an equal priority RT task currently running even if there are empty CPUs available. This can cause unnecessary wakeup latencies, and can prevent the scheduler from balancing all RT tasks across available CPUs. This change causes an RT task to search for a new CPU if an equal priority RT task is already running on wakeup. Lower priority tasks will still have to wait on higher priority tasks, but the system should still balance out because there is always the possibility that if there are both a high and low priority RT tasks on a given CPU that the high priority task could wakeup while the low priority task is running and force it to search for a better runqueue. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org # 37+ Link: http://lkml.kernel.org/r/1315837684-18733-1-git-send-email-sbohrer@rgmadvisors.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-18 13:48:56 +02:00
Jiri Kosina	e060c38434	Merge branch 'master' into for-next Fast-forward merge with Linus to be able to merge patches based on more recent version of the tree.	2011-09-15 15:08:18 +02:00
Bart Van Assche	ca4a04cf3d	futex: Fix spelling in a source code comment Change a single occurrence of "unlcoked" into "unlocked". Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Darren Hart <dvhltc@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:37:17 +02:00
Vitaliy Ivanov	7cfdaf38d4	futex: uninitialized warning corrections The variables here are really not used uninitialized. kernel/futex.c: In function 'fixup_pi_state_owner.clone.17': kernel/futex.c:1582:6: warning: 'curval' may be used uninitialized in this function kernel/futex.c: In function 'handle_futex_death': kernel/futex.c:2486:6: warning: 'nval' may be used uninitialized in this function kernel/futex.c: In function 'do_futex': kernel/futex.c:863:11: warning: 'curval' may be used uninitialized in this function kernel/futex.c:828:6: note: 'curval' was declared here kernel/futex.c:898:5: warning: 'oldval' may be used uninitialized in this function kernel/futex.c:890:6: note: 'oldval' was declared here Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com> Acked-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:23:07 +02:00
Vitaliy Ivanov	124ff4e53a	async: uninitialized warning corrections The variables here are really not used uninitialized. kernel/async.c: In function 'async_synchronize_cookie_domain': kernel/async.c:270:10: warning: 'starttime.tv64' may be used uninitialized in this function kernel/async.c: In function 'async_run_entry_fn': kernel/async.c:122:10: warning: 'calltime.tv64' may be used uninitialized in this function Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Viresh Kumar <viresh.kumar@st.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:22:28 +02:00
Thomas Tuttle	fa2563e41c	workqueue: lock cwq access in drain_workqueue Take cwq->gcwq->lock to avoid racing between drain_workqueue checking to make sure the workqueues are empty and cwq_dec_nr_in_flight decrementing and then incrementing nr_active when it activates a delayed work. We discovered this when a corner case in one of our drivers resulted in us trying to destroy a workqueue in which the remaining work would always requeue itself again in the same workqueue. We would hit this race condition and trip the BUG_ON on workqueue.c:3080. Signed-off-by: Thomas Tuttle <ttuttle@chromium.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-14 18:09:38 -07:00
Thomas Gleixner	4523f6ada8	alarmtimers: Fix error handling commit `8bc0daf` (alarmtimers: Rework RTC device selection using class interface) did not implement required error checks. Add them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-14 10:54:29 +02:00
Thomas Gleixner	757455d41c	locking, latencytop: Annotate latency_lock as raw The latency_lock is lock can be taken in the guts of the scheduler code and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:02 +02:00
Thomas Gleixner	2737c49f29	locking, timer_stats: Annotate table_lock as raw The table_lock lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Reported-by: Andreas Sundebo <kernel@sundebo.dk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andreas Sundebo <kernel@sundebo.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:00 +02:00
Thomas Gleixner	8292c9e15c	locking, semaphores: Annotate inner lock as raw There is no reason to have the spin_lock protecting the semaphore preemptible on -rt. Annotate it as a raw_spinlock. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. ( On rt this also solves lockdep complaining about the rt_mutex.wait_lock being not initialized. ) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:57 +02:00
Thomas Gleixner	ee30a7b2fc	locking, sched: Annotate thread_group_cputimer as raw The thread_group_cputimer lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:55 +02:00
Thomas Gleixner	07354eb1a7	locking, printk: Annotate logbuf_lock as raw The logbuf_lock lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ merged and fixed it ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:54 +02:00
Thomas Gleixner	5389f6fad2	locking, tracing: Annotate tracing locks as raw The tracing locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:52 +02:00
Thomas Gleixner	cdcc136ffd	locking, sched, cgroups: Annotate release_list_lock as raw The release_list_lock can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:49 +02:00
Thomas Gleixner	ec484608c5	locking, kprobes: Annotate the hash locks and kretprobe.lock as raw The kprobe locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:11:45 +02:00
Thomas Gleixner	9fb6033625	clocksource: Make watchdog reset lockless KGDB needs to trylock watchdog_lock when trying to reset the clocksource watchdog after the system has been stopped to avoid a potential deadlock. When the trylock fails TSC usually becomes unstable. We can be more clever by using an atomic counter and checking it in the clocksource_watchdog callback. We restart the watchdog whenever the counter is > 0 and only decrement the counter when we ran through a full update cycle. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com> Acked-by: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121326280.2723@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-13 09:58:29 +02:00
Thomas Gleixner	0fa914c632	rtmutex: Cleanup the debug code Use the existing lock debugging macros. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-12 13:42:32 +02:00
Santosh Shilimkar	60f96b41f7	genirq: Add IRQCHIP_SKIP_SET_WAKE flag Some irq chips need the irq_set_wake() functionality, but do not require a irq_set_wake() callback. Instead of forcing an empty callback to be implemented add a flag which notes this fact. Check for the flag in set_irq_wake_real() and return success when set. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2011-09-12 09:52:49 +02:00
Geert Uytterhoeven	ed585a6516	genirq: Make irq_shutdown() symmetric vs. irq_startup again If an irq_chip provides .irq_shutdown(), but neither of .irq_disable() or .irq_mask(), free_irq() crashes when jumping to NULL. Fix this by only trying .irq_disable() and .irq_mask() if there's no .irq_shutdown() provided. This revives the symmetry with irq_startup(), which tries .irq_startup(), .irq_enable(), and irq_unmask(), and makes it consistent with the comment for irq_chip.irq_shutdown() in <linux/irq.h>, which says: * @irq_shutdown: shut down the interrupt (defaults to ->disable if NULL) This is also how __free_irq() behaved before the big overhaul, cfr. e.g. `3b56f0585f` ("genirq: Remove bogus conditional"), where the core interrupt code always overrode .irq_shutdown() to .irq_disable() if .irq_shutdown() was NULL. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-m68k@lists.linux-m68k.org Link: http://lkml.kernel.org/r/1315742394-16036-2-git-send-email-geert@linux-m68k.org Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-12 09:38:53 +02:00
Peter Zijlstra	e8abccb719	posix-cpu-timers: Cure SMP accounting oddities David reported: Attached below is a watered-down version of rt/tst-cpuclock2.c from GLIBC. Just build it with "gcc -o test test.c -lpthread -lrt" or similar. Run it several times, and you will see cases where the main thread will measure a process clock difference before and after the nanosleep which is smaller than the cpu-burner thread's individual thread clock difference. This doesn't make any sense since the cpu-burner thread is part of the top-level process's thread group. I've reproduced this on both x86-64 and sparc64 (using both 32-bit and 64-bit binaries). For example: [davem@boricha build-x86_64-linux]$ ./test process: before(0.001221967) after(0.498624371) diff(497402404) thread: before(0.000081692) after(0.498316431) diff(498234739) self: before(0.001223521) after(0.001240219) diff(16698) [davem@boricha build-x86_64-linux]$ The diff of 'process' should always be >= the diff of 'thread'. I make sure to wrap the 'thread' clock measurements the most tightly around the nanosleep() call, and that the 'process' clock measurements are the outer-most ones. --- #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <fcntl.h> #include <string.h> #include <errno.h> #include <pthread.h> static pthread_barrier_t barrier; static void chew_cpu(void arg) { pthread_barrier_wait(&barrier); while (1) __asm__ __volatile__("" : : : "memory"); return NULL; } int main(void) { clockid_t process_clock, my_thread_clock, th_clock; struct timespec process_before, process_after; struct timespec me_before, me_after; struct timespec th_before, th_after; struct timespec sleeptime; unsigned long diff; pthread_t th; int err; err = clock_getcpuclockid(0, &process_clock); if (err) return 1; err = pthread_getcpuclockid(pthread_self(), &my_thread_clock); if (err) return 1; pthread_barrier_init(&barrier, NULL, 2); err = pthread_create(&th, NULL, chew_cpu, NULL); if (err) return 1; err = pthread_getcpuclockid(th, &th_clock); if (err) return 1; pthread_barrier_wait(&barrier); err = clock_gettime(process_clock, &process_before); if (err) return 1; err = clock_gettime(my_thread_clock, &me_before); if (err) return 1; err = clock_gettime(th_clock, &th_before); if (err) return 1; sleeptime.tv_sec = 0; sleeptime.tv_nsec = 500000000; nanosleep(&sleeptime, NULL); err = clock_gettime(th_clock, &th_after); if (err) return 1; err = clock_gettime(my_thread_clock, &me_after); if (err) return 1; err = clock_gettime(process_clock, &process_after); if (err) return 1; diff = process_after.tv_nsec - process_before.tv_nsec; printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", process_before.tv_sec, process_before.tv_nsec, process_after.tv_sec, process_after.tv_nsec, diff); diff = th_after.tv_nsec - th_before.tv_nsec; printf("thread: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", th_before.tv_sec, th_before.tv_nsec, th_after.tv_sec, th_after.tv_nsec, diff); diff = me_after.tv_nsec - me_before.tv_nsec; printf("self: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", me_before.tv_sec, me_before.tv_nsec, me_after.tv_sec, me_after.tv_nsec, diff); return 0; } This is due to us using p->se.sum_exec_runtime in thread_group_cputime() where we iterate the thread group and sum all data. This does not take time since the last schedule operation (tick or otherwise) into account. We can cure this by using task_sched_runtime() at the cost of having to take locks. This also means we can (and must) do away with thread_group_sched_runtime() since the modified thread_group_cputime() is now more accurate and would deadlock when called from thread_group_sched_runtime(). Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 15:25:52 +02:00
Martin Schwidefsky	65516f8a7c	clockevents: Add direct ktime programming function There is at least one architecture (s390) with a sane clockevent device that can be programmed with the equivalent of a ktime. No need to create a delta against the current time, the ktime can be used directly. A new clock device function 'set_next_ktime' is introduced that is called with the unmodified ktime for the timer if the clock event device has the CLOCK_EVT_FEAT_KTIME bit set. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20110823133142.815350967@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:56 +02:00
Martin Schwidefsky	d1748302f7	clockevents: Make minimum delay adjustments configurable The automatic increase of the min_delta_ns of a clockevents device should be done in the clockevents code as the minimum delay is an attribute of the clockevents device. In addition not all architectures want the automatic adjustment, on a massively virtualized system it can happen that the programming of a clock event fails several times in a row because the virtual cpu has been rescheduled quickly enough. In that case the minimum delay will erroneously be increased with no way back. The new config symbol GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic adjustment. The config option is selected only for x86. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:56 +02:00
Heiko Carstens	29c158e81c	nohz: Remove "Switched to NOHz mode" debugging messages When performing cpu hotplug tests the kernel printk log buffer gets flooded with pointless "Switched to NOHz mode..." messages. Especially when afterwards analyzing a dump this might have removed more interesting stuff out of the buffer. Assuming that switching to NOHz mode simply works just remove the printk. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Michal Hocko	09a1d34f85	nohz: Make idle/iowait counter update conditional get_cpu_{idle,iowait}_time_us update idle/iowait counters unconditionally if the given CPU is in the idle loop. This doesn't work well outside of CPU governors which are singletons so nobody (except for IRQ) can race with them. We will need to use both functions from /proc/stat handler to properly handle nohz idle/iowait times. Make the update depend on a non NULL last_update_time argument. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Michal Hocko	6beea0cda8	nohz: Fix update_ts_time_stat idle accounting update_ts_time_stat currently updates idle time even if we are in iowait loop at the moment. The only real users of the idle counter (via get_cpu_idle_time_us) are CPU governors and they expect to get cumulative time for both idle and iowait times. The value (idle_sleeptime) is also printed to userspace by print_cpu but it prints both idle and iowait times so the idle part is misleading. Let's clean this up and fix update_ts_time_stat to account both counters properly and update consumers of idle to consider iowait time as well. If we do this we might use get_cpu_{idle,iowait}_time_us from other contexts as well and we will get expected values. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Linus Torvalds	79016f6488	Merge branch 'timers-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip * 'timers-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: rtc: twl: Fix registration vs. init order rtc: Initialized rtc_time->tm_isdst rtc: Fix RTC PIE frequency limit rtc: rtc-twl: Remove lockdep related local_irq_enable() rtc: rtc-twl: Switch to using threaded irq rtc: ep93xx: Fix 'rtc' may be used uninitialized warning alarmtimers: Avoid possible denial of service with high freq periodic timers alarmtimers: Memset itimerspec passed into alarm_timer_get alarmtimers: Avoid possible null pointer traversal	2011-09-07 13:03:48 -07:00
Linus Torvalds	e81b693c01	Merge branch 'sched-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip * 'sched-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: sched: Fix a memory leak in __sdt_free() sched: Move blk_schedule_flush_plug() out of __schedule() sched: Separate the scheduler entry for preemption	2011-09-07 13:01:34 -07:00
Eric B Munson	7f310a5d4e	perf_event: Fix broken calc_timer_values() We detected a serious issue with PERF_SAMPLE_READ and timing information when events were being multiplexing. Samples would have time_running > time_enabled. That was easy to reproduce with a libpfm4 example (ran 3 times to cause multiplexing on Core 2): $ syst_smpl -e uops_retired:freq=1 & $ syst_smpl -e uops_retired:freq=1 & $ syst_smpl -e uops_retired:freq=1 & IIP:0x0000000040062d ... PERIOD:2355332948 ENA=40144625315 RUN=60014875184 syst_smpl: WARNING: time_running > time_enabled 63277537998 uops_retired:freq=1 , scaled The bug was not present in kernel up to (and including) 3.0. It turns out the bug was introduced by the following commit: commit `c479429591` events: Move lockless timer calculation into helper function The parameters of the function got reversed yet the call sites were not updated to reflect the change. That lead to time_running and time_enabled being swapped. That had no effect when there was no multiplexing because in that case time_running = time_enabled but it would show up in any other scenario. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110829124112.GA4828@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-31 15:56:29 +02:00
Mark Rutland	5f12a76193	perf: provide PMU when initing events Currently, an event's 'pmu' field is set after pmu::event_init() is called. This means that pmu::event_init() must figure out which struct pmu the event was initialised from. This makes it difficult to consolidate common event initialisation code for similar PMUs, and very difficult to implement drivers for PMUs which can have multiple instances (e.g. a USB controller PMU, a GPU PMU, etc). This patch sets the 'pmu' field before initialising the event, allowing event init code to identify the struct pmu instance easily. In the event of failure to initialise an event, the event is destroyed via kfree() without calling perf_event::destroy(), so this shouldn't result in bad behaviour even if the destroy field was set before failure to initialise was noted. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1313062280-19123-1-git-send-email-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-31 10:49:59 +01:00

... 3 4 5 6 7 ...

12429 commits