The speed and clock of the serial ports is retrieved from the device
tree in both the PowerPC legacy serial code and the Open Firmware serial
driver, therefore they need to handle the fact that the device tree is
always big endian, while the CPU may not be.
Also fix other device tree references in the legacy serial code.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Fix the IRQ flag handling naming. In linux/irqflags.h under one configuration,
it maps:
local_irq_enable() -> raw_local_irq_enable()
local_irq_disable() -> raw_local_irq_disable()
local_irq_save() -> raw_local_irq_save()
...
and under the other configuration, it maps:
raw_local_irq_enable() -> local_irq_enable()
raw_local_irq_disable() -> local_irq_disable()
raw_local_irq_save() -> local_irq_save()
...
This is quite confusing. There should be one set of names expected of the
arch, and this should be wrapped to give another set of names that are expected
by users of this facility.
Change this to have the arch provide:
flags = arch_local_save_flags()
flags = arch_local_irq_save()
arch_local_irq_restore(flags)
arch_local_irq_disable()
arch_local_irq_enable()
arch_irqs_disabled_flags(flags)
arch_irqs_disabled()
arch_safe_halt()
Then linux/irqflags.h wraps these to provide:
raw_local_save_flags(flags)
raw_local_irq_save(flags)
raw_local_irq_restore(flags)
raw_local_irq_disable()
raw_local_irq_enable()
raw_irqs_disabled_flags(flags)
raw_irqs_disabled()
raw_safe_halt()
with type checking on the flags 'arguments', and then wraps those to provide:
local_save_flags(flags)
local_irq_save(flags)
local_irq_restore(flags)
local_irq_disable()
local_irq_enable()
irqs_disabled_flags(flags)
irqs_disabled()
safe_halt()
with tracing included if enabled.
The arch functions can now all be inline functions rather than some of them
having to be macros.
Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
Cc: starvik@axis.com [CRIS]
Cc: jesper.nilsson@axis.com [CRIS]
Cc: linux-cris-kernel@axis.com
Since powerpc uses -Werror on arch powerpc, the build was broken like
this:
cc1: warnings being treated as errors
arch/powerpc/kernel/module.c: In function 'module_finalize':
arch/powerpc/kernel/module.c:66: error: unused variable 'err'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.
However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.
Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.
So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.
Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.
Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The logic to distinguish marked instruction events from ordinary events
on PPC970 and derivatives was flawed. The result is that instruction
sampling didn't get enabled in the PMU for some marked instruction
events, so they would never trigger. This fixes it by adding the
appropriate break statements in the switch statement.
Reported-by: David Binderman <dcb314@hotmail.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make sigreturn zero regs->trap, make do_signal() do the same on all
paths. As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to. Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time. Same for multiple signals of any kind
interrupting that sucker on 64bit...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.
The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.
This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).
It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).
The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:
1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state
2) We store the counter state and ignore all read/overflow events
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some platforms may want to override dma_set_mask() to take into
account some specific "features" such as the availability of
a direct-map window in addition to an iommu.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch removes all explicit tests for the TIF_32BIT flag
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Neither lfs nor stfs touch the fpscr, so remove the restore/save of it
around them.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since the cpu accounting code uses the hypervisor dispatch trace log
now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled
access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory
in that case. This restores those files.
To do this, we now have a hook that the cpu accounting code will call
as it processes each entry from the hypervisor dispatch trace log.
The code in dtl.c now uses that to fill up its ring buffer, rather
than having the hypervisor fill the ring buffer directly.
This also fixes dtl_file_read() to handle overflow conditions a bit
better and adds a spinlock to ensure that race conditions (multiple
processes opening or reading the file concurrently) are handled
correctly.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the
PURR register for measuring the user and system time used by
processes, as well as other related times such as hardirq and
softirq times. This turns out to be quite confusing for users
because it means that a program will often be measured as taking
less time when run on a multi-threaded processor (SMT2 or SMT4 mode)
than it does when run on a single-threaded processor (ST mode), even
though the program takes longer to finish. The discrepancy is
accounted for as stolen time, which is also confusing, particularly
when there are no other partitions running.
This changes the accounting to use the timebase instead, meaning that
the reported user and system times are the actual number of real-time
seconds that the program was executing on the processor thread,
regardless of which SMT mode the processor is in. Thus a program will
generally show greater user and system times when run on a
multi-threaded processor than on a single-threaded processor.
On pSeries systems on POWER5 or later processors, we measure the
stolen time (time when this partition wasn't running) using the
hypervisor dispatch trace log. We check for new entries in the
log on every entry from user mode and on every transition from
kernel process context to soft or hard IRQ context (i.e. when
account_system_vtime() gets called). So that we can correctly
distinguish time stolen from user time and time stolen from system
time, without having to check the log on every exit to user mode,
we store separate timestamps for exit to user mode and entry from
user mode.
On systems that have a SPURR (POWER6 and POWER7), we read the SPURR
in account_system_vtime() (as before), and then apportion the SPURR
ticks since the last time we read it between scaled user time and
scaled system time according to the relative proportions of user
time and system time over the same interval. This avoids having to
read the SPURR on every kernel entry and exit. On systems that have
PURR but not SPURR (i.e., POWER5), we do the same using the PURR
rather than the SPURR.
This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl
for now since it conflicts with the use of the dispatch trace log
by the time accounting code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs. If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated. If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.
With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:
text data bss dec hex filename
9524478 4734564 8469944 22728986 15ad11a ../test-1024/vmlinux
to:
text data bss dec hex filename
9524482 3751508 8469944 21745934 14bd10e ../test-1024/vmlinux
a reduction of 983052 bytes overall.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to
smp.c to save an #ifdef CONFIG_SMP
No functionality change.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The POWER architecture does not require stcx to check that it is operating
on the same address as the larx. This means it is possible for an
an exception handler to execute a larx, get a reservation, decide
not to do the stcx and then return back with an active reservation. If the
interrupted code was in the middle of a larx/stcx sequence the stcx could
incorrectly succeed.
All recent POWER CPUs check the address before letting the stcx succeed
so we can create a CPU feature and nop it out. As Ben suggested, we can
only do this in our syscall path because there is a remote possibility
some kernel code gets interrupted by an exception that ends up operating
on the same cacheline.
Thanks to Paul Mackerras and Derek Williams for the idea.
To test this I used a very simple null syscall (actually getppid) testcase
at http://ozlabs.org/~anton/junkcode/null_syscall.c
I tested against 2.6.35-git10 with the following changes against the
pseries_defconfig:
CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n
CONFIG_PPC_4K_PAGES=n
CONFIG_PPC_64K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_PPC_SUBPAGE_PROT=n
CONFIG_FUNCTION_TRACER=n
CONFIG_FUNCTION_GRAPH_TRACER=n
CONFIG_IRQSOFF_TRACER=n
CONFIG_STACK_TRACER=n
to remove the overhead of virtual CPU accounting, syscall auditing and
the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.
POWER6: +8.2%
POWER7: +7.0%
Another suggestion was to use a larx to something in the L1 instead of a stcx.
This was almost as fast as removing the larx on POWER6, but only 3.5% faster
on POWER7. We can use this to speed up the reservation clear in our
exception exit code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In f761622e59 we changed
early_setup_secondary so it's called using the proper kernel stack
rather than the emergency one.
Unfortunately, this stack pointer can't be used when translation is off
on PHYP as this stack pointer might be outside the RMO. This results in
the following on all non zero cpus:
cpu 0x1: Vector: 300 (Data Access) at [c00000001639fd10]
pc: 000000000001c50c
lr: 000000000000821c
sp: c00000001639ff90
msr: 8000000000001000
dar: c00000001639ffa0
dsisr: 42000000
current = 0xc000000016393540
paca = 0xc000000006e00200
pid = 0, comm = swapper
The original patch was only tested on bare metal system, so it never
caught this problem.
This changes __secondary_start so that we calculate the new stack
pointer but only start using it after we've called early_setup_secondary.
With this patch, the above problem goes away.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 0fe1ac48 ("powerpc/perf_event: Fix oops due to
perf_event_do_pending call") moved the call to perf_event_do_pending
in timer_interrupt() down so that it was after the irq_enter() call.
Unfortunately this moved it after the code that checks whether it
is time for the next decrementer clock event. The result is that
the call to perf_event_do_pending() won't happen until the next
decrementer clock event is due. This was pointed out by Milton
Miller.
This fixes it by moving the check for whether it's time for the
next decrementer clock event down to the point where we're about
to call the event handler, after we've called perf_event_do_pending.
This has the side effect that on old pre-Core99 Powermacs where we
use the ppc_n_lost_interrupts mechanism to replay interrupts, a
replayed interrupt will incur a little more latency since it will
now do the code from the irq_enter down to the irq_exit, that it
used to skip. However, these machines are now old and rare enough
that this doesn't matter. To make it clear that ppc_n_lost_interrupts
is only used on Powermacs, and to speed up the code slightly on
non-Powermac ppc32 machines, the code that tests ppc_n_lost_interrupts
is now conditional on CONFIG_PMAC as well as CONFIG_PPC32.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Call kexec purgatory code correctly. We were getting lucky before.
If you examine the powerpc 32bit kexec "purgatory" code you will
see it expects the following:
>From kexec-tools: purgatory/arch/ppc/v2wrap_32.S
-> calling convention:
-> r3 = physical number of this cpu (all cpus)
-> r4 = address of this chunk (master only)
As such, we need to set r3 to the current core, r4 happens to be
unused by purgatory at the moment but we go ahead and set it
here as well
Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.
Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"
Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data). It's not always allowable to take such a miss, and
intermittent crashes will result.
Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)). This patch therefore only
affects the init of secondaries.
Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:
address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048
Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.
Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it
into the callers. To avoid a mess of circular includes I didn't add
it as an inline function.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The 'smt_enabled=X' boot option does not handle values of X > 2.
For Power 7 processors with smt modes of 0,1,2,3, and 4 this does
not work. This patch allows the smt_enabled option to be set to
any value limited to a max equal to the number of threads per
core.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During CPU offline/online tests __cpu_up would flood the logs with
the following message:
Processor 0 found.
This provides no useful information to the user as there is no context
provided, and since the operation was a success (to this point) it is expected
that the CPU will come back online, providing all the feedback necessary.
Change the "Processor found" message to DBG() similar to other such messages in
the same function. Also, add an appropriate log level for the "Processor is
stuck" message.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
start_secondary() is called shortly after _start and also via
cpu_idle()->cpu_die()->pseries_mach_cpu_die()
start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is
called via the cpu_idle() routine with preemption disabled, resulting in the
following repeating message during rapid cpu offline/online tests
with CONFIG_PREEMPT=y:
BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan]
Call Trace:
[c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable)
[c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c
[c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c
[c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800
[c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8
[c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4
[c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14
Move the cpu_die() call inside the existing preemption enabled block of
cpu_idle(). This is safe as the idle task is affined to a single CPU so the
debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are
in a "migration disabled" region.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
list_for_each_entry binds its first argument to a non-null value, and thus
any null test on the value of that argument is superfluous.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
iterator I;
expression x,E,E1,E2;
statement S,S1,S2;
@@
I(x,...) { <...
- if (x != NULL || ...)
S
...> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During kdump we run the crash handlers first then stop all other CPUs.
We really want to stop all CPUs as close to the fail as possible and also
have a very controlled environment for running the crash handlers, so it
makes sense to reverse the order.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use is_32bit_task() helper to test 32 bit binary.
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The interrupt stacks need to be indexed by the physical cpu since the
critical, debug and machine check handlers use the contents of SPRN_PIR to
index the critirq_ctx, dbgirq_ctx, and mcheckirq_ctx arrays.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
There are two entries for .cpu_user_features in
arch/powerpc/kernel/cputable.c. Remove the one that doesn't belong
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Clear the machine check syndrom register before enabling machine check
interrupts. The initial state of the tlb can lead to parity errors being
flagged early after a cold boot.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
- Most archs use one callchain buffer per cpu, except x86 that needs
to deal with NMIs. Provide a default perf_callchain_buffer()
implementation that x86 overrides.
- Centralize all the kernel/user regs handling and invoke new arch
handlers from there: perf_callchain_user() / perf_callchain_kernel()
That avoid all the user_mode(), current->mm checks and so...
- Invert some parameters in perf_callchain_*() helpers: entry to the
left, regs to the right, following the traditional (dst, src).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.
This removes repetitive code.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:
arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().
do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.
Further kernel_execve() and sys_execve() need to be changed to match.
This has been test built on x86_64, frv, arm and mips.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark arguments to certain system calls as being const where they should be but
aren't. The list includes:
(*) The filename arguments of various stat syscalls, execve(), various utimes
syscalls and some mount syscalls.
(*) The filename arguments of some syscall helpers relating to the above.
(*) The buffer argument of various write syscalls.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>