Before we split up the dcache_lru_lock, the unused dentry counter needs to
be made independent of the global dcache_lru_lock. Convert it to per-cpu
counters to do this.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The sysctl knob sysctl_vfs_cache_pressure is used to determine which
percentage of the shrinkable objects in our cache we should actively try
to shrink.
It works great in situations in which we have many objects (at least more
than 100), because the aproximation errors will be negligible. But if
this is not the case, specially when total_objects < 100, we may end up
concluding that we have no objects at all (total / 100 = 0, if total <
100).
This is certainly not the biggest killer in the world, but may matter in
very low kernel memory situations.
Signed-off-by: Glauber Costa <glommer@openvz.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This series reworks our current object cache shrinking infrastructure in
two main ways:
* Noticing that a lot of users copy and paste their own version of LRU
lists for objects, we put some effort in providing a generic version.
It is modeled after the filesystem users: dentries, inodes, and xfs
(for various tasks), but we expect that other users could benefit in
the near future with little or no modification. Let us know if you
have any issues.
* The underlying list_lru being proposed automatically and
transparently keeps the elements in per-node lists, and is able to
manipulate the node lists individually. Given this infrastructure, we
are able to modify the up-to-now hammer called shrink_slab to proceed
with node-reclaim instead of always searching memory from all over like
it has been doing.
Per-node lru lists are also expected to lead to less contention in the lru
locks on multi-node scans, since we are now no longer fighting for a
global lock. The locks usually disappear from the profilers with this
change.
Although we have no official benchmarks for this version - be our guest to
independently evaluate this - earlier versions of this series were
performance tested (details at
http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
visible performance regressions while yielding a better qualitative
behavior in NUMA machines.
With this infrastructure in place, we can use the list_lru entry point to
provide memcg isolation and per-memcg targeted reclaim. Historically,
those two pieces of work have been posted together. This version presents
only the infrastructure work, deferring the memcg work for a later time,
so we can focus on getting this part tested. You can see more about the
history of such work at http://lwn.net/Articles/552769/
Dave Chinner (18):
dcache: convert dentry_stat.nr_unused to per-cpu counters
dentry: move to per-sb LRU locks
dcache: remove dentries from LRU before putting on dispose list
mm: new shrinker API
shrinker: convert superblock shrinkers to new API
list: add a new LRU list type
inode: convert inode lru list to generic lru list code.
dcache: convert to use new lru list infrastructure
list_lru: per-node list infrastructure
shrinker: add node awareness
fs: convert inode and dentry shrinking to be node aware
xfs: convert buftarg LRU to generic code
xfs: rework buffer dispose list tracking
xfs: convert dquot cache lru to list_lru
fs: convert fs shrinkers to new scan/count API
drivers: convert shrinkers to new count/scan API
shrinker: convert remaining shrinkers to count/scan API
shrinker: Kill old ->shrink API.
Glauber Costa (7):
fs: bump inode and dentry counters to long
super: fix calculation of shrinkable objects for small numbers
list_lru: per-node API
vmscan: per-node deferred work
i915: bail out earlier when shrinker cannot acquire mutex
hugepage: convert huge zero page shrinker to new shrinker API
list_lru: dynamically adjust node arrays
This patch:
There are situations in very large machines in which we can have a large
quantity of dirty inodes, unused dentries, etc. This is particularly true
when umounting a filesystem, where eventually since every live object will
eventually be discarded.
Dave Chinner reported a problem with this while experimenting with the
shrinker revamp patchset. So we believe it is time for a change. This
patch just moves int to longs. Machines where it matters should have a
big long anyway.
Signed-off-by: Glauber Costa <glommer@openvz.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.
Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Olof Johansson reports that the tests against HZ_FIXED seem
non-functional. Fix this by using '0' as a sentinel for "not
specified" and test against that instead.
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch will add big endian architecture support to megaraid_sas
driver. The support added is for LSI MegaRAID all generation controllers-
(3Gb/s, 6Gb/s and 12 Gb/s controllers).
We have done basic sanity test @ppc64 arch and @x86_64. Additional
testing/observations are welcome.
[jejb: fix up rejections]
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: Sumit Saxena <sumit.saxena@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The mn10300 kernel crashes just after starting userspace programs, if
CONFIG_PREEMPT is disabled:
Freeing unused kernel memory: 96K (90286000 - 9029e000)
MISALIGN: 97c33ff9: unsupported instruction f
MISALIGN: 97c33ff9: unsupported instruction f
MISALIGN: 97c33ff9: unsupported instruction f
:
This fixes the problem that was introduced by commit d17fc238ac
("MN10300: Enable IRQs more in system call exit work path").
Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: Kiyoshi Owada <owada.kiyoshi@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The prototype for ahc_9005_subdevinfo_valid shows that the caller has the
arguments in the wrong order.
637 ahc_9005_subdevinfo_valid(uint16_t device, uint16_t vendor,
638 uint16_t subdevice, uint16_t subvendor)
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Changes the version of hpsa so we know something has changed. Please consider
this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch does a bit of housekeeping for hpsa. Change lowercase alpha hex
digits to uppercase for consistency within the driver. Also moves the P822se
in the tables to keep controllers of each family grouped together.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add the marketing names for HP Smart Array Gen8 controllers. Also removes an
unused ID. Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
commit 6d4028c644 (i2c: use dev_get_platdata()) did a bad conversion
of this one case:
drivers/i2c/busses/i2c-davinci.c: In function 'davinci_i2c_probe':
drivers/i2c/busses/i2c-davinci.c:665:2: warning: passing argument 1 of
'dev_get_platdata' from incompatible pointer type [enabled by default]
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
* acpi-pci-hotplug:
ACPI / hotplug / PCI: Avoid parent bus rescans on spurious device checks
ACPI / hotplug / PCI: Use _OST to notify firmware about notify status
ACPI / hotplug / PCI: Avoid doing too much for spurious notifies
ACPI / hotplug / PCI: Don't trim devices before scanning the namespace
This patch adds the PCI ID's for HP Smart Array Gen9 controllers. Please
consider this patch for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit
Signed-off-by: Nell Hardcastle <nell@spicious.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.
Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding the
entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSL12LAAoJEEFnBt12D9kB64gP/RBipnYbo3RPanHg+lE/J1V7
KSVFNGKWJHxTg47VVC1YJGIG21jqxAilpdS2MQL5FP7iyd+IzvtHpQiJgp+2G+pq
di06yrdyrYErxRgZgGQi8IpR538ZzOEVLCKJGdb09YelkRzPT5au7CC1MAsX3qco
yba7PHk0/Nc4hZE4aGbgR1DlRmn86ob7mM0KFE/LORaSN2BueMgWcwKhQXYNGyoh
assX4yNhAbUG6Bgw7paBLDGqHh8c5Ei5AppU8yPb+N094jgYHBJryUoDlzzUHD23
qqiEqHhUKT0TpgHNs8KH0WZFugcmjKvYEbzdzadBxqfXnJN4fKSEcdfF3iz4T14j
U6EZks89GoHwA523OghUZkKNOqlsUdWfdKz+8/grQqKisYwDcf3fCxEYk/4weDCQ
b6fFlOv6+AI3btjXp6F511ZKxyT4ZZzkHjp/ZSrhBygyamNZfax0ma0j+ZS9AZql
kPxQS0nOve6NKaP7vXxMmW5sGMnL19ER/Hm31wthGcWI43GVebUdklnzfGaEeSjs
pmP8oiCNemceqVpiPKxcOxiguf/eyIjP1SFXbguASygUmQeTDbbJ8n1FYznCitue
xJgWttKWsEf/aMR3eJtQ3aBmHR3rijAV4E28Wlq8XMkocwvpQm2zMocS2Z5BJ80S
hi1kQVy8+RxNX96tOSp1
=GSWl
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
Pull slave-dmaengine updates from Vinod Koul:
"This pull brings:
- Andy's DW driver updates
- Guennadi's sh driver updates
- Pl08x driver fixes from Tomasz & Alban
- Improvements to mmp_pdma by Daniel
- TI EDMA fixes by Joel
- New drivers:
- Hisilicon k3dma driver
- Renesas rcar dma driver
- New API for publishing slave driver capablities
- Various fixes across the subsystem by Andy, Jingoo, Sachin etc..."
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (94 commits)
dma: edma: Remove limits on number of slots
dma: edma: Leave linked to Null slot instead of DUMMY slot
dma: edma: Find missed events and issue them
ARM: edma: Add function to manually trigger an EDMA channel
dma: edma: Write out and handle MAX_NR_SG at a given time
dma: edma: Setup parameters to DMA MAX_NR_SG at a time
dmaengine: pl330: use dma_set_max_seg_size to set the sg limit
dmaengine: dma_slave_caps: remove sg entries
dma: replace devm_request_and_ioremap by devm_ioremap_resource
dma: ste_dma40: Fix potential null pointer dereference
dma: ste_dma40: Remove duplicate const
dma: imx-dma: Remove redundant NULL check
dma: dmagengine: fix function names in comments
dma: add driver for R-Car HPB-DMAC
dma: k3dma: use devm_ioremap_resource() instead of devm_request_and_ioremap()
dma: imx-sdma: Staticize sdma_driver_data structures
pch_dma: Add MODULE_DEVICE_TABLE
dmaengine: PL08x: Add cyclic transfer support
dmaengine: PL08x: Fix reading the byte count in cctl
dmaengine: PL08x: Add support for different maximum transfer size
...
Core:
- Support Allocation Units 8MB-64MB in SD3.0, previous max was 4MB.
- The slot-gpio helper can now handle GPIO debouncing card-detect.
- Read supported voltages from DT "voltage-ranges" property.
Drivers:
- dw_mmc: Add support for ARC architecture, and support exynos5420.
- mmc_spi: Support CD/RO GPIOs.
- sh_mobile_sdhi: Add compatibility for more Renesas SoCs.
- sh_mmcif: Add DT support for DMA channels.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJSLxu8AAoJEHNBYZ7TNxYMkV8P/RZMkP3L88wPmvdQ4IxVD/Bg
jphPsZTAlY8pLY7EcBVfbIGfBMMoVZ8CayuAbj9u5jFiMBNDUKRnKrky1m95F62n
xXiS5vUiKdWpByMpfKmrqvJr6eS8SWq5z/yug1c3s1Y9T363KgfypgJLvNgbMGiv
qbmjE+Zw65nMwHCk+4Rvq6s4aN6KosRZP0ABsn1foXt3kybSgemp5ShrlEyXohfS
E91EiYxHFC4fdmuWiZZvL1tyHFeV25omyZA90mpkioNItiwoyOM2rfjfEfNq+WBw
UrmdBesbGsF0Zi12CBa9LtzdRjYK8PugBWKW3mycS5++NX9KW6Ac/EpGqFeH9KgL
WZ2v4aQjkbnzQKUB2HcWAyXm88G9MkNvpLbIrmIPtson+q0UjgPYWe5BI3dy/Y1v
YS1JeseslVtSTKzGYsa1GJ7Nc1xYiILRz0RS4YGYXNjwvrl89i2UH7cglYDW36Xd
vxvRBaFpVsj1mfjjITEoG6nE0v5aYH6gSITY79XR+/kN871/99/oIUaWdpjcm9yv
SIYmK7ipcvxugkQ7BoMGbym/dvuUrZ+Vnf8dFlGPTJegZVsnfgrVAnRpvcVwW8+x
4Z79wUPSIETRqj2XX2I/Y0JnrXry+dLLVyeK1tELoeOKev73Ai2lcqPSz6J0tzJs
IErcz0hM1znL2RtgNwio
=+EYB
-----END PGP SIGNATURE-----
Merge tag 'mmc-updates-for-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC updates from Chris Ball:
"MMC highlights for 3.12:
Core:
- Support Allocation Units 8MB-64MB in SD3.0, previous max was 4MB.
- The slot-gpio helper can now handle GPIO debouncing card-detect.
- Read supported voltages from DT "voltage-ranges" property.
Drivers:
- dw_mmc: Add support for ARC architecture, and support exynos5420.
- mmc_spi: Support CD/RO GPIOs.
- sh_mobile_sdhi: Add compatibility for more Renesas SoCs.
- sh_mmcif: Add DT support for DMA channels"
* tag 'mmc-updates-for-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (50 commits)
Revert "mmc: tmio-mmc: Remove .set_pwr() callback from platform data"
mmc: dw_mmc: Add support for ARC
mmc: sdhci-s3c: initialize host->quirks2 for using quirks2
mmc: sdhci-s3c: fix the wrong register value, when clock is disabled
mmc: esdhc: add support to get voltage from device-tree
mmc: sdhci: get voltage from sdhc host
mmc: core: parse voltage from device-tree
mmc: omap_hsmmc: use the generic config for omap2plus devices
mmc: omap_hsmmc: clear status flags before starting a new command
mmc: dw_mmc: exynos: Add a new compatible string for exynos5420
mmc: sh_mmcif: revision-specific CLK_CTRL2 handling
mmc: sh_mmcif: revision-specific Command Completion Signal handling
mmc: sh_mmcif: add support for Device Tree DMA bindings
mmc: sh_mmcif: move header include from header into .c
mmc: SDHI: add DT compatibility strings for further SoCs
mmc: dw_mmc-pci: enable bus-mastering mode
mmc: dw_mmc-pci: get resources from a proper BAR
mmc: tmio-mmc: Remove .set_pwr() callback from platform data
mmc: tmio-mmc: Remove .get_cd() callback from platform data
mmc: sh_mobile_sdhi: Remove .set_pwr() callback from platform data
...
device-mapper device. This dm-stats code required the reintroduction of
a div64_u64_rem() helper, but as a separate method that doesn't slow
down div64_u64() -- especially on 32-bit systems.
Allow the error target to replace request-based DM devices
(e.g. multipath) in addition to bio-based DM devices.
Various other small code fixes and improvements to thin-provisioning, DM
cache and the DM ioctl interface.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJSLyNnAAoJEMUj8QotnQNaXVEIAKA1l43enaGiROBZEZXgAGUY
1JUsnHES4ujyn/jtT39jPTQf9AW/rS4FUCrZiXG2aaNHXo7+7cdVoBHAiWc7mXad
budBSqn47W7WDyFlQarKwsuYFcdLnqdnieRDMXQ1cN5dl4Rx61LclnsylQd4SSS0
lznXkfOTquetDSuEPOuUHJDZufdacw3PpxWbTKGJld40fd7YZfGWQoG0ek1OeqqL
fA30DTlYnkFyhheLCjFcDY6H55Rt7QpBWOUAa2XXLR6GLfk5iFK99autjWk2xTPT
nppRwQrw9VH+HdW0jGLU+LRs1Y3nxwT9OBLWt9wav87Smdg/7jQAjwde9eKbO2k=
=3ooH
-----END PGP SIGNATURE-----
Merge tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"Add the ability to collect I/O statistics on user-defined regions of a
device-mapper device. This dm-stats code required the reintroduction
of a div64_u64_rem() helper, but as a separate method that doesn't
slow down div64_u64() -- especially on 32-bit systems.
Allow the error target to replace request-based DM devices (e.g.
multipath) in addition to bio-based DM devices.
Various other small code fixes and improvements to thin-provisioning,
DM cache and the DM ioctl interface"
* tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm stripe: silence a couple sparse warnings
dm: add statistics support
dm thin: always return -ENOSPC if no_free_space is set
dm ioctl: cleanup error handling in table_load
dm ioctl: increase granularity of type_lock when loading table
dm ioctl: prevent rename to empty name or uuid
dm thin: set pool read-only if breaking_sharing fails block allocation
dm thin: prefix pool error messages with pool device name
dm: allow error target to replace bio-based and request-based targets
math64: New separate div64_u64_rem helper
dm space map: optimise sm_ll_dec and sm_ll_inc
dm btree: prefetch child nodes when walking tree for a dm_btree_del
dm btree: use pop_frame in dm_btree_del to cleanup code
dm cache: eliminate holes in cache structure
dm cache: fix stacking of geometry limits
dm thin: fix stacking of geometry limits
dm thin: add data block size limits to Documentation
dm cache: add data block size limits to code and Documentation
dm cache: document metadata device is exclussive to a cache
dm: stop using WQ_NON_REENTRANT
Headline item is multithreading for RAID5 so that more
IO/sec can be supported on fast (SSD) devices.
Also TILE-Gx SIMD suppor for RAID6 calculations and an
assortment of bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIVAwUAUi6dRTnsnt1WYoG5AQIqMBAAm/XUEqfyBNUiTPHmIU/OyReOlfsp8A2o
xtcmSzaCtIUz4btPszUrw3PShqnk+lXXX2AB0rp3PzfOgyNYXBRKbzOf3eGr2VEp
L/Cm0iSWHqQ7V7MoV5ZrqtvuyJV1a7FK3a3VaoKaUk424o4sZ7P67t/YZAnTCP/i
9wQoPeIOJ8YjZsaAQjzI3q7yRMRE8ytyBnF4NdgeMyr2p17w2e9pnmNfCTo4wnWs
Nu2wPr2QCPQXr/FoIhdIVEy3kVatqH8qXG8Fw+5n07HJYxGCvQZLDuoOVDYyFeoW
gnNq2MMgLZm/7Nzqd1bN+QQZuBCd5JL4VJ2G4vLfYrn3ZSdSysrVKQXFKYG3Gkua
1KP4Pv0hndAl4DtGbUk8CiZp6b+c5qeWvq+sO2NuhUGmumFMK2q4DJhITNexjmrs
Eg4opnR8JMLDkYD6o52Ziu5KQR/q1PKRLj80eoVuqB2QQM5+NPb4s3k2WN+53lQD
L9fH2alUxxSK+5R8ykk923QQ/XErMUwXaka+O/gGFAlYvaaW/GKTxFnKn/GIXAkc
tKW88zB+zA5EZEFec+K43z1UjtGxMWsryvDN55ON2iV+LIZBISm7krroBeR55cyO
+3tHlPsga0pO+9DdSm7hvZeWRrq5ZJTiZmL/e2FYygrC5tFAY0p+z49fK3e9Th13
C85G7fg3yDY=
=zLxh
-----END PGP SIGNATURE-----
Merge tag 'md/3.12' of git://neil.brown.name/md
Pull md update from Neil Brown:
"Headline item is multithreading for RAID5 so that more IO/sec can be
supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6
calculations and an assortment of bug fixes"
* tag 'md/3.12' of git://neil.brown.name/md:
raid5: only wakeup necessary threads
md/raid5: flush out all pending requests before proceeding with reshape.
md/raid5: use seqcount to protect access to shape in make_request.
raid5: sysfs entry to control worker thread number
raid5: offload stripe handle to workqueue
raid5: fix stripe release order
raid5: make release_stripe lockless
md: avoid deadlock when dirty buffers during md_stop.
md: Don't test all of mddev->flags at once.
md: Fix apparent cut-and-paste error in super_90_validate
raid6/test: replace echo -e with printf
RAID: add tilegx SIMD implementation of raid6
md: fix safe_mode buglet.
md: don't call md_allow_write in get_bitmap_file.
Pull vfs pile 3 (of many) from Al Viro:
"Waiman's conversion of d_path() and bits related to it,
kern_path_mountpoint(), several cleanups and fixes (exportfs
one is -stable fodder, IMO).
There definitely will be more... ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
split read_seqretry_or_unlock(), convert d_walk() to resulting primitives
dcache: Translating dentry into pathname without taking rename_lock
autofs4 - fix device ioctl mount lookup
introduce kern_path_mountpoint()
rename user_path_umountat() to user_path_mountpoint_at()
take unlazy_walk() into umount_lookup_last()
Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c
prune_super(): sb->s_op is never NULL
exportfs: don't assume that ->iterate() won't feed us too long entries
afs: get rid of redundant ->d_name.len checks
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
Move the call to platform_get_resource adjacent to the call to
devm_ioremap_resource to make the connection between them more clear.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Remove unneeded error handling on the result of a call to
platform_get_resource_byname when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,e,e1;
expression ret != 0;
identifier l;
@@
res = platform_get_resource_byname(...);
- if (res == NULL) { ... \(goto l;\|return ret;\) }
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
When I moved the RCU walk termination into unlazy_walk(), I didn't copy
quite all of it: for the successful RCU termination we properly add the
necessary reference counts to our temporary copy of the root path, but
for the failure case we need to make sure that any temporary root path
information is cleared out (since it does _not_ have the proper
reference counts from the RCU lookup).
We could clean up this mess by just always dropping the temporary root
information, but Al points out that that would mean that a single lookup
through symlinks could see multiple different root entries if it races
with another thread doing chroot. Not that I think we should really
care (we had that before too, back before we had a copy of the root path
in the nameidata).
Al says he has a cunning plan. In the meantime, this is the minimal fix
for the problem, even if it's not all that pretty.
Reported-by: Mace Moneta <moneta.mace@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Wan Zongshun <mcuos.com@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch adds the driver for the watchdog found in the Allwinner A10 and
A13 SoCs. It has DT-support and uses the new watchdog framework.
Signed-off-by: Carlo Caione <carlo.caione@gmail.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch removes the global variables in the driver file and
group them into a structure.
Signed-off-by: Leela Krishna Amudala <l.krishna@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Let the inode verifier do it's work by returning an error when we
fail to find correct magic numbers in an inode buffer.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Saw this on generic/270 after a DQALLOC transaction overrun
shutdown:
XFS: Assertion failed: !(bip->bli_item.li_flags & XFS_LI_IN_AIL), file: fs/xfs/xfs_buf_item.c, line: 952
.....
xfs_buf_item_relse+0x4f/0xd0
xfs_buf_item_unlock+0x1b4/0x1e0
xfs_trans_free_items+0x7d/0xb0
xfs_trans_cancel+0x13c/0x1b0
xfs_symlink+0x37e/0xa60
....
When a transaction abort occured.
If we are aborting a transaction and trigger this code path, then
the item may be dirty. If the item is dirty, then it may be in the
AIL. Hence if we are aborting, we need to check if the item is in
the AIL and remove it before freeing it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
We have quite a few places now where we do:
x = kmem_zalloc(large size)
if (!x)
x = kmem_zalloc_large(large size)
and do a similar dance when freeing the memory. kmem_free() already
does the correct freeing dance, and kmem_zalloc_large() is only ever
called in these constructs, so just factor it all into
kmem_zalloc_large() and kmem_free().
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Ever since increasing the number of supported ACLs from 25 to as
many as can fit in an xattr, there have been reports of order 4
memory allocations failing in the ACL code. Fix it in the same way
we've fixed all the xattr read/write code that has the same problem.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
When splitting the root of the da btree, we shuffled data between
buffers and the structures that track them. At one point, we copy
data and state from one buffer to another, including the ops
associated with the buffer. When we do this, we also need to copy
the buffer type associated with the buf log item so that the buffer
is logged correctly. If we don't do that, log recovery won't
recognise it and hence it won't recalculate the CRC on the buffer
after recovery. This leads to a directory block that can't be read
after recovery has run.
Found by inspection after finding the same problem with remote
symlink buffers.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The logging of a remote symlink block does not set the buffer type
being logged, and hence on recovery the type of buffer is not
recognised and hence CRCs are not calculated after replay. This
results in log recoery throwing:
XFS (vdc): Unknown buffer type 0
errors, and subsequent reads of the symlink failing CRC
verification. Found via fsstress + godown.
Reported by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
This is the recovery side of the btree block owner change operation
performed by swapext on CRC enabled filesystems. We detect that an
owner change is needed by the flag that has been placed on the inode
log format flag field. Because the inode recovery is being replayed
after the buffers that make up the BMBT in the given checkpoint, we
can walk all the buffers and directly modify them when we see the
flag set on an inode.
Because the inode can be relogged and hence present in multiple
chekpoints with the "change owner" flag set, we could do multiple
passes across the inode to do this change. While this isn't optimal,
we can't directly ignore the flag as there may be multiple
independent swap extent operations being replayed on the same inode
in different checkpoints so we can't ignore them.
Further, because the owner change operation uses ordered buffers, we
might have buffers that are newer on disk than the current
checkpoint and so already have the owner changed in them. Hence we
cannot just peek at a buffer in the tree and check that it has the
correct owner and assume that the change was completed.
So, for the moment just brute force the owner change every time we
see an inode with the flag set. Note that we have to be careful here
because the owner of the buffers may point to either the old owner
or the new owner. Currently the verifier can't verify the owner
directly, so there is no failure case here right now. If we verify
the owner exactly in future, then we'll have to take this into
account.
This was tested in terms of normal operation via xfstests - all of
the fsr tests now pass without failure. however, we really need to
modify xfs/227 to stress v3 inodes correctly to ensure we fully
cover this case for v5 filesystems.
In terms of recovery testing, I used a hacked version of xfs_fsr
that held the temp inode open for a few seconds before exiting so
that the filesystem could be shut down with an open owner change
recovery flags set on at least the temp inode. fsr leaves the temp
inode unlinked and in btree format, so this was necessary for the
owner change to be reliably replayed.
logprint confirmed the tmp inode in the log had the correct flag set:
INO: cnt:3 total:3 a:0x69e9e0 len:56 a:0x69ea20 len:176 a:0x69eae0 len:88
INODE: #regs:3 ino:0x44 flags:0x209 dsize:88
^^^^^
0x200 is set, indicating a data fork owner change needed to be
replayed on inode 0x44. A printk in the revoery code confirmed that
the inode change was recovered:
XFS (vdc): Mounting Filesystem
XFS (vdc): Starting recovery (logdev: internal)
recovering owner change ino 0x44
XFS (vdc): Version 5 superblock detected. This kernel L support enabled!
Use of these features in this kernel is at your own risk!
XFS (vdc): Ending recovery (logdev: internal)
The script used to test this was:
$ cat ./recovery-fsr.sh
#!/bin/bash
dev=/dev/vdc
mntpt=/mnt/scratch
testfile=$mntpt/testfile
umount $mntpt
mkfs.xfs -f -m crc=1 $dev
mount $dev $mntpt
chmod 777 $mntpt
for i in `seq 10000 -1 0`; do
xfs_io -f -d -c "pwrite $(($i * 4096)) 4096" $testfile > /dev/null 2>&1
done
xfs_bmap -vp $testfile |head -20
xfs_fsr -d -v $testfile &
sleep 10
/home/dave/src/xfstests-dev/src/godown -f $mntpt
wait
umount $mntpt
xfs_logprint -t $dev |tail -20
time mount $dev $mntpt
xfs_bmap -vp $testfile
umount $mntpt
$
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
The operation status is decoded in decode_op_hdr.
Stop the print_overflow message that is always hit without this patch:
nfs: decode_free_stateid: prematurely hit end of receive buffer. Remaining
buffer length is 0 words.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>