1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

40688 commits

Author SHA1 Message Date
David S. Miller
04ad37836c Included changes:
- change my email in MAINTAINERS and Doc files
 - create and export list of single hop neighs per interface
 - protect CRC in the BLA code by means of its own lock
 - minor fixes and code cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWcRAcAAoJENpFlCjNi1MRxPQP/A8wAuHYUyMDu6ov5VRUR+24
 TYKgCVI1evuYlkOdgHkTQfTlOILgV2NHvORzKwQRNToyvtsd4VIeXZfFJ0UXB0Wo
 sMz9vrmmcndGkcYPWs+6IoVUbxMPZXjRGhme3Ig7nNTOQ4GvcSaQANbYWAMBfg61
 h6pxez67el9jzso2acHEgvOyaldOLDcbbdwt4vY7bPRgD9VXpE3nazBymaZXLLxs
 1ByXiCgJw32gXSWv8RQ/j13btjrbWCdmcEz9Ag1xO5i5u9BI1VJ8nmbLuNu63S0G
 Ftgvk5QBeMAxs3xDiTmtQD3bmiS87Jy+1d5rFb581/arM5SnYq6GZIrb81tdVTU6
 PMkMZ6/EUV2HSogq9PJ1ZDk/0oPYT5tqfLtJWcaAZplmWYt3ZUrQsPo4z5CtbBhU
 6Ar29G5slkgLslcBqn6YB00LxwOmj7elyVdPtL+wMCojtut+Ds4O+FzPdvd/145F
 hCtIe55b2ciBsfp1dDXP5P15HeEMjLiN2xWJKAPLhDCmGruJ6Pnly+ElzgV+zdKL
 7Qe1mOXwticMy3pH+ST7CP47tp5uyT0ak27eo+oOn4LI/ppM2qW1P9bNUZ6RrVwf
 dRRh0dzQruaLQauHK2Z6BWtA2Q36wY2anwmBONK34NH6VfRMB1GJMi3JF8gDgTds
 SWwU5FGWTn/cnCejtsbh
 =/Z+R
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Antonio Quartulli says:

====================
Included changes:
- change my email in MAINTAINERS and Doc files
- create and export list of single hop neighs per interface
- protect CRC in the BLA code by means of its own lock
- minor fixes and code cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16 11:09:40 -05:00
Zhu Yanjun
566178f853 net: sctp: dynamically enable or disable pf state
As we all know, the value of pf_retrans >= max_retrans_path can
disable pf state. The variables of pf_retrans and max_retrans_path
can be changed by the userspace application.

Sometimes the user expects to disable pf state while the 2
variables are changed to enable pf state. So it is necessary to
introduce a new variable to disable pf state.

According to the suggestions from Vlad Yasevich, extra1 and extra2
are removed. The initialization of pf_enable is added.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-16 10:56:50 -05:00
Simon Wunderlich
5a1dd8a477 batman-adv: lock crc access in bridge loop avoidance
We have found some networks in which nodes were constantly requesting
other nodes BLA claim tables to synchronize, just to ask for that again
once completed. The reason was that the crc checksum of the asked nodes
were out of sync due to missing locking and multiple writes to the same
crc checksum when adding/removing entries. Therefore the asked nodes
constantly reported the wrong crc, which caused repeating requests.

To avoid multiple functions changing a backbone gateways crc entry at
the same time, lock it using a spinlock.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Tested-by: Alfons Name <AlfonsName@web.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 15:17:09 +08:00
Sven Eckelmann
c05a57f6fb batman-adv: Fix typo 'wether' -> 'whether'
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 15:17:09 +08:00
Sven Eckelmann
01f6b5c76a batman-adv: Use chain pointer when purging fragments
The chain pointer was already created in batadv_frag_purge_orig to make the
checks more readable. Just use the chain pointer everywhere instead of
having the same dereference + array access in the most lines of this
function.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 15:17:09 +08:00
Simon Wunderlich
ad7e2c466d batman-adv: unify flags access style in tt global add
This should slightly improve readability

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 15:17:08 +08:00
Simon Wunderlich
c169c59dd5 batman-adv: detect local excess vlans in TT request
If the local representation of the global TT table of one originator has
more VLAN entries than the respective TT update, there is some
inconsistency present. By detecting and reporting this inconsistency,
the global table gets updated and the excess VLAN will get removed in
the process.

Reported-by: Alessandro Bolletta <alessandro@mediaspot.net>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 15:17:08 +08:00
Hannes Frederic Sowa
9b29c6962b ipv6: automatically enable stable privacy mode if stable_secret set
Bjørn reported that while we switch all interfaces to privacy stable mode
when setting the secret, we don't set this mode for new interfaces. This
does not make sense, so change this behaviour.

Fixes: 622c81d57b ("ipv6: generation of stable privacy addresses for link-local and autoconf")
Reported-by: Bjørn Mork <bjorn@mork.no>
Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:37:32 -05:00
Eric Dumazet
6857a02af5 sctp: use GFP_KERNEL in sctp_init()
modules init functions being called from process context, we better
use GFP_KERNEL allocations to increase our chances to get these
high-order pages we want for SCTP hash tables.

This mostly matters if SCTP module is loaded once memory got fragmented.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:27:45 -05:00
Lorenzo Colitti
c1e64e298b net: diag: Support destroying TCP sockets.
This implements SOCK_DESTROY for TCP sockets. It causes all
blocking calls on the socket to fail fast with ECONNABORTED and
causes a protocol close of the socket. It informs the other end
of the connection by sending a RST, i.e., initiating a TCP ABORT
as per RFC 793. ECONNABORTED was chosen for consistency with
FreeBSD.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:26:52 -05:00
Lorenzo Colitti
6eb5d2e08f net: diag: Support SOCK_DESTROY for inet sockets.
This passes the SOCK_DESTROY operation to the underlying protocol
diag handler, or returns -EOPNOTSUPP if that handler does not
define a destroy operation.

Most of this patch is just renaming functions. This is not
strictly necessary, but it would be fairly counterintuitive to
have the code to destroy inet sockets be in a function whose name
starts with inet_diag_get.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:26:51 -05:00
Lorenzo Colitti
64be0aed59 net: diag: Add the ability to destroy a socket.
This patch adds a SOCK_DESTROY operation, a destroy function
pointer to sock_diag_handler, and a diag_destroy function
pointer.  It does not include any implementation code.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:26:51 -05:00
Lorenzo Colitti
b613f56ec9 net: diag: split inet_diag_dump_one_icsk into two
Currently, inet_diag_dump_one_icsk finds a socket and then dumps
its information to userspace. Split it into a part that finds the
socket and a part that dumps the information.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:26:51 -05:00
Tom Herbert
7f00feaf10 ila: Add generic ILA translation facility
This patch implements an ILA tanslation table. This table can be
configured with identifier to locator mappings, and can be be queried
to resolve a mapping. Queries can be parameterized based on interface,
direction (incoming or outoing), and matching locator.  The table is
implemented using rhashtable and is configured via netlink (through
"ip ila .." in iproute).

The table may be used as alternative means to do do ILA tanslations
other than the lw tunnels

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:25:20 -05:00
Tom Herbert
fc9e50f5a5 netlink: add a start callback for starting a netlink dump
The start callback allows the caller to set up a context for the
dump callbacks. Presumably, the context can then be destroyed in
the done callback.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:25:20 -05:00
Tom Herbert
33f11d1614 ila: Create net/ipv6/ila directory
Create ila directory in preparation for supporting other hooks in the
kernel than LWT for doing ILA. This includes:
  - Moving ila.c to ila/ila_lwt.c
  - Splitting out some common functions into ila_common.c

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 23:25:20 -05:00
Tom Herbert
6ae23ad362 net: Add driver helper functions to determine checksum offloadability
Add skb_csum_offload_chk driver helper function to determine if a
device with limited checksum offload capabilities is able to offload the
checksum for a given packet.

This patch includes:
  - The skb_csum_offload_chk function. Returns true if checksum is
    offloadable, else false. Optionally, in the case that the checksum
    is not offloable, the function can call skb_checksum_help to resolve
    the checksum. skb_csum_offload_chk also returns whether the checksum
    refers to an encapsulated checksum.
  - Definition of skb_csum_offl_spec structure that caller uses to
    indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only,
    whether encapsulated checksums can be offloaded, whether checksum with
    IPv6 extension headers can be offloaded).
  - Ancilary functions called skb_csum_offload_chk_help,
    skb_csum_off_chk_help_cmn, skb_csum_off_chk_help_cmn_v4_only.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:21 -05:00
Tom Herbert
9a49850d0a tcp: Fix conditions to determine checksum offload
In tcp_send_sendpage and tcp_sendmsg we check the route capabilities to
determine if checksum offload can be performed. This check currently
does not take the IP protocol into account for devices that advertise
only one of NETIF_F_IPV6_CSUM or NETIF_F_IP_CSUM. This patch adds a
function to check capabilities for checksum offload with a socket
called sk_check_csum_caps. This function checks for specific IPv4 or
IPv6 offload support based on the family of the socket.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:20 -05:00
Tom Herbert
c8cd0989bd net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM
These netif flags are unnecessary convolutions. It is more
straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM,
and NETIF_F_IPV6_CSUM directly.

This patch also:
    - Cleans up can_checksum_protocol
    - Simplifies netdev_intersect_features

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:20 -05:00
Tom Herbert
a188222b6e net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.

This patch:
  - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
  - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:50:08 -05:00
Tom Herbert
53692b1de4 sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 16:49:58 -05:00
tadeusz.struk@intel.com
130ed5d105 net: fix uninitialized variable issue
msg_iocb needs to be initialized on the recv/recvfrom path.
Otherwise afalg will wrongly interpret it as an async call.

Cc: stable@vger.kernel.org
Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 15:46:48 -05:00
David S. Miller
5233252fce bluetooth: Validate socket address length in sco_sock_bind().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 15:39:08 -05:00
Eric Dumazet
225734de70 net_sched: make qdisc_tree_decrease_qlen() work for non mq
Stas Nichiporovich reported a regression in his HFSC qdisc setup
on a non multi queue device.

It turns out I mistakenly added a TCQ_F_NOPARENT flag on all qdisc
allocated in qdisc_create() for non multi queue devices, which was
rather buggy. I was clearly mislead by the TCQ_F_ONETXQUEUE that is
also set here for no good reason, since it only matters for the root
qdisc.

Fixes: 4eaf3b84f2 ("net_sched: fix qdisc_tree_decrease_qlen() races")
Reported-by: Stas Nichiporovich <stasn77@gmail.com>
Tested-by: Stas Nichiporovich <stasn77@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 13:41:52 -05:00
Ido Schimmel
6ff64f6f92 switchdev: Pass original device to port netdev driver
switchdev drivers need to know the netdev on which the switchdev op was
invoked. For example, the STP state of a VLAN interface configured on top
of a port can change while being member in a bridge. In this case, the
underlying driver should only change the STP state of that particular
VLAN and not of all the VLANs configured on the port.

However, current switchdev infrastructure only passes the port netdev down
to the driver. Solve that by passing the original device down to the
driver as part of the required switchdev object / attribute.

This doesn't entail any change in current switchdev drivers. It simply
enables those supporting stacked devices to know the originating device
and act accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:20 -05:00
Ido Schimmel
9d547833f0 switchdev: vlan: Use switchdev_port* in vlan_netdev_ops
We need to be able to propagate static FDB entries and certain bridge
port attributes (e.g. learning, flooding) down to the port netdev
driver when bridge port is a VLAN interface.

Achieve that by setting ndo_bridge* and ndo_fdb* in vlan_netdev_ops to
the corresponding switchdev_port* functions. This is consistent with
team and bond devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 11:58:20 -05:00
Simon Wunderlich
18165f6f65 batman-adv: rename equiv/equal or better to similar or better
Since the function applies a threshold and also slightly worse
values are accepted, ''equal or better'' does not represent the
intention of the function. ''Similar or better'' represents that better.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 00:21:42 +08:00
Marek Lindner
4ff1e2a738 batman-adv: update last seen field of single hop originators
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 00:21:42 +08:00
Marek Lindner
7587405ab9 batman-adv: export single hop neighbor list via debugfs
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 00:21:42 +08:00
Marek Lindner
8248a4c7c8 batman-adv: add bat_hardif_neigh_init algo ops call
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 00:21:41 +08:00
Marek Lindner
cef63419f7 batman-adv: add list of unique single hop neighbors per hard-interface
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-16 00:21:41 +08:00
Florian Westphal
9c55d3b545 nfnetlink: add nfnl_dereference_protected helper
to avoid overly long line in followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-15 15:14:04 +01:00
Eyal Shapira
cf1e05c636 mac80211: handle width changes from opmode notification IE in beacon
An AP can send an operating channel width change in a beacon
opmode notification IE as long as there's a change in the nss as
well (See 802.11ac-2013 section 10.41).
So don't limit updating to nss only from an opmode notification IE.

Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:16:47 +01:00
Johannes Berg
a87da0cbc4 mac80211: suppress unchanged "limiting TX power" messages
When the AP is advertising limited TX power, the message can be
printed over and over again. Suppress it when the power level
isn't changing.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=106011

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:14:04 +01:00
Johannes Berg
1ea2c86480 mac80211: reprogram in interface order
During reprogramming, mac80211 currently first adds all the channel
contexts, then binds them to the vifs and then goes to reconfigure
all the interfaces. Drivers might, perhaps implicitly, rely on the
operation order for certain things that typically happen within a
single function elsewhere in mac80211. To avoid problems with that,
reorder the code in mac80211's restart/reprogramming to work fully
within the interface loop so that the order of operations is like
in normal operation.

For iwlwifi, this fixes a firmware crash when reprogramming with an
AP/GO interface active.

Reported-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:13:59 +01:00
Johannes Berg
74430f9489 mac80211: run scan completed work on reconfig failure
When reconfiguration during resume fails while a scan is pending
for completion work, that work will never run, and the scan will
be stuck forever. Factor out the code to recover this and call it
also in ieee80211_handle_reconfig_failure().

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:12:50 +01:00
Ola Olsson
707554b4d1 nl80211: Fix potential memory leak in nl80211_connect
Free cached keys if the last early return path is taken.

Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:11:26 +01:00
Ola Olsson
e5dbe0701a nl80211: Fix potential memory leak in nl80211_set_wowlan
Compared to cfg80211_rdev_free_wowlan in core.h,
the error goto label lacks the freeing of nd_config.
Fix that.

Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:10:05 +01:00
Ola Olsson
09d118008f nl80211: fix a few memory leaks in reg.c
The first leak occurs when entering the default case
in the switch for the initiator in set_regdom.
The second leaks a platform_device struct if the
platform registration in regulatory_init succeeds but
the sub sequent regulatory hint fails due to no memory.

Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-15 13:08:02 +01:00
Vlad Yasevich
f654861569 skbuff: Fix offset error in skb_reorder_vlan_header
skb_reorder_vlan_header is called after the vlan header has
been pulled.  As a result the offset of the begining of
the mac header has been incrased by 4 bytes (VLAN_HLEN).
When moving the mac addresses, include this incrase in
the offset calcualation so that the mac addresses are
copied correctly.

Fixes: a6e18ff111 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off)
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 00:30:41 -05:00
Eric Dumazet
5037e9ef94 net: fix IP early demux races
David Wilder reported crashes caused by dst reuse.

<quote David>
  I am seeing a crash on a distro V4.2.3 kernel caused by a double
  release of a dst_entry.  In ipv4_dst_destroy() the call to
  list_empty() finds a poisoned next pointer, indicating the dst_entry
  has already been removed from the list and freed. The crash occurs
  18 to 24 hours into a run of a network stress exerciser.
</quote>

Thanks to his detailed report and analysis, we were able to understand
the core issue.

IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.

When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.

Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.

We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.

This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.

It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.

Can probably be backported back to linux-3.6 kernels

Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 23:52:00 -05:00
David S. Miller
5148371a75 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-12-11

Here's another set of Bluetooth & 802.15.4 patches for the 4.5 kernel:

 - 6LoWPAN debugfs support
 - New 802.15.4 driver for ADF7242 MAC IEEE802154
 - Initial code for 6LoWPAN Generic Header Compression (GHC) support
 - Refactor Bluetooth LE scan & advertising behind dedicated workqueue
 - Cleanups to Bluetooth H:5 HCI driver
 - Support for Toshiba Broadcom based Bluetooth controllers
 - Use continuous scanning when establishing Bluetooth LE connections

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:23:10 -05:00
Eugene Crosser
979f66b32d iucv: call skb_linearize() when needed
When the linear buffer of the received sk_buff is shorter than
the header, use skb_linearize(). sk_buffs with short linear buffer
happen on the sending side under high traffic, and some kernel
configurations, when allocated buffer starts just before page
boundary, and IUCV transport has to send it as two separate QDIO
buffer elements, with fist element shorter than the header.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:16:44 -05:00
Eugene Crosser
0506eb01f7 iucv: prevent information leak in iucv_message
Initialize storage for the future IUCV header that will be included
in the transmitted packet. Some of the header fields are unused with
HiperSockets transport, and will contain data left from some other
functions.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:16:44 -05:00
Alexander Aring
5241c2d7c5 ipv6: addrconf: drop ieee802154 specific things
This patch removes ARPHRD_IEEE802154 from addrconf handling. In the
earlier days of 802.15.4 6LoWPAN, the interface type was ARPHRD_IEEE802154
which introduced several issues, because 802.15.4 interfaces used the
same type.

Since commit 965e613d29 ("ieee802154: 6lowpan: fix ARPHRD to
ARPHRD_6LOWPAN") we use ARPHRD_6LOWPAN for 6LoWPAN interfaces. This
patch will remove ARPHRD_IEEE802154 which is currently deadcode, because
ARPHRD_IEEE802154 doesn't reach the minimum 1280 MTU of IPv6.

Also we use 6LoWPAN EUI64 specific defines instead using link-layer
constanst from 802.15.4 link-layer header.

Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:10:34 -05:00
Hannes Frederic Sowa
79462ad02e net: add validation for the socket syscall protocol argument
郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang <cwang@twopensource.com>
Reported-by: 郭永刚 <guoyonggang@360.cn>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:09:30 -05:00
Tejun Heo
c38c4597e4 netfilter: implement xt_cgroup cgroup2 path match
This patch implements xt_cgroup path match which matches cgroup2
membership of the associated socket.  The match is recursive and
invertible.

For rationales on introducing another cgroup based match, please refer
to a preceding commit "sock, cgroup: add sock->sk_cgroup".

v3: Folded into xt_cgroup as a new revision interface as suggested by
    Pablo.

v2: Included linux/limits.h from xt_cgroup2.h for PATH_MAX.  Added
    explicit alignment to the priv field.  Both suggested by Jan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-14 20:34:55 +01:00
Tejun Heo
4ec8ff0edc netfilter: prepare xt_cgroup for multi revisions
xt_cgroup will grow cgroup2 path based match.  Postfix existing
symbols with _v0 and prepare for multi revision registration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-14 20:34:52 +01:00
Pablo Neira Ayuso
a4ec80082c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Resolve conflict between commit 264640fc2c ("ipv6: distinguish frag
queues by device for multicast and link-local packets") from the net
tree and commit 029f7f3b87 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations") from the nf-next tree.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Conflicts:
	net/ipv6/netfilter/nf_conntrack_reasm.c
2015-12-14 20:31:16 +01:00
David S. Miller
9e5be5bd43 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
netfilter fixes for net

The following patchset contains Netfilter fixes for you net tree,
specifically for nf_tables and nfnetlink_queue, they are:

1) Avoid a compilation warning in nfnetlink_queue that was introduced
   in the previous merge window with the simplification of the conntrack
   integration, from Arnd Bergmann.

2) nfnetlink_queue is leaking the pernet subsystem registration from
   a failure path, patch from Nikolay Borisov.

3) Pass down netns pointer to batch callback in nfnetlink, this is the
   largest patch and it is not a bugfix but it is a dependency to
   resolve a splat in the correct way.

4) Fix a splat due to incorrect socket memory accounting with nfnetlink
   skbuff clones.

5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6.

6) Traverse the nftables commit list in reverse order from the commit
   path, otherwise we crash when the user applies an incremental update
   via 'nft -f' that deletes an object that was just introduced in this
   batch, from Xin Long.

Regarding the compilation warning fix, many people have sent us (and
keep sending us) patches to address this, that's why I'm including this
batch even if this is not critical.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 11:09:01 -05:00