summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2015-01-16Revert "mac80211: Fix accounting of the tailroom-needed counter"Johannes Berg
commit 1e359a5de861a57aa04d92bb620f52a5c1d7f8b1 upstream. This reverts commit ca34e3b5c808385b175650605faa29e71e91991b. It turns out that the p54 and cw2100 drivers assume that there's tailroom even when they don't say they really need it. However, there's currently no way for them to explicitly say they do need it, so for now revert this. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331. Fixes: ca34e3b5c808 ("mac80211: Fix accounting of the tailroom-needed counter") Reported-by: Christopher Chavez <chrischavez@gmx.us> Bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Debugged-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16rpc: fix xdr_truncate_encode to handle buffer ending on page boundaryJ. Bruce Fields
commit 49a068f82a1d30eb585d7804b05948376be6cf9a upstream. A struct xdr_stream at a page boundary might point to the end of one page or the beginning of the next, but xdr_truncate_encode isn't prepared to handle the former. This can cause corruption of NFSv4 READDIR replies in the case that a readdir entry that would have exceeded the client's dircount/maxcount limit would have ended exactly on a 4k page boundary. You're more likely to hit this case on large directories. Other xdr_truncate_encode callers are probably also affected. Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: Fix accepting connections when not using mgmtJohan Hedberg
commit 6a8fc95c87110a466ee81675b41170b963f82bdb upstream. When connectable mode is enabled (page scan on) through some non-mgmt method the HCI_CONNECTABLE flag will not be set. For backwards compatibility with user space versions not using mgmt we should not require HCI_CONNECTABLE to be set if HCI_MGMT is not set. Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDRMarcel Holtmann
commit 8bfe8442ff20fdc2d965c197103d935a99bd3296 upstream. When controllers set the HCI_QUIRK_INVALID_BDADDR flag, it is required by userspace to program a valid public Bluetooth device address into the controller before it can be used. After successful address configuration, the internal state changes and the controller runs the complete initialization procedure. However one small difference is that this is no longer the HCI_SETUP stage. The HCI_SETUP stage is only valid during initial controller setup. In this case the stack runs the initialization as part of the HCI_CONFIG stage. The controller version information, default name and supported commands are only stored during HCI_SETUP. While these information are static, they are not read initially when HCI_QUIRK_INVALID_BDADDR is set. So when running in HCI_CONFIG state, these information need to be updated as well. This especially impacts Bluetooth 4.1 and later controllers using extended feature pages and second event mask page. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: Clear LE white list when resetting controllerMarcel Holtmann
commit a4d5504d5c39cc84f1f828e19967595597a8136e upstream. The internal representation of the LE white list needs to be cleared when receiving a successful HCI_Reset command. A reset of the controller is expected to start with an empty LE white list. When the LE white list is not cleared on controller reset, the passive background scanning might skip programming the remote devices. Only changes to the LE white list are programmed when passive background is started. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: Fix check for direct advertisingJohan Hedberg
commit 0b1db38ca26b322296cbd141f3080eccfe1cc3e1 upstream. These days we allow simultaneous LE scanning and advertising. Checking for whether advertising is enabled or not is therefore not a reliable way to determine whether directed advertising was used to trigger the connection creation. The appropriate place to check (instead of the hdev context) is the connection role that's stored in the hci_conn. This patch fixes such a check in le_conn_timeout() which could otherwise lead to incorrect HCI commands being sent. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: Fix LE connection timeout deadlockJohan Hedberg
commit 980ffc0a2cec2c37589cc97993e1ad17252f4f47 upstream. The le_conn_timeout() may call hci_le_conn_failed() which in turn may call hci_conn_del(). Trying to use the _sync variant for cancelling the conn timeout from hci_conn_del() could therefore result in a deadlock. This patch converts hci_conn_del() to use the non-sync variant so the deadlock is not possible. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-16Bluetooth: 6lowpan: fix skb_unshare behaviourAlexander Aring
commit b0c42cd7b210efc74aa4bfc3e39a2814dfaa9b89 upstream. This patch reverts commit: a7807d73 ("Bluetooth: 6lowpan: Avoid memory leak if memory allocation fails") which was wrong suggested by Alexander Aring. The function skb_unshare run also kfree_skb on failure. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-08mac80211: free management frame keys when removing stationJohannes Berg
commit 28a9bc68124c319b2b3dc861e80828a8865fd1ba upstream. When writing the code to allow per-station GTKs, I neglected to take into account the management frame keys (index 4 and 5) when freeing the station and only added code to free the first four data frame keys. Fix this by iterating the array of keys over the right length. Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-08mac80211: fix multicast LED blinking and counterAndreas Müller
commit d025933e29872cb1fe19fc54d80e4dfa4ee5779c upstream. As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount" stopped being incremented after the use-after-free fix. Furthermore, the RX-LED will be triggered by every multicast frame (which wouldn't happen before) which wouldn't allow the LED to rest at all. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the patch. Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation") Signed-off-by: Andreas Müller <goo@stapelspeicher.org> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-08mac80211: avoid using uninitialized stack dataJes Sorensen
commit 7e6225a1604d0c6aa4140289bf5761868ffc9c83 upstream. Avoid a case where we would access uninitialized stack data if the AP advertises HT support without 40MHz channel support. Fixes: f3000e1b43f1 ("mac80211: fix broken use of VHT/20Mhz with some APs") Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-08mac80211: copy chandef from AP vif to VLANsFelix Fietkau
commit 2967e031d4d737d9cc8252d878a17924d7b704f0 upstream. Instead of keeping track of all those special cases where VLAN interfaces have no bss_conf.chandef, just make sure they have the same as the AP interface they belong to. Among others, this fixes a crash getting a VLAN's channel from userspace since a NULL channel is returned as a good result (return value 0) for VLANs since the commit below. Fixes: c12bc4885f4b3 ("mac80211: return the vif's chandef in ieee80211_cfg_get_channel()") Signed-off-by: Felix Fietkau <nbd@openwrt.org> [rewrite commit log] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16netlink: use jhash as hashfn for rhashtableDaniel Borkmann
[ Upstream commit 7f19fc5e0b617593dcda0d9956adc78b559ef1f5 ] For netlink, we shouldn't be using arch_fast_hash() as a hashing discipline, but rather jhash() instead. Since netlink sockets can be opened by any user, a local attacker would be able to easily create collisions with the DPDK-derived arch_fast_hash(), which trades off performance for security by using crc32 CPU instructions on x86_64. While it might have a legimite use case in other places, it should be avoided in netlink context, though. As rhashtable's API is very flexible, we could later on still decide on other hashing disciplines, if legitimate. Reference: http://thread.gmane.org/gmane.linux.kernel/1844123 Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table") Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.cValdis.Kletnieks@vt.edu
[ Upstream commit 69204cf7eb9c5a72067ce6922d4699378251d053 ] commit 46e5da40ae (net: qdisc: use rcu prefix and silence sparse warnings) triggers a spurious warning: net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage! The code should be using the _bh variant of rcu_dereference. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16tcp: fix more NULL deref after prequeue changesEric Dumazet
[ Upstream commit 0f85feae6b710ced3abad5b2b47d31dfcb956b62 ] When I cooked commit c3658e8d0f1 ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. Reported-by: Dann Frazier <dann.frazier@canonical.com> Bisected-by: Dann Frazier <dann.frazier@canonical.com> Tested-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16net: sctp: use MAX_HEADER for headroom reserve in output pathDaniel Borkmann
[ Upstream commit 9772b54c55266ce80c639a80aa68eeb908f8ecf5 ] To accomodate for enough headroom for tunnels, use MAX_HEADER instead of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs of trinity an skb_under_panic() via SCTP output path (see reference). I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere in other protocols might be one possible cause for this. In any case, it looks like accounting on chunks themself seems to look good as the skb already passed the SCTP output path and did not hit any skb_over_panic(). Given tunneling was enabled in his .config, the headroom would have been expanded by MAX_HEADER in this case. Reported-by: Robert Święcki <robert@swiecki.net> Reference: https://lkml.org/lkml/2014/12/1/507 Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16openvswitch: Fix flow mask validation.Pravin B Shelar
[ Upstream commit f2a01517f2a1040a0b156f171a7cefd748f2fd03 ] Following patch fixes typo in the flow validation. This prevented installation of ARP and IPv6 flows. Fixes: 19e7a3df72 ("openvswitch: Fix NDP flow mask validation") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-12-16gre: Set inner mac header in gro completeTom Herbert
[ Upstream commit 6fb2a756739aa507c1fd5b8126f0bfc2f070dc46 ] Set the inner mac header to point to the GRE payload when doing GRO. This is needed if we proceed to send the packet through GRE GSO which now uses the inner mac header instead of inner network header to determine the length of encapsulation headers. Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length") Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-29rtnetlink: release net refcnt on error in do_setlink()Nicolas Dichtel
rtnl_link_get_net() holds a reference on the 'struct net', we need to release it in case of error. CC: Eric W. Biederman <ebiederm@xmission.com> Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Several small fixes here: 1) Don't crash in tg3 driver when the number of tx queues has been configured to be different from the number of rx queues. From Thadeu Lima de Souza Cascardo. 2) VLAN filter not disabled properly in promisc mode in ixgbe driver, from Vlad Yasevich. 3) Fix OOPS on dellink op in VTI tunnel driver, from Xin Long. 4) IPV6 GRE driver WCCP code checks skb->protocol for ETH_P_IP instead of ETH_P_IPV6, whoops. From Yuri Chislov. 5) Socket matching in ping driver is buggy when packet AF does not match socket's AF. Fix from Jane Zhou. 6) Fix checksum calculation errors in VXLAN due to where the udp_tunnel6_xmit_skb() helper gets it's saddr/daddr from. From Alexander Duyck. 7) Fix 5G detection problem in rtlwifi driver, from Larry Finger. 8) Fix NULL deref in tcp_v{4,6}_send_reset, from Eric Dumazet. 9) Various missing netlink attribute verifications in bridging code, from Thomas Graf. 10) tcp_recvmsg() unconditionally calls ipv4 ip_recv_error even for ipv6 sockets, whoops. Fix from Willem de Bruijn" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits) net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINK bridge: Add missing policy entry for IFLA_BRPORT_FAST_LEAVE net: Check for presence of IFLA_AF_SPEC net: Validate IFLA_BRIDGE_MODE attribute length bridge: Validate IFLA_BRIDGE_FLAGS attribute length stmmac: platform: fix default values of the filter bins setting net/mlx4_core: Limit count field to 24 bits in qp_alloc_res net: dsa: bcm_sf2: reset switch prior to initialization net: dsa: bcm_sf2: fix unmapping registers in case of errors tg3: fix ring init when there are more TX than RX channels tcp: fix possible NULL dereference in tcp_vX_send_reset() rtlwifi: Change order in device startup rtlwifi: rtl8821ae: Fix 5G detection problem Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse" vxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX] ip6_udp_tunnel: Fix checksum calculation net-timestamp: Fix a documentation typo net/ping: handle protocol mismatching scenario af_packet: fix sparse warning ...
2014-11-26net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socksWillem de Bruijn
TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets. If the socket is of family AF_INET6, call ipv6_recv_error instead of ip_recv_error. This change is more complex than a single branch due to the loadable ipv6 module. It reuses a pre-existing indirect function call from ping. The ping code is safe to call, because it is part of the core ipv6 module and always present when AF_INET6 sockets are active. Fixes: 4ed2d765 (net-timestamp: TCP timestamping) Signed-off-by: Willem de Bruijn <willemb@google.com> ---- It may also be worthwhile to add WARN_ON_ONCE(sk->family == AF_INET6) to ip_recv_error. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINKThomas Graf
Only search for IFLA_EXT_MASK if the message actually carries a ifinfomsg header and validate minimal length requirements for IFLA_EXT_MASK. Fixes: 6cbdceeb ("bridge: Dump vlan information from a bridge port") Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26bridge: Add missing policy entry for IFLA_BRPORT_FAST_LEAVEThomas Graf
Fixes: c2d3babf ("bridge: implement multicast fast leave") Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26bridge: Validate IFLA_BRIDGE_FLAGS attribute lengthThomas Graf
Payload is currently accessed blindly and may exceed valid message boundaries. Fixes: 407af3299 ("bridge: Add netlink interface to configure vlans on bridge ports") Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd bugfixes from Bruce Fields: "These fix one mishandling of the case when security labels are configured out, and two races in the 4.1 backchannel code" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: nfsd: Fix slot wake up race in the nfsv4.1 callback code SUNRPC: Fix locking around callback channel reply receive nfsd: correctly define v4.2 support attributes
2014-11-25tcp: fix possible NULL dereference in tcp_vX_send_reset()Eric Dumazet
After commit ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") we have to relax check against skb dst in tcp_v[46]_send_reset() if prequeue dropped the dst. If a socket is provided, a full lookup was done to find this socket, so the dst test can be skipped. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88191 Reported-by: Jaša Bartelj <jasa.bartelj@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Daniel Borkmann <dborkman@redhat.com> Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against ↵Pablo Neira
get_next_corpse" This reverts commit 5195c14c8b27cc0b18220ddbf0e5ad3328a04187. If the conntrack clashes with an existing one, it is left out of the unconfirmed list, thus, crashing when dropping the packet and releasing the conntrack since golden rule is that conntracks are always placed in any of the existing lists for traceability reasons. Reported-by: Daniel Borkmann <dborkman@redhat.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=88841 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25ip6_udp_tunnel: Fix checksum calculationAlexander Duyck
The UDP checksum calculation for VXLAN tunnels is currently using the socket addresses instead of the actual packet source and destination addresses. As a result the checksum calculated is incorrect in some cases. Also uh->check was being set twice, first it was set to 0, and then it is set again in udp6_set_csum. This change removes the redundant assignment to 0. Fixes: acbf74a7 ("vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions.") Cc: Andy Zhou <azhou@nicira.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24net/ping: handle protocol mismatching scenarioJane Zhou
ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jane Zhou <a17711@motorola.com> Signed-off-by: Yiwei Zhao <gbjc64@motorola.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24af_packet: fix sparse warningMichael S. Tsirkin
af_packet produces lots of these: net/packet/af_packet.c:384:39: warning: incorrect type in return expression (different modifiers) net/packet/af_packet.c:384:39: expected struct page [pure] * net/packet/af_packet.c:384:39: got struct page * this seems to be because sparse does not realize that _pure refers to function, not the returned pointer. Tweak code slightly to avoid the warning. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24ipv6: gre: fix wrong skb->protocol in WCCPYuri Chislov
When using GRE redirection in WCCP, it sets the wrong skb->protocol, that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: Yuri Chislov <yuri.chislov@gmail.com> Tested-by: Yuri Chislov <yuri.chislov@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23ip_tunnel: the lack of vti_link_ops' dellink() cause kernel paniclucien
Now the vti_link_ops do not point the .dellink, for fb tunnel device (ip_vti0), the net_device will be removed as the default .dellink is unregister_netdevice_queue,but the tunnel still in the tunnel list, then if we add a new vti tunnel, in ip_tunnel_find(): hlist_for_each_entry_rcu(t, head, hash_node) { if (local == t->parms.iph.saddr && remote == t->parms.iph.daddr && link == t->parms.link && ==> type == t->dev->type && ip_tunnel_key_match(&t->parms, flags, key)) break; } the panic will happen, cause dev of ip_tunnel *t is null: [ 3835.072977] IP: [<ffffffffa04103fd>] ip_tunnel_find+0x9d/0xc0 [ip_tunnel] [ 3835.073008] PGD b2c21067 PUD b7277067 PMD 0 [ 3835.073008] Oops: 0000 [#1] SMP ..... [ 3835.073008] Stack: [ 3835.073008] ffff8800b72d77f0 ffffffffa0411924 ffff8800bb956000 ffff8800b72d78e0 [ 3835.073008] ffff8800b72d78a0 0000000000000000 ffffffffa040d100 ffff8800b72d7858 [ 3835.073008] ffffffffa040b2e3 0000000000000000 0000000000000000 0000000000000000 [ 3835.073008] Call Trace: [ 3835.073008] [<ffffffffa0411924>] ip_tunnel_newlink+0x64/0x160 [ip_tunnel] [ 3835.073008] [<ffffffffa040b2e3>] vti_newlink+0x43/0x70 [ip_vti] [ 3835.073008] [<ffffffff8150d4da>] rtnl_newlink+0x4fa/0x5f0 [ 3835.073008] [<ffffffff812f68bb>] ? nla_strlcpy+0x5b/0x70 [ 3835.073008] [<ffffffff81508fb0>] ? rtnl_link_ops_get+0x40/0x60 [ 3835.073008] [<ffffffff8150d11f>] ? rtnl_newlink+0x13f/0x5f0 [ 3835.073008] [<ffffffff81509cf4>] rtnetlink_rcv_msg+0xa4/0x270 [ 3835.073008] [<ffffffff8126adf5>] ? sock_has_perm+0x75/0x90 [ 3835.073008] [<ffffffff81509c50>] ? rtnetlink_rcv+0x30/0x30 [ 3835.073008] [<ffffffff81529e39>] netlink_rcv_skb+0xa9/0xc0 [ 3835.073008] [<ffffffff81509c48>] rtnetlink_rcv+0x28/0x30 .... modprobe ip_vti ip link del ip_vti0 type vti ip link add ip_vti0 type vti rmmod ip_vti do that one or more times, kernel will panic. fix it by assigning ip_tunnel_dellink to vti_link_ops' dellink, in which we skip the unregister of fb tunnel device. do the same on ip6_vti. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23ipv6: Do not treat a GSO_TCPV4 request from UDP tunnel over IPv6 as invalidAlexander Duyck
This patch adds SKB_GSO_TCPV4 to the list of supported GSO types handled by the IPv6 GSO offloads. Without this change VXLAN tunnels running over IPv6 do not currently handle IPv4 TCP TSO requests correctly and end up handing the non-segmented frame off to the device. Below is the before and after for a simple netperf TCP_STREAM test between two endpoints tunneling IPv4 over a VXLAN tunnel running on IPv6 on top of a 1Gb/s network adapter. Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.29 0.88 Before 87380 16384 16384 10.03 895.69 After Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl. 2) nf_nat_range is not fully initialized and this is copied back to userspace, from Daniel Borkmann. 3) Fix read past end of b uffer in netfilter ipset, also from Dan Carpenter. 4) Signed integer overflow in ipv4 address mask creation helper inet_make_mask(), from Vincent BENAYOUN. 5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods to properly describe the device's capabilities, from Joe Stringer. 6) Fix memory leaks and checksum miscalculations in openvswitch, from Pravin B SHelar and Jesse Gross. 7) FIB rules passes back ambiguous error code for unreachable routes, making behavior confusing for userspace. Fix from Panu Matilainen. 8) ieee802154fake_probe() doesn't release resources properly on error, from Alexey Khoroshilov. 9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann. 10) Fix access of stale slave pointers in bonding code, from Nikolay Aleksandrov. 11) Fix stack info leak in PPP pptp code, from Mathias Krause. 12) Cure locking bug in IPX stack, from Jiri Bohac. 13) Revert SKB fclone memory freeing optimization that is racey and can allow accesses to freed up memory, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (71 commits) tcp: Restore RFC5961-compliant behavior for SYN packets net: Revert "net: avoid one atomic operation in skb_clone()" virtio-net: validate features during probe cxgb4 : Fix DCB priority groups being returned in wrong order ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg openvswitch: Don't validate IPv6 label masks. pptp: fix stack info leak in pptp_getname() brcmfmac: don't include linux/unaligned/access_ok.h cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection ipv6: delete protocol and unregister rtnetlink when cleanup net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too bonding: fix curr_active_slave/carrier with loadbalance arp monitoring mac80211: minstrel_ht: fix a crash in rate sorting vxlan: Inline vxlan_gso_check(). can: m_can: update to support CAN FD features can: m_can: fix incorrect error messages can: m_can: add missing delay after setting CCCR_INIT bit can: m_can: fix not set can_dlc for remote frame can: m_can: fix possible sleep in napi poll can: m_can: add missing message RAM initialization ...
2014-11-21tcp: Restore RFC5961-compliant behavior for SYN packetsCalvin Owens
Commit c3ae62af8e755 ("tcp: should drop incoming frames without ACK flag set") was created to mitigate a security vulnerability in which a local attacker is able to inject data into locally-opened sockets by using TCP protocol statistics in procfs to quickly find the correct sequence number. This broke the RFC5961 requirement to send a challenge ACK in response to spurious RST packets, which was subsequently fixed by commit 7b514a886ba50 ("tcp: accept RST without ACK flag"). Unfortunately, the RFC5961 requirement that spurious SYN packets be handled in a similar manner remains broken. RFC5961 section 4 states that: ... the handling of the SYN in the synchronized state SHOULD be performed as follows: 1) If the SYN bit is set, irrespective of the sequence number, TCP MUST send an ACK (also referred to as challenge ACK) to the remote peer: <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK> After sending the acknowledgment, TCP MUST drop the unacceptable segment and stop processing further. By sending an ACK, the remote peer is challenged to confirm the loss of the previous connection and the request to start a new connection. A legitimate peer, after restart, would not have a TCB in the synchronized state. Thus, when the ACK arrives, the peer should send a RST segment back with the sequence number derived from the ACK field that caused the RST. This RST will confirm that the remote peer has indeed closed the previous connection. Upon receipt of a valid RST, the local TCP endpoint MUST terminate its connection. The local TCP endpoint should then rely on SYN retransmission from the remote end to re-establish the connection. This patch lets SYN packets through the discard added in c3ae62af8e755, so that spurious SYN packets are properly dealt with as per the RFC. The challenge ACK is sent unconditionally and is rate-limited, so the original vulnerability is not reintroduced by this patch. Signed-off-by: Calvin Owens <calvinowens@fb.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21net: Revert "net: avoid one atomic operation in skb_clone()"Eric Dumazet
Not sure what I was thinking, but doing anything after releasing a refcount is suicidal or/and embarrassing. By the time we set skb->fclone to SKB_FCLONE_FREE, another cpu could have released last reference and freed whole skb. We potentially corrupt memory or trap if CONFIG_DEBUG_PAGEALLOC is set. Reported-by: Chris Mason <clm@fb.com> Fixes: ce1a4ea3f1258 ("net: avoid one atomic operation in skb_clone()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains two bugfixes for your net tree, they are: 1) Validate netlink group from nfnetlink to avoid an out of bound array access. This should only happen with superuser priviledges though. Discovered by Andrey Ryabinin using trinity. 2) Don't push ethernet header before calling the netfilter output hook for multicast traffic, this breaks ebtables since it expects to see skb->data pointing to the network header, patch from Linus Luessing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21Merge tag 'master-2014-11-20' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-11-20 Please full this little batch of fixes intended for the 3.18 stream! For the mac80211 patch, Johannes says: "Here's another last minute fix, for minstrel HT crashing depending on the value of some uninitialised stack." On top of that... Ben Greear fixes an ath9k regression in which a BSSID mask is miscalculated. Dmitry Torokhov corrects an error handling routing in brcmfmac which was checking an unsigned variable for a negative value. Johannes Berg avoids a build problem in brcmfmac for arches where linux/unaligned/access_ok.h and asm/unaligned.h conflict. Mathy Vanhoef addresses another brcmfmac issue so as to eliminate a use-after-free of the URB transfer buffer if a timeout occurs. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-20ipx: fix locking regression in ipx_sendmsg and ipx_recvmsgJiri Bohac
This fixes an old regression introduced by commit b0d0d915 (ipx: remove the BKL). When a recvmsg syscall blocks waiting for new data, no data can be sent on the same socket with sendmsg because ipx_recvmsg() sleeps with the socket locked. This breaks mars-nwe (NetWare emulator): - the ncpserv process reads the request using recvmsg - ncpserv forks and spawns nwconn - ncpserv calls a (blocking) recvmsg and waits for new requests - nwconn deadlocks in sendmsg on the same socket Commit b0d0d915 has simply replaced BKL locking with lock_sock/release_sock. Unlike now, BKL got unlocked while sleeping, so a blocking recvmsg did not block a concurrent sendmsg. Only keep the socket locked while actually working with the socket data and release it prior to calling skb_recv_datagram(). Signed-off-by: Jiri Bohac <jbohac@suse.cz> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-20openvswitch: Don't validate IPv6 label masks.Joe Stringer
When userspace doesn't provide a mask, OVS datapath generates a fully unwildcarded mask for the flow by copying the flow and setting all bits in all fields. For IPv6 label, this creates a mask that matches on the upper 12 bits, causing the following error: openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff) This patch ignores the label validation check for masks, avoiding this error. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-19ipv6: delete protocol and unregister rtnetlink when cleanupDuan Jiong
pim6_protocol was added when initiation, but it not deleted. Similarly, unregister RTNL_FAMILY_IP6MR rtnetlink. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-19Merge tag 'mac80211-for-john-2014-11-18' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Here's another last minute fix, for minstrel HT crashing depending on the value of some uninitialised stack." Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-11-19SUNRPC: Fix locking around callback channel reply receiveTrond Myklebust
Both xprt_lookup_rqst() and xprt_complete_rqst() require that you take the transport lock in order to avoid races with xprt_transmit(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-11-18mac80211: minstrel_ht: fix a crash in rate sortingFelix Fietkau
The commit 5935839ad73583781b8bbe8d91412f6826e218a4 "mac80211: improve minstrel_ht rate sorting by throughput & probability" introduced a crash on rate sorting that occurs when the rate added to the sorting array is faster than all the previous rates. Due to an off-by-one error, it reads the rate index from tp_list[-1], which contains uninitialized stack garbage, and then uses the resulting index for accessing the group rate stats, leading to a crash if the garbage value is big enough. Cc: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-17bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queriesLinus Lüssing
Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected for both locally generated IGMP and MLD queries. The IP header specific filter options are off by 14 Bytes for netfilter (actual output on interfaces is fine). NF_HOOK() expects the skb->data to point to the IP header, not the ethernet one (while dev_queue_xmit() does not). Luckily there is an br_dev_queue_push_xmit() helper function already - let's just use that. Introduced by eb1d16414339a6e113d89e2cca2556005d7ce919 ("bridge: Add core IGMP snooping support") Ebtables example: $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \ --log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP before (broken): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \ DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \ IPv6 priority=0x3, Next Header=2 after (working): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \ DST=ff02:0000:0000:0000:0000:0000:0000:0001, \ IPv6 priority=0x0, Next Header=0 Signed-off-by: Linus Lüssing <linus.luessing@web.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-17netfilter: nfnetlink: fix insufficient validation in nfnetlink_bindPablo Neira Ayuso
Make sure the netlink group exists, otherwise you can trigger an out of bound array memory access from the netlink_bind() path. This splat can only be triggered only by superuser. [ 180.203600] UBSan: Undefined behaviour in ../net/netfilter/nfnetlink.c:467:28 [ 180.204249] index 9 is out of range for type 'int [9]' [ 180.204697] CPU: 0 PID: 1771 Comm: trinity-main Not tainted 3.18.0-rc4-mm1+ #122 [ 180.205365] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org +04/01/2014 [ 180.206498] 0000000000000018 0000000000000000 0000000000000009 ffff88007bdf7da8 [ 180.207220] ffffffff82b0ef5f 0000000000000092 ffffffff845ae2e0 ffff88007bdf7db8 [ 180.207887] ffffffff8199e489 ffff88007bdf7e18 ffffffff8199ea22 0000003900000000 [ 180.208639] Call Trace: [ 180.208857] dump_stack (lib/dump_stack.c:52) [ 180.209370] ubsan_epilogue (lib/ubsan.c:174) [ 180.209849] __ubsan_handle_out_of_bounds (lib/ubsan.c:400) [ 180.210512] nfnetlink_bind (net/netfilter/nfnetlink.c:467) [ 180.210986] netlink_bind (net/netlink/af_netlink.c:1483) [ 180.211495] SYSC_bind (net/socket.c:1541) Moreover, define the missing nf_tables and nf_acct multicast groups too. Reported-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-16ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUsDaniel Borkmann
It has been reported that generating an MLD listener report on devices with large MTUs (e.g. 9000) and a high number of IPv6 addresses can trigger a skb_over_panic(): skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20 head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0 dev:port1 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:100! invalid opcode: 0000 [#1] SMP Modules linked in: ixgbe(O) CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4 [...] Call Trace: <IRQ> [<ffffffff80578226>] ? skb_put+0x3a/0x3b [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4 [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45 [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68 [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182 [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3 [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46 [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70 mld_newpack() skb allocations are usually requested with dev->mtu in size, since commit 72e09ad107e7 ("ipv6: avoid high order allocations") we have changed the limit in order to be less likely to fail. However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb) macros, which determine if we may end up doing an skb_put() for adding another record. To avoid possible fragmentation, we check the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong assumption as the actual max allocation size can be much smaller. The IGMP case doesn't have this issue as commit 57e1ab6eaddc ("igmp: refine skb allocations") stores the allocation size in the cb[]. Set a reserved_tailroom to make it fit into the MTU and use skb_availroom() helper instead. This also allows to get rid of igmp_skb_size(). Reported-by: Wei Liu <lw1a2.jing@gmail.com> Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: David L Stevens <david.stevens@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16Merge branch 'net_ovs' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following fixes are accumulated in ovs-repo. Three of them are related to protocol processing, one is related to memory leak in case of error and one is to fix race. Patch "Validate IPv6 flow key and mask values" has conflicts with net-next, Let me know if you want me to send the patch for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16dcbnl : Disable software interrupts before taking dcb_lockAnish Bhatt
Solves possible lockup issues that can be seen from firmware DCB agents calling into the DCB app api. DCB firmware event queues can be tied in with NAPI so that dcb events are generated in softIRQ context. This can results in calls to dcb_*app() functions which try to take the dcb_lock. If the the event triggers while we also have the dcb_lock because lldpad or some other agent happened to be issuing a get/set command we could see a cpu lockup. This code was not originally written with firmware agents in mind, hence grabbing dcb_lock from softIRQ context was not considered. Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter updates for your net tree, they are: 1) Fix missing initialization of the range structure (allocated in the stack) in nft_masq_{ipv4, ipv6}_eval, from Daniel Borkmann. 2) Make sure the data we receive from userspace contains the req_version structure, otherwise return an error incomplete on truncated input. From Dan Carpenter. 3) Fix handling og skb->sk which may cause incorrect handling of connections from a local process. Via Simon Horman, patch from Calvin Owens. 4) Fix wrong netns in nft_compat when setting target and match params structure. 5) Relax chain type validation in nft_compat that was recently included, this broke the matches that need to be run from the route chain type. Now iptables-test.py automated regression tests report success again and we avoid the only possible problematic case, which is the use of nat targets out of nat chain type. 6) Use match->table to validate the tablename, instead of the match->name. Again patch for nft_compat. 7) Restore the synchronous release of objects from the commit and abort path in nf_tables. This is causing two major problems: splats when using nft_compat, given that matches and targets may sleep and call_rcu is invoked from softirq context. Moreover Patrick reported possible event notification reordering when rules refer to anonymous sets. 8) Fix race condition in between packets that are being confirmed by conntrack and the ctnetlink flush operation. This happens since the removal of the central spinlock. Thanks to Jesper D. Brouer to looking into this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>