summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-07-27sunrpc: use constant time memory comparison for macJason A. Donenfeld
commit 15a8b93fd5690de017ce665382ea45e5d61811a4 upstream. Otherwise, we enable a MAC forgery via timing attack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27ipvs: SNAT packet replies only for NATed connectionsJulian Anastasov
commit 3c5ab3f395d66a9e4e937fcfdf6ebc63894f028b upstream. We do not check if packet from real server is for NAT connection before performing SNAT. This causes problems for setups that use DR/TUN and allow local clients to access the real server directly, for example: - local client in director creates IPVS-DR/TUN connection CIP->VIP and the request packets are routed to RIP. Talks are finished but IPVS connection is not expired yet. - second local client creates non-IPVS connection CIP->RIP with same reply tuple RIP->CIP and when replies are received on LOCAL_IN we wrongly assign them for the first client connection because RIP->CIP matches the reply direction. As result, IPVS SNATs replies for non-IPVS connections. The problem is more visible to local UDP clients but in rare cases it can happen also for TCP or remote clients when the real server sends the reply traffic via the director. So, better to be more precise for the reply traffic. As replies are not expected for DR/TUN connections, better to not touch them. Reported-by: Nick Moriarty <nick.moriarty@york.ac.uk> Tested-by: Nick Moriarty <nick.moriarty@york.ac.uk> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27af_key: Fix sadb_x_ipsecrequest parsingHerbert Xu
commit 096f41d3a8fcbb8dde7f71379b1ca85fe213eded upstream. The parsing of sadb_x_ipsecrequest is broken in a number of ways. First of all we're not verifying sadb_x_ipsecrequest_len. This is needed when the structure carries addresses at the end. Worse we don't even look at the length when we parse those optional addresses. The migration code had similar parsing code that's better but it also has some deficiencies. The length is overcounted first of all as it includes the header itself. It also fails to check the length before dereferencing the sa_family field. This patch fixes those problems in parse_sockaddr_pair and then uses it in parse_ipsecrequest. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27Bluetooth: use constant time memory comparison for secret valuesJason A. Donenfeld
commit 329d82309824ff1082dc4a91a5bbed8c3bec1580 upstream. This file is filled with complex cryptography. Thus, the comparisons of MACs and secret keys and curve points and so forth should not add timing attacks, which could either result in a direct forgery, or, given the complexity, some other type of attack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27NFC: Add sockaddr length checks before accessing sa_family in bind handlersMateusz Jurczyk
commit f6a5885fc4d68e7f25ffb42b9d8d80aebb3bacbb upstream. Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in bind() handlers of the AF_NFC socket. Since the syscall doesn't enforce a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing .sa_family. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27nfc: Fix the sockaddr length sanitization in llcp_sock_connectMateusz Jurczyk
commit 608c4adfcabab220142ee335a2a003ccd1c0b25b upstream. Fix the sockaddr length verification in the connect() handler of NFC/LLCP sockets, to compare against the size of the actual structure expected on input (sockaddr_nfc_llcp) instead of its shorter version (sockaddr_nfc). Both structures are defined in include/uapi/linux/nfc.h. The fields specific to the _llcp extended struct are as follows: 276 __u8 dsap; /* Destination SAP, if known */ 277 __u8 ssap; /* Source SAP to be bound to */ 278 char service_name[NFC_LLCP_MAX_SERVICE_NAME]; /* Service name URI */; 279 size_t service_name_len; If the caller doesn't provide a sufficiently long sockaddr buffer, these fields remain uninitialized (and they currently originate from the stack frame of the top-level sys_connect handler). They are then copied by llcp_sock_connect() into internal storage (nfc_llcp_sock structure), and could be subsequently read back through the user-mode getsockname() function (handled by llcp_sock_getname()). This would result in the disclosure of up to ~70 uninitialized bytes from the kernel stack to user-mode clients capable of creating AFC_NFC sockets. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27nfc: Ensure presence of required attributes in the activate_target handlerMateusz Jurczyk
commit a0323b979f81ad2deb2c8836eab506534891876a upstream. Check that the NFC_ATTR_TARGET_INDEX and NFC_ATTR_PROTOCOLS attributes (in addition to NFC_ATTR_DEVICE_INDEX) are provided by the netlink client prior to accessing them. This prevents potential unhandled NULL pointer dereference exceptions which can be triggered by malicious user-mode programs, if they omit one or both of these attributes. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27NFC: fix broken device allocationJohan Hovold
commit 20777bc57c346b6994f465e0d8261a7fbf213a09 upstream. Commit 7eda8b8e9677 ("NFC: Use IDR library to assing NFC devices IDs") moved device-id allocation and struct-device initialisation from nfc_allocate_device() to nfc_register_device(). This broke just about every nfc-device-registration error path, which continue to call nfc_free_device() that tries to put the device reference of the now uninitialised (but zeroed) struct device: kobject: '(null)' (ce316420): is not initialized, yet kobject_put() is being called. The late struct-device initialisation also meant that various work queues whose names are derived from the nfc device name were also misnamed: 421 root 0 SW< [(null)_nci_cmd_] 422 root 0 SW< [(null)_nci_rx_w] 423 root 0 SW< [(null)_nci_tx_w] Move the id-allocation and struct-device initialisation back to nfc_allocate_device() and fix up the single call site which did not use nfc_free_device() in its error path. Fixes: 7eda8b8e9677 ("NFC: Use IDR library to assing NFC devices IDs") Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21cfg80211: Check if NAN service ID is of expected sizeSrinivas Dasari
commit 0a27844ce86d039d74221dd56cd8c0349b146b63 upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from userspace with NL80211_NAN_FUNC_SERVICE_ID. Fixes: a442b761b24 ("cfg80211: add add_nan_func / del_nan_func") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21cfg80211: Check if PMKID attribute is of expected sizeSrinivas Dasari
commit 9361df14d1cbf966409d5d6f48bb334384fbe138 upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, the wireless drivers may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum WLAN_PMKID_LEN bytes are received from userspace with NL80211_ATTR_PMKID. Fixes: 67fbb16be69d ("nl80211: PMKSA caching support") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIESSrinivas Dasari
commit d7f13f7450369281a5d0ea463cc69890a15923ae upstream. validate_scan_freqs() retrieves frequencies from attributes nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with nla_get_u32(), which reads 4 bytes from each attribute without validating the size of data received. Attributes nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy. Validate size of each attribute before parsing to avoid potential buffer overread. Fixes: 2a519311926 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODESrinivas Dasari
commit 8feb69c7bd89513be80eb19198d48f154b254021 upstream. Buffer overread may happen as nl80211_set_station() reads 4 bytes from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without validating the size of data received when userspace sends less than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE. Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid the buffer overread. Fixes: 3b1c5a5307f ("{cfg,nl}80211: mesh power mode primitives and userspace access") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21rds: tcp: use sock_create_lite() to create the accept socketSowmini Varadhan
commit 0933a578cd55b02dc80f219dc8f2efb17ec61c9a upstream. There are two problems with calling sock_create_kern() from rds_tcp_accept_one() 1. it sets up a new_sock->sk that is wasteful, because this ->sk is going to get replaced by inet_accept() in the subsequent ->accept() 2. The new_sock->sk is a leaked reference in sock_graft() which expects to find a null parent->sk Avoid these problems by calling sock_create_lite(). Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net: ipv6: Compare lwstate in detecting duplicate nexthopsDavid Ahern
commit f06b7549b79e29a672336d4e134524373fb7a232 upstream. Lennert reported a failure to add different mpls encaps in a multipath route: $ ip -6 route add 1234::/16 \ nexthop encap mpls 10 via fe80::1 dev ens3 \ nexthop encap mpls 20 via fe80::1 dev ens3 RTNETLINK answers: File exists The problem is that the duplicate nexthop detection does not compare lwtunnel configuration. Add it. Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes") Signed-off-by: David Ahern <dsahern@gmail.com> Reported-by: João Taveira Araújo <joao.taveira@gmail.com> Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Tested-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64Alban Browaeys
commit 9af9959e142c274f4a30fefb71d97d2b028b337f upstream. commit 9256645af098 ("net/core: relax BUILD_BUG_ON in netdev_stats_to_stats64") made an attempt to read beyond the size of the source a possibility. Fix to only copy src size to dest. As dest might be bigger than src. ================================================================== BUG: KASAN: slab-out-of-bounds in netdev_stats_to_stats64+0xe/0x30 at addr ffff8801be248b20 Read of size 192 by task VBoxNetAdpCtl/6734 CPU: 1 PID: 6734 Comm: VBoxNetAdpCtl Tainted: G O 4.11.4prahal+intel+ #118 Hardware name: LENOVO 20CDCTO1WW/20CDCTO1WW, BIOS GQET52WW (1.32 ) 05/04/2017 Call Trace: dump_stack+0x63/0x86 kasan_object_err+0x1c/0x70 kasan_report+0x270/0x520 ? netdev_stats_to_stats64+0xe/0x30 ? sched_clock_cpu+0x1b/0x190 ? __module_address+0x3e/0x3b0 ? unwind_next_frame+0x1ea/0xb00 check_memory_region+0x13c/0x1a0 memcpy+0x23/0x50 netdev_stats_to_stats64+0xe/0x30 dev_get_stats+0x1b9/0x230 rtnl_fill_stats+0x44/0xc00 ? nla_put+0xc6/0x130 rtnl_fill_ifinfo+0xe9e/0x3700 ? rtnl_fill_vfinfo+0xde0/0xde0 ? sched_clock+0x9/0x10 ? sched_clock+0x9/0x10 ? sched_clock_local+0x120/0x130 ? __module_address+0x3e/0x3b0 ? unwind_next_frame+0x1ea/0xb00 ? sched_clock+0x9/0x10 ? sched_clock+0x9/0x10 ? sched_clock_cpu+0x1b/0x190 ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? depot_save_stack+0x1d8/0x4a0 ? depot_save_stack+0x34f/0x4a0 ? depot_save_stack+0x34f/0x4a0 ? save_stack+0xb1/0xd0 ? save_stack_trace+0x16/0x20 ? save_stack+0x46/0xd0 ? kasan_slab_alloc+0x12/0x20 ? __kmalloc_node_track_caller+0x10d/0x350 ? __kmalloc_reserve.isra.36+0x2c/0xc0 ? __alloc_skb+0xd0/0x560 ? rtmsg_ifinfo_build_skb+0x61/0x120 ? rtmsg_ifinfo.part.25+0x16/0xb0 ? rtmsg_ifinfo+0x47/0x70 ? register_netdev+0x15/0x30 ? vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp] ? vboxNetAdpCreate+0x210/0x400 [vboxnetadp] ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? do_vfs_ioctl+0x17f/0xff0 ? SyS_ioctl+0x74/0x80 ? do_syscall_64+0x182/0x390 ? __alloc_skb+0xd0/0x560 ? __alloc_skb+0xd0/0x560 ? save_stack_trace+0x16/0x20 ? init_object+0x64/0xa0 ? ___slab_alloc+0x1ae/0x5c0 ? ___slab_alloc+0x1ae/0x5c0 ? __alloc_skb+0xd0/0x560 ? sched_clock+0x9/0x10 ? kasan_unpoison_shadow+0x35/0x50 ? kasan_kmalloc+0xad/0xe0 ? __kmalloc_node_track_caller+0x246/0x350 ? __alloc_skb+0xd0/0x560 ? kasan_unpoison_shadow+0x35/0x50 ? memset+0x31/0x40 ? __alloc_skb+0x31f/0x560 ? napi_consume_skb+0x320/0x320 ? br_get_link_af_size_filtered+0xb7/0x120 [bridge] ? if_nlmsg_size+0x440/0x630 rtmsg_ifinfo_build_skb+0x83/0x120 rtmsg_ifinfo.part.25+0x16/0xb0 rtmsg_ifinfo+0x47/0x70 register_netdevice+0xa2b/0xe50 ? __kmalloc+0x171/0x2d0 ? netdev_change_features+0x80/0x80 register_netdev+0x15/0x30 vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp] vboxNetAdpCreate+0x210/0x400 [vboxnetadp] ? vboxNetAdpComposeMACAddress+0x1d0/0x1d0 [vboxnetadp] ? kasan_check_write+0x14/0x20 VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? VBoxNetAdpLinuxOpen+0x20/0x20 [vboxnetadp] ? lock_acquire+0x11c/0x270 ? __audit_syscall_entry+0x2fb/0x660 do_vfs_ioctl+0x17f/0xff0 ? __audit_syscall_entry+0x2fb/0x660 ? ioctl_preallocate+0x1d0/0x1d0 ? __audit_syscall_entry+0x2fb/0x660 ? kmem_cache_free+0xb2/0x250 ? syscall_trace_enter+0x537/0xd00 ? exit_to_usermode_loop+0x100/0x100 SyS_ioctl+0x74/0x80 ? do_sys_open+0x350/0x350 ? do_vfs_ioctl+0xff0/0xff0 do_syscall_64+0x182/0x390 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f7e39a1ae07 RSP: 002b:00007ffc6f04c6d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc6f04c730 RCX: 00007f7e39a1ae07 RDX: 00007ffc6f04c730 RSI: 00000000c0207601 RDI: 0000000000000007 RBP: 00007ffc6f04c700 R08: 00007ffc6f04c780 R09: 0000000000000008 R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000007 R13: 00000000c0207601 R14: 00007ffc6f04c730 R15: 0000000000000012 Object at ffff8801be248008, in cache kmalloc-4096 size: 4096 Allocated: PID = 6734 save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 __kmalloc+0x171/0x2d0 alloc_netdev_mqs+0x8a7/0xbe0 vboxNetAdpOsCreate+0x65/0x1c0 [vboxnetadp] vboxNetAdpCreate+0x210/0x400 [vboxnetadp] VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] do_vfs_ioctl+0x17f/0xff0 SyS_ioctl+0x74/0x80 do_syscall_64+0x182/0x390 return_from_SYSCALL_64+0x0/0x6a Freed: PID = 5600 save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x73/0xc0 kfree+0xe4/0x220 kvfree+0x25/0x30 single_release+0x74/0xb0 __fput+0x265/0x6b0 ____fput+0x9/0x10 task_work_run+0xd5/0x150 exit_to_usermode_loop+0xe2/0x100 do_syscall_64+0x26c/0x390 return_from_SYSCALL_64+0x0/0x6a Memory state around the buggy address: ffff8801be248a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801be248b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8801be248b80: 00 00 00 00 00 00 00 00 00 00 00 07 fc fc fc fc ^ ffff8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Signed-off-by: Alban Browaeys <alban.browaeys@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21ipv6: dad: don't remove dynamic addresses if link is downSabrina Dubroca
commit ec8add2a4c9df723c94a863b8fcd6d93c472deed upstream. Currently, when the link for $DEV is down, this command succeeds but the address is removed immediately by DAD (1): ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800 In the same situation, this will succeed and not remove the address (2): ip addr add 1111::12/64 dev $DEV ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800 The comment in addrconf_dad_begin() when !IF_READY makes it look like this is the intended behavior, but doesn't explain why: * If the device is not ready: * - keep it tentative if it is a permanent address. * - otherwise, kill it. We clearly cannot prevent userspace from doing (2), but we can make (1) work consistently with (2). addrconf_dad_stop() is only called in two cases: if DAD failed, or to skip DAD when the link is down. In that second case, the fix is to avoid deleting the address, like we already do for permanent addresses. Fixes: 3c21edbd1137 ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()Michal Kubeček
commit e44699d2c28067f69698ccb68dd3ddeacfebc434 upstream. Recently I started seeing warnings about pages with refcount -1. The problem was traced to packets being reused after their head was merged into a GRO packet by skb_gro_receive(). While bisecting the issue pointed to commit c21b48cc1bbf ("net: adjust skb->truesize in ___pskb_trim()") and I have never seen it on a kernel with it reverted, I believe the real problem appeared earlier when the option to merge head frag in GRO was implemented. Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE branch of napi_skb_finish() so that if the driver uses napi_gro_frags() and head is merged (which in my case happens after the skb_condense() call added by the commit mentioned above), the skb is reused including the head that has been merged. As a result, we release the page reference twice and eventually end up with negative page refcount. To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish() the same way it's done in napi_skb_finish(). Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21bridge: mdb: fix leak on complete_info ptr on fail pathEduardo Valentin
commit 1bfb159673957644951ab0a8d2aec44b93ddb1ae upstream. We currently get the following kmemleak report: unreferenced object 0xffff8800039d9820 (size 32): comm "softirq", pid 0, jiffies 4295212383 (age 792.416s) hex dump (first 32 bytes): 00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00 ................ 00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff ................ backtrace: [<ffffffff8152b4aa>] kmemleak_alloc+0x4a/0xa0 [<ffffffff811d8ec8>] kmem_cache_alloc_trace+0xb8/0x1c0 [<ffffffffa0389683>] __br_mdb_notify+0x2a3/0x300 [bridge] [<ffffffffa038a0ce>] br_mdb_notify+0x6e/0x70 [bridge] [<ffffffffa0386479>] br_multicast_add_group+0x109/0x150 [bridge] [<ffffffffa0386518>] br_ip6_multicast_add_group+0x58/0x60 [bridge] [<ffffffffa0387fb5>] br_multicast_rcv+0x1d5/0xdb0 [bridge] [<ffffffffa037d7cf>] br_handle_frame_finish+0xcf/0x510 [bridge] [<ffffffffa03a236b>] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter] [<ffffffffa03a3738>] br_nf_hook_thresh+0x48/0xb0 [br_netfilter] [<ffffffffa03a3fb9>] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 [br_netfilter] [<ffffffffa03a4400>] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter] [<ffffffffa03a3c27>] br_nf_pre_routing+0x197/0x3d0 [br_netfilter] [<ffffffff814a2952>] nf_iterate+0x52/0x60 [<ffffffff814a29bc>] nf_hook_slow+0x5c/0xb0 [<ffffffffa037ddf4>] br_handle_frame+0x1a4/0x2c0 [bridge] This happens when switchdev_port_obj_add() fails. This patch frees complete_info object in the fail path. Reviewed-by: Vallish Vaidyeshwara <vallish@amazon.com> Signed-off-by: Eduardo Valentin <eduval@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net: prevent sign extension in dev_get_stats()Eric Dumazet
commit 6f64ec74515925cced6df4571638b5a099a49aae upstream. Similar to the fix provided by Dominik Heidler in commit 9b3dc0a17d73 ("l2tp: cast l2tp traffic counter to unsigned") we need to take care of 32bit kernels in dev_get_stats(). When using atomic_long_read(), we add a 'long' to u64 and might misinterpret high order bit, unless we cast to unsigned. Fixes: caf586e5f23ce ("net: add a core netdev->rx_dropped counter") Fixes: 015f0688f57ca ("net: net: add a core netdev->tx_dropped counter") Fixes: 6e7333d315a76 ("net: add rx_nohandler stat counter") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21tcp: reset sk_rx_dst in tcp_disconnect()WANG Cong
commit d747a7a51b00984127a88113cdbbc26f91e9d815 upstream. We have to reset the sk->sk_rx_dst when we disconnect a TCP connection, because otherwise when we re-connect it this dst reference is simply overridden in tcp_finish_connect(). This fixes a dst leak which leads to a loopback dev refcnt leak. It is a long-standing bug, Kevin reported a very similar (if not same) bug before. Thanks to Andrei for providing such a reliable reproducer which greatly narrows down the problem. Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Reported-by: Andrei Vagin <avagin@gmail.com> Reported-by: Kevin Xu <kaiwen.xu@hulu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21ipv6: avoid unregistering inet6_dev for loopbackWANG Cong
commit 60abc0be96e00ca71bac083215ac91ad2e575096 upstream. The per netns loopback_dev->ip6_ptr is unregistered and set to NULL when its mtu is set to smaller than IPV6_MIN_MTU, this leads to that we could set rt->rt6i_idev NULL after a rt6_uncached_list_flush_dev() and then crash after another call. In this case we should just bring its inet6_dev down, rather than unregistering it, at least prior to commit 176c39af29bc ("netns: fix addrconf_ifdown kernel panic") we always override the case for loopback. Thanks a lot to Andrey for finding a reliable reproducer. Fixes: 176c39af29bc ("netns: fix addrconf_ifdown kernel panic") Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net: sched: Fix one possible panic when no destroy callbackGao Feng
commit c1a4872ebfb83b1af7144f7b29ac8c4b344a12a8 upstream. When qdisc fail to init, qdisc_create would invoke the destroy callback to cleanup. But there is no check if the callback exists really. So it would cause the panic if there is no real destroy callback like the qdisc codel, fq, and so on. Take codel as an example following: When a malicious user constructs one invalid netlink msg, it would cause codel_init->codel_change->nla_parse_nested failed. Then kernel would invoke the destroy callback directly but qdisc codel doesn't define one. It causes one panic as a result. Now add one the check for destroy to avoid the possible panic. Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21net_sched: fix error recovery at qdisc creationEric Dumazet
commit 87b60cfacf9f17cf71933c6e33b66e68160af71d upstream. Dmitry reported uses after free in qdisc code [1] The problem here is that ops->init() can return an error. qdisc_create_dflt() then call ops->destroy(), while qdisc_create() does _not_ call it. Four qdisc chose to call their own ops->destroy(), assuming their caller would not. This patch makes sure qdisc_create() calls ops->destroy() and fixes the four qdisc to avoid double free. [1] BUG: KASAN: use-after-free in mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 at addr ffff8801d415d440 Read of size 8 by task syz-executor2/5030 CPU: 0 PID: 5030 Comm: syz-executor2 Not tainted 4.3.5-smp-DEV #119 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 0000000000000046 ffff8801b435b870 ffffffff81bbbed4 ffff8801db000400 ffff8801d415d440 ffff8801d415dc40 ffff8801c4988510 ffff8801b435b898 ffffffff816682b1 ffff8801b435b928 ffff8801d415d440 ffff8801c49880c0 Call Trace: [<ffffffff81bbbed4>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffffff81bbbed4>] dump_stack+0x6c/0x98 lib/dump_stack.c:51 [<ffffffff816682b1>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158 [<ffffffff81668524>] print_address_description mm/kasan/report.c:196 [inline] [<ffffffff81668524>] kasan_report_error+0x1b4/0x4b0 mm/kasan/report.c:285 [<ffffffff81668953>] kasan_report mm/kasan/report.c:305 [inline] [<ffffffff81668953>] __asan_report_load8_noabort+0x43/0x50 mm/kasan/report.c:326 [<ffffffff82527b02>] mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 [<ffffffff82524bdd>] qdisc_destroy+0x12d/0x290 net/sched/sch_generic.c:953 [<ffffffff82524e30>] qdisc_create_dflt+0xf0/0x120 net/sched/sch_generic.c:848 [<ffffffff8252550d>] attach_default_qdiscs net/sched/sch_generic.c:1029 [inline] [<ffffffff8252550d>] dev_activate+0x6ad/0x880 net/sched/sch_generic.c:1064 [<ffffffff824b1db1>] __dev_open+0x221/0x320 net/core/dev.c:1403 [<ffffffff824b24ce>] __dev_change_flags+0x15e/0x3e0 net/core/dev.c:6858 [<ffffffff824b27de>] dev_change_flags+0x8e/0x140 net/core/dev.c:6926 [<ffffffff824f5bf6>] dev_ifsioc+0x446/0x890 net/core/dev_ioctl.c:260 [<ffffffff824f61fa>] dev_ioctl+0x1ba/0xb80 net/core/dev_ioctl.c:546 [<ffffffff82430509>] sock_do_ioctl+0x99/0xb0 net/socket.c:879 [<ffffffff82430d30>] sock_ioctl+0x2a0/0x390 net/socket.c:958 [<ffffffff816f3b68>] vfs_ioctl fs/ioctl.c:44 [inline] [<ffffffff816f3b68>] do_vfs_ioctl+0x8a8/0xe50 fs/ioctl.c:611 [<ffffffff816f41a4>] SYSC_ioctl fs/ioctl.c:626 [inline] [<ffffffff816f41a4>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:617 [<ffffffff8123e357>] entry_SYSCALL_64_fastpath+0x12/0x17 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05netfilter: use skb_to_full_sk in ip_route_me_harderFlorian Westphal
commit 29e09229d9f26129a39462fae0ddabc4d9533989 upstream. inet_sk(skb->sk) is illegal in case skb is attached to request socket. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Reported by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05xfrm: Oops on error in pfkey_msg2xfrm_state()Dan Carpenter
commit 1e3d0c2c70cd3edb5deed186c5f5c75f2b84a633 upstream. There are some missing error codes here so we accidentally return NULL instead of an error pointer. It results in a NULL pointer dereference. Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05xfrm: NULL dereference on allocation failureDan Carpenter
commit e747f64336fc15e1c823344942923195b800aa1e upstream. The default error code in pfkey_msg2xfrm_state() is -ENOBUFS. We added a new call to security_xfrm_state_alloc() which sets "err" to zero so there several places where we can return ERR_PTR(0) if kmalloc() fails. The caller is expecting error pointers so it leads to a NULL dereference. Fixes: df71837d5024 ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICYSabrina Dubroca
commit 9b3eb54106cf6acd03f07cf0ab01c13676a226c2 upstream. When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for that dst. Unfortunately, the code that allocates and fills this copy doesn't care about what type of flowi (flowi, flowi4, flowi6) gets passed. In multiple code paths (from raw_sendmsg, from TCP when replying to a FIN, in vxlan, geneve, and gre), the flowi that gets passed to xfrm is actually an on-stack flowi4, so we end up reading stuff from the stack past the end of the flowi4 struct. Since xfrm_dst->origin isn't used anywhere following commit ca116922afa8 ("xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok()."), just get rid of it. xfrm_dst->partner isn't used either, so get rid of that too. Fixes: 9d6ec938019c ("ipv4: Use flowi4 in public route lookup interfaces.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packetsPeter Dawson
[ Upstream commit 0e9a709560dbcfbace8bf4019dc5298619235891 ] This fix addresses two problems in the way the DSCP field is formulated on the encapsulating header of IPv6 tunnels. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661 1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it was incorrect to assume that the upper 12b containing the DSCP and ECN fields would remain intact when formulating the encapsulating header. This fix handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable. 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect and resulted in the DSCP value always being set to 0. Commit 90427ef5d2a4 ("ipv6: fix flow labels when the traffic class is non-0") caused the regression by masking out the flowlabel which exposed the incorrect handling of the DSCP portion of the flowlabel in ip6_tunnel and ip6_gre. Fixes: 90427ef5d2a4 ("ipv6: fix flow labels when the traffic class is non-0") Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05sctp: check af before verify address in sctp_addr_id2transportXin Long
[ Upstream commit 912964eacb111551db73429719eb5fadcab0ff8a ] Commit 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc") invoked sctp_verify_addr to verify the addr. But it didn't check af variable beforehand, once users pass an address with family = 0 through sockopt, sctp_get_af_specific will return NULL and NULL pointer dereference will be caused by af->sockaddr_len. This patch is to fix it by returning NULL if af variable is NULL. Fixes: 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05tipc: allocate user memory with GFP_KERNEL flagParthasarathy Bhuvaragan
[ Upstream commit 57d5f64d83ab5b5a5118b1597386dd76eaf4340d ] Until now, we allocate memory always with GFP_ATOMIC flag. When the system is under memory pressure and a user tries to send, the send fails due to low memory. However, the user application can wait for free memory if we allocate it using GFP_KERNEL flag. In this commit, we use allocate memory with GFP_KERNEL for all user allocation. Reported-by: Rune Torgersen <runet@innovsys.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05mac80211: initialize SMPS field in HT capabilitiesFelix Fietkau
[ Upstream commit 43071d8fb3b7f589d72663c496a6880fb097533c ] ibss and mesh modes copy the ht capabilites from the band without overriding the SMPS state. Unfortunately the default value 0 for the SMPS field means static SMPS instead of disabled. This results in HT ibss and mesh setups using only single-stream rates, even though SMPS is not supposed to be active. Initialize SMPS to disabled for all bands on ieee80211_hw_register to ensure that the value is sane where it is not overriden with the real SMPS state. Reported-by: Elektra Wagenrad <onelektra@gmx.net> Signed-off-by: Felix Fietkau <nbd@nbd.name> [move VHT TODO comment to a better place] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05mac80211: recalculate min channel width on VHT opmode changesJohannes Berg
[ Upstream commit d2941df8fbd9708035d66d889ada4d3d160170ce ] When an associated station changes its VHT operating mode this can/will affect the bandwidth it's using, and consequently we must recalculate the minimum bandwidth we need to use. Failure to do so can lead to one of two scenarios: 1) we use a too high bandwidth, this is benign 2) we use a too narrow bandwidth, causing rate control and actual PHY configuration to be out of sync, which can in turn cause problems/crashes Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05ip6_tunnel: must reload ipv6h in ip6ip6_tnl_xmit()Eric Dumazet
[ Upstream commit 21b995a9cb093fff33ec91d7cb3822b882a90a1e ] Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull(), we must reload any pointer that was related to skb->head (or skb->data), or risk use after free. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05net: dsa: Check return value of phy_connect_direct()Florian Fainelli
[ Upstream commit 4078b76cac68e50ccf1f76a74e7d3d5788aec3fe ] We need to check the return value of phy_connect_direct() in dsa_slave_phy_connect() otherwise we may be continuing the initialization of a slave network device with a PHY that already attached somewhere else and which will soon be in error because the PHY device is in error. The conditions for such an error to occur are that we have a port of our switch that is not disabled, and has the same port number as a PHY address (say both 5) that can be probed using the DSA slave MII bus. We end-up having this slave network device find a PHY at the same address as our port number, and we try to attach to it. A slave network (e.g: port 0) has already attached to our PHY device, and we try to re-attach it with a different network device, but since we ignore the error we would end-up initializating incorrect device references by the time the slave network interface is opened. The code has been (re)organized several times, making it hard to provide an exact Fixes tag, this is a bugfix nonetheless. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05net: ethtool: Initialize buffer when querying device channel settingsEran Ben Elisha
[ Upstream commit 31a86d137219373c3222ca5f4f912e9a4d8065bb ] Ethtool channels respond struct was uninitialized when querying device channel boundaries settings. As a result, unreported fields by the driver hold garbage. This may cause sending unsupported params to driver. Fixes: 8bf368620486 ('ethtool: ensure channel counts are within bounds ...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> CC: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: take a reference on sessions used in genetlink handlersGuillaume Nault
commit 2777e2ab5a9cf2b4524486c6db1517a6ded25261 upstream. Callers of l2tp_nl_session_find() need to hold a reference on the returned session since there's no guarantee that it isn't going to disappear from under them. Relying on the fact that no l2tp netlink message may be processed concurrently isn't enough: sessions can be deleted by other means (e.g. by closing the PPPOL2TP socket of a ppp pseudowire). l2tp_nl_cmd_session_delete() is a bit special: it runs a callback function that may require a previous call to session->ref(). In particular, for ppp pseudowires, the callback is l2tp_session_delete(), which then calls pppol2tp_session_close() and dereferences the PPPOL2TP socket. The socket might already be gone at the moment l2tp_session_delete() calls session->ref(), so we need to take a reference during the session lookup. So we need to pass the do_ref variable down to l2tp_session_get() and l2tp_session_get_by_ifname(). Since all callers have to be updated, l2tp_session_find_by_ifname() and l2tp_nl_session_find() are renamed to reflect their new behaviour. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: hold session while sending creation notificationsGuillaume Nault
commit 5e6a9e5a3554a5b3db09cdc22253af1849c65dff upstream. l2tp_session_find() doesn't take any reference on the returned session. Therefore, the session may disappear while sending the notification. Use l2tp_session_get() instead and decrement session's refcount once the notification is sent. Fixes: 33f72e6f0c67 ("l2tp : multicast notification to the registered listeners") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: fix duplicate session creationGuillaume Nault
commit dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086 upstream. l2tp_session_create() relies on its caller for checking for duplicate sessions. This is racy since a session can be concurrently inserted after the caller's verification. Fix this by letting l2tp_session_create() verify sessions uniqueness upon insertion. Callers need to be adapted to check for l2tp_session_create()'s return code instead of calling l2tp_session_find(). pppol2tp_connect() is a bit special because it has to work on existing sessions (if they're not connected) or to create a new session if none is found. When acting on a preexisting session, a reference must be held or it could go away on us. So we have to use l2tp_session_get() instead of l2tp_session_find() and drop the reference before exiting. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: ensure session can't get removed during pppol2tp_session_ioctl()Guillaume Nault
commit 57377d63547861919ee634b845c7caa38de4a452 upstream. Holding a reference on session is required before calling pppol2tp_session_ioctl(). The session could get freed while processing the ioctl otherwise. Since pppol2tp_session_ioctl() uses the session's socket, we also need to take a reference on it in l2tp_session_get(). Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: fix race in l2tp_recv_common()Guillaume Nault
commit 61b9a047729bb230978178bca6729689d0c50ca2 upstream. Taking a reference on sessions in l2tp_recv_common() is racy; this has to be done by the callers. To this end, a new function is required (l2tp_session_get()) to atomically lookup a session and take a reference on it. Callers then have to manually drop this reference. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05netfilter: synproxy: fix conntrackd interactionEric Leblond
commit 87e94dbc210a720a34be5c1174faee5c84be963e upstream. This patch fixes the creation of connection tracking entry from netlink when synproxy is used. It was missing the addition of the synproxy extension. This was causing kernel crashes when a conntrack entry created by conntrackd was used after the switch of traffic from active node to the passive node. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05netfilter: xt_TCPMSS: add more sanity tests on tcph->doffEric Dumazet
commit 2638fd0f92d4397884fd991d8f4925cb3f081901 upstream. Denys provided an awesome KASAN report pointing to an use after free in xt_TCPMSS I have provided three patches to fix this issue, either in xt_TCPMSS or in xt_tcpudp.c. It seems xt_TCPMSS patch has the smallest possible impact. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05rtnetlink: add IFLA_GROUP to ifla_policySerhey Popovych
[ Upstream commit db833d40ad3263b2ee3b59a1ba168bb3cfed8137 ] Network interface groups support added while ago, however there is no IFLA_GROUP attribute description in policy and netlink message size calculations until now. Add IFLA_GROUP attribute to the policy. Fixes: cbda10fa97d7 ("net_device: add support for network device groups") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05ipv6: Do not leak throw route referencesSerhey Popovych
[ Upstream commit 07f615574f8ac499875b21c1142f26308234a92c ] While commit 73ba57bfae4a ("ipv6: fix backtracking for throw routes") does good job on error propagation to the fib_rules_lookup() in fib rules core framework that also corrects throw routes handling, it does not solve route reference leakage problem happened when we return -EAGAIN to the fib_rules_lookup() and leave routing table entry referenced in arg->result. If rule with matched throw route isn't last matched in the list we overwrite arg->result losing reference on throw route stored previously forever. We also partially revert commit ab997ad40839 ("ipv6: fix the incorrect return value of throw route") since we never return routing table entry with dst.error == -EAGAIN when CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point to check for RTF_REJECT flag since it is always set throw route. Fixes: 73ba57bfae4a ("ipv6: fix backtracking for throw routes") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05net: 8021q: Fix one possible panic caused by BUG_ON in free_netdevGao Feng
[ Upstream commit 9745e362add89432d2c951272a99b0a5fe4348a9 ] The register_vlan_device would invoke free_netdev directly, when register_vlan_dev failed. It would trigger the BUG_ON in free_netdev if the dev was already registered. In this case, the netdev would be freed in netdev_run_todo later. So add one condition check now. Only when dev is not registered, then free it directly. The following is the part coredump when netdev_upper_dev_link failed in register_vlan_dev. I removed the lines which are too long. [ 411.237457] ------------[ cut here ]------------ [ 411.237458] kernel BUG at net/core/dev.c:7998! [ 411.237484] invalid opcode: 0000 [#1] SMP [ 411.237705] [last unloaded: 8021q] [ 411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G E 4.12.0-rc5+ #6 [ 411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000 [ 411.237782] RIP: 0010:free_netdev+0x116/0x120 [ 411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297 [ 411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878 [ 411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000 [ 411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801 [ 411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000 [ 411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000 [ 411.239518] FS: 00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000 [ 411.239949] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0 [ 411.240936] Call Trace: [ 411.241462] vlan_ioctl_handler+0x3f1/0x400 [8021q] [ 411.241910] sock_ioctl+0x18b/0x2c0 [ 411.242394] do_vfs_ioctl+0xa1/0x5d0 [ 411.242853] ? sock_alloc_file+0xa6/0x130 [ 411.243465] SyS_ioctl+0x79/0x90 [ 411.243900] entry_SYSCALL_64_fastpath+0x1e/0xa9 [ 411.244425] RIP: 0033:0x7fb69089a357 [ 411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357 [ 411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003 [ 411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999 [ 411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004 [ 411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001 [ 411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0 Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05decnet: always not take dst->__refcnt when inserting dst into hash tableWei Wang
[ Upstream commit 76371d2e3ad1f84426a30ebcd8c3b9b98f4c724f ] In the existing dn_route.c code, dn_route_output_slow() takes dst->__refcnt before calling dn_insert_route() while dn_route_input_slow() does not take dst->__refcnt before calling dn_insert_route(). This makes the whole routing code very buggy. In dn_dst_check_expire(), dnrt_free() is called when rt expires. This makes the routes inserted by dn_route_output_slow() not able to be freed as the refcnt is not released. In dn_dst_gc(), dnrt_drop() is called to release rt which could potentially cause the dst->__refcnt to be dropped to -1. In dn_run_flush(), dst_free() is called to release all the dst. Again, it makes the dst inserted by dn_route_output_slow() not able to be released and also, it does not wait on the rcu and could potentially cause crash in the path where other users still refer to this dst. This patch makes sure both input and output path do not take dst->__refcnt before calling dn_insert_route() and also makes sure dnrt_free()/dst_free() is called when removing dst from the hash table. The only difference between those 2 calls is that dnrt_free() waits on the rcu while dst_free() does not. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05sctp: return next obj by passing pos + 1 into sctp_transport_get_idxXin Long
[ Upstream commit 988c7322116970696211e902b468aefec95b6ec4 ] In sctp_for_each_transport, pos is used to save how many objs it has dumped. Now it gets the last obj by sctp_transport_get_idx, then gets the next obj by sctp_transport_get_next. The issue is that in the meanwhile if some objs in transport hashtable are removed and the objs nums are less than pos, sctp_transport_get_idx would return NULL and hti.walker.tbl is NULL as well. At this moment it should stop hti, instead of continue getting the next obj. Or it would cause a NULL pointer dereference in sctp_transport_get_next. This patch is to pass pos + 1 into sctp_transport_get_idx to get the next obj directly, even if pos > objs nums, it would return NULL and stop hti. Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05ipv6: fix calling in6_ifa_hold incorrectly for dad workXin Long
[ Upstream commit f8a894b218138888542a5058d0e902378fd0d4ec ] Now when starting the dad work in addrconf_mod_dad_work, if the dad work is idle and queued, it needs to hold ifa. The problem is there's one gap in [1], during which if the pending dad work is removed elsewhere. It will miss to hold ifa, but the dad word is still idea and queue. if (!delayed_work_pending(&ifp->dad_work)) in6_ifa_hold(ifp); <--------------[1] mod_delayed_work(addrconf_wq, &ifp->dad_work, delay); An use-after-free issue can be caused by this. Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in net6_ifa_finish_destroy was hit because of it. As Hannes' suggestion, this patch is to fix it by holding ifa first in addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if the dad_work is already in queue. Note that this patch did not choose to fix it with: if (!mod_delayed_work(delay)) in6_ifa_hold(ifp); As with it, when delay == 0, dad_work would be scheduled immediately, all addrconf_mod_dad_work(0) callings had to be moved under ifp->lock. Reported-by: Wei Chen <weichen@redhat.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05igmp: add a missing spin_lock_init()WANG Cong
[ Upstream commit b4846fc3c8559649277e3e4e6b5cec5348a8d208 ] Andrey reported a lockdep warning on non-initialized spinlock: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x292/0x395 lib/dump_stack.c:52 register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755 ? 0xffffffffa0000000 __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255 lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135 _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175 spin_lock_bh ./include/linux/spinlock.h:304 ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076 igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194 ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736 We miss a spin_lock_init() in igmpv3_add_delrec(), probably because previously we never use it on this code path. Since we already unlink it from the global mc_tomb list, it is probably safe not to acquire this spinlock here. It does not harm to have it although, to avoid conditional locking. Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05igmp: acquire pmc lock for ip_mc_clear_src()WANG Cong
[ Upstream commit c38b7d327aafd1e3ad7ff53eefac990673b65667 ] Andrey reported a use-after-free in add_grec(): for (psf = *psf_list; psf; psf = psf_next) { ... psf_next = psf->sf_next; where the struct ip_sf_list's were already freed by: kfree+0xe8/0x2b0 mm/slub.c:3882 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411 sock_release+0x8d/0x1e0 net/socket.c:597 sock_close+0x16/0x20 net/socket.c:1072 This happens because we don't hold pmc->lock in ip_mc_clear_src() and a parallel mr_ifc_timer timer could jump in and access them. The RCU lock is there but it is merely for pmc itself, this spinlock could actually ensure we don't access them in parallel. Thanks to Eric and Long for discussion on this bug. Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>