summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-04-24rpc_pipefs: fix double-dput()Al Viro
commit 4a3877c4cedd95543f8726b0a98743ed8db0c0fb upstream. if we ever hit rpc_gssd_dummy_depopulate() dentry passed to it has refcount equal to 1. __rpc_rmpipe() drops it and dput() done after that hits an already freed dentry. Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ipv6: the entire IPv6 header chain must fit the first fragmentPaolo Abeni
[ Upstream commit 10b8a3de603df7b96004179b1b33b1708c76d144 ] While building ipv6 datagram we currently allow arbitrary large extheaders, even beyond pmtu size. The syzbot has found a way to exploit the above to trigger the following splat: kernel BUG at ./include/linux/skbuff.h:2073! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4230 Comm: syzkaller672661 Not tainted 4.16.0-rc2+ #326 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__skb_pull include/linux/skbuff.h:2073 [inline] RIP: 0010:__ip6_make_skb+0x1ac8/0x2190 net/ipv6/ip6_output.c:1636 RSP: 0018:ffff8801bc18f0f0 EFLAGS: 00010293 RAX: ffff8801b17400c0 RBX: 0000000000000738 RCX: ffffffff84f01828 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8801b415ac18 RBP: ffff8801bc18f360 R08: ffff8801b4576844 R09: 0000000000000000 R10: ffff8801bc18f380 R11: ffffed00367aee4e R12: 00000000000000d6 R13: ffff8801b415a740 R14: dffffc0000000000 R15: ffff8801b45767c0 FS: 0000000001535880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002000b000 CR3: 00000001b4123001 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6_finish_skb include/net/ipv6.h:969 [inline] udp_v6_push_pending_frames+0x269/0x3b0 net/ipv6/udp.c:1073 udpv6_sendmsg+0x2a96/0x3400 net/ipv6/udp.c:1343 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:764 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg+0xca/0x110 net/socket.c:640 ___sys_sendmsg+0x320/0x8b0 net/socket.c:2046 __sys_sendmmsg+0x1ee/0x620 net/socket.c:2136 SYSC_sendmmsg net/socket.c:2167 [inline] SyS_sendmmsg+0x35/0x60 net/socket.c:2162 do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4404c9 RSP: 002b:00007ffdce35f948 EFLAGS: 00000217 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004404c9 RDX: 0000000000000003 RSI: 0000000020001f00 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 0000000020000080 R11: 0000000000000217 R12: 0000000000401df0 R13: 0000000000401e80 R14: 0000000000000000 R15: 0000000000000000 Code: ff e8 1d 5e b9 fc e9 15 e9 ff ff e8 13 5e b9 fc e9 44 e8 ff ff e8 29 5e b9 fc e9 c0 e6 ff ff e8 3f f3 80 fc 0f 0b e8 38 f3 80 fc <0f> 0b 49 8d 87 80 00 00 00 4d 8d 87 84 00 00 00 48 89 85 20 fe RIP: __skb_pull include/linux/skbuff.h:2073 [inline] RSP: ffff8801bc18f0f0 RIP: __ip6_make_skb+0x1ac8/0x2190 net/ipv6/ip6_output.c:1636 RSP: ffff8801bc18f0f0 As stated by RFC 7112 section 5: When a host fragments an IPv6 datagram, it MUST include the entire IPv6 Header Chain in the First Fragment. So this patch addresses the issue dropping datagrams with excessive extheader length. It also updates the error path to report to the calling socket nonnegative pmtu values. The issue apparently predates git history. v1 -> v2: cleanup error path, as per Eric's suggestion Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+91e6f9932ff122fa4410@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net/ipv6: Increment OUTxxx counters after netfilter hookJeff Barnhill
[ Upstream commit 71a1c915238c970cd9bdd5bf158b1279d6b6d55b ] At the end of ip6_forward(), IPSTATS_MIB_OUTFORWDATAGRAMS and IPSTATS_MIB_OUTOCTETS are incremented immediately before the NF_HOOK call for NFPROTO_IPV6 / NF_INET_FORWARD. As a result, these counters get incremented regardless of whether or not the netfilter hook allows the packet to continue being processed. This change increments the counters in ip6_forward_finish() so that it will not happen if the netfilter hook chooses to terminate the packet, which is similar to how IPv4 works. Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net sched actions: fix dumping which requires several messages to user spaceCraig Dillabaugh
[ Upstream commit 734549eb550c0c720bc89e50501f1b1e98cdd841 ] Fixes a bug in the tcf_dump_walker function that can cause some actions to not be reported when dumping a large number of actions. This issue became more aggrevated when cookies feature was added. In particular this issue is manifest when large cookie values are assigned to the actions and when enough actions are created that the resulting table must be dumped in multiple batches. The number of actions returned in each batch is limited by the total number of actions and the memory buffer size. With small cookies the numeric limit is reached before the buffer size limit, which avoids the code path triggering this bug. When large cookies are used buffer fills before the numeric limit, and the erroneous code path is hit. For example after creating 32 csum actions with the cookie aaaabbbbccccdddd $ tc actions ls action csum total acts 26 action order 0: csum (tcp) action continue index 1 ref 1 bind 0 cookie aaaabbbbccccdddd ..... action order 25: csum (tcp) action continue index 26 ref 1 bind 0 cookie aaaabbbbccccdddd total acts 6 action order 0: csum (tcp) action continue index 28 ref 1 bind 0 cookie aaaabbbbccccdddd ...... action order 5: csum (tcp) action continue index 32 ref 1 bind 0 cookie aaaabbbbccccdddd Note that the action with index 27 is omitted from the report. Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")" Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13vti6: better validate user provided tunnel namesEric Dumazet
[ Upstream commit 537b361fbcbcc3cd6fe2bb47069fd292b9256d16 ] Use valid_name() to make sure user does not provide illegal device name. Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ip6_tunnel: better validate user provided tunnel namesEric Dumazet
[ Upstream commit db7a65e3ab78e5b1c4b17c0870ebee35a4ee3257 ] Use valid_name() to make sure user does not provide illegal device name. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ip6_gre: better validate user provided tunnel namesEric Dumazet
[ Upstream commit 5f42df013b8bc1b6511af7a04bf93b014884ae2a ] Use dev_valid_name() to make sure user does not provide illegal device name. syzbot caught the following bug : BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in ip6gre_tunnel_locate+0x334/0x860 net/ipv6/ip6_gre.c:339 Write of size 20 at addr ffff8801afb9f7b8 by task syzkaller851048/4466 CPU: 1 PID: 4466 Comm: syzkaller851048 Not tainted 4.16.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x1b9/0x29f lib/dump_stack.c:53 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0xac/0x2f5 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] ip6gre_tunnel_locate+0x334/0x860 net/ipv6/ip6_gre.c:339 ip6gre_tunnel_ioctl+0x69d/0x12e0 net/ipv6/ip6_gre.c:1195 dev_ifsioc+0x43e/0xb90 net/core/dev_ioctl.c:334 dev_ioctl+0x69a/0xcc0 net/core/dev_ioctl.c:525 sock_ioctl+0x47e/0x680 net/socket.c:1015 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x1650 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 SYSC_ioctl fs/ioctl.c:708 [inline] SyS_ioctl+0x24/0x30 fs/ioctl.c:706 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ipv6: sit: better validate user provided tunnel namesEric Dumazet
[ Upstream commit b95211e066fc3494b7c115060b2297b4ba21f025 ] Use dev_valid_name() to make sure user does not provide illegal device name. syzbot caught the following bug : BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in ipip6_tunnel_locate+0x63b/0xaa0 net/ipv6/sit.c:254 Write of size 33 at addr ffff8801b64076d8 by task syzkaller932654/4453 CPU: 0 PID: 4453 Comm: syzkaller932654 Not tainted 4.16.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x1b9/0x29f lib/dump_stack.c:53 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0xac/0x2f5 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] ipip6_tunnel_locate+0x63b/0xaa0 net/ipv6/sit.c:254 ipip6_tunnel_ioctl+0xe71/0x241b net/ipv6/sit.c:1221 dev_ifsioc+0x43e/0xb90 net/core/dev_ioctl.c:334 dev_ioctl+0x69a/0xcc0 net/core/dev_ioctl.c:525 sock_ioctl+0x47e/0x680 net/socket.c:1015 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x1650 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 SYSC_ioctl fs/ioctl.c:708 [inline] SyS_ioctl+0x24/0x30 fs/ioctl.c:706 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ip_tunnel: better validate user provided tunnel namesEric Dumazet
[ Upstream commit 9cb726a212a82c88c98aa9f0037fd04777cd8fe5 ] Use dev_valid_name() to make sure user does not provide illegal device name. syzbot caught the following bug : BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in __ip_tunnel_create+0xca/0x6b0 net/ipv4/ip_tunnel.c:257 Write of size 20 at addr ffff8801ac79f810 by task syzkaller268107/4482 CPU: 0 PID: 4482 Comm: syzkaller268107 Not tainted 4.16.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x1b9/0x29f lib/dump_stack.c:53 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0xac/0x2f5 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] __ip_tunnel_create+0xca/0x6b0 net/ipv4/ip_tunnel.c:257 ip_tunnel_create net/ipv4/ip_tunnel.c:352 [inline] ip_tunnel_ioctl+0x818/0xd40 net/ipv4/ip_tunnel.c:861 ipip_tunnel_ioctl+0x1c5/0x420 net/ipv4/ipip.c:350 dev_ifsioc+0x43e/0xb90 net/core/dev_ioctl.c:334 dev_ioctl+0x69a/0xcc0 net/core/dev_ioctl.c:525 sock_ioctl+0x47e/0x680 net/socket.c:1015 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x1650 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 SYSC_ioctl fs/ioctl.c:708 [inline] SyS_ioctl+0x24/0x30 fs/ioctl.c:706 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: fool proof dev_valid_name()Eric Dumazet
[ Upstream commit a9d48205d0aedda021fc3728972a9e9934c2b9de ] We want to use dev_valid_name() to validate tunnel names, so better use strnlen(name, IFNAMSIZ) than strlen(name) to make sure to not upset KASAN. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13vlan: also check phy_driver ts_info for vlan's real deviceHangbin Liu
[ Upstream commit ec1d8ccb07deaf30fd0508af6755364ac47dc08d ] Just like function ethtool_get_ts_info(), we should also consider the phy_driver ts_info call back. For example, driver dp83640. Fixes: 37dd9255b2f6 ("vlan: Pass ethtool get_ts_info queries to real device.") Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6Eric Dumazet
[ Upstream commit 81e98370293afcb58340ce8bd71af7b97f925c26 ] Check must happen before call to ipv6_addr_v4mapped() syzbot report was : BUG: KMSAN: uninit-value in sctp_sockaddr_af net/sctp/socket.c:359 [inline] BUG: KMSAN: uninit-value in sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384 CPU: 0 PID: 3576 Comm: syzkaller968804 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 sctp_sockaddr_af net/sctp/socket.c:359 [inline] sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384 sctp_bind+0x149/0x190 net/sctp/socket.c:332 inet6_bind+0x1fd/0x1820 net/ipv6/af_inet6.c:293 SYSC_bind+0x3f2/0x4b0 net/socket.c:1474 SyS_bind+0x54/0x80 net/socket.c:1460 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43fd49 RSP: 002b:00007ffe99df3d28 EFLAGS: 00000213 ORIG_RAX: 0000000000000031 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd49 RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401670 R13: 0000000000401700 R14: 0000000000000000 R15: 0000000000000000 Local variable description: ----address@SYSC_bind Variable was created at: SYSC_bind+0x6f/0x4b0 net/socket.c:1461 SyS_bind+0x54/0x80 net/socket.c:1460 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13sctp: do not leak kernel memory to user spaceEric Dumazet
[ Upstream commit 6780db244d6b1537d139dea0ec8aad10cf9e4adb ] syzbot produced a nice report [1] Issue here is that a recvmmsg() managed to leak 8 bytes of kernel memory to user space, because sin_zero (padding field) was not properly cleared. [1] BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline] BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:227 CPU: 1 PID: 3586 Comm: syzkaller481044 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176 kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199 copy_to_user include/linux/uaccess.h:184 [inline] move_addr_to_user+0x32e/0x530 net/socket.c:227 ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211 __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313 SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394 SyS_recvmmsg+0x76/0xa0 net/socket.c:2378 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x4401c9 RSP: 002b:00007ffc56f73098 EFLAGS: 00000217 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9 RDX: 0000000000000001 RSI: 0000000020003ac0 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000020003bc0 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401af0 R13: 0000000000401b80 R14: 0000000000000000 R15: 0000000000000000 Local variable description: ----addr@___sys_recvmsg Variable was created at: ___sys_recvmsg+0xd5/0x810 net/socket.c:2172 __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313 Bytes 8-15 of 16 are uninitialized ================================================================== Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 3586 Comm: syzkaller481044 Tainted: G B 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 panic+0x39d/0x940 kernel/panic.c:183 kmsan_report+0x238/0x240 mm/kmsan/kmsan.c:1083 kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176 kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199 copy_to_user include/linux/uaccess.h:184 [inline] move_addr_to_user+0x32e/0x530 net/socket.c:227 ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211 __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313 SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394 SyS_recvmmsg+0x76/0xa0 net/socket.c:2378 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net/sched: fix NULL dereference in the error path of tcf_bpf_init()Davide Caratti
[ Upstream commit 3239534a79ee6f20cffd974173a1e62e0730e8ac ] when tcf_bpf_init_from_ops() fails (e.g. because of program having invalid number of instructions), tcf_bpf_cfg_cleanup() calls bpf_prog_put(NULL) or bpf_prog_destroy(NULL). Unless CONFIG_BPF_SYSCALL is unset, this causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 PGD 800000007345a067 P4D 800000007345a067 PUD 340e1067 PMD 0 Oops: 0000 [#1] SMP PTI Modules linked in: act_bpf(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd glue_helper cryptd joydev snd_timer snd virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console i2c_core crc32c_intel serio_raw virtio_pci ata_piix libata virtio_ring floppy virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_bpf] CPU: 3 PID: 5654 Comm: tc Tainted: G E 4.16.0.bpf_test+ #408 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__bpf_prog_put+0xc/0xc0 RSP: 0018:ffff9594003ef728 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff9594003ef758 RCX: 0000000000000024 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044 R10: 0000000000000220 R11: ffff8a7ab9f17131 R12: 0000000000000000 R13: ffff8a7ab7c3c8e0 R14: 0000000000000001 R15: ffff8a7ab88f1054 FS: 00007fcb2f17c740(0000) GS:ffff8a7abfd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000020 CR3: 000000007c888006 CR4: 00000000001606e0 Call Trace: tcf_bpf_cfg_cleanup+0x2f/0x40 [act_bpf] tcf_bpf_cleanup+0x4c/0x70 [act_bpf] __tcf_idr_release+0x79/0x140 tcf_bpf_init+0x125/0x330 [act_bpf] tcf_action_init_1+0x2cc/0x430 ? get_page_from_freelist+0x3f0/0x11b0 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.29+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? mem_cgroup_commit_charge+0x80/0x130 ? page_add_new_anon_rmap+0x73/0xc0 ? do_anonymous_page+0x2a2/0x560 ? __handle_mm_fault+0xc75/0xe20 __sys_sendmsg+0x58/0xa0 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7fcb2e58eba0 RSP: 002b:00007ffc93c496c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffc93c497f0 RCX: 00007fcb2e58eba0 RDX: 0000000000000000 RSI: 00007ffc93c49740 RDI: 0000000000000003 RBP: 000000005ac6a646 R08: 0000000000000002 R09: 0000000000000000 R10: 00007ffc93c49120 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc93c49804 R14: 0000000000000001 R15: 000000000066afa0 Code: 5f 00 48 8b 43 20 48 c7 c7 70 2f 7c b8 c7 40 10 00 00 00 00 5b e9 a5 8b 61 00 0f 1f 44 00 00 0f 1f 44 00 00 41 54 55 48 89 fd 53 <48> 8b 47 20 f0 ff 08 74 05 5b 5d 41 5c c3 41 89 f4 0f 1f 44 00 RIP: __bpf_prog_put+0xc/0xc0 RSP: ffff9594003ef728 CR2: 0000000000000020 Fix it in tcf_bpf_cfg_cleanup(), ensuring that bpf_prog_{put,destroy}(f) is called only when f is not NULL. Fixes: bbc09e7842a5 ("net/sched: fix idr leak on the error path of tcf_bpf_init()") Reported-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13netlink: make sure nladdr has correct size in netlink_connect()Alexander Potapenko
[ Upstream commit 7880287981b60a6808f39f297bb66936e8bdf57a ] KMSAN reports use of uninitialized memory in the case when |alen| is smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't fully copied from the userspace. Signed-off-by: Alexander Potapenko <glider@google.com> Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2") Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net/ipv6: Fix route leaking between VRFsDavid Ahern
[ Upstream commit b6cdbc85234b072340b8923e69f49ec293f905dc ] Donald reported that IPv6 route leaking between VRFs is not working. The root cause is the strict argument in the call to rt6_lookup when validating the nexthop spec. ip6_route_check_nh validates the gateway and device (if given) of a route spec. It in turn could call rt6_lookup (e.g., lookup in a given table did not succeed so it falls back to a full lookup) and if so sets the strict argument to 1. That means if the egress device is given, the route lookup needs to return a result with the same device. This strict requirement does not work with VRFs (IPv4 or IPv6) because the oif in the flow struct is overridden with the index of the VRF device to trigger a match on the l3mdev rule and force the lookup to its table. The right long term solution is to add an l3mdev index to the flow struct such that the oif is not overridden. That solution will not backport well, so this patch aims for a simpler solution to relax the strict argument if the route spec device is an l3mdev slave. As done in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the RT6_LOOKUP_F_IFACE flag needs to be removed. Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack") Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: fix possible out-of-bound read in skb_network_protocol()Eric Dumazet
[ Upstream commit 1dfe82ebd7d8fd43dba9948fdfb31f145014baa0 ] skb mac header is not necessarily set at the time skb_network_protocol() is called. Use skb->data instead. BUG: KASAN: slab-out-of-bounds in skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 Read of size 2 at addr ffff8801b3097a0b by task syz-executor5/14242 CPU: 1 PID: 14242 Comm: syz-executor5 Not tainted 4.16.0-rc6+ #280 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23c/0x360 mm/kasan/report.c:412 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:443 skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739 harmonize_features net/core/dev.c:2924 [inline] netif_skb_features+0x509/0x9b0 net/core/dev.c:3011 validate_xmit_skb+0x81/0xb00 net/core/dev.c:3084 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3142 packet_direct_xmit+0x117/0x790 net/packet/af_packet.c:256 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x3aed/0x60b0 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xca/0x110 net/socket.c:639 ___sys_sendmsg+0x767/0x8b0 net/socket.c:2047 __sys_sendmsg+0xe5/0x210 net/socket.c:2081 Fixes: 19acc327258a ("gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@ovn.org> Reported-by: Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13arp: fix arp_filter on l3slave devicesMiguel Fadon Perlines
[ Upstream commit 58b35f27689b5eb514fc293c332966c226b1b6e4 ] arp_filter performs an ip_route_output search for arp source address and checks if output device is the same where the arp request was received, if it is not, the arp request is not answered. This route lookup is always done on main route table so l3slave devices never find the proper route and arp is not answered. Passing l3mdev_master_ifindex_rcu(dev) return value as oif fixes the lookup for l3slave devices while maintaining same behavior for non l3slave devices as this function returns 0 in that case. Fixes: 613d09b30f8b ("net: Use VRF device index for lookups on TX") Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13rxrpc: check return value of skb_to_sgvec alwaysJason A. Donenfeld
commit 89a5ea99662505d2d61f2a3030a6896c2cb3cdb0 upstream. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> [natechancellor: backport to 4.4] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ipsec: check return value of skb_to_sgvec alwaysJason A. Donenfeld
commit 3f29770723fe498a5c5f57c3a31a996ebdde03e1 upstream. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> [natechancellor: Adjusted context due to lack of fca11ebde3f0] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13Bluetooth: Send HCI Set Event Mask Page 2 command only when neededMarcel Holtmann
[ Upstream commit 313f6888c8fbb1bc8b36c9012ce4e1de848df696 ] The Broadcom BCM20702 Bluetooth controller in ThinkPad-T530 devices report support for the Set Event Mask Page 2 command, but actually do return an error when trying to use it. < HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0 > HCI Event: Command Complete (0x0e) plen 68 Read Local Supported Commands (0x04|0x0002) ncmd 1 Status: Success (0x00) Commands: 162 entries ... Set Event Mask Page 2 (Octet 22 - Bit 2) ... < HCI Command: Set Event Mask Page 2 (0x03|0x0063) plen 8 Mask: 0x0000000000000000 > HCI Event: Command Complete (0x0e) plen 4 Set Event Mask Page 2 (0x03|0x0063) ncmd 1 Status: Unknown HCI Command (0x01) Since these controllers do not support any feature that would require the event mask page 2 to be modified, it is safe to not send this command at all. The default value is all bits set to zero. T: Bus=01 Lev=02 Prnt=02 Port=03 Cnt=03 Dev#= 9 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=ff(vend.) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0a5c ProdID=21e6 Rev= 1.12 S: Manufacturer=Broadcom Corp S: Product=BCM20702A0 S: SerialNumber=F82FA8E8CFC0 C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=btusb E: Ad=84(I) Atr=02(Bulk) MxPS= 32 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 32 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=01 Driver=(none) Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13sctp: fix recursive locking warning in sctp_do_peeloffXin Long
[ Upstream commit 6dfe4b97e08ec3d1a593fdaca099f0ef0a3a19e6 ] Dmitry got the following recursive locking report while running syzkaller fuzzer, the Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:52 print_deadlock_bug kernel/locking/lockdep.c:1729 [inline] check_deadlock kernel/locking/lockdep.c:1773 [inline] validate_chain kernel/locking/lockdep.c:2251 [inline] __lock_acquire+0xef2/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 lock_sock_nested+0xcb/0x120 net/core/sock.c:2536 lock_sock include/net/sock.h:1460 [inline] sctp_close+0xcd/0x9d0 net/sctp/socket.c:1497 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432 sock_release+0x8d/0x1e0 net/socket.c:597 __sock_create+0x38b/0x870 net/socket.c:1226 sock_create+0x7f/0xa0 net/socket.c:1237 sctp_do_peeloff+0x1a2/0x440 net/sctp/socket.c:4879 sctp_getsockopt_peeloff net/sctp/socket.c:4914 [inline] sctp_getsockopt+0x111a/0x67e0 net/sctp/socket.c:6628 sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2690 SYSC_getsockopt net/socket.c:1817 [inline] SyS_getsockopt+0x240/0x380 net/socket.c:1799 entry_SYSCALL_64_fastpath+0x1f/0xc2 This warning is caused by the lock held by sctp_getsockopt() is on one socket, while the other lock that sctp_close() is getting later is on the newly created (which failed) socket during peeloff operation. This patch is to avoid this warning by use lock_sock with subclass SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion. Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13skbuff: only inherit relevant tx_flagsWillem de Bruijn
[ Upstream commit fff88030b3ff930ca7a3d74acfee0472f33887ea ] When inheriting tx_flags from one skbuff to another, always apply a mask to avoid overwriting unrelated other bits in the field. The two SKBTX_SHARED_FRAG cases clears all other bits. In practice, tx_flags are zero at this point now. But this is fragile. Timestamp flags are set, for instance, if in tcp_gso_segment, after this clear in skb_segment. The SKBTX_ANY_TSTAMP mask in __skb_tstamp_tx ensures that new skbs do not accidentally inherit flags such as SKBTX_SHARED_FRAG. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13sit: reload iphdr in ipip6_rcvHaishuang Yan
[ Upstream commit b699d0035836f6712917a41e7ae58d84359b8ff9 ] Since iptunnel_pull_header() can call pskb_may_pull(), we must reload any pointer that was related to skb->head. Fixes: a09a4c8dd1ec ("tunnels: Remove encapsulation offloads on decap") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflowJason A. Donenfeld
[ Upstream commit 48a1df65334b74bd7531f932cca5928932abf769 ] This is a defense-in-depth measure in response to bugs like 4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec"). There's not only a potential overflow of sglist items, but also a stack overflow potential, so we fix this by limiting the amount of recursion this function is allowed to do. Not actually providing a bounded base case is a future disaster that we can easily avoid here. As a small matter of house keeping, we take this opportunity to move the documentation comment over the actual function the documentation is for. While this could be implemented by using an explicit stack of skbuffs, when implementing this, the function complexity increased considerably, and I don't think such complexity and bloat is actually worth it. So, instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS, and measured the stack usage there. I also reverted the recent MIPS changes that give it a separate IRQ stack, so that I could experience some worst-case situations. I found that limiting it to 24 layers deep yielded a good stack usage with room for safety, as well as being much deeper than any driver actually ever creates. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()NeilBrown
[ Upstream commit 6ea44adce91526700535b3150f77f8639ae8c82d ] If you attempt a TCP mount from an host that is unreachable in a way that triggers an immediate error from kernel_connect(), that error does not propagate up, instead EAGAIN is reported. This results in call_connect_status receiving the wrong error. A case that it easy to demonstrate is to attempt to mount from an address that results in ENETUNREACH, but first deleting any default route. Without this patch, the mount.nfs process is persistently runnable and is hard to kill. With this patch it exits as it should. The problem is caused by the fact that xs_tcp_force_close() eventually calls xprt_wake_pending_tasks(xprt, -EAGAIN); which causes an error return of -EAGAIN. so when xs_tcp_setup_sock() calls xprt_wake_pending_tasks(xprt, status); the status is ignored. Fixes: 4efdd92c9211 ("SUNRPC: Remove TCP client connection reset hack") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13rds; Reset rs->rs_bound_addr in rds_add_bound() failure pathSowmini Varadhan
[ Upstream commit 7ae0c649c47f1c5d2db8cee6dd75855970af1669 ] If the rds_sock is not added to the bind_hash_table, we must reset rs_bound_addr so that rds_remove_bound will not trip on this rds_sock. rds_add_bound() does a rds_sock_put() in this failure path, so failing to reset rs_bound_addr will result in a socket refcount bug, and will trigger a WARN_ON with the stack shown below when the application subsequently tries to close the PF_RDS socket. WARNING: CPU: 20 PID: 19499 at net/rds/af_rds.c:496 \ rds_sock_destruct+0x15/0x30 [rds] : __sk_destruct+0x21/0x190 rds_remove_bound.part.13+0xb6/0x140 [rds] rds_release+0x71/0x120 [rds] sock_release+0x1a/0x70 sock_close+0xe/0x20 __fput+0xd5/0x210 task_work_run+0x82/0xa0 do_exit+0x2ce/0xb30 ? syscall_trace_enter+0x1cc/0x2b0 do_group_exit+0x39/0xa0 SyS_exit_group+0x10/0x10 do_syscall_64+0x61/0x1a0 Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13l2tp: fix missing print session offset infoHangbin Liu
[ Upstream commit 820da5357572715c6235ba3b3daa2d5b43a1198f ] Report offset parameter in L2TP_CMD_SESSION_GET command if it has been configured by userspace Fixes: 309795f4bec ("l2tp: Add netlink control API for L2TP") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: llc: add lock_sock in llc_ui_bind to avoid a race conditionlinzhang
[ Upstream commit 0908cf4dfef35fc6ac12329007052ebe93ff1081 ] There is a race condition in llc_ui_bind if two or more processes/threads try to bind a same socket. If more processes/threads bind a same socket success that will lead to two problems, one is this action is not what we expected, another is will lead to kernel in unstable status or oops(in my simple test case, cause llc2.ko can't unload). The current code is test SOCK_ZAPPED bit to avoid a process to bind a same socket twice but that is can't avoid more processes/threads try to bind a same socket at the same time. So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: move somaxconn init from sysctl codeRoman Kapl
[ Upstream commit 7c3f1875c66fbc19762760097cabc91849ea0bbb ] The default value for somaxconn is set in sysctl_core_net_init(), but this function is not called when kernel is configured without CONFIG_SYSCTL. This results in the kernel not being able to accept TCP connections, because the backlog has zero size. Usually, the user ends up with: "TCP: request_sock_TCP: Possible SYN flooding on port 7. Dropping request. Check SNMP counters." If SYN cookies are not enabled the connection is rejected. Before ef547f2ac16 (tcp: remove max_qlen_log), the effects were less severe, because the backlog was always at least eight slots long. Signed-off-by: Roman Kapl <roman.kapl@sysgo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13tcp: better validation of received ack sequencesEric Dumazet
[ Upstream commit d0e1a1b5a833b625c93d3d49847609350ebd79db ] Paul Fiterau Brostean reported : <quote> Linux TCP stack we analyze exhibits behavior that seems odd to me. The scenario is as follows (all packets have empty payloads, no window scaling, rcv/snd window size should not be a factor): TEST HARNESS (CLIENT) LINUX SERVER 1. - LISTEN (server listen, then accepts) 2. - --> <SEQ=100><CTL=SYN> --> SYN-RECEIVED 3. - <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 4. - --> <SEQ=101><ACK=301><CTL=ACK> --> ESTABLISHED 5. - <-- <SEQ=301><ACK=101><CTL=FIN,ACK> <-- FIN WAIT-1 (server opts to close the data connection calling "close" on the connection socket) 6. - --> <SEQ=101><ACK=99999><CTL=FIN,ACK> --> CLOSING (client sends FIN,ACK with not yet sent acknowledgement number) 7. - <-- <SEQ=302><ACK=102><CTL=ACK> <-- CLOSING (ACK is 102 instead of 101, why?) ... (silence from CLIENT) 8. - <-- <SEQ=301><ACK=102><CTL=FIN,ACK> <-- CLOSING (retransmission, again ACK is 102) Now, note that packet 6 while having the expected sequence number, acknowledges something that wasn't sent by the server. So I would expect the packet to maybe prompt an ACK response from the server, and then be ignored. Yet it is not ignored and actually leads to an increase of the acknowledgement number in the server's retransmission of the FIN,ACK packet. The explanation I found is that the FIN in packet 6 was processed, despite the acknowledgement number being unacceptable. Further experiments indeed show that the server processes this FIN, transitioning to CLOSING, then on receiving an ACK for the FIN it had send in packet 5, the server (or better said connection) transitions from CLOSING to TIME_WAIT (as signaled by netstat). </quote> Indeed, tcp_rcv_state_process() calls tcp_ack() but does not exploit the @acceptable status but for TCP_SYN_RECV state. What we want here is to send a challenge ACK, if not in TCP_SYN_RECV state. TCP_FIN_WAIT1 state is not the only state we should fix. Add a FLAG_NO_CHALLENGE_ACK so that tcp_rcv_state_process() can choose to send a challenge ACK and discard the packet instead of wrongly change socket state. With help from Neal Cardwell. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Fiterau Brostean <p.fiterau-brostean@science.ru.nl> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13netfilter: ctnetlink: fix incorrect nf_ct_put during hash resizeLiping Zhang
[ Upstream commit fefa92679dbe0c613e62b6c27235dcfbe9640ad1 ] If nf_conntrack_htable_size was adjusted by the user during the ct dump operation, we may invoke nf_ct_put twice for the same ct, i.e. the "last" ct. This will cause the ct will be freed but still linked in hash buckets. It's very easy to reproduce the problem by the following commands: # while : ; do echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets done # while : ; do conntrack -L done # iperf -s 127.0.0.1 & # iperf -c 127.0.0.1 -P 60 -t 36000 After a while, the system will hang like this: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382] ... So at last if we find cb->args[1] is equal to "last", this means hash resize happened, then we can set cb->args[1] to 0 to fix the above issue. Fixes: d205dc40798d ("[NETFILTER]: ctnetlink: fix deadlock in table dumping") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13libceph: NULL deref on crush_decode() error pathDan Carpenter
[ Upstream commit 293dffaad8d500e1a5336eeb90d544cf40d4fbd8 ] If there is not enough space then ceph_decode_32_safe() does a goto bad. We need to return an error code in that situation. The current code returns ERR_PTR(0) which is NULL. The callers are not expecting that and it results in a NULL dereference. Fixes: f24e9980eb86 ("ceph: OSD client") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: ieee802154: fix net_device reference release too earlyLin Zhang
[ Upstream commit a611c58b3d42a92e6b23423e166dd17c0c7fffce ] This patch fixes the kernel oops when release net_device reference in advance. In function raw_sendmsg(i think the dgram_sendmsg has the same problem), there is a race condition between dev_put and dev_queue_xmit when the device is gong that maybe lead to dev_queue_ximt to see an illegal net_device pointer. My test kernel is 3.13.0-32 and because i am not have a real 802154 device, so i change lowpan_newlink function to this: /* find and hold real wpan device */ real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; // if (real_dev->type != ARPHRD_IEEE802154) { // dev_put(real_dev); // return -EINVAL; // } lowpan_dev_info(dev)->real_dev = real_dev; lowpan_dev_info(dev)->fragment_tag = 0; mutex_init(&lowpan_dev_info(dev)->dev_list_mtx); Also, in order to simulate preempt, i change the raw_sendmsg function to this: skb->dev = dev; skb->sk = sk; skb->protocol = htons(ETH_P_IEEE802154); dev_put(dev); //simulate preempt schedule_timeout_uninterruptible(30 * HZ); err = dev_queue_xmit(skb); if (err > 0) err = net_xmit_errno(err); and this is my userspace test code named test_send_data: int main(int argc, char **argv) { char buf[127]; int sockfd; sockfd = socket(AF_IEEE802154, SOCK_RAW, 0); if (sockfd < 0) { printf("create sockfd error: %s\n", strerror(errno)); return -1; } send(sockfd, buf, sizeof(buf), 0); return 0; } This is my test case: root@zhanglin-x-computer:~/develop/802154# uname -a Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name lowpan0 type lowpan root@zhanglin-x-computer:~/develop/802154# //keep the lowpan0 device down root@zhanglin-x-computer:~/develop/802154# ./test_send_data & //wait a while root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0 //the device is gone //oops [381.303307] general protection fault: 0000 [#1]SMP [381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek rts5139(C) snd_hda_intel snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device coretemp i915 kvm_intel kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid parport_pc ppdev ip parport hid_generic usbhid hid ahci r8169 mii libahdi [381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic [381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer, BIOS FIBT19H02_X64 06/09/2014 [381.304546] tasks: ffff000096965fc0 ti: ffffB0013779c000 task.ti: ffffB8013779c000 [381.304659] RIP: 0010:[<ffffffff01621fe1>] [<ffffffff81621fe1>] __dev_queue_ximt+0x61/0x500 [381.304798] RSP: 0018:ffffB8013779dca0 EFLAGS: 00010202 [381.304880] RAX: 272b031d57565351 RBX: 0000000000000000 RCX: ffff8800968f1a00 [381.304987] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800968f1a00 [381.305095] RBP: ffff8e013773dce0 R08: 0000000000000266 R09: 0000000000000004 [381.305202] R10: 0000000000000004 R11: 0000000000000005 R12: ffff88013902e000 [381.305310] R13: 000000000000007f R14: 000000000000007f R15: ffff8800968f1a00 [381.305418] FS: 00007fc57f50f740(0000) GS: ffff88013fc80000(0000) knlGS: 0000000000000000 [381.305540] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [381.305627] CR2: 00007fad0841c000 CR3: 00000001368dd000 CR4: 00000000001007e0 [361.905734] Stack: [381.305768] 00000000002052d0 000000003facb30a ffff88013779dcc0 ffff880137764000 [381.305898] ffff88013779de70 000000000000007f 000000000000007f ffff88013902e000 [381.306026] ffff88013779dcf0 ffffffff81622490 ffff88013779dd39 ffffffffa03af9f1 [381.306155] Call Trace: [381.306202] [<ffffffff81622490>] dev_queue_xmit+0x10/0x20 [381.306294] [<ffffffffa03af9f1>] raw_sendmsg+0x1b1/0x270 [af_802154] [381.306396] [<ffffffffa03af054>] ieee802154_sock_sendmsg+0x14/0x20 [af_802154] [381.306512] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0 [381.306600] [<ffffffff811d52a5>] ? __d_alloc+0x25/0x180 [381.306687] [<ffffffff811a1f56>] ? kmem_cache_alloc_trace+0x1c6/0x1f0 [381.306791] [<ffffffff81607b91>] SYSC_sendto+0x121/0x1c0 [381.306878] [<ffffffff8109ddf4>] ? vtime_account_user+x54/0x60 [381.306975] [<ffffffff81020d45>] ? syscall_trace_enter+0x145/0x250 [381.307073] [<ffffffff816086ae>] SyS_sendto+0xe/0x10 [381.307156] [<ffffffff8172c87f>] tracesys+0xe1/0xe6 [381.307233] Code: c6 a1 a4 ff 41 8b 57 78 49 8b 47 20 85 d2 48 8b 80 78 07 00 00 75 21 49 8b 57 18 48 85 d2 74 18 48 85 c0 74 13 8b 92 ac 01 00 00 <3b> 50 10 73 08 8b 44 90 14 41 89 47 78 41 f6 84 24 d5 00 00 00 [381.307801] RIP [<ffffffff81621fe1>] _dev_queue_xmit+0x61/0x500 [381.307901] RSP <ffff88013779dca0> [381.347512] Kernel panic - not syncing: Fatal exception in interrupt [381.347747] drm_kms_helper: panic occurred, switching back to text console In my opinion, there is always exist a chance that the device is gong before call dev_queue_xmit. I think the latest kernel is have the same problem and that dev_put should be behind of the dev_queue_xmit. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13xfrm: fix state migration copy replay sequence numbersAntony Antony
[ Upstream commit a486cd23661c9387fb076c3f6ae8b2aa9d20d54a ] During xfrm migration copy replay and preplay sequence numbers from the previous state. Here is a tcpdump output showing the problem. 10.0.10.46 is running vanilla kernel, is the IKE/IPsec responder. After the migration it sent wrong sequence number, reset to 1. The migration is from 10.0.0.52 to 10.0.0.53. IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7cf), length 136 IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7cf), length 136 IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d0), length 136 IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7d0), length 136 IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I] IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R] IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I] IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R] IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d1), length 136 NOTE: next sequence is wrong 0x1 IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x1), length 136 IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d2), length 136 IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x2), length 136 Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13net: x25: fix one potential use-after-free issuelinzhang
[ Upstream commit 64df6d525fcff1630098db9238bfd2b3e092d5c1 ] The function x25_init is not properly unregister related resources on error handler.It is will result in kernel oops if x25_init init failed, so add properly unregister call on error handler. Also, i adjust the coding style and make x25_register_sysctl properly return failure. Signed-off-by: linzhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13arp: honour gratuitous ARP _replies_Ihar Hrachyshka
[ Upstream commit 23d268eb240954e6e78f7cfab04f2b1e79f84489 ] When arp_accept is 1, gratuitous ARPs are supposed to override matching entries irrespective of whether they arrive during locktime. This was implemented in commit 56022a8fdd87 ("ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set") There is a glitch in the patch though. RFC 2002, section 4.6, "ARP, Proxy ARP, and Gratuitous ARP", defines gratuitous ARPs so that they can be either of Request or Reply type. Those Reply gratuitous ARPs can be triggered with standard tooling, for example, arping -A option does just that. This patch fixes the glitch, making both Request and Reply flavours of gratuitous ARPs to behave identically. As per RFC, if gratuitous ARPs are of Reply type, their Target Hardware Address field should also be set to the link-layer address to which this cache entry should be updated. The field is present in ARP over Ethernet but not in IEEE 1394. In this patch, I don't consider any broadcasted ARP replies as gratuitous if the field is not present, to conform the standard. It's not clear whether there is such a thing for IEEE 1394 as a gratuitous ARP reply; until it's cleared up, we will ignore such broadcasts. Note that they will still update existing ARP cache entries, assuming they arrive out of locktime time interval. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13neighbour: update neigh timestamps iff update is effectiveIhar Hrachyshka
[ Upstream commit 77d7123342dcf6442341b67816321d71da8b2b16 ] It's a common practice to send gratuitous ARPs after moving an IP address to another device to speed up healing of a service. To fulfill service availability constraints, the timing of network peers updating their caches to point to a new location of an IP address can be particularly important. Sometimes neigh_update calls won't touch neither lladdr nor state, for example if an update arrives in locktime interval. The neigh->updated value is tested by the protocol specific neigh code, which in turn will influence whether NEIGH_UPDATE_F_OVERRIDE gets set in the call to neigh_update() or not. As a result, we may effectively ignore the update request, bailing out of touching the neigh entry, except that we still bump its timestamps inside neigh_update. This may be a problem for updates arriving in quick succession. For example, consider the following scenario: A service is moved to another device with its IP address. The new device sends three gratuitous ARP requests into the network with ~1 seconds interval between them. Just before the first request arrives to one of network peer nodes, its neigh entry for the IP address transitions from STALE to DELAY. This transition, among other things, updates neigh->updated. Once the kernel receives the first gratuitous ARP, it ignores it because its arrival time is inside the locktime interval. The kernel still bumps neigh->updated. Then the second gratuitous ARP request arrives, and it's also ignored because it's still in the (new) locktime interval. Same happens for the third request. The node eventually heals itself (after delay_first_probe_time seconds since the initial transition to DELAY state), but it just wasted some time and require a new ARP request/reply round trip. This unfortunate behaviour both puts more load on the network, as well as reduces service availability. This patch changes neigh_update so that it bumps neigh->updated (as well as neigh->confirmed) only once we are sure that either lladdr or entry state will change). In the scenario described above, it means that the second gratuitous ARP request will actually update the entry lladdr. Ideally, we would update the neigh entry on the very first gratuitous ARP request. The locktime mechanism is designed to ignore ARP updates in a short timeframe after a previous ARP update was honoured by the kernel layer. This would require tracking timestamps for state transitions separately from timestamps when actual updates are received. This would probably involve changes in neighbour struct. Therefore, the patch doesn't tackle the issue of the first gratuitous APR ignored, leaving it for a follow-up. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13ipv6: avoid dad-failures for addresses with NODADMahesh Bandewar
[ Upstream commit 66eb9f86e50547ec2a8ff7a75997066a74ef584b ] Every address gets added with TENTATIVE flag even for the addresses with IFA_F_NODAD flag and dad-work is scheduled for them. During this DAD process we realize it's an address with NODAD and complete the process without sending any probe. However the TENTATIVE flags stays on the address for sometime enough to cause misinterpretation when we receive a NS. While processing NS, if the address has TENTATIVE flag, we mark it DADFAILED and endup with an address that was originally configured as NODAD with DADFAILED. We can't avoid scheduling dad_work for addresses with NODAD but we can avoid adding TENTATIVE flag to avoid this racy situation. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13mac80211: bail out from prep_connection() if a reconfig is ongoingLuca Coelho
[ Upstream commit f8860ce836f2d502b07ef99559707fe55d90f5bc ] If ieee80211_hw_restart() is called during authentication, the authentication process will continue, causing the driver to be called in a wrong state. This ultimately causes an oops in the iwlwifi driver (at least). This fixes bugzilla 195299 partly. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195299 Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13af_key: Fix slab-out-of-bounds in pfkey_compile_policy.Steffen Klassert
[ Upstream commit d90c902449a7561f1b1d58ba5a0d11728ce8b0b2 ] The sadb_x_sec_len is stored in the unit 'byte divided by eight'. So we have to multiply this value by eight before we can do size checks. Otherwise we may get a slab-out-of-bounds when we memcpy the user sec_ctx. Fixes: df71837d502 ("[LSM-IPSec]: Security association restriction.") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08Revert "ip6_vti: adjust vti mtu according to mtu of lower device"Greg Kroah-Hartman
This reverts commit 2fe832c678189d6b19b5ff282e7e70df79c1406b which is commit 53c81e95df1793933f87748d36070a721f6cb287 upstream. Ben writes that there are a number of follow-on patches needed to fix this up, but they get complex to backport, and some custom fixes are needed, so let's just revert this and wait for a "real" set of patches to resolve this to be submitted if it is really needed. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Petr Vorel <pvorel@suse.cz> Cc: Alexey Kodanev <alexey.kodanev@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08Bluetooth: Fix missing encryption refresh on Security RequestSzymon Janc
commit 64e759f58f128730b97a3c3a26d283c075ad7c86 upstream. If Security Request is received on connection that is already encrypted with sufficient security master should perform encryption key refresh procedure instead of just ignoring Slave Security Request (Core Spec 5.0 Vol 3 Part H 2.4.6). > ACL Data RX: Handle 3585 flags 0x02 dlen 6 SMP: Security Request (0x0b) len 1 Authentication requirement: Bonding, No MITM, SC, No Keypresses (0x09) < HCI Command: LE Start Encryption (0x08|0x0019) plen 28 Handle: 3585 Random number: 0x0000000000000000 Encrypted diversifier: 0x0000 Long term key: 44264272a5c426a9e868f034cf0e69f3 > HCI Event: Command Status (0x0f) plen 4 LE Start Encryption (0x08|0x0019) ncmd 1 Status: Success (0x00) > HCI Event: Encryption Key Refresh Complete (0x30) plen 3 Status: Success (0x00) Handle: 3585 Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08netfilter: x_tables: add and use xt_check_proc_nameFlorian Westphal
commit b1d0a5d0cba4597c0394997b2d5fced3e3841b4e upstream. recent and hashlimit both create /proc files, but only check that name is 0 terminated. This can trigger WARN() from procfs when name is "" or "/". Add helper for this and then use it for both. Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08netfilter: bridge: ebt_among: add more missing match size checksFlorian Westphal
commit c8d70a700a5b486bfa8e5a7d33d805389f6e59f9 upstream. ebt_among is special, it has a dynamic match size and is exempt from the central size checks. commit c4585a2823edf ("bridge: ebt_among: add missing match size checks") added validation for pool size, but missed fact that the macros ebt_among_wh_src/dst can already return out-of-bound result because they do not check value of wh_src/dst_ofs (an offset) vs. the size of the match that userspace gave to us. v2: check that offset has correct alignment. Paolo Abeni points out that we should also check that src/dst wormhash arrays do not overlap, and src + length lines up with start of dst (or vice versa). v3: compact wormhash_sizes_valid() part NB: Fixes tag is intentionally wrong, this bug exists from day one when match was added for 2.6 kernel. Tag is there so stable maintainers will notice this one too. Tested with same rules from the earlier patch. Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks") Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08xfrm: Refuse to insert 32 bit userspace socket policies on 64 bit systemsSteffen Klassert
commit 19d7df69fdb2636856dc8919de72fc1bf8f79598 upstream. We don't have a compat layer for xfrm, so userspace and kernel structures have different sizes in this case. This results in a broken configuration, so refuse to configure socket policies when trying to insert from 32 bit userspace as we do it already with policies inserted via netlink. Reported-and-tested-by: syzbot+e1a1577ca8bcb47b769a@syzkaller.appspotmail.com Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> [use is_compat_task() - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08net: xfrm: use preempt-safe this_cpu_read() in ipcomp_alloc_tfms()Greg Hackmann
commit 0dcd7876029b58770f769cbb7b484e88e4a305e5 upstream. f7c83bcbfaf5 ("net: xfrm: use __this_cpu_read per-cpu helper") added a __this_cpu_read() call inside ipcomp_alloc_tfms(). At the time, __this_cpu_read() required the caller to either not care about races or to handle preemption/interrupt issues. 3.15 tightened the rules around some per-cpu operations, and now __this_cpu_read() should never be used in a preemptible context. On 3.15 and later, we need to use this_cpu_read() instead. syzkaller reported this leading to the following kernel BUG while fuzzing sendmsg: BUG: using __this_cpu_read() in preemptible [00000000] code: repro/3101 caller is ipcomp_init_state+0x185/0x990 CPU: 3 PID: 3101 Comm: repro Not tainted 4.16.0-rc4-00123-g86f84779d8e9 #154 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0xb9/0x115 check_preemption_disabled+0x1cb/0x1f0 ipcomp_init_state+0x185/0x990 ? __xfrm_init_state+0x876/0xc20 ? lock_downgrade+0x5e0/0x5e0 ipcomp4_init_state+0xaa/0x7c0 __xfrm_init_state+0x3eb/0xc20 xfrm_init_state+0x19/0x60 pfkey_add+0x20df/0x36f0 ? pfkey_broadcast+0x3dd/0x600 ? pfkey_sock_destruct+0x340/0x340 ? pfkey_seq_stop+0x80/0x80 ? __skb_clone+0x236/0x750 ? kmem_cache_alloc+0x1f6/0x260 ? pfkey_sock_destruct+0x340/0x340 ? pfkey_process+0x62a/0x6f0 pfkey_process+0x62a/0x6f0 ? pfkey_send_new_mapping+0x11c0/0x11c0 ? mutex_lock_io_nested+0x1390/0x1390 pfkey_sendmsg+0x383/0x750 ? dump_sp+0x430/0x430 sock_sendmsg+0xc0/0x100 ___sys_sendmsg+0x6c8/0x8b0 ? copy_msghdr_from_user+0x3b0/0x3b0 ? pagevec_lru_move_fn+0x144/0x1f0 ? find_held_lock+0x32/0x1c0 ? do_huge_pmd_anonymous_page+0xc43/0x11e0 ? lock_downgrade+0x5e0/0x5e0 ? get_kernel_page+0xb0/0xb0 ? _raw_spin_unlock+0x29/0x40 ? do_huge_pmd_anonymous_page+0x400/0x11e0 ? __handle_mm_fault+0x553/0x2460 ? __fget_light+0x163/0x1f0 ? __sys_sendmsg+0xc7/0x170 __sys_sendmsg+0xc7/0x170 ? SyS_shutdown+0x1a0/0x1a0 ? __do_page_fault+0x5a0/0xca0 ? lock_downgrade+0x5e0/0x5e0 SyS_sendmsg+0x27/0x40 ? __sys_sendmsg+0x170/0x170 do_syscall_64+0x19f/0x640 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x7f0ee73dfb79 RSP: 002b:00007ffe14fc15a8 EFLAGS: 00000207 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0ee73dfb79 RDX: 0000000000000000 RSI: 00000000208befc8 RDI: 0000000000000004 RBP: 00007ffe14fc15b0 R08: 00007ffe14fc15c0 R09: 00007ffe14fc15c0 R10: 0000000000000000 R11: 0000000000000207 R12: 0000000000400440 R13: 00007ffe14fc16b0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08xfrm_user: uncoditionally validate esn replay attribute structFlorian Westphal
commit d97ca5d714a5334aecadadf696875da40f1fbf3e upstream. The sanity test added in ecd7918745234 can be bypassed, validation only occurs if XFRM_STATE_ESN flag is set, but rest of code doesn't care and just checks if the attribute itself is present. So always validate. Alternative is to reject if we have the attribute without the flag but that would change abi. Reported-by: syzbot+0ab777c27d2bb7588f73@syzkaller.appspotmail.com Cc: Mathias Krause <minipli@googlemail.com> Fixes: ecd7918745234 ("xfrm_user: ensure user supplied esn replay window is valid") Fixes: d8647b79c3b7e ("xfrm: Add user interface for esn and big anti-replay windows") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08netfilter: ctnetlink: Make some parameters integer to avoid enum mismatchMatthias Kaehlcke
commit a2b7cbdd2559aff06cebc28a7150f81c307a90d3 upstream. Not all parameters passed to ctnetlink_parse_tuple() and ctnetlink_exp_dump_tuple() match the enum type in the signatures of these functions. Since this is intended change the argument type of to be an unsigned integer value. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> [natechancellor: ctnetlink_exp_dump_tuple is still inline] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08netfilter: nf_nat_h323: fix logical-not-parentheses warningNick Desaulniers
commit eee6ebbac18a189ef33d25ea9b8bcae176515e49 upstream. Clang produces the following warning: net/ipv4/netfilter/nf_nat_h323.c:553:6: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses] if (!set_h225_addr(skb, protoff, data, dataoff, taddr, ^ add parentheses after the '!' to evaluate the comparison first add parentheses around left hand side expression to silence this warning There's not necessarily a bug here, but it's cleaner to return early, ex: if (x) return ... rather than: if (x == 0) ... else return Also added a return code check that seemed to be missing in one instance. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>