summaryrefslogtreecommitdiff
path: root/net/l2tp/l2tp_core.c
AgeCommit message (Collapse)Author
2020-04-13l2tp: fix race between l2tp_session_delete() and l2tp_tunnel_closeall()Guillaume Nault
commit b228a94066406b6c456321d69643b0d7ce11cfa6 upstream. There are several ways to remove L2TP sessions: * deleting a session explicitly using the netlink interface (with L2TP_CMD_SESSION_DELETE), * deleting the session's parent tunnel (either by closing the tunnel's file descriptor or using the netlink interface), * closing the PPPOL2TP file descriptor of a PPP pseudo-wire. In some cases, when these methods are used concurrently on the same session, the session can be removed twice, leading to use-after-free bugs. This patch adds a 'dead' flag, used by l2tp_session_delete() and l2tp_tunnel_closeall() to prevent them from stepping on each other's toes. The session deletion path used when closing a PPPOL2TP file descriptor doesn't need to be adapted. It already has to ensure that a session remains valid for the lifetime of its PPPOL2TP file descriptor. So it takes an extra reference on the session in the ->session_close() callback (pppol2tp_session_close()), which is eventually dropped in the ->sk_destruct() callback of the PPPOL2TP socket (pppol2tp_session_destruct()). Still, __l2tp_session_unhash() and l2tp_session_queue_purge() can be called twice and even concurrently for a given session, but thanks to proper locking and re-initialisation of list fields, this is not an issue. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29l2tp: Fix possible NULL pointer dereferenceYueHaibing
[ Upstream commit 638a3a1e349ddf5b82f222ff5cb3b4f266e7c278 ] BUG: unable to handle kernel NULL pointer dereference at 0000000000000128 PGD 0 P4D 0 Oops: 0000 [#1 CPU: 0 PID: 5697 Comm: modprobe Tainted: G W 5.1.0-rc7+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__lock_acquire+0x53/0x10b0 Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 <48> 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a RSP: 0018:ffffc90001c07a28 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff88822f038440 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000128 RBP: ffffc90001c07a88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000000128 R15: 0000000000000000 FS: 00007fead0811540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000128 CR3: 00000002310da000 CR4: 00000000000006f0 Call Trace: ? __lock_acquire+0x24e/0x10b0 lock_acquire+0xdf/0x230 ? flush_workqueue+0x71/0x530 flush_workqueue+0x97/0x530 ? flush_workqueue+0x71/0x530 l2tp_exit_net+0x170/0x2b0 [l2tp_core ? l2tp_exit_net+0x93/0x2b0 [l2tp_core ops_exit_list.isra.6+0x36/0x60 unregister_pernet_operations+0xb8/0x110 unregister_pernet_device+0x25/0x40 l2tp_init+0x55/0x1000 [l2tp_core ? 0xffffffffa018d000 do_one_initcall+0x6c/0x3cc ? do_init_module+0x22/0x1f1 ? rcu_read_lock_sched_held+0x97/0xb0 ? kmem_cache_alloc_trace+0x325/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fead031a839 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe8d9acca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000560078398b80 RCX: 00007fead031a839 RDX: 0000000000000000 RSI: 000056007659dc2e RDI: 0000000000000003 RBP: 000056007659dc2e R08: 0000000000000000 R09: 0000560078398b80 R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 R13: 00005600783a04a0 R14: 0000000000040000 R15: 0000560078398b80 Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core CR2: 0000000000000128 ---[ end trace 8322b2b8bf83f8e1 If alloc_workqueue fails in l2tp_init, l2tp_net_ops is unregistered on failure path. Then l2tp_exit_net is called which will flush NULL workqueue, this patch add a NULL check to fix it. Fixes: 67e04c29ec0d ("l2tp: unregister l2tp_net_ops on failure path") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-06l2tp: fix reading optional fields of L2TPv3Jacob Wen
[ Upstream commit 4522a70db7aa5e77526a4079628578599821b193 ] Use pskb_may_pull() to make sure the optional fields are in skb linear parts, so we can safely read them later. It's easy to reproduce the issue with a net driver that supports paged skb data. Just create a L2TPv3 over IP tunnel and then generates some network traffic. Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase. Changes in v4: 1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/ 2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/ 3. Add 'Fixes' in commit messages. Changes in v3: 1. To keep consistency, move the code out of l2tp_recv_common. 2. Use "net" instead of "net-next", since this is a bug fix. Changes in v2: 1. Only fix L2TPv3 to make code simple. To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common. It's complicated to do so. 2. Reloading pointers after pskb_may_pull Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support") Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-06l2tp: remove l2specific_len dependency in l2tp_coreLorenzo Bianconi
commit 62e7b6a57c7b9bf3c6fd99418eeec05b08a85c38 upstream. Remove l2specific_len dependency while building l2tpv3 header or parsing the received frame since default L2-Specific Sublayer is always four bytes long and we don't need to rely on a user supplied value. Moreover in l2tp netlink code there are no sanity checks to enforce the relation between l2specific_len and l2specific_type, so sending a malformed netlink message is possible to set l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than 4 leaking memory on the wire and sending corrupted frames. Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-06l2tp: copy 4 more bytes to linear part if necessaryJacob Wen
[ Upstream commit 91c524708de6207f59dd3512518d8a1c7b434ee3 ] The size of L2TPv2 header with all optional fields is 14 bytes. l2tp_udp_recv_core only moves 10 bytes to the linear part of a skb. This may lead to l2tp_recv_common read data outside of a skb. This patch make sure that there is at least 14 bytes in the linear part of a skb to meet the maximum need of l2tp_udp_recv_core and l2tp_recv_common. The minimum size of both PPP HDLC-like frame and Ethernet frame is larger than 14 bytes, so we are safe to do so. Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-10l2tp: remove configurable payload offsetJames Chapman
[ Upstream commit 900631ee6a2651dc4fbaecb8ef9fa5f1e3378853 ] If L2TP_ATTR_OFFSET is set to a non-zero value in L2TPv3 tunnels, it results in L2TPv3 packets being transmitted which might not be compliant with the L2TPv3 RFC. This patch has l2tp ignore the offset setting and send all packets with no offset. In more detail: L2TPv2 supports a variable offset from the L2TPv2 header to the payload. The offset value is indicated by an optional field in the L2TP header. Our L2TP implementation already detects the presence of the optional offset and skips that many bytes when handling data received packets. All transmitted packets are always transmitted with no offset. L2TPv3 has no optional offset field in the L2TPv3 packet header. Instead, L2TPv3 defines optional fields in a "Layer-2 Specific Sublayer". At the time when the original L2TP code was written, there was talk at IETF of offset being implemented in a new Layer-2 Specific Sublayer. A L2TP_ATTR_OFFSET netlink attribute was added so that this offset could be configured and the intention was to allow it to be also used to set the tx offset for L2TPv2. However, no L2TPv3 offset was ever specified and the L2TP_ATTR_OFFSET parameter was forgotten about. Setting L2TP_ATTR_OFFSET results in L2TPv3 packets being transmitted with the specified number of bytes padding between L2TPv3 header and payload. This is not compliant with L2TPv3 RFC3931. This change removes the configurable offset altogether while retaining L2TP_ATTR_OFFSET for backwards compatibility. Any L2TP_ATTR_OFFSET value is ignored. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-08-22l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cacheWei Wang
[ Upstream commit 6d37fa49da1e8db8fb1995be22ac837ca41ac8a8 ] In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a UDP socket. User could call sendmsg() on both this tunnel and the UDP socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_sendmsg() is lockless and call sk_dst_check() to refresh sk->sk_dst_cache, there could be a race and cause the dst cache to be freed multiple times. So we fix l2tp side code to always call sk_dst_check() to garantee xchg() is called when refreshing sk->sk_dst_cache to avoid race conditions. Syzkaller reported stack trace: BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline] BUG: KASAN: use-after-free in atomic_fetch_add_unless include/linux/atomic.h:575 [inline] BUG: KASAN: use-after-free in atomic_add_unless include/linux/atomic.h:597 [inline] BUG: KASAN: use-after-free in dst_hold_safe include/net/dst.h:308 [inline] BUG: KASAN: use-after-free in ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029 Read of size 4 at addr ffff8801aea9a880 by task syz-executor129/4829 CPU: 0 PID: 4829 Comm: syz-executor129 Not tainted 4.18.0-rc7-next-20180802+ #30 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272 atomic_read include/asm-generic/atomic-instrumented.h:21 [inline] atomic_fetch_add_unless include/linux/atomic.h:575 [inline] atomic_add_unless include/linux/atomic.h:597 [inline] dst_hold_safe include/net/dst.h:308 [inline] ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029 rt6_get_pcpu_route net/ipv6/route.c:1249 [inline] ip6_pol_route+0x354/0xd20 net/ipv6/route.c:1922 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2098 fib6_rule_lookup+0x283/0x890 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2126 ip6_dst_lookup_tail+0x1278/0x1da0 net/ipv6/ip6_output.c:978 ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079 ip6_sk_dst_lookup_flow+0x5ed/0xc50 net/ipv6/ip6_output.c:1117 udpv6_sendmsg+0x2163/0x36b0 net/ipv6/udp.c:1354 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x51d/0x930 net/socket.c:2115 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2210 __do_sys_sendmmsg net/socket.c:2239 [inline] __se_sys_sendmmsg net/socket.c:2236 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2236 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446a29 Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f4de5532db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc38 RCX: 0000000000446a29 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc30 R08: 00007f4de5533700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc3c R13: 00007ffe2b830fdf R14: 00007f4de55339c0 R15: 0000000000000001 Fixes: 71b1391a4128 ("l2tp: ensure sk->dst is still valid") Reported-by: syzbot+05f840f3b04f211bad55@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: David Ahern <dsahern@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-31l2tp: do not accept arbitrary socketsEric Dumazet
[ Upstream commit 17cfe79a65f98abe535261856c5aef14f306dff7 ] syzkaller found an issue caused by lack of sufficient checks in l2tp_tunnel_create() RAW sockets can not be considered as UDP ones for instance. In another patch, we shall replace all pr_err() by less intrusive pr_debug() so that syzkaller can find other bugs faster. Acked-by: Guillaume Nault <g.nault@alphalink.fr> Acked-by: James Chapman <jchapman@katalix.com> ================================================================== BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 dst_release: dst:00000000d53d0d0f refcnt:-1 Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242 CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23b/0x360 mm/kasan/report.c:412 __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435 setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596 pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707 SYSC_connect+0x213/0x4a0 net/socket.c:1640 SyS_connect+0x24/0x30 net/socket.c:1621 do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20l2tp: cleanup l2tp_tunnel_delete callsJiri Slaby
[ Upstream commit 4dc12ffeaeac939097a3f55c881d3dc3523dff0c ] l2tp_tunnel_delete does not return anything since commit 62b982eeb458 ("l2tp: fix race condition in l2tp_tunnel_delete"). But call sites of l2tp_tunnel_delete still do casts to void to avoid unused return value warnings. Kill these now useless casts. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-12l2tp: fix race condition in l2tp_tunnel_deleteSabrina Dubroca
[ Upstream commit 62b982eeb4589b2e6d7c01a90590e3a4c2b2ca19 ] If we try to delete the same tunnel twice, the first delete operation does a lookup (l2tp_tunnel_get), finds the tunnel, calls l2tp_tunnel_delete, which queues it for deletion by l2tp_tunnel_del_work. The second delete operation also finds the tunnel and calls l2tp_tunnel_delete. If the workqueue has already fired and started running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the same tunnel a second time, and try to free the socket again. Add a dead flag to prevent firing the workqueue twice. Then we can remove the check of queue_work's result that was meant to prevent that race but doesn't. Reproducer: ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000 ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000 ip link set l2tp1 up ip l2tp del tunnel tunnel_id 3000 ip l2tp del tunnel tunnel_id 3000 Fixes: f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-12l2tp: Avoid schedule while atomic in exit_netRidge Kennedy
[ Upstream commit 12d656af4e3d2781b9b9f52538593e1717e7c979 ] While destroying a network namespace that contains a L2TP tunnel a "BUG: scheduling while atomic" can be observed. Enabling lockdep shows that this is happening because l2tp_exit_net() is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from within an RCU critical section. l2tp_exit_net() takes rcu_read_lock_bh() << list_for_each_entry_rcu() >> l2tp_tunnel_delete() l2tp_tunnel_closeall() __l2tp_session_unhash() synchronize_rcu() << Illegal inside RCU critical section >> BUG: sleeping function called from invalid context in_atomic(): 1, irqs_disabled(): 0, pid: 86, name: kworker/u16:2 INFO: lockdep is turned off. CPU: 2 PID: 86 Comm: kworker/u16:2 Tainted: G W O 4.4.6-at1 #2 Hardware name: Xen HVM domU, BIOS 4.6.1-xs125300 05/09/2016 Workqueue: netns cleanup_net 0000000000000000 ffff880202417b90 ffffffff812b0013 ffff880202410ac0 ffffffff81870de8 ffff880202417bb8 ffffffff8107aee8 ffffffff81870de8 0000000000000c51 0000000000000000 ffff880202417be0 ffffffff8107b024 Call Trace: [<ffffffff812b0013>] dump_stack+0x85/0xc2 [<ffffffff8107aee8>] ___might_sleep+0x148/0x240 [<ffffffff8107b024>] __might_sleep+0x44/0x80 [<ffffffff810b21bd>] synchronize_sched+0x2d/0xe0 [<ffffffff8109be6d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8105c7bb>] ? __local_bh_enable_ip+0x6b/0xc0 [<ffffffff816a1b00>] ? _raw_spin_unlock_bh+0x30/0x40 [<ffffffff81667482>] __l2tp_session_unhash+0x172/0x220 [<ffffffff81667397>] ? __l2tp_session_unhash+0x87/0x220 [<ffffffff8166888b>] l2tp_tunnel_closeall+0x9b/0x140 [<ffffffff81668c74>] l2tp_tunnel_delete+0x14/0x60 [<ffffffff81668dd0>] l2tp_exit_net+0x110/0x270 [<ffffffff81668d5c>] ? l2tp_exit_net+0x9c/0x270 [<ffffffff815001c3>] ops_exit_list.isra.6+0x33/0x60 [<ffffffff81501166>] cleanup_net+0x1b6/0x280 ... This bug can easily be reproduced with a few steps: $ sudo unshare -n bash # Create a shell in a new namespace # ip link set lo up # ip addr add 127.0.0.1 dev lo # ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \ peer_tunnel_id 1 udp_sport 50000 udp_dport 50000 # ip l2tp add session name foo tunnel_id 1 session_id 1 \ peer_session_id 1 # ip link set foo up # exit # Exit the shell, in turn exiting the namespace $ dmesg ... [942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200 ... To fix this, move the call to l2tp_tunnel_closeall() out of the RCU critical section, and instead call it from l2tp_tunnel_del_work(), which is running from the l2tp_wq workqueue. Fixes: 2b551c6e7d5b ("l2tp: close sessions before initiating tunnel delete") Signed-off-by: Ridge Kennedy <ridge.kennedy@alliedtelesis.co.nz> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: take a reference on sessions used in genetlink handlersGuillaume Nault
commit 2777e2ab5a9cf2b4524486c6db1517a6ded25261 upstream. Callers of l2tp_nl_session_find() need to hold a reference on the returned session since there's no guarantee that it isn't going to disappear from under them. Relying on the fact that no l2tp netlink message may be processed concurrently isn't enough: sessions can be deleted by other means (e.g. by closing the PPPOL2TP socket of a ppp pseudowire). l2tp_nl_cmd_session_delete() is a bit special: it runs a callback function that may require a previous call to session->ref(). In particular, for ppp pseudowires, the callback is l2tp_session_delete(), which then calls pppol2tp_session_close() and dereferences the PPPOL2TP socket. The socket might already be gone at the moment l2tp_session_delete() calls session->ref(), so we need to take a reference during the session lookup. So we need to pass the do_ref variable down to l2tp_session_get() and l2tp_session_get_by_ifname(). Since all callers have to be updated, l2tp_session_find_by_ifname() and l2tp_nl_session_find() are renamed to reflect their new behaviour. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: fix duplicate session creationGuillaume Nault
commit dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086 upstream. l2tp_session_create() relies on its caller for checking for duplicate sessions. This is racy since a session can be concurrently inserted after the caller's verification. Fix this by letting l2tp_session_create() verify sessions uniqueness upon insertion. Callers need to be adapted to check for l2tp_session_create()'s return code instead of calling l2tp_session_find(). pppol2tp_connect() is a bit special because it has to work on existing sessions (if they're not connected) or to create a new session if none is found. When acting on a preexisting session, a reference must be held or it could go away on us. So we have to use l2tp_session_get() instead of l2tp_session_find() and drop the reference before exiting. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05l2tp: fix race in l2tp_recv_common()Guillaume Nault
commit 61b9a047729bb230978178bca6729689d0c50ca2 upstream. Taking a reference on sessions in l2tp_recv_common() is racy; this has to be done by the callers. To this end, a new function is required (l2tp_session_get()) to atomically lookup a session and take a reference on it. Callers then have to manually drop this reference. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03l2tp: take reference on sessions being dumpedGuillaume Nault
[ Upstream commit e08293a4ccbcc993ded0fdc46f1e57926b833d63 ] Take a reference on the sessions returned by l2tp_session_find_nth() (and rename it l2tp_session_get_nth() to reflect this change), so that caller is assured that the session isn't going to disappear while processing it. For procfs and debugfs handlers, the session is held in the .start() callback and dropped in .show(). Given that pppol2tp_seq_session_show() dereferences the associated PPPoL2TP socket and that l2tp_dfs_seq_session_show() might call pppol2tp_show(), we also need to call the session's .ref() callback to prevent the socket from going away from under us. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Fixes: 0ad6614048cf ("l2tp: Add debugfs files for dumping l2tp debug info") Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-02l2tp: fix use-after-free during module unloadSabrina Dubroca
Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU -> l2tp_tunnel_destruct). By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish destroying the socket, the private data reserved via the net_generic mechanism has already been freed, but l2tp_tunnel_destruct() actually uses this data. Make sure tunnel deletion for the netns has completed before returning from l2tp_exit_net() by first flushing the tunnel removal workqueue, and then waiting for RCU callbacks to complete. Fixes: 167eb17e0b17 ("l2tp: create tunnel sockets in the right namespace") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-08l2tp: fix configuration passed to setup_udp_tunnel_sock()Guillaume Nault
Unused fields of udp_cfg must be all zeros. Otherwise setup_udp_tunnel_sock() fills ->gro_receive and ->gro_complete callbacks with garbage, eventually resulting in panic when used by udp_gro_receive(). [ 72.694123] BUG: unable to handle kernel paging request at ffff880033f87d78 [ 72.695518] IP: [<ffff880033f87d78>] 0xffff880033f87d78 [ 72.696530] PGD 26e2067 PUD 26e3067 PMD 342ed063 PTE 8000000033f87163 [ 72.696530] Oops: 0011 [#1] SMP KASAN [ 72.696530] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pptp gre pppox ppp_generic slhc crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw acpi_cpufreq button proc\ essor ext4 crc16 jbd2 mbcache virtio_blk virtio_net virtio_pci virtio_ring virtio [ 72.696530] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.7.0-rc1 #1 [ 72.696530] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 72.696530] task: ffff880035b59700 ti: ffff880035b70000 task.ti: ffff880035b70000 [ 72.696530] RIP: 0010:[<ffff880033f87d78>] [<ffff880033f87d78>] 0xffff880033f87d78 [ 72.696530] RSP: 0018:ffff880035f87bc0 EFLAGS: 00010246 [ 72.696530] RAX: ffffed000698f996 RBX: ffff88003326b840 RCX: ffffffff814cc823 [ 72.696530] RDX: ffff88003326b840 RSI: ffff880033e48038 RDI: ffff880034c7c780 [ 72.696530] RBP: ffff880035f87c18 R08: 000000000000a506 R09: 0000000000000000 [ 72.696530] R10: ffff880035f87b38 R11: ffff880034b9344d R12: 00000000ebfea715 [ 72.696530] R13: 0000000000000000 R14: ffff880034c7c780 R15: 0000000000000000 [ 72.696530] FS: 0000000000000000(0000) GS:ffff880035f80000(0000) knlGS:0000000000000000 [ 72.696530] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 72.696530] CR2: ffff880033f87d78 CR3: 0000000033c98000 CR4: 00000000000406a0 [ 72.696530] Stack: [ 72.696530] ffffffff814cc834 ffff880034b93468 0000001481416818 ffff88003326b874 [ 72.696530] ffff880034c7ccb0 ffff880033e48038 ffff88003326b840 ffff880034b93462 [ 72.696530] ffff88003326b88a ffff88003326b88c ffff880034b93468 ffff880035f87c70 [ 72.696530] Call Trace: [ 72.696530] <IRQ> [ 72.696530] [<ffffffff814cc834>] ? udp_gro_receive+0x1c6/0x1f9 [ 72.696530] [<ffffffff814ccb1c>] udp4_gro_receive+0x2b5/0x310 [ 72.696530] [<ffffffff814d989b>] inet_gro_receive+0x4a3/0x4cd [ 72.696530] [<ffffffff81431b32>] dev_gro_receive+0x584/0x7a3 [ 72.696530] [<ffffffff810adf7a>] ? __lock_is_held+0x29/0x64 [ 72.696530] [<ffffffff814321f7>] napi_gro_receive+0x124/0x21d [ 72.696530] [<ffffffffa000b145>] virtnet_receive+0x8df/0x8f6 [virtio_net] [ 72.696530] [<ffffffffa000b27e>] virtnet_poll+0x1d/0x8d [virtio_net] [ 72.696530] [<ffffffff81431350>] net_rx_action+0x15b/0x3b9 [ 72.696530] [<ffffffff815893d6>] __do_softirq+0x216/0x546 [ 72.696530] [<ffffffff81062392>] irq_exit+0x49/0xb6 [ 72.696530] [<ffffffff81588e9a>] do_IRQ+0xe2/0xfa [ 72.696530] [<ffffffff81587a49>] common_interrupt+0x89/0x89 [ 72.696530] <EOI> [ 72.696530] [<ffffffff810b05df>] ? trace_hardirqs_on_caller+0x229/0x270 [ 72.696530] [<ffffffff8102b3c7>] ? default_idle+0x1c/0x2d [ 72.696530] [<ffffffff8102b3c5>] ? default_idle+0x1a/0x2d [ 72.696530] [<ffffffff8102bb8c>] arch_cpu_idle+0xa/0xc [ 72.696530] [<ffffffff810a6c39>] default_idle_call+0x1a/0x1c [ 72.696530] [<ffffffff810a6d96>] cpu_startup_entry+0x15b/0x20f [ 72.696530] [<ffffffff81039a81>] start_secondary+0x12c/0x133 [ 72.696530] Code: ff ff ff ff ff ff ff ff ff ff 7f ff ff ff ff ff ff ff 7f 00 7e f8 33 00 88 ff ff 6d 61 58 81 ff ff ff ff 5e de 0a 81 ff ff ff ff <00> 5c e2 34 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 72.696530] RIP [<ffff880033f87d78>] 0xffff880033f87d78 [ 72.696530] RSP <ffff880035f87bc0> [ 72.696530] CR2: ffff880033f87d78 [ 72.696530] ---[ end trace ad7758b9a1dccf99 ]--- [ 72.696530] Kernel panic - not syncing: Fatal exception in interrupt [ 72.696530] Kernel Offset: disabled [ 72.696530] ---[ end Kernel panic - not syncing: Fatal exception in interrupt v2: use empty initialiser instead of "{ NULL }" to avoid relying on first field's type. Fixes: 38fd2af24fcf ("udp: Add socket based GRO and config") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01net: l2tp: fix reversed udp6 checksum flagsWang Shanker
This patch fixes a bug which causes the behavior of whether to ignore udp6 checksum of udp6 encapsulated l2tp tunnel contrary to what userspace program requests. When the flag `L2TP_ATTR_UDP_ZERO_CSUM6_RX` is set by userspace, it is expected that udp6 checksums of received packets of the l2tp tunnel to create should be ignored. In `l2tp_netlink.c`: `l2tp_nl_cmd_tunnel_create()`, `cfg.udp6_zero_rx_checksums` is set according to the flag, and then passed to `l2tp_core.c`: `l2tp_tunnel_create()` and then `l2tp_tunnel_sock_create()`. In `l2tp_tunnel_sock_create()`, `udp_conf.use_udp6_rx_checksums` is set the same to `cfg.udp6_zero_rx_checksums`. However, if we want the checksum to be ignored, `udp_conf.use_udp6_rx_checksums` should be set to `false`, i.e. be set to the contrary. Similarly, the same should be done to `udp_conf.use_udp6_tx_checksums`. Signed-off-by: Miao Wang <shankerwangmiao@gmail.com> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28l2tp: protect tunnel->del_work by ref_countAlexander Couzens
There is a small chance that tunnel_free() is called before tunnel->del_work scheduled resulting in a zero pointer dereference. Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11net: Modify sk_alloc to not reference count the netns of kernel sockets.Eric W. Biederman
Now that sk_alloc knows when a kernel socket is being allocated modify it to not reference count the network namespace of kernel sockets. Keep track of if a socket needs reference counting by adding a flag to struct sock called sk_net_refcnt. Update all of the callers of sock_create_kern to stop using sk_change_net and sk_release_kernel as those hacks are no longer needed, to avoid reference counting a kernel socket. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-11net: Add a struct net parameter to sock_create_kernEric W. Biederman
This is long overdue, and is part of cleaning up how we allocate kernel sockets that don't reference count struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06l2tp: unregister l2tp_net_ops on failure pathWANG Cong
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functionsAndy Zhou
Simplify l2tp implementation using common UDP tunnel APIs. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05l2tp: fix missing line continuationAndy Zhou
This syntax error was covered by L2TP_REFCNT_DEBUG not being set by default. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-01l2tp: Enable checksum unnecessary conversions for l2tp/UDP socketsTom Herbert
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-14l2tp: Call udp_sock_createTom Herbert
In l2tp driver call common function udp_sock_create to create the listener UDP port. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04l2tp: call udp{6}_set_csumTom Herbert
Call common functions to set checksum for UDP tunnel. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23l2tp: Add support for zero IPv6 checksumsTom Herbert
Added new L2TP configuration options to allow TX and RX of zero checksums in IPv6. Default is not to use them. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23net: Split sk_no_check into sk_no_check_{rx,tx}Tom Herbert
Define separate fields in the sock structure for configuring disabling checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx. The SO_NO_CHECK socket option only affects sk_no_check_tx. Also, removed UDP_CSUM_* defines since they are no longer necessary. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12net: rename local_df to ignore_dfWANG Cong
As suggested by several people, rename local_df to ignore_df, since it means "ignore df bit if it is set". Cc: Maciej Żenczykowski <maze@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-08l2tp: Remove UDP checksum verificationTom Herbert
Validating the UDP checksum is now done in UDP before handing packets to the encapsulation layer. Note that this also eliminates the "feature" where L2TP can ignore a non-zero UDP checksum (doing this was contrary to RFC 1122). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15ipv4: add a sock pointer to ip_queue_xmit()Eric Dumazet
ip_queue_xmit() assumes the skb it has to transmit is attached to an inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership") changed l2tp to not change skb ownership and thus broke this assumption. One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(), so that we do not assume skb->sk points to the socket used by l2tp tunnel. Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Reported-by: Zhan Jianyu <nasa4836@gmail.com> Tested-by: Zhan Jianyu <nasa4836@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Here is my initial pull request for the networking subsystem during this merge window: 1) Support for ESN in AH (RFC 4302) from Fan Du. 2) Add full kernel doc for ethtool command structures, from Ben Hutchings. 3) Add BCM7xxx PHY driver, from Florian Fainelli. 4) Export computed TCP rate information in netlink socket dumps, from Eric Dumazet. 5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas Dichtel. 6) Convert many drivers to pci_enable_msix_range(), from Alexander Gordeev. 7) Record SKB timestamps more efficiently, from Eric Dumazet. 8) Switch to microsecond resolution for TCP round trip times, also from Eric Dumazet. 9) Clean up and fix 6lowpan fragmentation handling by making use of the existing inet_frag api for it's implementation. 10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss. 11) Auto size SKB lengths when composing netlink messages based upon past message sizes used, from Eric Dumazet. 12) qdisc dumps can take a long time, add a cond_resched(), From Eric Dumazet. 13) Sanitize netpoll core and drivers wrt. SKB handling semantics. Get rid of never-used-in-tree netpoll RX handling. From Eric W Biederman. 14) Support inter-address-family and namespace changing in VTI tunnel driver(s). From Steffen Klassert. 15) Add Altera TSE driver, from Vince Bridgers. 16) Optimizing csum_replace2() so that it doesn't adjust the checksum by checksumming the entire header, from Eric Dumazet. 17) Expand BPF internal implementation for faster interpreting, more direct translations into JIT'd code, and much cleaner uses of BPF filtering in non-socket ocntexts. From Daniel Borkmann and Alexei Starovoitov" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits) netpoll: Use skb_irq_freeable to make zap_completion_queue safe. net: Add a test to see if a skb is freeable in irq context qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port' net: ptp: move PTP classifier in its own file net: sxgbe: make "core_ops" static net: sxgbe: fix logical vs bitwise operation net: sxgbe: sxgbe_mdio_register() frees the bus Call efx_set_channels() before efx->type->dimension_resources() xen-netback: disable rogue vif in kthread context net/mlx4: Set proper build dependancy with vxlan be2net: fix build dependency on VxLAN mac802154: make csma/cca parameters per-wpan mac802154: allow only one WPAN to be up at any given time net: filter: minor: fix kdoc in __sk_run_filter netlink: don't compare the nul-termination in nla_strcmp can: c_can: Avoid led toggling for every packet. can: c_can: Simplify TX interrupt cleanup can: c_can: Store dlc private can: c_can: Reduce register access can: c_can: Make the code readable ...
2014-03-31Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue changes from Tejun Heo: "PREPARE_[DELAYED_]WORK() were used to change the work function of work items without fully reinitializing it; however, this makes workqueue consider the work item as a different one from before and allows the work item to start executing before the previous instance is finished which can lead to extremely subtle issues which are painful to debug. The interface has never been popular. This pull request contains patches to remove existing usages and kill the interface. As one of the changes was routed during the last devel cycle and another depended on a pending change in nvme, for-3.15 contains a couple merge commits. In addition, interfaces which were deprecated quite a while ago - __cancel_delayed_work() and WQ_NON_REENTRANT - are removed too" * 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove deprecated WQ_NON_REENTRANT workqueue: Spelling s/instensive/intensive/ workqueue: remove PREPARE_[DELAYED_]WORK() staging/fwserial: don't use PREPARE_WORK afs: don't use PREPARE_WORK nvme: don't use PREPARE_WORK usb: don't use PREPARE_DELAYED_WORK floppy: don't use PREPARE_[DELAYED_]WORK ps3-vuart: don't use PREPARE_WORK wireless/rt2x00: don't use PREPARE_WORK in rt2800usb.c workqueue: Remove deprecated __cancel_delayed_work()
2014-03-29workqueue: remove deprecated WQ_NON_REENTRANTZhangZhen
Tejun Heo has made WQ_NON_REENTRANT useless in the dbf2576e37 ("workqueue: make all workqueues non-reentrant"). So remove its usages and definition. This patch doesn't introduce any behavior changes. tj: minor description updates. Signed-off-by: ZhangZhen <zhenzhang.zhang@huawei.com> Sigend-off-by: Tejun Heo <tj@kernel.org> Acked-by: James Chapman <jchapman@katalix.com> Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-03-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10l2tp: fix unused variable warningEric Dumazet
net/l2tp/l2tp_core.c:1111:15: warning: unused variable 'sk' [-Wunused-variable] Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07l2tp: keep original skb ownershipEric Dumazet
There is no reason to orphan skb in l2tp. This breaks things like per socket memory limits, TCP Small queues... Fix this before more people copy/paste it. This is very similar to commit 8f646c922d550 ("vxlan: keep original skb ownership") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06l2tp: fix manual sequencing (de)activation in L2TPv2Guillaume Nault
Commit e0d4435f "l2tp: Update PPP-over-L2TP driver to work over L2TPv3" broke the PPPOL2TP_SO_SENDSEQ setsockopt. The L2TP header length was previously computed by pppol2tp_l2t_header_len() before each call to l2tp_xmit_skb(). Now that header length is retrieved from the hdr_len session field, this field must be updated every time the L2TP header format is modified, or l2tp_xmit_skb() won't push the right amount of data for the L2TP header. This patch uses l2tp_session_set_header_len() to adjust hdr_len every time sequencing is (de)activated from userspace (either by the PPPOL2TP_SO_SENDSEQ setsockopt or the L2TP_ATTR_SEND_SEQ netlink attribute). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: remove unnecessary return'sstephen hemminger
One of my pet coding style peeves is the practice of adding extra return; at the end of function. Kill several instances of this in network code. I suppose some coccinelle wizardy could do this automatically. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13l2tp: make local functions staticstephen hemminger
Avoid needless export of local functions Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-09ipv6: make lookups simpler and fasterEric Dumazet
TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-08l2tp: Fix build warning with ipv6 disabled.David S. Miller
net/l2tp/l2tp_core.c: In function ‘l2tp_verify_udp_checksum’: net/l2tp/l2tp_core.c:499:22: warning: unused variable ‘tunnel’ [-Wunused-variable] Create a helper "l2tp_tunnel()" to facilitate this, and as a side effect get rid of a bunch of unnecessary void pointer casts. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-02l2tp: fix kernel panic when using IPv4-mapped IPv6 addressesFrançois Cachereul
IPv4 mapped addresses cause kernel panic. The patch juste check whether the IPv6 address is an IPv4 mapped address. If so, use IPv4 API instead of IPv6. [ 940.026915] general protection fault: 0000 [#1] [ 940.026915] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppox ppp_generic slhc loop psmouse [ 940.026915] CPU: 0 PID: 3184 Comm: memcheck-amd64- Not tainted 3.11.0+ #1 [ 940.026915] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 940.026915] task: ffff880007130e20 ti: ffff88000737e000 task.ti: ffff88000737e000 [ 940.026915] RIP: 0010:[<ffffffff81333780>] [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP: 0018:ffff88000737fd28 EFLAGS: 00010286 [ 940.026915] RAX: c748521a75ceff48 RBX: ffff880000c30800 RCX: 0000000000000000 [ 940.026915] RDX: ffff88000075cc4e RSI: 0000000000000028 RDI: ffff8800060e5a40 [ 940.026915] RBP: ffff8800060e5a40 R08: 0000000000000000 R09: ffff88000075cc90 [ 940.026915] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000737fda0 [ 940.026915] R13: 0000000000000000 R14: 0000000000002000 R15: ffff880005d3b580 [ 940.026915] FS: 00007f163dc5e800(0000) GS:ffffffff81623000(0000) knlGS:0000000000000000 [ 940.026915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 940.026915] CR2: 00000004032dc940 CR3: 0000000005c25000 CR4: 00000000000006f0 [ 940.026915] Stack: [ 940.026915] ffff88000075cc4e ffffffff81694e90 ffff880000c30b38 0000000000000020 [ 940.026915] 11000000523c4bac ffff88000737fdb4 0000000000000000 ffff880000c30800 [ 940.026915] ffff880005d3b580 ffff880000c30b38 ffff8800060e5a40 0000000000000020 [ 940.026915] Call Trace: [ 940.026915] [<ffffffff81356cc3>] ? inet6_csk_xmit+0xa4/0xc4 [ 940.026915] [<ffffffffa0038535>] ? l2tp_xmit_skb+0x503/0x55a [l2tp_core] [ 940.026915] [<ffffffff812b8d3b>] ? pskb_expand_head+0x161/0x214 [ 940.026915] [<ffffffffa003e91d>] ? pppol2tp_xmit+0xf2/0x143 [l2tp_ppp] [ 940.026915] [<ffffffffa00292e0>] ? ppp_channel_push+0x36/0x8b [ppp_generic] [ 940.026915] [<ffffffffa00293fe>] ? ppp_write+0xaf/0xc5 [ppp_generic] [ 940.026915] [<ffffffff8110ead4>] ? vfs_write+0xa2/0x106 [ 940.026915] [<ffffffff8110edd6>] ? SyS_write+0x56/0x8a [ 940.026915] [<ffffffff81378ac0>] ? system_call_fastpath+0x16/0x1b [ 940.026915] Code: 00 49 8b 8f d8 00 00 00 66 83 7c 11 02 00 74 60 49 8b 47 58 48 83 e0 fe 48 8b 80 18 01 00 00 48 85 c0 74 13 48 8b 80 78 02 00 00 <48> ff 40 28 41 8b 57 68 48 01 50 30 48 8b 54 24 08 49 c7 c1 51 [ 940.026915] RIP [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP <ffff88000737fd28> [ 940.057945] ---[ end trace be8aba9a61c8b7f3 ]--- [ 940.058583] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02l2tp: make datapath resilient to packet loss when sequence numbers enabledJames Chapman
If L2TP data sequence numbers are enabled and reordering is not enabled, data reception stops if a packet is lost since the kernel waits for a sequence number that is never resent. (When reordering is enabled, data reception restarts when the reorder timeout expires.) If no reorder timeout is set, we should count the number of in-sequence packets after the out-of-sequence (OOS) condition is detected, and reset sequence number state after a number of such packets are received. For now, the number of in-sequence packets while in OOS state which cause the sequence number state to be reset is hard-coded to 5. This could be configurable later. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02l2tp: make datapath sequence number support RFC-compliantJames Chapman
The L2TP datapath is not currently RFC-compliant when sequence numbers are used in L2TP data packets. According to the L2TP RFC, any received sequence number NR greater than or equal to the next expected NR is acceptable, where the "greater than or equal to" test is determined by the NR wrap point. This differs for L2TPv2 and L2TPv3, so add state in the session context to hold the max NR value and the NR window size in order to do the acceptable sequence number value check. These might be configurable later, but for now we derive it from the tunnel L2TP version, which determines the sequence number field size. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02l2tp: do data sequence number handling in a separate funcJames Chapman
This change moves some code handling data sequence numbers into a separate function to avoid too much indentation. This is to prepare for some changes to data sequence number handling in subsequent patches. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22l2tp: calling the ref() instead of deref()Dan Carpenter
This is a cut and paste typo. We call ->ref() a second time instead of ->deref(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20l2tp: unhash l2tp sessions on delete, not on freeTom Parkin
If we postpone unhashing of l2tp sessions until the structure is freed, we risk: 1. further packets arriving and getting queued while the pseudowire is being closed down 2. the recv path hitting "scheduling while atomic" errors in the case that recv drops the last reference to a session and calls l2tp_session_free while in atomic context As such, l2tp sessions should be unhashed from l2tp_core data structures early in the teardown process prior to calling pseudowire close. For pseudowires like l2tp_ppp which have multiple shutdown codepaths, provide an unhash hook. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20l2tp: avoid deadlock in l2tp stats updateTom Parkin
l2tp's u64_stats writers were incorrectly synchronised, making it possible to deadlock a 64bit machine running a 32bit kernel simply by sending the l2tp code netlink commands while passing data through l2tp sessions. Previous discussion on netdev determined that alternative solutions such as spinlock writer synchronisation or per-cpu data would bring unjustified overhead, given that most users interested in high volume traffic will likely be running 64bit kernels on 64bit hardware. As such, this patch replaces l2tp's use of u64_stats with atomic_long_t, thereby avoiding the deadlock. Ref: http://marc.info/?l=linux-netdev&m=134029167910731&w=2 http://marc.info/?l=linux-netdev&m=134079868111131&w=2 Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>