summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2008-04-18bluetooth: hci_core: defer hci_unregister_sysfs()Dave Young
upstream commit: 147e2d59833e994cc99341806a88b9e59be41391 Alon Bar-Lev reports: Feb 16 23:41:33 alon1 usb 3-1: configuration #1 chosen from 1 choice Feb 16 23:41:33 alon1 BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008 Feb 16 23:41:33 alon1 printing eip: c01b2db6 *pde = 00000000 Feb 16 23:41:33 alon1 Oops: 0000 [#1] PREEMPT Feb 16 23:41:33 alon1 Modules linked in: ppp_deflate zlib_deflate zlib_inflate bsd_comp ppp_async rfcomm l2cap hci_usb vmnet(P) vmmon(P) tun radeon drm autofs4 ipv6 aes_generic crypto_algapi ieee80211_crypt_ccmp nf_nat_irc nf_nat_ftp nf_conntrack_irc nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device bluetooth ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base pcmcia snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm nsc_ircc snd_timer ipw2200 thinkpad_acpi irda snd ehci_hcd yenta_socket uhci_hcd psmouse ieee80211 soundcore intel_agp hwmon rsrc_nonstatic pcspkr e1000 crc_ccitt snd_page_alloc i2c_i801 ieee80211_crypt pcmcia_core agpgart thermal bat! tery nvram rtc sr_mod ac sg firmware_class button processor cdrom unix usbcore evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod scsi_mod Feb 16 23:41:33 alon1 Feb 16 23:41:33 alon1 Pid: 4, comm: events/0 Tainted: P (2.6.24-gentoo-r2 #1) Feb 16 23:41:33 alon1 EIP: 0060:[<c01b2db6>] EFLAGS: 00010282 CPU: 0 Feb 16 23:41:33 alon1 EIP is at sysfs_get_dentry+0x26/0x80 Feb 16 23:41:33 alon1 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: f48a2210 Feb 16 23:41:33 alon1 ESI: f72eb900 EDI: f4803ae0 EBP: f4803ae0 ESP: f7c49efc Feb 16 23:41:33 alon1 hcid[7004]: HCI dev 0 registered Feb 16 23:41:33 alon1 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Feb 16 23:41:33 alon1 Process events/0 (pid: 4, ti=f7c48000 task=f7c3efc0 task.ti=f7c48000) Feb 16 23:41:33 alon1 Stack: f7cb6140 f4822668 f7e71e10 c01b304d ffffffff ffffffff fffffffe c030ba9c Feb 16 23:41:33 alon1 f7cb6140 f4822668 f6da6720 f7cb6140 f4822668 f6da6720 c030ba8e c01ce20b Feb 16 23:41:33 alon1 f6e9dd00 c030ba8e f6da6720 f6e9dd00 f6e9dd00 00000000 f4822600 00000000 Feb 16 23:41:33 alon1 Call Trace: Feb 16 23:41:33 alon1 [<c01b304d>] sysfs_move_dir+0x3d/0x1f0 Feb 16 23:41:33 alon1 [<c01ce20b>] kobject_move+0x9b/0x120 Feb 16 23:41:33 alon1 [<c0241711>] device_move+0x51/0x110 Feb 16 23:41:33 alon1 [<f9aaed80>] del_conn+0x0/0x70 [bluetooth] Feb 16 23:41:33 alon1 [<f9aaed99>] del_conn+0x19/0x70 [bluetooth] Feb 16 23:41:33 alon1 [<c012c1a1>] run_workqueue+0x81/0x140 Feb 16 23:41:33 alon1 [<c02c0c88>] schedule+0x168/0x2e0 Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50 Feb 16 23:41:33 alon1 [<c012c9cb>] worker_thread+0x9b/0xf0 Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50 Feb 16 23:41:33 alon1 [<c012c930>] worker_thread+0x0/0xf0 Feb 16 23:41:33 alon1 [<c012f962>] kthread+0x42/0x70 Feb 16 23:41:33 alon1 [<c012f920>] kthread+0x0/0x70 Feb 16 23:41:33 alon1 [<c0104c2f>] kernel_thread_helper+0x7/0x18 Feb 16 23:41:33 alon1 ======================= Feb 16 23:41:33 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0 56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74 47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98 e7 10 00 8b 43 10 Feb 16 23:41:33 alon1 EIP: [<c01b2db6>] sysfs_get_dentry+0x26/0x80 SS:ESP 0068:f7c49efc Feb 16 23:41:33 alon1 ---[ end trace aae864e9592acc1d ]--- Defer hci_unregister_sysfs because hci device could be destructed while hci conn devices still there. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Stefan Seyfried <seife@suse.de> Acked-by: Alon Bar-Lev <alon.barlev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> dsd@gentoo.org notes: This patch fixes http://bugs.gentoo.org/211179 Cc: Daniel Drake <dsd@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18VLAN: Don't copy ALLMULTI/PROMISC flags from underlying devicePatrick McHardy
Upstream commit: 0ed21b321a13421e2dfeaa70a6c324e05e3e91e6 Changing these flags requires to use dev_set_allmulti/dev_set_promiscuity or dev_change_flags. Setting it directly causes two unwanted effects: - the next dev_change_flags call will notice a difference between dev->gflags and the actual flags, enable promisc/allmulti mode and incorrectly update dev->gflags - this keeps the underlying device in promisc/allmulti mode until the VLAN device is deleted [ Ported back to 2.6.24 VLAN code. -DaveM ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18TCP: Let skbs grow over a page on fast peersHerbert Xu
Upstream commit: 69d1506731168d6845a76a303b2c45f7c05f3f2c While testing the virtio-net driver on KVM with TSO I noticed that TSO performance with a 1500 MTU is significantly worse compared to the performance of non-TSO with a 16436 MTU. The packet dump shows that most of the packets sent are smaller than a page. Looking at the code this actually is quite obvious as it always stop extending the packet if it's the first packet yet to be sent and if it's larger than the MSS. Since each extension is bound by the page size, this means that (given a 1500 MTU) we're very unlikely to construct packets greater than a page, provided that the receiver and the path is fast enough so that packets can always be sent immediately. The fix is also quite obvious. The push calls inside the loop is just an optimisation so that we don't end up doing all the sending at the end of the loop. Therefore there is no specific reason why it has to do so at MSS boundaries. For TSO, the most natural extension of this optimisation is to do the pushing once the skb exceeds the TSO size goal. This is what the patch does and testing with KVM shows that the TSO performance with a 1500 MTU easily surpasses that of a 16436 MTU and indeed the packet sizes sent are generally larger than 16436. I don't see any obvious downsides for slower peers or connections, but it would be prudent to test this extensively to ensure that those cases don't regress. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18TCP: Fix shrinking windows with window scalingPatrick McHardy
Upstream commit: 607bfbf2d55dd1cfe5368b41c2a81a8c9ccf4723 When selecting a new window, tcp_select_window() tries not to shrink the offered window by using the maximum of the remaining offered window size and the newly calculated window size. The newly calculated window size is always a multiple of the window scaling factor, the remaining window size however might not be since it depends on rcv_wup/rcv_nxt. This means we're effectively shrinking the window when scaling it down. The dump below shows the problem (scaling factor 2^7): - Window size of 557 (71296) is advertised, up to 3111907257: IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...> - New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes below the last end: IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...> The number 40 results from downscaling the remaining window: 3111907257 - 3111841425 = 65832 65832 / 2^7 = 514 65832 % 2^7 = 40 If the sender uses up the entire window before it is shrunk, this can have chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq() will notice that the window has been shrunk since tcp_wnd_end() is before tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number. This will fail the receivers checks in tcp_sequence() however since it is before it's tp->rcv_wup, making it respond with a dupack. If both sides are in this condition, this leads to a constant flood of ACKs until the connection times out. Make sure the window is never shrunk by aligning the remaining window to the window scaling factor. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18NET: Fix multicast device ioctl checksPatrick McHardy
Upstream commit: 61ee6bd487b9cc160e533034eb338f2085dc7922 SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list method to determine whether it supports multicast. Drivers implementing secondary unicast support use set_rx_mode however. Check for both dev->set_multicast_mode and dev->set_rx_mode to determine multicast capabilities. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SCTP: Fix local_addr deletions during list traversals.Chidambar 'ilLogict' Zinnoury
Upstream commit: 22626216c46f2ec86287e75ea86dd9ac3df54265 Since the lists are circular, we need to explicitely tag the address to be deleted since we might end up freeing the list head instead. This fixes some interesting SCTP crashes. Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18sch_htb: fix "too many events" situationMartin Devera
Upstream commit: 8f3ea33a5078a09eba12bfe57424507809367756 HTB is event driven algorithm and part of its work is to apply scheduled events at proper times. It tried to defend itself from livelock by processing only limited number of events per dequeue. Because of faster computers some users already hit this hardcoded limit. This patch limits processing up to 2 jiffies (why not 1 jiffie ? because it might stop prematurely when only fraction of jiffie remains). Signed-off-by: Martin Devera <devik@cdi.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18NET: Add preemption point in qdisc_runHerbert Xu
Upstream commit: 2ba2506ca7ca62c56edaa334b0fe61eb5eab6ab0 The qdisc_run loop is currently unbounded and runs entirely in a softirq. This is bad as it may create an unbounded softirq run. This patch fixes this by calling need_resched and breaking out if necessary. It also adds a break out if the jiffies value changes since that would indicate we've been transmitting for too long which starves other softirqs. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18netpoll: zap_completion_queue: adjust skb->users counterJarek Poplawski
Upstream commit: 8a455b087c9629b3ae3b521b4f1ed16672f978cc zap_completion_queue() retrieves skbs from completion_queue where they have zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero yet, so it's increased now. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18LLC: Restrict LLC sockets to rootPatrick McHardy
Upstream commit: 3480c63bdf008e9289aab94418f43b9592978fff LLC currently allows users to inject raw frames, including IP packets encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other systems do. Restrict LLC sockets to root similar to packet sockets. [ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18INET: inet_frag_evictor() must run with BH disabledDavid S. Miller
Part of upstream commit: e8e16b706e8406f1ab3bccab16932ebc513896d8 Based upon a lockdep trace from Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18AX25 ax25_out: check skb for NULL in ax25_kick()Jarek Poplawski
Upstream commit: f47b7257c7368698eabff6fd7b340071932af640 According to some OOPS reports ax25_kick tries to clone NULL skbs sometimes. It looks like a race with ax25_clear_queues(). Probably there is no need to add more than a simple check for this yet. Another report suggested there are probably also cases where ax25 ->paclen == 0 can happen in ax25_output(); this wasn't confirmed during testing but let's leave this debugging check for some time. Reported-and-tested-by: Jann Traschewski <jann@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.Dave Young
Jens Axboe noticed that we were queueing &conn->work on both btaddconn and keventd_wq. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24NETFILTER: nfnetlink_log: fix computation of netlink skb sizeEric Leblond
Upstream commit 7000d38d: This patch is similar to nfnetlink_queue fixes. It fixes the computation of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24NETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skbEric Leblond
Upstream commit cabaa9bf: Size of the netlink skb was wrongly computed because the formula was using NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for netlink header as NLMSG_SPACE does. This was causing a failure of message building in some cases. On my test system, all messages for packets in range [8*k+41, 8*k+48] where k is an integer were invalid and the corresponding packets were dropped. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24NETFILTER: xt_time: fix failure to match on SundaysJan Engelhardt
Upstream commit 4f4c9430: xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never matches on Sundays. On my host I have a rule like iptables -A OUTPUT -m time --weekdays Sun -j REJECT and it never matches. The problem is in localtime_2(), which uses r->weekday = (4 + r->dse) % 7; to map the epoch day onto a weekday in {0,...,6}. In particular this gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never match. xt_time_match() has if (!(info->weekdays_match & (1 << current_time.weekday))) return false; and when current_time.weekday = 0, the result of the & is always zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24NETFILTER: fix ebtable targets returnPatrick McHardy
Upstream commit 1b04ab459: The function ebt_do_table doesn't take NF_DROP as a verdict from the targets. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24NETFILTER: Fix incorrect use of skb_make_writablePatrick McHardy
Upstream commit eb1197bc0: http://bugzilla.kernel.org/show_bug.cgi?id=9920 The function skb_make_writable returns true or false. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24NETFILTER: nfnetlink_queue: fix SKB_LINEAR_ASSERT when mangling packet dataPatrick McHardy
Upstream commit e2b58a67: As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>, inserting new data in skbs queued over {ip,ip6,nfnetlink}_queue triggers a SKB_LINEAR_ASSERT in skb_put(). Going back through the git history, it seems this bug is present since at least 2.6.12-rc2, probably even since the removal of skb_linearize() for netfilter. Linearize non-linear skbs through skb_copy_expand() when enlarging them. Tested by Thomas, fixes bugzilla #9933. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24IPCOMP: Disable BH on output when using shared tfmHerbert Xu
Upstream commit: 21e43188f272c7fd9efc84b8244c0b1dfccaa105 Because we use shared tfm objects in order to conserve memory, (each tfm requires 128K of vmalloc memory), BH needs to be turned off on output as that can occur in process context. Previously this was done implicitly by the xfrm output code. That was lost when it became lockless. So we need to add the BH disabling to IPComp directly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24IPCONFIG: The kernel gets no IP from some DHCP serversStephen Hemminger
Upstream commit: dea75bdfa57f75a7a7ec2961ec28db506c18e5db From: Stephen Hemminger <shemminger@linux-foundation.org> Based upon a patch by Marcel Wappler: This patch fixes a DHCP issue of the kernel: some DHCP servers (i.e. in the Linksys WRT54Gv5) are very strict about the contents of the DHCPDISCOVER packet they receive from clients. Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and 'siaddr' MUST be set to '0'. These DHCP servers ignore Linux kernel's DHCP discovery packets with these two fields set to '255.255.255.255' (in contrast to popular DHCP clients, such as 'dhclient' or 'udhcpc'). This leads to a not booting system. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24IPV4: Remove IP_TOS setting privilege checks.David S. Miller
Upstream commit: e4f8b5d4edc1edb0709531bd1a342655d5e8b98e Various RFCs have all sorts of things to say about the CS field of the DSCP value. In particular they try to make the distinction between values that should be used by "user applications" and things like routing daemons. This seems to have influenced the CAP_NET_ADMIN check which exists for IP_TOS socket option settings, but in fact it has an off-by-one error so it wasn't allowing CS5 which is meant for "user applications" as well. Further adding to the inconsistency and brokenness here, IPV6 does not validate the DSCP values specified for the IPV6_TCLASS socket option. The real actual uses of these TOS values are system specific in the final analysis, and these RFC recommendations are just that, "a recommendation". In fact the standards very purposefully use "SHOULD" and "SHOULD NOT" when describing how these values can be used. In the final analysis the only clean way to provide consistency here is to remove the CAP_NET_ADMIN check. The alternatives just don't work out: 1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing setups. 2) If we just fix the off-by-one error in the class comparison in IPV4, certain DSCP values can be used in IPV6 but not IPV4 by default. So people will just ask for a sysctl asking to override that. I checked several other freely available kernel trees and they do not make any privilege checks in this area like we do. For the BSD stacks, this goes back all the way to Stevens Volume 2 and beyond. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24IPV6: dst_entry leak in ip4ip6_err.Denis V. Lunev
Upstream commit: 9937ded8e44de8865cba1509d24eea9d350cebf0 The result of the ip_route_output is not assigned to skb. This means that - it is leaked - possible OOPS below dereferrencing skb->dst - no ICMP message for this case Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24IPV6: Fix IPsec datagram fragmentationHerbert Xu
Upstream commits: 28a89453b1e8de8d777ad96fa1eef27b5d1ce074 b5c15fc004ac83b7ad280acbe0fd4bbed7e2c8d4 This is a long-standing bug in the IPsec IPv6 code that breaks when we emit a IPsec tunnel-mode datagram packet. The problem is that the code the emits the packet assumes the IPv6 stack will fragment it later, but the IPv6 stack assumes that whoever is emitting the packet is going to pre-fragment the packet. In the long term we need to fix both sides, e.g., to get the datagram code to pre-fragment as well as to get the IPv6 stack to fragment locally generated tunnel-mode packet. For now this patch does the second part which should make it work for the IPsec host case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24NET: Fix race in dev_close(). (Bug 9750)Matti Linnanvuori
Upstream commit: d8b2a4d21e0b37b9669b202867bfef19f68f786a There is a race in Linux kernel file net/core/dev.c, function dev_close. The function calls function dev_deactivate, which calls function dev_watchdog_down that deletes the watchdog timer. However, after that, a driver can call netif_carrier_ok, which calls function __netdev_watchdog_up that can add the watchdog timer again. Function unregister_netdevice calls function dev_shutdown that traps the bug !timer_pending(&dev->watchdog_timer). Moving dev_deactivate after netif_running() has been cleared prevents function netif_carrier_on from calling __netdev_watchdog_up and adding the watchdog timer again. Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-03-24NET: Messed multicast lists after dev_mc_sync/unsyncJorge Boncompte [DTI2]
Upstream commit: 12aa343add3eced38a44bdb612b35fdf634d918c Commit a0a400d79e3dd7843e7e81baa3ef2957bdc292d0 ("[NET]: dev_mcast: add multicast list synchronization helpers") from you introduced a new field "da_synced" to struct dev_addr_list that is not properly initialized to 0. So when any of the current users (8021q, macvlan, mac80211) calls dev_mc_sync/unsync they mess the address list for both devices. The attached patch fixed it for me and avoid future problems. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-02-25BLUETOOTH: Add conn add/del workqueues to avoid connection fail.Dave Young
Upstream commit: b6c0632105f7d7548f1d642ba830088478d4f2b0 The bluetooth hci_conn sysfs add/del executed in the default workqueue. If the del_conn is executed after the new add_conn with same target, add_conn will failed with warning of "same kobject name". Here add btaddconn & btdelconn workqueues, flush the btdelconn workqueue in the add_conn function to avoid the issue. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25INET: Prevent out-of-sync truesize on ip_fragment slow pathHerbert Xu
Upstream commit: 29ffe1a5c52dae13b6efead97aab9b058f38fce4 When ip_fragment has to hit the slow path the value of skb->truesize may go out of sync because we would have updated it without changing the packet length. This violates the constraints on truesize. This patch postpones the update of skb->truesize to prevent this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25INET_DIAG: Fix inet_diag_lock_handler error path.Arnaldo Carvalho de Melo
Upstream commit: 8cf8e5a67fb07f583aac94482ba51a7930dab493 Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=9825 The inet_diag_lock_handler function uses ERR_PTR to encode errors but its callers were testing against NULL. This only happens when the only inet_diag modular user, DCCP, is not built into the kernel or available as a module. Also there was a problem with not dropping the mutex lock when a handler was not found, also fixed in this patch. This caused an OOPS and ss would then hang on subsequent calls, as &inet_diag_table_mutex was being left locked. Thanks to spike at ml.yaroslavl.ru for report it after trying 'ss -d' on a kernel that doesn't have DCCP available. This bug was introduced in cset d523a328fb0271e1a763e985a21f2488fd816e7e ("Fix inet_diag dead-lock regression"), after 2.6.24-rc3, so just 2.6.24 seems to be affected. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25IPCOMP: Fetch nexthdr before ipch is destroyedHerbert Xu
Upstream commit: 2614fa59fa805cd488083c5602eb48533cdbc018 When I moved the nexthdr setting out of IPComp I accidently moved the reading of ipch->nexthdr after the decompression. Unfortunately this means that we'd be reading from a stale ipch pointer which doesn't work very well. This patch moves the reading up so that we get the correct nexthdr value. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25IPCOMP: Fix reception of incompressible packetsHerbert Xu
Upstream commit: b1641064a3f4a58644bc2e8edf40c025c58473b4 I made a silly typo by entering IPPROTO_IP (== 0) instead of IPPROTO_IPIP (== 4). This broke the reception of incompressible packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25IPV4: fib: fix route replacement, fib_info is sharedJulian Anastasov
Upstream commit: c18865f39276435abb9286f9a816cb5b66c99a00 fib_info can be shared by many route prefixes but we don't want duplicate alternative routes for a prefix+tos+priority. Last change was not correct to check fib_treeref because it accounts usage from other prefixes. Additionally, avoid replacement without error if new route is same, as Joonwoo Park suggests. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25IPV4: fib_trie: apply fixes from fib_hashJulian Anastasov
Upstream commit: 936f6f8e1bc46834bbb3e3fa3ac13ab44f1e7ba6 Update fib_trie with some fib_hash fixes: - check for duplicate alternative routes for prefix+tos+priority when replacing route - properly insert by matching tos together with priority - fix alias walking to use list_for_each_entry_continue for insertion and deletion when fa_head is not NULL - copy state from fa to new_fa on replace (not a problem for now) - additionally, avoid replacement without error if new route is same, as Joonwoo Park suggests. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25PKT_SCHED: ematch: oops from uninitialized variable (resend)Stephen Hemminger
Upstream commit: 268bcca1e7b0d244afd07ea89cda672e61b0fc4a Setting up a meta match causes a kernel OOPS because of uninitialized elements in tree. [ 37.322381] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 37.322381] IP: [<ffffffff883fc717>] :em_meta:em_meta_destroy+0x17/0x80 [ 37.322381] Call Trace: [ 37.322381] [<ffffffff803ec83d>] tcf_em_tree_destroy+0x2d/0xa0 [ 37.322381] [<ffffffff803ecc8c>] tcf_em_tree_validate+0x2dc/0x4a0 [ 37.322381] [<ffffffff803f06d2>] nla_parse+0x92/0xe0 [ 37.322381] [<ffffffff883f9672>] :cls_basic:basic_change+0x202/0x3c0 [ 37.322381] [<ffffffff802a3917>] kmem_cache_alloc+0x67/0xa0 [ 37.322381] [<ffffffff803ea221>] tc_ctl_tfilter+0x3b1/0x580 [ 37.322381] [<ffffffff803dffd0>] rtnetlink_rcv_msg+0x0/0x260 [ 37.322381] [<ffffffff803ee944>] netlink_rcv_skb+0x74/0xa0 [ 37.322381] [<ffffffff803dffc8>] rtnetlink_rcv+0x18/0x20 [ 37.322381] [<ffffffff803ee6c3>] netlink_unicast+0x263/0x290 [ 37.322381] [<ffffffff803cf276>] __alloc_skb+0x96/0x160 [ 37.322381] [<ffffffff803ef014>] netlink_sendmsg+0x274/0x340 [ 37.322381] [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140 [ 37.322381] [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30 [ 37.322381] [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30 [ 37.322381] [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140 [ 37.322381] [<ffffffff80288611>] zone_statistics+0xb1/0xc0 [ 37.322381] [<ffffffff803c7e5e>] sys_sendmsg+0x20e/0x360 [ 37.322381] [<ffffffff803c7411>] sockfd_lookup_light+0x41/0x80 [ 37.322381] [<ffffffff8028d04b>] handle_mm_fault+0x3eb/0x7f0 [ 37.322381] [<ffffffff8020c2fb>] system_call_after_swapgs+0x7b/0x80 Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25TC: oops in em_metaStephen Hemminger
Upstream commit: 04f217aca4d803fe72c2c54fe460d68f5233ce52 If userspace passes a unknown match index into em_meta, then em_meta_change will return an error and the data for the match will not be set. This then causes an null pointer dereference when the cleanup is done in the error path via tcf_em_tree_destroy. Since the tree structure comes kzalloc, it is initialized to NULL. Discovered when testing a new version of tc command against an accidental older kernel. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25TCP: Fix a bug in strategy_allowed_congestion_controlShan Wei
Upstream commit: 16ca3f913001efdb6171a2781ef41c77474e3895 In strategy_allowed_congestion_control of the 2.6.24 kernel, when sysctl_string return 1 on success,it should call tcp_set_allowed_congestion_control to set the allowed congestion control.But, it don't. the sysctl_string return 1 on success, otherwise return negative, never return 0.The patch fix the problem. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-25NETFILTER: nf_conntrack_tcp: conntrack reopening fixJozsef Kadlecsik
[NETFILTER]: nf_conntrack_tcp: conntrack reopening fix [Upstream commits b2155e7f + d0c1fd7a] TCP connection tracking in netfilter did not handle TCP reopening properly: active close was taken into account for one side only and not for any side, which is fixed now. The patch includes more comments to explain the logic how the different cases are handled. The bug was discovered by Jeff Chua. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24Revert "mac80211: warn when receiving frames with unaligned data"Linus Torvalds
This reverts commit 81100eb80add328c4d2a377326f15aa0e7236398 for the release, to avoid the unnecessary warning noise that is only really relevant to wireless driver developers. The warning will probably go right back in after I cut the release, but at least we won't unnecessarily worry users. Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-23[INET]: Fix truesize setting in ip_append_dataHerbert Xu
As it is ip_append_data only counts page fragments to the skb that allocated it. As such it means that the first skb gets hit with a 4K charge even though it might have only used a fraction of it while all subsequent skb's that use the same page gets away with no charge at all. This bug was exposed by the UDP accounting patch. [ The wmem_alloc bumping needs to be moved with the truesize, noticed by Takahiro Yasui. -DaveM ] Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-23[NETNS]: Re-export init_net via EXPORT_SYMBOL.Denis V. Lunev
init_net is used added as a parameter to a lot of old API calls, f.e. ip_dev_find. These calls were exported as EXPORT_SYMBOL. So, export init_net as EXPORT_SYMBOL to keep networking API consistent. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-23[IPV4]: Add missing skb->truesize increment in ip_append_page().David S. Miller
And as noted by Takahiro Yasui, we thus need to bump the sk->sk_wmem_alloc at this spot as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-23[BLUETOOTH]: Move children of connection device to NULL before connection down.Dave Young
The rfcomm tty device will possibly retain even when conn is down, and sysfs doesn't support zombie device moving, so this patch move the tty device before conn device is destroyed. For the bug refered please see : http://lkml.org/lkml/2007/12/28/87 Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21[ICMP]: ICMP_MIB_OUTMSGS increment duplicatedWang Chen
Commit "96793b482540f3a26e2188eaf75cb56b7829d3e3" (Add ICMPMsgStats MIB (RFC 4293)) made a mistake. In that patch, David L added a icmp_out_count() in ip_push_pending_frames(), remove icmp_out_count() from icmp_reply(). But he forgot to remove icmp_out_count() from icmp_send() too. Since icmp_send and icmp_reply will call icmp_push_reply, which will call ip_push_pending_frames, a duplicated increment happened in icmp_send. This patch remove the icmp_out_count from icmp_send too. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21[IPV6]: RFC 2011 compatibility brokenWang Chen
The snmp6 entry name was changed, and it broke compatibility to RFC 2011. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21[IPV6]: ICMP6_MIB_OUTMSGS increment duplicatedWang Chen
icmpv6_send() calls ip6_push_pending_frames() indirectly. Both ip6_push_pending_frames() and icmpv6_send() increment counter ICMP6_MIB_OUTMSGS. This patch remove the increment from icmpv6_send. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[NET]: rtnl_link: fix use-after-freePatrick McHardy
When unregistering the rtnl_link_ops, all existing devices using the ops are destroyed. With nested devices this may lead to a use-after-free despite the use of for_each_netdev_safe() in case the upper device is next in the device list and is destroyed by the NETDEV_UNREGISTER notifier. The easy fix is to restart scanning the device list after removing a device. Alternatively we could add new devices to the front of the list to avoid having dependant devices follow the device they depend on. A third option would be to only restart scanning if dev->iflink of the next device matches dev->ifindex of the current one. For now this seems like the safest solution. With this patch, the veth rtnl_link_ops unregistration can use rtnl_link_unregister() directly since it now also handles destruction of multiple devices at once. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[AF_KEY]: Fix skb leak on pfkey_send_migrate() errorPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[IrDA]: af_irda memory leak fixesJesper Juhl
Here goes an IrDA patch against your latest net-2.6 tree. This patch fixes some af_irda memory leaks. It also checks for irias_new_obect() return value. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[NEIGH]: Revert 'Fix race between neigh_parms_release and neightbl_fill_parms'David S. Miller
Commit 9cd40029423701c376391da59d2c6469672b4bed (Fix race between neigh_parms_release and neightbl_fill_parms) introduced device reference counting regressions for several people, see: http://bugzilla.kernel.org/show_bug.cgi?id=9778 for example. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[NETFILTER]: bridge-netfilter: fix net_device refcnt leaksPatrick McHardy
When packets are flood-forwarded to multiple output devices, the bridge-netfilter code reuses skb->nf_bridge for each clone to store the bridge port. When queueing packets using NFQUEUE netfilter takes a reference to skb->nf_bridge->physoutdev, which is overwritten when the packet is forwarded to the second port. This causes refcount unterflows for the first device and refcount leaks for all others. Additionally this provides incorrect data to the iptables physdev match. Unshare skb->nf_bridge by copying it if it is shared before assigning the physoutdev device. Reported, tested and based on initial patch by Jan Christoph Nordholz <hesso@pool.math.tu-berlin.de>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>