summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2006-05-12[NEIGH]: Fix IP-over-ATM and ARP interaction.Simon Kelley
The classical IP over ATM code maintains its own IPv4 <-> <ATM stuff> ARP table, using the standard neighbour-table code. The neigh_table_init function adds this neighbour table to a linked list of all neighbor tables which is used by the functions neigh_delete() neigh_add() and neightbl_set(), all called by the netlink code. Once the ATM neighbour table is added to the list, there are two tables with family == AF_INET there, and ARP entries sent via netlink go into the first table with matching family. This is indeterminate and often wrong. To see the bug, on a kernel with CLIP enabled, create a standard IPv4 ARP entry by pinging an unused address on a local subnet. Then attempt to complete that entry by doing ip neigh replace <ip address> lladdr <some mac address> nud reachable Looking at the ARP tables by using ip neigh show will reveal two ARP entries for the same address. One of these can be found in /proc/net/arp, and the other in /proc/net/atm/arp. This patch adds a new function, neigh_table_init_no_netlink() which does everything the neigh_table_init() does, except add the table to the netlink all-arp-tables chain. In addition neigh_table_init() has a check that all tables on the chain have a distinct address family. The init call in clip.c is changed to call neigh_table_init_no_netlink(). Since ATM ARP tables are rather more complicated than can currently be handled by the available rtattrs in the netlink protocol, no functionality is lost by this patch, and non-ATM ARP manipulation via netlink is rescued. A more complete solution would involve a rtattr for ATM ARP entries and some way for the netlink code to give neigh_add and friends more information than just address family with which to find the correct ARP table. [ I've changed the assertion checking in neigh_table_init() to not use BUG_ON() while holding neigh_tbl_lock. Instead we remember that we found an existing tbl with the same family, and after dropping the lock we'll give a diagnostic kernel log message and a stack dump. -DaveM ] Signed-off-by: Simon Kelley <simon@thekelleys.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-11[NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()Patrick McHardy
When deleting the last child the level of a class should drop to zero. Noticed by Andreas Mueller <andreas@stapelspeicher.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10[IPV6]: skb leakage in inet6_csk_xmitAlexey Kuznetsov
inet6_csk_xit does not free skb when routing fails. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10[BRIDGE]: Do sysfs registration inside rtnl.Stephen Hemminger
Now that netdevice sysfs registration is done as part of register_netdevice; bridge code no longer has to be tricky when adding it's kobjects to bridges. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10[NET]: Do sysfs registration as part of register_netdevice.Stephen Hemminger
The last step of netdevice registration was being done by a delayed call, but because it was delayed, it was impossible to return any error code if the class_device registration failed. Side effects: * one state in registration process is unnecessary. * register_netdevice can sleep inside class_device registration/hotplug * code in netdev_run_todo only does unregistration so it is simpler. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09[NET] linkwatch: Handle jiffies wrap-aroundHerbert Xu
The test used in the linkwatch does not handle wrap-arounds correctly. Since the intention of the code is to eliminate bursts of messages we can afford to delay things up to a second. Using that fact we can easily handle wrap-arounds by making sure that we don't delay things by more than one second. This is based on diagnosis and a patch by Stefan Rompf. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stefan Rompf <stefan@loplof.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09[IRDA]: Removing unused EXPORT_SYMBOLsAdrian Bunk
This patch removes the following unused EXPORT_SYMBOL's: - irias_find_attrib - irias_new_string_value - irias_new_octseq_value Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09[NET]: Make netdev_chain a raw notifier.Alan Stern
From: Alan Stern <stern@rowland.harvard.edu> This chain does it's own locking via the RTNL semaphore, and can also run recursively so adding a new mutex here was causing deadlocks. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09[IPV4]: ip_options_fragment() has no effect on fragmentationWei Yongjun
Fix error point to options in ip_options_fragment(). optptr get a error pointer to the ipv4 header, correct is pointer to ipv4 options. Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-08Merge branch 'upstream-fixes' of ↵Stephen Hemminger
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2006-05-06[IPV4]: Remove likely in ip_rcv_finish()Hua Zhong
This is another result from my likely profiling tool (dwalker@mvista.com just sent the patch of the profiling tool to linux-kernel mailing list, which is similar to what I use). On my system (not very busy, normal development machine within a VMWare workstation), I see a 6/5 miss/hit ratio for this "likely". Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06[NET]: Create netdev attribute_groups with class_device_addStephen Hemminger
Atomically create attributes when class device is added. This avoids the race between registering class_device (which generates hotplug event), and the creation of attribute groups. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[TCP]: Fix snd_cwnd adjustments in tcp_highspeed.cJohn Heffner
Xiaoliang (David) Wei wrote: > Hi gurus, > > I am reading the code of tcp_highspeed.c in the kernel and have a > question on the hstcp_cong_avoid function, specifically the following > AI part (line 136~143 in net/ipv4/tcp_highspeed.c ): > > /* Do additive increase */ > if (tp->snd_cwnd < tp->snd_cwnd_clamp) { > tp->snd_cwnd_cnt += ca->ai; > if (tp->snd_cwnd_cnt >= tp->snd_cwnd) { > tp->snd_cwnd++; > tp->snd_cwnd_cnt -= tp->snd_cwnd; > } > } > > In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd), > snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this > small chance of getting -1 becomes a problem? > Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -= > cwnd? Absolutely correct. Thanks. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[NETROM/ROSE]: Kill module init version kernel log messages.Ralf Baechle
There are out of date and don't tell the user anything useful. The similar messages which IPV4 and the core networking used to output were killed a long time ago. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[DCCP]: Fix sock_orphan dead lockHerbert Xu
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to DCCP_CLOSED in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[BRIDGE]: keep track of received multicast packetsStephen Hemminger
It makes sense to add this simple statistic to keep track of received multicast packets. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[SCTP]: Fix state table entries for chunks received in CLOSED state.Sridhar Samudrala
Discard an unexpected chunk in CLOSED state rather can calling BUG(). Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[SCTP]: Fix panic's when receiving fragmented SCTP control chunks.Sridhar Samudrala
Use pskb_pull() to handle incoming COOKIE_ECHO and HEARTBEAT chunks that are received as skb's with fragment list. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[SCTP]: Prevent possible infinite recursion with multiple bundled DATA.Vladislav Yasevich
There is a rare situation that causes lksctp to go into infinite recursion and crash the system. The trigger is a packet that contains at least the first two DATA fragments of a message bundled together. The recursion is triggered when the user data buffer is smaller that the full data message. The problem is that we clone the skb for every fragment in the message. When reassembling the full message, we try to link skbs from the "first fragment" clone using the frag_list. However, since the frag_list is shared between two clones in this rare situation, we end up setting the frag_list pointer of the second fragment to point to itself. This causes sctp_skb_pull() to potentially recurse indefinitely. Proposed solution is to make a copy of the skb when attempting to link things using frag_list. Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[SCTP]: Allow spillover of receive buffer to avoid deadlock.Neil Horman
This patch fixes a deadlock situation in the receive path by allowing temporary spillover of the receive buffer. - If the chunk we receive has a tsn that immediately follows the ctsn, accept it even if we run out of receive buffer space and renege data with higher TSNs. - Once we accept one chunk in a packet, accept all the remaining chunks even if we run out of receive buffer space. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Mark Butler <butlerm@middle.net> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05[PATCH] softmac: make non-operational after being stoppedDaniel Drake
zd1211 with softmac and wpa_supplicant revealed an issue with softmac and the use of workqueues. Some of the work functions actually reschedule themselves, so this meant that there could still be pending work after flush_scheduled_work() had been called during ieee80211softmac_stop(). This patch introduces a "running" flag which is used to ensure that rescheduling does not happen in this situation. I also used this flag to ensure that softmac's hooks into ieee80211 are non-operational once the stop operation has been started. This simply makes softmac a little more robust, because I could crash it easily by receiving frames in the short timeframe after shutting down softmac and before turning off the ZD1211 radio. (ZD1211 is now fixed as well!) Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05[PATCH] softmac: don't reassociate if user asked for deauthenticationDaniel Drake
When wpa_supplicant exits, it uses SIOCSIWMLME to request deauthentication. softmac then tries to reassociate without any user intervention, which isn't the desired behaviour of this signal. This change makes softmac only attempt reassociation if the remote network itself deauthenticated us. Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-03[DECNET]: Fix level1 router helloPatrick Caulfield
This patch fixes hello messages sent when a node is a level 1 router. Slightly contrary to the spec (maybe) VMS ignores hello messages that do not name level2 routers that it also knows about. So, here we simply name all the routers that the node knows about rather just other level1 routers. (I hope the patch is clearer than the description. sorry). Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[TCP]: Fix sock_orphan dead lockHerbert Xu
Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to TCP_CLOSE in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. Note that I've also moved the increment on the orphan count forward. This may look like a problem as we're increasing it even if the socket is just about to be destroyed where it'll be decreased again. However, this simply enlarges a window that already exists. This also changes the orphan count test by one. Considering what the orphan count is meant to do this is no big deal. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[ROSE]: Eleminate HZ from ROSE kernel interfacesRalf Baechle
Convert all ROSE sysctl time values from jiffies to ms as units. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETROM]: Eleminate HZ from NET/ROM kernel interfacesRalf Baechle
Convert all NET/ROM sysctl time values from jiffies to ms as units. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[AX.25]: Eleminate HZ from AX.25 kernel interfacesRalf Baechle
Convert all AX.25 sysctl time values from jiffies to ms as units. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[ROSE]: Fix routing table locking in rose_remove_neigh.Ralf Baechle
The locking rule for rose_remove_neigh() are that the caller needs to hold rose_neigh_list_lock, so we better don't take it yet again in rose_neigh_list_lock. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[AX.25]: Move AX.25 symbol exportsRalf Baechle
Move AX.25 symbol exports to next to their definitions where they're supposed to be these days. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[AX25, ROSE]: Remove useless SET_MODULE_OWNER calls.Ralf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[AX.25]: Spelling fixRalf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[ROSE]: Remove useless prototype for rose_remove_neigh().Ralf Baechle
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETFILTER]: x_tables: don't use __copy_{from,to}_user on unchecked memory ↵Patrick McHardy
in compat layer Noticed by Linus Torvalds <torvalds@osdl.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETFILTER]: H.323 helper: Change author's email addressJing Min Zhao
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETFILTER]: NAT: silence unused variable warnings with CONFIG_XFRM=nPatrick McHardy
net/ipv4/netfilter/ip_nat_standalone.c: In function 'ip_nat_out': net/ipv4/netfilter/ip_nat_standalone.c:223: warning: unused variable 'ctinfo' net/ipv4/netfilter/ip_nat_standalone.c:222: warning: unused variable 'ct' Surprisingly no complaints so far .. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETFILTER]: H.323 helper: fix use of uninitialized dataPatrick McHardy
When a Choice element contains an unsupported choice no error is returned and parsing continues normally, but the choice value is not set and contains data from the last parsed message. This may in turn lead to parsing of more stale data and following crashes. Fixes a crash triggered by testcase 0003243 from the PROTOS c07-h2250v4 testsuite following random other testcases: CPU: 0 EIP: 0060:[<c01a9554>] Not tainted VLI EFLAGS: 00210646 (2.6.17-rc2 #3) EIP is at memmove+0x19/0x22 eax: d7be0307 ebx: d7be0307 ecx: e841fcf9 edx: d7be0307 esi: bfffffff edi: bfffffff ebp: da5eb980 esp: c0347e2c ds: 007b es: 007b ss: 0068 Process events/0 (pid: 4, threadinfo=c0347000 task=dff86a90) Stack: <0>00000006 c0347ea6 d7be0301 e09a6b2c 00000006 da5eb980 d7be003e d7be0052 c0347f6c e09a6d9c 00000006 c0347ea6 00000006 00000000 d7b9a548 00000000 c0347f6c d7b9a548 00000004 e0a1a119 0000028f 00000006 c0347ea6 00000006 Call Trace: [<e09a6b2c>] mangle_contents+0x40/0xd8 [ip_nat] [<e09a6d9c>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat] [<e0a1a119>] set_addr+0x60/0x14d [ip_nat_h323] [<e0ab6e66>] q931_help+0x2da/0x71a [ip_conntrack_h323] [<e0ab6e98>] q931_help+0x30c/0x71a [ip_conntrack_h323] [<e09af242>] ip_conntrack_help+0x22/0x2f [ip_conntrack] [<c022934a>] nf_iterate+0x2e/0x5f [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c02294ce>] nf_hook_slow+0x42/0xb0 [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c025d732>] xfrm4_output+0x3c/0x4e [<c025d357>] xfrm4_output_finish+0x0/0x39f [<c0230370>] ip_forward+0x1c2/0x1fa [<c022f417>] ip_rcv+0x388/0x3b5 [<c02188f9>] netif_receive_skb+0x2bc/0x2ec [<c0218994>] process_backlog+0x6b/0xd0 [<c021675a>] net_rx_action+0x4b/0xb7 [<c0115606>] __do_softirq+0x35/0x7d [<c0104294>] do_softirq+0x38/0x3f Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03[NETFILTER]: H.323 helper: fix endless loop caused by invalid TPKT lenPatrick McHardy
When the TPKT len included in the packet is below the lowest valid value of 4 an underflow occurs which results in an endless loop. Found by testcase 0000058 from the PROTOS c07-h2250v4 testsuite. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-02[NETFILTER] SCTP conntrack: fix infinite loopPatrick McHardy
fix infinite loop in the SCTP-netfilter code: check SCTP chunk size to guarantee progress of for_each_sctp_chunk(). (all other uses of for_each_sctp_chunk() are preceded by do_basic_checks(), so this fix should be complete.) Based on patch from Ingo Molnar <mingo@elte.hu> CVE-2006-1527 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01Merge branch 'audit.b10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] Audit Filter Performance [PATCH] Rework of IPC auditing [PATCH] More user space subject labels [PATCH] Reworked patch for labels on user space messages [PATCH] change lspp ipc auditing [PATCH] audit inode patch [PATCH] support for context based audit filtering, part 2 [PATCH] support for context based audit filtering [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit() [PATCH] drop task argument of audit_syscall_{entry,exit} [PATCH] drop gfp_mask in audit_log_exit() [PATCH] move call of audit_free() into do_exit() [PATCH] sockaddr patch [PATCH] deal with deadlocks in audit_free()
2006-05-01[NETFILTER] x_tables: fix compat related crash on non-x86Patrick McHardy
When iptables userspace adds an ipt_standard_target, it calculates the size of the entire entry as: sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target)) ipt_standard_target looks like this: struct xt_standard_target { struct xt_entry_target target; int verdict; }; xt_entry_target contains a pointer, so when compiled for 64 bit the structure gets an extra 4 byte of padding at the end. On 32 bit architectures where iptables aligns to 8 byte it will also have 4 byte padding at the end because it is only 36 bytes large. The compat_ipt_standard_fn in the kernel adjusts the offsets by sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target), which will always result in 4, even if the structure from userspace was already padded to a multiple of 8. On x86 this works out by accident because userspace only aligns to 4, on all other architectures this is broken and causes incorrect adjustments to the size and following offsets. Thanks to Linus for lots of debugging help and testing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01[PATCH] Reworked patch for labels on user space messagesSteve Grubb
The below patch should be applied after the inode and ipc sid patches. This patch is a reworking of Tim's patch that has been updated to match the inode and ipc patches since its similar. [updated: > Stephen Smalley also wanted to change a variable from isec to tsec in the > user sid patch. ] Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01[PATCH] sockaddr patchSteve Grubb
On Thursday 23 March 2006 09:08, John D. Ramsdell wrote: > I noticed that a socketcall(bind) and socketcall(connect) event contain a > record of type=SOCKADDR, but I cannot see one for a system call event > associated with socketcall(accept). Recording the sockaddr of an accepted > socket is important for cross platform information flow analys Thanks for pointing this out. The following patch should address this. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-04-29[IPV6]: Fix race in route selection.YOSHIFUJI Hideaki
We eliminated rt6_dflt_lock (to protect default router pointer) at 2.6.17-rc1, and introduced rt6_select() for general router selection. The function is called in the context of rt6_lock read-lock held, but this means, we have some race conditions when we do round-robin. Signed-off-by; YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[XFRM]: fix incorrect xfrm_policy_afinfo_lock useIngo Molnar
xfrm_policy_afinfo_lock can be taken in bh context, at: [<c013fe1a>] lockdep_acquire_read+0x54/0x6d [<c0f6e024>] _read_lock+0x15/0x22 [<c0e8fcdb>] xfrm_policy_get_afinfo+0x1a/0x3d [<c0e8fd10>] xfrm_decode_session+0x12/0x32 [<c0e66094>] ip_route_me_harder+0x1c9/0x25b [<c0e770d3>] ip_nat_local_fn+0x94/0xad [<c0e2bbc8>] nf_iterate+0x2e/0x7a [<c0e2bc50>] nf_hook_slow+0x3c/0x9e [<c0e3a342>] ip_push_pending_frames+0x2de/0x3a7 [<c0e53e19>] icmp_push_reply+0x136/0x141 [<c0e543fb>] icmp_reply+0x118/0x1a0 [<c0e54581>] icmp_echo+0x44/0x46 [<c0e53fad>] icmp_rcv+0x111/0x138 [<c0e36764>] ip_local_deliver+0x150/0x1f9 [<c0e36be2>] ip_rcv+0x3d5/0x413 [<c0df760f>] netif_receive_skb+0x337/0x356 [<c0df76c3>] process_backlog+0x95/0x110 [<c0df5fe2>] net_rx_action+0xa5/0x16d [<c012d8a7>] __do_softirq+0x6f/0xe6 [<c0105ec2>] do_softirq+0x52/0xb1 this means that all write-locking of xfrm_policy_afinfo_lock must be bh-safe. This patch fixes xfrm_policy_register_afinfo() and xfrm_policy_unregister_afinfo(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[XFRM]: fix incorrect xfrm_state_afinfo_lock useIngo Molnar
xfrm_state_afinfo_lock can be read-locked from bh context, so take it in a bh-safe manner in xfrm_state_register_afinfo() and xfrm_state_unregister_afinfo(). Found by the lock validator. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[TCP]: Fix unlikely usage in tcp_transmit_skb()Hua Zhong
The following unlikely should be replaced by likely because the condition happens every time unless there is a hard error to transmit a packet. Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[XFRM]: fix softirq-unsafe xfrm typemap->lock useIngo Molnar
xfrm typemap->lock may be used in softirq context, so all write_lock() uses must be softirq-safe. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[IPSEC]: Fix IP ID selectionHerbert Xu
I was looking through the xfrm input/output code in order to abstract out the address family specific encapsulation/decapsulation code. During that process I found this bug in the IP ID selection code in xfrm4_output.c. At that point dst is still the xfrm_dst for the current SA which represents an internal flow as far as the IPsec tunnel is concerned. Since the IP ID is going to sit on the outside of the encapsulated packet, we obviously want the external flow which is just dst->child. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[IPV4]: inet_init() -> fs_initcallHeiko Carstens
Convert inet_init to an fs_initcall to make sure its called before any device driver's initcall. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29[NETLINK]: cleanup unused macro in net/netlink/af_netlink.cSoyoung Park
1 line removal, of unused macro. ran 'egrep -r' from linux-2.6.16/ for Nprintk and didn't see it anywhere else but here, in #define... Signed-off-by: Soyoung Park <speattle@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>