summaryrefslogtreecommitdiff
path: root/net/netfilter
AgeCommit message (Collapse)Author
2011-10-17Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2011-10-12IPVS netns shutdown/startup dead-lockHans Schillstrom
ip_vs_mutext is used by both netns shutdown code and startup and both implicit uses sk_lock-AF_INET mutex. cleanup CPU-1 startup CPU-2 ip_vs_dst_event() ip_vs_genl_set_cmd() sk_lock-AF_INET __ip_vs_mutex sk_lock-AF_INET __ip_vs_mutex * DEAD LOCK * A new mutex placed in ip_vs netns struct called sync_mutex is added. Comments from Julian and Simon added. This patch has been running for more than 3 month now and it seems to work. Ver. 3 IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex instead of __ip_vs_mutex as sugested by Julian. Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-10-05netfilter: Use proper rwlock init functionThomas Gleixner
Replace the open coded initialization with the init function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03netfilter: nf_conntrack: fix event flooding in GRE protocol trackerFlorian Westphal
GRE connections cause ctnetlink event flood because the ASSURED event is set for every packet received. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-08-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2011-08-30netfilter: nf_ct_tcp: wrong multiplication of TCPOLEN_TSTAMP_ALIGNED in ↵Jozsef Kadlecsik
tcp_sack skips fastpath The wrong multiplication of TCPOLEN_TSTAMP_ALIGNED by 4 skips the fast path for the timestamp-only option. Bug reported by Michael M. Builov (netfilter bugzilla #738). Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_ct_tcp: fix incorrect handling of invalid TCP optionJozsef Kadlecsik
Michael M. Builov reported that in the tcp_options and tcp_sack functions of netfilter TCP conntrack the incorrect handling of invalid TCP option with too big opsize may lead to read access beyond tcp-packet or buffer allocated on stack (netfilter bugzilla #738). The fix is to stop parsing the options at detecting the broken option. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_ct_pptp: fix DNATed PPTP connection address translationSanket Shah
When both the server and the client are NATed, the set-link-info control packet containing the peer's call-id field is not properly translated. I have verified that it was working in 2.6.16.13 kernel previously but due to rewrite, this scenario stopped working (Not knowing exact version when it stopped working). Signed-off-by: Sanket Shah <sanket.shah@elitecore.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_queue: reject NF_STOLEN verdicts from userspaceFlorian Westphal
A userspace listener may send (bogus) NF_STOLEN verdict, which causes skb leak. This problem was previously fixed via 64507fdbc29c3a622180378210ecea8659b14e40 (netfilter: nf_queue: fix NF_STOLEN skb leak) but this had to be reverted because NF_STOLEN can also be returned by a netfilter hook when iterating the rules in nf_reinject. Reject userspace NF_STOLEN verdict, as suggested by Michal Miroslaw. This is complementary to commit fad54440438a7c231a6ae347738423cbabc936d9 (netfilter: avoid double free in nf_reinject). Cc: Julian Anastasov <ja@ssi.bg> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-07netfilter: avoid double free in nf_reinjectJulian Anastasov
NF_STOLEN means skb was already freed Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-29netfilter: xt_rateest: fix xt_rateest_mt_checkentry()Eric Dumazet
commit 4a5a5c73b7cfee (slightly better error reporting) added some useless code in xt_rateest_mt_checkentry(). Fix this so that different error codes can really be returned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-2.6
2011-07-26atomic: use <linux/atomic.h>Arun Sharma
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-22Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Fix wrong check in list_splice_init_rcu() net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu() sysctl,rcu: Convert call_rcu(free_head) to kfree vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu() vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu() ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu() ipc,rcu: Convert call_rcu(free_un) to kfree_rcu() security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu() security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu() ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu() block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu() scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu() audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu() security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu() md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
2011-07-22IPVS: Free resources on module removalSimon Horman
This resolves a panic on module removal. Reported-by: Dave Jones <davej@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-07-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2011-07-21netfilter: ipset: hash:net,iface fixed to handle overlapping nets behind ↵Jozsef Kadlecsik
different interfaces If overlapping networks with different interfaces was added to the set, the type did not handle it properly. Example ipset create test hash:net,iface ipset add test 192.168.0.0/16,eth0 ipset add test 192.168.0.0/24,eth1 Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned a match. In the patch the algorithm is fixed in order to correctly handle overlapping networks. Limitation: the same network cannot be stored with more than 64 different interfaces in a single set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-20net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()Paul E. McKenney
The RCU callback xt_rateest_free_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). This also allows us to dispense with an rcu_barrier() call, speeding up unloading of this module. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-07-19netfilter: nfnetlink_queue: batch verdict supportFlorian Westphal
Introduces a new nfnetlink type that applies a given verdict to all queued packets with an id <= the id in the verdict message. If a mark is provided it is applied to all matched packets. This reduces the number of verdicts that have to be sent. Applications that make use of this feature need to maintain a timeout to send a batchverdict periodically to avoid starvation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-19netfilter: nfnetlink_queue: assert monotonic packet idsEric Dumazet
Packet identifier is currently setup in nfqnl_build_packet_message(), using one atomic_inc_return(). Problem is that since several cpus might concurrently call nfqnl_enqueue_packet() for the same queue, we can deliver packets to consumer in non monotonic way (packet N+1 being delivered after packet N) This patch moves the packet id setup from nfqnl_build_packet_message() to nfqnl_enqueue_packet() to guarantee correct delivery order. This also removes one atomic operation. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Pablo Neira Ayuso <pablo@netfilter.org> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18netfilter: nfnetlink_queue: provide rcu enabled callbacksEric Dumazet
nenetlink_queue operations on SMP are not efficent if several queues are used, because of nfnl_mutex contention when applications give packet verdict. Use new call_rcu field in struct nfnl_callback to advertize a callback that is called under rcu_read_lock instead of nfnl_mutex. On my 2x4x2 machine, I was able to reach 2.000.000 pps going through user land returning NF_ACCEPT verdicts without losses, instead of less than 500.000 pps before patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18netfilter: nfnetlink: add RCU in nfnetlink_rcv_msg()Eric Dumazet
Goal of this patch is to permit nfnetlink providers not mandate nfnl_mutex being held while nfnetlink_rcv_msg() calls them. If struct nfnl_callback contains a non NULL call_rcu(), then nfnetlink_rcv_msg() will use it instead of call() field, holding rcu_read_lock instead of nfnl_mutex Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-01netfilter: Reduce switch/case indentJoe Perches
Make the case labels the same indent as the switch. git diff -w shows miscellaneous 80 column wrapping, comment reflowing and a comment for a useless gcc warning for an otherwise unused default: case. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-30netfilter: add SELinux context support to AUDIT targetMr Dash Four
In this revision the conversion of secid to SELinux context and adding it to the audit log is moved from xt_AUDIT.c to audit.c with the aid of a separate helper function - audit_log_secctx - which does both the conversion and logging of SELinux context, thus also preventing internal secid number being leaked to userspace. If conversion is not successful an error is raised. With the introduction of this helper function the work done in xt_AUDIT.c is much more simplified. It also opens the possibility of this helper function being used by other modules (including auditd itself), if desired. With this addition, typical (raw auditd) output after applying the patch would be: type=NETFILTER_PKT msg=audit(1305852240.082:31012): action=0 hook=1 len=52 inif=? outif=eth0 saddr=10.1.1.7 daddr=10.1.2.1 ipid=16312 proto=6 sport=56150 dport=22 obj=system_u:object_r:ssh_client_packet_t:s0 type=NETFILTER_PKT msg=audit(1306772064.079:56): action=0 hook=3 len=48 inif=eth0 outif=? smac=00:05:5d:7c:27:0b dmac=00:02:b3:0a:7f:81 macproto=0x0800 saddr=10.1.2.1 daddr=10.1.1.7 ipid=462 proto=6 sport=22 dport=3561 obj=system_u:object_r:ssh_server_packet_t:s0 Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Mr Dash Four <mr.dash.four@googlemail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-21ip: introduce ip_is_fragment helper inline functionPaul Gortmaker
There are enough instances of this: iph->frag_off & htons(IP_MF | IP_OFFSET) that a helper function is probably warranted. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21Remove redundant linux/version.h includes from net/Jesper Juhl
It was suggested by "make versioncheck" that the follwing includes of linux/version.h are redundant: /home/jj/src/linux-2.6/net/caif/caif_dev.c: 14 linux/version.h not needed. /home/jj/src/linux-2.6/net/caif/chnl_net.c: 10 linux/version.h not needed. /home/jj/src/linux-2.6/net/ipv4/gre.c: 19 linux/version.h not needed. /home/jj/src/linux-2.6/net/netfilter/ipset/ip_set_core.c: 20 linux/version.h not needed. /home/jj/src/linux-2.6/net/netfilter/xt_set.c: 16 linux/version.h not needed. and it seems that it is right. Beyond manually inspecting the source files I also did a few build tests with various configs to confirm that including the header in those files is indeed not needed. Here's a patch to remove the pointless includes. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-20Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-agn-rxon.c drivers/net/wireless/rtlwifi/pci.c net/netfilter/ipvs/ip_vs_core.c
2011-06-16netfilter: ipset: whitespace and coding fixes detected by checkpatch.plJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: hash:net,iface type introducedJozsef Kadlecsik
The hash:net,iface type makes possible to store network address and interface name pairs in a set. It's mostly suitable for egress and ingress filtering. Examples: # ipset create test hash:net,iface # ipset add test 192.168.0.0/16,eth0 # ipset add test 192.168.0.0/24,eth1 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: use the stored first cidr value instead of '1'Jozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: fix return code for destroy when sets are in useJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: add xt_action_param to the variant level kadt functions, ↵Jozsef Kadlecsik
ipset API change With the change the sets can use any parameter available for the match and target extensions, like input/output interface. It's required for the hash:net,iface set type. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: use unified from/to address masking and check the usageJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: take into account cidr value for the from address when ↵Jozsef Kadlecsik
creating the set When creating a set from a range expressed as a network like 10.1.1.172/29, the from address was taken as the IP address part and not masked with the netmask from the cidr. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: support range for IPv4 at adding/deleting elements for ↵Jozsef Kadlecsik
hash:*net* types The range internally is converted to the network(s) equal to the range. Example: # ipset new test hash:net # ipset add test 10.2.0.0-10.2.1.12 # ipset list test Name: test Type: hash:net Header: family inet hashsize 1024 maxelem 65536 Size in memory: 16888 References: 0 Members: 10.2.1.12 10.2.1.0/29 10.2.0.0/24 10.2.1.8/30 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: set type support with multiple revisions addedJozsef Kadlecsik
A set type may have multiple revisions, for example when syntax is extended. Support continuous revision ranges in set types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: fix adding ranges to hash typesJozsef Kadlecsik
When ranges are added to hash types, the elements may trigger rehashing the set. However, the last successfully added element was not kept track so the adding started again with the first element after the rehashing. Bug reported by Mr Dash Four. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: support listing setnames and headers tooJozsef Kadlecsik
Current listing makes possible to list sets with full content only. The patch adds support partial listings, i.e. listing just the existing setnames or listing set headers, without set members. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: options and flags support added to the kernel APIJozsef Kadlecsik
The support makes possible to specify the timeout value for the SET target and a flag to reset the timeout for already existing entries. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: whitespace fixes: some space before tab slipped inJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: ipset: timeout can be modified for already added elementsJozsef Kadlecsik
When an element to a set with timeout added, one can change the timeout by "readding" the element with the "-exist" flag. That means the timeout value is reset to the specified one (or to the default from the set specification if the "timeout n" option is not used). Example ipset add foo 1.2.3.4 timeout 10 ipset add foo 1.2.3.4 timeout 600 -exist Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16netfilter: fix looped (broad|multi)cast's MAC handlingNicolas Cavallari
By default, when broadcast or multicast packet are sent from a local application, they are sent to the interface then looped by the kernel to other local applications, going throught netfilter hooks in the process. These looped packet have their MAC header removed from the skb by the kernel looping code. This confuse various netfilter's netlink queue, netlink log and the legacy ip_queue, because they try to extract a hardware address from these packets, but extracts a part of the IP header instead. This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header if there is none in the packet. Signed-off-by: Nicolas Cavallari <cavallar@lri.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16Merge branch 'master' of ↵Patrick McHardy
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next-2.6
2011-06-16Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy
2011-06-14IPVS: remove unused init and cleanup functions.Hans Schillstrom
After restructuring, there is some unused or empty functions left to be removed. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-14IPVS: labels at pos 0Hans Schillstrom
Put goto labels at the beginig of row acording to coding style example. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13IPVS netns exit causes crash in conntrackHans Schillstrom
Quote from Patric Mc Hardy "This looks like nfnetlink.c excited and destroyed the nfnl socket, but ip_vs was still holding a reference to a conntrack. When the conntrack got destroyed it created a ctnetlink event, causing an oops in netlink_has_listeners when trying to use the destroyed nfnetlink socket." If nf_conntrack_netlink is loaded before ip_vs this is not a problem. This patch simply avoids calling ip_vs_conn_drop_conntrack() when netns is dying as suggested by Julian. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13IPVS: rename of netns init and cleanup functions.Hans Schillstrom
Make it more clear what the functions does, on request by Julian. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-13ipvs: support more FTP PASV responsesJulian Anastasov
Change the parsing of FTP commands and responses to support skip character. It allows to detect variations in the 227 PASV response. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-06-09rtnetlink: Compute and store minimum ifinfo dump sizeGreg Rose
The message size allocated for rtnl ifinfo dumps was limited to a single page. This is not enough for additional interface info available with devices that support SR-IOV and caused a bug in which VF info would not be displayed if more than approximately 40 VFs were created per interface. Implement a new function pointer for the rtnl_register service that will calculate the amount of data required for the ifinfo dump and allocate enough data to satisfy the request. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>