summaryrefslogtreecommitdiff
path: root/net/netfilter
AgeCommit message (Collapse)Author
2011-11-30netfilter: xt_qtaguid: fix crash on ctrl delete commandJP Abgrall
Because for now the xt_qtaguid module allows procs to use tags without having /dev/xt_qtaguid open, there was a case where it would try to delete a resources from a list that was proc specific. But that resource was never added to that list which is only used when /dev/xt_qtaguid has been opened by the proc. Once our userspace is fully updated, we won't need those exceptions. Change-Id: Idd4bfea926627190c74645142916e10832eb2504 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: Fix the stats info display orderAshish Sharma
Change-Id: I3bf165c31f35a6c7dc212f23df5eefaeb8129d0d Signed-off-by: Ashish Sharma <ashishsharma@google.com>
2011-11-30netfilter: xt_qtaguid: add missing tracking for no filp caseJP Abgrall
In cases where the skb would have an sk_socket but no file, that skb would not be counted at all. Assigning to uid 0 now. Adding extra counters to track skb counts. Change-Id: If049b4b525e1fbd5afc9c72b4a174c0a435f2ca7 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: fix crash after using delete ctrl commandJP Abgrall
* Crash fix The delete command would delete a socket tag entry without removing it from the proc_qtu_data { ..., sock_tag_list, }. This in turn would cause an exiting process to crash while cleaning up its matching proc_qtu_data. * Added more aggressive tracking/cleanup of proc_qtu_data This should allow one process to cleanup qtu_tag_data{} left around from processes that didn't use resource tracking via /dev/xt_qtaguid. * Debug printing tweaks Better code inclusion/exclusion handling, and extra debug out of full state. Change-Id: I735965af2962ffcd7f3021cdc0068b3ab21245c2 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: change WARN_ONCE into pr_warn_onceJP Abgrall
Make the warning less scary. Change-Id: I0276c5413e37ec991f24db57aeb90333fb1b5a65 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: provide an iface_stat_all proc entryJP Abgrall
There is a /proc/net/xt_qtaguid/iface/<iface>/{rx_bytes,rx_packets,tx_bytes,...} but for better convenience and to avoid getting overly stale net/dev stats we now have /proc/net/xt_qtaguid/iface_stat_all which outputs lines of: iface_name active rx_bytes rx_packets tx_bytes tx_packets net_dev_rx_bytes net_dev_rx_packets net_dev_tx_bytes net_dev_tx_packets Change-Id: I12cc10d2d123b86b56d4eb489b1d77b2ce72ebcf Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: work around devices that reset their statsJP Abgrall
Most net devs will not reset their stats when just going down/up, unless a NETDEV_UNREGISTER was notified. But some devs will not send out a NETDEV_UNREGISTER but still reset their stats just before a NETDEV_UP. Now we just track the dev stats during NETDEV_DOWN... just in case. Then on NETDEV_UP we check the stats: if the device didn't do a NETDEV_UNREGISTER and a prior NETDEV_DOWN captured stats, then we treat it as an UNREGISTER and save the totals from the stashed values. Added extra netdev event debugging. Change-Id: Iec79e74bfd40269aa3e5892f161be71e09de6946 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: warn only once for missing proc qtaguid dataJP Abgrall
When a process doesn't have /dev/xt_qtaguid open, only warn once instead of for every ctrl access. Change-Id: I98a462a8731254ddc3bf6d2fefeef9823659b1f0 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: 1st pass at tracking tag based data resourcesJP Abgrall
* Added global resource tracking based on tags. - Can be put into passive mode via /sys/modules/xt_qtaguid/params/tag_tracking_passive - The number of socket tags per UID is now limited - Adding /dev/xt_qtaguid that each process should open before starting to tag sockets. A later change will make it a "must". - A process should not create new tags unless it has the dev open. A later change will make it a must. - On qtaguid_resources release, the process' matching socket tag info is deleted. * Support run-time debug mask via /sys/modules parameter "debug_mask". * split module into prettyprinting code, includes, main. * Removed ptrdiff_t usage which didn't work in all cases. Change-Id: I4a21d3bea55d23c1c3747253904e2a79f7d555d9 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: qtaguid: fix proc/.../stats uid filtered outputJP Abgrall
"cat /proc/net/xt_qtaguid/stats" for a non-priviledged UID would output multiple twice its own stats. The fix tweaks the way lines are counted. Non-root: idx iface acct_tag_hex uid_tag_int cnt_set ... 2 wlan0 0x0 10022 0 ... 3 wlan0 0x0 10022 1 ... 4 wlan0 0x3010000000000000 10022 0 ... 5 wlan0 0x3010000000000000 10022 1 ... Root: idx iface acct_tag_hex uid_tag_int cnt_set 2 wlan0 0x0 0 0 ... 3 wlan0 0x0 0 1 ... 4 wlan0 0x0 1000 0 ... ... 12 wlan0 0x0 10022 0 ... 13 wlan0 0x0 10022 1 ... ... 18 wlan0 0x3010000000000000 10022 0 ... 19 wlan0 0x3010000000000000 10022 1 ... Change-Id: I3cae1f4fee616bc897831350374656b0c718c45b Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: fix dev_stats for missing NETDEV_UNREGISTERJP Abgrall
Turns out that some devices don't call the notifier chains with NETDEV_UNREGISTER. So now we only track up/down as the points for tracking active/inactive transitions and saving the get_dev_stats(). Change-Id: I948755962b4c64150b4d04f294fb4889f151e42b Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: add some tagging/matching statsJP Abgrall
/proc/net/xt_qtaguid/ctrl will now show: active tagged sockets: lines of "sock=%p tag=0x%llx (uid=%u)" sockets_tagged, : the number of sockets successfully tagged. sockets_untagged: the number of sockets successfully untagged. counter_set_changes: ctrl counter set change requests. delete_cmds: ctrl delete commands completed. iface_events: number of NETDEV_* events handled. match_found_sk: sk found in skbuff without ct assist. match_found_sk_in_ct: the number of times the connection tracker found a socket for us. This happens when the skbuff didn't have info. match_found_sk_none: the number of times no sk could be determined successfully looked up. This indicates we don't know who the data actually belongs to. This could be unsolicited traffic. Change-Id: I3a65613bb24852e1eea768ab0320a6a7073ab9be Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: Fix sockfd_put() call within spinlockJP Abgrall
sockfd_put() risks sleeping. So when doing a delete ctrl command, defer the sockfd_put() and kfree() to outside of the spinlock. Change-Id: I5f8ab51d05888d885b2fbb035f61efa5b7abb88a Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: Fix socket refcounts when taggingJP Abgrall
* Don't hold the sockets after tagging. sockfd_lookup() does a get() on the associated file. There was no matching put() so a closed socket could never be freed. * Don't rely on struct member order for tag_node The structs that had a struct tag_node member would work with the *_tree_* routines only because tag_node was 1st. * Improve debug messages Provide info on who the caller is. Use unsigned int for uid. * Only process NETDEV_UP events. * Pacifier: disable netfilter matching. Leave .../stats header. Change-Id: Iccb8ae3cca9608210c417597287a2391010dff2c Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: recognize IPV6 interfaces. root is procfs privileged.JP Abgrall
* Allow tracking interfaces that only have an ipv6 address. Deal with ipv6 notifier chains that do NETDEV_UP without the rtnl_lock() * Allow root all access to procfs ctrl/stats. To disable all checks: echo 0 > /sys/module/xt_qtaguid/parameters/ctrl_write_gid echo 0 > /sys/module/xt_qtaguid/parameters/stats_readall_gid * Add CDEBUG define to enable pr_debug output specific to procfs ctrl/stats access. Change-Id: I9a469511d92fe42734daff6ea2326701312a161b Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: add counter sets and matching controlJP Abgrall
* Added support for sets of counters. By default set 0 is active. Userspace can control which set is active for a given UID by writing to .../ctrl s <set_num> <uid> Changing the active set is only permitted for processes in the AID_NET_BW_ACCT group. The active set tracking is reset when the uid tag is deleted with the .../ctrl command d 0 <uid> * New output format for the proc .../stats - Now has cnt_set in the list. """ idx iface acct_tag_hex uid_tag_int cnt_set rx_bytes rx_packets tx_bytes tx_packets rx_tcp_packets rx_tcp_bytes rx_udp_packets rx_udp_bytes rx_other_packets rx_other_bytes tx_tcp_packets tx_tcp_bytes tx_udp_packets tx_udp_bytes tx_other_packets tx_other_bytes ... 2 rmnet0 0x0 1000 0 27729 29 1477 27 27501 26 228 3 0 0 1249 24 228 3 0 0 2 rmnet0 0x0 1000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 rmnet0 0x0 10005 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 rmnet0 0x0 10005 1 46407 57 8008 64 46407 57 0 0 0 0 8008 64 0 0 0 0 ... 6 rmnet0 0x7fff000100000000 10005 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 rmnet0 0x7fff000100000000 10005 1 27493 24 1564 22 27493 24 0 0 0 0 1564 22 0 0 0 0 """ * Refactored for proc stats output code. * Silenced some of the per packet debug output. * Reworded some of the debug messages. * Replaced all the spin_lock_irqsave/irqrestore with *_bh(): netfilter handling is done in softirq. Change-Id: Ibe89f9d754579fd97335617186c614b43333cfd3 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: qtaguid: disable #define DEBUGJP Abgrall
This would cause log spam to the point of slowing down the system. Change-Id: I5655f0207935004b0198f43ad0d3c9ea25466e4e Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: xt_qtaguid: add uid permission checks during ctrl/stats accessJP Abgrall
* uid handling - Limit UID impersonation to processes with a gid in AID_NET_BW_ACCT. This affects socket tagging, and data removal. - Limit stats lookup to own uid or the process gid is in AID_NET_BW_STATS. This affects stats lookup. * allow pacifying the module Setting passive to Y/y will make the module return immediately on external stimulus. No more stats and silent success on ctrl writes. Mainly used when one suspects this module of misbehaving. Change-Id: I83990862d52a9b0922aca103a0f61375cddeb7c4 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: qtaguid: add tag delete command, expand stats output.JP Abgrall
* Add a new ctrl command to delete stored data. d <acct_tag> [<uid>] The uid will default to the running process's. The accounting tag can be 0, in which case all counters and socket tags associated with the uid will be cleared. * Simplify the ctrl command handling at the expense of duplicate code. This should make it easier to maintain. * /proc/net/xt_qtaguid/stats now returns more stats idx iface acct_tag_hex uid_tag_int {rx,tx}_{bytes,packets} {rx,tx}_{tcp,udp,other}_{bytes,packets} the {rx,tx}_{bytes,packets} are the totals. * re-tagging will now allow changing the uid. Change-Id: I9594621543cefeab557caa3d68a22a3eb320466d Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: quota2: add support to log quota limit reached.JP Abgrall
This uses the NETLINK NETLINK_NFLOG family to log a single message when the quota limit is reached. It uses the same packet type as ipt_ULOG, but - never copies skb data, - uses 112 as the event number (ULOG's +1) It doesn't log if the module param "event_num" is 0. Change-Id: I6f31736b568bb31a4ff0b9ac2ee58380e6b675ca Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfitler: fixup the quota2, and enable.JP Abgrall
The xt_quota2 came from http://sourceforge.net/projects/xtables-addons/develop It needed tweaking for it to compile within the kernel tree. Fixed kmalloc() and create_proc_entry() invocations within a non-interruptible context. Removed useless copying of current quota back to the iptable's struct matchinfo: - those are per CPU: they will change randomly based on which cpu gets to update the value. - they prevent matching a rule: e.g. -A chain -m quota2 --name q1 --quota 123 can't be followed by -D chain -m quota2 --name q1 --quota 123 as the 123 will be compared to the struct matchinfo's quota member. Change-Id: I021d3b743db3b22158cc49acb5c94d905b501492 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: adding the original quota2 from xtables-addonsJP Abgrall
The original xt_quota in the kernel is plain broken: - counts quota at a per CPU level (was written back when ubiquitous SMP was just a dream) - provides no way to count across IPV4/IPV6. This patch is the original unaltered code from: http://sourceforge.net/projects/xtables-addons at commit e84391ce665cef046967f796dd91026851d6bbf3 Change-Id: I19d49858840effee9ecf6cff03c23b45a97efdeb Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfitler: xt_qtaguid: add another missing spin_unlock.JP Abgrall
This time the symptom is caused by tagging the same socket twice without untagging it in between. This would cause it to not unlock, and return. Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30netfilter: qtaguid: fix bad-arg handling when tagging socketJP Abgrall
When processing args passed to the procfs ctrl, if the tag was invalid it would exit without releasing the spin_lock... Bye bye scheduling. Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: Ic1480ae9d37bba687586094cf6d0274db9c5b28a
2011-11-30nf: qtaguid: make procfs entry for ctrl return correct data.JP Abgrall
(This is a direct cherry-pick from 2.6.39: I3b925802) Fixed procreader for /proc/net/xt_qtaguid/ctrl: it would just fill the output with the same entry. Simplify the **start handling. Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: I3b92580228f2b57795bb2d0d6197fc95ab6be552
2011-11-30nf: qtaguid: workaround xt_socket_get_sk() returning bad SKs.JP Abgrall
(This is a direct cherry pick from 2.6.39: Id2a9912b) * xt_socket_get_sk() returns invalid sockets when the sk_state is TCP_TIME_WAIT. Added detection of time-wait. * Added more constrained usage: qtaguid insures that xt_socket_get*_sk() is not invoked for unexpected hooks or protocols (but I have not seen those active at the point where the returned sk is bad). Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: Id2a9912bb451a3e59d012fc55bbbd40fbb90693f
2011-11-30netfilter: add xt_qtaguid matching moduleJP Abgrall
This module allows tracking stats at the socket level for given UIDs. It replaces xt_owner. If the --uid-owner is not specified, it will just count stats based on who the skb belongs to. This will even happen on incoming skbs as it looks into the skb via xt_socket magic to see who owns it. If an skb is lost, it will be assigned to uid=0. To control what sockets of what UIDs are tagged by what, one uses: echo t $sock_fd $accounting_tag $the_billed_uid \ > /proc/net/xt_qtaguid/ctrl So whenever an skb belongs to a sock_fd, it will be accounted against $the_billed_uid and matching stats will show up under the uid with the given $accounting_tag. Because the number of allocations for the stats structs is not that big: ~500 apps * 32 per app we'll just do it atomic. This avoids walking lists many times, and the fancy worker thread handling. Slabs will grow when needed later. It use netdevice and inetaddr notifications instead of hooks in the core dev code to track when a device comes and goes. This removes the need for exposed iface_stat.h. Put procfs dirs in /proc/net/xt_qtaguid/ ctrl stats iface_stat/<iface>/... The uid stats are obtainable in ./stats. Change-Id: I01af4fd91c8de651668d3decb76d9bdc1e343919 Signed-off-by: JP Abgrall <jpa@google.com>
2011-11-30nf: xt_socket: export the fancy sock finder codeJP Abgrall
The socket matching function has some nifty logic to get the struct sock from the skb or from the connection tracker. We export this so other xt_* can use it, similarly to ho how xt_socket uses nf_tproxy_get_sock. Change-Id: I11c58f59087e7f7ae09e4abd4b937cd3370fa2fd Signed-off-by: JP Abgrall <jpa@google.com>
2011-10-17Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller
2011-10-12IPVS netns shutdown/startup dead-lockHans Schillstrom
ip_vs_mutext is used by both netns shutdown code and startup and both implicit uses sk_lock-AF_INET mutex. cleanup CPU-1 startup CPU-2 ip_vs_dst_event() ip_vs_genl_set_cmd() sk_lock-AF_INET __ip_vs_mutex sk_lock-AF_INET __ip_vs_mutex * DEAD LOCK * A new mutex placed in ip_vs netns struct called sync_mutex is added. Comments from Julian and Simon added. This patch has been running for more than 3 month now and it seems to work. Ver. 3 IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex instead of __ip_vs_mutex as sugested by Julian. Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-10-05netfilter: Use proper rwlock init functionThomas Gleixner
Replace the open coded initialization with the init function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-03netfilter: nf_conntrack: fix event flooding in GRE protocol trackerFlorian Westphal
GRE connections cause ctnetlink event flood because the ASSURED event is set for every packet received. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-08-30Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2011-08-30netfilter: nf_ct_tcp: wrong multiplication of TCPOLEN_TSTAMP_ALIGNED in ↵Jozsef Kadlecsik
tcp_sack skips fastpath The wrong multiplication of TCPOLEN_TSTAMP_ALIGNED by 4 skips the fast path for the timestamp-only option. Bug reported by Michael M. Builov (netfilter bugzilla #738). Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_ct_tcp: fix incorrect handling of invalid TCP optionJozsef Kadlecsik
Michael M. Builov reported that in the tcp_options and tcp_sack functions of netfilter TCP conntrack the incorrect handling of invalid TCP option with too big opsize may lead to read access beyond tcp-packet or buffer allocated on stack (netfilter bugzilla #738). The fix is to stop parsing the options at detecting the broken option. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_ct_pptp: fix DNATed PPTP connection address translationSanket Shah
When both the server and the client are NATed, the set-link-info control packet containing the peer's call-id field is not properly translated. I have verified that it was working in 2.6.16.13 kernel previously but due to rewrite, this scenario stopped working (Not knowing exact version when it stopped working). Signed-off-by: Sanket Shah <sanket.shah@elitecore.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-30netfilter: nf_queue: reject NF_STOLEN verdicts from userspaceFlorian Westphal
A userspace listener may send (bogus) NF_STOLEN verdict, which causes skb leak. This problem was previously fixed via 64507fdbc29c3a622180378210ecea8659b14e40 (netfilter: nf_queue: fix NF_STOLEN skb leak) but this had to be reverted because NF_STOLEN can also be returned by a netfilter hook when iterating the rules in nf_reinject. Reject userspace NF_STOLEN verdict, as suggested by Michal Miroslaw. This is complementary to commit fad54440438a7c231a6ae347738423cbabc936d9 (netfilter: avoid double free in nf_reinject). Cc: Julian Anastasov <ja@ssi.bg> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-08-07netfilter: avoid double free in nf_reinjectJulian Anastasov
NF_STOLEN means skb was already freed Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-29netfilter: xt_rateest: fix xt_rateest_mt_checkentry()Eric Dumazet
commit 4a5a5c73b7cfee (slightly better error reporting) added some useless code in xt_rateest_mt_checkentry(). Fix this so that different error codes can really be returned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-2.6
2011-07-26atomic: use <linux/atomic.h>Arun Sharma
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-22Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Fix wrong check in list_splice_init_rcu() net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu() sysctl,rcu: Convert call_rcu(free_head) to kfree vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu() vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu() ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu() ipc,rcu: Convert call_rcu(free_un) to kfree_rcu() security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu() security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu() ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu() block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu() scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu() audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu() security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu() md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
2011-07-22IPVS: Free resources on module removalSimon Horman
This resolves a panic on module removal. Reported-by: Dave Jones <davej@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-07-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2011-07-21netfilter: ipset: hash:net,iface fixed to handle overlapping nets behind ↵Jozsef Kadlecsik
different interfaces If overlapping networks with different interfaces was added to the set, the type did not handle it properly. Example ipset create test hash:net,iface ipset add test 192.168.0.0/16,eth0 ipset add test 192.168.0.0/24,eth1 Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned a match. In the patch the algorithm is fixed in order to correctly handle overlapping networks. Limitation: the same network cannot be stored with more than 64 different interfaces in a single set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-20net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()Paul E. McKenney
The RCU callback xt_rateest_free_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). This also allows us to dispense with an rcu_barrier() call, speeding up unloading of this module. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-07-19netfilter: nfnetlink_queue: batch verdict supportFlorian Westphal
Introduces a new nfnetlink type that applies a given verdict to all queued packets with an id <= the id in the verdict message. If a mark is provided it is applied to all matched packets. This reduces the number of verdicts that have to be sent. Applications that make use of this feature need to maintain a timeout to send a batchverdict periodically to avoid starvation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-19netfilter: nfnetlink_queue: assert monotonic packet idsEric Dumazet
Packet identifier is currently setup in nfqnl_build_packet_message(), using one atomic_inc_return(). Problem is that since several cpus might concurrently call nfqnl_enqueue_packet() for the same queue, we can deliver packets to consumer in non monotonic way (packet N+1 being delivered after packet N) This patch moves the packet id setup from nfqnl_build_packet_message() to nfqnl_enqueue_packet() to guarantee correct delivery order. This also removes one atomic operation. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Pablo Neira Ayuso <pablo@netfilter.org> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18netfilter: nfnetlink_queue: provide rcu enabled callbacksEric Dumazet
nenetlink_queue operations on SMP are not efficent if several queues are used, because of nfnl_mutex contention when applications give packet verdict. Use new call_rcu field in struct nfnl_callback to advertize a callback that is called under rcu_read_lock instead of nfnl_mutex. On my 2x4x2 machine, I was able to reach 2.000.000 pps going through user land returning NF_ACCEPT verdicts without losses, instead of less than 500.000 pps before patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18netfilter: nfnetlink: add RCU in nfnetlink_rcv_msg()Eric Dumazet
Goal of this patch is to permit nfnetlink providers not mandate nfnl_mutex being held while nfnetlink_rcv_msg() calls them. If struct nfnl_callback contains a non NULL call_rcu(), then nfnetlink_rcv_msg() will use it instead of call() field, holding rcu_read_lock instead of nfnl_mutex Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Florian Westphal <fw@strlen.de> CC: Eric Leblond <eric@regit.org> Signed-off-by: Patrick McHardy <kaber@trash.net>