summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-06-16tcp: TCP connection times out if ICMP frag needed is delayedSridhar Samudrala
[ upstream commit: 7d227cd235c809c36c847d6a597956ad9e9d2bae ] We are seeing an issue with TCP in handling an ICMP frag needed message that is received after net.ipv4.tcp_retries1 retransmits. The default value of retries1 is 3. So if the path mtu changes and ICMP frag needed is lost for the first 3 retransmits or if it gets delayed until 3 retransmits are done, TCP doesn't update MSS correctly and continues to retransmit the orginal message until it timesout after tcp_retries2 retransmits. I am seeing this issue even with the latest 2.6.25.4 kernel. In tcp_retransmit_timer(), when retransmits counter exceeds tcp_retries1 value, the dst cache entry of the socket is reset. At this time, if we receive an ICMP frag needed message, the dst entry gets updated with the new MTU, but the TCP sockets dst_cache entry remains NULL. So the next time when we try to retransmit after the ICMP frag needed is received, tcp_retransmit_skb() gets called. Here the cur_mss value is calculated at the start of the routine with a NULL sk_dst_cache. Instead we should call tcp_current_mss after the rebuild_header that caches the dst entry with the updated mtu. Also the rebuild_header should be called before tcp_fragment so that skb is fragmented if the mss goes down. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16vlan: Correctly handle device notifications for layered VLAN devicesPatrick McHardy
[ upstream commit: 81d85346b3fcd8b3167eac8b5fb415a210bd4345 ] Commit 30688a9 ([VLAN]: Handle vlan devices net namespace changing) changed the device notifier to special-case notifications for VLAN devices, effectively disabling state propagation to underlying VLAN devices. This is needed for layered VLANs though, so restore the original behaviour. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16l2tp: avoid skb truesize bug if headroom is increasedJames Chapman
[ upstream commit: 090c48d3dd5ea90b37350334aaed9a93b0c1e0a1 ] A user reported seeing occasional bugs such as the following when using the L2TP driver. SKB BUG: Invalid truesize (272) len=72, sizeof(sk_buff)=208 When L2TP adds its header in the transmit path, it might need to increase the headroom of the skb. In some cases, the increased headroom trips a kernel bug when the skb is freed because the skb has grown beyond its truesize value. The fix is to increase the truesize by the amount of headroom added, after orphaning the skb. While here, fix a misleading comment. Thanks to Iouri Kharon <bc-info@styx.cabel.net> for the initial report and testing the fix. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16netlink: Fix nla_parse_nested_compat() to call nla_parse() directlyThomas Graf
[ upstream commit: b9a2f2e450b0f770bb4347ae8d48eb2dea701e24 ] The purpose of nla_parse_nested_compat() is to parse attributes which contain a struct followed by a stream of nested attributes. So far, it called nla_parse_nested() to parse the stream of nested attributes which was wrong, as nla_parse_nested() expects a container attribute as data which holds the attribute stream. It needs to call nla_parse() directly while pointing at the next possible alignment point after the struct in the beginning of the attribute. With this patch, I can no longer reproduce the reported leftover warnings. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ipsec: Use the correct ip_local_out functionHerbert Xu
[ upstream commit: 1ac06e0306d0192a7a4d9ea1c9e06d355ce7e7d3 ] Because the IPsec output function xfrm_output_resume does its own dst_output call it should always call __ip_local_output instead of ip_local_output as the latter may invoke dst_output directly. Otherwise the return values from nf_hook and dst_output may clash as they both use the value 1 but for different purposes. When that clash occurs this can cause a packet to be used after it has been freed which usually leads to a crash. Because the offending value is only returned from dst_output with qdiscs such as HTB, this bug is normally not visible. Thanks to Marco Berizzi for his perseverance in tracking this down. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16net_sched: cls_api: fix return value for non-existant classifiersPatrick McHardy
[ upstream commit: f2df824948d559ea818e03486a8583e42ea6ab37 ] cls_api should return ENOENT when the requested classifier doesn't exist. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags()David Woodhouse
[ upstream commit: 0e91796eb46e29edc791131c832a2232bcaed9dd ] Am I just being particularly dim today, or can the call to dev->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() never happen? We've just set dev->flags = flags & IFF_MULTICAST, effectively. So the condition '(dev->flags ^ flags) & IFF_MULTICAST' is _never_ going to be true. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16cassini: Only use chip checksum for ipv4 packets.David S. Miller
[ upstream commit: b1443e2f6501f06930a162ff1ff08382a98bf23e ] According to David Monro, at least with Natsemi Saturn chips the cassini driver has some trouble with ipv6 checksums. Until we have more information about what's going on here, only use the chip checksums for ipv4. This workaround was suggested and tested by David. Update version and release date. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16can: Fix copy_from_user() results interpretationSam Ravnborg
[ Upstream commit: 3f91bd420a955803421f2db17b2e04aacfbb2bb8 ] Both copy_to_ and _from_user return the number of bytes, that failed to reach their destination, not the 0/-EXXX values. Based on patch from Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16bluetooth: rfcomm_dev_state_change deadlock fixDave Young
commit 537d59af73d894750cff14f90fe2b6d77fbab15b in mainline There's logic in __rfcomm_dlc_close: rfcomm_dlc_lock(d); d->state = BT_CLOSED; d->state_changed(d, err); rfcomm_dlc_unlock(d); In rfcomm_dev_state_change, it's possible that rfcomm_dev_put try to take the dlc lock, then we will deadlock. Here fixed it by unlock dlc before rfcomm_dev_get in rfcomm_dev_state_change. why not unlock just before rfcomm_dev_put? it's because there's another problem. rfcomm_dev_get/rfcomm_dev_del will take rfcomm_dev_lock, but in rfcomm_dev_add the lock order is : rfcomm_dev_lock --> dlc lock so I unlock dlc before the taken of rfcomm_dev_lock. Actually it's a regression caused by commit 1905f6c736cb618e07eca0c96e60e3c024023428 ("bluetooth : __rfcomm_dlc_close lock fix"), the dlc state_change could be two callbacks : rfcomm_sk_state_change and rfcomm_dev_state_change. I missed the rfcomm_sk_state_change that time. Thanks Arjan van de Ven <arjan@linux.intel.com> for the effort in commit 4c8411f8c115def968820a4df6658ccfd55d7f1a ("bluetooth: fix locking bug in the rfcomm socket cleanup handling") but he missed the rfcomm_dev_state_change lock issue. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-06-16bluetooth: fix locking bug in the rfcomm socket cleanup handlingArjan van de Ven
[ Upstream commit: 7dccf1f4e1696c79bff064c3770867cc53cbc71c ] in net/bluetooth/rfcomm/sock.c, rfcomm_sk_state_change() does the following operation: if (parent && sock_flag(sk, SOCK_ZAPPED)) { /* We have to drop DLC lock here, otherwise * rfcomm_sock_destruct() will dead lock. */ rfcomm_dlc_unlock(d); rfcomm_sock_kill(sk); rfcomm_dlc_lock(d); } } which is fine, since rfcomm_sock_kill() will call sk_free() which will call rfcomm_sock_destruct() which takes the rfcomm_dlc_lock()... so far so good. HOWEVER, this assumes that the rfcomm_sk_state_change() function always gets called with the rfcomm_dlc_lock() taken. This is the case for all but one case, and in that case where we don't have the lock, we do a double unlock followed by an attempt to take the lock, which due to underflow isn't going anywhere fast. This patch fixes this by moving the stragling case inside the lock, like the other usages of the same call are doing in this code. This was found with the help of the www.kerneloops.org project, where this deadlock was observed 51 times at this point in time: http://www.kerneloops.org/search.php?search=rfcomm_sock_destruct Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ax25: Fix NULL pointer dereference and lockup.Jarek Poplawski
[ Upstream commit: 7dccf1f4e1696c79bff064c3770867cc53cbc71c ] There is only one function in AX25 calling skb_append(), and it really looks suspicious: appends skb after previously enqueued one, but in the meantime this previous skb could be removed from the queue. This patch Fixes it the simple way, so this is not fully compatible with the current method, but testing hasn't shown any problems. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16af_key: Fix selector family initialization.Kazunori MIYAZAWA
[ upstream commit: 4da5105687e0993a3bbdcffd89b2b94d9377faab ] This propagates the xfrm_user fix made in commit bcf0dda8d2408fe1c1040cdec5a98e5fcad2ac72 ("[XFRM]: xfrm_user: fix selector family initialization") Based upon a bug report from, and tested by, Alan Swanson. Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16sunhv: Fix locking in non-paged I/O case.David S. Miller
[ upstream commit: 3651751fff44ede58f65cbb1e39242139ead251b ] This causes the lock to be taken twice, thus resulting in a deadlock. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16fbdev: export symbol fb_mode_optionGeoff Levand
upstream commit: 659179b28f15ab1b1db5f8767090f5e728f115a1 Frame buffer and mode setting drivers can be built as modules, so fb_mode_option needs to be exported to support these. Prevents this error: ERROR: "fb_mode_option" [drivers/ps3/ps3av_mod.ko] undefined! Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: fix missed mutex_unlockCyrill Gorcunov
upstream commit: 71fd5179e8d1d4d503b517e0c5374f7c49540bfc Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: clean up (un)lock_parentMiklos Szeredi
upstream commit: 8dc4e37362a5dc910d704d52ac6542bfd49ddc2f dget(dentry->d_parent) --> dget_parent(dentry) unlock_parent() is racy and unnecessary. Replace single caller with unlock_dir(). There are several other suspect uses of ->d_parent in ecryptfs... Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16eCryptfs: protect crypt_stat->flags in ecryptfs_open()Michael Halcrow
upstream commit: 2f9b12a31fcb738ea8c9eb0d4ddf906c6f1d696c Make sure crypt_stat->flags is protected with a lock in ecryptfs_open(). Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ecryptfs: add missing lock around notify_changeMiklos Szeredi
upstream commit: 9c3580aa52195699065bc2d7242b1c7e3e6903fa Callers of notify_change() need to hold i_mutex. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16sound: emu10k1 - fix system hang with Audigy2 ZS Notebook PCMCIA cardJaroslav Franek
upstream commit: 868e15dbd2940f9453b4399117686f408dc77299 When the Linux kernel is compiled with CONFIG_DEBUG_SHIRQ=y, the Soundblaster Audigy2 ZS Notebook PCMCIA card causes the system hang during boot (udev stage) or when the card is hot-plug. The CONFIG_DEBUG_SHIRQ flag is by default 'y' with all Fedora kernels since 2.6.23. The problem was reported as https://bugzilla.redhat.com/show_bug.cgi?id=326411 The issue was hunted down to the snd_emu10k1_create() routine: /* pseudo-code */ snd_emu10k1_create(...) { ... request_irq(... IRQF_SHARED ...) { register the irq handler #ifdef CONFIG_DEBUG_SHIRQ call the irq handler: snd_emu10k1_interrupt() { poll I/O port // <---- !! system hangs ... } #endif } ... snd_emu10k1_cardbus_init(...) { initialize I/O ports } ... } The early access to I/O port in the interrupt handler causes the freeze. Obviously it is necessary to init the I/O ports before accessing them. This patch moves the registration of the irq handler after the initialization of the I/O ports. Signed-off-by: Jaroslav Franek <jarin.franek@post.cz> Acked-by: James Courtier-Dutton <James@superbug.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ALSA: hda - Fix resume of auto-config mode with Realtek codecsTakashi Iwai
upstream commit: 07bc76dfa19b10017b518dd9aa1b2719e8c863de The auto-config mode of Realtek ALC codecs has a bug since 2.6.25 that it cannot resume properly. The problem was the wrong assignment of init_hook that overrides the whole initialization. Relevant bug reports: http://bugzilla.kernel.org/show_bug.cgi?id=10662 https://bugzilla.novell.com/show_bug.cgi?id=385473 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16double-free of inode on alloc_file() failure exit in create_write_pipe()Al Viro
upstream commit: ed1524371716466e9c762808b02601d0d0276a92 Duh... Fortunately, the bug is quite recent (post-2.6.25) and, embarrassingly, mine ;-/ http://bugzilla.kernel.org/show_bug.cgi?id=10878 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enableMichael Buesch
upstream commit: a3bafeedfff2ac5fa0a316bea4570e27900b6fcc This fixes a context assertion in ssb that makes b44 print out warnings on resume. This fixes the following kernel oops: http://www.kerneloops.org/oops.php?number=12732 http://www.kerneloops.org/oops.php?number=11410 Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16Add 'rd' alias to new brd ramdisk driverNick Piggin
upstream commit: efedf51c866130945b5db755cb58670e60205d83 Alias brd to rd in the hope of helping legacy users. Suggested by Jan. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16ipwireless: Fix blocked sendingDavid Sterba
upstream commit: eb4e545d4ac82d9018487edb4419b33b9930c857 Packet sending is driven by two flags, tx_ready and tx_queued. It was possible, that there were queued data for sending and hardware was flagged as blocked but in fact it was not. The tx_queued was indicator but should be really a counter else first fragmented packet resets tx_queued flag, but there may be pending packets which do not get sent. New semantics: tx_ready - set, if hw is ready to send packet, no packet is being transferred right now set the flag right at the place where data are copied into hw memory and not earlier without checking if it was succesful tx_queued - count of enqueued packets, including fragments Tested-by: Michal Rokos <michal.rokos@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-16b43: Fix controller restart crashMichael Buesch
upstream commit: 3bf0a32e22fedc0b46443699db2d61ac2a883ac4 This fixes a kernel crash on rmmod, in the case where the controller was restarted before doing the rmmod. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09Linux 2.6.25.6v2.6.25.6Chris Wright
2008-06-09x86: fix bad pmd ffff810000207xxx(9090909090909090)Hugh Dickins
upstream commit: 2884f110d5409714f3a04eeb6d2ecd77da66b242 OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages: mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090). Initialization's cleanup_highmap was leaving alignment filler behind in the pmd for MODULES_VADDR: when vmalloc's guard page would occupy a new page table, it's not allocated, and then module unload's vfree hits the bad 9090 pmd entry left over. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Hugh notes: It's actually not a serious problem, but it does look as if it's a serious problem, so we should stamp it out. Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09cpufreq: fix null object access on Transmeta CPUCHIKAMA masaki
upstream commit: 879000f94442860e72c934f9e568989bc7fb8ec4 If cpu specific cpufreq driver(i.e. longrun) has "setpolicy" function, governor object isn't set into cpufreq_policy object at "__cpufreq_set_policy" function in driver/cpufreq/cpufreq.c . This causes a null object access at "store_scaling_setspeed" and "show_scaling_setspeed" function in driver/cpufreq/cpufreq.c when reading or writing through /sys interface (ex. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed) Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=10654 https://bugzilla.redhat.com/show_bug.cgi?id=443354 Signed-off-by: CHIKAMA Masaki <masaki.chikama@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Chuck Ebbert <cebbert@redhat.com> Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09capabilities: remain source compatible with 32-bit raw legacy capability ↵Andrew G. Morgan
support. upstream commit: ca05a99a54db1db5bca72eccb5866d2a86f8517f Source code out there hard-codes a notion of what the _LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the raw capability system calls capget() and capset(). Its unfortunate, but true. Since the confusing header file has been in a released kernel, there is software that is erroneously using 64-bit capabilities with the semantics of 32-bit compatibilities. These recently compiled programs may suffer corruption of their memory when sys_getcap() overwrites more memory than they are coded to expect, and the raising of added capabilities when using sys_capset(). As such, this patch does a number of things to clean up the situation for all. It 1. forces the _LINUX_CAPABILITY_VERSION define to always retain its legacy value. 2. adopts a new #define strategy for the kernel's internal implementation of the preferred magic. 3. deprecates v2 capability magic in favor of a new (v3) magic number. The functionality of v3 is entirely equivalent to v2, the only difference being that the v2 magic causes the kernel to log a "deprecated" warning so the admin can find applications that may be using v2 inappropriately. [User space code continues to be encouraged to use the libcap API which protects the application from details like this. libcap-2.10 is the first to support v3 capabilities.] Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518. Thanks to Bojan Smojver for the report. [akpm@linux-foundation.org: s/depreciate/deprecate/g] [akpm@linux-foundation.org: be robust about put_user size] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Bojan Smojver <bojan@rexursive.com> Cc: stable@kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: fix prexor vs sync_request raceDan Williams
upstream commit: e0a115e5aa554b93150a8dc1c3fe15467708abb2 During the initial array synchronization process there is a window between when a prexor operation is scheduled to a specific stripe and when it completes for a sync_request to be scheduled to the same stripe. When this happens the prexor completes and the stripe is unconditionally marked "insync", effectively canceling the sync_request for the stripe. Prior to 2.6.23 this was not a problem because the prexor operation was done under sh->lock. The effect in older kernels being that the prexor would still erroneously mark the stripe "insync", but sync_request would be held off and re-mark the stripe as "!in_sync". Change the write completion logic to not mark the stripe "in_sync" if a prexor was performed. The effect of the change is to sometimes not set STRIPE_INSYNC. The worst this can do is cause the resync to stall waiting for STRIPE_INSYNC to be set. If this were happening, then STRIPE_SYNCING would be set and handle_issuing_new_read_requests would cause all available blocks to eventually be read, at which point prexor would never be used on that stripe any more and STRIPE_INSYNC would eventually be set. echo repair > /sys/block/mdN/md/sync_action will correct arrays that may have lost this race. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw: backport to 2.6.25.5] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: fix uninitialized use of mddev->recovery_waitDan Williams
upstream commit: a6d8113a986c66aeb379a26b6e0062488b3e59e1 If an array was created with --assume-clean we will oops when trying to set ->resync_max. Fix this by initializing ->recovery_wait in mddev_find. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09md: do not compute parity unless it is on a failed driveDan Williams
upstream commit: c337869d95011495fa181536786e74aa2d7ff031 If a block is computed (rather than read) then a check/repair operation may be lead to believe that the data on disk is correct, when infact it isn't. So only compute blocks for failed devices. This issue has been around since at least 2.6.12, but has become harder to hit in recent kernels since most reads bypass the cache. echo repair > /sys/block/mdN/md/sync_action will set the parity blocks to the correct state. Cc: <stable@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09eCryptfs: remove unnecessary page decrypt callMichael Halcrow
upstream commit: d3e49afbb66109613c3474f2273f5830ac2dcb09 The page decrypt calls in ecryptfs_write() are both pointless and buggy. Pointless because ecryptfs_get_locked_page() has already brought the page up to date, and buggy because prior mmap writes will just be blown away by the decrypt call. This patch also removes the declaration of a now-nonexistent function ecryptfs_write_zeros(). Thanks to Eric Sandeen and David Kleikamp for helping to track this down. Eric said: fsx w/ mmap dies quickly ( < 100 ops) without this, and survives nicely (to millions of ops+) with it in place. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw: backport to 2.6.25.5] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09brk: make sys_brk() honor COMPAT_BRK when computing lower boundJiri Kosina
upstream commit: a5b4592cf77b973c29e7c9695873a26052b58951 Fix a regression introduced by commit 4cc6028d4040f95cdb590a87db478b42b8be0508 Author: Jiri Kosina <jkosina@suse.cz> Date: Wed Feb 6 22:39:44 2008 +0100 brk: check the lower bound properly The check in sys_brk() on minimum value the brk might have must take CONFIG_COMPAT_BRK setting into account. When this option is turned on (i.e. we support ancient legacy binaries, e.g. libc5-linked stuff), the lower bound on brk value is mm->end_code, otherwise the brk start is allowed to be arbitrarily shifted. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09pagemap: fix bug in add_to_pagemap, require aligned-length reads of ↵Thomas Tuttle
/proc/pid/pagemap upstream commit: aae8679b0ebcaa92f99c1c3cb0cd651594a43915 Fix a bug in add_to_pagemap. Previously, since pm->out was a char *, put_user was only copying 1 byte of every PFN, resulting in the top 7 bytes of each PFN not being copied. By requiring that reads be a multiple of 8 bytes, I can make pm->out and pm->end u64*s instead of char*s, which makes put_user work properly, and also simplifies the logic in add_to_pagemap a bit. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Thomas Tuttle <ttuttle@google.com> Cc: Matt Mackall <mpm@selenic.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09proc: calculate the correct /proc/<pid> link countVegard Nossum
upstream commit: aed5417593ad125283f35513573282139a8664b5 This patch: commit e9720acd728a46cb40daa52c99a979f7c4ff195c Author: Pavel Emelyanov <xemul@openvz.org> Date: Fri Mar 7 11:08:40 2008 -0800 [NET]: Make /proc/net a symlink on /proc/self/net (v3) introduced a /proc/self/net directory without bumping the corresponding link count for /proc/self. This patch replaces the static link count initializations with a call that counts the number of directory entries in the given pid_entry table whenever it is instantiated, and thus relieves the burden of manually keeping the two in sync. [akpm@linux-foundation.org: cleanup] Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09Smack: fuse mount hang fixCasey Schaufler
upstream commit: e97dcb0eadbb821eccd549d4987b653cf61e2374 The d_instantiate hook for Smack can hang on the root inode of a filesystem if the file system code has not really done all the set-up. Fuse is known to encounter this problem. This change detects an attempt to instantiate a root inode and addresses it early in the processing, before any attempt is made to do something that might hang. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09atl1: fix 4G memory corruption bugAlexey Dobriyan
upstream commit: aefdbf1a3b832a580a50cf3d1dcbb717be7cbdbe When using 4+ GB RAM and SWIOTLB is active, the driver corrupts memory by writing an skb after the relevant DMA page has been unmapped. Although this doesn't happen when *not* using bounce buffers, clearing the pointer to the DMA page after unmapping it fixes the problem. http://marc.info/?t=120861317000005&r=2&w=2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> [jacliburn@bellsouth.net: backport to 2.6.25.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09netfilter: nf_conntrack_ipv6: fix inconsistent lock state in ↵Patrick McHardy
nf_ct_frag6_gather() upstream commit: b9c698964614f71b9c8afeca163a945b4c2e2d20 [ 63.531438] ================================= [ 63.531520] [ INFO: inconsistent lock state ] [ 63.531520] 2.6.26-rc4 #7 [ 63.531520] --------------------------------- [ 63.531520] inconsistent {softirq-on-W} -> {in-softirq-W} usage. [ 63.531520] tcpsic6/3864 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 63.531520] (&q->lock#2){-+..}, at: [<c07175b0>] ipv6_frag_rcv+0xd0/0xbd0 [ 63.531520] {softirq-on-W} state was registered at: [ 63.531520] [<c0143bba>] __lock_acquire+0x3aa/0x1080 [ 63.531520] [<c0144906>] lock_acquire+0x76/0xa0 [ 63.531520] [<c07a8f0b>] _spin_lock+0x2b/0x40 [ 63.531520] [<c0727636>] nf_ct_frag6_gather+0x3f6/0x910 ... According to this and another similar lockdep report inet_fragment locks are taken from nf_ct_frag6_gather() with softirqs enabled, but these locks are mainly used in softirq context, so disabling BHs is necessary. Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09netfilter: xt_connlimit: fix accouning when receive RST packet in ↵Patrick McHardy
ESTABLISHED state upstream commit: d2ee3f2c4b1db1320c1efb4dcaceeaf6c7e6c2d3 In xt_connlimit match module, the counter of an IP is decreased when the TCP packet is go through the chain with ip_conntrack state TW. Well, it's very natural that the server and client close the socket with FIN packet. But when the client/server close the socket with RST packet(using so_linger), the counter for this connection still exsit. The following patch can fix it which is based on linux-2.6.25.4 Signed-off-by: Dong Wei <dwei.zh@gmail.com> Acked-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09netfilter: nf_conntrack_expect: fix error path unwind in ↵Patrick McHardy
nf_conntrack_expect_init() upstream commit: 12293bf91126ad253a25e2840b307fdc7c2754c3 Signed-off-by: Alexey Dobriyan <adobriyan@parallels.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09x86, fpu: fix CONFIG_PREEMPT=y corruption of application's FPU stackSuresh Siddha
upstream commit: 870568b39064cab2dd971fe57969916036982862 Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT, and bisected it to commit v2.6.19-1363-gacc2076, "i386: add sleazy FPU optimization". Add tsk_used_math() checks to prevent calling math_state_restore() which can sleep in the case of !tsk_used_math(). This prevents making a blocking call in __switch_to(). Apparently "fpu_counter > 5" check is not enough, as in some signal handling and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible. It's a side effect though. This is the failing scenario: process 'A' in save_i387_ia32() just after clear_used_math() Got an interrupt and pre-empted out. At the next context switch to process 'A' again, kernel tries to restore the math state proactively and sees a fpu_counter > 0 and !tsk_used_math() This results in init_fpu() during the __switch_to()'s math_state_restore() And resulting in fpu corruption which will be saved/restored (save_i387_fxsave and restore_i387_fxsave) during the remaining part of the signal handling after the context switch. Bisected-by: Jürgen Mell <j.mell@t-online.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Jürgen Mell <j.mell@t-online.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09CPUFREQ: Make acpi-cpufreq more robust against BIOS freq changes behind our ↵Venkatesh Pallipadi
back. upstream commit: e56a727b023d40d1adf660168883f30f2e6abe0a We checked the hardware freq with OS cached freq value in get_cur_freqon_cpu(). Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> Cc: Thomas Renninger <trenn@suse.de> Cc: "Anthony L. Awtrey" <tony@awtrey.com> [chrisw: backport to 2.6.25.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09HID: split Numlock emulation quirk from HID_QUIRK_APPLE_HAS_FN.Diego 'Flameeyes' Petteno
upstream commit: 6e7045990f35ef9250804b3fd85e855b8c2aaeb6. [jkosina@suse.cz: Needed to fix apple aluminium keyboard regression] Since 2.6.25 the HID_QUIRK_APPLE_HAS_FN quirk is enabled even for non-laptop Apple keyboards of the Aluminium series. The USB version of these don't need Numlock emulation, like the laptop (and Aluminium Wireless) do, as they have a proper keypad. This patch splits the Numlock emulation for Apple keyboards in a different quirk flag, so that it can be enabled for all the keyboards but the Aluminium USB ones. If the Numlock emulation is enabled for Aluminium USB keyboards, the JKL and UIO keys become the numeric pad, and the rest of the keyboard is disabled, included the key used to disable Numlock. Additionally, these keyboard should not have a Numlock at all, as the Numlock key is instead replaced by the 'Clear' key as usual for Apple USB keyboards. Signed-off-by: Diego 'Flameeyes' Petteno <flameeyes@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09netfilter: xt_iprange: module aliases for xt_iprangePhil Oester
upstream commit: 01b7a314291b2ef56ad718ee1374a1bac4768b29 Using iptables 1.3.8 with kernel 2.6.25, rules which include '-m iprange' don't automatically pull in xt_iprange module. Below patch adds module aliases to fix that. Patch against latest -git, but seems like a good candidate for -stable also. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09PS3: gelic: fix memory leakMasakazu Mokuno
upstream commit: 6fc7431dc0775f21ad7a7a39c2ad0290291a56ea This fixes the bug that the I/O buffer is not freed at the driver removal. Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09Revert "PCI: remove default PCI expansion ROM memory allocation"Linus Torvalds
upstream commit: 8d539108560ec121d59eee05160236488266221c This reverts commit 9f8daccaa05c14e5643bdd4faf5aed9cc8e6f11e, which was reported to break X startup (xf86-video-ati-6.8.0). See http://bugs.freedesktop.org/show_bug.cgi?id=15523 for details. Reported-by: Laurence Withers <l@lwithers.me.uk> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [cebbert@redhat.com: backport, remove first hunk to make port easier] [chrisw@sous-sol.org: add back first hunk] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09x86: prevent PGE flush from interruption/preemptionIngo Molnar
upstream commit: b1979a5fda7869a790f4fd83fb06c78498d26ba1 CR4 manipulation is not protected against interrupts and preemption, but KVM uses smp_function_call to manipulate the X86_CR4_VMXE bit either from the CPU hotplug code or from the kvm_init call. We need to protect the CR4 manipulation from both interrupts and preemption. Original bug report: http://lkml.org/lkml/2008/5/7/48 Bugzilla entry: http://bugzilla.kernel.org/show_bug.cgi?id=10642 This is not a regression from 2.6.25, it's a long standing and hard to trigger bug. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-06-09XFS: Fix memory corruption with small buffer readsChristoph Hellwig
upstream commit: 6ab455eeaff6893cd06da33843e840d888cdc04a When we have multiple buffers in a single page for a blocksize == pagesize filesystem we might overwrite the page contents if two callers hit it shortly after each other. To prevent that we need to keep the page locked until I/O is completed and the page marked uptodate. Thanks to Eric Sandeen for triaging this bug and finding a reproducible testcase and Dave Chinner for additional advice. This should fix kernel.org bz #10421. Tested-by: Eric Sandeen <sandeen@sandeen.net> SGI-PV: 981813 SGI-Modid: xfs-linux-melb:xfs-kern:31173a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>