linux-toradex.git - Linux kernel for Apalis, Colibri and Verdin modules

Age	Commit message (Collapse)	Author
2007-02-05	[PATCH] jmicron: 40/80pin primary detection	ethanhsiao@jmicron.com
	jmicron module detects all JMB36x as JMB361 and PATA0 has wrong pin status of XICBLID. Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> cebbert@redhat.com: I folded in the warning fix (a51545ab25) because otherwise it makes the tester think the patch caused the warning that was already there. Cc: Dave Jones <davej@redhat.com> Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] bonding: ARP monitoring broken on x86_64	Andy Gospodarek
	While working with the latest bonding code I noticed a nasty problem that will prevent arp monitoring from always functioning correctly on x86_64 systems. Comparing ints to longs and expecting reliable results on x86_64 is a bad idea. With this patch, arp monitoring works correctly again. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@osdl.org> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] libata: use kmap_atomic(KM_IRQ0) in SCSI simulator	Jeff Garzik
	We are inside spin_lock_irqsave(). quoth akpm's debug facility: [ 231.948000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB) [ 232.232000] ata1.00: configured for UDMA/33 [ 232.404000] WARNING (1) at arch/i386/mm/highmem.c:47 kmap_atomic() [ 232.404000] [<c01162e6>] kmap_atomic+0xa9/0x1ab [ 232.404000] [<c0242c81>] ata_scsi_rbuf_get+0x1c/0x30 [ 232.404000] [<c0242caf>] ata_scsi_rbuf_fill+0x1a/0x87 [ 232.404000] [<c0243ab2>] ata_scsiop_mode_sense+0x0/0x309 [ 232.404000] [<c01729d5>] end_bio_bh_io_sync+0x0/0x37 [ 232.404000] [<c02311c6>] scsi_done+0x0/0x16 [ 232.404000] [<c02311c6>] scsi_done+0x0/0x16 [ 232.404000] [<c0242dcc>] ata_scsi_simulate+0xb0/0x13f [...] Signed-off-by: Jeff Garzik <jeff@garzik.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] md: fix potential memalloc deadlock in md	NeilBrown
	If a GFP_KERNEL allocation is attempted in md while the mddev_lock is held, it is possible for a deadlock to eventuate. This happens if the array was marked 'clean', and the memalloc triggers a write-out to the md device. For the writeout to succeed, the array must be marked 'dirty', and that requires getting the mddev_lock. So, before attempting a GFP_KERNEL alloction while holding the lock, make sure the array is marked 'dirty' (unless it is currently read-only). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] md: fix a few problems with the interface (sysfs and ioctl) to md.	NeilBrown
	While developing more functionality in mdadm I found some bugs in md... - When we remove a device from an inactive array (write 'remove' to the 'state' sysfs file - see 'state_store') would should not update the superblock information - as we may not have read and processed it all properly yet. - initialise all raid_disk entries to '-1' else the 'slot sysfs file will claim '0' for all devices in an array before the array is started. - all '\n' not to be present at the end of words written to sysfs files - when we use SET_ARRAY_INFO to set the md metadata version, set the flag to say that there is persistant metadata. - allow GET_BITMAP_FILE to be called on an array that hasn't been started yet. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] md: make 'repair' actually work for raid1.	NeilBrown
	When 'repair' finds a block that is different one the various parts of the mirror. it is meant to write a chosen good version to the others. However it currently writes out the original data to each. The memcpy to make all the data the same is missing. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] md: assorted md and raid1 one-liners	NeilBrown
	Fix few bugs that meant that: - superblocks weren't alway written at exactly the right time (this could show up if the array was not written to - writting to the array causes lots of superblock updates and so hides these errors). - restarting device recovery after a clean shutdown (version-1 metadata only) didn't work as intended (or at all). 1/ Ensure superblock is updated when a new device is added. 2/ Remove an inappropriate test on MD_RECOVERY_SYNC in md_do_sync. The body of this if takes one of two branches depending on whether MD_RECOVERY_SYNC is set, so testing it in the clause of the if is wrong. 3/ Flag superblock for updating after a resync/recovery finishes. 4/ If we find the neeed to restart a recovery in the middle (version-1 metadata only) make sure a full recovery (not just as guided by bitmaps) does get done. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] sis190: failure to set the MAC address from EEPROM	Francois Romieu
	Fix from http://bugzilla.kernel.org/show_bug.cgi?id=7747 Signed-off-by: Andrew Morton <akpm@osdl.org> Cc: <sleepy@mike-neko.net> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] remove __devinit markings from rtc_sysfs_add_device()	Mike Frysinger
	rtc_sysfs_add_device is needed even after dev initialization, so drop __devinit. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] Revert "[PATCH] Fix up mmap_kmem"	Linus Torvalds
	This reverts commit 99a10a60ba9bedcf5d70ef81414d3e03816afa3f. As per Hugh Dickins: "Nadia Derbey has reported that mmap of /dev/kmem no longer works with the kernel virtual address as offset, and Franck has confirmed that his patch came from a misunderstanding of what an offset means to /dev/kmem - whereas his patch description seems to say that he was correcting the offset on a few plaforms, there was no such problem to correct, and his patch was in fact changing its API on all platforms." Suggested-by: Hugh Dickins <hugh@veritas.com> Cc: Franck Bui-Huu <fbuihuu@gmail.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Andi Kleen <ak@suse.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] ACPI: fix cpufreq regression	Ingo Molnar
	recently cpufreq support on my laptop (Lenovo T60) broke completely: when it's plugged into AC it would never go higher than 1 GHz - neither 1.3 GHz nor 1.83 GHz is possible - no matter which governor (userspace, speed or ondemand) is used. after some cpufreq debugging i tracked the regression back to the following (totally correct) bug-fix commit: commit 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e Author: Dave Jones <davej@redhat.com> Date: Wed Nov 22 20:42:01 2006 -0500 [PATCH] Correct bound checking from the value returned from _PPC method. this bugfix, which makes other laptops work, made a previously hidden (BIOS) bug visible on my laptop. The bug is the following: if the _PPC (Performance Present Capabilities) optional ACPI object is queried /after/ bootup then the BIOS reports an incorrect value of '2'. My laptop (Lenovo T60) has the following performance states supported: 0: 1833000 1: 1333000 2: 1000000 Per ACPI specification, a _PPC value of '0' means that all 3 performance states are usable. A _PPC value of '1' means states 1 .. 2 are usable, a value of '2' means only state '2' (slowest) is usable. now, the _PPC object is optional, and it also comes with notification. Furthermore, when a CPU object is initialized, the _PPC object is initialized as well. So the following evaluation of the _PPC object is superfluous: [<c028ba5f>] acpi_processor_get_platform_limit+0xa1/0xaf [<c028c040>] acpi_processor_register_performance+0x3b9/0x3ef [<c0111a85>] acpi_cpufreq_cpu_init+0xb7/0x596 [<c03dab74>] cpufreq_add_dev+0x160/0x4a8 [<c02bed90>] sysdev_driver_register+0x5a/0xa0 [<c03d9c4c>] cpufreq_register_driver+0xb4/0x176 [<c068ac08>] acpi_cpufreq_init+0xe5/0xeb [<c010056e>] init+0x14f/0x3dd and this is the point where my laptop's BIOS returns the incorrect value of '2'. Note that it has not sent any notification event, so the value is probably not really intentional (possibly spurious), and Windows likely doesnt query it after bootup either. Maybe the value is kept at '2' normally, and is only set to the real value when a true asynchronous event (such as AC plug event, battery switch, etc.) occurs. So i /think/ this is a grey area of the ACPI spec: per the letter of the spec the _PPC value only changes when notified, so there's no reason to query it after the system has booted up. So in my opinion the best (and most compatible) strategy would be to do the change below, and to not evaluate the _PPC object in the acpi_processor_get_performance_info() call, but only evaluate it if _PPC is present during CPU object init, or if it's notified during an asynchronous event. This change is more permissive than the previous logic, so it definitely shouldnt break any existing system. This also happens to fix my laptop, which is merrily chugging along at 1.83 GHz now. Yay! Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Jones <davej@redhat.com> Acked-by: Len Brown <len.brown@intel.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] IB/iser: return error code when PDUs may not be sent	Erez Zilber
	iSER limits the number of outstanding PDUs to send. When this threshold is reached, it should return an error code (-ENOBUFS) instead of setting the suspend_tx bit (which should be used only by libiscsi). Without this fix, during logout, open-iscsi over iSER tries to logout forever. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] PCI: prevent down_read when pci_devices is empty	Ard van Breemen
	The pci_find_subsys gets called very early by obsolete ide setup parameters. This is a bogus call since pci is not initialized yet, so the list is empty. But in the mean time, interrupts get enabled by down_read. This can result in a kernel panic when the irq controller gets initialized. This patch checks if the device list is empty before taking the semaphore, and hence will not enable irq's. Furthermore it will inform that it is called while pci_devices is empty as a reminder that the ide code needs to be fixed. The pci_get_subsys can get called in the same manner, and as such is patched in the same manner. [akpm@osdl.org: cleanups] Signed-off-by: Ard van Breemen <ard@telegraafnet.nl> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> [chrisw: fold in 6a4c24ec5212 to avoid printk spamming] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] ieee1394: sbp2: fix probing of some DVD-ROM/RWs	Stefan Richter
	Since commit 98e238cd42be6c0852da519303cf0182690f8d9f in Linux 2.6.19, "ieee1394: sbp2: don't prefer MODE SENSE 10", some FireWire DVD-ROMs and DVD-RWs were mistaken as CD-ROM because sr_mod now sent MODE SENSE 6. The MMC command set includes only MODE SENSE 10. http://bugzilla.kernel.org/show_bug.cgi?id=7800 This fix lets sbp2 switch scsi_device.use_10_for_rw on for MMC LUs. This should rather be done in the command set driver sr_mod, not in the sbp2 transport driver, and an according patch will follow for a next Linux release. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] IB/mthca: Fix off-by-one in FMR handling on memfree	Michael S. Tsirkin
	mthca_table_find() will return the wrong address when the table entry being searched for is exactly at the beginning of a sglist entry (other than the first), because it uses >= when it should use >. Example: assume we have 2 entries in scatterlist, 4K each, offset is 4K. The current code will return first entry + 4K when we really want the second entry. In particular this means mapping an FMR on a memfree HCA may end up writing the page table into the wrong place, leading to memory corruption and also causing the HCA to use an incorrect address translation table. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] md: pass down BIO_RW_SYNC in raid{1,10}	Lars Ellenberg
	md raidX make_request functions strip off the BIO_RW_SYNC flag, thus introducing additional latency. Fixing this in raid1 and raid10 seems to be straightforward enough. For our particular usage case in DRBD, passing this flag improved some initialization time from ~5 minutes to ~5 seconds. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Lars Ellenberg <lars@linbit.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] Fix HWRNG built-in initcalls priority	Michael Buesch
	This changes all HWRNG driver initcalls to module_init(). We must probe the RNGs after the major kernel subsystems are already up and running (like PCI). This fixes Bug 7730. http://bugzilla.kernel.org/show_bug.cgi?id=7730 Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] i2c/m41t00: Do not forget to write year	Philippe De Muyter
	m41t00.c forgets to set the year field in set_rtc_time; fix that. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Acked-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05	[PATCH] i2c-mv64xxx: Fix random oops at boot	Maxime Bizon
	I have a Marvell board which has the same i2c hw block than mv64xxx, so I'm trying to use i2c-mv64xxx driver. But I get the following random oops at boot: Unable to handle kernel NULL pointer dereference at virtual address 00000002 Backtrace: [<c0397e4c>] (mv64xxx_i2c_intr+0x0/0x2b8) from [<c02879c4>] (__do_irq+0x4c/0x8c) [<c0287978>] (__do_irq+0x0/0x8c) from [<c0287c0c>] (do_level_IRQ+0x68/0xc0) r8 = C0501E08 r7 = 00000005 r6 = C0501E08 r5 = 00000005 r4 = C048BB78 [<c0287ba4>] (do_level_IRQ+0x0/0xc0) from [<c02885f8>] (asm_do_IRQ+0x50/0x134) r6 = C0449C78 r5 = F1020000 r4 = FFFFFFFF [<c02885a8>] (asm_do_IRQ+0x0/0x134) from [<c02869c4>] (__irq_svc+0x24/0x100) r8 = C1CAC400 r7 = 00000005 r6 = 00000002 r5 = F1020000 r4 = FFFFFFFF [<c0287efc>] (setup_irq+0x0/0x124) from [<c02880d0>] (request_irq+0xb0/0xd0) r7 = C041B2AC r6 = C0397E4C r5 = 00000000 r4 = 00000005 [<c0288020>] (request_irq+0x0/0xd0) from [<c03985f4>] (mv64xxx_i2c_probe+0x148/0x244) [<c03984ac>] (mv64xxx_i2c_probe+0x0/0x244) from [<c038bedc>] (platform_drv_probe+0x20/0x24) The oops is caused by a spurious interrupt that occurs when request_irq is called. mv64xxx_i2c_fsm() tries to read drv_data->msg, which is NULL. I noticed that hardware init is done after requesting irq. Thus any pending irq from previous hardware usage may cause this. The following patch fixes it: Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] connector: some fixes for ia64 unaligned access errors	Erik Jacobson
	On ia64, the various functions that make up cn_proc.c cause kernel unaligned access errors. If you are using these, for example, to get notification about all tasks forking and exiting, you get multiple unaligned access errors per process. Use put_unaligned() in the appropriate palces to fix this. Signed-off-by: Erik Jacobson <erikj@sgi.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] asix: Fix typo for AX88772 PHY Selection	David Hollis
	The attached patch fixes a PHY selection problem that prevents AX88772 based devices (Linksys USB200Mv2, etc) devices from working. The interface comes up and everything seems fine except the device doesn't send/receive any packets. The one-liner attached fixes this issue and makes the devices usable again. Signed-off-by: David Hollis <dhollis@davehollis.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] V4L: cx88: Fix leadtek_eeprom tagging	Jean Delvare
	reference to .init.text: from .text between 'cx88_card_setup' (at offset 0x68c) and 'cx88_risc_field' Caused by leadtek_eeprom() being declared __devinit and called from a non-devinit context. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] dvb-core: fix bug in CRC-32 checking on 64-bit systems	Ang Way Chuang
	CRC-32 checking during ULE decapsulation always failed on x86_64 systems due to the size of a variable used to store CRC. This bug was discovered on Fedora Core 6 with kernel-2.6.18-1.2849. The i386 counterpart has no such problem. This patch has been tested on 64-bit system as well as 32-bit system. Signed-off-by: Ang Way Chuang <wcang@nrg.cs.usm.my> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] zd1211rw: Call ieee80211_rx in tasklet	Ulrich Kunitz
	The driver called ieee80211_rx in hardware interrupt context. This has been against the intention of the ieee80211_rx function. It caused a bug in the crypto routines used by WPA. This patch calls ieee80211_rx in a tasklet. Signed-off-by: Ulrich Kunitz <kune@deine-taler.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] cciss: fix XFER_READ/XFER_WRITE in do_cciss_request	Mike Miller
	This patch fixes a stupid bug. Sometime during the 2tb enhancement I ended up replacing the macros XFER_READ and XFER_WRITE with h->cciss_read and h->cciss_write respectively. It seemed to work somehow at least on x86_64 and ia64. I don't know how. But people started complaining about command timeouts on older controllers like the 64xx series and only on ia32. This resolves the issue reproduced in our lab. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] fix aoe without scatter-gather [Bug 7662]	Ed L Cashin
	Fix a bug that only appears when AoE goes over a network card that does not support scatter-gather. The headers in the linear part of the skb appeared to be larger than they really were, resulting in data that was offset by 24 bytes. This patch eliminates the offset data on cards that don't support scatter-gather or have had scatter-gather turned off. There remains an unrelated issue that I'll address in a separate email. Fixes bugzilla #7662 Signed-off-by: "Ed L. Cashin" <ecashin@coraid.com> Cc: <stable@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: <boddingt@optusnet.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] smc911x: fix netpoll compilation faliure	Vitaly Wool
	Fix the compilation failure for smc911x.c when NET_POLL_CONTROLLER is set. Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and ↵	Ingo Molnar
	APM idle code Fernando Lopez-Lezcano reported frequent scheduling latencies and audio xruns starting at the 2.6.18-rt kernel, and those problems persisted all until current -rt kernels. The latencies were serious and unjustified by system load, often in the milliseconds range. After a patient and heroic multi-month effort of Fernando, where he tested dozens of kernels, tried various configs, boot options, test-patches of mine and provided latency traces of those incidents, the following 'smoking gun' trace was captured by him: _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / \|\|\|\|\| delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0) IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up) IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up) ... <idle>-0 1...1 11us!: default_idle (cpu_idle) ... <idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0) ... <...>-5856 0D..2 618us : __switch_to (__schedule) <...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162) <...>-5856 0D..2 619us : __spin_unlock_irq (__schedule) <...>-5856 0...1 619us : trace_stop_sched_switched (__schedule) <...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0) what is visible in this trace is that CPU#1 ran try_to_wake_up() for PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task() for CPU#0. But it decided to not send an IPI that no CPU - due to TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set, and only rescheduled to PID:5856 upon the next lapic timer IRQ. The result was a 600+ usecs latency and a missed wakeup! the bug turned out to be an idle-wakeup bug introduced into the mainline kernel this summer via an optimization in the x86_64 tree: commit 495ab9c045e1b0e5c82951b762257fe1c9d81564 Author: Andi Kleen <ak@suse.de> Date: Mon Jun 26 13:59:11 2006 +0200 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. the problem is this type of change: if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - clear_thread_flag(TIF_POLLING_NRFLAG); + current_thread_info()->status &= ~TS_POLLING; smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); this changes clear_thread_flag() to an explicit clearing of TS_POLLING. clear_thread_flag() is defined as: clear_bit(flag, &ti->flags); and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms: static inline void clear_bit(int nr, volatile unsigned long * addr) { __asm__ __volatile__( LOCK_PREFIX "btrl %1,%0" hence smp_mb__after_clear_bit() is defined as a simple compile barrier: #define smp_mb__after_clear_bit() barrier() but the explicit TS_POLLING clearing introduced by the patch: + current_thread_info()->status &= ~TS_POLLING; is not an atomic op! So the clearing of the TS_POLLING bit is freely reorderable with the reading of the NEED_RESCHED bit - and both now reside in different memory addresses. CPU idle wakeup very much depends on ordered memory ops, the clearing of the TS_POLLING flag must always be done before we test need_resched() and hit the idle instruction(s). [Symmetrically, the wakeup code needs to set NEED_RESCHED before it tests the TS_POLLING flag, so memory ordering is paramount.] Fernando's dual-core Athlon64 system has a sufficiently advanced memory ordering model so that it triggered this scenario very often. ( And it also turned out that the reason why these latencies never triggered on my testsystems is that i routinely use idle=poll, which was the only idle variant not affected by this bug. ) The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to act as an absolute barrier between the TS_POLLING write and the NEED_RESCHED read. This affects almost all idling methods (default, ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU> [chrisw: backport to 2.6.19.1] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] i2c: fix broken ds1337 initialization	Dirk Eibach
	On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to 2.6.18 broke RTC support. The main problem are changes to ds1337_init_client(). When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7 in status register is set. This has to be reset by writing 0 to status register. But since there are only 16 byte written to the chip and the first byte is interpreted as an address, the status register (which is the 16th) is never written. The other problem is, that initializing all registers to zero is not valid for day, date and month register. Funny enough this is checked by ds1337_detect(), which depends on this values not being zero. So then treated by ds1337_init_client() the ds1337 is not detected anymore, whereas the failure bit in the status register is still set. Broken by commit f9e8957937ebf60d22732a5ca9130f48a7603f60 (2.6.16-rc1, 2006-01-06). This fix is in Linus' tree since 2.6.20-rc1 (commit 763d9c046a2e511ec090a8986d3f85edf7448e7e). Signed-off-by: Dirk Stieler <stieler@gdsys.de> Signed-off-by: Dirk Eibach <eibach@gdsys.de> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] SCSI: add missing cdb clearing in scsi_execute()	Tejun Heo
	Clear-garbage-after-CDB patch missed scsi_execute() and it causes some ODDs (HL-DT-ST DVD-RAM GSA-H30N) choke during SCSI scan. Note that this patch is only for -stable. There is another more reliable fix for this problem proposed for devel tree. http://thread.gmane.org/gmane.linux.ide/14605/focus=14605 Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] IB/srp: Fix FMR mapping for 32-bit kernels and addresses above 4G	Roland Dreier
	struct srp_device.fmr_page_mask was unsigned long, which means that the top part of addresses above 4G was being chopped off on 32-bit architectures. Of course nothing good happens when data from SRP targets is DMAed to the wrong place. Fix this by changing fmr_page_mask to u64, to match the addresses actually used by IB devices. Thanks to Brian Cain <Brian.Cain@ge.com> and David McMillen <davem@systemfabricworks.com> for help diagnosing the bug and testing the fix. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] ieee1394: ohci1394: add PPC_PMAC platform code to driver probe	Stefan Richter
	Fixes http://bugzilla.kernel.org/show_bug.cgi?id=7431 iBook G3 threw a machine check exception and put the display backlight to full brightness after ohci1394 was unloaded and reloaded. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> [dsd@gentoo.org: also added missing if condition, commit 63cca59e89892497e95e1e9c7156d3345fb7e2e8] Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] libata: handle 0xff status properly	Tejun Heo
	libata waits for !BSY even when the status register reports 0xff. This causes long boot delays when D8 isn't pulled down properly. This patch does the followings. * don't wait if status register is 0xff in all wait functions * make ata_busy_sleep() return 0 on success and -errno on failure. -ENODEV is returned on 0xff status and -EBUSY on other failures. * make ata_bus_softreset() succeed on 0xff status. 0xff status is not reset failure. It indicates no device. This removes unnecessary retries on such ports. Note that the code change assumes unoccupied port reporting 0xff status does not produce valid device signature. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Joe Jin <lkmaillist@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] Revert "[PATCH] zd1211rw: Removed unneeded packed attributes"	John W. Linville
	This reverts commit 4e1bbd846d00a245dcf78b6b331d8a9afed8e6d7. Quoth Daniel Drake <dsd@gentoo.org>: "A user reported that commit 4e1bbd846d00a245dcf78b6b331d8a9afed8e6d7 (Remove unneeded packed attributes) breaks the zd1211rw driver on ARM." Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] V4L: Fix broken TUNER_LG_NTSC_TAPE radio support	Hans Verkuil
	The TUNER_LG_NTSC_TAPE is identical in all respects to the TUNER_PHILIPS_FM1236_MK3. So use the params struct for the Philips tuner. Also add this LG_NTSC_TAPE tuner to the switches where radio specific parameters are set so it behaves like a TUNER_PHILIPS_FM1236_MK3. This change fixes the radio support for this tuner (the wrong bandswitch byte was used). Thanks to Andy Walls <cwalls@radix.net> for finding this bug. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] DVB: lgdt330x: fix signal / lock status detection bug	Michael Krufky
	In some cases when using VSB, the AGC status register has been known to falsely report "no signal" when in fact there is a carrier lock. The datasheet labels these status flags as QAM only, yet the lgdt330x module is using these flags for both QAM and VSB. This patch allows for the carrier recovery lock status register to be tested, even if the agc signal status register falsely reports no signal. Thanks to jcrews from #linuxtv in irc, for initially reporting this bug. Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] bonding: incorrect bonding state reported via ioctl	Andy Gospodarek
	This is a small fix-up to finish out the work done by Jay Vosburgh to add carrier-state support for bonding devices. The output in /proc/net/bonding/bondX was correct, but when collecting the same info via an iotcl it could still be incorrect. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Cc: Jeff Garzik <jeff@garzik.org> Cc: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] read_zero_pagealigned() locking fix	Hugh Dickins
	Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel bugzilla 7645. Right: read_zero_pagealigned uses down_read of mmap_sem, but another thread's racing read of /dev/zero, or a normal fault, can easily set that pte again, in between zap_page_range and zeromap_page_range getting there. It's been wrong ever since 2.4.3. The simple fix is to use down_write instead, but that would serialize reads of /dev/zero more than at present: perhaps some app would be badly affected. So instead let zeromap_page_range return the error instead of BUG_ON, and read_zero_pagealigned break to the slower clear_user loop in that case - there's no need to optimize for it. Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other user of zeromap_page_range), though it really isn't interesting there. And since mmap_zero wants -EAGAIN for out-of-memory, the zeromaps better return that than -ENOMEM. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Ramiro Voicu: <Ramiro.Voicu@cern.ch> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10	[PATCH] dm-crypt: Select CRYPTO_CBC	Herbert Xu
	As CBC is the default chaining method for cryptoloop, we should select it from cryptoloop to ease the transition. Spotted by Rene Herman. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] forcedeth: Disable INTx when enabling MSI in forcedeth	Daniel Barkalow
	At least some nforce cards continue to send legacy interrupts when MSI is enabled, and these interrupts are treated as unhandled by the kernel. This patch disables legacy interrupts explicitly when enabling MSI mode. The correct fix is to change the MSI infrastructure to disable legacy interrupts when enabling MSI, but this is potentially risky if the device isn't PCI-2.3 or is quirky, so the correct fix is going into mainline, while patches like this one go into -stable. Legend has it that it is most correct to disable legacy interrupts before enabling MSI, but the mainline patch does it in the other order, and this patch is "obviously" the same as mainline. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Greg KH <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] drm-sis linkage fix	Andrew Morton
	Fix http://bugzilla.kernel.org/show_bug.cgi?id=7606 WARNING: "drm_sman_set_manager" [drivers/char/drm/sis.ko] undefined! Cc: <daniel-silveira@gee.inatel.br> Cc: Dave Airlie <airlied@linux.ie> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] USB: Fix oops in PhidgetServo	Sean Young
	The PhidgetServo causes an Oops when any of its sysfs attributes are read or written too, making the driver useless. Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] TOKENRING: Remote memory corruptor in ibmtr.c	David Miller
	ip_summed changes last summer had missed that one. As the result, we have ip_summed interpreted as CHECKSUM_PARTIAL now. IOW, ->csum is interpreted as offset of checksum in the packet. net/core/* will both read and modify the value as that offset, with obvious reasons. At the very least it's a remote memory corruptor. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] IB/ucm: Fix deadlock in cleanup	Michael S Tsirkin
	ib_ucm_cleanup_events() holds file_mutex while calling ib_destroy_cm_id(). This can deadlock since ib_destroy_cm_id() flushes event handlers, and ib_ucm_event_handler() needs file_mutex, too. Therefore, drop the file_mutex during the call to ib_destroy_cm_id(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] SUNHME: Fix for sunhme failures on x86	Jurij Smakov
	The following patch fixes the failure of sunhme drivers on x86 hosts due to missing pci_enable_device() and pci_set_master() calls, lost during code refactoring. It has been filed as bugzilla bug #7502 [0] and Debian bug #397460 [1]. [0] http://bugzilla.kernel.org/show_bug.cgi?id=7502 [1] http://bugs.debian.org/397460 Signed-off-by: Jurij Smakov <jurij@wooyd.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11	[PATCH] cryptoloop: Select CRYPTO_CBC	Herbert Xu
	As CBC is the default chaining method for cryptoloop, we should select it from cryptoloop to ease the transition. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-11-29	[PATCH] r8169: Fix iteration variable sign	Francois Romieu
	This changes the type of variable "i" in rtl8169_init_one() from "unsigned int" to "int". "i" is checked for < 0 later, which can never happen for "unsigned". This results in broken error handling. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-28	Merge branch 'upstream-linus' of ↵	Linus Torvalds
	master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] libata: Fixup ata_sas_queuecmd to handle __ata_scsi_queuecmd failure [PATCH] ahci: AHCI mode SATA patch for Intel ICH9 [PATCH] libata: don't schedule EH on wcache on/off if old EH
2006-11-28	[PATCH] Fix Intel/Sharp command set erase suspend bug	Joakim Tjernlund
	When we sleep and wait for a suspended operation to be resumed, go back and check until it's ready -- don't just continue after the first time we're woken. This can cause file system corruption. Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-28	[PATCH] libata: Fixup ata_sas_queuecmd to handle __ata_scsi_queuecmd failure	Brian King
	Fixes ata_sas_queuecmd to properly handle a failure from __ata_scsi_queuecmd. Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>