summaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)Author
2007-02-05[PATCH] start_kernel: test if irq's got enabled early, barf, and disable ↵Ard van Breemen
them again The calls made by parse_parms to other initialization code might enable interrupts again way too early. Having interrupts on this early can make systems PANIC when they initialize the IRQ controllers (which happens later in the code). This patch detects that irq's are enabled again, barfs about it and disables them again as a safety net. [akpm@osdl.org: cleanups] Signed-off-by: Ard van Breemen <ard@telegraafnet.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-11-08[PATCH] sysctl: Undeprecate sys_sysctlEric W. Biederman
The basic issue is that despite have been deprecated and warned about as a very bad thing in the man pages since its inception there are a few real users of sys_sysctl. It was my assumption that because sysctl had been deprecated for all of 2.6 there would be no user space users by this point, so I initially gave sys_sysctl a very short deprecation period. Now that I know there are a few real users the only sane way to proceed with deprecation is to push the time limit out to a year or two work and work with distributions that have big testing pools like fedora core to find these last remaining users. Which means that the sys_sysctl interface needs to be maintained in the meantime. Since I have provided a technical measure that allows us to add new sysctl entries without reserving more binary numbers I believe that is enough to fix the sys_sysctl binary interface maintenance problems, because there is no longer a need to change the binary interface at all. Since the sys_sysctl implementation needs to stay around for a while and the worst of the maintenance issues that caused us to occasionally break the ABI have been addressed I don't see any advantage in continuing with the removal of sys_sysctl. So instead of merely increasing the deprecation period this patch removes the deprecation of sys_sysctl and modifies the kernel to compile the code in by default. With committing to maintain sys_sysctl we get all of the advantages of a fast interface for anything that needs it. Currently sys_sysctl is about 5x faster than /proc/sys, for the same string data. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] x86-64: Speed up dwarf2 unwinderJan Beulich
This changes the dwarf2 unwinder to do a binary search for CIEs instead of a linear work. The linker is unfortunately not able to build a proper lookup table at link time, instead it creates one at runtime as soon as the bootmem allocator is usable (so you'll continue using the linear lookup for the first [hopefully] few calls). The code should be ready to utilize a build-time created table once a fixed linker becomes available. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-10-20[PATCH] uml: use DEFCONFIG_LIST to avoid reading host's configPaolo 'Blaisorblade' Giarrusso
This should make sure that, for UML, host's configuration files are not considered, which avoids various pains to the user. Our dependency are such that the obtained Kconfig will be valid and will lead to successful compilation - however they cannot prevent an user from disabling any boot device, and if an option is not set in the read .config (say /boot/config-XXX), with make menuconfig ARCH=um, it is not set. This always disables UBD and all console I/O channels, which leads to non-working UML kernels, so this bothers users - especially now, since it will happen on almost every machine (/boot/config-`uname -r` exists almost on every machine). It can be workarounded with make defconfig ARCH=um, but it is non-obvious and can be avoided, so please _do_ merge this patch. Given the existence of options, it could be interesting to implement (additionally) "option required" - with it, Kconfig will refuse reading a .config file (from wherever it comes) if the given option is not set. With this, one could mark with it the option characteristic of the given architecture (it was an old proposal of Roman Zippel, when I pointed out our problem): config UML option required default y However this should be further discussed: *) for x86, it must support constructs like: ==arch/i386/Kconfig== config 64BIT option required default n where Kconfig must require that CONFIG_64BIT is disabled or not present in the read .config. *) do we want to do such checks only for the starting defconfig or also for .config? Which leads to: *) I may want to port a x86_64 .config to x86 and viceversa, or even among more different archs. Should that be allowed, and in which measure (the user may force skipping the check for a .config or it is only given a warning by default)? Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <kbuild-devel@lists.sourceforge.net> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04Remove all inclusions of <linux/config.h>Dave Jones
kbuild explicitly includes this at build time. Signed-off-by: Dave Jones <davej@redhat.com>
2006-10-03[PATCH] md: remove MAX_MD_DEVS which is an arbitrary limitNeilBrown
Once upon a time we needed to fixed limit to the number of md devices, probably because we preallocated some array. This need no longer exists, but we still have an arbitrary limit. So remove MAX_MD_DEVS and allow as many devices as we can fit into the 'minor' part of a device number. Also remove some useless noise at init time (which reports MAX_MD_DEVS) and remove MD_THREAD_NAME_MAX which hasn't been used for a while. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] replace cad_pid by a struct pidCedric Le Goater
There are a few places in the kernel where the init task is signaled. The ctrl+alt+del sequence is one them. It kills a task, usually init, using a cached pid (cad_pid). This patch replaces the pid_t by a struct pid to avoid pid wrap around problem. The struct pid is initialized at boot time in init() and can be modified through systctl with /proc/sys/kernel/cad_pid [ I haven't found any distro using it ? ] It also introduces a small helper routine kill_cad_pid() which is used where it seemed ok to use cad_pid instead of pid 1. [akpm@osdl.org: cleanups, build fix] Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] introduce kernel_execveArnd Bergmann
The use of execve() in the kernel is dubious, since it relies on the __KERNEL_SYSCALLS__ mechanism that stores the result in a global errno variable. As a first step of getting rid of this, change all users to a global kernel_execve function that returns a proper error code. This function is a terrible hack, and a later patch removes it again after the kernel syscalls are gone. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] IPC namespace coreKirill Korotaev
This patch set allows to unshare IPCs and have a private set of IPC objects (sem, shm, msg) inside namespace. Basically, it is another building block of containers functionality. This patch implements core IPC namespace changes: - ipc_namespace structure - new config option CONFIG_IPC_NS - adds CLONE_NEWIPC flag - unshare support [clg@fr.ibm.com: small fix for unshare of ipc namespace] [akpm@osdl.org: build fix] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] namespaces: utsname: implement utsname namespacesSerge E. Hallyn
This patch defines the uts namespace and some manipulators. Adds the uts namespace to task_struct, and initializes a system-wide init namespace. It leaves a #define for system_utsname so sysctl will compile. This define will be removed in a separate patch. [akpm@osdl.org: build fix, cleanup] Signed-off-by: Serge Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] csa: Extended system accounting over taskstatsJay Lan
Add extended system accounting handling over taskstats interface. A CONFIG_TASK_XACCT flag is created to enable the extended accounting code. Signed-off-by: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] fix EMBEDDED + SYSCTL menuRandy Dunlap
SYSCTL should still depend on EMBEDDED. This unbreaks the EMBEDDED menu (from the recent SYSCTL_SYCALL menu option patch). Fix typos in new SYSCTL_SYSCALL menu. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] allow /proc/config.gz to be built as a moduleRoss Biro
The driver for /proc/config.gz consumes rather a lot of memory and it is in fact possible to build it as a module. In some ways this is a bit risky, because the .config which is used for compiling kernel/configs.c isn't necessarily the same as the .config which was used to build vmlinux. But OTOH the potential memory savings are decent, and it'd be fairly dumb to build your configs.o with a different .config. Signed-off-by: Andrew Morton <akpm@google.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells
Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-27[PATCH] sysctl: Allow /proc/sys without sys_sysctlEric W. Biederman
Since sys_sysctl is deprecated start allow it to be compiled out. This should catch any remaining user space code that cares, and paves the way for further sysctl cleanups. [akpm@osdl.org: If sys_sysctl() is not compiled-in, emit a warning] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] kdump: introduce "reset_devices" command line optionVivek Goyal
Resetting the devices during driver initialization can be a costly operation in terms of time (especially scsi devices). This option can be used by drivers to know that user forcibly wants the devices to be reset during initialization. This option can be useful while kernel is booting in unreliable environment. For ex. during kdump boot where devices are in unknown random state and BIOS execution has been skipped. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits) [PATCH] Don't set calgary iommu as default y [PATCH] i386/x86-64: New Intel feature flags [PATCH] x86: Add a cumulative thermal throttle event counter. [PATCH] i386: Make the jiffies compares use the 64bit safe macros. [PATCH] x86: Refactor thermal throttle processing [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64) [PATCH] Fix unwinder warning in traps.c [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1 [PATCH] x86: Move direct PCI scanning functions out of line [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI [PATCH] Don't leak NT bit into next task [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder [PATCH] Fix some broken white space in ia32_signal.c [PATCH] Initialize argument registers for 32bit signal handlers. [PATCH] Remove all traces of signal number conversion [PATCH] Don't synchronize time reading on single core AMD systems [PATCH] Remove outdated comment in x86-64 mmconfig code [PATCH] Use string instructions for Core2 copy/clear [PATCH] x86: - restore i8259A eoi status on resume [PATCH] i386: Split multi-line printk in oops output. ...
2006-09-26[PATCH] Move unwind_init earlierAndi Kleen
Needed for use of the unwinder in lockdep, because lockdep runs really early too. Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26[PATCH] Allow early_param and identical __setup to existRusty Russell
We currently assume that boot parameters which are handled by early_param() will not overlap boot parameters handled by __setup: if they do, behaviour is dependent on link order, usually meaning __setup will not get called. ACPI wants to use early_param("pci"), and pci uses __setup("pci="), so we modify the core to let them coexist: "pci=noacpi" will now get passed to both. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-25Driver Core: add ability for drivers to do a threaded probeGreg Kroah-Hartman
This adds the infrastructure for drivers to do a threaded probe, and waits at init time for all currently outstanding probes to complete. A new kernel thread will be created when the probe() function for the driver is called, if the multithread_probe bit is set in the driver saying it can support this kind of operation. I have tested this with USB and PCI, and it works, and shaves off a lot of time in the boot process, but there are issues with finding root boot disks, and some USB drivers assume that this can never happen, so it is currently not enabled for any bus type. Individual drivers can enable this right now if they wish, and bus authors can selectivly turn it on as well, once they determine that their subsystem will work properly with it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-16[PATCH] Kconfig: move CONFIG_EMBEDDED options to submenuChuck Ebbert
Fix two problems with the CONFIG_EMBEDDED submenu: (1) The menu was split in two by the rt_mutex patch, which moved half the items into the "General setup" menu. (2) CONFIG_SYSCTL and CONFIG_UID16 were added to the main menu instead of the submenu. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: delay accounting usage of taskstats interfaceShailabh Nagar
Usage of taskstats interface by delay accounting. Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: taskstats interfaceShailabh Nagar
Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC family), for getting statistics of tasks and thread groups during their lifetime and when they exit. The interface is intended for use by multiple accounting packages though it is being created in the context of delay accounting. This patch creates the interface without populating the fields of the data that is sent to the user in response to a command or upon the exit of a task. Each accounting package interested in using taskstats has to provide an additional patch to add its stats to the common structure. [akpm@osdl.org: cleanups, Kconfig fix] Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] per-task-delay-accounting: setupShailabh Nagar
Initialization code related to collection of per-task "delay" statistics which measure how long it had to wait for cpu, sync block io, swapping etc. The collection of statistics and the interface are in other patches. This patch sets up the data structures and allows the statistics collection to be disabled through a kernel boot parameter. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: kbuild: introduce utsrelease.h kbuild: explicit turn off gcc stack-protector
2006-07-03[PATCH] lockdep: annotate genirqIngo Molnar
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: coreIngo Molnar
Do 'make oldconfig' and accept all the defaults for new config options - reboot into the kernel and if everything goes well it should boot up fine and you should have /proc/lockdep and /proc/lockdep_stats files. Typically if the lock validator finds some problem it will print out voluminous debug output that begins with "BUG: ..." and which syslog output can be used by kernel developers to figure out the precise locking scenario. What does the lock validator do? It "observes" and maps all locking rules as they occur dynamically (as triggered by the kernel's natural use of spinlocks, rwlocks, mutexes and rwsems). Whenever the lock validator subsystem detects a new locking scenario, it validates this new rule against the existing set of rules. If this new rule is consistent with the existing set of rules then the new rule is added transparently and the kernel continues as normal. If the new rule could create a deadlock scenario then this condition is printed out. When determining validity of locking, all possible "deadlock scenarios" are considered: assuming arbitrary number of CPUs, arbitrary irq context and task context constellations, running arbitrary combinations of all the existing locking scenarios. In a typical system this means millions of separate scenarios. This is why we call it a "locking correctness" validator - for all rules that are observed the lock validator proves it with mathematical certainty that a deadlock could not occur (assuming that the lock validator implementation itself is correct and its internal data structures are not corrupted by some other kernel subsystem). [see more details and conditionals of this statement in include/linux/lockdep.h and Documentation/lockdep-design.txt] Furthermore, this "all possible scenarios" property of the validator also enables the finding of complex, highly unlikely multi-CPU multi-context races via single single-context rules, increasing the likelyhood of finding bugs drastically. In practical terms: the lock validator already found a bug in the upstream kernel that could only occur on systems with 3 or more CPUs, and which needed 3 very unlikely code sequences to occur at once on the 3 CPUs. That bug was found and reported on a single-CPU system (!). So in essence a race will be found "piecemail-wise", triggering all the necessary components for the race, without having to reproduce the race scenario itself! In its short existence the lock validator found and reported many bugs before they actually caused a real deadlock. To further increase the efficiency of the validator, the mapping is not per "lock instance", but per "lock-class". For example, all struct inode objects in the kernel have inode->inotify_mutex. If there are 10,000 inodes cached, then there are 10,000 lock objects. But ->inotify_mutex is a single "lock type", and all locking activities that occur against ->inotify_mutex are "unified" into this single lock-class. The advantage of the lock-class approach is that all historical ->inotify_mutex uses are mapped into a single (and as narrow as possible) set of locking rules - regardless of how many different tasks or inode structures it took to build this set of rules. The set of rules persist during the lifetime of the kernel. To see the rough magnitude of checking that the lock validator does, here's a portion of /proc/lockdep_stats, fresh after bootup: lock-classes: 694 [max: 2048] direct dependencies: 1598 [max: 8192] indirect dependencies: 17896 all direct dependencies: 16206 dependency chains: 1910 [max: 8192] in-hardirq chains: 17 in-softirq chains: 105 in-process chains: 1065 stack-trace entries: 38761 [max: 131072] combined max dependencies: 2033928 hardirq-safe locks: 24 hardirq-unsafe locks: 176 softirq-safe locks: 53 softirq-unsafe locks: 137 irq-safe locks: 59 irq-unsafe locks: 176 The lock validator has observed 1598 actual single-thread locking patterns, and has validated all possible 2033928 distinct locking scenarios. More details about the design of the lock validator can be found in Documentation/lockdep-design.txt, which can also found at: http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt [bunk@stusta.de: cleanups] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: better lock debuggingIngo Molnar
Generic lock debugging: - generalized lock debugging framework. For example, a bug in one lock subsystem turns off debugging in all lock subsystems. - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from the mutex/rtmutex debugging code: it caused way too much prototype hackery, and lockdep will give the same information anyway. - ability to do silent tests - check lock freeing in vfree too. - more finegrained debugging options, to allow distributions to turn off more expensive debugging features. There's no separate 'held mutexes' list anymore - but there's a 'held locks' stack within lockdep, which unifies deadlock detection across all lock classes. (this is independent of the lockdep validation stuff - lockdep first checks whether we are holding a lock already) Here are the current debugging options: CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y which do: config DEBUG_MUTEXES bool "Mutex debugging, basic checks" config DEBUG_LOCK_ALLOC bool "Detect incorrect freeing of live mutexes" Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: console_init after local_irq_enable()Heiko Carstens
s390's console_init must enable interrupts, but early_boot_irqs_on() gets called later. To avoid problems move console_init() after local_irq_enable(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] time initialisation fixjohn stultz
We're not reay to take a timer interrupt until timekeeping_init() has run. But time_init() will start the time interrupt and if it is called with local interrupts enabled we'll immediately take an interrupt and die. Fix that by running timekeeping_init() prior to time_init(). We don't know _why_ local interrupts got enabled on Jesse Brandeburg's machine. That's a separate as-yet-unsolved problem. THe patch adds a little bit of debugging to detect that. This whole requirement that local interrupts be held off during early boot keeps on biting us. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Jesse Brandeburg <jesse.brandeburg@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03kbuild: introduce utsrelease.hSam Ravnborg
include/linux/version.h contained both actual KERNEL version and UTS_RELEASE that contains a subset from git SHA1 for when kernel was compiled as part of a git repository. This had the unfortunate side-effect that all files including version.h would be recompiled when some git changes was made due to changes SHA1. Split it out so we keep independent parts in separate files. Also update checkversion.pl script to no longer check for UTS_RELEASE. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: Remove obsolete #include <linux/config.h> remove obsolete swsusp_encrypt arch/arm26/Kconfig typos Documentation/IPMI typos Kconfig: Typos in net/sched/Kconfig v9fs: do not include linux/version.h Documentation/DocBook/mtdnand.tmpl: typo fixes typo fixes: specfic -> specific typo fixes in Documentation/networking/pktgen.txt typo fixes: occuring -> occurring typo fixes: infomation -> information typo fixes: disadvantadge -> disadvantage typo fixes: aquire -> acquire typo fixes: mecanism -> mechanism typo fixes: bandwith -> bandwidth fix a typo in the RTC_CLASS help text smb is no longer maintained Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S
2006-06-30[PATCH] UML: fix the INIT_ENV_ARG_LIMIT dependenciesAdrian Bunk
Fix the INIT_ENV_ARG_LIMIT dependencies to what seems to have been intended. Spotted by Jean-Luc Leger. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] add smp_setup_processor_id()Andrew Morton
Presently, smp_processor_id() isn't necessarily set up until setup_arch(). But it's used in boot_cpu_init() and printk() and perhaps in other places, prior to setup_arch() being called. So provide a new smp_setup_processor_id() which is called before anything else, wire it up for Voyager (which boots on a CPU other than #0, and broke). Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] Light weight event countersChristoph Lameter
The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits) [PATCH] devfs: Remove it from the feature_removal.txt file [PATCH] devfs: Last little devfs cleanups throughout the kernel tree. [PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV [PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed [PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree [PATCH] devfs: Remove devfs_remove() function from the kernel tree [PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree [PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree [PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree [PATCH] devfs: Remove devfs support from the sound subsystem [PATCH] devfs: Remove devfs support from the ide subsystem. [PATCH] devfs: Remove devfs support from the serial subsystem [PATCH] devfs: Remove devfs from the init code [PATCH] devfs: Remove devfs from the partition code ...
2006-06-27[PATCH] pi-futex: rt mutex coreIngo Molnar
Core functions for the rt-mutex subsystem. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] fs/buffer.c: cleanupsAdrian Bunk
- add a proper prototype for the following global function: - buffer_init() - make the following needlessly global function static: - end_buffer_async_write() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] devfs: Remove the devfs_fs_kernel.h file from the treeGreg Kroah-Hartman
Also fixes up all files that #include it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26[PATCH] devfs: Remove devfs from the init codeGreg Kroah-Hartman
This patch removes the devfs code from the init/ directory. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits) kbuild: trivial fixes in Makefile kbuild: adding symbols in Kconfig and defconfig to TAGS kbuild: replace abort() with exit(1) kbuild: support for %.symtypes files kbuild: fix silentoldconfig recursion kbuild: add option for stripping modules while installing them kbuild: kill some false positives from modpost kbuild: export-symbol usage report generator kbuild: fix make -rR breakage kbuild: append -dirty for updated but uncommited changes kbuild: append git revision for all untagged commits kbuild: fix module.symvers parsing in modpost kbuild: ignore make's built-in rules & variables kbuild: bugfix with initramfs kbuild: modpost build fix kbuild: check license compatibility when building modules kbuild: export-type enhancement to modpost.c kbuild: add dependency on kernel.release to the package targets kbuild: `make kernelrelease' speedup kconfig: KCONFIG_OVERWRITECONFIG ...
2006-06-26Merge branch 'x86-64'Linus Torvalds
* x86-64: (83 commits) [PATCH] x86_64: x86_64 stack usage debugging [PATCH] x86_64: (resend) x86_64 stack overflow debugging [PATCH] x86_64: msi_apic.c build fix [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs [PATCH] x86_64: Avoid broadcasting NMI IPIs [PATCH] x86_64: fix apic error on bootup [PATCH] x86_64: enlarge window for stack growth [PATCH] x86_64: Minor string functions optimizations [PATCH] x86_64: Move export symbols to their C functions [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR [PATCH] x86_64: Fix modular pc speaker [PATCH] x86_64: remove sys32_ni_syscall() [PATCH] x86_64: Do not use -ffunction-sections for modules [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle [PATCH] x86_64: adjust kstack_depth_to_print default [PATCH] i386/x86-64: adjust /proc/interrupts column headings [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels [PATCH] x86_64: Fix fast check in safe_smp_processor_id [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status ... Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26[PATCH] x86_64: Move VM86 config into arch/i386/KconfigAndi Kleen
Architecture specific configs like this have no business at all in init/Kconfig. This prevents it from being set on x86-64 Pointed out by H.Peter Anvin Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] x86_64: reliable stack trace supportJan Beulich
These are the generic bits needed to enable reliable stack traces based on Dwarf2-like (.eh_frame) unwind information. Subsequent patches will enable x86-64 and i386 to make use of this. Thanks to Andi Kleen and Ingo Molnar, who pointed out several possibilities for improvement. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] initramfs overwrite fixH. Peter Anvin
This patch ensures that initramfs overwrites work correctly, even when dealing with device nodes of different types. Furthermore, when replacing a file which already exists, we must make very certain that we truncate the existing file. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Michael Neuling <mikey@neuling.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] Time: Use clocksource infrastructure for update_wall_timejohn stultz
Modify the update_wall_time function so it increments time using the clocksource abstraction instead of jiffies. Since the only clocksource driver currently provided is the jiffies clocksource, this should result in no functional change. Additionally, a timekeeping_init and timekeeping_resume function has been added to initialize and maintain some of the new timekeping state. [hirofumi@mail.parknet.co.jp: fixlet] Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Make sysctl obligatory except under CONFIG_EMBEDDEDH. Peter Anvin
Make makes sysctl non-optional unless EMBEDDED is set. There are a number of interfaces exposed via sysctl, enough that it has to be considered core kernel functionality at this point. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-20Merge branch 'audit.b21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (25 commits) [PATCH] make set_loginuid obey audit_enabled [PATCH] log more info for directory entry change events [PATCH] fix AUDIT_FILTER_PREPEND handling [PATCH] validate rule fields' types [PATCH] audit: path-based rules [PATCH] Audit of POSIX Message Queue Syscalls v.2 [PATCH] fix se_sen audit filter [PATCH] deprecate AUDIT_POSSBILE [PATCH] inline more audit helpers [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array [PATCH] update of IPC audit record cleanup [PATCH] minor audit updates [PATCH] fix audit_krule_to_{rule,data} return values [PATCH] add filtering by ppid [PATCH] log ppid [PATCH] collect sid of those who send signals to auditd [PATCH] execve argument logging [PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULES [PATCH] audit_panic() is audit-internal [PATCH] inotify (5/5): update kernel documentation ... Manual fixup of conflict in unclude/linux/inotify.h
2006-06-20[PATCH] audit: path-based rulesAmy Griffis
In this implementation, audit registers inotify watches on the parent directories of paths specified in audit rules. When audit's inotify event handler is called, it updates any affected rules based on the filesystem event. If the parent directory is renamed, removed, or its filesystem is unmounted, audit removes all rules referencing that inotify watch. To keep things simple, this implementation limits location-based auditing to the directory entries in an existing directory. Given a path-based rule for /foo/bar/passwd, the following table applies: passwd modified -- audit event logged passwd replaced -- audit event logged, rules list updated bar renamed -- rule removed foo renamed -- untracked, meaning that the rule now applies to the new location Audit users typically want to have many rules referencing filesystem objects, which can significantly impact filtering performance. This patch also adds an inode-number-based rule hash to mitigate this situation. The patch is relative to the audit git tree: http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary and uses the inotify kernel API: http://lkml.org/lkml/2006/6/1/145 Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>