summaryrefslogtreecommitdiff
path: root/arch/x86/mm/init_32.c
AgeCommit message (Collapse)Author
2008-05-13x86: fix app crashes after SMP resumeHugh Dickins
After resume on a 2cpu laptop, kernel builds collapse with a sed hang, sh or make segfault (often on 20295564), real-time signal to cc1 etc. Several hurdles to jump, but a manually-assisted bisect led to -rc1's d2bcbad5f3ad38a1c09861bca7e252dde7bb8259 x86: do not zap_low_mappings in __smp_prepare_cpus. Though the low mappings were removed at bootup, they were left behind (with Global flags helping to keep them in TLB) after resume or cpu online, causing the crashes seen. Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up on x86_32. This used to be serialized by smp_commenced_mask: that's now gone, but a low_mappings flag will do. No need for native_smp_cpus_done to repeat the zap: let mem_init zap BSP's low mappings just like on UP. (In passing, fix error code from native_cpu_up: do_boot_cpu returns a variety of diagnostic values, Dprintk what it says but convert to -EIO. And save_pg_dir separately before zap_low_mappings: doesn't matter now, but zapping twice in succession wiped out resume's swsusp_pg_dir.) That worked well on the duo and one quad, but wouldn't boot 3rd or 4th cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock. The TLB flush IPI now being sent reveals a long-standing bug: the booting cpu has its APIC readied in smp_callin at the top of start_secondary, but isn't put into the cpu_online_map until just before that unlock_ipi_call_lock. So native_smp_call_function_mask to online cpus would send_IPI_allbutself, including the cpu just coming up, though it has been excluded from the count to wait for: by the time it handles the IPI, the call data on native_smp_call_function_mask's stack may well have been overwritten. So fall back to send_IPI_mask while cpu_online_map does not match cpu_callout_map: perhaps there's a better APICological fix to be made at the start_secondary end, but I wouldn't know that. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-28hotplug-memory: make online_page() commonJeremy Fitzhardinge
All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-26x86: remove NexGen supportDmitri Vorobiev
It is claimed that NexGen CPUs were never shipped: http://lkml.org/lkml/2008/4/20/179 Also, the kernel support for these chips has been broken for a long time, the code intended to support NexGen thereby being essentially dead. As an outcome of the discussion that can be found using the URL above, this patch removes the NexGen support altogether. The changes in this patch survived a defconfig build for i386, a couple of successful randconfig builds, as well as a runtime test, which consisted in booting a 32-bit x86 box up to the shell prompt. Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat: generic: add ioremap_wc() interface wrapper /dev/mem: make promisc the default pat: cleanups x86: PAT use reserve free memtype in mmap of /dev/mem x86: PAT phys_mem_access_prot_allowed for dev/mem mmap x86: PAT avoid aliasing in /dev/mem read/write devmem: add range_is_allowed() check to mmap of /dev/mem x86: introduce /dev/mem restrictions with a config option
2008-04-24x86: unify KERNEL_PGD_PTRSJeremy Fitzhardinge
Make KERNEL_PGD_PTRS common, as previously it was only being defined for 32-bit. There are a couple of follow-on changes from this: - KERNEL_PGD_PTRS was being defined in terms of USER_PGD_PTRS. The definition of USER_PGD_PTRS doesn't really make much sense on x86-64, since it can have two different user address-space configurations. I renamed USER_PGD_PTRS to KERNEL_PGD_BOUNDARY, which is meaningful for all of 32/32, 32/64 and 64/64 process configurations. - USER_PTRS_PER_PGD was also defined and was being used for similar purposes. Converting its users to KERNEL_PGD_BOUNDARY left it completely unused, and so I removed it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24x86: rename paravirt_alloc_pt etc after the pagetable structureJeremy Fitzhardinge
Rename (alloc|release)_(pt|pd) to pte/pmd to explicitly match the name of the appropriate pagetable level structure. [ x86.git merge work by Mark McLoughlin <markmc@redhat.com> ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24x86: introduce /dev/mem restrictions with a config optionArjan van de Ven
This patch introduces a restriction on /dev/mem: Only non-memory can be read or written unless the newly introduced config option is set. The X server needs access to /dev/mem for the PCI space, but it doesn't need access to memory; both the file permissions and SELinux permissions of /dev/mem just make X effectively super-super powerful. With the exception of the BIOS area, there's just no valid app that uses /dev/mem on actual memory. Other popular users of /dev/mem are rootkits and the like. (note: mmap access of memory via /dev/mem was already not allowed since a really long time) People who want to use /dev/mem for kernel debugging can enable the config option. The restrictions of this patch have been in the Fedora and RHEL kernels for at least 4 years without any problems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-19x86: remove pointless commentsWANG Cong
Remove old comments that include the old arch/i386 directory. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-17x86: don't use large pages to map the first 2/4MB of memoryAndi Kleen
Intel recommends to not use large pages for the first 1MB of the physical memory because there are fixed size MTRRs there which cause splitups in the TLBs. On AMD doing so is also a good idea. The implementation is a little different between 32bit and 64bit. On 32bit I just taught the initial page table set up about this because it was very simple to do. This also has the advantage that the risk of a prefetch ever seeing the page even if it only exists for a short time is minimized. On 64bit that is not quite possible, so use set_memory_4k() a little later (in check_bugs) instead. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17x86: replace the now useless max_pfn_mapped defineThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17x86: implement true end_pfn_mapped for 32bitAndi Kleen
Even on 32bit 2MB pages can map more memory than is in the true max_low_pfn if end_pfn is not highmem and not aligned to 2MB. Add a end_pfn_map similar to x86-64 that accounts for this fact. This is important for code that really needs to know about all mapping aliases. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-17x86: enhance DEBUG_RODATA support for hotplug and kprobesMathieu Desnoyers
Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andi Kleen <andi@firstfloor.org> Cc: pageexec@freemail.hu Cc: akpm@linux-foundation.org CC: Andi Kleen <andi@firstfloor.org> CC: pageexec@freemail.hu CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-14x86: include proper prototypes for rodata_testHarvey Harrison
extern should not appear in C files. Also, the definitions do not match the prototype currently, not sure what way you want to go with this, I've switched the prototype to return int, but I can see going to the void return as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: introduce page pool in cpaThomas Gleixner
DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: construct 32-bit boot time page tables in native format.Ian Campbell
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled kernel are in PAE format. early_ioremap is updated to use the standard page table accessors. Clear any mappings beyond max_low_pfn from the boot page tables in native_pagetable_setup_start because the initial mappings can extend beyond the range of physical memory and into the vmalloc area. Derived from patches by Eric Biederman and H. Peter Anvin. [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ] Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika Penttilä <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-04x86: use _ASM_EXTABLE macro in arch/x86/mm/init_32.cH. Peter Anvin
Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/mm/init_32.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-01suspend: cleanup reference to swsusp_pg_dir[]Rafael J. Wysocki
swsusp_pg_dir[] is used for suspend, but not for hibernation. clean-up the ifdefs which worked by accident, while implying the opposite. Delete the __nosavedata, which also implied the opposite. Some day we may optimize CONFIG_ACPI_SLEEP to build minimal kernels for just hibernate or just suspend but not both, but today isn't that day. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-30x86: don't special-case pmd allocations as muchJeremy Fitzhardinge
In x86 PAE mode, stop treating pmds as a special case. Previously they were always allocated and freed with the pgd. The modifies the code to be the same as 64-bit mode, where they are allocated on demand. This is a step on the way to unifying 32/64-bit pagetable allocation as much as possible. There is a complicating wart, however. When you install a new reference to a pmd in the pgd, the processor isn't guaranteed to see it unless you reload cr3. Since reloading cr3 also has the side-effect of flushing the tlb, this is an expense that we want to avoid whereever possible. This patch simply avoids reloading cr3 unless the update is to the current pagetable. Later patches will optimise this further. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: arch/x86/mm/init_32.c printk fixesIngo Molnar
printk fixes. NOP in terms of functionality, but strings got a bit larger due to the KERN_ markers that were added. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: arch/x86/mm/init_32.c cleanupIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: cpa: fix the self-testIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: init memory debuggingIngo Molnar
debug incorrect/late access to init memory, by permanently unmapping the init memory ranges. Depends on CONFIG_DEBUG_PAGEALLOC=y. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add testcases for RODATA and NX protections/attributesArjan van de Ven
Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: make sure initmem is writableArjan van de Ven
When we free initmem, various rodata and CPA checks may have left memory read only.. this patch ensures that the memory is writable before we free it. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: cpa: move flush to cpaThomas Gleixner
The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: move page_is_ram() functionThomas Gleixner
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: convert CPA users to the new set_page_ APIArjan van de Ven
This patch converts various users of change_page_attr() to the new, more intent driven set_page_*/set_memory_* API set. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: remove set_kernel_exec()Andi Kleen
The SMP trampoline always runs in real mode, so making it executable in the page tables doesn't make much sense because it executes before page tables are set up. That was the only user of set_kernel_exec(). Remove set_kernel_exec(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernelsAndi Kleen
No need to make it 64bit there. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86 32-bit boot: rename bt_ioremap() to early_ioremap()Huang, Ying
This patch renames bt_ioremap to early_ioremap, which is used in x86_64. This makes it easier to merge i386 and x86_64 usage. [ mingo@elte.hu: fix ] Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30i386 boot: replace boot_ioremap with enhanced bt_ioremap - enhance bt_ioremapHuang, Ying
This patch makes it possible for bt_ioremap() to be used before paging_init(), via providing an early implementation of set_fixmap() that can be used before paging_init(). This way boot_ioremap() can be replaced by bt_ioremap(). Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: return the page table level in lookup_address()Ingo Molnar
based on this patch from Andi Kleen: | Subject: CPA: Return the page table level in lookup_address() | From: Andi Kleen <ak@suse.de> | | Needed for the next change. | | And change all the callers. and ported it to x86.git. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: clean up pte_execAndi Kleen
- Rename it to pte_exec() from pte_exec_kernel(). There is nothing kernel specific in there. - Move it into the common file because _PAGE_NX is 0 on !PAE and then pte_exec() will be always evaluate to true. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30c_p_a(): do a simple self test at bootAndi Kleen
When CONFIG_DEBUG_RODATA is enabled undo the ro mapping and redo it again. This gives some simple testing for change_page_attr(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: demacro asm-x86/pgalloc_32.hJeremy Fitzhardinge
Convert macros into inline functions, for better type-checking. This patch required a little bit of fiddling with headers in order to make __(pte|pmd)_free_tlb inline rather than macros. asm-generic/tlb.h includes asm/pgalloc.h, though it doesn't directly use any pgalloc definitions. I removed this include to avoid an include cycle, but it may cause secondary compile failures by things depending on the indirect inclusion; arch/x86/mm/hugetlbpage.c was one such place; there may be others. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add mm parameter to paravirt_alloc_pdJeremy Fitzhardinge
Add mm to paravirt_alloc_pd, partly to make it consistent with paravirt_alloc_pt, and because later changes will make use of it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: unify pgtable accessors which useJeremy Fitzhardinge
Make users of supported_pte_mask common. This has the side-effect of introducing the variable for 32-bit non-PAE, but I think its a pretty small cost to simplify the code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86 boot: use E820 memory map on EFI 32 platformHuang, Ying
Because the EFI memory map are converted to e820 memory map in bootloader, the EFI memory map handling code is removed to clean up. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: clean up mm/init_32.cJeremy Fitzhardinge
Some code reformatting in init_32.c. No functional change. Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-15x86: fix boot crash on HIGHMEM4G && SPARSEMEMIngo Molnar
Denys Fedoryshchenko reported a bootup crash when he upgraded his system from 3GB to 4GB RAM: http://lkml.org/lkml/2008/1/7/9 the bug is due to HIGHMEM4G && SPARSEMEM kernels making pfn_to_page() to return an invalid pointer when the pfn is in a memory hole. The 256 MB PCI aperture at the end of RAM was not mapped by sparsemem, and hence the pfn was not valid. But set_highmem_pages_init() iterated this range without checking the pfn's validity first. this bug was probably present in the sparsemem code ever since sparsemem has been introduced in v2.6.13. It was masked due to HIGHMEM64G using larger memory regions in sparsemem_32.h: #ifdef CONFIG_X86_PAE #define SECTION_SIZE_BITS 30 #define MAX_PHYSADDR_BITS 36 #define MAX_PHYSMEM_BITS 36 #else #define SECTION_SIZE_BITS 26 #define MAX_PHYSADDR_BITS 32 #define MAX_PHYSMEM_BITS 32 #endif which creates 1GB sparsemem regions instead of 64MB sparsemem regions. So in practice we only ever created true sparsemem holes on x86 with HIGHMEM4G - but that was rarely used by distros. ( btw., we could probably save 2MB of mem_map[]s on X86_PAE if we reduced the sparsemem region size to 256 MB. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86Linus Torvalds
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (114 commits) x86: delete vsyscall files during make clean kbuild: fix typo SRCARCH in find_sources x86: fix kernel rebuild due to vsyscall fallout .gitignore update for x86 arch x86: unify include/asm/debugreg_32/64.h x86: unify include/asm/unwind_32/64.h x86: unify include/asm/types_32/64.h x86: unify include/asm/tlb_32/64.h x86: unify include/asm/siginfo_32/64.h x86: unify include/asm/bug_32/64.h x86: unify include/asm/mman_32/64.h x86: unify include/asm/agp_32/64.h x86: unify include/asm/kdebug_32/64.h x86: unify include/asm/ioctls_32/64.h x86: unify include/asm/floppy_32/64.h x86: apply missing DMA/OOM prevention to floppy_32.h x86: unify include/asm/cache_32/64.h x86: unify include/asm/cache_32/64.h x86: unify include/asm/dmi_32/64.h x86: unify include/asm/delay_32/64.h ...
2007-10-17x86: fix CONFIG_PAGEALLOC related boot hangs/OOMsIngo Molnar
if CONFIG_PAGEALLOC is enabled then X86_FEATURE_PSE is disabled and all the kernel physical RAM pagetables are set up as 4K pages. This is needed so that CONFIG_PAGEALLOC can do finegrained mapping and unmapping of pages. as a side-effect though, the total size of memory allocated as kernel pagetables increases significantly. All these pagetables are allocated via alloc_bootmem_low_pages(), straight out of the lowmem DMA pool. If the system has enough RAM and a large kernel image then almost all of the 16 MB lowmem DMA pool is allocated to the image and to pagetables - leaving no space for __GFP_DMA allocations. this results in drivers failing and the bootup hanging: swapper invoked oom-killer: gfp_mask=0x80d1, order=0, oomkilladj=0 [<4015059f>] out_of_memory+0x17f/0x1c0 [<40151f3c>] __alloc_pages+0x37c/0x3a0 [<40168cd7>] slob_new_page+0x37/0x50 [<40168dff>] slob_alloc+0x10f/0x190 [<40169010>] __kmalloc_node+0x80/0x90 [<405a17e3>] scsi_host_alloc+0x33/0x2c0 [<405a1a82>] scsi_register+0x12/0x60 [<40d5889e>] aha1542_detect+0x9e/0x940 [<405c5ba5>] ultrastor_detect+0x265/0x5f0 [<401352f5>] getnstimeofday+0x35/0xf0 [<40d58751>] init_this_scsi_driver+0x41/0xf0 [<40d0b856>] kernel_init+0x136/0x310 [<40d58710>] init_this_scsi_driver+0x0/0xf0 [<40d0b720>] kernel_init+0x0/0x310 [<40105547>] kernel_thread_helper+0x7/0x10 ======================= the fix is to first allocate from above the DMA pool, and if that fails (for example due to it being a machine with less than 16 MB of RAM), allocate from the DMA pool as a fallback. With this fix applied i was able to boot a PAGEALLOC=y kernel that would hang before. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17Merge branch 'xen-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xfs: eagerly remove vmap mappings to avoid upsetting Xen xen: add some debug output for failed multicalls xen: fix incorrect vcpu_register_vcpu_info hypercall argument xen: ask the hypervisor how much space it needs reserved xen: lock pte pages while pinning/unpinning xen: deal with stale cr3 values when unpinning pagetables xen: add batch completion callbacks xen: yield to IPI target if necessary Clean up duplicate includes in arch/i386/xen/ remove dead code in pgtable_cache_init paravirt: clean up lazy mode handling paravirt: refactor struct paravirt_ops into smaller pv_*_ops
2007-10-16remove dead code in pgtable_cache_initJeremy Fitzhardinge
The conversion from using a slab cache to quicklist left some residual dead code. I note that in the conversion it now always allocates a whole page for the pgd, rather than the 32 bytes needed for a PAE pgd. Was this intended? Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>
2007-10-16fix memory hot remove not configured case.KAMEZAWA Hiroyuki
Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess. This patch cleans up them. This is against 2.6.23-rc6-mm1. - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case. - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(), which returns -EINVAL. - removed remove_pages() only used in powerpc. - removed no-op remove_memory() in i386, sh, sparc64, x86_64. - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it to return -EINVAL. Note: Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other archs if there are requirements and testers. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-11i386: move mmThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>