Age | Commit message (Collapse) | Author |
|
This patch reverts NUMA affine page table allocation added by commit
1411e0ec31 (x86-64, numa: Put pgtable to local node memory).
The commit made an undocumented change where the kernel linear mapping
strictly follows intersection of e820 memory map and NUMA
configuration. If the physical memory configuration has holes or NUMA
nodes are not properly aligned, this leads to using unnecessarily
smaller mapping size which leads to increased TLB pressure. For
details,
http://thread.gmane.org/gmane.linux.kernel/1104672
Patches to fix the problem have been proposed but the underlying code
needs more cleanup and the approach itself seems a bit heavy handed
and it has been determined to revert the feature for now and come back
to it in the next developement cycle.
http://thread.gmane.org/gmane.linux.kernel/1105959
As init_memory_mapping_high() callsites have been consolidated since
the commit, reverting is done manually. Also, the RED-PEN comment in
arch/x86/mm/init.c is not restored as the problem no longer exists
with memblock based top-down early memory allocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
There's no reason for these to live in setup_arch(). Move them inside
initmem_init().
- v2: x86-32 initmem_init() weren't updated breaking 32bit builds.
Fixed. Found by Ankita.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Shaohui Zheng <shaohui.zheng@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
|
|
initmem_init() extensively accesses and modifies global data
structures and the parameters aren't even followed depending on which
path is being used. Drop @start/last_pfn and let it deal with
@max_pfn directly. This is in preparation for further NUMA init
cleanups.
- v2: x86-32 initmem_init() weren't updated breaking 32bit builds.
Fixed. Found by Yinghai.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Shaohui Zheng <shaohui.zheng@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
|
|
Introduce init_memory_mapping_high(), and use it with 64bit.
It will go with every memory segment above 4g to create page table to the
memory range itself.
before this patch all page tables was on one node.
with this patch, one RED-PEN is killed
debug out for 8 sockets system after patch
[ 0.000000] initial memory mapped : 0 - 20000000
[ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff]
[ 0.000000] 0000000000 - 007f600000 page 2M
[ 0.000000] 007f600000 - 007f750000 page 4k
[ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff]
[ 0.000000] RAMDISK: 7bc84000 - 7f745000
....
[ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used
[ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used
[ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used
[ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used
[ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used
[ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used
[ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used
[ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used
[ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used
[ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used
[ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff]
[ 0.000000] 0100000000 - 1080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff]
[ 0.000000] 1080000000 - 2080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x207ffc0000-0x207fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00002080000000-0x0000307fffffff]
[ 0.000000] 2080000000 - 3080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 3080000000 @ [0x307ff3d000-0x307fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x307ffc0000-0x307fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00003080000000-0x0000407fffffff]
[ 0.000000] 3080000000 - 4080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 4080000000 @ [0x407fefd000-0x407fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x407ffc0000-0x407fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00004080000000-0x0000507fffffff]
[ 0.000000] 4080000000 - 5080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 5080000000 @ [0x507febd000-0x507fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x507ffc0000-0x507fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00005080000000-0x0000607fffffff]
[ 0.000000] 5080000000 - 6080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 6080000000 @ [0x607fe7d000-0x607fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x607ffc0000-0x607fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00006080000000-0x0000707fffffff]
[ 0.000000] 6080000000 - 7080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 7080000000 @ [0x707fe3d000-0x707fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x707ffc0000-0x707fffffff] PGTABLE
[ 0.000000] init_memory_mapping: [0x00007080000000-0x0000807fffffff]
[ 0.000000] 7080000000 - 8080000000 page 2M
[ 0.000000] kernel direct mapping tables up to 8080000000 @ [0x807fdfc000-0x807fffffff]
[ 0.000000] memblock_x86_reserve_range: [0x807ffbf000-0x807fffffff] PGTABLE
[ 0.000000] Initmem setup node 0 [0000000000000000-000000107fffffff]
[ 0.000000] NODE_DATA [0x0000107ffbd000-0x0000107ffc1fff]
[ 0.000000] Initmem setup node 1 [0000001080000000-000000207fffffff]
[ 0.000000] NODE_DATA [0x0000207ffbb000-0x0000207ffbffff]
[ 0.000000] Initmem setup node 2 [0000002080000000-000000307fffffff]
[ 0.000000] NODE_DATA [0x0000307ffbb000-0x0000307ffbffff]
[ 0.000000] Initmem setup node 3 [0000003080000000-000000407fffffff]
[ 0.000000] NODE_DATA [0x0000407ffbb000-0x0000407ffbffff]
[ 0.000000] Initmem setup node 4 [0000004080000000-000000507fffffff]
[ 0.000000] NODE_DATA [0x0000507ffbb000-0x0000507ffbffff]
[ 0.000000] Initmem setup node 5 [0000005080000000-000000607fffffff]
[ 0.000000] NODE_DATA [0x0000607ffbb000-0x0000607ffbffff]
[ 0.000000] Initmem setup node 6 [0000006080000000-000000707fffffff]
[ 0.000000] NODE_DATA [0x0000707ffbb000-0x0000707ffbffff]
[ 0.000000] Initmem setup node 7 [0000007080000000-000000807fffffff]
[ 0.000000] NODE_DATA [0x0000807ffba000-0x0000807ffbefff]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D1933D1.9020609@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
Move it into head file. to prepare use it in other files.
[ hpa: added missing <linux/types.h> and changed type to phys_addr_t. ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D1933BA.8000508@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
On 32-bit non-PAE system, cast to 'phys_addr_t' truncates value
before subtraction. Subtracting before cast produce same result
but remove following warnings from sparse:
arch/x86/include/asm/pgtable_types.h:255:38: warning: cast truncates bits from constant value (100000000 becomes 0)
arch/x86/include/asm/pgtable_types.h:270:38: warning: cast truncates bits from constant value (100000000 becomes 0)
arch/x86/include/asm/pgtable.h:127:32: warning: cast truncates bits from constant value (100000000 becomes 0)
arch/x86/include/asm/pgtable.h:132:32: warning: cast truncates bits from constant value (100000000 becomes 0)
arch/x86/include/asm/pgtable.h:344:31: warning: cast truncates bits from constant value (100000000 becomes 0)
64-bit or PAE machines will not be affected by this change.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
LKML-Reference: <1285770588-14065-1-git-send-email-namhyung@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
The generic resource based page_is_ram() works better with memory
hotplug/hotremove. So switch the x86 e820map based code to it.
CC: Andi Kleen <andi@firstfloor.org>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
LKML-Reference: <20100122033004.470767217@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
To eventually interleave emulated nodes over physical nodes, we
need to know the physical topology of the machine without actually
registering it. This does the k8 node setup in two parts:
detection and registration. NUMA emulation can then used the
physical topology detected to setup the address ranges of emulated
nodes accordingly. If emulation isn't used, the k8 nodes are
registered as normal.
Two formals are added to the x86 NUMA setup functions: `acpi' and
`k8'. These represent whether ACPI or K8 NUMA has been detected;
both cannot be true at the same time. This specifies to the NUMA
emulation code whether an underlying physical NUMA topology exists
and which interface to use.
This patch deals solely with separating the k8 setup path into
Northbridge detection and registration steps and leaves the ACPI
changes for a subsequent patch. The `acpi' formal is added here,
however, to avoid touching all the header files again in the next
patch.
This approach also ensures emulated nodes will not span physical
nodes so the true memory latency is not misrepresented.
k8_get_nodes() may now be used to export the k8 physical topology
of the machine for NUMA emulation.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: unification of declarations, cleanup
Unification of declarations:
moved init_memory_mapping, initmem_init and free_initmem from
page_XX_types.h to page_types.h
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1239693869.3033.31.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
gcc 3.2.2 reports:
In file included from /usr/src/all/linux-next/arch/x86/include/asm/page.h:8,
from /usr/src/all/linux-next/arch/x86/include/asm/processor.h:18,
from /usr/src/all/linux-next/arch/x86/include/asm/atomic_32.h:6,
from /usr/src/all/linux-next/arch/x86/include/asm/atomic.h:2,
from include/linux/crypto.h:20,
from arch/x86/kernel/asm-offsets_32.c:7,
from arch/x86/kernel/asm-offsets.c:2:
/usr/src/all/linux-next/arch/x86/include/asm/page_types.h:54: warning: parameter has incomplete type
/usr/src/all/linux-next/arch/x86/include/asm/page_types.h:56: warning: parameter has incomplete type
In file included from /usr/src/all/linux-next/arch/x86/include/asm/page.h:8,
from /usr/src/all/linux-next/arch/x86/include/asm/processor.h:18,
from include/linux/prefetch.h:14,
from include/linux/list.h:6,
from include/linux/module.h:9,
from init/main.c:13:
/usr/src/all/linux-next/arch/x86/include/asm/page_types.h:54: warning: parameter has incomplete type
/usr/src/all/linux-next/arch/x86/include/asm/page_types.h:56: warning: parameter has incomplete type
This is a bogus warning, but moving the pat-related functions
into asm/pat.h and including asm/pgtable_types.h should fix it.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
PAGETABLE_LEVELS and the PTE masks should be in pgtable*.h
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers
Conflicts:
arch/x86/include/asm/page.h
arch/x86/include/asm/pgtable.h
arch/x86/mach-voyager/voyager_smp.c
arch/x86/mm/fault.c
|
|
pgtable*.h is intended for definitions relating to actual pagetables
and their entries, so move all the definitions for
(pte|pmd|pud|pgd)(val)?_t to the appropriate pgtable*.h headers.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
|
|
|
|
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
|
|
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
|
|
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
|