Casper H.S. Dik said:
You could run just a 64 bit kernel (single 64 bit program) and then
you will have 4GB of user virtual memory.
Sure that's a fine solution, but it requires a 64bit CPU
(contrary to all the "we don't need 64bit yet" nay sayers)
My point was just that the usual "we only need 64bit when
everybody has more than 4GB of RAM" argument you often hear
is bogus when you look at the details.
It's also possible on some architectures to use separate page tables
or segments for the kernel and for user space, but at least on x86
without ASNs that can hold multiple page it's extremly slow (basically
every interrupt and syscall turns into a full heavy weight context
switch). Also the kernel cannot directly access user space memory
anymore, so it has to implement its own page table walks and TLBs
for this. These tend to be a lot slower than what the CPU native
facilities offer.
There is another issue - even when you look at physical
memory, not virtual the real boundary is more like 3.2-3.5GB.
The reason is that a machine without real IOMMU (like a PC or many
other chipsets) need to put some memory holes below the 4GB range for
the AGP aperture and for 32bit only PCI IO mappings. This leads to
that a PC with even less 4GB of real memory has physical memory
addresses beyond 0xffffffff. The firmware/chipset has to map the
memory "around" the memory holes, otherwise you would lose memory.
Usually they can only do that at DIMM granuality. There are machines
that when you have two 2GB DIMMs put the first one at 0 and the second
one at 4GB (and 2GB-4GB is PCI mappings and memory hole). The highest
physical address seen is 6GB, even though you only have 4GB of
memory. The same applies for a 4x1GB setup, there the highest
physical address is typically at 5GB. In some cases it even
happens at less than 4GB of memory, e.g. with 3.5GB when the holes
are bigger than 0.5GB.
There are some PCI cards with extremly big IO mappings (e.g. a lot of
the high speed HPC interconnects who do direct memory to memory
network) where the threshold is even lower. While these extreme cards
support usually 64bit BARs and could put their mappings beyond the 4GB
boundary the PC BIOS cannot do that for compatibility reasons
and it's difficult to do later in a driver because placing
IO mappings is deeply chipset specific (that's more an x86 specific
issue admittedly, but then this thread is cross posted to x86 specific
groups)
While an 32bit x86 OS can access that high memory using PAE it already
becomes inconvenient and slow (requires unnecessary TLB flushes and waits in
the kernel to manage limited mapping space) and a 64bit OS is better at
this.
-Andi