You need a refresher course in x86 architecture. Ever heard of Byte Enables?
The Byte Enables are I/O pins on Pentium class processors that "determine which
bytes must be written to external memory, or which bytes were requested by the
CPU for the current cycle. (Intel datasheet for Pentium Processors)." This is
why Pentium class processors do not have address lines A0 thru A2; the
processor always fetches 64-bits (8 bytes) at a time. The Byte Enables (B0 thru
B7) tell the CPU which of these bytes you wanted. When a cacheable read is
performed from memory, whether or not a single byte or word or double-word is
being read, a full cacheline (four doublewords, at least on the 486) of data is
read from memory. This is because the overhead of reading three more
doublewords is insignificant in comparison to fetching the first doubleword.
The principle of localization also comes into play here. Chances are that your
code/data is contiguous so why not read what you will probably be needing in
the future now. The OS doesn't care about the data bus width, otherwise you
wouldn't be able to run 95,98,ME on 386's, 486's, Pentiums, Pentium II's etc.
Here's a question for you since the original Pentium had a 64-bit data bus is
it a 64-bit processor? Since the original 80486 has 8, 80-bit registers in its
FPU is the 486 an 80-bit CPU?
OK, you're running a 32 bit program and you have a processor that can access
memory 256 bits at a time. What does it do with the 224 bits that it's not
using other than store it in the cache? Will pulling the next 8
instructions or next 8 words of memory into cache every time you do a read
improve performance significantly over pulling the next 4?
-Bill (remove "botizer" to reply via email)