bit width

Kevin C. · Aug 1, 2005

Why do processors continue to increase in bit width, especially GPUs which
are already 128-bit?

Is it just pure marketing? Why else does the march of progress seem to imply
larger bit widths?

Yousuf Khan · Aug 1, 2005

Kevin said:
Why do processors continue to increase in bit width, especially GPUs which
are already 128-bit?

Is it just pure marketing? Why else does the march of progress seem to imply
larger bit widths?

Well in the case of GPUs it's a little bit different measure than it is
for CPUs. For CPUs, bit width usually refers to the CPU's addressing
capabilities, which would also usually imply that its registers are also
of the same width; e.g. a 32-bit CPU would have 32-bit registers and
32-bit addressing.

For GPU's, these chips don't really do too much memory addressing, so
the bit width doesn't affect that. Even their registers aren't really
necessarily as big as their advertised bit widths. Instead, their bit
widths usually refer to internal data bus pathways inside these chips.

Interestingly, if you're counting by internal pathway bandwidths, then
CPU's have long since past the 128-bit stage more than 15 years ago (the
486 had a 128-bit internal bandwidth), and are probably now in the
512-bit or 1024-bit stage.

Yousuf Khan

Moozh · Aug 2, 2005

There are countless data pathways in a processor as well as outside the
processor, so a processor does not truly have a unified "bit width."

In common nomenclature the "bit width" of a processor is usually its
virtual addressability.

Bit width is important for bandwidth because:

bandwidth = bit width * frequency

Take your average Pentium 4 chip. There are four data busses of 64
bits in width, typically clocked at 100/133/200MHz (400/533/800 FSB).
This means that in a modern P4, 256 bits of data are transferred
through the data bus two hundred million times per second. Doing the
math gives a bandwidth of 6.4GB/s.

If each data path were to be increased to 128 bits, the maximum
potention bandwidth of the data bus would then be 12.8GB/s.

It is not easy, however, to increase the bandwidth that dramatically.
Trace lines can only be too close together before they interfere with
each other. Also, each bit of this bus is a physical pin on the
processor's underside. Higher pincounts mean higher manufacturing
costs.

Moozh

keith · Aug 2, 2005

Why do processors continue to increase in bit width, especially GPUs which
are already 128-bit?

Because it's one of the last knobs to twist. Latency and frequency are
about at the end of the pike.

Is it just pure marketing? Why else does the march of progress seem to imply
larger bit widths?

Wider is a way of making faster. When nothing else works, you throw more
money at the problem.

Tony Hill · Aug 2, 2005

Why do processors continue to increase in bit width, especially GPUs which
are already 128-bit?

Is it just pure marketing? Why else does the march of progress seem to imply
larger bit widths?

To figure out the answer to that question you first need to figure out
just how you are defining bit-width. For example, people often ask
about how PCs are only just now getting to use 64-bit chips when
gaming consoles used 64-bit chips WAY back with the Nintendo 64 (and
perhaps earlier). The answer to that is quite simple actually, the
definition of 64-bit was VERY different for these two systems.

There are a few ways that you can define how wide your chip is:

1. Width of the integer registers
2. Number of (memory address) bits in a standard pointer
3. Width of your data bus
4. Width of your vector registers
5. ???

The first two options tend to be how PC chips are defined these days.
AMD and Intel's new "64-bit" chips mainly differ from their "32-bit"
counterparts in that they can use 64-bit pointers for addressing large
quantities of memory (32-bit pointers limit you to a 4GB virtual
memory space). They also can handle a full 64-bit integer in a single
register, allowing them to easily deal with integer values with a
range larger than 4-billion (32-bit chips can do this as well, it's
just slower and uses more than one register). Of these two reasons
the memory addressibility factor was the real driving force. 4GB of
memory just isn't that much these days, and when you figure on the OS
taking 1 or 2GB of that you are left with a rather tight address
space. Start fracturing that address space up and it gets real hard
to do work with large amounts of data.

Now, the width of the data bus is a bit of an older definition of the
bitness of a CPU, it was quite common back in the 8 and 16-bit days.
It is also the definition that made the Nintendo64 a "64-bit" gaming
console. By this definition the Intel Pentium was the first "64-bit"
x86 CPU. Really this isn't too relevant anymore, especially with the
move to more serialized databuses (fast and narrow vs. the parallel
bus design of wide but slow), though you do sometimes see it with
video cards, particularly as a way to differentiate the low-end cards
from their more expensive brethren.

Of course, this isn't the only way that bitness is often defined in a
video card, sometimes they refer to it's bits according to the width
of the vector registers, our 4th option above. Note that these two
bitness definitions are not linked or mutually exclusive. For
example, you could have one card with a 64-bit data bus and 128-bit
wide registers while another card could have a 256-bit wide data bus
but the same 128-bit wide registers.

Video cards tend to execute the same instruction on MANY pieces of
data in succession. As such they are much better served by more of a
vector processor design approach when compared to standard (scalar)
CPUs which tend to only execute an instruction on one or two pieces of
data. Here a 128-bit vector can, for example, allow you to pack 4
chunks of 32-bit data into one register and run a single instruction
on all 4 chunks. Note that this sort of thing DOES exist in the PC
world as well in the form of SSE2. SSE2 has 128-bit vector registers
that can handle either 4 32-bit chunks of data or 2 64-bit chunks.
They aren't always as efficient at this sort of thing as a GPU, but
that's mainly because a GPU is a pretty application-specific design
while a CPU is much more of a general-purpose design.

Anyway, all that being said, it doesn't exactly answer your question
of "why do we move to higher levels of bitness". The real answer is
simply that it's one method of improving the performance of a computer
chip. It's not the only method, but sometimes it is simply the right
choice to make.

Nate Edel · Aug 2, 2005

Moozh said:
There are countless data pathways in a processor as well as outside the
processor, so a processor does not truly have a unified "bit width."

In common nomenclature the "bit width" of a processor is usually its
virtual addressability.

Not necessarily; it's some combination of its general purpose register
width, ALU width, and its virtual addressibility.

These days, those are usually the same thing(*), and it has been since the
1980s on RISC machines. It was emphatically *not* the same for the 1970s
8-bit microprocessors(**) or for the two big late-1970s/early-1980s families
of CISC microprocessors(***).

(* except, as I understand it, for the EM64T Intel chips, which have 32-bit
ALUs despite 64-bit registers and addressing)
(** 8-bit registers, 8-bit ALU, 16-bit flat address space)
(*** 16-bit registers/20-24 bit segmented address space on the 8086/80286;
32-bit registers, 16-bit ALU, and 24-bit address space on the 1st-gen
68000)

Rob Stow · Aug 2, 2005

Tony said:
To figure out the answer to that question you first need to figure out
just how you are defining bit-width. For example, people often ask
about how PCs are only just now getting to use 64-bit chips when
gaming consoles used 64-bit chips WAY back with the Nintendo 64 (and
perhaps earlier). The answer to that is quite simple actually, the
definition of 64-bit was VERY different for these two systems.

There are a few ways that you can define how wide your chip is:

1. Width of the integer registers
2. Number of (memory address) bits in a standard pointer
3. Width of your data bus
4. Width of your vector registers
5. ???

The first two options tend to be how PC chips are defined these days.
AMD and Intel's new "64-bit" chips mainly differ from their "32-bit"
counterparts in that they can use 64-bit pointers for addressing large
quantities of memory

AMD64 and EMT64 chips use 64 bit registers for handling pointers,
but that is absolutely the only way in which their memory
capabilities can be thought of as "64 bit".

AMD64 chips have 40 bit physical and 48 bit virtual addressing.
Both limits can be easily extended in future chips if it ever
becomes necessary, but the current model cannot be extended to
full 64 bit addressing because the 4 highest bits are reserved
for things like the NX bit.

EMT64 capable P4 chips have 36 bit physical addressing. I can't
remember how much virtual addressing than can have, but I do
remember that it is less than AMD64. I would /expect/ that the
EMT64 capable Xeons would have more than just 36 bit physical
addressing.

(32-bit pointers limit you to a 4GB virtual memory space).

Many, many, pre-EMT64 Xeon systems had far more than just 4 GB of
physical RAM. Addressing more than 4 GB with only 32 bit
registers required Intel's slow and clumsy PAE kludge, but it was
(is!) done in an awful lot of Xeon servers.

Yousuf Khan · Aug 3, 2005

Rob said:
AMD64 chips have 40 bit physical and 48 bit virtual addressing.
Both limits can be easily extended in future chips if it ever becomes
necessary, but the current model cannot be extended to full 64 bit
addressing because the 4 highest bits are reserved for things like the
NX bit.

EMT64 capable P4 chips have 36 bit physical addressing. I can't
remember how much virtual addressing than can have, but I do remember
that it is less than AMD64. I would /expect/ that the EMT64 capable
Xeons would have more than just 36 bit physical addressing.

I'm pretty sure that the EMT64 virtual limit is exactly the same as
AMD64, at 48-bits. The EM64T is a full copy of AMD64, so it has to be
the same limit. The physical limit is more of a function of the number
of address pins you have on your processor, i.e. 36 address pins for
Xeon, 40 pins for Opteron.

Yousuf Khan

George Macdonald · Aug 3, 2005

There are countless data pathways in a processor as well as outside the
processor, so a processor does not truly have a unified "bit width."

In common nomenclature the "bit width" of a processor is usually its
virtual addressability.

Bit width is important for bandwidth because:

bandwidth = bit width * frequency

Take your average Pentium 4 chip. There are four data busses of 64
bits in width, typically clocked at 100/133/200MHz (400/533/800 FSB).
This means that in a modern P4, 256 bits of data are transferred
through the data bus two hundred million times per second. Doing the
math gives a bandwidth of 6.4GB/s.

Huh? Where do you get "four data busses of 64-bits" on a P4? AIUI, there
is *one* 64-bit wide data bus (actually 4x 16-bit wide) which is double
strobe clocked and uses DDR signalling for an effective 4x bus base-clock
rate. The address bus is DDR off the same base clock so works at half the
effective rate of the data bus.

bit width

Kevin C.

Yousuf Khan

Moozh

keith

Tony Hill

Nate Edel

Rob Stow

Yousuf Khan

George Macdonald