Suggestions for Core 2 Duo systems that use PCI - not PCI express?

  • Thread starter Thread starter muzician21
  • Start date Start date
M

muzician21

Right now I'm running a 2.4 gig P4 on a Soyo Dragon mobo. I could
upgrade to a socket 478 3.4gig processor and get about a 30% bump in
speed which wouldn't be bad, but it's my understanding going to a Core
2 Duo chip I could see a much bigger increase.

I dont need to be on the cutting edge, what Core 2 chips should I look
for that would net me about 3x the speed of that 2.4gig P4? I'm
favoring Intel unless you feel there's a really compelling reason to
go with someone else. .

I still want to run XP - all my software works with it and I'd like to
stay with PCI slots, not PCI express so I can swap over hardware I've
already got. The more PCI slots the better - like 5 or more. Does such
an animal exist - i.e. Core 2 duo system with lots of PCI slots?

Thanks for all input
 
muzician21 said:
I dont need to be on the cutting edge, what Core 2 chips should I look
for that would net me about 3x the speed of that 2.4gig P4?

None.

While you would see improvement from any C2D over 2.4GHz, you won't likely
see anywhere near 3X the speed on anything.

I still want to run XP - all my software works with it and I'd like to
stay with PCI slots, not PCI express so I can swap over hardware I've
already got. The more PCI slots the better - like 5 or more. Does such
an animal exist - i.e. Core 2 duo system with lots of PCI slots?

If you stay with PCI, then you will choke your I/O to graphics, HDs, and
other peripherals that use the PCI bus. There's no sense in staying with
PCI if you want performance.
 
In message <[email protected]> "JR Weiss"
None.

While you would see improvement from any C2D over 2.4GHz, you won't likely
see anywhere near 3X the speed on anything.

Why not? Except under very specific workloads, the P4's pipeline length
all but crippled the processor's responsiveness for day to day usage.

Hyperthreading partially addressed this, although it caused it's own set
of slowdowns.

A single Core 2 core is roughly 1.5x-2x faster then a similarly clocked
P4 CPU, one of the higher end Core 2 Duo processors should easily offer
3x-4x processing improvements over a P4.

In fairness, we're rarely CPU bound at all these days, so when it comes
to desktop performance comparing CPUs isn't always the best way to
start.
If you stay with PCI, then you will choke your I/O to graphics, HDs, and
other peripherals that use the PCI bus. There's no sense in staying with
PCI if you want performance.

Depending on what sort of devices are connected, the PCI bus'
limitations may not matter. Sound cards, fax boards, even
SCSI-connected scanners and similar won't get near the PCI bus'
bandwidth limitations. Higher performance devices will make use of a
faster bus, but for most users their video card and possibly an
additional drive controller are about all that fit into that ballpark.

(okay okay, ethernet too in theory, but in practice how many users have
hardware that can sustain over PCI's practical transfer speeds over
ethernet?)
 
I suggest that you try the "power search" function for Intel motherboards at
www.newegg.com, even if you can't use them (outside US or Canada) or don't
care to use them.

I find three Socket 775 boards with 5 PCI slots:

http://www.newegg.com/Product/Produ...CodeValue=705:9908&PropertyCodeValue=735:7583

(link may wrap)

I don't know of a simple performance comparison. The highest clock
frequencies of a Core2 CPU may not be much higher than a P4. Multicore
processors can give improved performance, but that may require software
written to exploit multiple CPUs.

I hope that you're aware that you'll need to replace your RAM and probably
the power supply as well as the motherboard and CPU. Fortunately, DDR2
memory is pretty cheap at the moment.

I'm not sure that I agree with another poster about PCI-E being significant.
It's the way to go for high-end graphics cards for gaming, but it may not
offer practical advantages over PCI for desktop users for other purposes.
There appears to be a lag in PCI-E card development, even if you were
prepared to replace all of your cards. For example: I have an Asus PCI-E
sound card. It is really a PCI card with some bridge circuitry, so that it
works in a PCI-E X1 slot. It has no better performance than the PCI version,
and it's a bit more awkward to use, as it requires a separate power
connection.
 
None.

While you would see improvement from any C2D over 2.4GHz, you won't likely
see anywhere near 3X the speed on anything.


Hmm. Looking at a chart like this

http://www.cpubenchmark.net/common_cpus.html

gives the impression there are CPU's that are many times faster.

What I'm mostly looking at is rendering times for processing video
such as through VirtuaDub and for creating DVD's. You feel I I won't
see "anywhere near" 3x the speed? If that's correct maybe just maxing
out the board with a faster socket 478 CPU isn't such a bad idea.
 
muzician21 said:
Hmm. Looking at a chart like this

http://www.cpubenchmark.net/common_cpus.html

gives the impression there are CPU's that are many times faster.

What I'm mostly looking at is rendering times for processing video
such as through VirtuaDub and for creating DVD's. You feel I I won't
see "anywhere near" 3x the speed? If that's correct maybe just maxing
out the board with a faster socket 478 CPU isn't such a bad idea.

A magazine article, or a web site now, will tend to use
benchmarks that emphasize processor performance this way.

(clock_speed * instructions_per_clock) * number_of_cores

What they do, is test multithreaded software. Multithreading works
best in multimedia applications, because a number of problems there
(processing large data sets) benefit from a divide and conquer
algorithm.

For example, Photoshop could split a picture in two pieces, and
a processor core could work on each half of the picture.

But the truth is, activities on a computer consist of a mix
of single threaded ones and multithreaded ones. So a typical
user doesn't see the huge speedup the above equation might
suggest. For single threaded computing, you'd see an improvement
proportional to just a single core. The Core2 "instructions_per_clock" is
how some of the speedup occurs.

(clock_speed * instructions_per_clock)

So if you wanted a 3x speedup at all times, I'd have to pick a
processor that offers that improvement at all times. To do that,
I'd use a single threaded benchmark. If your target was 3x performance
increase only while you were rendering or shrinking a movie, then a
multithreaded benchmark would tell you that.

I can pick a "Pentium 4 2.4GHz C Northwood" on hwbot.org, and then
look at the collected benchmarks. The "C" means FSB800 (front side
bus speed), which would be about as good as it gets for a S478
processor. A much earlier processor, say one for socket 423,
might be FSB400, making it harder to get data in and out of the
processor.

http://www.hwbot.org/ResultBrowseByProcessor.do?cpuModelId=1425

SuperPI 1M ( 1 million digits) 80 seconds at 2.4Ghz
SuperPI 32M (32 million digits) 58 minutes 59 seconds at 2.4GHz

Now, compare to an E8400 Core2 Duo 3GHz processor.

http://www.hwbot.org/listResults.do?cpuModelId=1512&applicationId=3

SuperPI 1M ( 1 million digits) 15-16 seconds at 3.0Ghz

http://www.hwbot.org/listResults.do?cpuModelId=1512&applicationId=7

SuperPI 32M (32 million digits) 14:10 to 15:59 at 3.0GHz

The scaleup there implies a factor of 5, in the 1 million digit
benchmark. But the thing is, SuperPI uses about 8MB of data in
main memory, and the E8400 has 6MB of shared L2 cache. I don't know
what the locality of reference is like in SuperPI, but I would be
a bit suspicious that the benchmark is overestimating the speedup.
A lot of the SuperPI data, might end up stored in L2, giving
an unfair advantage and a less than honest performance ratio.

So I can try the 32 million digit benchmark. This still seems a
little on the high side.

If we compare 58:59 to 15:59, that is a factor of 3539/959 = 3.69

Your P4 consisted of a single core, and it could have had Hyperthreading,
which makes a second, virtual core. The virtual core, on a good day,
contributes only an extra 10% to performance, as it runs when the
other core is "blocked". Now, you can buy quad core processors,
and if the software you use can actually use all four cores, then
you should see a good improvement.

The Q9650, is two E8400s inside the same CPU package. It is a quad for $324.
The Q9550 is comparable, and is 2.83GHz for $270.

http://www.newegg.com/Product/Product.aspx?Item=N82E16819115130

core core core core Q9550, Q9650
| | | | Block Diagram
-+----+- -+----+- Two silicon die, joined inside.
| 6MB L2 | | 6MB L2 |
----+--- ---+----
| |
+-----+------+
|
LGA775 FSB (used for memory access and I/O)

Nehalem (Core i7) is the most recent generation, and the motherboard
and RAM for it, may add to the upgrade costs. This is an example of
one of those. Socket is LGA1366 instead of LGA775 for the other one.
The extra pins support a direct memory interface.

Intel Core i7 920 Nehalem 2.66GHz 4*256KB L2 8MB L3 Cache LGA 1366 130W Quad $289
http://www.newegg.com/Product/Product.aspx?Item=N82E16819115202

core core core core Core i7 is a single die
| | | |
256KB L2 256KB L2 256KB L2 256KB L2
| | | |
-+-----------+----------+----------+-
| 8MB L3 |<-----> triple channel memory
-------------------+---------------- interface on processor
| (like AMD does it)
LGA1366 FSB (used for I/O)

Using the HWBOT again... 14.5 seconds for SuperPI 1M (when the
entire data set could fit in L3. That is 14.5 seconds at 2.66GHz.

http://www.hwbot.org/listResults.do?cpuModelId=1741&applicationId=3

The SuperPI 32M is 12:45 at 2.66GHz, and ratio to P4 2.4Ghz is
58:59/12:45 = 3539/765 = 4.6x single threaded.

http://www.hwbot.org/listResults.do?cpuModelId=1741&applicationId=7

An E8400 is $165, and a motherboard with DDR2 memory makes for
a more reasonably priced alternative. It really depends on
what your budget is. The pricing is such, that buying low
end Intel platforms may not make much long term sense.
(You'd only be looking at upgrading again.)

As far as I know, all the current benchmarks on Tomwhardware charts
are multithreaded, intended to let the extra cores show their stuff.
It is too bad they don't try to be more balanced, and throw
in a less impressive speedup from a single threaded benchmark.
I've used SuperPI above, as an example of a single threaded one.

Paul
 
Hmm. Looking at a chart like this
http://www.cpubenchmark.net/common_cpus.html
gives the impression there are CPU's that are many times faster.
What I'm mostly looking at is rendering times for processing video such as
through VirtuaDub and for creating DVD's. You feel I I won't see "anywhere
near" 3x the speed? If that's correct maybe just maxing out the board
with a faster socket 478 CPU isn't such a bad idea.

A single benchmark cannot tell the entire story. Your computer is a system,
not just a CPU. Your use of the system includes the hardware, the software,
and the wetware (your input and control). Besides, the CPU benchmark alone
cannot tell a true story of actual performance. Also, even if you find a
reasonable overall benchmark for comparison, what is the setup time for a
typical job, vs the run time of the rendering app?

I have a Q9450 (2.66 GHz quad-core) and an E6850 (3.0 GHz dual-core) system.
Both are similarly configured -- Motherboard, FSB, RAM, HD, gfx. The chart
you cite shows the respective CPU Mark score of 3895 and 1814, or a
performance factor (PF) of 2.15x the "speed" for the Q6600. In the only
real-world, CPU-intensive, no-manual-intervention, fully multithreaded app I
run (Folding @Home SMP), the real PF is more like 1.4-1.5x in actual frame
times for similar Work Units. OTOH, the PF for Internet browsing is 1.0x --
there is nothing in the Q6600 that makes Internet browsing faster.

Where is a benchmark that uses your rendering app, or similar, as the
testbed? What does it show for a new system and a system similar to yours?
How much will those results change when you factor in your desire to retain
old HD, gfx, and other peripherals? Will the MoBo you choose on the basis
of PCI slot capability perform the same as a same-generation MoBo that is
optimized for current peripherals?

If your rendering apps are NOT fully multithreaded (i.e., cannot take full
advantage of 2 or 4 cores), do not scale linearly with added cores (VERY
common) and/or entail a significant amount of HD read/write, the CPU part of
the performance will be less significant. If you use the same HDs in your
new system, your HD R/W performance will not increase at all.

If you don't want to go "cutting edge" and restrict yourself to a MoBo with
multiple USABLE PCI slots (looks like 1 of the 5 would be unusable on either
MoBo cited by Bob, once a gfx card is installed), likely the best you can do
for a reasonable cost is a Q9650 ($325 for CPU alone). Assuming you find a
MoBo that supports its full FSB and RAM specs, the CPU Mark scores for the
new and old CPUs are 4414 and 339, or a theoretical PF of 13x.

Given the real-world example above, you could expect 2/3 of that for the CPU
portion of a fully multithreaded app, or about 8.6x. Both the MoBos cited
by Bob are restricted to DDR2 800 RAM, so your memory bandwidth will be
restricted relative to the benchmark system right off the bat. IF the RAM
bandwidth scales linearly from 1066 to 800 and IF RAM bandwidth has a
similar weighting in overall performance, now you're down to a 6.5x PF.
Then, if your rendering app only can take advantage of 2 cores instead of 4,
you're down to a 3.2x PF. With both HD and gfx performance at par (no
increase), they will significantly reduce the overall PF.

FWIW, if you go for a more mainstream CPU like the Q6600 instead of the
Q9650, its CPU Mark score of 2851 would indicate a PF of 4.2 or 2.1 using
the above methodology (before HD and gfx input), instead of 6.5 or 3.2 for
the Q9650. None of this addresses setup time (your manual intervention to
get the rendering work ready to run), which also has an assumed PF of 1.0,
and could therefore be another significant factor in overall performance.

While I admit my methodology is far from rigid, it does give a reasonable
feel for how unreliable CPU benchmarks alone are for assessing performance
potential of a system.
 
....
What I'm mostly looking at is rendering times for processing video
such as through VirtuaDub and for creating DVD's. You feel I I
won't see "anywhere near" 3x the speed? If that's correct maybe
just maxing out the board with a faster socket 478 CPU isn't such
a bad idea.

No! Multiple core CPUs are the bomb. You will see a huge improvement
in performance when it counts. If I were you, I would go to a
VirtuaDub USENET group or maybe a web forum. Ask users, they know
what hardware works best. The bigger the group, the more likely you
will get replies from techies (and some of them will know what they
are talking about).

By the way. I am real-life testing an SSD drive (OCZ Vertex) right
now. The numbers look very good (at least to me) for compressing and
decompressing archives. It feels very fast too. Now my Raptor is my
slow (haha) secondary hard drive.

On a 2 core 3 GHz CPU...

WinRAR 3.7... 1,270 KB/s

7zip (multithreading, 2 core CPU)
compressing, resulting...
speed... 3223
rating... 7500
decompressing, resulting...
speed... 27582
rating... 3117

Good luck and have fun.
 
John Doe said:
No! Multiple core CPUs are the bomb. You will see a huge improvement
in performance when it counts. If I were you, I would go to a
VirtuaDub USENET group or maybe a web forum. Ask users, they know
what hardware works best. The bigger the group, the more likely you
will get replies from techies (and some of them will know what they
are talking about).

I just took a look at the author's web documentation. From what I can
garner, VirtualDub is not multithreaded -- it was NOT as of v1.67 -- and

"My policy on optimizations has been to try to make as many CPU- and
OS-specific optimizations dynamically dispatched. The minimum requirements
for VirtualDub 1.5.2 are an 80486 CPU and Windows 95, but the program
detects and uses features specific to Windows 98 and NT/2000/XP, and to CPUs
that support MMX, SSE, and SSE2."

Further:

"VirtualDub, although not really designed for multi-CPU systems, is
moderately multithreaded. On an average render it will use about 4-5 threads
for UI, reading from disk, processing, and writing to disk. Trying to keep
these threads busy is a challenge and to do so VirtualDub's rendering engine
is pipelined -- all of the stages attempt to work in parallel with queues
between them. The idea is that you add enough buffering between the
different threads that they are all working on different places in the
output and the stages only block on the single bottleneck within the system,
which is usually either disk (I/O) bandwidth or CPU power."
(http://www.virtualdub.org/blog/pivot/entry.php?id=31)


So, while it will definitely benefit from the pipelining of a C2D or C2Q, I
don't think it will directly benefit from 2 or 4 cores, except to the extent
background apps are offloaded to other cores. Even more interesting is his
discussion of bottlenecks:

"If you are doing a high-bandwidth operation with Huffyuv or
uncompressed video and only a light amount of processing, you are almost
certainly going to be disk bound. This means that your CPU utilization is
actually going to drop well below 100% because reading and writing from the
disk is the bottleneck. At this point you largely don't care about MMX/SSE2
optimized code or other kinds of CPU-based performance optimizations,
because the only thing they'll do is make your CPU run cooler. Conversely,
if you're using a ton of video filters and your render is running at 1 fps,
it probably isn't so bad to have your video files stored over the network,
because the network will keep up just fine."


With a new MoBo and CPU, but old HD with slower RAM and I/O bus, you are
indeed likely to become "disk bound" as he describes. In that case, you
will quickly reach the HD I/O limit, and the extra CPU capability will be
wasted.

So, a C2D is a good idea. A FAST dual-core with lots of cache (E8500) will
likely be better than a slower quad (Q6600) FOR THIS APP. The Phenom II
X940 may also work, but you're less likely to find an AM2+ MoBo with 5 PCI
slots. A fast RAID 0 HD array (e.g., a pair of VelociRaptors) will also
help a lot.
 
I still want to run XP - all my software works with it and I'd like to
stay with PCI slots, not PCI express so I can swap over hardware I've
already got. The more PCI slots the better - like 5 or more. Does such
an animal exist - i.e. Core 2 duo system with lots of PCI slots?

What the hell do you want to do that for? Either buy a motherboard with
onboard graphics or get a cheap PCI-e card. Everything else plugs into
PCI anyway.
 
JR Weiss said:
So, a C2D is a good idea. A FAST dual-core with lots of cache
(E8500) will likely be better than a slower quad (Q6600) FOR THIS
APP. The Phenom II X940 may also work, but you're less likely to
find an AM2+ MoBo with 5 PCI slots. A fast RAID 0 HD array (e.g.,
a pair of VelociRaptors) will also help a lot.

No! Solid State Disk SSD drive.
Like I mentioned in my reply.
 
muzician21 said:
Right now I'm running a 2.4 gig P4 on a Soyo Dragon mobo. I could
upgrade to a socket 478 3.4gig processor and get about a 30% bump in
speed which wouldn't be bad, but it's my understanding going to a Core
2 Duo chip I could see a much bigger increase.

I dont need to be on the cutting edge, what Core 2 chips should I look
for that would net me about 3x the speed of that 2.4gig P4? I'm
favoring Intel unless you feel there's a really compelling reason to
go with someone else. .

I still want to run XP - all my software works with it and I'd like to
stay with PCI slots, not PCI express so I can swap over hardware I've
already got. The more PCI slots the better - like 5 or more. Does such
an animal exist - i.e. Core 2 duo system with lots of PCI slots?

Thanks for all input

Using the Newegg "advanced search" in the motherboard section, I can find
a couple boards with five PCI slots. There is still one PCI Express x16
for a good video card. The SuperMicro board has room for four sticks of
RAM. I wouldn't expect either board to be for overclocking, but maybe
that isn't important to you.

http://www.newegg.com/Product/Produ...0200280+1070509908+1073507583&Subcategory=280

Paul
 
John Doe said:
No! Solid State Disk SSD drive.
Like I mentioned in my reply.

....except that he's trying to save money by reusing components...

If he's willing to jump to an SSD (or a pair), he's likely willing to jump
to a new PCIe setup.

Also, for an app with lots of HD Read/Writes, and SSD may not be the best
solution. While a VERY expensive single-layer SSD may be faster than a fast
HD, it is doubtful that a multilayer SSD will be as fast where writes are
predominant.
 
....
Also, for an app with lots of HD Read/Writes, and SSD may not be
the best solution. While a VERY expensive single-layer SSD may be
faster than a fast HD, it is doubtful that a multilayer SSD will
be as fast where writes are predominant.

Are you reading the messages you are replying to?

Do you own or even use an SSD drive?

My OCZ Vertex MLC SSD drive appears to be faster in every way and
blazing faster in some ways, just like the hardware testing site
article (AnandTech I guess) shows. I am putting together a casual
list of real-world observations and maybe my impressions from a few
benchmarks.
 
John Doe said:
My OCZ Vertex MLC SSD drive appears to be faster in every way and
blazing faster in some ways, just like the hardware testing site
article (AnandTech I guess) shows. I am putting together a casual
list of real-world observations and maybe my impressions from a few
benchmarks.

Are you comparing it to a RAID 0 Velociraptor setup, or some other previous
setup you used?

While current reviews show very good speed for this particular new SSD (and
only the 120 GB version was tested in the review I saw, not the 250),
BenchmarkReviews.com cautions that the benchmark figures are not definitive
because they are not designed to test the SSD architecture. Still, if you
believe it is faster than your previous array, that is a good thing.

This comes at a current cost of about 4x per GB compared to a VelociRaptor.
Again, not likely an option for someone who is looking to reuse components
to save $$, and even less likely for someone who needs higher capacity.
 
I said:
My OCZ Vertex MLC SSD drive appears to be faster in every way...

I take that back. But I will find out... Not just to provide
information, but to tell whether the reviews on Newegg and the review
on AnandTech are worth anything. I have two dives, and can configure
Windows every which way. There must be a decent method for testing the
thing.
 
a core 2 chip
a pci express slot

.. . . by no means would make you cutting edge. One can build super
el-cheapo, bottom of the barrel system with the above.

--g
 
Isn't the cheapest Intel CPU that's called a "Core 2" about
$120? That seems a a bit higher than bottom of barrel when
you can get a dual core Athlon x2 for $40.

Pricewatch has an Intel core 2 for $35. It really depends on his
definition, there is an Intel celeron for $5.

--g
 
geoff said:
Pricewatch has an Intel core 2 for $35. It really depends on his
definition, there is an Intel celeron for $5.

AFAIK, the latest Celerons are P4 architecture, not Core2.
 
Back
Top