T
Tony Hill
No it is not clear that they are valid. Suppose my theory is right.
The compiler is using extra MPU sockets as remote bandwidth to do
prefetch. Then this only works if you were stupid enough to buy a
system with more sockets and MPUs than you need. Why would you ever do
that?
Simple answer to your question: If it gives you a 40% increase in
floating point performance for your high-cost application.
Seriously, if these results are real, I don't think you're possible
explanation for the performance invalidates them. It does, however,
place a whole new caveat on the benchmark. Is a 4P Opteron server
that gets 3500 CFP and costs you $20,000 worth the price vs. a 1P
Power5+ system that gets 3000 CFP and costs $10,000?
In any case, we definitely need more info to make any sort of final
judgment on this.