fast disks needed

  • Thread starter Thread starter steve
  • Start date Start date
S

steve

Hi,
I need to get some very fast disks for my web server. I have been
looking at setups with dual scsi disks. Not knowing much about this
subject (yet) can you help me figure out the performance boost going
from IDE to scsi. For example, one hosting company is providing
"73GB SCSI hard drives @ 10K RPM". How do these compare to 7200 RPM
IDE’s.

Thanks in advance.
 
Previously steve said:
Hi,
I need to get some very fast disks for my web server. I have been
looking at setups with dual scsi disks. Not knowing much about this
subject (yet) can you help me figure out the performance boost going
from IDE to scsi. For example, one hosting company is providing
"73GB SCSI hard drives @ 10K RPM". How do these compare to 7200 RPM
IDE’s.

Unless you expet to wrote a lot to these disks, you should get more
memory first. Main memory is several orders of magintude fatser than
disks. 12GB or more of main memory are possible today (of
course you have to use a real OS and not a toy as platform).

As to you question, you might get a factor of three or so in
access time, i.e. it does not really matter if the slower
solution was significantly too slow.

Arno
 
steve said:
Hi,
I need to get some very fast disks for my web server. I have been
looking at setups with dual scsi disks. Not knowing much about this
subject (yet) can you help me figure out the performance boost going
from IDE to scsi. For example, one hosting company is providing
"73GB SCSI hard drives @ 10K RPM". How do these compare to 7200 RPM
IDE’s.

Are you sure that the disks are the bottleneck?

How fast is your internet connection and what are you doing with your site
that is disk-intensive?
 
Are you sure that the disks are the bottleneck?

How fast is your internet connection and what are you doing with your site
that is disk-intensive?

IME small servers are more likely to be IO-limited than memory
limited, and memory may be easier to add than adding disks
are reorganizing your partition on the new spindles.

IMO adding hardware before you've learned to use the performance
monitoring tools your OS provides is futile. In NT/w2k/XP it's
perfmon.exe.
 
How many hits per second are you getting? How are they configuring two drives
(RAID?)
Unless you expet to wrote a lot to these disks, you should get more
memory first. Main memory is several orders of magintude fatser than
disks. 12GB or more of main memory are possible today (of
course you have to use a real OS and not a toy as platform).
Nonsense from the resident Linux Troll.
As to you question, you might get a factor of three or so in
access time, i.e. it does not really matter if the slower
solution was significantly too slow.
Access time is 2X faster, but with command queuing 10K SCSI will be 3-4X
faster than 7200 IDE.
 
Al Dykes said:
 >> Hi,
 >> I need to get some very fast disks for my web server.
I have been
 >> looking at setups with dual scsi disks. Not knowing
much about this
 >> subject (yet) can you help me figure out the
performance boost going
 >> from IDE to scsi. For example, one hosting company
is providing
 >> "73GB SCSI hard drives @ 10K RPM". How do these
compare to 7200 RPM
 >> IDE’s.
 >> Thanks in advance.
 >>

IME small servers are more likely to be IO-limited than memory
limited, and memory may be easier to add than adding disks
are reorganizing your partition on the new spindles.

IMO adding hardware before you've learned to use the
performance
monitoring tools your OS provides is futile. In NT/w2k/XP it's
perfmon.exe.
--

a d y k e s @ p a n i x . c o m

Don't blame me. I voted for Gore.

This query is for remote hosting of a website so bandwidth is not a
problem. It is running Linux. This may not be the exact newsgroup
for this query, but I have not got satisfactory answers from hosting
forums, so I figured the folks on this newsgroup can help me with
those perf. numbers.

In any case, the application is forums, so a lot of reads, and less
writes. Also in remote env., ram is expensive. Having very large
databases, there is a lag for the index to be loaded into ram from
disk, so I think scsi is the answer. If I can get three times perf.
improvement in disk access (per Arno) I would be very happy.

Anyone else can give me some idea about scsi disk performance
difference.
 
Al Dykes said:
 >> Hi,
 >> I need to get some very fast disks for my web server.
I have been
 >> looking at setups with dual scsi disks. Not knowing
much about this
 >> subject (yet) can you help me figure out the
performance boost going
 >> from IDE to scsi. For example, one hosting company
is providing
 >> "73GB SCSI hard drives @ 10K RPM". How do these
compare to 7200 RPM
 >> IDE’s.
 >> Thanks in advance.
 >>

IME small servers are more likely to be IO-limited than memory
limited, and memory may be easier to add than adding disks
are reorganizing your partition on the new spindles.

IMO adding hardware before you've learned to use the
performance
monitoring tools your OS provides is futile. In NT/w2k/XP it's
perfmon.exe.
--

a d y k e s @ p a n i x . c o m

Don't blame me. I voted for Gore.

Al, I am moving stuff to a new server, so no legacy problems.

The perf tool on *nix (top) shows that cpu is running cool. Linux
grabs memory as much as it can, and I don’t think that is the problem.
The IO however maxes out with large queries. So that’s why looking
for a better disk IO solution.
 
Al, I am moving stuff to a new server, so no legacy problems.

The perf tool on *nix (top) shows that cpu is running cool. Linux
grabs memory as much as it can, and I don’t think that is the problem.
The IO however maxes out with large queries. So that’s why looking
for a better disk IO solution.

As I said, most slow servers are disk-bound. I agree.

Measuring memory is tricky becuase, as you say, the server will grab
all it can, but there is a point of diminishing returns and your system may have
have reached it. It's nice to have "too much". Finding and eliminating bottlenecks
is an art.

Go SCSI.
 
steve said:
This query is for remote hosting of a website so bandwidth is not a
problem. It is running Linux. This may not be the exact newsgroup
for this query, but I have not got satisfactory answers from hosting
forums, so I figured the folks on this newsgroup can help me with
those perf. numbers.

In any case, the application is forums, so a lot of reads, and less
writes.

.... so more RAM could probably help, as perhaps could some restructuring
of the software to avoid unnecessary IO.

Also in remote env., ram is expensive. Having very large
databases, there is a lag for the index to be loaded into ram from
disk, so I think scsi is the answer. If I can get three times perf.
improvement in disk access (per Arno) I would be very happy.

If you're that close to saturation, tripled disk performance will
probably be only a short-term fix; it would probably be better to
segment the task or otherwise restructure.
 
steve said:
This query is for remote hosting of a website so bandwidth is not a
problem. It is running Linux. This may not be the exact newsgroup
for this query, but I have not got satisfactory answers from hosting
forums, so I figured the folks on this newsgroup can help me with
those perf. numbers.

In any case, the application is forums, so a lot of reads, and less
writes. Also in remote env., ram is expensive. Having very large
databases, there is a lag for the index to be loaded into ram from
disk, so I think scsi is the answer. If I can get three times perf.
improvement in disk access (per Arno) I would be very happy.

Anyone else can give me some idea about scsi disk performance
difference.
I have never seen drive benchmarking done on Linux. Results from Windows
Server should be comparable.

There are several 10K SATA vs SCSI comparisons. The best is from
xbitlabs.com: http://www.xbitlabs.com/articles/storage/display/wd740gd.html

Look at the database and web-server results with queue depths of 4/16/64.
SCSI wins by good margins, except for some write heavy loads.

Nobody compares 7200 ATA against 10K SCSI. If you need to know, go to
storagereview.com and compare those drives in the database. Or ask in
comp.periphs.scsi, experienced UNIX people hang out there.
 
Previously Al Dykes said:
Al Dykes said:
steve wrote:
[...]
Al, I am moving stuff to a new server, so no legacy problems.

The perf tool on *nix (top) shows that cpu is running cool. Linux
grabs memory as much as it can, and I don’t think that is the problem.
The IO however maxes out with large queries. So that’s why looking
for a better disk IO solution.
As I said, most slow servers are disk-bound. I agree.

O.K., definitely a disk-bottleneck.
Measuring memory is tricky becuase, as you say, the server will grab
all it can, but there is a point of diminishing returns and your
system may have have reached it. It's nice to have "too much".
Finding and eliminating bottlenecks is an art.

Linux allways grabs all memory and uses it for the buffer-cache. The
real usage numbers are displayed by "free":

wagner@gate:~$free
total used free shared buffers cached
Mem: 507556 504588 2968 0 2332 431408
-/+ buffers/cache: 70848 436708
Swap: 0 0 0

The -/+ line tells the strory without what the buffer-cache uses.
The memory for the buffer-cache will be dynamically adjusted and
basically uses what is unused (althought sometimes that does not work
quite correctly).

Also in "top" you have VIRT, RES and SHR numbers for memory per
process. VIRT is the complete size, RES is the physical RAM in use
witout SHR. It gets more complicates, since SHR ist the RAM that is
being used together with other processes (Libraries, Code from
parents, etc) and does count only once. In addition VIRT shows all
allocated memory. Unused memory does not reside in RAM or on SWAP.

Example:

top - 13:47:53 up 239 days, 12:51, 1 user, load average: 1.12, 0.85, 0.54
Tasks: 60 total, 1 running, 59 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.7% us, 7.6% sy, 21.9% ni, 58.9% id, 0.5% wa, 0.3% hi, 1.1% si
Mem: 507556k total, 504164k used, 3392k free, 3704k buffers
Swap: 0k total, 0k used, 0k free, 430188k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8229 wagner 15 0 2056 940 1844 R 1.9 0.2 0:00.02 top
1 root 16 0 1500 460 1344 S 0.0 0.1 1:24.40 init

'top' is 2056kB big (process area), uses 940kB of RAM just for itself
and uses 1844 kB together with other processes. The SHR is not part
of the VIRT number, since the SHRed memory does not belong to a specific
process. Obviously top does only use its RESident memory, since
swap usage is zero.

Also you can add/remove fields in top with 'f'. While it is possible to
get a summary from 'top', using 'free' is a bit easier ;-)

I agree. It will be faster. How much is impossible to tell without
realistic benchmarks for your application. Also get the mamimum
amount of memory you can. You might also try a newer kernel. Quite
some things have happened to the buffer-cache. 2.6.11 is pretty
stable. I had ptoblems with 2.6.8 ... 2.6.10.
2.6.7 was also pretty o.k..

Arno
 
Back
Top