I want to build a 1.5TB storage array for MythTV

  • Thread starter Thread starter Yeechang Lee
  • Start date Start date
Y

Yeechang Lee

Recently ran into the account of a guy who built his own 1.2TB
RAID50-based storage array for $1600 (see
<URL:http://www.finnie.org/terabyte/>). I really like the idea and
have been thinking about following suit.

Like Finnie, I want to be able to store huge amounts of DivX/Xvid
files online. In addition to the storage array, I also plan to build a
separate MythTV (<URL:http://www.mythtv.org>) box, which among other
things will let me play them at will. My 200GB Series 1 TiVo's been
serving me well for more than four years, but I really like the idea
of being able to seamlessly integrate my AVI collection with TV
recordings, and from what I gather MythTV has finally matured enough
to be a realistic TiVo alternative.

I have been 100% Linux at home for almost a decade and am quite
comfortable with most of the technical aspects of the project.

I'm planning on making the following changes to Finnie's build
configuration:

* Instead of 200GB ATA, use 250GB SATA drives for a total of
1.5TB. Outpost.com offers a Western Digital 250GB SATA drive for
$170 (<URL:http://shop1.outpost.com/product/3868597>). I just missed
the chance to get a $30 rebate off each drive, but I'm sure
Fatwallet will alert me to a similar opportunity sooner or later.
* Accordingly, get a HighPoint SATA RAID card instead of the specified
RocketRAID 454 ATA RAID card. I think the RocketRAID 1640
(<URL:http://www.newegg.com/app/SearchProductResult.asp?Submit=Go&DEPA=0>)
is the way to go.
* Instead of ext3, use XFS as the file system.

My questions:
* If I connect the storage array to my Linksys WRT54G router, will
100Mbps Ethernet be fast enough to pump the AVI files to the MythTV
box without dropping frames?
* Conversely, will 100Mbps Ethernet be sufficient to let me use the
storage array as the primary storage medium for MythTV's recordings?
What about HDTV encodings (using the pcHDTV Linux-only card)? Or do
I have to upgrade to a Gigabit Ethernet router? Or would the encoder
card and MythTV software have to run on the storage array itself in
order to achieve acceptable performance? (Actually, I'm not opposed
to doing so, if one box can simultaneously handle both storage and
MythTV tasks.)
* Anything else that I'm missing or should keep in mind?
 
Yeechang said:
Recently ran into the account of a guy who built his own 1.2TB
RAID50-based storage array for $1600 (see
<URL:http://www.finnie.org/terabyte/>). I really like the idea and
have been thinking about following suit.

Like Finnie, I want to be able to store huge amounts of DivX/Xvid
files online. In addition to the storage array, I also plan to build a
separate MythTV (<URL:http://www.mythtv.org>) box, which among other
things will let me play them at will. My 200GB Series 1 TiVo's been
serving me well for more than four years, but I really like the idea
of being able to seamlessly integrate my AVI collection with TV
recordings, and from what I gather MythTV has finally matured enough
to be a realistic TiVo alternative.

I have been 100% Linux at home for almost a decade and am quite
comfortable with most of the technical aspects of the project.

I'm planning on making the following changes to Finnie's build
configuration:

* Instead of 200GB ATA, use 250GB SATA drives for a total of
1.5TB. Outpost.com offers a Western Digital 250GB SATA drive for
$170 (<URL:http://shop1.outpost.com/product/3868597>). I just missed
the chance to get a $30 rebate off each drive, but I'm sure
Fatwallet will alert me to a similar opportunity sooner or later.
* Accordingly, get a HighPoint SATA RAID card instead of the specified
RocketRAID 454 ATA RAID card. I think the RocketRAID 1640
(<URL:http://www.newegg.com/app/SearchProductResult.asp?Submit=Go&DEPA=0>)
is the way to go.
* Instead of ext3, use XFS as the file system.

My questions:
* If I connect the storage array to my Linksys WRT54G router, will
100Mbps Ethernet be fast enough to pump the AVI files to the MythTV
box without dropping frames?
* Conversely, will 100Mbps Ethernet be sufficient to let me use the
storage array as the primary storage medium for MythTV's recordings?
What about HDTV encodings (using the pcHDTV Linux-only card)? Or do
I have to upgrade to a Gigabit Ethernet router? Or would the encoder
card and MythTV software have to run on the storage array itself in
order to achieve acceptable performance? (Actually, I'm not opposed
to doing so, if one box can simultaneously handle both storage and
MythTV tasks.)
* Anything else that I'm missing or should keep in mind?

I don't know much about MythTV, but I did some testing with Gigabit
Ethernet. The results are here:

http://somacon.com/docs/gigabitnas.html

Highlights:------------------------
TEST: Repeated disk-to-disk 150 MB file transfer via FTP.
Connected via Gigabit Switch to Athlon XP 2500+.

P3 @ 500 Mhz, tg3 driver = 27 MB/sec

P3 @ 850 Mhz, tg3 driver = 33 MB/sec

P3 @ 850 Mhz, smc95x2 driver = 36 MB/sec

P3 @ 1000 Mhz, smc95x2 driver = 36 MB/sec

---------------------------------------

With Gigabit Ethernet, maximum transfer speed is affected by the CPU
speed. The CPU burden is likely introduced by the TCP stack and
network driver. By comparison, current hardware will easily top out a
100Mbit network, with a rate of 11 MB/sec.
 
Yeechang said:
Recently ran into the account of a guy who built his own 1.2TB
RAID50-based storage array for $1600 (see
<URL:http://www.finnie.org/terabyte/>). I really like the idea and
have been thinking about following suit.

Like Finnie, I want to be able to store huge amounts of DivX/Xvid
files online. In addition to the storage array, I also plan to build a
separate MythTV (<URL:http://www.mythtv.org>) box, which among other
things will let me play them at will. My 200GB Series 1 TiVo's been
serving me well for more than four years, but I really like the idea
of being able to seamlessly integrate my AVI collection with TV
recordings, and from what I gather MythTV has finally matured enough
to be a realistic TiVo alternative.

I have been 100% Linux at home for almost a decade and am quite
comfortable with most of the technical aspects of the project.

I'm planning on making the following changes to Finnie's build
configuration:

* Instead of 200GB ATA, use 250GB SATA drives for a total of
1.5TB. Outpost.com offers a Western Digital 250GB SATA drive for
$170 (<URL:http://shop1.outpost.com/product/3868597>). I just missed
the chance to get a $30 rebate off each drive, but I'm sure
Fatwallet will alert me to a similar opportunity sooner or later.

You might want to check the CompUSA site--they were having a deal for 200
gig drives for about 90 bucks counting a mail-in rebate, but I don't know
if it's still on.
* Accordingly, get a HighPoint SATA RAID card instead of the specified
RocketRAID 454 ATA RAID card. I think the RocketRAID 1640

Personally for a RAID that size I'd go for a 3Ware or LSI Logic. No
particular reason, just that I'm used to terabytes being mainframe
territory and I get nervous with consumer RAID controllers trying to handle
that much data.
(<URL:http://www.newegg.com/app/SearchProductResult.asp?Submit=Go&DEPA=0>)
is the way to go.
* Instead of ext3, use XFS as the file system.

My questions:
* If I connect the storage array to my Linksys WRT54G router, will
100Mbps Ethernet be fast enough to pump the AVI files to the MythTV
box without dropping frames?
Yes.

* Conversely, will 100Mbps Ethernet be sufficient to let me use the
storage array as the primary storage medium for MythTV's recordings?
Yes.

What about HDTV encodings (using the pcHDTV Linux-only card)?
Yes.

Or do
I have to upgrade to a Gigabit Ethernet router?
No.

Or would the encoder
card and MythTV software have to run on the storage array itself in
order to achieve acceptable performance? (Actually, I'm not opposed
to doing so, if one box can simultaneously handle both storage and
MythTV tasks.)
* Anything else that I'm missing or should keep in mind?

HDTV is carried on two adjacent 6 MHz channels, for an aggregate of 12
analog MHz. I don't recall the actual bit rate but it's not very high.
There's no reason that any other kind of TV would need more than that
unless you're storing files in an uncompressed or lossless-compressed form.
 
Yeechang Lee wrote:

* Instead of 200GB ATA, use 250GB SATA drives for a total of
1.5TB. Outpost.com offers a Western Digital 250GB SATA drive for
$170 (<URL:http://shop1.outpost.com/product/3868597>). I just missed
the chance to get a $30 rebate off each drive, but I'm sure
Fatwallet will alert me to a similar opportunity sooner or later.

Generally rebates are limited to one per SKU per household. Be sure to
check the details.

* Anything else that I'm missing or should keep in mind?

I'm doing a cheap-o scaled down version of what you describe. As soon
as my drives come in I'll let you know. I have a feeling that the
latency will be increased when seeking through a recording, but other
than that might be OK.


-WD
 
Will said:
I have a feeling that the latency will be increased when seeking
through a recording, but other than that might be OK.

That's something I hadn't even considered. On my TiVo seeks are
lightining-fast; I can't believe any network-based system will be
quite so quick. (On the other hand, even over a wireless link and
Samba I've found seeks using mplayer to be more or less acceptable, so
a storage array on much newer equipment might not be so bad at all.) I
wonder if MythTV has the ability to locally cache recordings?
 
Yeechang said:
I
wonder if MythTV has the ability to locally cache recordings?

Not that I'm aware of. I haven't looked into it too much, but I think
it would be nice to be able to record to and watch from the local disk
but after a certain amount of time the recordings would get offloaded to
the NAS for storage.

I know there is a built-in option to transcode files after an amount of
time, but I wonder if it would be possible to have that same function
just do a simple transfer without re-encoding.

I'm sure once I get my NAS set up, I'll be looking deeper into it!


-WD
 
* If I connect the storage array to my Linksys WRT54G router, will
100Mbps Ethernet be fast enough to pump the AVI files to the MythTV
box without dropping frames?
* Conversely, will 100Mbps Ethernet be sufficient to let me use the
storage array as the primary storage medium for MythTV's recordings?
What about HDTV encodings (using the pcHDTV Linux-only card)? Or do
I have to upgrade to a Gigabit Ethernet router? Or would the encoder
card and MythTV software have to run on the storage array itself in
order to achieve acceptable performance? (Actually, I'm not opposed
to doing so, if one box can simultaneously handle both storage and
MythTV tasks.)

Seems to me your architecture has just one primary client system (the
MythTV box). So why not stuff an extra gigabit NIC in the MythTV box
and the storage box and connect them with a cross-over cable? A quick
peak suggests that gigabit NICs will cost from about $30...

(Keep the regular network for everything else, of course).

When/if you want to expand the gigabit network, just add a switch...

Malc.
 
J. Clarke ([email protected]) wrote in alt.video.ptv.tivo:
HDTV is carried on two adjacent 6 MHz channels, for an aggregate of 12
analog MHz.

No, HDTV (or any form of ATSC digital TV) in the US uses just one 6MHz
channel.
I don't recall the actual bit rate but it's not very high.

19.3Mbps at most. That's 2.5MB/sec. It's easily carried in real time
over 100Mbps Ethernet.
 
Shailesh Humbad said:
Yeechang Lee wrote:

I don't know much about MythTV, but I did some testing with Gigabit
Ethernet. The results are here:

http://somacon.com/docs/gigabitnas.html

Highlights:------------------------
TEST: Repeated disk-to-disk 150 MB file transfer via FTP.
Connected via Gigabit Switch to Athlon XP 2500+.

P3 @ 500 Mhz, tg3 driver = 27 MB/sec

P3 @ 850 Mhz, tg3 driver = 33 MB/sec

P3 @ 850 Mhz, smc95x2 driver = 36 MB/sec

P3 @ 1000 Mhz, smc95x2 driver = 36 MB/sec

---------------------------------------

With Gigabit Ethernet, maximum transfer speed is affected by the CPU
speed. The CPU burden is likely introduced by the TCP stack and
network driver. By comparison, current hardware will easily top out a
100Mbit network, with a rate of 11 MB/sec.

Depending on your workload, it is affected by so many things: CPU,
memory, backplane, drivers, even the file system and how NFS (in
our tests) is configured.

Using dual 2.4 Xeons to drive 2 3ware cards and two disk arrays
of 8 in a 6-1-1 raid 5 config and raid 1 the two arrays (~2.8TB)
with the 2.6.7 linux kernel using XFS I am up to ~55MB/sec in
our prelim testing. We got a little better read speed with JFS but
XFS wrote faster.

We have done thinks from changing NFS parameters to
tweaking the intel ethernet drivers:
http://support.intel.com/support/network/sb/CS-009209.htm
to changing memory buffers. I am doing some research now
on tweaking flush buffers.

You can get rather obsesses with this. :)
 
Jeff said:
J. Clarke ([email protected]) wrote in alt.video.ptv.tivo:

No, HDTV (or any form of ATSC digital TV) in the US uses just one 6MHz
channel.

You're right--I don't know where I got the two adjacent channels--maybe that
was something proposed early on.
 
Malcolm said:
Seems to me your architecture has just one primary client system (the
MythTV box). So why not stuff an extra gigabit NIC in the MythTV box
and the storage box and connect them with a cross-over cable? A quick
peak suggests that gigabit NICs will cost from about $30...

Why bother?
 
J. Clarke said:
HDTV is carried on two adjacent 6 MHz channels, for an aggregate of 12
analog MHz. I don't recall the actual bit rate but it's not very high.

It's one channel, not two. http://www.atsc.org/document_map/video.htm
-Joe

ATSC Standard A/53: Digital Television Standard
5. SYSTEM OVERVIEW

The Digital Television Standard describes a system designed to transmit
high quality video and audio and ancillary data over a single 6 MHz
channel. The system can deliver reliably about 19 Mbps of throughput in
a 6 MHz terrestrial broadcasting channel (8VSB encoding) and about 38 Mbps
of throughput in a 6 MHz cable television channel.
 
J. Clarke ([email protected]) wrote in alt.video.ptv.tivo:

No, HDTV (or any form of ATSC digital TV) in the US uses just one 6MHz
channel.


19.3Mbps at most. That's 2.5MB/sec. It's easily carried in real time
over 100Mbps Ethernet.

And if my math skills haven't failed me... around 8.5 or
8.75 GB/hr. (Call it 9 for planning purposes.)
 
Malcolm Weir wrote:

Why bother?

'cos that way you don't *need* to know what else is going on.

The recommendation was, and remains, one that will address a stated
concern. It is, likely, overkill, but at $30 per card, who cares?

Now, if you want to claim that you *know* the amount of traffic on the
existing 100BaseT network, and therefore you *know* that there is
sufficient excess bandwidth for the new application, be my guest...
but I'd be curious *how* you know that!

Malc.
 
Malcolm said:
'cos that way you don't *need* to know what else is going on.

The recommendation was, and remains, one that will address a stated
concern. It is, likely, overkill, but at $30 per card, who cares?

Now, if you want to claim that you *know* the amount of traffic on the
existing 100BaseT network, and therefore you *know* that there is
sufficient excess bandwidth for the new application, be my guest...
but I'd be curious *how* you know that!

So how much traffic does a network that somebody has in his house usually
have?
 
Shailesh Humbad said:
I don't know much about MythTV, but I did some testing with Gigabit
Ethernet. The results are here:

http://somacon.com/docs/gigabitnas.html
With Gigabit Ethernet, maximum transfer speed is affected by the CPU

Not exactly. Consider that maximum full-duplex throughput of a gigabit ethernet
card is 200 mega_bytes_ per second. Put this on a standard 33Mhz PCI
bus (132 megabytes/second peak), and you can see that the speed of the
CPU doesn't matter from the standpoint of getting maximum throughput,
especially considering you may be using some of that available PCI
bandwidth for disk I/O.

FTP is hardly a useful benchmark for determining the peak ethernet
performance anyway. Try with something lower overhead like rcp.
 
Scott said:
FTP is hardly a useful benchmark for determining the peak ethernet
performance anyway. Try with something lower overhead like rcp.

Huh? Once you've gotten past the dialog that starts the transfer,
ftp has no level 3 overhead: it sends just data bytes until EOF.

-Joe
 
J. Clarke said:
You might want to check the CompUSA site--they were having a deal for 200
gig drives for about 90 bucks counting a mail-in rebate, but I don't know
if it's still on.

The last time I saw one of these rebates, it was limited to one per
household unfortunately.
Personally for a RAID that size I'd go for a 3Ware or LSI Logic. No
particular reason, just that I'm used to terabytes being mainframe
territory and I get nervous with consumer RAID controllers trying to handle
that much data.

I think, from the link he mentioned, that he intends to use the Linux
md driver. (software RAID). Most of those cheap RAID cards are not
hardware RAID at all. They advertise it falsely and give you a Windows
driver that does software RAID.

Nothing is worse than cheap RAID. If you're not using something solid,
even if it does really do hardware RAID, your best bet is software
RAID. The Linux software RAID is better-tested and more widely used
(by people who will notice if something went wrong and get it fixed or
at least complain) than any of the cheapo RAID or fakeraid cards.

Bigger issues than the size of the RAID group are things like, what
happens if a drive fails? This can be very hard to test without
special HD firmware made for this purpose. I've seen cheap hardware
RAIDs where yanking the drive out seems to fail it fine, but when a
drive fails for real, the errors get passed through to the OS.

Or how does the RAID behave when running into bad blocks on a "good"
drive, while resyncing onto a hot-spare or a replaced disk. I've seen
even hugely expensive "enterprise" solutions fail miserably here.

Or how does the RAID behave when it sees a double-disk failure? Even
some high-end solutions give you no way to recover. Obviously if both
drives have really failed, you're screwed. But what if a power
connection gets knocked loose?
 
([email protected]) wrote in alt.video.ptv.tivo:
Or how does the RAID behave when it sees a double-disk failure? Even
some high-end solutions give you no way to recover. Obviously if both
drives have really failed, you're screwed. But what if a power
connection gets knocked loose?

I had this exact situation with Linux software RAID-5 (2.6 kernel), and
a simple remove and re-add of the drives caused the re-build to start, and
I lost no data.

I'm trying to figure out how to get a spare onto the array to avoid even
this sort of problem, but although the software can do it, I don't have any
more room in the case.
 
The last time I saw one of these rebates, it was limited to one per
household unfortunately.

You may be right; hopefully I will be able to find a bona fide price
cut soon. As mentioned, even $170 (the pre-rebate price) isn't bad.
I think, from the link he mentioned, that he intends to use the Linux
md driver. (software RAID). Most of those cheap RAID cards are not
hardware RAID at all.

I definitely get the sense through research that modern Linux software
RAID is superior to the consumer-grade RAID cards. That said, if I
follow the approach outlined at <URL:http://www.finnie.org/terabyte/>,
I will be using both the RcoketRAID card's RAID 5 *and* software RAID
0. But if Finnie is mistaken and just software RAID is sufficient and
more reliable, I'd go that way.
Or how does the RAID behave when it sees a double-disk failure? Even
some high-end solutions give you no way to recover. Obviously if
both drives have really failed, you're screwed. But what if a power
connection gets knocked loose?

Reliability is important, of course, but my desire has limits. Since
the array would be used to hold video files, of course I want some
degree of redundancy (as a RAID 50 arrangement apparently
provides). That said, video files aren't *so* important as to
necessitate in my mind extraordinary redundancy; that's why I'm
willing to take the risk of two disks going bad at once, and why I'm
not even going to bother trying to backup a 1.5TB array.
 
Back
Top