Is HT faster infiniband?

  • Thread starter Thread starter Student
  • Start date Start date
Is Hypertransport faster or Infiniband?

http://139.95.253.214/SRVS/CGI-BIN/...00000000220593058,K=9241,Sxi=1,Case=obj(4650)

How much faster is HyperTransport™ than other technologies like PCI,
PCI-X or Infiniband?

Traditional PCI transfers data at 133 MB/sec, PCI-X at 1 GB/sec,
InfiniBand at about 4GB/sec in the 12 channel implementation and 1.25 in
the more popular 4 channels. HyperTransport transfers data at 6.4 GB. It
is about 50 times faster than PCI, 6 times faster than PCI-X and 5 times
faster than InfiniBand 4 channels. It is important to remember that
InfiniBand is not an alternative to HyperTransport technology. Each
HyperTransport I/O bus consists of two point-to-point unidirectional
links. Each link can be from two bits to 32 bits wide. Standard bus
widths of 2, 4, 8, 16, and 32 bits are supported. Asymmetric
HyperTransport I/O buses are designed to be permitted in situations
requiring different upstream and downstream bandwidths. Commands,
addresses, and data (CAD) all use the same bits. So, a simple, low-cost
HyperTransport I/O implementation using two CAD bits in each direction
is designed to provide a raw bandwidth of up to 400 Megabytes per second
in each direction (at the highest possible speed of 1.6 Gbit/sec). Two
directions combined give almost 8 times the peak bandwidth of PCI 32/33.
A larger implementation using 16 CAD bits in each direction is designed
to provide bandwidth up to 3.2 Gigabytes per second both ways - 48 times
the peak bandwidth of 32-bit PCI running at 33MHz.
 
http://139.95.253.214/SRVS/CGI-BIN/...00000000220593058,K=9241,Sxi=1,Case=obj(4650)

How much faster is HyperTransport™ than other technologies like PCI,
PCI-X or Infiniband?

Traditional PCI transfers data at 133 MB/sec, PCI-X at 1 GB/sec,
InfiniBand at about 4GB/sec in the 12 channel implementation and 1.25 in
the more popular 4 channels. HyperTransport transfers data at 6.4 GB. It
is about 50 times faster than PCI, 6 times faster than PCI-X and 5 times
faster than InfiniBand 4 channels. It is important to remember that
InfiniBand is not an alternative to HyperTransport technology. Each
HyperTransport I/O bus consists of two point-to-point unidirectional
links. Each link can be from two bits to 32 bits wide. Standard bus
widths of 2, 4, 8, 16, and 32 bits are supported. Asymmetric
HyperTransport I/O buses are designed to be permitted in situations
requiring different upstream and downstream bandwidths. Commands,
addresses, and data (CAD) all use the same bits. So, a simple, low-cost
HyperTransport I/O implementation using two CAD bits in each direction
is designed to provide a raw bandwidth of up to 400 Megabytes per second
in each direction (at the highest possible speed of 1.6 Gbit/sec). Two
directions combined give almost 8 times the peak bandwidth of PCI 32/33.
A larger implementation using 16 CAD bits in each direction is designed
to provide bandwidth up to 3.2 Gigabytes per second both ways - 48 times
the peak bandwidth of 32-bit PCI running at 33MHz.

Your HT info is a little out of date: clock speeds on mbrds has been at
1GHz since Fall 2004, giving a peak bandwidth of 4GB/s in each direction on
a 16/16 link, minus any packetization overhead of course. The next jump to
1.4GHz is in the works but I don't know where they go when they reach the
original design target of 1.6GHz.
 
Is Hypertransport faster or Infiniband?

It doesn't matter - they are not targeted at the same "problem". I must
say Inifiniband's proponents have not helped here by announcing it as a
"do-all" "solution" for on-board as well as off-board links... so bloody
confusing. The way I see it, Inifiniband, despite claims, is an off-board
wired or back-plane transport which possibly has better error
detection/recovery.

On that last point, I keep reading that Hypertransport suffers from lack of
error detection/recovery but CRC checking and packet retries are clearly in
the specs so I don't know what the full story is there yet. To put things
in perspective, Hypertransport links on currently available mbrds run at
approximately the same speed as current PCI Express, which on a x16 link
has a peak bandwidth of 4.1GB/s (B=Byte); a 16/16 HT link has a peak
bandwidth of 4GB/s in each direction.
 
George said:
It doesn't matter - they are not targeted at the same "problem". I must
say Inifiniband's proponents have not helped here by announcing it as a
"do-all" "solution" for on-board as well as off-board links... so bloody
confusing. The way I see it, Inifiniband, despite claims, is an off-board
wired or back-plane transport which possibly has better error
detection/recovery.

On that last point, I keep reading that Hypertransport suffers from lack of
error detection/recovery but CRC checking and packet retries are clearly in
the specs so I don't know what the full story is there yet. To put things
in perspective, Hypertransport links on currently available mbrds run at
approximately the same speed as current PCI Express, which on a x16 link
has a peak bandwidth of 4.1GB/s (B=Byte); a 16/16 HT link has a peak
bandwidth of 4GB/s in each direction.

It may be in the spec, but the retry is recent. So what do you do in a
pc when you get an error and don't find out about it for 512 bytes? Do
you retry the last 512 bytes worth of transactions? If you would check
into it you would find that the actual result is a crash of some sort.

I don't know anyone that has been pushing IB as a "do all" solution.
Clearly it is not a FSB. And IB does for sure have better recovery and
detection. Although HT is trying, but they have an installed base
problem with the networking extensions.
 
It may be in the spec, but the retry is recent. So what do you do in a
pc when you get an error and don't find out about it for 512 bytes? Do
you retry the last 512 bytes worth of transactions? If you would check
into it you would find that the actual result is a crash of some sort.

So the packet length is 512 bit-times and the CRC comes embedded at 64-bit
times into the next packet. I guess the only solution would be to hold a
packet-sized buffer, which would kill the latency advantage. Is that
unusual? Does PCI Express use a much smaller packet-size, thus giving it a
faster retry cycle? Then again HT has separate channels for the
up/down-links so you don't have to turn-aroun a bi-di channel.
I don't know anyone that has been pushing IB as a "do all" solution.
Clearly it is not a FSB. And IB does for sure have better recovery and
detection. Although HT is trying, but they have an installed base
problem with the networking extensions.

The initial hype on Infiniband was very waffly IMO - it clearly wanted to
pose as a do-all on-board/off-board link, not necessarily a FSB. From my
POV they were not clear enough in what it was to target, i.e. back planes
and wires.
 
Del said:
It may be in the spec, but the retry is recent. So what do you do in a
pc when you get an error and don't find out about it for 512 bytes? Do
you retry the last 512 bytes worth of transactions? If you would check
into it you would find that the actual result is a crash of some sort.

So how does that work in practice? One gathers from Stone and
Partridge work on ethernet checksum vs CRC errors that undetected
errors are probably much more common than anyone would have cared to
think. Does anybody know about HT-type traffic? If someone bothers to
do a study, we'll find out that computers have turned into random
number generators?

As it is, the Stone and Partridge stuff doesn't seem to have created
much more than some interesting exchanges on comp.arch. Does anybody
care anymore? I'm sure that IBM does, but can it afford to?

RM
 
So how does that work in practice? One gathers from Stone and
Partridge work on ethernet checksum vs CRC errors that undetected
errors are probably much more common than anyone would have cared to
think. Does anybody know about HT-type traffic? If someone bothers to
do a study, we'll find out that computers have turned into random
number generators?

Turned into? PCs have always been random number generators. Why else did
we need a Ctl/Alt/Del key combo?:-)

I guess it's true to say that, as PCs have migrated up to "important"
tasks, the need for confidence in data integrity has increased.
 
Robert said:
So how does that work in practice? One gathers from Stone and
Partridge work on ethernet checksum vs CRC errors that undetected
errors are probably much more common than anyone would have cared to
think. Does anybody know about HT-type traffic? If someone bothers to
do a study, we'll find out that computers have turned into random
number generators?

As it is, the Stone and Partridge stuff doesn't seem to have created
much more than some interesting exchanges on comp.arch. Does anybody
care anymore? I'm sure that IBM does, but can it afford to?

RM

I don't know about their work, and I know little about ethernet. I do
know from experience in the lab that a 32 bit crc, properly chosen, with
retry can cope with quite high error rates without any problem with the
system. And I would believe that the systems in question would not
tolerate very many undetected errors because the disks for the virtual
memory, and the coherence traffic if any was carried over the network in
question along with all the other I/O traffic.
 
Del said:
Robert Myers wrote:

I don't know about their work, and I know little about ethernet. I do
know from experience in the lab that a 32 bit crc, properly chosen, with
retry can cope with quite high error rates without any problem with the
system. And I would believe that the systems in question would not
tolerate very many undetected errors because the disks for the virtual
memory, and the coherence traffic if any was carried over the network in
question along with all the other I/O traffic.

I think you did participate in the discussion of this subject on
comp.arch:

Stone, J., Partridge, C.: "When The CRC and TCP Checksum Disagree",
Proceedings of the ACM conference on Applications, Technologies,
Architectures, and Protocols for Computer Communication (SIGCOMM'00),
Stockholm, Sweden, August/September 2000, pp. 309-319

Abstract

"Traces of Internet packets from the past two years show that between 1
packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on
links where link-level CRCs should catch all but 1 in 4 billion errors.
For certain situations, the rate of checksum failures can be even
higher: in one hour-long test we observed a checksum failure of 1
packet in 400. We investigate why so many errors are observed, when
link-level CRCs should catch nearly all of them.We have collected
nearly 500,000 packets which failed the TCP or UDP or IP checksum. This
dataset shows the Internet has a wide variety of error sources which
can not be detected by link-level checks. We describe analysis tools
that have identified nearly 100 different error patterns. Categorizing
packet errors, we can infer likely causes which explain roughly half
the observed errors. The causes span the entire spectrum of a network
stack, from memory errors to bugs in TCP.After an analysis we conclude
that the checksum will fail to detect errors for roughly 1 in 16
million to 10 billion packets. From our analysis of the cause of
errors, we propose simple changes to several protocols which will
decrease the rate of undetected error. Even so, the highly non-random
distribution of errors strongly suggests some applications should
employ application-level checksums or equivalents."

It may not be a good model for the possiblity of other link-level
errors, but it does make you wonder.

In overclocking tests, I've found that PC's will tolerate significant
memory errors without giving any immediate indication of a problem.
Short of a crash, I don't know how you'd know anything was wrong
without application-level checking.


RM
 
Bitstring <[email protected]>, from
the wonderful person Robert Myers said:
In overclocking tests, I've found that PC's will tolerate significant
memory errors without giving any immediate indication of a problem.
Short of a crash, I don't know how you'd know anything was wrong
without application-level checking.

If it'll (successfully) run Memtest86 overnight, and it'll pass the
Prime95 torture tests, then it's working right (IME). If it won't, then
it might run WinXP and applications anyway, but it'll do strange things
from time to time ...
 
I think you did participate in the discussion of this subject on
comp.arch:

Stone, J., Partridge, C.: "When The CRC and TCP Checksum Disagree",
Proceedings of the ACM conference on Applications, Technologies,
Architectures, and Protocols for Computer Communication (SIGCOMM'00),
Stockholm, Sweden, August/September 2000, pp. 309-319

Abstract

"Traces of Internet packets from the past two years show that between 1
packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on
links where link-level CRCs should catch all but 1 in 4 billion errors.
For certain situations, the rate of checksum failures can be even
higher: in one hour-long test we observed a checksum failure of 1
packet in 400. We investigate why so many errors are observed, when
link-level CRCs should catch nearly all of them.We have collected
nearly 500,000 packets which failed the TCP or UDP or IP checksum. This
dataset shows the Internet has a wide variety of error sources which
can not be detected by link-level checks. We describe analysis tools
that have identified nearly 100 different error patterns. Categorizing
packet errors, we can infer likely causes which explain roughly half
the observed errors. The causes span the entire spectrum of a network
stack, from memory errors to bugs in TCP.After an analysis we conclude
that the checksum will fail to detect errors for roughly 1 in 16
million to 10 billion packets. From our analysis of the cause of
errors, we propose simple changes to several protocols which will
decrease the rate of undetected error. Even so, the highly non-random
distribution of errors strongly suggests some applications should
employ application-level checksums or equivalents."

It may not be a good model for the possiblity of other link-level
errors, but it does make you wonder.

I know of two NICs which are reputed to have a bug in their checksum
offloading. What amazes me is that the only software where I've seen a
hiccup from this is Eudora, where it reports error 10053 or 10054 when you
try to send a longish e-mail msg. Turning off "Checksum Offload" fixes the
problem.
In overclocking tests, I've found that PC's will tolerate significant
memory errors without giving any immediate indication of a problem.
Short of a crash, I don't know how you'd know anything was wrong
without application-level checking.

I suspect that many overclockers are running slightly over the ragged
edge... and that stable really means very low error rate.
 
Robert Myers said:
I think you did participate in the discussion of this subject on
comp.arch:

Stone, J., Partridge, C.: "When The CRC and TCP Checksum Disagree",
Proceedings of the ACM conference on Applications, Technologies,
Architectures, and Protocols for Computer Communication (SIGCOMM'00),
Stockholm, Sweden, August/September 2000, pp. 309-319

Abstract

"Traces of Internet packets from the past two years show that between 1
packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on
links where link-level CRCs should catch all but 1 in 4 billion errors.
For certain situations, the rate of checksum failures can be even
higher: in one hour-long test we observed a checksum failure of 1
packet in 400. We investigate why so many errors are observed, when
link-level CRCs should catch nearly all of them.We have collected
nearly 500,000 packets which failed the TCP or UDP or IP checksum. This
dataset shows the Internet has a wide variety of error sources which
can not be detected by link-level checks. We describe analysis tools
that have identified nearly 100 different error patterns. Categorizing
packet errors, we can infer likely causes which explain roughly half
the observed errors. The causes span the entire spectrum of a network
stack, from memory errors to bugs in TCP.After an analysis we conclude
that the checksum will fail to detect errors for roughly 1 in 16
million to 10 billion packets. From our analysis of the cause of
errors, we propose simple changes to several protocols which will
decrease the rate of undetected error. Even so, the highly non-random
distribution of errors strongly suggests some applications should
employ application-level checksums or equivalents."

It may not be a good model for the possiblity of other link-level
errors, but it does make you wonder.

In overclocking tests, I've found that PC's will tolerate significant
memory errors without giving any immediate indication of a problem.
Short of a crash, I don't know how you'd know anything was wrong
without application-level checking.


RM

OK, thanks for the reminder. As I recall, claiming that the checksums
were missing the errors was a mild distortion. The errors were
transpositions of data blocks being fetched to the adapter so the data
was bad when it got there. Is that not the case?

As for PCs not being affected by memory errors, how many do you estimate
it took to crash the system? The lab system I was referring to was
seeing many errors per second.

del
 
Del said:
OK, thanks for the reminder. As I recall, claiming that the checksums
were missing the errors was a mild distortion. The errors were
transpositions of data blocks being fetched to the adapter so the data
was bad when it got there. Is that not the case?

That was one of the explanations. I wasn't convinced there was any one
single explanation that would have dominated.

I concluded that, if you really had to know that your data are
reliable, you should probably do your own end-to-end error checking.
As for PCs not being affected by memory errors, how many do you estimate
it took to crash the system? The lab system I was referring to was
seeing many errors per second.

Oh, a few errors per hour will generally let a system run, IIRC. The
difference in speed between running on the ragged edge like that and
not running at all is so small that it isn't worth running on the
ragged edge.

RM
 
Robert Myers said:
That was one of the explanations. I wasn't convinced there was any one
single explanation that would have dominated.

I concluded that, if you really had to know that your data are
reliable, you should probably do your own end-to-end error checking.


Oh, a few errors per hour will generally let a system run, IIRC. The
difference in speed between running on the ragged edge like that and
not running at all is so small that it isn't worth running on the
ragged edge.

RM
The lab system was in the range of 10**5 error/sec and still ran
perfectly with 32 bit crc and retry. A bad cable can really prove your
recovery mechanism.

So it sounds like the protocol or the software or something for the
ethernet systems in question were broken.

del
 
Del Cecchi said:
The lab system was in the range of 10**5 error/sec and
still ran perfectly with 32 bit crc and retry.

Yes, this is barely possible on a 100 Mbit/s system.
A 64 byte packet has a 60% chance of arriving error free.
Unfortunately, a 1500 byte packet has only a 0.0006% chance,
assuming random error distribution. So acks get through,
but data in will be bad.
A bad cable can really prove your recovery mechanism.

Yep! Beware of newbies with crimpers! RJ45s are hard to do,
and not just because the correct pattern is counter-intuitive.
All the intuitive patterns split a pair which often gives
some connectivity but poor performance. There are 40,320 ways
of wiring the 8 conductor cable straight-thru. All but 1,152
split at least one pair necessary for 10baseT or 100baseTX.
So it sounds like the protocol or the software or something
for the ethernet systems in question were broken.

I would hope any system with anything near 0.1% error rates
was using ECC, not just CRC.

-- Robert
 
Yes, this is barely possible on a 100 Mbit/s system.
A 64 byte packet has a 60% chance of arriving error free.
Unfortunately, a 1500 byte packet has only a 0.0006% chance,
assuming random error distribution. So acks get through,
but data in will be bad.


Yep! Beware of newbies with crimpers! RJ45s are hard to do,
and not just because the correct pattern is counter-intuitive.
All the intuitive patterns split a pair which often gives
some connectivity but poor performance. There are 40,320 ways
of wiring the 8 conductor cable straight-thru. All but 1,152
split at least one pair necessary for 10baseT or 100baseTX.

Switches with just a Web based interface, which allow you to collect error
rates and mirror ports, are cheap now. All the Cat5 that I put in is now
running 1Gb/s Full Duplex, with maybe 5-6 errors/week/port due, I believe,
to speed ramping at PC power on.

While there is undoubtedly bad cable around, much of it was done by
"professionals" or taken off the shelf... or even just caused by physical
abuse or misrouting in the wall or ceiling/floor cavity. I also tend to
think much of "bad cable" is due to legacy "telephone" mentality, equipment
practices and personnel. To me the punch-down block is a scary and
dangerous place.:-)
 
George Macdonald said:
Switches with just a Web based interface, which allow you
to collect error rates and mirror ports, are cheap now.

Any particular brands/models you'd recommend?
All the Cat5 that I put in is now running 1Gb/s Full Duplex,
with maybe 5-6 errors/week/port due, I believe, to speed
ramping at PC power on.

Could be. Also could be interference from noisemakers
like motor starts. But sounds like good cable.
While there is undoubtedly bad cable around, much of it
was done by "professionals" or taken off the shelf... or
even just caused by physical abuse or misrouting in the
wall or ceiling/floor cavity. I also tend to think much
of "bad cable" is due to legacy "telephone" mentality,

Yes, a lot of that. But crimpers still aren't easy
even after you know T-568A from T-568B
equipment practices and personnel. To me the punch-down
block is a scary and dangerous place.:-)

Hey, jacks have punchdowns too! And if you _really_ like
retro, Siemon makes a Cat5e rated 66 block :)

-- Robert
 
Robert Redelmeier said:
Yes, this is barely possible on a 100 Mbit/s system.
A 64 byte packet has a 60% chance of arriving error free.
Unfortunately, a 1500 byte packet has only a 0.0006% chance,
assuming random error distribution. So acks get through,
but data in will be bad.


Yep! Beware of newbies with crimpers! RJ45s are hard to do,
and not just because the correct pattern is counter-intuitive.
All the intuitive patterns split a pair which often gives
some connectivity but poor performance. There are 40,320 ways
of wiring the 8 conductor cable straight-thru. All but 1,152
split at least one pair necessary for 10baseT or 100baseTX.


I would hope any system with anything near 0.1% error rates
was using ECC, not just CRC.

-- Robert
This was a parallel source synchronous link (RIO) running at a GB/sec.
And the error rate was packet errors. I didn't have any way to collect
statistics on bit errors. CRC with retry is the moral equivilent of ECC.
 
Del Cecchi said:
This was a parallel source synchronous link (RIO) running
at a GB/sec. And the error rate was packet errors.
I didn't have any way to collect statistics on bit errors.

Probably the same 1e5/s -- that link was running around
1e10 bit/s. Unless non random, or extremely large packets,
the chances of having two+ errors in one packet are very small.
CRC with retry is the moral equivilent of ECC.

Well, that's an odd sense of morality :)

CRC with retry has low "clean" overhead, but throws away lots
of good? bits. ECC has much higher overhead but seldom throws
anything away. There is an error-rate breakpoint below which
CRC/retry is best, above ECC is better.

-- Robert
 
Back
Top