WLBS/NLB ("Load Balancing") and a Single NIC

  • Thread starter Thread starter Hairy One Kenobi
  • Start date Start date
H

Hairy One Kenobi

Hi,
Seems to be a lot of similar questions on Google/Deja, but not much in
the way of answers..

Basic question - I have two machines, identical config aside from Priority
and Dedicated IP.

1. Single NIC
2. Dedicated IP as primary NIC IP
3. Cluster IP as multi-homed IP on same card
4. WLBS/NLB installed with appropriate addresses. Multicast active. Affinity
"None".
5. When connecting, I always hit the first box. No exceptions.
6. If the first box is shut down, I always hit the second.
7. If the first box is restarted, I generally hit the second, but
occasionally end up with a hung session (testing with IIS using two static
pages with machine name)

AFAICT this is "by the book" configuration, so why doesn't it work? WLBS
reports both nodes in and out at the right times, but the service just
doesn't balance.

I /know/ this works, so what am I doing wrong in this case..?

(Incidentally, to make life interesting, this particular setup is a VMware
duplicate of the "real" problem system. 8.5 hours straight - so far - and
it's lucky that the monitor can't fit through my office window ;o)

--

Hairy One Kenobi

Disclaimer: the opinions expressed in this opinion do not necessarily
reflect the opinions of the highly-opinionated person expressing the opinion
in the first place. So there!
 
Hairy One Kenobi said:
Hi,
Seems to be a lot of similar questions on Google/Deja, but not much in
the way of answers..

Basic question - I have two machines, identical config aside from Priority
and Dedicated IP.

1. Single NIC
2. Dedicated IP as primary NIC IP
3. Cluster IP as multi-homed IP on same card
4. WLBS/NLB installed with appropriate addresses. Multicast active. Affinity
"None".
5. When connecting, I always hit the first box. No exceptions.
6. If the first box is shut down, I always hit the second.
7. If the first box is restarted, I generally hit the second, but
occasionally end up with a hung session (testing with IIS using two static
pages with machine name)

How are you testing?

Are you opening IE and hitting F5 or even Ctrl-F5?

IE will use HTTP 1.1 and maintain a TCP connection to the host and a given
session will continue to use the same source socket.

Set your home page on the client to your cluster IP. Open multiple IE
sessions and on each one, after you start the session hit Ctrl-F5 (F5 or
refresh won't do it) to force a page load from the cluster IP.

You should see a distribution across both nodes.

BTW, buy a second NIC for each PC. Trying to do NLB with a single NIC is
just a pain in the ass and not worth it, particularly since you can buy a
decent "server class" NIC for around $100 these days.
 
Doug Frisk said:
How are you testing?

Are you opening IE and hitting F5 or even Ctrl-F5?

IE will use HTTP 1.1 and maintain a TCP connection to the host and a given
session will continue to use the same source socket.

Set your home page on the client to your cluster IP. Open multiple IE
sessions and on each one, after you start the session hit Ctrl-F5 (F5 or
refresh won't do it) to force a page load from the cluster IP.

Ctrl-F5 for a forced refresh. Hadn't thought about the HTTP/1.1 implication,
so I've just repeated the test using a HTTP GET utility.. same results as
before.
You should see a distribution across both nodes.

Yep. Agreed :o\
BTW, buy a second NIC for each PC. Trying to do NLB with a single NIC is
just a pain in the ass and not worth it, particularly since you can buy a
decent "server class" NIC for around $100 these days.

Well, unfortunately I don't have that option - it's a customer system and
I'm not /supposed/ to be supporting their NLB solution. Sigh.

In theory, it /should/ work out-of-the-box...

Thanks,

H1K
 
Hairy One Kenobi said:
Ctrl-F5 for a forced refresh. Hadn't thought about the HTTP/1.1 implication,
so I've just repeated the test using a HTTP GET utility.. same results as
before.

Are you certain that the utility you're using is incrementing the source
socket with each get.

In "no affinity" mode, NLB uses the source IP and source port to determine
which node in the array will reply.
 
Doug Frisk said:
Are you certain that the utility you're using is incrementing the source
socket with each get.

In "no affinity" mode, NLB uses the source IP and source port to determine
which node in the array will reply.

Not quite sure about "incrementing the socket". The utility is generating an
HTTP 1.0 GET and closing the connection. It's shown as TIME_WAIT in netstat,
indicating that each request is being shunted to the same box.

Surely the behaviour you describe is when affinity is activated
(specifically, set to "Single")?

H1K
 
Not quite sure about "incrementing the socket". The utility is generating an
HTTP 1.0 GET and closing the connection. It's shown as TIME_WAIT in netstat,
indicating that each request is being shunted to the same box.

Surely the behaviour you describe is when affinity is activated
(specifically, set to "Single")?

Nope. When the affinity is set to Class C, NLB hashes against the first 3
octets of the source IP. If it's set to single, NLB hashes against all 4
octets of the source IP. When Affinity is set to non, NLB hashes against
the 4 octets of the IP plus the 2 bytes (octets) of the source TCP or UDP
port.
 
Doug Frisk said:
generating

Nope. When the affinity is set to Class C, NLB hashes against the first 3
octets of the source IP. If it's set to single, NLB hashes against all 4
octets of the source IP. When Affinity is set to non, NLB hashes against
the 4 octets of the IP plus the 2 bytes (octets) of the source TCP or UDP
port.

Ah. So setting no affinity gets you.. affinity?!? Strange.. so there's a
good chance that the unbalanced load balancing is actually working?

Hmm.

H1K
 
Ah. So setting no affinity gets you.. affinity?!? Strange.. so there's a
good chance that the unbalanced load balancing is actually working?

Exactly.

Because there's no communication between the array members when a packet
arrives at an NLB host, each member needs a deterministic method to decide
if they're responsible for that inbound packet.

So for a given configuration using affinity set to "none", a packet from a
given IP/source port will always be serviced by the same host within the
array.

But for a given source IP, different sessions using different source ports
may (statistically will) hit different nodes.
 
Doug Frisk said:
first all

Exactly.

Because there's no communication between the array members when a packet
arrives at an NLB host, each member needs a deterministic method to decide
if they're responsible for that inbound packet.

So for a given configuration using affinity set to "none", a packet from a
given IP/source port will always be serviced by the same host within the
array.

But for a given source IP, different sessions using different source ports
may (statistically will) hit different nodes.

Thanks, Doug. I'll retest.

Now to glue all that hair back on ;o)

H1K
 
Back
Top