Network Load Balancing NLB on Server 2003 with Cisco Switch

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have two servers (HP Proliant BL20p-G2) setup as follows

ServerA NIC1 - 10.10.0.58
ServerA HeartBeat - 192.168.251.13

ServerB NIC1- 10.10.0.59
ServerB HeartBeat - 192.168.251.14

Cluster IP - 10.10.0.60

The HeartBeat NICs are connected via a crossover cable

The Main NICs are connected to a Cisco CAT4000 switch

NLB is setup to use "Multicast" and "IGMP Multicast"

Everything is running Windows Server 2003

All NICs are 1GB

I have two problems

When I setup NLB it can only hit the clustered address
10.10.0.60 from devices on the same VLAN. Hosts on other
VLANs or other subnets cannot hit it. This was resolved
by our network people adding a static ARP to the default
gateway for the clustered IP and MAC (they were not at all
happy about having to do this!!!). Regardless once this
was done it appeared to work from all networks and VLANs.
The addition of the static ARP was determined from MS KB's
Q193602 and Q197862.

We did this and initially all appeared to be fine until we
started to monitor the systems via our monitoring
systems. About once an hour the monitoring system
reports "IP_TTL_EXPIRED_TRANSIT" on the clustered IP
address (10.10.0.60) and the dedicated IPs on each server
occasionally post "Host Unreachable" messages. I want to
bring these servers into production but cannot if they are
occasionally do this.

All drivers from HP/Compaq are fully up to date. I even
opened a ticket with HP/Compaq to see if they had any know
issues with their NIC and Teaming software on Server 2003
running NLB.

Is my setup OK?

Are occasional dropped pings normal with NLB?

Are there any know hotfixes relating to NLB on Server 2003?

Should I be using Unicast, Multicast or "IGMP Multicast"?

This is driving me nuts - I have never had problems with
NLB on Win2k!

Thank you

Darren
 
I'd use Unicast, personally, unless you have a specific need for Multicast
or IGMP Multicast.

-Y
 
I did try use Unicast however this just kills all server
network connectivity. I also tried changing to Unicast
with heartbeat network totally disabled. As soon as the
servers converge using Unicast both drop from the network
(and the only way to connect is locally via KVM)

Why do you recommend Unicast over Multicast?
 
Hi,

First thing - the NLB heartbeat traffic doesn't go across your crossover
cable. NLB uses the load balanced connection to send heartbeat traffic. The
crossover connection between the your two servers isn't providing use right
now.

Second - NIC teaming and NLB don't always work together very well. I suggest
that you remove the teaming software and get NLB up and working correctly
without the teaming NICs.

http://www.microsoft.com/technet/tr...echnol/windowsserver2003/support/NLB-TRBL.asp

--

Thanks,
Marc Reynolds
Microsoft Technical Support

This posting is provided "AS IS" with no warranties, and confers no rights.
 
So I can just disable the Heartbeat network or do I plug
the HeartBeat NICs into the same VLAN as the dedicated
NICs.

I removed the static ARP from the switch and I have not
had a missed ping or TTL expired in transport since then
(of course now I cannot hit the server from any VLAN or
Subnet other than its own.)

Do Vendors normally have addional support/teaks/drives for
their teaming software? These are Proliant servers so one
would assume that they work fine with MS.
 
Unicast is the preffered Method because you don't have to manually add the
ARP entries on the routers. As the articles stated you mentioned in your
first POST, it is only the Mutlicast Mac address that the routers disregard
and do not cache.

The downfall of Unicast mode is that if you are in Single NIC scenerio, the
2(or however many) NLB nodes can't access eachother, even via Dedicated IP
address. Thats why we recommend 2 NIC Nodes with Unicast cast mode. The
Default gateway should only be configured ONLY on the Dedicated\Managed
NIC. As Marc stated in the second post, this is not the NLB heartbeat, the
NLB heartbeat is only communicated over the adapters that have NLB bound to
them. The management NIC is used for Node to Node Non NLB communication
and management purposes.

If you run NLB /Display and post them on this thread I will look at your
configuration and provide suggestions.
1 BIG Question..

Are you using Layer 2 or Layer 3 switches? What are the NLB NIC's plugged
directly into.

NOTE: NLB and Layer 3 switches are NOT Compatible and will not work without
a HUB in between the NIC and switch, and I don't know of any GigaBit HUB
out there.

Thank you,

Alan Wood[MSFT]

This posting is provided "AS IS" with no warranties, and confers no rights.
 
Alan,

We are using Layer 3 switches and I have no options to
change this. Does this mean that I just cannot ever run
NLB?

Here is the results from "NLB display"



WLBS Cluster Control Utility V2.4 (c) 1997-2003 Microsoft
Corporation.

Cluster 10.10.0.60



=== Configuration: ===



Current time = 12/22/2003 9:36:51 AM
ParametersVersion = 4
VirtualNICName =
AliveMsgPeriod = 1000
AliveMsgTolerance = 5
NumActions = 100
NumPackets = 200
NumAliveMsgs = 66
ClusterNetworkAddress = 01-00-5e-7f-00-3c
ClusterName = dalwebcl1.abc.com
ClusterIPAddress = 10.10.0.60
ClusterNetworkMask = 255.255.254.0
DedicatedIPAddress = 10.10.0.58
DedicatedNetworkMask = 255.255.254.0
HostPriority = 1
ClusterModeOnStart = STARTED
PersistedStates = NONE
DescriptorsPerAlloc = 512
MaxDescriptorAllocs = 512
TCPConnectionTimeout = 60
IPSecConnectionTimeout = 86400
FilterICMP = DISABLED
ScaleSingleClient = 0
NBTSupportEnable = 1
MulticastSupportEnable = 1
MulticastARPEnable = 1
MaskSourceMAC = 1
IGMPSupport = ENABLED
IPtoMcastIP = ENABLED
McastIPAddress = 239.255.0.60
NetmonAliveMsgs = 0
EffectiveVersion = V2.1
IPChangeDelay = 60000
IPToMACEnable = 1
ConnectionCleanupDelay = 300000
RemoteControlEnabled = 0
RemoteControlUDPPort = 2504
RemoteControlCode = 0x0
RemoteMaintenanceEnabled = 0x0
CurrentVersion = V2.4
InstallDate = 0x3FBA8A26
VerifyDate = 0x0
NumberOfRules = 1
BDATeaming = DISABLED
TeamID =
Master = DISABLED
ReverseHash = DISABLED
IdentityHeartbeatPeriod = 10000
IdentityHeartbeatEnabled = ENABLED
PortRules
Virtual IP addr Start End Prot Mode Pri
Load Affinity
ALL 0 65535 Both Multiple
Equal S



=== Event messages: ===



#1164 ID: 0x4007001D Type: 4 Category: 0 Time: 12/18/2003
1:52:19 PM
NLB Cluster 10.10.0.60 : Host 1 converged as DEFAULT host
with host(s) 1,2 as part of the cluster.

000C0000 005A0004 00000000 4007001D 00000000 00000000
00000000 00000000
00000000 00000000 000614F8 00000000 00000000

#1162 ID: 0x4007003F Type: 4 Category: 0 Time: 12/18/2003
1:52:14 PM
NLB Cluster 10.10.0.60 : Initiating convergence on host
1. Reason: Host 2 is joining the cluster.

000C0000 005A0004 00000000 4007003F 00000000 00000000
00000000 00000000
00000000 00000000 00060A71 00000000 00000000

#1157 ID: 0x4007001D Type: 4 Category: 0 Time: 12/18/2003
1:51:02 PM
NLB Cluster 10.10.0.60 : Host 1 converged as DEFAULT host
with host(s) 1 as part of the cluster.

000C0000 005A0004 00000000 4007001D 00000000 00000000
00000000 00000000
00000000 00000000 000614F8 00000000 00000000

#1155 ID: 0x40070045 Type: 4 Category: 0 Time: 12/18/2003
1:50:57 PM
NLB Cluster 10.10.0.60 : Initiating convergence on host
1. Reason: Host 2 is leaving the cluster.

000C0000 005A0004 00000000 40070045 00000000 00000000
00000000 00000000
00000000 00000000 00060BBF 00000000 00000000

#1140 ID: 0x4007001C Type: 4 Category: 0 Time: 12/18/2003
1:43:34 PM
NLB Cluster 10.10.0.60 : Host 1 converged with host(s) 1,2
as part of the cluster.

000C0000 005A0004 00000000 4007001C 00000000 00000000
00000000 00000000
00000000 00000000 00061516 00000000 00000000

#1137 ID: 0x40070005 Type: 4 Category: 0 Time: 12/18/2003
1:43:28 PM
NLB Cluster 10.10.0.60 : Cluster mode started with host ID
1.

000C0000 005A0004 00000000 40070005 00000000 00000000
00000000 00000000
00000000 00000000 00040222 00000000 00000000

#1134 ID: 0x4007003F Type: 4 Category: 0 Time: 12/18/2003
1:43:28 PM
NLB Cluster 10.10.0.60 : Initiating convergence on host
1. Reason: Host 1 is joining the cluster.

000C0000 005A0004 00000000 4007003F 00000000 00000000
00000000 00000000
00000000 00000000 0006081E 00000000 00000000

#1104 ID: 0x4007001D Type: 4 Category: 0 Time: 12/18/2003
1:37:34 PM
NLB Cluster 10.10.0.60 : Host 1 converged as DEFAULT host
with host(s) 1,2 as part of the cluster.

000C0000 005A0004 00000000 4007001D 00000000 00000000
00000000 00000000
00000000 00000000 000614F8 00000000 00000000

#1102 ID: 0x4007003F Type: 4 Category: 0 Time: 12/18/2003
1:37:29 PM
NLB Cluster 10.10.0.60 : Initiating convergence on host
1. Reason: Host 2 is joining the cluster.

000C0000 005A0004 00000000 4007003F 00000000 00000000
00000000 00000000
00000000 00000000 00060A71 00000000 00000000

#1100 ID: 0x4007001D Type: 4 Category: 0 Time: 12/18/2003
1:36:21 PM
NLB Cluster 10.10.0.60 : Host 1 converged as DEFAULT host
with host(s) 1 as part of the cluster.

000C0000 005A0004 00000000 4007001D 00000000 00000000
00000000 00000000
00000000 00000000 000614F8 00000000 00000000



=== IP configuration: ===





Windows IP Configuration



Host Name . . . . . . . . . . . . : serverA

Primary Dns Suffix . . . . . . . : abc.com

Node Type . . . . . . . . . . . . : Unknown

IP Routing Enabled. . . . . . . . : No

WINS Proxy Enabled. . . . . . . . : No

DNS Suffix Search List. . . . . . : abc.com



Ethernet adapter HeartBeat:



Connection-specific DNS Suffix . :

Description . . . . . . . . . . . : HP NC7781 Gigabit
Server Adapter #3

Physical Address. . . . . . . . . : 00-0B-CD-FE-D3-21

DHCP Enabled. . . . . . . . . . . : No

IP Address. . . . . . . . . . . . : 192.168.251.13

Subnet Mask . . . . . . . . . . . : 255.255.255.252

Default Gateway . . . . . . . . . :



Ethernet adapter Team:



Connection-specific DNS Suffix . :

Description . . . . . . . . . . . : HP Network Team #1

Physical Address. . . . . . . . . : 00-0B-CD-FE-CA-D1

DHCP Enabled. . . . . . . . . . . : No

IP Address. . . . . . . . . . . . : 10.10.0.60

Subnet Mask . . . . . . . . . . . : 255.255.254.0

IP Address. . . . . . . . . . . . : 10.10.0.58

Subnet Mask . . . . . . . . . . . : 255.255.254.0

Default Gateway . . . . . . . . . : 10.10.1.1

DNS Servers . . . . . . . . . . . : 10.10.1.10

10.10.1.11



=== Current state: ===



Host 1 has entered a converging state 3 time(s) since
joining the cluster

and the last convergence completed at approximately:
12/18/2003 1:52:14 PM

Host 1 converged as DEFAULT with the following host(s) as
part of the cluster:

1, 2
 
I have a similar environment, that I can't find documentation on how to setup

Two servers in a DMZ, both have dual NICS in each
Server 1 -NIC1 - Public I
Server 1 -NIC2 - 192.168.1.1 Private IP with Cross-Over into Server

Server 2 - NIC1 - Publlic I
Server 2 - NIC2 - 192.168.1.2 Private IP with Cross-Over into Server

Question, which of the above NIC's do I check off "Network Load Balancing" in network properties
Question, when starting NLB Manager, which IP of the other server do I try to connect to to add a cluster member
Question, do I need to run NLB Manager on the other server as well after the first one is configured
Question, is the configuration saved after a reboot

Thanks!
 
First off you don't need to connect the two server together with a cross
over cable. NLB sends it heartbeat packets across the load balanced
interface.
Question, which of the above NIC's do I check off "Network Load Balancing"
in network properties?
The public interface
Question, when starting NLB Manager, which IP of the other server do I try
to connect to to add a cluster member?
The public interface
Question, do I need to run NLB Manager on the other server as well after
the first one is configured?
You don't need to, but you can.
Question, is the configuration saved after a reboot?
yes

If you are only going to use a single interface on each node, you need to
add an IP address as the "dedicated" IP on each node. The dedicated IP is
bound to the load balanced interface, but is not load balanced.


--

Thanks,
Marc Reynolds
Microsoft Technical Support

This posting is provided "AS IS" with no warranties, and confers no rights.


VMC said:
I have a similar environment, that I can't find documentation on how to setup:

Two servers in a DMZ, both have dual NICS in each.
Server 1 -NIC1 - Public IP
Server 1 -NIC2 - 192.168.1.1 Private IP with Cross-Over into Server 2

Server 2 - NIC1 - Publlic IP
Server 2 - NIC2 - 192.168.1.2 Private IP with Cross-Over into Server 1

Question, which of the above NIC's do I check off "Network Load Balancing" in network properties?
Question, when starting NLB Manager, which IP of the other server do I try
to connect to to add a cluster member?
 
Thanks for the reply. I thought the best practice was to have two nics total for a NLB environment, one for network access, one for management. We are ultimately trying to load balance IIS servers. In this scenario, do you recommend we configure for Multicast, or Unicast?
 
This thread is interesting to me, we are currently trying to implimen
NLB accross 4 new servers which will eventually allow Terminal Service
access. The problem I have is that I create a NLB Cluster on TS1 an
then Added TS2, TS3 and TS4. At each stage the network seemed unstabl
for some time after adding the next server finally when TS4 went int
the NLB cluster TS2 and TS3 stopped responding to pings from any of th
servers in the cluster but I can still connect to the interface fro
another PC or ping the TS machine from another on the local networ
connected to the same switch.

We have an 2server IIS cluster runnign that I set up last year thi
larger cluster all seems very erratic


-
Eagle
 
Simple solution...... Server Load Balancer

You guys are making this way to hard. Its simple. Setup the NLB on your server using Unicast. You will then have a Virtual IP that cannot be routed correctly in any switch, especially Cisco and 3Com as they look for a physical NIC attached to the mac address, which is not the case. So...... Easy solution....

Any device you wish to NLB, put it behind a Server Load Balancer. Takes care of all your problems and sorts out all the issues with Switch Flooding, Mac Address mapping, etc., etc.....

Couple companies that sell "Server Load Balancer" F5's Big-IP. Great product. http://www.f5.com
Cisco, Barracuda, Zeus, Juniper, all have Server Load Balancing products and they help in more ways then you initialy intend.

My personal experience, I say use F5, very easy to configure, setup, manage and it works absolutely wonderful!! Took a network we had with 6 NLB and 2 Clustered servers that were causing Switch Flooding and seriously degradining network performance, and made them faster then they originally were, speed up our Web Apps, amazing.. Great product, and not too pricey. Available on ebay as well as many people grow and out grow their smaller models to get bigger models.

Yours Truly,
Brainz
 
Back
Top