IT Courses Offered In:
This site runs on .NET 4.0 and SQL Server 2008 powered by Windows Server 2008 R2 Hyper-V. Built using Visual Studio 2010.
| IT Training | Newsletters | An Overview of Windows Server 2003 Clusters |
An Overview of Windows Server 2003 Clusters
by Tom Hall
Senior Instructor for SQLSoft+
MCSA, MCSE, MCT
Table of Contents
The Cluster Service
|
Network Load Balancing Clusters (NLB)
|
The Cluster Service
Overview of Clusters
Two types of clusters are supported by Windows Server 2003 to provide high availability of applications and services:
- The cluster service sometimes referred to as "shared device clustering."
- Network Load Balancing clusters.
Shared device clusters are clusters that pass access to data from one server to another in the event of a failure. Network Load Balancing clusters provide a performance based distribution of IP connections..
The Cluster service is used for data that changes frequently. Examples are
Exchange and SQL Server databases.
Network Load Balancing clusters are typically used for data that is either static
or where data is not stored on behalf of the client. Examples of these types
of data are Web farms, or Exchange front-end servers.
The Cluster Service
As many as eight (8) nodes or clustered servers can be in a cluster. All of the nodes in a cluster communicate with heartbeats once per second. The recommended implementation is to use at least one server as a failover server that does not provide services unless another server fails over to it. A short period of time exists when access to the data is not available. That period of time is during "Failover," which may last from 30 seconds to 6 minutes (these are typical times, they could be longer) depending on the resource type that is being clustered, the configuration, and the hardware used in the implementation.
Network Load Balancing Clusters
Network Load Balancing (NLB) Clusters can include up to 32 servers. These servers communicate with heartbeats once per second. If five heartbeats are missed from a server, that server is considered to be a failed server. Five missing heartbeats trigger a process called convergence. During convergence the cluster is dynamically reconfigured to exclude the server from which the five heartbeats were missed.
When heartbeats from a new or repaired server coming online are detected by an NLB cluster, the cluster enters convergence again and the new server becomes a member of the cluster and begins accepting inbound connections on behalf of the cluster. Convergence requires less than 10 seconds from detection of a required change to completion. During convergence, functioning hosts in a cluster continue to process inbound requests without interruption in service.
What Can Be Clustered
Applications to be clustered must use an IP-based protocol. The application must permit where the data for the application is stored so that the data can be located on a drive that is accessible to the cluster nodes. The application running on the client must either automatically try to reconnect or offer the client a method for reconnecting after a failover. Applications written to the cluster API are called Cluster Aware applications. These applications provide communication with the cluster service and allow the cleanest failover to another server. Non-cluster aware applications can also be clustered, but with more limited performance.
Examples of applications and services that can be clustered are:
- SQL Server 2000 and later
- Exchange Server 2000 or Exchange Server 2003
- File and Print Services
- Stand alone DFS roots
- DHCP
- WINS
Check with the vendor of other applications under consideration before clustering third-party applications.
This is one link of many to get started on additional information on clustering.
Network
Load Balancing: Frequently Asked Questions for Windows 2000 and Windows Server
2003.
Clustering Methods
Shared Nothing
Shared Nothing is a clustering term indicating that only one server in a cluster has access to a set of data until a failover occurs. Shared Nothing is the most common implementation of clustered applications.
Shared Everything
Shared Everything is a clustering term which means that access to a set of data can be provided to multiple servers without a failover occurring. Access to the data is controlled by a Distributed Lock Manager (DLM) provided by the application. The DLM prevents more than one server at a time from accessing the data which could cause data corruption. The DLM can cause significant overhead for applications that require frequent access to data by multiple servers. This is not a common implementation.
Cluster Service Terminology
- Node – is a server that has become a member of a cluster. A node is a server that provides access to data which is stored on a shared disk. Access to the data can be passed from one node to another through a process called "Failover."
- Heartbeat – a UDP broadcast used to communicate between nodes in a cluster. This is a broadcast heartbeat on port 3343 that is used to check that servers are online. Heartbeats are sent out once per second. Five missing heartbeats is the default trigger point for a failover to occur.
- Private Network – a network that is designated for cluster Node-to-Node communication. Since this is the network that handles broadcast heartbeat traffic from all of the nodes, it should be placed on a separate segment. Private networks only handle heartbeat and node to node traffic. Other traffic between the nodes includes configuration change traffic.
- Public Network – a network that is designated for client to cluster communication. Network adapter cards configured as "Public Network" will only handle traffic from a client to a cluster. It will not handle broadcast heartbeat traffic or node to node configuration traffic.
- Mixed Network – a network that is designated for client to cluster communication, but which will allow heartbeat and node to node communication in the event that the Private network fails. One network card should be configured as "Mixed Network" to eliminate another single point of failure.
- Network Name Resource – a virtual server name that clients can connect to. Network Name resources are paired with an IP Address resource for name to IP address resolution.
- IP Address Resource – An IP address resource is paired with a network name resource and dynamically registered in DNS and WINS so that clients can connect to the application or service provided by a virtual server.
- Groups – a group contains a number of resources. Common examples would be a Network Name, an IP Address, and a disk. Groups and all of the resources within a group failover to another server as a whole.
The client can connect to a Network Name resource (a virtual server). A cluster node then provides access to the data stored on a shared disk. The Network Name, the IP address, and the disk are all resources within a single group. Only one node at a time can own a group and all of the resources within a group failover and failback as a unit. No single resource within a group can failover without the other resources in a group also going through failover.

Cluster Service Installation Requirements
Operating Systems of Each Node
Windows Server 2003 Enterprise Edition supports the cluster service. Standard Edition does not support the cluster service.
Network Adapter Cards
One network adapter should be configured as Private for heartbeat and node-to-node traffic. A second network adapter card should be configured as Mixed so that if the Private card fails the Mixed network adapter can takeover. Additional network cards can be configured as Public.
Disk Requirements
The data to be stored in the clusters shared location must be on a separate bus from the operating system and on the same bus accessible to all of the nodes in the cluster. The same drive letter configuration of the shared drive must be the same on all of the nodes in the cluster. The shared drive must be formatted as NTFS.
RAID
One of the primary points of clustering is to provide "high availability" for applications and data. Most data that is clustered is therefore stored on RAID arrays. Only hardware RAID is supported.
IP Addresses
During the installation of the cluster service several IP addresses will be required. One will be used for each node, one IP address is required for the cluster, one IP address will be required for the private network, and at least one IP address will be required for each network name resource or virtual server configured.
Network Name Resource Naming
Cluster and network name resources use the NetBIOS naming convention of 15 characters maximum. This is done so that they can be registered in both WINS and DNS for name to IP address resolution.
Required Accounts
To install the cluster service, you must be an administrator of the machine where the cluster service is being installed. A service account must also be created in Active Directory prior to installing the cluster service. This account will be used to start the cluster service.
Applications and Services to Be Clustered
Each application will have specific requirements and procedures to install it into a cluster. Applications are normally installed after the cluster service is installed; they must be installed on each node required to support the application. WINS and DHCP can be migrated to the cluster service if they were installed before the cluster service was installed. DFS roots must be either migrated into the cluster or removed from the node. Check with the vendor of applications and services before you attempt to cluster them.
For more information, see Technical Overview of Windows Server 2003 Clustering.
Cluster Administration
Both graphical and command line tools exist for administration.
The graphical user interface tool called Cluster Administrator is launched by running Cluadmin.exe from the Run command or Command prompt.
After the cluster service is installed, this tool can be used to manage clusters locally or remotely. It allows the configuration of the cluster service and can be used to create groups, resources, do manual failovers, and testing.
Figure: A screen shot of cluster administrator:

The command line version of cluster administrator is Cluster.exe. This tool allows command line execution or scripting of almost anything that can be done with the graphical version of the tool.
Network Load Balancing Clusters (NLB)
Overview
In the past, one method of providing a distribution of IP connections was DNS round robin. Two considerations with DNS round robin are that first, it is not performance-based and second, if one of the servers fails, connections to that server will continue to fail until either the server is repaired and brought back online or until the record is manually removed from DNS. The purpose of Network Load Balancing is to provide a performance-based distribution of IP connections to a group of servers running the same application. This does not imply that NLB is aware of the application that the client is connecting to. An additional application must be used to check on the application or service to ensure that the service or application is running. If the application or service fails, then a WLBS stop command can be issued locally or remotely to the server that has had a failure. The WLBS service stopping will trigger a process called convergence and the failed server will dynamically exit the cluster.
The distribution of connections to an NLB cluster can be automatically controlled or manually assigned. A manual assignment might be preferred when one server has faster hardware than other servers in the NLB cluster. Up to 32 hosts can be in an NLB cluster. Another option is to configure a priority based or N-1 failover NLB cluster. This means that only one server in the cluster will respond to inbound requests. The server that will respond has the lowest host ID or "highest priority" to handle the traffic. Other servers in an N-1 failover cluster act as hot spares and will take over from lowest host ID to highest host ID in the event of a failed server.
NLB cluster hosts communicate with heartbeats once per second. The heartbeats are UDP broadcasts on port 3343. By default, 5 missing heartbeats triggers a process called convergence, which reconfigures the cluster. Convergence reforms the cluster by either removing a host that is no longer responding or by adding a new or repaired host that comes back online. Convergence takes less than 10 seconds from discovery of missing heartbeats or of heartbeats from a server coming online. The period of once per second and the trigger point of 5 missing heartbeats can both be modified in the registry.
Requirements
Software Requirements
NLB clusters are supported on either Windows Server 2003 Standard or Enterprise Edition. The service is built into the operating system. By default "Network Load Balancing" shows up on the properties of the network adapter cards. If it is not displayed there, click on install and install the service. No other software is required.
Protocol Requirements
Applications must be able to use TCP or UDP protocols. Load balancing is only done on TCP or UDP or both over IP. The IP addresses on which load balancing is done are all manually assigned. DHCP assigned addresses are not supported. The address records must be manually entered into DNS. There is no support for dynamic DNS registrations.
Hardware Requirements
Two network adapters are preferred. The recommended implementation is that one card is used for cluster heartbeats and cluster traffic. Note that heartbeat traffic and cluster traffic is done on the same network adapter card. No separate network exists for heartbeat traffic as there is on a shared device cluster. The second card is used for host-to-host or staging server-to-host communication. Using a second card allows communication with the server without reducing available bandwidth on the card used for client to cluster communication. You may have, for instance, a staging server used to update the content that clients access on the host.
Network Configuration Requirements
All hosts in an NLB cluster must be on the same broadcast subnet or on a VLAN. IP address to MAC address (the cluster's MAC address in this case) resolution is done with the ARP protocol which is a broadcast protocol, therefore all hosts in an NLB cluster must be on the same broadcast subnet to be able to respond.
Compatible Applications
Compatible applications must use either TCP or UDP over IP as those are the only protocols on which load balancing occurs.
Scalability
Up to 32 hosts can be configured to converge into a single NLB cluster. An older method of distributing IP connections was DNS round robin. DNS round robin was manually configured and manually repaired if a server failed. However, DNS round robin can now be used to alternately connect clients to multiple NLB clusters. Connection failures will not occur unless all of the hosts in one of the clusters fails. As long as there is at least one host in each NLB cluster, client connections will not fail. So, NLB clusters are scalable in increments of 32 servers per cluster up to any size.
Hardware Configuration
A best practice for each host in an NLB cluster is to have at least two network adapter cards. One enabled for NLB and a second card used for host to host or staging server to host communication. The NLB enabled adapter should be configured in "unicast" mode. Unicast mode means that that adapter will only be used for client to NLB cluster communication and will only support the clusters IP address and MAC address. The NLB service will load balance on all IP addresses other than the host dedicated address.
Creating a Network Load Balancing Cluster
Manually Creating an NLB Cluster
Creating an NLB cluster can be done manually by configuring each server with the same cluster name, cluster IP address, a unique host ID, and the same type and number of port rules. These settings are found on the properties of the network adapter cards on each server. After the parameters are configured the host will begin sending our heartbeats once per second and either form a cluster if it is the first server configured, or the host will join an existing cluster with the same parameters.
By Using Network Load Balancing Manager
The second method of creating a network load balanced cluster is to use the Network Load Balancing Manager. This tool is found in the administration tools on the desktop of every Windows Server 2003 server. The NLB Manager tool is a graphical tool that can be used to create clusters and to manage existing clusters. It is a wizard driven tool which simplifies the creation of NLB clusters. Using this tool, the parameters of the cluster are selected and then pushed to the servers in the cluster. So, each server is configured correctly.
NLB Installation and Configuration
Configuration
As previously mentioned, NLB clusters can be created with either of two methods, either manually or by using the Network Load Balancing Manager. In this overview, we will initially focus on the manual configuration and interfaces. Manual configuration of an NLB cluster is done on the properties of the network adapter card. If Network Load Balancing is not present on the properties of the network adapter card, then install the network load balancing service on the network adapter.

Three sets of information exist that must be configured on the properties of network load balancing: Cluster Parameters, Host Parameters and Port Rules. A tab exists for each of these property sets.
Figure: The Cluster Parameters tab

All hosts in the cluster must use the same settings on the Cluster Parameters tab with regard to:
- Primary IP Address (Cluster Address) – This is the address that will be manually registered in DNS that clients will use to connect to the cluster.
- Subnet Mask – Used with the IP address to connect to the cluster.
- Full Internet Name (www.domainname.ext) – The name clients use to connect to a Web site for example.
- Network Address – This is a software generated MAC address based on the IP address of the cluster and whether the cluster is running in unicast or multicast mode.
- Unicast, Multicast, or IGMP Multicast:
- Unicast – means that this network adapter will only be bound to the clusters IP address and MAC address. The host IP address will not be used on this adapter.
- Multicast – means that this network adapter will be bound to both the cluster IP address and MAC address AND the host IP address and the burned in MAC address on the card.
- IGMP – used for streaming applications.
The selection of remote control effects whether or not this host will be controlled by remote administration commands, but it will not effect whether or not the host will converge into a cluster.
Figure: The Host Parameters tab

All hosts in the cluster must have unique settings on the Host Parameters tab with regard to:
- Priority (unique host identifier) – Each host requires a
unique priority ID.
- This host ID is used to control individual hosts remotely.
- It is also used to modify the MAC address that this host sometimes uses in response to clients making a connection.
- It is used to determine which host (the host with the lowest ID) is to accept inbound connections in an N-1 failover configuration (more on this on the Port Rules tab)
- Dedicated IP address – This is the IP address and subnet mask that is used for normal host to host or client to host communication.
- Initial Host State – This is normally set to "Started" and means that when the host operating system is started that the NLB service will start also. "Retain suspended state after computer restarts" controls whether or not the suspended state is retained after a restart. A remote "suspend" command (do not respond to remote NLB administration commands) can be issued to the host as an administrative control option.
The Dedicated IP address and Primary IP address must be configured in TCP/IP Properties (Dedicated on first page, Cluster or Primary on advanced page). We will come back to this after the port rules have been defined.
Figure: The Port Rules tab and Add/Edit Port Rule GUI

Port Rules
Port rules define which ports over TCP and/or UDP will be load balanced in this cluster. The number of rules, port ranges, protocol types, and filtering modes must match on each host or they will not converge into the cluster.
- Port Range – controls which port or range of ports will be controlled by this rule. Port rules can cover a single port or a range of ports. The default rule covers all ports.
- Protocols:
- TCP, UDP, or Both – These are the only protocols on which load balancing occurs. The selections here will be dependant on the application that is being load balanced.
- Filtering Mode
- Multiple host – Means that all of the hosts in the cluster will respond to inbound requests as determined by the NLB algorithm which is installed on each host with the network load balancing service.
- Affinity – This setting controls how subsequent connections
from a particular client (based on the client IP address) are handled.
- None – This selection means that any host can handle subsequent connections from any client. This is used when no information is stored on the hosts on behalf of clients. A Web site would use None or no affinity because a client could get text, background, and graphics for a Web site from any server in the cluster. It would not matter which server responded to the inbound request. Another example would be an application that stores information in a client side cookie, no information on this particular host is stored on behalf of the client.
- Single – This selection is used when information is stored on a cluster host for a client. Subsequent connections from the same client (based on the clients IP address) will always return to the same cluster host so that the information stored for the client will be available.
- Class C – This selection is used if information is stored on behalf of a client, but there is concern that the client may be represented by multiple IP addresses. This can occur if a client is behind a proxy array. When this is the case, the client connection may be represented by multiple IP addresses. So, the client's IP address and any class C IP address similar to the initial IP address will also come back to the same cluster host.
- Load Weight
- Equal – Means to let the NLB algorithm make an equal overall distribution of connections. This does not mean that each host will take only one connection and that the very next inbound connection will go to a different host. Overall it will create an even distribution of connections.
- Manual – If some hosts have a different hardware configuration, then the load on the hosts can be manually set.
- Single Host
- Handling Priority – This setting is used if "Single host" is selected. The host with the lowest host ID (the highest priority) will handle all of the inbound cluster requests. Other hosts in the cluster act as "hot spares." If the host with the lowest priority fails, then the host with the next higher host ID takes over all of the cluster connections.
- Disabled – Selecting this option would disable this port rule for this host. If, for instance, the port rule controlled port 443 and you wanted another server to handle the overhead of encryption and decryption for SSL connections, this rule could be disabled for this host.
TCP/IP Properties
Now that the NLB parameters have been set, TCP/IP Properties must also be configured to support the NLB settings for the host and cluster IP addresses. This is done on the properties of the network adapter card on the host. Select "Internet Protocol (TCP/IP)," then "Properties."
Figure: Network Adapter Card Properties and TCP/IP Properties
pages

The Dedicated IP address from the Host tab in NLB properties is entered on the General tab in TCP/IP properties. These two addresses must match.
Figure: TCP/IP Properties Advanced Page shown with "Add"
to enter an additional IP address

The Cluster IP Address, from the Cluster Parameters tab in NLB Properties, is entered on the Advanced Page in TCP/IP properties.
After entering the Cluster and Dedicated IP address in TCP/IP properties, close TCP/IP properties. Open a command prompt and type WLBS Query and hit Enter. You should see that this host ID converged (past tense) into the cluster along with any other host IDs that are configured to be in the same cluster.
The very same parameters you used to configure an NLB cluster manually as above can be set using the Network Load Balancing Manager. Initially the cluster parameters are set using a wizard. After the cluster parameters and the first host to be in the cluster have been defined, other hosts can be added to the cluster. A screen shot of the pages for creating port rules from the Network Load Balancing Manager is shown below. The interface is the same as configuring NLB manually. However, the parameters only have to be configured one time, and then they can be pushed to additional servers. So the NLB Manager dramatically simplifies the creation and management of NLB clusters.
Figure: A screen shot of the pages for creating port rules from
the Network Load Balancing Manager

Administration
Network Load Balance cluster administration consists of starting, stopping, and checking the status of the cluster. You can manage an NLB cluster two ways: command line administration done with the Windows Load Balance Service (WLBS.exe) and the graphical tool called Network Load Balancing Manager.
Command Line Administration
Remote WLBS commands can be only be used if remote administration is allowed on the Cluster tab of NLB properties on the host. Below are some examples of commands that can be issued from either the command line or in scripts. These commands can be issued locally on one of the hosts in the cluster or remotely. The commands can be sent to a single host or to all of the hosts in a cluster.
The examples below are commands issued from a local host.
- WLBS query – this command will return the list of all the active hosts in the cluster.
- WLBS stop – this command will stop the WLBS service on the local host which will trigger convergence and the local host will exit the cluster.
- WLBS start – this command will start the WLBS service on the local host. When the WLBS service starts the host will begin sending heartbeats which will be recognized by other hosts in the cluster. The discovered heartbeats which will trigger the convergence process and the host will join the cluster.
- WLBS drainstop – this command will cause the local host not to accept anymore inbound requests, but it will not terminate existing client connections. This command allows a graceful exit of a host. When client connections no longer exist, a WLBS stop command can be issued to the host.
- WLBS suspend – this command will cause the host not to respond to any WLBS commands other than WLBS resume. If for instance, 4 particular hosts were issued a suspend command, we could issue a WLBS stop command to the entire cluster without effecting these particular hosts.
- WLBS resume – this command causes hosts that have been issued a WLBS suspend command to begin responding to WLBS commands again.
- WLBS drain port 80 – this command will cause a drainstop only on the port rule that contains port 80. This could be a rule with the single port of 80, or it could be the rule which contains a range of ports that included port 80.
- WLBS disable port 80 – this command will disable the port rule that contains port 80. This could be a rule with the single port of 80, or it could be the rule which contains a range of ports that included port 80. Either way that port rule will be disabled on this host.
- WLBS enable port 80 – this command will enable the port rule that contains port 80. This could be a rule with the single port of 80, or it could be the rule which contains a range of ports that included port 80. Either way that port rule will be enabled on this host.
Many other commands are available for specific hosts by specifying the host IP address or the Host ID. These commands can also be issued to an entire cluster by using the name of the cluster (e.g. WLBS stop cluster1)
Network Load Balance Manager Administration
The WLBS.exe commands can be issued as above from the command line, or they can be issued from the Network Load Balance Manager graphical tool. In this tool, you can select the name of the cluster to issue commands to the entire cluster, you can select a host to issue commands to only that host, or you can select a port rule to issue a command to control that port rule. A screen shot of commands available to issue to a single host is shown below.

The intention of the preceding pages is to provide an overview of the two different types of clusters, how they are configured, and why they are used.

