Cluster Design Considerations

This section contains the following information:

Cluster Deployment Sizing Guidance

Publisher Guidelines

Subscriber Guidelines

Providing Sufficient Bandwidth Between Publisher and Subscribers

Round-Trip Time Considerations for Geographically Distributed Clusters

Implementing Policy Manager Zones for Geographical Regions

This section contains recommendations on how to optimize the Publisher and Subscriber constraints when deploying a Policy Manager cluster.

Cluster Deployment Sizing Guidance

Cluster deployment sizing should not be based on raw performance numbers.

 

In ClearPass 6.11.0 and later, the maximum single cluster size is 32 servers. This includes the publisher, standby publisher, subscribers, dedicated insight server, and standby insight servers.

 

When designing large clusters, we recommend contacting your Aruba account team to discuss options. For details on selecting ClearPass servers, refer to the Policy Manager Ordering and Sizing Guide, available for download from the HPE Networking Support Portal.

To determine the optimum sizing for a Policy Manager cluster:

1. Determine how many endpoints need to be authenticated.

a. The number of authenticating endpoints can be determined by taking the number of users times the number of devices per user.

b. To this total, add the other endpoints that just perform MAC Media Access Control. A MAC address is a unique identifier assigned to network interfaces for communications on a network. authentication, such as printers and other non-authenticating endpoints.

2. Take into account the following factors:

a. Number and type of authentications and authorizations:

MAC Media Access Control. A MAC address is a unique identifier assigned to network interfaces for communications on a network. authentication/authorizations vs. PAP Password Authentication Protocol. PAP validates users by password. PAP does not encrypt passwords for transmission and is thus considered insecure. vs. EAP-MSCHAPv2 EAP Microsoft Challenge Handshake Authentication Protocol Version 2. vs. PEAP Protected Extensible Authentication Protocol. PEAP is a type of EAP communication that addresses security issues associated with clear text EAP transmissions by creating a secure channel encrypted and protected by TLS.-MSCHAPv2 vs. PEAP Protected Extensible Authentication Protocol. PEAP is a type of EAP communication that addresses security issues associated with clear text EAP transmissions by creating a secure channel encrypted and protected by TLS.-GTC Generic Token Card. GTC is a protocol that can be used as an alternative to MSCHAPv2  protocol. GTC allows authentication to various authentication databases even in cases where MSCHAPv2  is not supported by the database. vs. EAP-TLS EAP–Transport Layer Security. EAP-TLS is a certificate-based authentication method supporting mutual authentication, integrity-protected ciphersuite negotiation and key exchange between two endpoints. See RFC 5216.

Active Directory Microsoft Active Directory. The directory server that stores information about a variety of things, such as organizations, sites, systems, users, shares, and other network objects or components. It also provides authentication and authorization mechanisms, and a framework within which related services can be deployed. vs. local database vs. external SQL datastore

No posture assessment vs. in-band posture assessment in the PEAP Protected Extensible Authentication Protocol. PEAP is a type of EAP communication that addresses security issues associated with clear text EAP transmissions by creating a secure channel encrypted and protected by TLS. tunnel vs. HTTPS Hypertext Transfer Protocol Secure. HTTPS is a variant of the HTTP that adds a layer of security on the data in transit through a secure socket layer or transport layer security protocol connection.-based posture assessment done by OnGuard.

b. RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources.  accounting load.

c. Operational tasks taking place during authentications, such as configuration activities, administrative tasks, replication load, periodic report generation, and so on.

d. Disk space consumed.

Note that Policy Manager writes copious amounts of data for each transaction (this data is displayed in the Access Tracker).

3. Then pick the number of Policy Manager hardware appliances you would need, with redundancy ranging from (N+1) to full redundancy, depending on the needs of the customer.

Publisher Guidelines

Setting Up a Standby Publisher

Policy Manager allows you to designate one of the Subscriber servers in a cluster to be the Standby Publisher, thereby providing for that Subscriber to be automatically promoted to active Publisher status in the event that the Publisher goes out of service. This ensures that any service degradation is limited to an absolute minimum. For details, see Deploying the Standby Publisher.

Publisher Sizing

The Publisher server must be sized appropriately because it handles database write operations from all Subscribers simultaneously. The Publisher must also be capable of handling the total-number of endpoints within the cluster and be capable of processing remote work directed to it when guest-account creation and onboarding The process of preparing a device for use on an enterprise network, by creating the appropriate access credentials and setting up the network connection parameters. are occurring.

Publisher Deployment Guidance

In a world-wide large-scale deployment, not all Subscriber servers are equally busy. To determine the maximum request rate that must be handled by the Publisher server, examine the cluster's traffic pattern for busy hours and estimate the traffic load for each Subscriber server, adjusting for time zone differences.

In a large-scale deployment, isolate the Publisher server, to allow it to handle the maximum amount of traffic possible.

To help reduce the maximum amount of traffic possible in a large-scale deployment (ignoring API Application Programming Interface. Refers to a set of functions, procedures, protocols, and tools that enable users to build application software. requests from Subscribers as well as the outbound replication traffic to Subscribers), the Publisher should not receive any authentication requests or Guest/Onboard requests directly .

If the worker traffic sent from the Subscriber servers is expected to fully saturate the capacity of the Publisher server, Insight should not be enabled on the Publisher server. If the Publisher server has spare capacity, it can be used to support the Policy Manager Insight database. However, take care to carefully monitor the Publisher server's capacity and performance.

Subscriber Guidelines

Guidelines for a Subscriber deployment are as follows:

Use The Nearest Subscriber. Guests and Onboard clients should be directed to the nearest Subscriber server. From the client’s point of view, the internal API Application Programming Interface. Refers to a set of functions, procedures, protocols, and tools that enable users to build application software. call to the Publisher is handled transparently. The best response times for static resources is obtained if the server is nearby.

Use Subscribers as Workers. Subscriber should be used as workers that process the following:

Authentication requests (for example, RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources. , TACACS+ Terminal Access Controller Access Control System+. TACACS+ provides separate authentication, authorization, and accounting services. It is derived from, but not backward compatible with, TACACS. , Web-Auth)

Online Certificate Status Protocol (OCSP Online Certificate Status Protocol. OCSP is used for determining the current status of a digital certificate without requiring a CRL. ) requests

Static content delivery (for example, images, CSS, JavaScript)

Avoid Sending Worker Traffic to the Publisher. Best practices is to avoid sending "worker traffic" to the Publisher, as the Publisher services API Application Programming Interface. Refers to a set of functions, procedures, protocols, and tools that enable users to build application software. requests from Subscribers, handles the resulting database writes, and generates replication changes to send back to the Subscribers.

Verify OSCP checks for EAP-TLS EAP–Transport Layer Security. EAP-TLS is a certificate-based authentication method supporting mutual authentication, integrity-protected ciphersuite negotiation and key exchange between two endpoints. See RFC 5216. If Onboard is Being Used. If Onboard is used, ensure that the EAP-TLS EAP–Transport Layer Security. EAP-TLS is a certificate-based authentication method supporting mutual authentication, integrity-protected ciphersuite negotiation and key exchange between two endpoints. See RFC 5216. authentication method in Policy Manager is configured to perform localhost OCSP Online Certificate Status Protocol. OCSP is used for determining the current status of a digital certificate without requiring a CRL. (Online Certificate Status Protocol) checks.

When a configuration backup file from an earlier ClearPass version that includes a virtual IP (VIP) address is restored on a 6.11 installation, the restoration is successful on the standalone ClearPass server and includes the VIP address. However, when a subscriber is rejoined to the cluster the VIP address is not reflected in the subscriber's configuration because it is not enabled by default. After the subscriber has rejoined the cluster, go to Administration > Server Manager > Server Configuration > Virtual IP Settings. Select the subscriber in the Secondary Node drop-down list, select the Enabled check box, and then Save.

Providing Sufficient Bandwidth Between Publisher and Subscribers

In a large-scale deployment, reduced bandwidth or high latency on the link (greater than 200 ms) delivers a lower-quality user experience for all users of that Subscriber, even though static content is delivered locally almost instantaneously.

For reliable operation of each Subscriber, ensure that there is sufficient bandwidth available for communications with the Publisher. For basic authentication operations, there is no specific requirement for high bandwidth. However, the number of round-trips to complete an EAP Extensible Authentication Protocol. An authentication protocol for wireless networks that extends the methods used by the PPP, a protocol often used when connecting a computer to the Internet. EAP can support multiple authentication mechanisms, such as token cards, smart cards, certificates, one-time passwords, and public key encryption authentication.  authentication could cause delay for the end user.

Traffic Flows Between Publisher and Subscriber

The traffic flows between the Publisher and Subscriber servers include:

Basic monitoring of the cluster. Monitoring operations generate a small amount of traffic.

Time synchronization for clustering. Generates standard Network Time Protocol (NTP Network Time Protocol. NTP is a protocol for synchronizing the clocks of computers over a network.) traffic.

Policy Manager configuration changes. This is not a significant bandwidth consumer.

Zone Cache. The amount of traffic depends on the authentication load and other details of the deployment. Cached information is metadata and is not large. This data is replicated only within the Policy Manager zone.

Guest/Onboard dynamic content proxy requests. This is essentially a web page and averages approximately 100 KB.

Guest/Onboard configuration changes. Only the changes to the database configuration are sent, and this information is typically small in size (approximately 10 KB).

Round-Trip Time Considerations for Geographically Distributed Clusters

It's important to take the delay between a Policy Manager server and a NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch./NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. (a controller or switch) into consideration when building geographically distributed clusters.

In a large geographically dispersed cluster, the worst case round-trip time (RTT) between a NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. /NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch. and all potential servers in the cluster that might handle authentication is a design consideration.

Aruba recommends that the round-trip time between the NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch./NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. and a Policy Manager server should not exceed 600 ms.

The acceptable delay between cluster servers is less than 100 ms (RTT less than 200 ms).

The link bandwidth should be greater than 10 Mbps Megabits per second.

It's possible to configure a NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch./NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. to point at multiple RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources.  servers, either for load balancing or failover.

For example, a NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch./NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. in Paris could point to a Policy Manager server in London as a backup RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources.  server. That's not a problem as long as the round-trip time guidelines are adhered to.

Implementing Policy Manager Zones for Geographical Regions

Policy Manager zones exist to control the replication of information between servers in a cluster. Included in this control is the replication of the Zone Cache, which contains the endpoints' run-time state information.

The Zone Cache is replicated across all servers in a zone—not all servers in the cluster. If zoning has not been configured, traffic flows between the Publisher and Subscriber servers, as well as between all the Subscriber servers in the cluster.

Run-Time Information

The run-time state information includes:

Roles and postures of the connected entities

Connection status of all endpoints running OnGuard

Machine authentication state

Session information used for Change of Authorization (CoA Change of Authorization. The RADIUS CoA is used in the AAA service framework to allow dynamic modification of the authenticated, authorized, and active subscriber sessions. )

Information about which endpoints are on which NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. /NAD Network Access Device. NAD is a device that automatically connects the user to the preferred network, for example, an AP or an Ethernet switch.

Policy Manager uses run-time state information to make policy decisions across multiple transactions.

When a Cluster Spans WAN Boundaries and Geographic Zones

In a deployment where a cluster spans WAN Wide Area Network. WAN is a telecommunications network or computer network that extends over a large geographical distance. boundaries and multiple geographic zones, it's not necessary to share run-time state information across all the servers in the cluster.

For example, endpoints present in one geographical area are not likely to authenticate or be present in another area. It's therefore more efficient from a network usage and processing perspective to restrict the sharing of such run-time state information to a specific geographical area.

Certain cached information is replicated only on the servers within a Policy Manager zone. In a large-scale deployment with multiple geographical areas, multiple zones should be used to reduce the amount of data that needs to be replicated over a wide-area network.

Zones and the Persistent Agent

A persistent agent attempts to establish communications with a Policy Manager server in the same zone; if that is not possible, it contacts a server in another zone.

Zone configurations allow for fairly deterministic control of where the persistent agent will send its health information. At minimum, the agent health information should go to a server in the same zone as the authentication request.

From a design perspective, for large geographically dispersed deployments, the design goal should be for agent health information and authentication requests to be sent to the same cluster server. Targeting authentication requests to a specific server is easily accomplished with NAS Network Access Server. NAS provides network access to users, such as a wireless AP, network switch, or dial-in terminal server. configuration.

Creating Geographical Zones in Policy Manager

You can configure zones in Policy Manager to match with the geographical areas in your deployment. You can define multiple zones per cluster. Each zone has a number of Policy Manager servers that share their runtime state.

To create geographical zones in Policy Manager:

1. Navigate to the Administration > Server Manager > Server Configuration page.

Figure 1  Manage Policy Manager Zones Link

2. Click the Manage Policy Manager Zones link. The Policy Manager Zones dialog opens.

3. Select Click to add.... A blank field appears in the dialog.

Figure 2  Adding a Policy Manager Zone

4. Enter the name of the new Policy Manager zone.

5. To create additional Policy Manager zones, repeat Steps 3 and 4.

6. When finished, click Save. You see the message, "Policy Manager Zones modified successfully."

Policy Manager Zone Deployment Guidance

Guidance for deploying Policy Manager zones is as follows:

1. In a large-scale deployment, create one Policy Manager zone for each major geographical area of the deployment.

2. To handle RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources.  authentication traffic in each region, configure the region’s networking devices with the Policy Manager servers in the same zone.

3. If additional authentication servers are required for backup, you can specify one or more Policy Manager servers located in a different zone, but Arubarecommends that you deploy remote servers that have the best connection, that is, the lowest latency, highest bandwidth, and highest reliability.

4. There may be cases in which the RADIUS Remote Authentication Dial-In User Service. An Industry-standard network access protocol for remote authentication. It allows authentication, authorization, and accounting of remote users who want to access network resources.  server on the network infrastructure is configured to use remote Policy Manager server servers that are outside of their primary geographic area.

In this scenario, the replication of the runtime states might be relevant. Consider this behavior during the design and deployment of a distributed cluster of Policy Manager server servers.