This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

AOS 10

The HPE Aruba Networking Wireless Operating System 10 or AOS-10 suite of technical documentation covers a range of design concepts and configuration examples that will help network architects and administrators design and configure the most optimal Wi-Fi networks for their needs.

1 - Components

A description of the different elements of the network infrastructure involved in running an AOS-10 powered network. These components include hardware, software, and management of Aruba devices through HPE’s sophisticated cloud platforms and services.

1.1 - Overview

An introduction to AOS 10 components covering deployment environments, hardware, software, and personas.

An introduction to AOS 10 components covering deployment environments, hardware, software, licensing, and personas.

HPE GreenLake

The HPE GreenLake edge-to-cloud platform is a secure, cloud-based platform that allows you to view and control your hybrid cloud estate. The platform unifies and simplifies IT operations by providing an intuitive, self-service dashboard where you can deploy and run cloud services used to provision and manage networking, compute, and storage infrastructure and perform day to day operations.

The HPE GreenLake platform provides the following:

  • Workspaces – Create, manage, and monitor workspaces that contain devices, applications, and services.

  • User Management – Manage users, roles, and access permissions to workspaces.

  • Applications – Deploy and access applications used to configure, manage, and monitor your infrastructure and operations.

  • Subscriptions – Manage device and service subscriptions for each workspace.

  • Devices – Manage device inventory and subscription assignments.

Whereas Aruba Activate and Aruba Central used to be independent cloud services, both are fully integrated into the HPE GreenLake platform. To support an AOS 10 deployment, each organization requires one workspace with a Central application deployed for your specific region. For large organizations, a workspace can support multiple Central applications each within a different region if required.

HPE Greenlake Platform

APs and Gateways are automatically or manually added to a workspace. When new APs and Gateways are purchased and a workspace for your organization exists, the APs and Gateways will automatically be added to your inventory for your workspace. This is similar to how devices were added to Activate in the past. Each AP and Gateway is assigned to a Central application and a subscription is automatically or manually assigned.

When a new or provisioned AP or Gateway boots, it will automatically communicate with device.arubanetworks.com which re-directs the device to the HPE GreenLake platform. Based on the application and subscription assignment, the device is redirected to the Central application it is assigned. Devices with no Central application or subscription assignment will not be managed by Central.

Aruba Central

HPE Aruba Networking Central is an application under the HPE GreenLake platform that simplifies the deployment, management, and optimization of WLAN, LAN, VPN, and SD-WAN. The Central application is deployed within a workspace in the HPE GreenLake platform. Each AOS 10 deployment requires one instance of the Central application to be deployed. Workspaces for larger organizations may include multiple Central applications if required where each Central application instance supports devices deployed within a specific geographical region such as North America and EMEA.

Aruba Central eliminates the time-consuming manual process of moving information from one management platform to another or trying to correlate troubleshooting information across multiple views. The use of integrated AI-based ML, IoT device profiling for security, and Unified Infrastructure management accelerates the edge-to-cloud transformation for today’s Intelligent Edge.

Aruba Central is a cloud-native microservices-based platform that provides the scalability and resilience needed for critical environments. Compared to an on-premise solution, Central is more adaptive, predictable, and horizontally scalable with built-in redundancy. Central also provides seamless access to Aruba ClearPass Device Insight, Aruba User Experience Insight (UXI), and Aruba Meridian to furnish significant capabilities to leverage AI/ML and location-based services for network visibility and insight.

Aruba Central has the following key features:

  • Cloud-native enterprise campus WLAN software

  • AI Insights for WLAN, switching, and SD-WAN

  • Advanced IPS/IDS threat defense management

  • Mobile application-based network installation

  • Unified management for access and WAN edge

  • Live chat and an AI-based search engine

  • Cloud, on-premises and as-a-Service (aaS) options

ArubaOS 10

ArubaOS 10 (AOS 10) is the distributed network operating system working with Aruba Central that controls Aruba Access Points (APs) and Gateways. With its flexible architecture, network teams can deliver reliable and secure wired and wireless connectivity for small offices, mid-sized branches, campuses, and remote workers. Working in tandem with cloud-native Aruba Central, AOS 10 provides the management and control to deliver greater scalability, enhanced security, AI-powered optimizations for faster problem resolution and unified management of all APs and gateways.

AOS 10 is different from AOS 8 in many ways. As the AOS 10 operating system is now unified, the same firmware version can now be implemented for all AP and Gateway deployment types. IT organizations no longer have to manage and maintain different AOS 8 versions and device modes to support campus, SD-Branch and Microbranch deployments. APs and Gateways running AOS 10 can support multiple personas.

The following is a summary of key architectural differences in AOS 10 from previous AOS releases:

  • The management / control plane for APs and Gateways resides within the cloud platform. APs no longer rely on Controllers for management, configuration, and operation.

  • Gateways for WLAN deployments are completely optional. AOS 10 APs can locally bridge user traffic, or tunnel user traffic to resilient Gateway cluster based on business and scaling needs.

  • The AOS version and AP mode no-longer determines the forwarding architecture. For each WLAN profile, customers can select the forwarding mode.

  • Merges SD-Branch and Microbranch functionality into a single release.

  • APs and Gateways may implement different AOS 10 versions (multi version support).

AOS 10 is designed to support networks of all sizes and can easily scale to accommodate growing network requirements. It helps streamline operations, device, user, or application policy enforcement, and AI-powered troubleshooting and optimization. As part of Aruba’s Edge Services Platform (ESP) architecture, Aruba Central along with AOS 10 delivers cloud-native management and control services across wired and Wireless Local Area Network (WLAN), and WAN through a single console. AOS 10 offers a fully cloud managed SD-WAN solution. Organizations can adopt the benefits of SD-WAN capabilities, coupled with identity-based and role-based traffic segmentation, enforced with a built-in firewall, and supported by IDS or IPS and other security functions.

AOS 10 Architecture

Supported Devices

AOS 10 is supported on specific models of APs and Gateways for new deployments or migrations. As the list of supported AP and Gateway models for each AOS 10 release will evolve over time, the current list of supported APs and Gateways can be referenced at the HPE Aruba Networking Documentation Center.

All AP models ship with either AOS 8 or AOS 10 that supports connectivity to the cloud. All models of Gateways currently ship with a version of AOS 8 that can communicate with the cloud platform, with the exception being the 9114 and the 9106 models that ship with AOS 10. A new Gateway that is deployed using one touch provisioning (OTP), zero touch provisioning (ZTP) or full setup that is assigned to a Central instance and is licensed will be automatically upgraded to a SD-WAN image by the cloud platform. The Gateway can then be automatically upgraded to AOS 10 using the version compliance feature in Central.

Deployments with supported APs and Gateways running AOS 8 can also be migrated to AOS 10. The exact migration procedure that you follow to upgrade your APs and Gateways to AOS 10 will vary by deployment and is covered in the AOS 10 Adoption guide.

Network Roles

APs and Gateways running AOS 10 can adopt network roles based on the Central configuration group they are assigned. The network role that is assigned to a configuration group determines the configuration and monitoring options that are exposed for devices within each configuration group. For example, APs assigned to a configuration group with a Campus / Branch network role will have different configuration and monitoring options exposed than APs assigned to a configuration group with a Microbranch network role.

A configuration group can contain APs only, Gateways only or both depending on the assigned network roles. Not all network roles are compatible and require dedicated configuration groups. For example, AP configuration groups with a Microbranch network role do not support Gateways and Gateway configuration groups with a VPN Concentrator network role do not support APs.

AP Network Role Gateway Network Role Can be Mixed
Campus / Branch Mobility Yes
Branch Yes
VPN Concentrator No
Microbranch Mobility No
Branch No
VPN Concentrator No

Each configuration group can support one network role for APs and Gateways. The number of configuration groups you deploy will vary based on preference for how your configuration is organized, the type of AP and Gateway network roles that are required and the complexity of your deployment. As a general rule of thumb, a configuration group is required for each group of devices that share the same network role and common configuration.

When a configuration group is created for APs or Gateways, a network role must be selected. The network role cannot be changed once the configuration group has been created and saved. If an AP or Gateway network role needs to be changed, the device must be moved to a new configuration group with the new network role. An example of a network role assignment for APs and Gateways is depicted below.

Existing configuration groups can also be edited, and a new device type assigned. The device type and network role that can be added will be dependent on the AP or Gateway network role configured in the group. For example, you may edit an existing mobility Gateway configuration group and add APs with a campus / branch network role or vice versa.

Access Points

The configuration of AOS 10 APs is determined by the configuration group they are assigned. The network role assigned to a configuration group is selected based on how the AP will be used and influences the configuration and monitoring options that are exposed. For example, configuration groups for Microbranch APs include additional configuration and monitoring options that are not applicable or exposed in configuration groups supporting campus / branch APs.

The following network roles can be assigned to a configuration group supporting AOS 10 APs:

  • Campus / Branch – Used to configure groups of APs deployed in a campus or branches that are connected via uplink ports to a local area network. APs can bridge or tunnel user traffic and may also be deployed as Mesh Portals or Mesh Points.

  • Microbranch – Used to configure individual APs deployed in remote small offices or home offices that securely connect to a private network over the public internet. Microbranch APs can support a number of different forwarding modes, support routing and may also connect to multiple Internet services.

Configuration groups with a Campus / Branch network role include Gateways with a Mobility or Branch network role. Configuration groups with a Microbranch network role cannot include Gateways but may include switches.

AP Network Roles

Gateways

The configuration of AOS 10 Gateways is determined by the configuration group they are assigned. The network role assigned to a configuration group is selected based on how the Gateways will be used and influences the configuration and monitoring options that are exposed. For example, configuration groups for branch Gateways and VPN concentrators include additional VPN, WAN and routing configuration which are not applicable or exposed for configuration groups supporting mobility Gateways.

The following network roles can be assigned to a configuration group supporting AOS 10 Gateways:

  • Mobility – Used to configure Gateways that terminates tunneled user traffic from APs and/or UBT switches in large offices and campuses.

  • Branch – Used to configure Gateways deployed in remote branch offices. Branch Gateways support mobility functions and can terminate tunneled user traffic from APs and/or UBT switches. Branch Gateways also offer advanced routing, WAN connectivity and WAN path optimization and can be deployed at the edge of a branch network in place of a traditional WAN router or firewall.

  • VPN Concentrator – Used to configure Gateways that terminate secure orchestrated overlay tunnels established from branch Gateways and/or microbranch APs.

Configuration groups with a Mobility or Branch network role may include APs with a Campus / Branch network role and optionally switches but not APs with a Microbranch network role. Configuration groups with a VPN Concentrator role cannot include APs but may optionally include switches.

Gateway Group Network Roles

2 - Design Fundamentals and Concepts

Design fundamentals and concepts of AOS 10 covers design options for an AOS 10 installation as well as guidance, in depth information, and best practices for operating AOS 10 networks.

2.1 - Access Point Deployments

Options for an AP only deployment and discussion of the scaling constraints and design options.

Access Points are the underpinning of the Campus wireless architecture. To provide maximum flexibility, an ArubaOS 10 AP can support WLANs configured to either bridge or tunnel user traffic. A special WLAN is also supported that can offer both forwarding types. One important feature of the AOS 10 architecture is that APs are no longer dependent on Gateways. Customers are free to deploy APs with or without Gateways depending on their traffic forwarding needs, feature requirements, and size of the network.

An AOS 10 AP-only deployment consists only of APs, and no Gateways. The APs are strategically deployed in one or more buildings to establish RF coverage areas. Since no Gateways are used, the APs publish WLANs that are configured to bridge the user traffic directly onto the wired network at the access layer.

AP Only Topology

An AP-only deployment can be considered for any environment where must provide Wi-Fi access to client devices, but do not require tunneling or other advanced features offered by Gateways. This includes:

  • Small offices and branches
  • Regional branches or headquarters
  • Warehouses
  • Campuses

There are some environments where AP-only deployments might not be suitable. For example, large hospitals and medical centers where high scale and seamless roaming are key requirements for specific medical applications and real-time communications.

Roaming Domains

A roaming domain is the population of APs in a common RF coverage area that share VLAN IDs and broadcast domains (IP networks). The coverage area may be contained within a single physical location such as a building / floor or if scaling and the LAN architecture permits may be extended between physical locations such as co-located buildings in a campus environment.

As the WLANs bridge the user traffic directly into the access layer, VLANs must be created and extended between the APs. A typical AP-only deployment utilizes a single AP management VLAN and two or more wireless user VLANs. There is no hard requirement for using a single AP management VLAN, and a customer may implement multiple AP management VLANs, if required. The only requirement is that the AP management VLANs are dedicated for management and not shared with client devices.

The number of wireless user VLANs varies based on deployment size, broadcast domain tolerance, and customer segmentation requirements. These VLANs are extended between the APs within a given RF coverage area. This is needed so that the wireless client devices can seamlessly roam between the APs and maintain their VLAN membership and IP addressing.

Seamless Roaming in AP Only Deployments

The AP management and wireless user VLANs are terminated on a multilayer switch that resides in the core or aggregation layers. This may vary based on the customer environment and LAN architecture. The multilayer switch is the default Gateway for the AP-management and wireless user VLANs, and includes IP helper addresses to facilitate DHCP addressing.

Roaming Domain Scaling

Each roaming domain can scale to support a maximum number of APs and client devices. As the AP management and user VLANs are extended between the APs in a roaming domain, any broadcast/multicast frames forwarded over the AP or wireless user VLANs will be received and processed by all the APs in the roaming domain. Broadcast/multicast frames are normal and are used by both APs and clients for various functions. As a general rule of thumb, the more APs and clients you deploy in a roaming domain, the more broadcast/multicast frames will be transmitted and processed by all connected hosts in each VLAN.

The frequency of broadcast/multicast frames that are flooded over the AP and wireless user VLANs are the main limiting factor for scaling as CPUs on APs can only process so many of these frames before other software services are impacted. Aruba has validated that a single individual roaming domain can support a maximum of:

  • 500 x APs
  • 5,000 x clients

These are the maximum limits that have been tested and verified by Aruba with APs connected to a single management VLAN and wireless clients connected to a single wireless user VLAN. Broadcast/multicast traffic was also generated to ensure correct operation in heavier broadcast/multicast environments. These limits also apply when multiple AP management and user VLANs are deployed. The aggregate number of APs and clients should not exceed the verified limits across all the VLANs. For example, if a customer configures four user VLANs, we recommend that you do not exceed a total of 5,000 clients across all the user VLANs.

Multiple Roaming Domains

An AP-only deployment can include as many roaming domains of up to 500 APs / 5,000 clients as needed as long as the AP management and wireless user VLANs for each roaming domain are separated at layer 3 (that is, reside on separate IP networks / broadcast domains).

For example, a 10-building campus design can include up to 500 APs per building, with each building supporting a maximum of 5,000 clients. The campus in this example would include 5,000 APs supporting a maximum of 50,000 clients across the campus. The AP management and wireless user VLANs in each building will be assigned unique IP subnets resulting in unique broadcast domains for each building. Larger buildings requiring higher scaling may implement multiple roaming domains as needed.

Multiple Roaming Domains in AP Only Deployments

Consider deploying multiple roaming domains for the following scenarios:

  • Scaling—You have a coverage area such as a large building or co-located buildings that must support more than 500 APs and/or 5,000 clients. Additional roaming domains will be incorporated into the design to accommodate the additional APs and/or wireless client devices.
  • LAN Architecture—The AP-management and wireless-user VLANs cannot be extended between access layer switches either between floors within a building or between buildings. If the LAN switches cannot be reconfigured to extend the needed VLANs, separate roaming domains will be required.

Overlapping RF Coverage

If the RF coverage areas between buildings or floors do not overlap, there is no expectation of network connectivity as you move between areas. But if the RF coverage areas overlap, you may expect continuous network connectivity as you move between buildings or floors. However, the IP network membership for client devices moving between two roaming domains will not be seamless. The client devices must obtain new IP addressing after the roam.

While the roam itself can be a fast roam, since the wireless-user VLANs in each roaming domain map to different broadcast domains, the client device must re-DHCP to obtain a new host address and default Gateway before it can continue to communicate over the intermediate IP network. The user VLAN ID can be consistent between roaming domains for simplified operations and management, but for scaling, the IP networks assigned to each roaming domain must be unique.

Deployments requiring multiple roaming domains with overlapping RF coverage therefore require careful planning and consideration:

  • User Experience—Do you expect uninterrupted network connectivity when moving between roaming domains? For example, between buildings or floors?

  • RF Design—Can the design accommodate or implement RF boundaries to minimize the hard roaming points between adjacent roaming domains to provide the best user experience?

  • Client Devices—Do you have any specialized or custom client devices deployed? Test to validate that they can tolerate and support hard roaming. Modern Apple, Android, and Microsoft operating systems will issue a DHCP discover and re-ARP after each roam.

  • Applications—What applications do you have deployed and can they tolerate hosts changing IP addresses? While some applications such as Teams and Zoom can automatically recover after host readdressing, others cannot.

A decision would need to be made on whether users, client devices, and applications can tolerate hard roaming in the environment before considering co-located roaming domains. If not, then Gateways and tunneling can be considered.

Types of Roaming

A client device can experience two types of roams in an AOS 10 deployment—hard roams and seamless roams. This section provides additional details for each roaming type and clarification on when we provide seamless and hard roaming in non-tunneled environments (i.e. AP only networks).

Hard Roaming

In a multiple roaming domain environment, client devices obtain new IP addressing and a default Gateway when transitioning between APs in separate roaming domains. Client devices may retain the same VLAN ID assignment depending on the LAN environment. While the user VLAN IDs may be common or unique between roaming domains, IP subnets or broadcast domains are always unique.

When a client device transitions between roaming domains, the following actions take place before the client device can continue to communicate over the intermediate IP network:

  • The client device issues a DHCP discover to obtain new addressing. A full DHCP exchange occurs and new host addressing and options are assigned. As the DHCP discover is a broadcast frame, it also populates the MAC address tables on all the layer-2 switches where the new user VLAN extends.

  • The client device sends an ARP for the default Gateway. This permits the client device to communicate with hosts on other IP networks.

  • All running applications on the client device re-establish their sessions. This may occur automatically or require user intervention.

Seamless Roaming

In a single roaming domain, client devices experience a seamless roam since all the APs in the RF coverage share user VLANs and broadcast domains. Client devices can maintain their user VLAN ID, IP addressing, and default Gateway after each roam. The roaming time will vary based on the WLAN type, its configuration, and the fast-roaming capabilities of the client.

After a successful roam, the client device’s MAC address or port-bindings are updated on all the layer-2 switches where the user VLAN extends. There are two ways in which this happens:

  • The client device sends a broadcast frame such as an ARP or DHCP-discover, which is flooded over the user VLAN.

  • The new AP sends a Gratuitous ARP, if proxy ARP is enabled, which is flooded over the user VLAN.

FAQ

Do ArubaOS 10 APs support layer-3 mobility?

No. We do not have plans to support this. Gateways provide a scalable and resilient option if layer-3 mobility is required.

Does Aruba Central have limits on how many instances of 500 APs or 5,000 clients can be deployed? Are there any Aruba Central group or site limits?

There are no Aruba Central group or site restrictions.

Do the AP-management and wireless-user VLANs must be dedicated or can they be shared with other hosts?

Our best practice recommendation is to use dedicated VLANs for AP management and wireless users, especially in larger environments where large amount of broadcast or multicast traffic is expected.

Is there anything I can do additionally to prevent unwanted broadcast or multicast frames from impacting the APs?

In addition to implementing dedicated management and user VLANs, it is strongly recommended that you remove any unwanted VLANs on the AP switchports, on the access layer switches. These switchports should be explicitly configured to untag the AP-management VLAN and only tag the user VLANs. No other VLANs should be extended to the APs. This will ensure that any broadcast or multicast frames from other VLANs are not received by the APs.

Can APs in a roaming domain be connected to different management VLANs?

The APs may be distributed between multiple VLANs as long as you do not exceed 500 APs in the roaming domain. However, since DTLS tunnels are established between APs for session state synchronization and clean-up, the APs must be able to reach each other over same IP.

Can I deploy a single wireless-user VLAN supporting up to 5,000 clients in a roaming domain?

A single user VLAN is supported, however most real-world deployments will include two or more user VLANs that naturally establish smaller broadcast domains.

Can I exceed 500 APs or 5,000 clients if I deploy additional VLANs?

No. These recommended limits apply per roaming domain regardless of the number of AP management and user VLANs.

When do I need to consider Gateways for my customers’ deployment?

Please refer to the Gateway Use Cases section.

Can APs with foundation and advanced licenses be mixed in a roaming domain? What will happen if they are mixed?

While you may mix APs with different licenses, it is not recommended. Certain features such as Live Upgrade, HPE Aruba Networking AirGroup (custom services), Air Slice, UCC, and Multizone require an advanced license. Mixing APs with different license tiers in a building or floor will result in feature discrepancies and an operationally challenging environment.

What happens if the total number of APs or clients in a roaming domain exceeds 500 APs or 5,000 clients?

These are soft limits which are not enforced and have been provided as a recommended best practice. If either limit is exceeded, AP and client performance may degrade, especially in high traffic environments.

Can I deploy APs and tunneled WLANs across a WAN?

Not all WANs are created equally.

  1. Public Internet including VPNs: Not supported

  2. Private WANs (MPLS, TDM etc.): These types of deployments are not tested and are therefore, not supported without additional validation. Please contact your HPE Aruba Networking sales team to discuss these requirements further.

  3. Metropolitan Ethernet Services or Ethernet extensions between sites: These are supported since they would not be much different from dark fiber implementations. However, we do require the service to support standard 1,518 byte or larger Ethernet frames. Gigabit Ethernet speeds or higher are also recommended.

Are APs and bridged WLANs supported with EVPN-VXLAN?

APs using bridge forwarding are now supported on the NetConductor solution, with the limitation of no role-to-role group-based policies. Support for 3rd party EVPN-VXLAN environments will vary depending on the vendor’s ability to support rapid MAC address moves. Please reference the validated solutions guide (VSG) for more information.

2.2 - Gateway Deployments

Use cases, personas, and roaming considerations for Gateway deployments.

Gateways are high-performance appliances that have evolved to support a wide range of use cases and can act as (1) the wireless control plane for greater security and scalability or (2) SD-Branch device with intelligent routing and tunnel orchestration software. Gateways are not a refresh of wireless controllers; they are expressly designed to be both cloud and IoT ready.

Use Cases

While Gateways are optional, they offer certain features and capabilities that are not available in AP-only deployments. There are deployment scenarios when Gateways should be considered to provide a better end-user experience, simplify operations, or take advantage of advanced features. There are also scenarios where Gateways are mandatory and required.

The following are some common features and use cases for Gateway deployments:

  • LAN Architecture — The LAN architecture does not permit management and user VLANs to be extended between the APs, and seamless roaming is required.

  • Roaming Domain Scaling — Gateways and tunneled WLANs are required to establish roaming domains that exceed 500 APs and 5,000 clients.

  • Layer-3 Mobility — Gateways and tunneled WLANs are required to centralize wireless-user VLANs and permit client devices to seamlessly roam between APs, across layer 3 network boundaries.

  • RADIUS Proxy - One does not want to configure large numbers of APs as clients on their RADIUS server. If Gateways are deployed, RADIUS messages can be proxied through the Gateway cluster.

  • Security & Policy - For policy and compliance, user traffic needs to be segmented and/or terminated in different zones within the network where user VLANs are deemed insufficient. Newer Gateways could also help in enhancing the security further by enabling IDS/IPS. The IDS/IPS engine performs deep packet inspection to monitor network traffic for malware and suspicious activity. When either of the two is detected, the IDS function alerts network administrators, while the Intrusion Prevention System (IPS) takes immediate action to block threats.

  • Traffic Optimization - For high broadcast or multicast environments, Gateways offer more granular controls that can be enabled per VLAN to prevent unwanted broadcast or multicast frames and/or datagrams from reaching the APs.

  • Data Plane Termination - Gateways are required to terminate tunnels from Aruba devices. This includes APs, Gateways and Switches.

  • Solutions - Gateways are required for Aruba SD-Branch, Microbranch and VIA deployments.

  • MultiZone - Two or more clusters of Gateways are required to deploy MultiZone when separate tunneled WLANs are terminated on different Gateway clusters within the network.

  • Datacenter Redundancy - If layer 3 mobility and failover between datacenters is required.

  • Dynamic Segmentation - Dynamic Segmentation unifies role‑based access and policy enforcement across wired, wireless, and WAN networks with centralized policy definition and dedicated enforcement points, ensuring that users and devices can only communicate with destinations consistent with their role. Gateways play an essential role in policy enforcement – keeping traffic secure and segregated.

Personas

An AOS 10 Gateway can operate in one of the three personas i.e., a Mobility, Branch or VPN Concentrator. These personas could be set while creating a new group in Aruba Central. Setting a group type to any of the personas essentially dictates what configuration options would be exposed in the group settings. For example, if the group type is set to Mobility, then only WLAN related configuration options are available in that group whereas if the type is set to Branch, then in addition to the WLAN configuration options, other SD-Branch specific Branch Gateway options are also available for configuration.

The Mobility persona is used for WLAN deployments whereas the Branch and VPN Concentrator personas are used for SD-Branch deployments.

Gateway Personas

Mobility

The Mobility persona configures a Gateway to support wireless (WLAN) and wired (LAN) functionalities in a campus network. When a Mobility Gateway is used in a WLAN deployment, all APs will form Internet Protocol security (IPsec) and Generic Routing Encapsulation (GRE) tunnels to the Gateways when a tunneled or a mixed mode WLAN is created.

Gateways in this mode do not provide any WAN capabilities.

Branch

The Branch persona sets a Gateway to operate as an SD-Branch Gateway, supporting the optimization and control of WAN, LAN, WLAN, and cloud security services. The Branch Gateway provides features such as routing, firewall, security, Uniform Resource Locator (URL) filtering, and compression. With support for multiple WAN connection types, the Branch Gateway routes traffic over the most efficient link based on availability, application, user-role, and link health. This allows organizations to take advantage of high-speed, lower-cost broadband links to supplement or replace traditional WAN links such as MPLS.

In addition to providing Branch functionalities, Branch Gateways also support all of the WLAN functionalities of a Mobility Gateway.

VPN Concentrator

The VPN Concentrator persona sets a Gateway to act as a headend Gateway, or Virtual Private Network Concentrator (VPNC) for all branch offices. Branch Gateways establish IPsec tunnels to one or more headend Gateways over the Internet or other untrusted networks. High Availability options support either multiple headend Gateways deployed at a single site or headend Gateways deployed in pairs at multiple sites for the highest availability. The most widely deployed topology is the dual hub-and-spoke where branches are multi-homed to a primary and backup data center. Any of the headend Gateways can perform the function of VPNC at the hub site. These devices offer high-performance and support a large number of tunnels to aggregate data traffic from hundreds to thousands of branches.

VPNCs can act as headend Gateways for either other Branch Gateways or Microbranch APs.

Role Matrix

Some Gateways do not support all available personas and this restriction should be taken into account when choosing a Gateway model.

Platform Mobility VPNC Branch
7000 Series
7005 Yes No Yes
7008 Yes No Yes
7010 Yes Yes Yes
7024 Yes Yes Yes
7030 Yes Yes Yes
7200 Series
7205 Yes Yes Yes
7210 Yes Yes Yes
7220 Yes Yes Yes
7240XM Yes Yes Yes
7280 Yes Yes Yes
9000 Series
9004 Yes Yes Yes
9004-LTE No Yes Yes
9012 Yes Yes Yes
9100 Series
9106 Yes Yes Yes
9114 Yes Yes Yes
9200 Series
9240 Yes Yes Yes

Roaming With Gateways

An AOS 10 deployment with Gateways supports the ability to configure WLAN profiles to tunnel the user traffic to a cluster of Gateways where the user VLANs reside. Client devices are statically or dynamically assigned to a user VLAN that is extended between all the Gateway nodes in the cluster. The user VLANs either terminate on the core switching layer or a dedicated aggregation switching layer that is also the default Gateway for the Gateway management and user VLANs.

For more details on Gateway clustering, refer to the Clusters topic.

With a centralized forwarding architecture, client devices can seamlessly roam between APs that are tunneling user traffic to a common Gateway cluster. The client devices can maintain their VLAN membership, IP addressing, and default Gateway since the user VLANs and broadcast domains are common between the cluster members. With the clustering architecture, the client’s MAC address is also set to a single cluster member irrespective of the AP that the client device is attached to. The client MAC address will only move in the event of a cluster node upgrade or outage.

Hard roaming is required in AP-Gateway deployments if a client device transitions between APs that tunnel the user traffic to separate Gateway clusters. While the user VLAN IDs may be common between clusters, the IP subnets or broadcast domains must be unique per cluster. Any client device that moves between Gateway clusters must obtain a new IP address and default Gateway after the roam.

AP-Gateway Roaming

Gateway Scaling with AOS 10

Scaling numbers related to clients, AOS 10 devices, tunnels and cluster sizes for various Gateway models can be accessed in the Capacity Planning section of the Validated Solution Guide.

Gateway Cluster Scaling Calculator

This calculator is used to determine the number of gateways required for AOS 10 tunneled WLAN and user based tunneling (UBT) deployments.

The calculator can be accessed in the Capacity Planning section of the Validated Solution Guide.

2.3 - Gateway Serviceability

Learn about HPE Aruba Networking serviceability features for gateways which ensure continuous device management. Topics include automatic rollback mechanisms, auto re-provisioning capabilities, and disaster recovery procedures for maintaining uninterrupted connectivity between gateways and the Central platform.

To ensure continuous reachability to gateways, the HPE Aruba Networking Central management platform facilitates essential operations such as configuration, monitoring, debugging, and general management.

Upon initial boot, HPE Aruba Networking AOS-10 gateways in factory default state establish a WebSocket connection with Central through provisioning data obtained via zero-touch provisioning or manual setup.

Subsequent connectivity disruptions might arise from user-initiated configuration errors, network anomalies, or device malfunctions. To mitigate these disruptions, robust device-side recovery mechanisms are supported.

Automatic recovery

Auto rollback

The following configuration errors, including but not limited to those listed, can all disrupt communication between the gateway and HPE Aruba Networking Central:

  • Incorrect uplink VLAN settings
  • Uplink port misconfiguration
  • Bandwidth contract policy restrictions
  • Access control list conflicts

For example, these errors could stem from simple typographical mistakes, rendering corrective actions impossible once connectivity is lost.

Example output for the show switches command showing a configuration state of rollback:

Output for show switches from an HPE Aruba Networking gateway

To address this, AOS-10 gateways implement automatic rollback to the last known good configuration upon detection of connectivity loss due to configuration service updates.

Furthermore, the gateway will communicate the rollback event to the Central configuration service, enabling user visibility and diagnostic capabilities.

Auto re-provisioning

HPE Aruba Networking Central provides an automated solution for connectivity loss detection and restoration, minimizing user intervention to the correction of erroneous configurations.

In scenarios where the initial connection to Central fails due to inaccurate provisioning parameters (e.g., incorrect management URL), or where automatic rollback is unsuccessful (e.g., expired certificates), gateways support self-reprovisioning from the Activate service. Upon user modification of provisioning data, the gateway will initiate a reset and reconnect to Central using the provided information.

2.4 - Clusters

Gateway clustering with AOS 10.

A cluster is a group of HPE Aruba Networking Gateways operating as a single entity to provide high availability and service continuity for tunneled clients in a network. Gateway clusters provide redundancy for HPE Aruba Networking APs with mixed or tunneled WLANs, HPE Aruba Networking switches configured for user-based tunneling (UBT), and tunneled clients in the event of maintenance or failure.

Clustering provides the following features and benefits:

  • Stateful Client Failover – When a Gateway is taken down for maintenance or fails, APs, UBT switches and clients continue to receive service from another Gateway in the cluster without any disruption to applications.

  • Load Balancing – Device and client sessions are automatically distributed and shared between the Gateways in the cluster. This distributes the workload between the cluster nodes, minimizes the impact of maintenance and failure events and provides a better connection experience for clients.

  • Seamless Roaming – When a client roams between APs, the clients remain anchored to the same Gateway in the cluster to provide a seamless roaming experience. Clients maintain their VLAN membership and IP addressing as they roam.

  • Ease of Deployment – A Gateway cluster is automatically formed when assigned to a group or site in Central without any manual configuration.

  • Live Upgrades – Allows customers to perform in-service cluster upgrades of Gateways while the network remains fully operational. The Live Upgrade feature allows upgrades to be completely automated. This is a key feature for customers with mission-critical networks that must remain operational 24/7.

Reference diagram of a typical cluster in AOS 10.

2.4.1 - Types of Clusters

Gateway clustering with AOS 10.

Types of Clusters

A resilient cluster consists of two or more Gateways that service clients and devices. A cluster that consists of Gateways of the same model is referred to as homogeneous cluster while a cluster that consists of Gateways of different models is referred to as a heterogeneous cluster. As a best practice, HPE Aruba Networking recommends deploying homogeneous clusters whenever possible.

Homogeneous Clusters

A homogeneous cluster is a cluster built with Gateways of the same model. The primary benefit of a homogeneous cluster is that each node provides equal client, device, and forwarding capacity along with common port configurations. This makes homogeneous clusters much easier to plan, design, and configure than heterogeneous clusters.

Example cluster consisting of gateways of same series and model.

The maximum number of nodes you can deploy in a homogeneous cluster will vary by series. The 7000 or 9000 series Gateways can support a maximum of four nodes, the 7200 series can support a maximum of twelve nodes, and the 9100 or 9200 series Gateways can support a maximum of six nodes.

Gateway Series Maximum Gateways per Cluster
7000 4
7200 12
9000 4
9100 6
9200 6

Heterogeneous Clusters

A heterogeneous cluster is a cluster built with Gateways of different models. Heterogeneous cluster support is primarily provided to help customers migrate existing clusters using older Gateways to newer models. For example, migrating an existing cluster of 7005 series Gateways to 9004 series Gateways or 7200 series Gateways to 9200 series Gateways.

Example cluster consisting of gateways of differing series and models.

The primary benefit of a heterogeneous cluster is that multiple Gateways models can co-exist within a cluster during a migration, however this comes with some considerations:

  1. The maximum cluster size will be limited by the lowest common denominator Gateway series. For example, a heterogeneous cluster of 7200 series and 9200 series Gateways will be limited to a maximum of six nodes.

  2. Base and failover capacities are extremely difficult to calculate. Active and standby client and device sessions will be unevenly distributed between the available nodes based on the capacity of each node. Careful planning must be performed to ensure that the loss of a high-capacity node does not impact clients or devices.

  3. Forwarding performance, scaling and uplink capacities will vary between the nodes.

  4. Configuration in Central may require device level overrides to accommodate uplink port differences between Gateway models.

While heterogeneous clusters are supported, they are not recommended for long-term production use. Heterogeneous clusters should only be implemented when migrating Gateways in existing clusters to a new model. If a heterogeneous cluster must be implemented, the cluster should be limited to two models of Gateways. While more than two Gateway models can be supported, troubleshooting and debugging will be more complicated if technical issues occur.

Gateway series Maximum gateways per cluster
7000 and 9000
7000 and 7200
9000 and 7200
4
7200 and 9100
7200 and 9200
9100 and 9200
6

2.4.2 - Cluster Roles

Gateway clustering with AOS-10.

Gateways in a cluster are assigned various roles to distribute client and device sessions between the available nodes. For each cluster, one gateway is elected a cluster leader which is responsible for device session assignment, bucket map computation and node list distribution. In addition to a cluster leader role, a gateway may assume one or more of the following roles:

  • Device Designated Gateway (DDG) or Standby Device Designated Gateway (S-DDG)

  • Switch Designated Gateway (SDG) or Standby Switch Designated Gateway (S-SDG)

  • User Designated Gateway (UDG) or Standby User Designated Gateway (S-UDG)

  • VLAN Designated Gateway (VDG) or Standby VLAN Designated Gateway (S-VDG)

The roles that are assigned to gateways within a cluster will be dependent on the number of cluster nodes, persona of the gateways, and the types of devices that are tunneling client traffic to the cluster. The UDG/S-UDG roles are assigned to gateways for tunneled clients, DDG/S-DDG roles are assigned to gateways for APs, and SDG/S-SDG roles are assigned to gateways for UBT switches. VDG/S-VDG roles are assigned to Branch Gateways configured for Default Gateway mode that terminate user VLANs.

A cluster can consist of a single gateway or multiple gateways. A single gateway is still considered a cluster as the cluster name must be selected for profiles configured for mixed and tunnel forwarding. When a cluster consists of a single gateway, no standby sessions are assigned as there are no gateways available to assume the standby roles. Standalone gateways will assume the cluster leader and designated role for client and device sessions. When a cluster consists of two or more gateways, designated and standby roles are distributed between the available cluster nodes.

Bucket maps

The cluster leader is responsible for computing a bucket map for the cluster which is published to both APs and UBT switches by their assigned DDGs. Unlike AOS-8 where a bucket map was published per ESSID, in AOS-10 one bucket map is published per cluster. APs and UBT switches tunneling to multiple clusters will have a published bucket map for each cluster.

Bucket maps are used by APs and UBT switches to determine the UDG and S-UDG session assignments for each tunneled client. Each tunneled client is assigned a UDG to anchor north / south traffic. To determine the active and standby UDG role assignments, the last 3 bytes of each client’s MAC address is XORed to derive a decimal value (0-255) which is used as an index in the bucket map table to determine the UDG and S-UDG assignments. Each AP and switch that is tunneling to a cluster will be provided with the same bucket map. If multizone is deployed, each AP and UBT switch will receive separate bucket maps for each cluster.

The following illustration provides an example bucket map published by a two-node homogeneous cluster. Each gateway in the UDG list is assigned a numerical value (0 and 1 in this case) that have an equal number of active and standby assignments. Each client MAC address is hashed to provide a numerical index value (0-255) that determines each client’s active and standby UDG assignment. In this example, the hashed index value 32 will assign node 0 as the UDG and node 1 as the S-UDG while the index value 15 will assign node 1 as the UDG and node 0 as the S-UDG.

Bucket map output from a gateway cluster.

Roles and tunnels

Each AP and UBT switch that is tunneling clients to a cluster will establish tunnels to each gateway node within the cluster:

  • Campus AP – Establishes IPsec and GRE tunnels to each cluster node, this operation is orchestrated by Central.

  • EdgeConnect Microbranch AP - Establishes IPsec tunnels to each VPN Concentrator in a cluster, this operation is orchestrated by Central. When using centralized layer 2 (CL2) forwarding, GRE tunnels are encapsulated in the IPsec tunnels.

  • UBT Switches – Establish GRE tunnels to each cluster node based on switch configuration.

The role of each gateway within a cluster determines which cluster node is responsible for exchanging signaling messages to APs and UBT switches in addition to the forwarding of broadcast (BC), multicast (MC), and unicast traffic destined to tunneled clients.

Device Tunnel Type Traffic Type Gateway Role
Campus AP IPsec Device Signaling & BC/MC to Clients DDG
GRE Unicast to / from Clients & BC/MC from Clients UDG
EdgeConnect Microbranch AP (CL2) IPsec Device Signaling & BC/MC to Clients DDG
GRE in IPsec Unicast to / from Clients & BC/MC from Clients UDG
UBT Switch GRE Device Signaling & BC/MC to Clients (UBT 1.0) SDG
GRE Unicast to / from Clients
BC/MC from clients (UBT 1.0)
BC/MC to and from Clients (UBT 2.0)
UDG

Device designated gateway

Each AP is assigned a Device Designated Gateway (DDG) which is responsible for publishing the bucket map to the AP. The bucket map is used for UDG/S-UDG assignments for each tunneled client. One bucket map is published per cluster.

For each AP, the cluster leader selects a DDG and S-DDG as part of the initial orchestration and messaging. The assignments are performed in a round-robin fashion based on each cluster node’s device capacity and load. The resulting distribution will be even for homogeneous clusters and uneven for heterogeneous clusters as gateways will have uneven device capacities. Higher capacity nodes will have more DDG/S-DDG assignments than lower capacity nodes.

Gateways with a DDG role are responsible for the following functions:

  1. Bucket map distribution

  2. Forwarding of north / south broadcast and multicast traffic destined to wireless clients

  3. Forwarding IGMP/MLD group membership reports for IP multicast

The S-DDG assumes the role of publishing the bucket map and other forwarding functions if the DDG is taken down for maintenance or fails. New DDG/S-DDG role assignments are event driven as nodes are added and removed from the cluster. There is no periodic load-balancing. If a failover occurs, the S-DDG assumes the DDG role and a new bucket map is published. Impacted devices from failover are assigned a new S-DDG node.

A cluster can accommodate multiple node failures and assign DDG and S-DDG roles until the cluster’s maximum device capacity has been reached. Once a cluster’s device capacity has been reached and additional nodes are lost, impacted APs will become orphaned as there is no remaining device capacity available in the cluster to accommodate new DDG role assignments.

DDG and S-DDG assignments are performed by the cluster leader and done in a round-robin fashion.

A depiction of the DDG and S-DDG assignments for a four-node heterogeneous cluster.

Switch designated gateway

Each UBT switch is assigned a Switch Designated Gateway (SDG) which, like the DDG role, is responsible for publishing the bucket map to the switches. Unlike APs, where the cluster leader dynamically determines each AP’s DDG and S-DDG role assignment, a UBT switch’s initial SDG assignment is determined by the explicit configuration of the primary and backup gateways as part of the UBT configuration:

  • AOS-S – The gateway’s IP address specified as the controller-ip or backup-controller-ip

  • AOS-CX – The gateway’s IP address specified as the primary-controller-ip or backup-controller-ip

The switches initial SDG assignment is based on the controller-ip or primary-controller-ip defined as part of the switch configuration. The switches S-SDG assignment is automatic and is distributed between the cluster members based on capacity and load.

When a UBT switch first initializes, an attempt will be made to establish a PAPI session to the primary gateway IP address specified in the configuration. If the primary gateway IP does not respond, the secondary gateway IP is used. Once a connection is established, an S-SDG role is assigned by the gateway cluster leader.

Gateways with an SDG role are responsible for the following functions:

  1. Bucket map distribution

  2. Forwarding of broadcast and multicast traffic destined to UBT version 1.0 clients

  3. Forwarding IGMP/MLD group membership reports for IP multicast (UBT version 1.0)

The S-SDG assumes the role of publishing the bucket map and other forwarding functions if the SDG is taken down for maintenance or fails. If a failover occurs, the S-SDG assumes the SDG role and a new bucket map is published. Impacted devices from failover are assigned a new S-SDG node.

The initial SDG assignments are based on the switch configuration while the S-SDG assignments are performed by the gateway cluster leader in a round-robin manner.

A depiction of the SDG and S-SDG assignments for a four-node heterogeneous cluster.

As the AOS-S / AOS-CX switch configuration influences the SDG role assignments, HPE Aruba Networking recommends assigning different primary and backup IP addresses to groups of switches to provide an even distribution of SDG roles between the available cluster nodes. The distribution must be performed manually by the switch admin when defining the golden configuration for each group of access layer switches.

An equal distribution of SDG roles between the available cluster nodes is especially important for UBT version 1.0 deployments as each cluster node with an SDG role for a group of UBT switches is responsible for replication and forwarding of broadcast and multicast traffic destined to UBT clients. Distributing the SDG role ensures that broadcast and multicast traffic replication and forwarding is distributed between all the available cluster nodes.

An example distribution of primary IP addresses for a four-node cluster is provided in the table below:

Switch Group Primary IP
1 GW-A
2 GW-B
3 GW-C
4 GW-D

When failover between clusters is required, both the primary-controller-ip and secondary-controller-ip addresses are configured on each group of UBT switches where the primary IP points to a cluster node residing in the primary cluster and the secondary IP points to a cluster node residing in the backup cluster. As with a single cluster deployment, the SDG roles should be evenly distributed between the avilable cluster nodes in each cluster. This will ensure even SDG role distribution regardless of the cluster that is servicing the UBT switches.

An example distribution of primary and secondary IP addresses for failover between a primary and secondary cluster for four-node clusters is provided in the table below:

Switch Group Primary IP Secondary IP
1 GW-DC1-A GW-DC2-A
2 GW-DC1-B GW-DC2-B
3 GW-DC1-C GW-DC2-C
4 GW-DC1-D GW-DC2-D

User designated gateway

Each tunneled client is assigned a User Designated Gateway (UDG) to anchor north / south traffic. Each client’s unique MAC address is assigned a UDG and S-UDG via the bucket map that is published by the cluster leader for each cluster.

The bucket indexes used for UDG and S-UDG assignments are allocated in a round-robin fashion based on each cluster node’s client capacity. For homogeneous clusters, each gateway in the cluster will be allocated equal buckets while for heterogeneous clusters higher capacity nodes will be allocated more buckets than lower capacity nodes. Client MAC address hashing is utilized to ensure good session distribution but also ensures that each client is anchored to the same gateway while roaming.

Gateways with a UDG role are responsible for the following functions:

  • Forwarding broadcast and multicast traffic received from clients.

  • Forwarding of IP multicast traffic destined to UBT 2.0 clients.

  • Forwarding of unicast traffic (bi-directional).

The S-UDG assumes the role of forwarding functions if the UDG is removed from the cluster through maintenance or failure. A new bucket map is published by the cluster leader when nodes are added or removed from the cluster and is event driven. With AOS 10 there is no periodic load-balancing. If a failover occurs, the S-UDG assumes the UDG role and a new bucket map is published. Impacted clients from failover are assigned a new S-UDG node.

A cluster can accommodate multiple node failures and assign UDG and S-UDG roles until the cluster’s maximum client capacity has been reached. Once a cluster’s client capacity has been reached and additional nodes are lost, impacted clients will become orphaned as there is no remaining client capacity available in the cluster to accommodate new UDG role assignments.

UDG/S-UDG role assignments are determined using the published bucket map for the cluster by hashing each client’s MAC address to determine an index value (0-255).

In this example the hashing results in Client 1 being assigned GW-A for UDG and GW-B for S-UDG while Client 2 is assigned GW-C for UDG and GW-D for S-UDG.

Branch high availability

When high availability (HA) is required for branch office deployments, a pair of Branch gateways are deployed to terminate the WAN uplinks and VLANs within the branches and provide resiliency. Each gateway is configured with an IP interface on the management and user VLANs, and Virtual Router Redundancy Protocol (VRRP) is automatically orchestrated to provide first-hop router redundancy and failover for clients and devices. Dynamic Host Control Protocol (DHCP) services may also be enabled to provide host addressing which will also operate in HA mode.

With the convergence of clustering and branch HA, role assignments are further optimized to prevent client traffic from taking multiple hops within the cluster. Branch HA is enabled on pairs of gateways using auto-site clustering and requires the default gateway mode to be enabled within the Central configuration group. A peer connection is established between the gateways at each site where a preferred leader is configured by the admin or is automatically elected.

The cluster leader performs the following roles within the cluster during normal operation:

  • VLAN designated gateway (VDG) and VRRP active role for the management and user VLANs

  • DDG role for each AP

  • SDG role for each UBT switch

  • UDG role for each tunneled client

The leader is responsible for routing and forwarding of all branch management and client traffic during normal operation. The forwarding of WAN traffic is distributed between the gateways and may traverse the virtual peer link. The assignment of all the active roles to the preferred gateway ensures that all client traffic is anchored to the preferred gateway during normal operation, preventing unnecessary east-west traffic. The VDG and VRRP state for the management and user VLANs is synchronized and pinned to the active gateway. The secondary gateway operates in a standby mode and assumes all the standby roles. The only client traffic that is forwarded by a standby gateway is WAN traffic for any WAN uplinks it terminates.

If the active gateway is taken down for maintenance or fails, the standby gateway will take over all the active roles within the cluster along with all routing and forwarding functions. As multiple layers of convergence are required, failover is not seamless and will temporarily impact user traffic.

The DDG, SDG, UDG and VDG role assignments for a branch HA cluster.

2.4.3 - Automatic and Manual Modes

Clusters of gateways can be defined manually or can be formed automatically, either by group or by site.

Cluster Modes

AOS 10 supports automatic and manual clustering modes to support Gateways that are deployed for wireless access, User Based Tunneling (UBT) or VPN Concentrators (VPNCs). A cluster can be automatically or manually established between Gateways that are assigned to the same configuration group. A cluster cannot be formed between Gateways that are assigned to separate configuration groups.

When the clustering mode for a configuration group is set auto group or auto site clustering modes, a cluster will be automatically established between the Gateways within the group with no additional configuration being required. A unique cluster name is automatically generated by Central, and the cluster configuration and establishment is automatically orchestrated by Central. When the clustering mode is set to manual, the admin must select the cluster members and specify a cluster name.

Additional cluster configuration options are available for both automatic and manual clustering modes based on the Mobility, Branch or VPN Concentrator role assigned to the Gateway configuration group. These additional options are available when the Manual Cluster configuration option is enabled within the configuration group. Different options are available for Mobility, Branch and VPN Concentrator roles.

The cluster mode is defined per configuration group and each configuration group may support Gateways using both automatic and manual clustering modes. The following cluster combinations are supported per group:

  • One auto group cluster and one or more manual clusters

  • One or more auto site clusters and one or more manual clusters

  • Multiple manual clusters

The only limitation is that a configuration group cannot support multiple auto group clusters or an auto group and auto site cluster.

Auto Group Clustering

Auto group clustering mode is the default clustering mode for Mobility and VPN Concentrator Gateway configuration groups. Gateways within the configuration group with shared configuration will automatically form a cluster amongst themselves.

Gateways in configuration groups with auto group clustering enabled are assigned a unique cluster name using the auto_group_XXX format where XXX is the unique numerical ID of the configuration group. This applies to configuration groups with a single Gateway or multiple Gateways. Only one auto group cluster is permitted for each configuration group. Campus deployments with multiple clusters will implement one configuration group for each cluster. This is demonstrated in the following graphic where three configuration groups with auto group clustering are used to configure Gateways in two data centers and a DMZ:

Auto Group Clustering Mode

When auto group clusters are present in Central, they can be assigned to WLAN and wired-port profiles configured for tunnel or mixed forwarding modes. The APs can reside in the same configuration group as the Gateways or a separate configuration group. The auto group cluster you assign each profile determines where client traffic is tunneled to. You can assign one auto group cluster as a Primary Gateway Cluster and one auto group cluster as a Secondary Gateway Cluster. If present, you may assign other cluster types as a Secondary Gateway Cluster. Once the profile configuration has been saved, Central will automatically orchestrate the IPsec and GRE tunnels from the APs to the Gateway cluster nodes selected for each profile.

The following graphic demonstrates the auto group cluster options that are presented for a WLAN profile when the Tunnel forwarding mode is selected:

Auto Group cluster profile assignment

Auto Site Clustering

Auto site clustering mode is the default clustering mode for Branch Gateway configuration groups. Auto site clusters simplify operation and configuration for branch office deployments by allowing APs to automatically tunnel to Gateways in their site. The Gateways must reside in the same configuration group and site for a cluster to form. Only Gateways in the same configuration group and site will automatically form a cluster amongst themselves.

Gateways with auto site clustering enabled are assigned a unique cluster name using the auto_site_XX_YYY format where XX is the unique numerical ID of the site and YYY is unique numerical ID of the configuration group. A unique cluster name is generated for sites with standalone Gateways or multiple Gateways. Only one auto group cluster is permitted per site.

Branch office deployments will often include Branch Gateways of different models deployed in standalone or HA configurations depending on the size and needs of each branch site. One configuration group with auto site clustering is created for each Gateway model and variation. This demonstrated below where two configuration groups are used for 9004 series Gateways deployed in standalone and HA pairs. Each standalone and HA pair of Gateways are assigned to their respective sites and are automatically assigned a unique cluster name:

Auto Site clustering mode

When auto site clusters are present in Central, they can be assigned to WLAN and wired-port profiles configured for tunnel or mixed forwarding modes. The APs may reside in the same configuration group as the Gateways or a separate configuration group. If separate configuration groups are deployed, one AP configuration group will be required for each Gateway configuration group.

Unlike auto group clusters where profiles are configured to tunnel traffic to specific cluster, auto site allows the admin to select an auto site group. The dropdown for the Primary Gateway Cluster lists each Gateway configuration group with auto site clustering enabled. Once the profile configuration has been saved, Central will automatically orchestrate the IPsec and GRE tunnels from the APs to the Gateway cluster nodes in their site.

The following graphic demonstrates the auto site cluster options that are presented for a WLAN profile when the Tunnel forwarding mode is selected. In this example four configuration groups configured for auto site clustering for 9004 and 9012 series Gateways in standalone and HA pairs are presented:

Auto Group cluster profile assignment

A site may also include a second auto site cluster if additional failover is required. As only one auto site cluster can be established between Gateways in the same configuration group and site, a second configuration group is required for the additional auto site cluster to be established. The Gateways in the second auto site cluster are assigned to the same site as the Gateways in the primary auto site cluster. The second auto site configuration group can then be assigned as a Secondary Gateway Cluster within the profile. This is demonstrated below where a primary and secondary auto site cluster is assigned:

Auto Group cluster failover

Manual Clustering

Manual clustering mode is optional for Branch Gateway, Mobility and VPN Concentrator Gateway configuration groups. When automatic clustering is disabled, clusters can be manually created and named by the admin. When automatic clustering mode in a configuration group is disabled, existing auto group or auto site clusters are not removed. Existing automatic clusters can either be retailed as-is or they can be removed and re-created manually.

Each manual cluster requires a unique cluster name and one or more Gateways in the group to be assigned. Each configuration group can support multiple manual mode clusters if required. Gateways within a configuration group can only be assigned to one automatic or one manual cluster at a time. Gateways can only form a manual cluster with other Gateways in the same configuration group.

Manual mode clusters are useful for situations where user defined cluster names are required, members need to be deterministically assigned or multiple clusters need to be formed between Gateways within the same configuration group. This is demonstrated as follows where a two configuration groups are used to configure and manage Mobility Gateways in two data centers. As VLANs and other configuration is shared, manual mode clustering is used to establish two clusters in each configuration group. This simplifies configuration and operation as two configuration groups can be used instead of four configuration groups using auto group clustering mode.

Manual clustering mode

When manual clusters are present in Central, they can be assigned to WLAN and wired-port profiles configured for tunnel or mixed forwarding modes. The APs can reside in the same configuration group as the Gateways or a separate configuration group. The clusters you assign each profile determines where client traffic is tunneled to. You can assign one manual cluster as a Primary Gateway Cluster and one manual cluster as a Secondary Gateway Cluster. Once the profile configuration has been saved, Central will automatically orchestrate the IPsec and GRE tunnels from the APs to the Gateway cluster nodes selected for each profile.

The following graphic demonstrates the manual cluster options that are presented for a WLAN profile when the Tunnel forwarding mode is selected:

Manual cluster profile assignment

2.4.4 - Formation Process

Gateway clustering with AOS 10.

Cluster Formation

Cluster formation between Gateways is determined by the cluster configuration within each configuration group. When an automatic cluster mode is enabled, Central orchestrates the cluster name and configuration for each cluster node:

  • Auto group – A cluster is orchestrated between active Gateways within the same configuration group.

  • Auto site – A cluster is orchestrated between active Gateways within the same configuration group and site.

When manual cluster mode is enabled, the admin defines the cluster name and cluster members. The admin configuration initiates the cluster formation between the active Gateways.

Handshake Process

The first step of cluster formation involves a handshake process where messages are exchanged between all potential cluster members over the management VLAN between the Gateways system IP addresses. The handshake process occurs using PAPI hello messages that are exchanged between nodes to verify reachability between all cluster members. Information relevant to clustering is exchanged through these hello messages which includes platform type, MAC address, system IP address and version. After all members have exchanged hello messages, they establish IKEv2 IPsec tunnels with each other in a fully meshed configuration.

What follows is a depiction of cluster members engaging in the hello message exchange process as part of the handshake prior to cluster formation:

Handshake Process / Hello Messages

Cluster Leader Election

For each cluster one Gateway will be selected as the cluster leader. Depending on the persona of the Gateways, the cluster leader has multiple responsibilities including:

  • Active and standby VLAN designated Gateway (VDG) assignment

  • Active and standby device designated Gateway (DDG) assignment

  • Active and standby user designated Gateway (UDG) assignment

  • Standby switch designated Gateway (S-SDG) assignment

The cluster election takes place after the initial handshake as a parallel thread to VLAN probing and the heartbeat process.

WLAN Gateways

The cluster leader is elected as the result of the hello message exchange which includes each platform’s information, priority, and MAC address. The leader election process considers the following (in order):

  1. Largest Platform

  2. Configured Priority

  3. Highest MAC Address

For homogeneous clusters, the Gateway with the highest configured priority or MAC address will be elected as the cluster leader. For heterogeneous clusters, the largest Gateway with the highest configured priority or MAC address will be elected as the cluster leader. The MAC address being the tiebreaker when equal capacity nodes with the same priority are evaluated.

The following graphic depicts a cluster leader election for a four-node 7240XM heterogeneous cluster. In this example DC-GW2 has the highest MAC address and is elected as the cluster leader. All other nodes become members:

WLAN cluster leader election

Branch HA Gateways

When branch HA is configured on two branch Gateways, the leader can be either automatically elected or manually selected by the admin. When a preferred leader is manually selected, no automatic election occurs, and the selected node becomes the leader.

When no preferred leader is configured, the leader election process considers the following (in order):

  1. Number of Active WAN Uplinks (Uplink Tracking)

  2. Largest Platform

  3. Highest MAC Address

Most branch Gateway deployments will implement a pair of Gateways of the same series and model forming a homogeneous cluster. When uplink tracking is disabled, the branch Gateway with the highest MAC address will be elected as the cluster leader. The MAC address being the tiebreaker when equal capacity nodes with the same priority are evaluated.

When uplink tracking is enabled, the number of active WAN uplinks are evaluated and the Gateway with the highest number of active WAN uplinks will be elected as the cluster leader. Inactive, virtual, and backup WAN uplinks are not considered.

VLAN Probes

Gateways in a configuration group share the same VLAN configuration and port assignments. The management and user VLANs are common between the Gateways in a cluster and must therefore be extended between the Gateways by the respective core / aggregation layer switches. A missing or isolated VLAN on one or more Gateways can result in blackholed clients.

VLAN probes are used by Gateways in a cluster to detect isolated or missing VLANs on each cluster node. Each cluster node transmits unicast EtherType 0x88b5 frames out each VLAN destined to other cluster node. For a cluster consisting of four nodes, each node may transmit a VLAN probe per VLAN to three peers. To prevent unnecessary or duplicate probes, each Gateway keeps track of probe requests and responses to each cluster peer for each VLAN. If a Gateway responds to a probe for a given VLAN from a peer, the Gateway marks the VLAN as successful and will skip transmitting a probe to that peer for that VLAN.

VLANs that are present on each node that receive a response and are marked as successful while VLANs that do not receive a response are marked as failed and displayed as failed in Central. Prior to 10.6, Gateways will probe configured VLANs including VLAN 1. As there is no configuration to exclude explicit VLANs, VLAN 1 will often show in Central as being failed.

In 10.6 and above, VLAN probing has been enhanced to be more intelligent where only VLANs with assigned clients are probed. While the gateways management VLAN is always probed as its required for cluster establishment, only user VLANs with active tunneled clients will be probed. VLANs with no tunneled clients are no longer automatically probed preventing unused VLANs from being displayed as being failed in Central. Only user VLANs that have not been extended will be displayed.

VLANs that have failed probes are listed in the cluster detail’s view in Central. This is demonstrated below where VLANs 100 and 101 have not been extended to one Gateway node in a cluster and are both listed as failed for that node. Note that in this example the Gateways are running 10.5, as such VLAN 1 is also listed as being failed for each node:

Cluster polling failed VLANs

Heartbeats

Cluster nodes exchange PAPI heartbeat messages to cluster peers at regular intervals in parallel to the leader election and VLAN probing messages. These heartbeat messages are bidirectional and serve as the primary detection mechanism for cluster node failures. A round trip delay (RTD) is computed for every request and response. Heartbeats are integral to the process the cluster leader uses to determine the role of each cluster node and detect node failures.

Failure detection and failover time is determined by the cluster heartbeat threshold configuration for the cluster. The recommended detection time for a port-channel is 2000ms while the default value of 900ms is recommended for a single uplink. Failure detection is based on no response for the configured heartbeat threshold which is configurable between 500ms > 2000ms.

Connectivity and Verification

The Gateway Cluster dashboard displays a list of Gateway clusters provisioned and managed by Central. This can be accessed in Central by selecting Devices > Gateways > Clusters then selecting a specific cluster name. This view can be accessed with a global context filter or by selecting a specific configuration group or site.

The Summary view for a cluster provides important cluster information such leader version, capacity and number of node failures that can occur. The graphic below provides an example summary for a two node 7220 cluster. Note that the summary view provides color coded client capacity over time for each node which is useful for determining client distribution during normal and peak times. In this example each nodes client capacity is below 40% for the past 3 hours:

Cluster summary and capacity

The Gateways view provides a list of cluster nodes, operational status, per node capacity, model, and role information. The following graphic demonstrates the status view for the above production cluster. This below view shows that each cluster node is UP and SJQAOS10-GW11 has been elected as the cluster leader. Note that the number of current active and standby client sessions for each node is also provided. Clients are distributed between the available nodes based on published bucket map for the cluster:

Cluster gateway status

The Gateways view also provides additional heartbeat and VLAN probe information for each peer. You can view the peer details for each member of the cluster using the dropdown. This demonstrated below where the peer details for SJQAOS10-GW11 is shown. In this example the peer Gateways has a member role and is connected. Note that all VLANs (including 1) have been correctly extended between the Gateways, therefore no VLANs have failed probes:

Cluster peer status

2.4.5 - Features

Gateway clustering with AOS 10.

Cluster Features

Seamless Roaming

The advantage of introducing the concept of the UDG is that it significantly enhances the experience for client roaming within a cluster. Once a client associates to an AP, it hashes the client’s MAC address and assigns it a UDG using the bucket map published for the cluster. Each client’s traffic is always anchored to its UDG which remains the same regardless of which AP the clients roams to. As each AP maintains GRE tunnels to each cluster node, any AP the client roams to will automatically forward the traffic to the UDG upon association and authentication.

A visual representation of the roaming process within a cluster is displayed below. In this example, GW-B is the assigned UDG for the client:

Seamless client roaming

Stateful Failover

Stateful failover is a critical aspect of cluster operations that safeguards clients from any impacts associated with a Gateway failure event. When multiple Gateways are present in a cluster, each client’s state is fully synchronized between the UDG and the S-UDG meaning that information such as the station table, the user table, layer 2 user state, layer 3 user state, will all be shared between both Gateways.

In addition, high value sessions such as FTP and DPI-qualified sessions are also synced to the S-UDG. Synchronizing client state and high value session information enables the S-UDG to assume the role as the client’s new UDG if the client’s current UDG fails. This permits stateful failover with no client de-authentication when clients move from their UDG to their S-UDG.

Event Driven Load Balancing

Client and device distribution is greatly simplified in AOS 10. One major change is that load balancing is no longer periodically performed during run-time and is now event driven as Gateways are added or removed from the cluster. Client distribution between cluster nodes is performed using the published bucket map for the cluster while device distribution is performed by the cluster leader based on each Gateways device capacity.

The goal of load balancing during a node addition or removal is to avoid disruption to clients and devices. When a Gateway in a cluster is taken down for maintenance or fails, impacted UDG, DDG and S-DDG sessions seamlessly transition to their standby nodes with little or no impact to traffic:

  • The cluster leader recomputes a new bucket map which is published to all devices. The bucket map is not immediately republished to provide sufficient time to activate standby client entries. The new bucket map includes the new S-UDG assignments for the clients.

  • The cluster leader reassigns the S-DDG/S-SDG sessions which are immediately published.

If the cluster leader is taken down for maintenance or fails, a new cluster leader is elected, and a role change notification is sent to all devices. The new cluster leader is responsible for recomputing and distributing the new bucket map for the cluster and performing DDG/SDG reassignments.

When a Gateway is added to a cluster, the cluster leader recomputes UDG and S-UDG assignments to avoid disruption to clients. The new bucket map from the first pass is published after 120 seconds while the bucket map for the second pass is published after 165 seconds.

DDG assignments are also recomputed when Gateways are added to a cluster. If the cluster is operating with a single node, S-DDG assignments are made for all devices that don’t have an S-DDG assignment. The cluster leader also performs load-balancing and re-assigns DDG and S-DDG sessions based on each Gateways capacity.

Live Upgrades

In AOS 10 Gateways are configured, managed, and upgraded independently from APs. AOS 10 APs and Gateways can run different AOS 10 software versions and can both be independently upgraded with zero network downtime as maintenance windows allow.

The live upgrade feature for Gateways allows cluster nodes to be upgraded with minimal or no impact to clients. When a live upgrade is initiated, the new firmware version is downloaded to all the Gateways in the cluster to the specified partition. Once the new firmware version has been downloaded and validated, Gateways are upgraded then sequentially rebooted to ensure all tunneled sessions are synchronized as UDGs, DDGs and SDGs are rebooted.

When a live upgrade is initiated for a cluster, the upgrade status of each node is displayed. Each node will first download the specified firmware image from the cloud and will upgrade the target partition. Once upgraded, the nodes are sequentially rebooted to minimize the impact to clients and devices:

Example of Live Upgrade

Live upgrades can be performed on-demand or be scheduled. Scheduled upgrades can be scheduled for any time within 1 week of the current date and time. A time zone, date and start time in hours and minutes must be specified. Scheduled live upgrades can be cancelled any time prior to the scheduled event. Here’s an example of a live upgrade being scheduled for an individual cluster where new firmware will be downloaded and installed on the Gateways’ primary partitions. The time zone is set to UTC and date and time is specified.

Live Upgrade scheduling

2.4.6 - Dynamic Authorization in a Cluster

Gateway clustering with AOS 10.

Change of Authorization

Change of Authorization (CoA) is a feature which extends the capabilities of the Remote Authentication Dial-In User Service (RADIUS) protocol and is defined in RFC 5176. CoA request messages are usually sent by a RADIUS server to a Network Access Server (NAS) device for dynamic modification of authorization attributes for an existing session. If the NAS device is able to successfully implement the requested authorization changes for the client, it will respond to the RADIUS server with a CoA acknowledgement also referred to as a CoA-ACK. Conversely, if the change is unsuccessful, the NAS will respond with a CoA negative acknowledgement or CoA-NAK.

For tunneled clients, CoA requests are sent to the target client’s user designated Gateway (UDG). The UDG will then return an acknowledgement to the RADIUS server upon the successful implementation of the changes or a NAK if the implementation was unsuccessful. However, a clients UDG may change during normal cluster operations due to reasons such as maintenance or failures. These scenarios can cause CoA requests to be dropped as the intended client would no longer be associated with the Gateway that received the CoA request. HPE Aruba Networking has implemented cluster redundancy features to prevent the scenario.

Cluster CoA Support

The primary protocol used to provide CoA support for clusters in AOS 10 is Virtual Router Redundancy Protocol (VRRP). In every cluster there are the same number of VRRP instances as there are nodes and each Gateway serves as the conductor of an instance. For example, a cluster with four Gateways would have four instances of VRRP and four virtual IP addresses (VIPs). The VRRP conductor receives messages intended for the VIP of its instance while the remaining Gateways in the cluster are backups for all other instances where they are not acting as the conductor. This configuration ensures that each cluster is protected by a fault-tolerant and fully redundant design.

AOS 10 reserves VRRP instance IDs in the 220-255 range. When the conductor of each instance sends RADIUS requests to the RADIUS server, it injects the VIP of its instance into the message as the NAS-IP by default. This ensures that CoA requests from the RADIUS server will always be forwarded correctly regardless of which Gateway is the acting conductor for each instance. For example, the RADIUS server sends CoA requests to the current conductor of a VRRP instance and not to an individual station. From the perspective of the RADIUS server, it is sending the request to the current holder of the VIP address of the instance. Here’s a depiction of sample architecture that will be used for the duration of the CoA section:

Example CoA implementation

This sample network consists of a four-node cluster with four instances of VRRP. The assigned VRRP ID range falls between 220 and 255, therefore the four instances in this cluster are assigned the VRRP IDs of 220, 221, 222, and 223. The priorities for the Gateways in each instance are dynamically assigned where the conductor of the instance is assigned a priority of 255, the first backup is assigned a priority of 235, the second backup is assigned a priority of 215 and the third backup is assigned a priority of 195.

VRRP Instance Virtual IP GW-A Priority GW-B Priority GW-C Priority GW-D Priority
220 VIP 1 255 235 215 195
221 VIP 2 195 255 235 215
222 VIP 3 215 195 255 235
223 VIP 4 235 215 195 255

GW-A is the conductor of instance 220 with a priority of 255, GW-B is the first backup with a priority of 235, GW-C is the second backup with a priority of 215 and GW-D is the third backup with a priority 195. Similarly, GW-B is the conductor for instance 221 due to having the highest priority of 255, GW-C the first backup with a priority of 235, GW-D is the second backup with a priority of 215 and GW-A is the third backup with a priority of 192. Instances 222 and 223 follow the same pattern as instances 220 and 221.

CoA with Gateway Failure

The failure of a cluster node can adversely impact CoA operations if the network doesn’t have the appropriate level of fault tolerance. If a user’s anchor Gateway fails, the RADIUS server will push the CoA request to their UDG with the assumption that it will enforce the change and respond with an ACK. However, if a redundancy mechanism such as VRRP hasn’t been implemented then the request will go unanswered and will not result in a successful change. In such a scenario, the users associated with the failed node will failover to their standby UDG as usual. However, the UDG will never receive the change request from the RADIUS server since the server is not aware of the cluster operations. VRRP instances must be implemented for each node to prevent such an occurrence and maintain CoA operations in the cluster.

In the figure below, GW-A is the master of instance 220 with GW-B serving as the first backup, GW-C serving as the second backup and GW-D serving as the third backup. A client associated to GW-A has been fully authenticated using 802.1X with GW-D acting as the client’s standby UDG. When communicating with ClearPass, GW-A automatically inserts the VIP for instance 220 as the NAS-IP. From the perspective of ClearPass, it is sending CoA requests to the current conductor of VRRP instance 220.

Client authentication against ClearPass

If GW-A fails, the client session will failover to GW-D. The client’s session moves over to GW-D as it’s the standby UDG. GW-D then assumes the role of UDG for the client. Since GW-B has a higher priority than GW-C or GW-D, it will assume the role of conductor and take ownership of the VIP.

GW-A failure

Any CoA requests sent by ClearPass for client 1 will be addressed to the VIP for instance 220. From the perspective of ClearPass, the VIP of instance 220 is the correct address for any CoA request intended for the client in the example. As GW-A has failed, GW-B is now the conductor of VRRP instance 200 and owns the VIP. When ClearPass sends a CoA request for the client, GW-B will receive it and then forward it to all nodes in the cluster. In this case GW-B forwards the request to GW-C and GW-D.

CoA message forwarded to GW-B

After the change in the CoA request has been successfully implemented, GW-D will send a CoA acknowledgement message back to ClearPass.

CoA ACK from GW-D

2.4.7 - Failover

Gateway clustering with AOS 10.

Cluster Failover

Cluster failover is a new feature in AOS 10 which permits APs servicing mixed or tunneled profiles to failover between datacenters in the event that all the cluster nodes in the primary datacenter fail or become unreachable. Cluster failover is enabled by selecting a secondary Gateway cluster when defining a new mixed or tunnel profile. Unlike failover within a cluster which is non-impacting to clients and applications, failover between clusters is not hitless.

When a secondary cluster is selected in a profile, APs servicing the profile will tunnel the client traffic to the primary cluster during normal operation. IPsec and GRE tunnels are established from the APs to cluster nodes in both the primary and secondary cluster. Failover to the secondary cluster is initiated once all the tunnels to the cluster nodes in the primary cluster go down and at least one cluster node in the secondary cluster is reachable. A primary and secondary cluster selection within a WLAN profile is depicted below.

Configuring for primary and secondary cluster.

A primary cluster failure detection typically occurs within 60 seconds. When a primary cluster failure is detected, the profiles are disabled for a further 60 seconds to bounce the tunneled clients to permit broadcast domain changes when moving between datacenters. Once re-enabled, the tunneled clients obtain new IP addressing and are able to resume communications across the network through the secondary cluster. AP and client sessions are distributed between the secondary cluster nodes in the same way as the primary cluster. Each AP is assigned a DDG & S-DDG session based on each node’s capacity and load while each client is assigned a UDG & S-UDG session based on bucket map assignment.

Failover between clusters can be enabled with or without preemption. When preemption is enabled, APs can automatically fail-back to the primary cluster when one or more nodes in the primary cluster become available. When preemption is triggered, the APs include a default 5-minute hold-timer to prevent flapping. The primary cluster must be up and operational for 5 minutes (non-configurable) before fail-back to the primary cluster can occur. As with failover from the primary to secondary cluster, the profiles are disabled for 60 seconds to accommodate broadcast domain changes.

When considering deploying cluster failover, careful planning is required to ensure that the Gateways in the secondary cluster have adequate client and device capacity to accommodate a failover. Capacity of the secondary cluster should be equal or greater than the capacity in the primary cluster.

In addition to capacity planning, VLAN assignments must also be considered. While the IP networks can be unique within each datacenter, any static or dynamically assigned VLANs must be present in both datacenters and configured in both clusters. This will ensure that tunneled clients are assigned the same static or dynamically assigned VLAN during a failover. If VLAN pools are implemented, the hashing algorithm will ensure that the tunneled clients are assigned the same VLAN in each cluster.

Cluster failover can be implemented and leveraged in different ways. Your profiles can all be configured to prefer a cluster in the primary datacenter and only failover to a cluster residing in the secondary datacenter during a primary datacenter outage. All the traffic workload in this example being anchored to the primary datacenter during normal operation. A primary-secondary datacenter failover model is depicted below.

Datacenter workload failover

Alternatively, your WLAN profiles in different configuration groups can be configured to distribute the primary and secondary cluster assignments between the datacenters. For example, half the APs in a campus can be configured to prefer the primary datacenter and failover to the secondary datacenter while the other half of the APs in the campus can be configured to prefer the secondary datacenter and failover to the primary datacenter. With this model the traffic workload would be evenly distributed between both datacenters. This is sometimes referred to as salt-and-peppering as depicted below.

Datacenter workload distribution

2.4.8 - Planning

Gateway clustering with AOS 10.

Planning a Gateway Cluster

Each cluster can support a specific number of tunneled clients and tunneling devices. The Gateway series, model, and number of cluster nodes determines each cluster’s capacity. When planning a cluster, the primary consideration is the number of Gateways that are required to meet the base client, device, and tunnel capacity needs in addition to how many Gateways are required for redundancy.

Total cluster capacity factors in the base and redundancy requirements.

Cluster Capacity

A cluster’s capacity is the maximum number of tunneled clients and tunneling devices each cluster can serve. This includes each AP and UBT switch/stack that establishes tunnels to a cluster and each wired or wireless client device that is tunneled to the cluster.

For each Gateway series, HPE Aruba Networking publishes the maximum number of clients and devices supported per Gateway and per cluster. The maximum number of cluster nodes that can be deployed per Gateway series is also provided. This information and other considerations such as uplink types and uplink bandwidth are used to select a Gateway model and the number of cluster nodes that are required to meet the base capacity needs.

Once your base capacity needs are met, you can then determine the number of additional nodes that are needed to provide redundant capacity to accommodate maintenance events and failures. The additional nodes added for redundancy are not dormant during normal operation and will carry user traffic. Additional nodes can be added as needed up to the maximum supported cluster size for the platform.

7000 / 9000 Series - Gateway Scaling

Scaling 7005 7008 7010 7024 7030 9004 9012
Max Clients / Gateway 1,024 1,024 2,048 2,048 4,096 2,048 2,048
Max Clients / Cluster 4,096 4,096 8,192 8,192 16,384 8,192 8,192
Max Devices / Gateway 64 64 128 128 256 128 512
Max Devices / Cluster 256 256 512 512 1,024 512 1,024
Max Tunnels / Gateway 5,120 5,120 5,120 5,120 10,240 5,120 5,120
Max Cluster Size 4 Nodes 4 Nodes 4 Nodes 4 Nodes 4 Nodes 4 Nodes 4 Nodes

7200 Series – Gateway Scaling

Scaling 7205 7210 7220 7240XM 7280
Max Clients / Gateway 8,192 16,384 24,576 32,768 32,768
Max Clients / Cluster 98,304 98,304 98,304 98,304 98,304
Max Devices / Gateway 1,024 2,048 4,096 8,192 8,192
Max Devices / Cluster 2,048 4,096 8,192 16,384 16,384
Max Tunnels / Gateway 12,288 24,576 49,152 98,304 98,304
Max Cluster Size 12 Nodes 12 Nodes 12 Nodes 12 Nodes 12 Nodes

9100 / 9200 Series – Gateway Scaling

Scaling 9114 9240 Base 9240 Silver 9240 Gold
Max Clients / Gateway 10,000 32,000 48,000 64,000
Max Clients / Cluster 60,000 128,000 192,000 256,000
Max Devices / Gateway 4,000 4,000 8,000 16,000
Max Devices / Cluster 8,000 8,000 16,000 32,000
Max Tunnels / Gateway 40,000 40,000 80,000 160,000
Max Cluster Size 6 Nodes 6 Nodes 6 Nodes 6 Nodes

Maximum cluster capacity

Each cluster can support a maximum number of clients and devices that cannot be exceeded. The number of cluster nodes required to reach a cluster’s maximum client or device capacity will vary by Gateway series and model. In some cases the maximum number of clients and devices for a cluster can only be reached by ignoring any high availability requirements and running with no redundancy.

Gateway series Gateway model Max cluster client capacity
7000 All 4 Nodes
9000 All 4 Nodes
7200 7205 12 Nodes
7210 6 Nodes
7220 4 Nodes
7240XM / 7280 3 Nodes
9100 All 6 Nodes
9200 All 4 Nodes
Gateway series Gateway model Max cluster device capacity
7000 All 4 Nodes
9000 All 4 Nodes
7200 All 2 Nodes
9100 All 2 Nodes
9200 All 2 Nodes

When a cluster’s client or device maximum capacity has been reached, the addition of more cluster nodes will not provide any additional client or device capacity. A cluster cannot support more clients or devices than the stated maximum for the Gateway series or model. Once the maximum client or device capacity has been reached for a cluster, each additional node will add forwarding and uplink capacity for client traffic in addition to client and device capacity for failover.

What Consumes Capacity

Each tunneled client and tunneling device consumes resources within a cluster. Each Gateway model can support a specific number of clients and devices that directly correlates to the available processing, memory resources and forwarding capacity for each platform. HPE Aruba Networking tests and validates each platform at scale to determine these limits.

With AOS 10 the Gateway scaling capacity has changed from what was set with AOS 8. These new capacities should be considered when evaluating a Gateway model or series for deployment with AOS 10. As AP management and control is no longer provided by Gateways, the number of supported devices and tunnels has been increased.

Client Capacity

Each tunneled client device (unique MAC) consumes one client resource within a cluster and counts against the cluster’s published client capacity. For each Gateway series and model, HPE Aruba Networking provides the maximum number of clients that can be supported per Gateway and per homogeneous cluster. Each Gateway model and cluster cannot support more clients than the stated maximum.

When determining client capacity needs for a cluster, consider all tunneled clients that are connected to Campus APs, Microbranch APs, and UBT switches. Each tunneled client consumes one client resource within the cluster. Clients that need to be considered include:

  • WLAN clients connected to Campus APs.

  • WLAN clients connected to Microbranch APs implementing Centralized Layer 2 (CL2) forwarding.

  • Wired clients connected to tunneled downlink ports on APs.

  • Wired clients connected to UBT ports.

{: .note } Only tunneled clients that terminate in a cluster need to be considered. WLAN and wired clients connected to Campus APs, Microbranch APs or UBT ports that are bridged by the devices are excluded. WLAN and wired clients connected to Microbranch APs implementing Distributed Layer 3 (DL3) forwarding may also be excluded.

Each AP and active UBT port establish GRE tunnels to each cluster node. The bucket map published by the cluster leader determines each tunneled client’s UDG and S-UDG assignment. A client’s UDG assignment determines which GRE tunnel the AP or UBT switch uses to forward the client’s traffic. If the client’s UDG fails, the client’s traffic is transitioned to the GRE tunnel associated with the client’s assigned S-UDG.

The number of tunneled clients does not influence the number of GRE tunnels that APs or UBT switches establish to the cluster nodes. Each AP and active UBT port will establish one GRE tunnel to each cluster node regardless of the number of tunneled client devices the WLAN or UBT port is servicing. The number of WLAN and wired port profiles also does not influence the number of GRE tunnels. The GRE tunnels are shared by all the profiles that terminate within a cluster.

The figure below depicts the client resource consumption for a 4 node 7240XM cluster supporting 60K tunneled clients. A four node 7240XM cluster can support a maximum of 98K clients and each node can support a maximum of 32K clients. In this example each client is assigned a UDG and S-UDG using the published bucket map for the cluster that is distributed between the four cluster nodes. Each cluster node in this example is allocated 15K UDG sessions and 15K S-UDG sessions during normal operation.

An example showing how a four node cluster of 7240XM Gateways supporting 60K tunneled clients will consume the available client capacity on each node within the cluster.

Device Capacity

Each tunneling device consumes one device resource within a cluster and counts against the cluster’s published device capacity. For each Gateway series and model, HPE Aruba Networking provides the maximum number of devices that can be supported per Gateway and per homogeneous cluster. Each Gateway model and cluster cannot support more devices than the stated maximum.

When determining device capacity for a cluster, you need to consider all devices that are tunneling client traffic to the cluster. Each device that is tunneling client traffic to a cluster consumes a device resource within the cluster. Devices that need to be considered include:

  • Campus APs

  • Microbranch APs

  • UBT Switches

Each AP and UBT switch that is tunneling client traffic to a cluster establishes IPsec tunnels to each cluster node for signaling, messaging and bucket map distribution. The cluster leader determines each AP’s DDG and S-DDG assignment which are load balanced based on each cluster nodes capacity and load. For UBT switches, the admin configuration determines each UBT switch’s SDG assignment while the cluster leader determines the S-SDG assignment. UBT switches implement a PAPI control-channel to the SDG node for signaling, messaging and bucket map distribution.

The figure below depicts the device resource consumption for a 4 node 7240XM cluster supporting 8K APs. A four node 7240XM cluster can support a maximum of 16K devices and each node can support a maximum of 8K devices. In this example each AP is assigned a DDG and S-DDG by the cluster leader that are distributed between the four cluster nodes. Each cluster node in this example is allocated 2K DDG sessions and 2K S-DDG sessions during normal operation.

An example showing how a four node cluster of 7240XM Gateways supporting 8K APs will consume the available device capacity on each node within the cluster.

Tunnel Capacity

APs and UBT switches establish IPsec and/or GRE tunnels to each cluster node. APs will only establish tunnels to a cluster when a WLAN or wired-port profile is configured for mixed or tunnel forwarding, and a cluster is selected as the primary or secondary cluster. UBT switches will only tunnel to the cluster that is configured as the primary or secondary IP as part of the switch configuration.

The following types of tunnels will be established:

  • Campus APs – IPsec and GRE tunnels

  • Microbranch APs (CL2) – IPsec and GRE tunnels. GRE tunnels are encapsulated in IPsec.

  • UBT Switches – GRE Tunnels

The tunnels from Campus APs and Microbranch APs are orchestrated by Central while the GRE tunnels from UBT switches are initiated based on admin configuration. Each tunnel from an AP or UBT switch consumes tunnel resources on each Gateway within a cluster. Unlike client and device capacity that is evaluated per cluster, tunnel capacity is evaluated per Gateway.

The number of tunnels that a device can establish to each Gateway in a cluster will vary by device. During normal operation, APs will establish 2 x IPsec tunnels (SPI-in and SPI-out) per Gateway for DDG sessions and 1 x GRE tunnel per Gateway for UDG sessions. The number of IPsec tunnels will periodically increase to 4 x IPsec tunnels per Gateway during re-keying (5 total). Microbranch APs configured for CL2 forwarding consume the same number of tunnels as Campus APs. The main difference being that each GRE tunnel is encapsulated in the IPsec tunnel.

Tunnel consumption for a Campus AP is depicted in the figure below. In this example the AP has established 2 x IPsec tunnels and 1 x GRE tunnel to each Gateway in the cluster. The 2 additional IPsec tunnels that are periodically established to each Gateway for re-keying are also shown. Worst case, each AP will establish a total of 5 tunnels to each Gateway in the cluster during re-keying.

An AP will potentially have five individual tunnels operational to each cluster node, each AP will reserve tunnel capacity appropriately.

For WLAN only deployments, the need for calculating the tunnel consumption per Gateway is not required as the maximum number of devices supported per Gateway already factors in the worst-case maximum of 5 tunnels per AP. As the maximum number of devices per Gateway is a hard limit, there will never be more tunnels established by APs than a Gateway can support.

The number of GRE tunnels established to each cluster node per UBT switch or stack is variable based on UBT version and number of UBT ports. For both UBT versions, 1 x GRE tunnel is established per UBT port to each Gateway in the cluster which are used for UDG sessions. The total number of UBT ports will therefore influence the total number of GRE tunnels that are established to each cluster node.

When UBT version 1.0 is deployed, two additional GRE tunnels are established from each UBT switch or stack to their SDG/S-SDG cluster nodes. These additional GRE tunnels are used to forward broadcast and multicast traffic destined to clients similar to how DDG tunnels are used on APs. Each UBT switch or stack configured for UBT version 1.0 will therefore consume two additional GRE tunnels per cluster.

Tunnel consumption for a UBT switch with two active UBT ports is depicted in the figure below. In this example the UBT switch is configured for UBT version 1.0 and has established 1 x GRE tunnel to its SDG/S-SDG Gateways for broadcast / multicast traffic destined to clients. Additionally, each active UBT port has established 1 x GRE tunnel to each Gateway for UDG sessions. If all 48-ports were active in this example, a total of 49 x GRE tunnels would be established per Gateway. Note that the number of clients per UBT port does not influence GRE tunnel count but would count against the cluster’s client capacity.

Example of how a switch will build GRE tunnels to a gateway when configured with support for UBT.

As the tunnel consumption for UBT deployments is variable, it is therefore, important to understand the UBT version that will be implemented, the total number of UBT switches or stacks and total number of UBT ports. For UBT version 1.0, each switch / stack will consume 2 x GRE tunnels per cluster and each UBT port will consume 1 x GRE tunnel per Gateway in the cluster for UDG sessions. For UBT version 2.0, each UBT port will consume 1 x GRE tunnel per Gateway in the cluster for UDG sessions.

For mixed WLAN and UBT switch deployments, the number of tunnels that can be consumed by both the APs and UBT switches may potentially exceed the Gateways tunnel capacity. As such it is important to calculate the total number of tunnels needed to support your deployment as each Gateway in the cluster will be terminating tunnels from both APs and UBT switches.

Determining Capacity

To successfully determine a cluster’s base capacity requirements, a good understanding of the environment is needed. Each Gateway model is designed to support a specific number of clients, devices and tunnels, and can forward a specific amount of encrypted and unencrypted traffic. The number of cluster nodes you deploy in a cluster will determine the total number of clients and devices that can be supported during normal operation and during maintenance or failure events.

Base Capacity

A successful cluster design starts by gathering requirements which will influence the Gateway model and number of cluster nodes you deploy. Once the base capacity has been determined, additional nodes can then be added to the base cluster as redundant capacity.

To determine a clusters base capacity requirements, the following information needs to be gathered:

  • Total Tunneled Clients – The total number of client devices that will be tunneled to the cluster. This includes wireless clients, clients connected to wired AP ports and wired clients connected to UBT ports. Each unique client MAC address counts as one client.

  • Total Tunneling Devices – The total number of devices that are establishing tunnels to the cluster. This will include Campus APs, Microbranch APs and UBT switches. Each AP, UBT switch / stack counts as one device.

  • Total UBT Ports – If UBT is deployed, the total number of UBT ports across all switches and stacks must be known.

  • UBT Version – The UBT version determines if additional GRE tunnels are established to the cluster from each UBT switch or stack for broadcast / multicast traffic destined to clients. This can be significant if the total number of UBT switches or stacks are high.

  • Traffic Forwarding – The minimum aggregate amount of user traffic that will need to be forwarded by the cluster. This will help with Gateway model selection.

  • Uplink Ports – The types of Ethernet ports needed to connect each Gateway to their respective switching layer and the number of uplink ports that need to be implemented.

Determining the number of clients and devices that need to be supported by a cluster is a straightforward process. Each tunneled client (wired and wireless) will consume one client resource within the cluster. Each AP and UBT switch or stack that is tunneling client traffic to a cluster will consume one device resource within that cluster. A Gateway model and number of nodes can then be selected to meet the client and device capacity needs. The primary goal is to deploy the minimum number of cluster nodes required to meet your base client and device capacity needs.

When evaluating client and device capacities to select a Gateway, the best practice is to use 80% of published Gateway and cluster scaling numbers to ensure that your base cluster design will include 20% additional capacity to accommodate future expansion. Designing a cluster at 100% scale is not recommended as there will be no additional capacity to support additional clients or devices after the initial deployments.

The general philosophy used to select a Gateway model and determine the minimum number of nodes required to meet the base capacity needs starts with referencing the tables below. These tables provide the maximum number of clients and devices supported per Gateway and per cluster and can aid by narrowing the choice of Gateways to a specific series or model.

For example, if your base cluster needs to support 50,000 clients and 5,000 APs, the 7000 and 9000 series Gateways can be quickly eliminated as can the 7205 and 7210 series Gateways. The remaining Gateway options are reduced to the 7220, 7240XM, 7280 and 9240 base models.

Using 80% scaling numbers, the minimum number of nodes required to meet the client and device capacity requirements for each Gateway model can be calculated and evaluated. For each model the maximum clients and devices supported per platform are captured and 80% numbers determined. The number of nodes required to meet the client and device requirements for each platform can then be determined. The minimum number of nodes required to meet your client and device capacity will likely differ. For example, a specific model of Gateway may require 2 nodes to meet client capacity needs and 1 node to meet device capacity needs.

This is demonstrated below where the 80% client and device capacities for each Gateway model is captured and listed under Per Node. This value multiplied to determine how many nodes are required to meet the 50,000 client and 5,000 AP requirement. Using the 7220 as an example, a minimum of 3 nodes is required to meet the client capacity requirements (19,660 x 3 = 58,980) while a minimum of 2 nodes are required to meet the device capacity requirements (3,277 x 2 = 6,554).

Other Gateway models require a minimum of 1 or 2 nodes to meet the above client and device capacity requirements. As such the 7220 can be excluded from consideration as 3 nodes are required to meet the capacity needs vs. 2 nodes for other models.

Model 80% client cap per node Min Nodes Cluster 80% device cap per node Min Nodes Cluster
7220 19,660 3 58,980 3,277 2 6,554
7240XM 26,214 2 52,428 6,554 1 6,554
7280 26,214 2 52,428 6,554 1 6,554
9240 Base 25,600 2 51,200 3,200 2 6,400

The next step is to evaluate the number of uplink ports and port types needed to connect the Gateways to their respective core / aggregation layer switches. As a best practice, each Gateway should connect to a redundant switching layer using a minimum of two ports in a LACP configuration. Each Gateway model is available with different Ethernet port configurations supporting different speeds. Gateways models are available with copper, SFP, SFP+, SFP28+ and QSFP+ interfaces which are provided in the datasheets.

In the above example, the 7240XM, 7280 and 9240 base models all support a minimum of four SFP+ ports and either can be selected if 10Gbps uplinks are required. If higher speed uplinks such as 25Gbps or 40Gbps are needed, the 7240XM can be excluded.

In parallel, the forwarding performance of each Gateway model needs to be considered. The maximum amount of traffic that each Gateway model can forward is provided in the published datasheets. Each Gateway model can forward a specific amount of user traffic and the number of nodes in the cluster determines the aggregate throughput of the cluster. For example, a 9240 base Gateway can provide up to 20Gbps of forwarding capacity. A 2-node 9240 base cluster will offer an aggregate forwarding capacity of 40Gbps (2 x 20Gbps).

If more aggregate forwarding capacity is required, a different Gateway model and uplink type might be selected. For example, a 7280 series Gateway that is connected using QSFP+ interfaces can provide up to 80Gbps of forwarding capacity per Gateway. A 2 node 7280 cluster offering an aggregate forwarding capacity of 160Gbps (2 x 80Gbps).

In the above example, both the 9240 base and 7280 series Gateways meet the base capacity requirements with a 2-node cluster. The ultimate decision as to which Gateway model to use will likely come down to uplink port preference based on the port types that are available on the switching layer and aggregate forwarding capacity requirements. Additional nodes can be added to the base cluster design if more uplink and aggregate forwarding capacity is required.

The above example captured the methodology used to select a Gateway model and determine the minimum cluster size for a wireless LAN only deployment and did not evaluate tunnel capacity. As a Gateway cannot support more APs than its maximum device capacity, a Gateways tunnel capacity cannot be exceeded for a wireless LAN only deployment.

When UBT is deployed, the number of clients and devices will influence your base cluster client and device capacity requirements while the UBT version and total number of UBT ports will influence tunnel capacity requirements. As the total number of UBT switches or stacks and UBT ports are variable, additional validation will be required to ensure that tunnel capacity on a selected Gateway model is not exceeded:

  • UBT version 1.0 – Each UBT switch or stack will consume 2 x GRE tunnels to the cluster for broadcast / multicast traffic destined to clients. Additionally, each UBT port will consume 1 x GRE tunnel too each Gateway in the cluster.

  • UBT version 2.0 – Each UBT port will consume 1 x GRE tunnel to each Gateway in the cluster.

Expanding on the previous example, let’s assume the base cluster needs to support 50,000 clients, 4,500 APs, 512 UBT switches / stacks and 12,288 UBT ports and UBT version 2.0 will be implemented. The total number of clients and devices remains the same, but we have now introduced additional GRE tunnels to support the UBT ports.

We have already determined that a 2-node cluster using a 7240XM, 7280 or 9240 base series Gateways can meet the base client and device capacity needs. The next step is to calculate tunnel consumption. Each AP will establish 5 tunnels, each UBT port will establish 1 tunnel. With simple multiplication and addition, we can easily determine to total number of tunnels that are required:

  • AP Tunnels / Gateway: 5 x 4500 = 22,500

  • UBT Port Tunnels / Gateway: 12,288

For this example, a total of 34,788 tunnels per Gateway is required. We can determine the maximum tunnel capacity for each Gateway model and calculate the 80% tunnel scaling number. The number of required tunnels is then subtracted to determine the remaining number of tunnels for each model.

This is demonstrated in the table below that shows that our tunnel capacity requirements can be met by both the 7240XM and 7280 series Gateways but not the 9240 base series Gateway. The 9240 base Gateway would not be a good choice for this mixed wireless LAN / UBT deployment unless a separate cluster is deployed.

Model Capacity (80%) Required Remaining
7240XM 76,800 34,788 42,012
7280 76,800 34,788 42,012
9240 Base 32,000 34,788 -2,788

If UBT version 1.0 was deployed in the above example, two additional GRE tunnels would be consumed per UBT switch or stack to the cluster. In this example 1,024 additional GRE tunnels would be established from the 512 UBT switches to different Gateways within the cluster based on the SDG/S-SDG assignments. To calculate the additional per Gateway tunnel capacity for UBT version 1.0, the total number of tunnels is divided by the number of base cluster nodes. For a 2-node base cluster, 512 additional tunnels would be consumed per Gateway.

Redundant Capacity

Once a base cluster design has been determined, additional nodes can be added to provide redundant capacity. Each additional node added to a base cluster will provide additional forwarding capacity, uplink capacity and redundant client and device capacity to accommodate maintenance and failure events. It’s important to note that each additional node added to your base cluster are not dormant and will support client and device sessions and provide forwarding during normal operation.

The number of additional nodes that you add to your base cluster for redundant capacity will be influenced by your tolerance for how many cluster nodes can be lost before client or device capacity is impacted. Your cluster design may include as many redundant nodes as the maximum cluster size for the Gateway series supports.

Minimum redundancy is provided by adding one redundant node to the base cluster. This is referred to as N+1 redundancy where the cluster can sustain the loss of a single node without impacting clients or devices. An N+1 redundancy model is typically employed for base clusters consisting of a single node but may also be used to provide redundancy for base clusters with multiple nodes. The following is an example of a N+1 redundancy model where one additional node is added to each base cluster:

N+1 redundancy in a cluster is achieved by adding a gateway to a cluster, allowing for single node failure without interruptions.

The maximum number of redundant nodes that you add to your base cluster will typically be less than or equal to the number of nodes in the base cluster. The only limitation is the maximum number of cluster nodes the Gateway series can support.

When the number of redundant nodes equals the number of base cluster nodes, maximum redundancy is provided. This is referred to as 2N redundancy (also known as N+N redundancy) where the cluster can sustain the loss of half its nodes without impacting clients or devices. 2N redundancy is typically employed in mission critical environments where continuous operation is required. The cluster nodes may reside within the same datacenter or be distributed between datacenters when bandwidth and latency permits. The 2N redundancy model is depicted below where three redundant nodes are added to a three-node base cluster design:

2N Redundancy

Most cluster designs will not include more redundant nodes than the base cluster unless additional forwarding, uplink or firewall capacity is required. Your cluster design may include a single node for redundancy for N+1 redundancy, twice as many nodes for 2N redundancy or something in between.

MultiZone

One main architectural change in AOS 10 is that WLAN and wired-port profiles in an AP configuration group can terminate on different clusters. This capability is referred to as MultiZone and is supported by Campus APs using profiles configured for mixed or tunnel forwarding and Microbranch APs with profiles configured for Centralized Layer 2 (CL2) forwarding.

MultiZone has various applications within an enterprise network. The most common use is segmentation where different classes of traffic are tunneled to different points within the network. For example, trusted traffic from an employee WLAN is tunneled to a cluster located in the datacenter while untrusted traffic from a guest/visitor WLAN is tunneled to a cluster located in a DMZ behind a firewall. Other common uses include departmental access and multi-tenancy.

When planning for capacity for a MultiZone deployment, the following considerations need to be made:

  • Each AP will consume a device resource on each cluster it is tunneling client traffic to.

  • Each AP will establish IPsec and GRE tunnels to each cluster node for each cluster it is tunneling client traffic to.

  • Each tunneled client will consume a client resource on the cluster it is tunneled to.

  • Each AP can tunnel to a maximum of twelve Gateways across all clusters.

MultiZone is enabled when WLAN or wired-port profiles configured for mixed, or tunnel forwarding are provisioned that terminate on separate clusters within the Central instance. When enabled, APs will establish IPsec and GRE tunnels to each cluster node in each cluster. As with a single cluster implementation, the APs will establish 3 tunnels to each cluster node during normal operation and 5 tunnels during re-keying.

DDG and S-DDG sessions are allocated in each cluster by each cluster leader that also publishes the bucket map for their respective cluster. Each tunneled client is allocated a UDG and S-UDG session in their respective cluster based on the bucket map for that cluster.

Tunnel consumption for a MultiZone AP deployment is depicted below. In this example an AP is configured with three WLAN profiles where two WLAN profiles terminate on an employee cluster while one WLAN profile terminates on a guest cluster. The APs establish IPsec and GRE tunnels to each cluster and are assigned DDG sessions in each cluster and receive a bucket map for each cluster. Clients connected to WLAN A or WLAN B are assigned UDG sessions in the employee cluster while clients connected to WLAN C are assigned UDG sessions in the guest cluster.

Multizone capacity

Capacity planning for a MultiZone deployment follows the methodology described in previous sections where the base capacity for each cluster is designed to support the maximum number of tunneling devices and tunneled clients that terminate in each cluster. Additional nodes are then added for redundant capacity.

As mixed and tunneled WLAN and wired-port profiles can be distributed between multiple configuration groups in Central, a good understanding of the total number of APs that are assigned to profiles terminating in each cluster is required. Device capacity and tunnel consumption may be equal across clusters if profiles are common between all APs and configuration groups or unequal if different profiles are assigned to APs in each configuration group.

For example, if WLAN A, WLAN B and WLAN C in this illustration are assigned to 1,000 APs in configuration group A and WLAN A and WLAN B are assigned to 1,000 APs in configuration group B, 2,000 device resources would be consumed in the employee cluster while 1,000 device resources would be consumed in the guest cluster. Tunnel consumption would be 10,000 on the Gateways in the employee cluster and 5,000 on the Gateways in the guest cluster.

An understanding of the maximum number of tunneled clients per cluster across all WLANs is also required and this will typically vary between clusters. For example, the employee cluster may be designed to support a maximum of 10,000 employee devices while the guest cluster may be designed to support a maximum of 2,000 guest or visitor devices. In this case WLAN A and WLAN B would consume 10,000 client resources on the employee cluster while WLAN C would consume 2,000 client resources on the guest cluster.

2.5 - Forwarding Modes of Operation

Bridged, tunneled, or mixed, there are multiple options available for forwarding traffic when deploying AOS 10 access points; discover the forwarding modes, implications on roaming, and best practices for implementing these modes.

In AOS 10, client traffic can be bridged locally by the APs or be tunneled to a Gateway cluster. How the client traffic is forwarded by the APs is determined by the traffic forwarding mode configured in each WLAN and downlink wired port profile:

  • Bridge – APs will bridge client traffic out its uplink interface on the desired VLAN.

  • Tunnel – APs will tunnel client traffic to a Gateway cluster on the desired VLAN.

  • Mixed – The AP will bridge, or tunnel client traffic based on VLAN assignment.

Traffic Forwarding Modes

The traffic forwarding mode configured in each profile determines if the APs or the Gateways are the authenticators and where the VLAN and user-role assignment decisions are made:

  • Bridge Forwarding – APs are the authenticators and determine the static or dynamic VLAN and user role assignment for each client.

  • Tunnel or Mixed Forwarding – Gateways are the authenticators and determine the static or dynamic VLAN and user role assignments for each client.

For mixed forwarding, the assigned VLAN ID determines if the client’s traffic is bridged locally by the AP or tunneled to a Gateway cluster. Client traffic is bridged if the assigned VLAN is not present within the assigned Gateway cluster and is tunneled if the VLAN is present within the assigned Gateway cluster.

The traffic forwarding modes are extremely flexible permitting wireless client traffic to be bridged or tunneled as needed. Selecting the Bridge or Tunnel forwarding mode will exclusively configure the forwarding mode of the WLAN profile to the specified forwarding mode. Mixed forwarding permits both forwarding types but requires dedicated VLANs to be implemented for bridged and tunneled clients. A profile configured for bridge forwarding cannot tunnel user traffic and vice versa.

Bridge Forwarding

When bridge traffic forwarding is configured in a WLAN or downlink wired port profile, the client traffic will be directly forwarded out of the APs uplink port(s) onto the access switching layer with an appropriate 802.1Q VLAN tag. To support bridge forwarding, the APs management and bridged user VLANs are extended from the access switching layer to the APs uplink ports. Each bridged client is assigned a VLAN that is 802.1Q tagged out the APs uplink port. As a recommended security best practice, no bridged clients should be assigned to the APs management VLAN.

An example of a bridge forwarding deployment is depicted below where the AP management VLAN (not shown) and bridged user VLANs 76 and 79 are extended from the access switching layer to hospitality AP which services wired and wireless clients. In this example the WLAN client is assigned VLAN 76 while the wired client is assigned VLAN 79. The core / aggregation switch has IP interfaces and IP helper addresses defined for each VLAN and is the default gateway for each VLAN.

Bridge Forwarding Mode

Seamless Roaming

To provide the best possible experience for bridged clients and their applications, the AP management and user VLANs are extended between APs that establish common RF coverage areas within a building or floor. The AP management and bridged user VLANs are shared between the APs and are allocated a specific IP network based on the number of hosts each VLAN needs to support.

Clients roaming between APs sharing VLANs are able to maintain their VLAN assignment, IP addressing and default gateway after a roam. This often is referred to as a seamless roam as it is the least disruptive to applications. The roam can be a fast roam or slow roam depending on the WLAN profile configuration and capabilities of the client.

A seamless roam is depicted in below where bridged user VLAN 76 and its associated IP network (10.200.76.0/24) has been extended between all the APs within a building. In this example the client is able to maintain its VLAN 76 membership, IP addressing and default gateway after each roam.

Bridge Forwarding & Seamless Roaming

Bridge Forwarding Scaling

The total number APs and bridged clients that can be supported across shared management and user VLANs will also influence your VLAN and IP network design. Broadcast / multicast frames and packets are normal in IP networks and are used by both APs and clients to function. The higher the number of active hosts that are connected to a given VLAN, the higher the quantity and frequency of broadcast / multicast frames and packets that are flooded over the VLAN. As broadcast / multicast frames are flooded, they must be received and processed by all active hosts in the VLAN.

When bridge forwarding is deployed, HPE Aruba Networking has validated that we can support a maximum of 500 APs and 5,000 bridged clients across all shared management and user VLANs. The total number of APs in a shared management VLAN cannot exceed 500 and the total number of clients across all bridged user VLANs cannot exceed 5,000.

When scaling beyond 500 APs and 5,000 clients is required for a deployment within a building or campus, two design options are available:

  1. A cluster of Gateways can be deployed with centralized user VLANs offering higher scaling and seamless mobility.

  2. Multiple instances of 500 APs and 5,000 clients can be strategically deployed where the AP management and user VLANs for each instance are layer 3 separated (i.e., implement separate broadcast domains).

If Gateways are not an option, with careful planning and design multiple instances of APs can be deployed where the AP management and user VLANs for each instance of APs connect to separate IP networks providing scaling. Each instance of APs and clients being limited to a floor, building or co-located buildings as needed. There is no limit as to how many instances of 500 APs and 5,000 clients you can deploy as long as each instance of APs and clients are layer 3 separated from other instances. The VLAN IDs used by each instance of APs and clients can be common to simplify configuration and operations, but the IP networks for each instance must be unique.

The compromise for this design is that roaming between separate instances of APs such as between buildings requires a hard roam as bridged clients will require new IP addressing and a default gateway to be assigned (see Hard Roaming). As such a good understanding of your user expectations, application and client behavior needs to be understood before considering a hard roaming design. If hard roaming is not acceptable, a cluster of Gateways must be deployed.

A good understanding of the LAN architecture is also helpful when scaling as larger LANs will typically include natural layer 3 boundaries such as aggregation switching layers within buildings that prevent AP management and user VLANs from being extended. These layer 3 boundaries provide natural segmentation boundaries between each instance of APs and clients.

An example of a campus design implementing bridge forwarding is depicted below. In this example each building implements a dedicated layer 3 aggregation switching layer that connects to a routed core. Each building is completely layer 3 separated from neighboring buildings preventing AP management and user VLANs from being extended. Each building in this example implements a specific number of APs that use the same VLAN IDs but with unique IP networks. Each building can potentially scale to support a maximum of 500 APs or 5,000 clients (whichever limit is reached first).

Bridge Forwarding Scaling

Hard Roaming

When scaling without Gateways is required or the LAN architecture precludes VLANs and IP networks from being extended between APs across buildings or floors, a bridged forwarding implementation is still possible but with compromises to application and user experience.

There are some situations where it is not possible to extend VLANs and their IP networks between APs in larger deployments such as within a building or between buildings. The local area network (LAN) design may include intentional layer 3 boundaries within the distribution switching layer that prevent VLANs and their associated IP networks from being extended between access layer switches servicing buildings or floors. Access layer switches configured for routed access will also prevent VLANs and IP networks from being extended between wiring closets within a building or floor.

When a client device roams between APs separated by a layer 3 device, a hard roam is performed as the client’s broadcast domain membership changes. While the APs in each building or floor may implement the same management and user VLAN IDs, the associated IP networks will be unique for each. Clients roaming between APs separated across a layer 3 device will require new IP addressing and a default gateway to be assigned after the roam. While modern clients are able to obtain new IP addressing to accommodate the IP network change, the transition between IP networks will impact active applications as the source IP addresses of the clients will change after a hard roam.

A hard roam between APs deployed in separate buildings within a campus separated by layer 3 aggregation switching layers is depicted below. In this example, a common bridged user VLAN ID 76 is deployed in both buildings but has a different IP network assigned in each building:

  • VLAN 76 / Building A – 10.200.76.0/24

  • VLAN 76 / Building B – 10.201.76.0/24

Bridged clients roaming between APs within each building will have a seamless roaming experience while bridged clients roaming between buildings have a hard roaming experience. The roam can be a fast roam or a slow roam depending on the configuration of the WLAN profile and client capabilities.

Bridge Forwarding & Hard Roaming

Depending on your LAN architecture and environment, a hard roam may be required for clients moving between buildings within a campus, between floors within a multi-story building or between co-located buildings. This will be dependent on where the layer 3 boundaries reside within the LAN for each environment. For most deployments these boundaries will reside between buildings.

Before considering a hard roaming design, the following needs to be investigated and considered:

  • User Experience – Do the users expect uninterrupted network connectivity when moving between buildings or floors?

  • RF Design* – Can the AP placement and cell design accommodate / implement RF boundaries to minimize the hard roaming points across layer 3 boundaries to provide the best possible user and application experience?

  • Client Devices – Do you have any specialized or custom client devices deployed? These will need to be tested to validated to ensure they can tolerate and support hard roaming. Modern Apple, Android and Microsoft operating systems will initiate the DHCP DORA process after each roam.

  • Applications – What applications are you using, and can they tolerate hosts changing IP addresses? While some applications such as Outlook, Teams and Zoom can automatically recover after host re-addressing others cannot.

Ultimately you will need to decide if your users, client devices, and applications can tolerate hard roaming before considering and implementing a hard roaming design. If a hard roaming design cannot be tolerated and seamless roaming is required, a design using Gateways and tunnel forwarding should be considered.

MAC Address Learning

When bridge forwarding is enabled, client traffic is forwarded out the APs uplink ports on the assigned VLAN to the access switching layer. Each bridged client MAC address will be learned by all the layer 2 devices participating in the VLAN using normal layer 2 learning. Each bridged client’s MAC address is initially learned from DHCP and ARP broadcast messages transmitted by each client during association, authentication and roaming. Each switch participating in the VLAN will either learn a bridged client’s MAC address from a switchport that is connected to the AP where the client is attached or from its uplink / downlink port connecting to a peer switch.

An example of MAC learning for a bridge forwarding deployment is depicted below. All the layer 2 switches participating in VLAN 76 in this example will learn client 1’s MAC address upon client 1 transmitting a broadcast frame or packet after a successful association and authentication:

  • SW-2 – Will directly learn Client 1’s MAC address on port 1/1/1 that connects to the AP (where Client 1 is attached).

  • SW-1 – Will learn client 1’s MAC address on port 1/1/20 that connects to SW-2 (layer 2 path to Client 1).

  • SW-3 – Will learn Client 1’s MAC address on port 1/1/52 that connects to SW-1 (layer 2 path to Client 1).

MAC address learning in a bridged network.

When a bridged client roams between APs, a MAC move will occur. Upon a successful roam, a frame or packet from the roaming client will trigger the upstream switches to update their layer 2 forwarding tables to reflect the new layer 2 path to the roamed client.

A MAC address move resulting from a roaming bridged client is depicted below. In this example client 1 has roamed from an AP connected to SW-2 to an AP connected to SW-3. Upon client 1 transmitting a broadcast frame or packet after the roam, all the switches participating in VLAN 75 will update their MAC address forwarding tables to reflect the new layer 2 path to client 1:

  • SW-3 – Will re-learn client 1’s MAC address on port 1/1/2 that connects to the AP (where client 1 is attached).

  • SW-1 – Will re-learn client 1’s MAC address on port 1/1/2 that connects to SW-3 (new layer 2 path to client 1)

  • SW-2 – Will re-learn client 1’s MAC address on port 1/1/52 (new layer 2 path to client 1).

MAC Address Move

Tunnel Forwarding

When tunnel traffic forwarding is configured in a WLAN or downlink wired port profile, the client traffic is encapsulated in GRE by the APs and is tunneled to the primary Gateway cluster. Client traffic forwarded within the GRE tunnels are tagged with the clients assigned VLAN. As covered in the clustering section, the role of each Gateway within the primary cluster determines which Gateway is responsible for transmitting and receiving traffic for each tunneled client.

With tunnel traffic forwarding, the user VLANs are centralized and reside within each cluster. Each tunneled profile terminates within a primary cluster and can optionally failover to a secondary cluster. For each primary cluster, the Gateways management and user VLANs are extended from each Gateway in the cluster to their respective core / aggregation switching layer. As a best practice, all the VLANs are 802.1Q tagged. Each Gateway within a cluster shares the same management VLAN, user VLANs and associated IP networks. Each tunneled client is either statically or dynamically assigned to a centralized user VLAN within its primary cluster.

An example of a bridge forwarding deployment is depicted below where the tunneled user VLANs 73 and 75 are extended from the Gateway to the core / aggregation switching layer. In this example the WLAN client is assigned VLAN 73 while the wired client is assigned VLAN 75. The core / aggregation switch has IP interfaces and IP helper addresses defined for each VLAN and is the default gateway for each VLAN.

Tunnel Forwarding Mode

The above example uses a single Gateway to simplify the datapath of each tunneled client. When multiple Gateways are deployed within a cluster, each AP establishes IPsec and GRE tunnels to each cluster node. The role of each Gateway within the cluster determines which Gateway is responsible for anchoring each client’s traffic and which Gateway is responsible for forwarding broadcast / multicast traffic destined to clients attached to each AP. The Gateway role effectively determines which GRE tunnel the AP selects when forwarding traffic from a client and which tunnel is selected by the Gateway for unicast, broadcast and multicast return traffic.

The figure below expands on the previous example by adding an additional Gateway to the cluster and includes each Gateways role assignment. In the below diagram:

  • Client 1 is dynamically assigned VLAN 73, and GW-A is assigned the UDG role.

  • Client 2 is dynamically assigned VLAN 73, and GW-B is assigned the UDG role.

  • GW-A is assigned the DDG role for the AP.

Tunnel Forwarding by Role

In the above example GW-A is assigned the UDG role for client 1 and is responsible for receiving all traffic transmitted by client 1 and transmitting all unicast traffic destined to client 1. GW-B is assigned the UDG role for client 2 and is responsible for receiving all traffic transmitted by client 2 and transmitting all unicast traffic destined to client 2. This traffic is encapsulated and forwarded in the respective GRE tunnels that terminate on GW-A or GW-B based on the UDG role assignment for each client.

GW-A is also assigned the DDG role for the AP and is responsible for transmitting all broadcast and multicast traffic that is flooded on VLAN 73. This traffic is encapsulated and forwarded in the IPsec tunnel that terminates on GW-A.

Roaming

Each tunneled WLAN terminates in a primary cluster where the user VLANs are centralized. Each tunneled client is either statically or dynamically assigned a VLAN which is present on all the Gateways within the primary cluster. For each client, one Gateway in the cluster is assigned a UDG role that determines which Gateway the client’s traffic is anchored to. As the bucketmap is published per cluster, each client will maintain its UDG assignment as it roams. Each client’s traffic is always be anchored to the same Gateway within a cluster regardless of which AP the client roams to.

When a client roams between APs that tunnel to the same primary cluster, the client is able to maintain its VLAN assignment, IP addressing and default gateway after each roam providing a seamless roaming experience. Clients can perform a slow roam or fast roam depending on the WLAN profile configuration and capabilities of the client. Seamless roaming can be achieved between APs in the same Central configuration group (same profile) or between APs in separate configuration groups (duplicated WLAN profiles). The only requirement for a seamless roam is that the primary cluster must be the same between the APs.

A seamless roam is depicted below where user VLAN 73 and its associated IP network (10.200.73.0/24) is centralized within a primary cluster consisting of four Gateways. In this example APs in each building and floor connect to AP management VLANs implementing separate IP networks. Using the published bucketmap for the cluster, GW-B is assigned the UDG role for the client which is maintained after each roam. Regardless of which AP the client is connected to, the client is able to maintain its VLAN membership, IP address and default gateway.

Tunnel Forwarding & Seamless Roaming

For some larger campus deployments, tunneled WLANs might be distributed between primary clusters located in separate datacenters to distribute traffic workloads. The IP networks for each user VLAN being unique in each cluster. Using configuration groups in Central, APs in each building are strategically distributed between the primary clusters to evenly distribute the user traffic between each datacenter. For availability and failover, the alternate cluster is assigned as the secondary cluster.

As with bridged forwarding, clients roaming between APs serviced by a separate cluster will perform a hard roam as new IP addressing will be required. While the VLANs will be common, the IP networks for each user VLAN in each datacenter will be unique. Clients roaming between APs tunneling to separate primary clusters will require a new IP address and default gateway after each roam.

MAC Address Learning

When tunnel forwarding is enabled, the APs tunnel the client’s traffic to the Gateway in the cluster that is assigned the UDG role. Each tunneled clients MAC address will be learned by the core / aggregation switch along with all the active Gateways within the cluster:

  • Core / Aggregation Switch – Will learn each client’s MAC address from the physical or logical aggregated port that connects to the UDG Gateway for each client.

  • Gateways – Will learn each client’s MAC address either from the GRE tunnel (UDG role) or physical or logical aggregated uplink port from the core / aggregation switch.

Each tunneled client’s MAC address is anchored to the Gateway assuming the UDG role for each client. The layer 2 path for each tunneled client will remain bound to the physical or logical port of its assigned UDG Gateway regardless of which AP the client roams. No MAC address move will occur between the Gateways and the core / aggregation switching layer after a roam. A client’s MAC address will only move between Gateways as a result of a UDG -> S-UDG transition.

An example of MAC learning for a tunnel forwarding deployment is depicted below. In this example, client 1 is dynamically assigned VLAN 73 and GW-A is assigned the UDG role. During normal operation, SW-GW-AGG learns client 1’s MAC address on port 1/1/1 that connects to GW-A. GW-B learns the client 1’s MAC address on port 0/0/0 that connects to SW-GW-AGG.

MAC Address Learning

Mixed Forwarding

When mixed traffic forwarding is configured in a WLAN or downlink wired port profile, the client traffic will either be directly forwarded out of the APs uplink port(s) onto the access switching layer with an appropriate 802.1Q VLAN tag or encapsulated in GRE by the APs and is tunneled to the primary Gateway cluster:

  • Bridged – The AP will bridge the traffic when a client device is assigned a VLAN ID or Name that is not present in the primary or secondary Gateway cluster.

  • Tunneled – The AP will tunnel the traffic when a client device is assigned a VLAN ID or Name that is configured in the primary or secondary Gateway cluster.

When a profile configured for mixed forwarding is created and a primary cluster is selected, the VLANs present in the primary and secondary cluster are learned by the APs and are tagged in the GRE tunnels. The APs use this knowledge to determine when to bridge or tunnel clients when a VLAN is assigned.

For branch deployments using Branch Gateways, the AP management and bridged user VLANs are typically extended from the Branch Gateways to the APs and are common to both. The Branch Gateways provide DHCP services and routing for each VLAN within the branch. When no layer 3 separation exists between the APs and the Gateways in a branch deployment, profiles implementing mixed traffic forwarding will always tunnel the clients. If bridge traffic forwarding is required and layer 3 separation between the APs and the Branch Gateways is not possible, separate profiles implementing bridge traffic forwarding must be implemented.

To support mixed forwarding, the APs management and bridged user VLANs are extended from the access switching layer to the APs uplink port(s). It is recommended that dedicated VLAN IDs be used for bridged and tunneled clients and the VLANs must not overlap. As a recommended best practice, only the AP management and bridged user VLANs should be extended to your APs.

An example of a typical mixed WLAN deployment is depicted below where dedicated VLANs are implemented for bridged and tunneled clients. In this example the untagged AP management VLAN (not shown) and 802.1Q tagged bridged user VLAN 76 is extended from the access switching layer to an AP. VLAN 73 is centralized within a cluster and is 802.1Q tagged from the Gateway to the core / aggregation switching layer. Client 1 is dynamically assigned VLAN 73 and is tunneled to the primary cluster while client 2 is dynamically assigned VLAN 76 and is locally bridged by the APs.

Mixed Forwarding Mode

Best Practices & Considerations

AOS 10 APs can support any combination of WLAN and downlink wired port profiles implementing bridge, tunnel, or mixed forwarding modes. When profiles of different forwarding types are serviced by APs, the following considerations and best practices should be followed:

  1. Implement dedicated VLAN IDs for bridged and tunneled clients. An AP can only bridge or tunnel clients for a given VLAN ID and cannot do both simultaneously.

  2. Prune all tunneled VLANs from the APs uplink ports at the access switching layer. A tunneled VLAN must not be extended to the uplink ports on the APs. As a recommended best practice, only the AP management and bridged user VLANs should be extended to the APs.

  3. Avoid using VLAN 1 whenever possible. VLAN 1 is the default management VLAN for APs and is also present on the Gateways. Assigning clients to VLAN 1 may have unintentional consequences such as bridging clients onto the APs native VLAN or blackholing tunneled clients.

  4. If profiles using bridge traffic forwarding are implemented, it is recommended that you change the APs default management VLAN ID to match the native VLAN ID configured on your access layer switches.

  5. If the APs default management VLAN 1 is retained, avoid assigning tunneled clients to a VLAN in the primary cluster that indirectly maps to the APs untagged management VLAN. For example, if your APs are managed on untagged VLAN 70 which is terminated on a Branch Gateway, you must not assign tunneled clients to VLAN 70.

  6. If implementing mixed forwarding with Branch Gateways, bridged user VLANs must be layer 3 separated from the Gateways. If no layer 3 separation is implemented, all the clients will be tunneled as all the VLANs will be present within the primary cluster. If layer 3 separation cannot be implemented, a dedicated profile using bridged forwarding must be implemented.

2.6 - Access Point Port Usage

AP Ports and how to configure them.

AP ports can be used in different ways depending on the AP model and deployment type. Using wired port profiles, AP ports can be configured with an uplink or downlink persona. The persona of an AP’s Ethernet port determines how the port is used, where the port connects, and what type of traffic is carried.

  • Uplink Ports – Are used to connect APs to the access switching layer. Uplink ports support the APs management VLAN, carry AP management traffic, establish tunnels to Gateways and forward bridged client traffic.

  • Downlink Ports – Are used to connect wired client devices to the APs or APs operating as Mesh Points to downstream switches. Similar to WLAN profiles, downlink ports can bridge or tunnel client traffic, support authentication and apply policies via user roles.

HPE Aruba Networking Central and the default configuration of the APs include default profiles that configure specific ports as uplinks or downlinks depending on the number of physical Ethernet ports that are installed on the AP and the intended use of the AP. With a few exceptions, uplink ports are used to connect APs to an access switching layer and by default, all models of HPE Aruba Networking APs will implement Ethernet 0/0 as an uplink port.

AP models equipped with dual Ethernet ports may implement both Ethernet 0/0 and Ethernet 0/1 as uplink ports permitting both ports to be connected to the access switching layer in an active / active or active / standby configuration. Hospitality, remote, or APs that can provide Power over Ethernet (PoE) (H, R, or P variants) implement Ethernet 0/0 as an uplink port with all other ports configured as downlink ports.

An example of uplink and downlink port usage for various AP types is depicted below. In this example all APs connect to the access switching layer using their Ethernet 0/0 ports which have a default or user defined uplink wired port profile assigned. All APs will obtain a management IP address on their configured management VLAN, communicate with Central and forward client traffic using their uplink port.

Wired client devices connect to downlink ports which varies by platform. Each wired client’s traffic either being locally bridged or tunneled by the AP based depending on the traffic forwarding configuration within the assigned downlink wired port profile. Wired client devices optionally being MAC or 802.1X authenticated by a RADIUS server or Cloud Auth service.

Uplink and Downlink Ports

Uplink ports are used to connect APs to the access switching layer. Depending on the AP model, an AP can be connected using a single uplink port or dual uplink ports operating in an active / active or active / standby configuration. Both APs and Central include a default uplink wired port profile named default_wired_port_profile that is assigned to AP uplink ports by default. The default port assignment will vary based on AP series and model.

AP Family AP Model Default Assignment
300 Series AP-303, AP-303H, AP-303P, AP-304, AP-305 Ethernet 0/0
310 Series AP-314, AP-315, AP-318 Ethernet 0/0
320 Series AP-324, AP-325 Ethernet 0/0 & Ethernet 0/1
330 Series AP-334, AP-335 Ethernet 0/0 & Ethernet 0/1
340 Series AP-344, AP-345 Ethernet 0/0 & Ethernet 0/1
360 Series AP-365, AP-367 Ethernet 0/0
370 Series AP-374, AP-375, AP-375EX, AP-375ATEX, AP-377, AP-377EX Ethernet 0/0 & Ethernet 0/1
380 Series AP-387 Ethernet 0/0
500 Series AP-503H, AP-504, AP-505, AP-505H Ethernet 0/0
503 Series AP-503, AP-503R Ethernet 0/0
510 Series AP-514, AP-515, AP-518 Ethernet 0/0 & Ethernet 0/1
530 Series AP-534, AP-535 Ethernet 0/0 & Ethernet 0/1
550 Series AP-555 Ethernet 0/0 & Ethernet 0/1
560 Series AP-565, AP-565EX, AP-567, AP-567EX Ethernet 0/0
570 Series AP-574, AP-575, AP-575EX, AP-577, AP-577EX Ethernet 0/0 & Ethernet 0/1
580 Series AP-584, AP-585, AP-585EX, AP-587, AP-587EX Ethernet 0/0 & Ethernet 0/1
605R Series AP-605R Ethernet 0/0
610 Series AP-615 Ethernet 0/0
630 Series AP-634, AP-635 Ethernet 0/0 & Ethernet 0/1
650 Series AP-654, AP-655 Ethernet 0/0 & Ethernet 0/1

The default uplink wired port profile default_wired_port_profile is present on all HPE Aruba Networking APs in a factory defaulted state as well as in each configuration group in Central. This default assignment permits both un-provisioned and provisioned APs to be connected to the access switching layer using a single uplink or dual uplinks without any additional configuration being required. When connected using dual uplink ports, a high-availability bonded link is automatically created by the APs that operates in either active / active configuration if LACP is detected or active / standby if LACP is absent.

APs using the default uplink wired port profile implement untagged VLAN 1 for management by default and require a dynamic host configuration protocol (DHCP) server to service the VLAN for host addressing. To successfully discover and communicate with Central, the DHCP server must provide a valid IPv4 address, subnet mask, default gateway and one or more domain name servers. Internally, a switched virtual IP interface (SVI) with a DHCP client is bound to VLAN 1.

The default configuration of the uplink wired port profile will:

  • Configure the port as a trunk

  • Configure VLAN 1 as the native VLAN

  • Permit all VLANs (1-4094)

  • Enable port-bonding

With the default uplink wired port profile, APs can support both bridged and/or tunneled clients with no modification being required. The AP’s native VLAN is set to 1 and all other VLANs are permitted on the uplink ports. All AP management traffic will be forwarded on VLAN 1 untagged while bridged client traffic will be forwarded out the assigned VLAN with a corresponding VLAN tag.

The default wired port profile:

wired-port-profile default_wired_port_profile
 switchport-mode trunk
 allowed-vlan all
 native-vlan 1
 port-bonding
 no shutdown
 access-rule-name default_wired_port_profile
 speed auto
 duplex full
 no poe
 type employee
 captive-portal disable
 no dot1x

Default wired port profile assignments:

Port Profile Assignments
------------------------
Port  Profile Name
----  ------------
0     default_wired_port_profile
1     default_wired_port_profile
2     wired-SetMeUp
3     wired-SetMeUp
4     wired-SetMeUp
USB   wired-SetMeUp

AP deployments exclusively using tunnel forwarding only require an untagged management VLAN to be configured on the access switching layer for operation. The switchports that each AP connect to will be configured for access mode with the desired AP management VLAN ID assigned. Both the AP and the access layer switches will forward all Ethernet frames untagged. The AP will implement VLAN 1 while the peer access layer switch will implement the configured access VLAN. This is identical to how Campus APs operated in AOS 6 and AOS 8.

An example of an AP implementing the default uplink wired port profile connected to an access switchport is depicted below. In this example the AP is connected to port 1/1/1 on an access layer switch that is configured with the access VLAN 70. The AP in this example implements VLAN 1 for management which indirectly maps to VLAN 70 on the access layer switch.

AP Connected to an Access Switchport

When WLAN and/or wired port profiles are configured with bridged or mixed forwarding, the AP management and one or more dedicated bridged user VLANs will be extended from the access switching layer to the APs. The switchports that each AP connect will be configured for trunk mode with a native VLAN and 802.1Q tagged bridged VLANs assigned. As a recommended best practice, only the untagged management VLAN and the 802.1Q tagged bridged user VLANs should be extended to the APs. AP management traffic is forwarded untagged while bridge user traffic is forwarded 802.1Q tagged.

An example of an AP implementing the default uplink wired port profile connected to a trunk switchport is depicted below. In this example the AP is connected to port 1/1/1 on an access layer switch that is configured with the native VLAN 70 and allowed VLANs 70,76-79. The AP in this example implements VLAN 1 for management which indirectly maps to native VLAN 70 on the access layer switch. All bridged clients are assigned to VLAN IDs 76–79 which are 802.1Q tagged between the AP and the peer access layer switch.

AP Connected to a Trunk Switchport

Management VLAN

By default, access points implement native VLAN 1 for management which is untagged out the AP’s uplink ports. APs will utilize untagged VLAN 1 for IP addressing and communication with Central without any further configuration being required in Central. APs require a DHCP server to provide an IP address, default gateway and one or more name server IP addresses and Internet access to be able to communicate with Central.

The default management VLAN for an AP can be seen by issuing the show ip interface brief command in the console. The br0 label will indicate that the default VLAN 1 is being used by the AP for management, when a different VLAN is assigned as the management VLAN then the VID will be appended to the br0 interface.

AP Default Management VLAN

AOS 10.5 introduces the option to change the management VLAN ID configuration of APs to a new value for deployments that require such a configuration. APs can be easily re-configured to use a new untagged VLAN for management that matches the management VLAN ID configured in the access switching layer or may implement an 802.1Q tagged management VLAN if required.

Changing the AP’s management VLAN to a different value requires a new uplink wired port profile to be configured and assigned to the AP’s Ethernet 0/0 and optionally Ethernet 0/1 uplink ports. A new uplink wired port profile is recommended to preserve the configuration in the default uplink port profile. This permits the default profile to be reassigned to the AP’s uplink ports in the event of a misconfiguration.

The new uplink wired port profile includes the Use AP Management VLAN as Native VLAN option that must be enabled for the management VLAN to be modified. The new profile can be configured for access or trunk depending on if bridged user VLANs are required. When trunk mode is configured, the Native VLAN and Allowed VLANs must be configured and the AP’s management VLAN must be included in the Allowed VLAN list. The configuration is similar to how a trunk port is configured on a typical Ethernet switch.

The topic Configuring Wired Port Profiles on APs covers the configuration of wired port profiles for access points running AOS 10.

Example configuration of an access wired port profile applied to an AP’s configuration:

wired-port-profile uplink_profile_access
 switchport-mode access
 allowed-vlan all
 native-vlan ap-ip-vlan
 port-bonding
 no shutdown
 access-rule-name uplink_profile_access
 speed auto
 duplex auto
 no poe
 type employee
 captive-portal disable
 no dot1x
!
enet0-port-profile uplink_profile_access
enet1-port-profile uplink_profile_access

Example configuration of a trunk wired port profile applied to an AP’s configuration:

wired-port-profile uplink_profile_trunk
 switchport-mode trunk
 allowed-vlan all
 native-vlan ap-ip-vlan
 port-bonding
 no shutdown
 access-rule-name uplink_profile_trunk
 speed auto
 duplex auto
 no poe
 type employee
 captive-portal disable
 no dot1x
!
enet0-port-profile uplink_profile_trunk
enet1-port-profile uplink_profile_trunk

Once the uplink profile has been saved and applied, the APs management VLAN ID can then be changed under System > VLAN configuration. When the AP Management VLAN is changed, the Customize VLANs of Uplink Ports option will automatically change to Native VLAN Only. The APs will continue to use the default VLAN 1 for management until a new management VLAN ID is specified and saved.

AP Management VLAN

Demonstrating the change of an AP using a modified management VLAN can be accomplished by issuing the show ip interface brief command to the AP. With the management VLAN changed to VLAN 71, the output will now show the br0.71 interface with an IPv4 address and network mask assigned. The br0.71 label indicates that VLAN 71 is now being used by the AP for management. The management VLAN in this example is untagged from the AP as VLAN 71 is configured as the Native VLAN in the uplink wired port profile.

AP New Management VLAN

VLAN enforcement

The uplink wired port profile is used to configure the operation of the uplink ports which includes the VLAN configuration. By default, the APs uplink ports are assigned the default_wired_port_profile which configures the native VLAN as 1 and accepts traffic from all VLANs.

The ability to configure and apply a new uplink port profile with more restrictive VLAN configuration has been supported for some time, this option is typically not needed as we recommend that the VLANs be pruned at the access switching layer. As a recommended best practice only the AP management and bridged user VLANs should be extended to the APs.

Some customers may not wish to prune VLANs at the access layer and instead extend all VLANs to the AP. By default, APs will automatically discover VLANs based on traffic that is received on their uplink ports. If all VLANs are extended to the APs, the APs will automatically learn VLAN IDs, and MAC addresses as flooded frames and packets are received on their uplink ports. If tunneled VLANs are also extended to the APs, MAC flapping may occur as MAC addresses can be learned on two traffic paths.

If VLANs cannot be pruned at the access switching layer, VLAN enforcement can be enabled on the APs to restrict which VLANs that the APs accept. VLAN enforcement requires a new trunk uplink port profile to be configured and applied to the APs that includes the Native VLAN and a restrictive Allowed VLAN list. The Allowed VLAN list must only include the APs management VLAN and bridged user VLANs. All other VLANs must be excluded.

An example of an uplink wired port profile configured to only accept traffic from a specific range of VLANs is depicted below. In this example the APs management VLAN is 71 and the bridged user VLANs are 76-79. The Allowed VLAN list in this example includes the VLANs 71, 76-79:

wired-port-profile uplink_profile_trunk
 switchport-mode trunk
 allowed-vlan 71,76-79
 native-vlan ap-ip-vlan
 port-bonding
 no shutdown
 access-rule-name uplink_profile_trunk
 speed auto
 duplex auto
 no poe
 type employee
 captive-portal disable
 no dot1x
!
enet0-port-profile uplink_profile_trunk
enet1-port-profile uplink_profile_trunk

Once the new uplink wired port profile has been configured and applied to the uplink ports, VLAN enforcement can be enabled under System > VLAN within the configuration group. VLAN enforcement is enabled by setting the Customize VLANs of Uplink Ports option to All VLAN Settings. Once saved, the APs will only accept traffic from VLANs you configured in the Allowed VLAN list within the uplink wired port profile.

VLAN enforcement configuration within a configuration group is depicted below. The APs in this example have also been re-configured to use VLAN 71 for management which is configured as the Native VLAN in the above wired port profile. The management VLAN has been included to highlight that the management VLAN must be included in the Allowed VLAN list within the modified uplink profile.

AP Management VLAN

HPE Aruba Networking APs equipped with a second Ethernet port can optionally be dual connected to an access switching layer. If LACP is implemented, traffic can also be load-balanced between the uplink ports. Each APs uplink port can be strategically distributed between switchports in separate I/O modules within a chassis or between members of a stack. APs may also be connected to separate chassis or stacks placed in separate wiring closets if VLANs and broadcast domains are common to both uplink ports. Dual uplinks allow APs to maintain network connectivity to the access switching layer in the event of an I/O module, stack member or wiring closet failure.

APs can be connected using dual uplinks operating in an active / active or active / standby configuration without any additional configuration being required in Central. The default uplink wired port profile permits port-bonding by default and will place the APs Ethernet 0/0 and Ethernet 0/1 ports into either an active / active or active / standby state:

  • Active / Active – If LACP BPDUs from the same LACP group are received on both the APs Ethernet 0/0 and Ethernet 0/1 ports.

  • Active / Standby – If no LACP BPDUs are received on both the APs Ethernet 0/0 and Ethernet 0/1 ports.

Active / standby

With an active / standby dual-uplink deployment, both the Ethernet 0/0 and Ethernet 0/1 ports are connected to the access switching layer. During normal operation the APs Ethernet 0/0 uplink port is used for AP management and traffic forwarding while the APs Ethernet 0/1 uplink port is in a standby state and will not transmit or receive management or user traffic. The APs Ethernet 0/1 port will only become active if the link on the Ethernet 0/0 uplink port is lost.

Active-Standby Failover

The primary LAN requirement to support APs using an active / standby uplink configuration is that the VLANs and associated IP networks (broadcast domains) must be common to both AP uplink ports. APs implementing active / standby uplinks do not support layer 3 failover and cannot be connected to switchports implementing separate VLAN IDs or broadcast domains. The switchport configuration and broadcast domains for both uplink ports must be identical for failover to work. If the link to the Ethernet 0/0 interface is lost, the APs will transition their management IP interface, orchestrated tunnels, and bridged client traffic to their Ethernet 0/1 link. From the access switching layer perspective, the APs management IP address, MAC address and all bridged clients MAC addresses will move.

For most active / standby deployments, each AP will be connected to a common access layer switch or stack where the APs uplink ports are distributed between I/O modules within in a chassis or members of a stack. This permits the APs to continue operation in the event that an I/O module or stack member fails.

An example of a typical active / standby deployment using a stack of CX switches is depicted below. In this example the APs Ethernet 0/0 and Ethernet 0/1 ports implement the default uplink wired port profile where each uplink port connects to a separate stack member within a VSF stack:

  • Ethernet 0/0 – The active uplink port is connected to switchport 1/1/10 (first stack member)

  • Ethernet 0/1 – The standby uplink port is connected to switchport 2/1/10 (second stack member)

Within the VSF stack, both switchports are configured as trunks with the same Native VLAN and Allowed VLANs configured. The AP in this example will implement untagged VLAN 71 for management and 802.1Q tagged VLANs 76-79 to service bridged clients.

Illustration of a switch stack and AP setup for active-standby failover within a single closet.

interface 1/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/0]
   no routing
   vlan trunk native vlan 71
   vlan trunk allowed 71,76-79
...
interface 2/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/1]
   no routing
   vlan trunk native vlan 71
   vlan trunk allowed 71,76-79

If additional redundancy is required, APs implementing active / standby uplinks can be connected to separate switches or stacks located in the same wiring closet or separate wiring closets. This permits additional redundancy in the event of a power failure. Both deployments are supported as long as the same VLAN IDs and broadcast domains are present on both uplink ports. Connecting APs to switchports using different VLAN IDs or broadcast domains is not supported.

An example of a typical active / standby deployment using separate stacks of CX switches is depicted below. In this example the APs Ethernet 0/0 and Ethernet 0/1 ports implement the default uplink wired port profile where each uplink port connects to a stack member on in separate VSF stacks:

  • Ethernet 0/0 – The active uplink port is connected to switchport 1/1/10 (first VSF stack)

  • Ethernet 0/1 – The standby uplink port is connected to switchport 2/1/10 (second VSF stack)

Within each VSF stack, the switchports are configured as trunks with the same Native VLAN and Allowed VLANs configured. The AP in this example will implement untagged VLAN 71 for management and 802.1Q tagged VLANs 76-79 to service bridged clients. VLANs 71,76-79 in this example are extended between both VSF stacks.

Illustration of multiple switches or switch stacks and AP setup for active-standby failover across switches or closets.

BLD10-FL1-IDF-A

interface 1/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/0]
   no routing
   vlan trunk native vlan 71
   vlan trunk allowed 71,76-79
BLD10-FL1-IDF-B

interface 1/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/1]
   no routing
   vlan trunk native vlan 71
   vlan trunk allowed 71,76-79

Active / active

With an active / active dual-uplink deployment, both the Ethernet 0/0 and Ethernet 0/1 ports are connected to a common access layer switch or stack using Link Aggregation Control Protocol (LACP). During normal operation, both the Ethernet 0/0 and Ethernet 0/1 ports are active and using hashing algorithms will both carry management and user traffic. If either link or path fails, management and user traffic will automatically failover to the remaining active link.

Active-active load sharing

Active / active configuration requires that both AP uplink ports be connected to peer switchports that are in the same LACP link aggregation group. The LACP bond will not establish if the uplink ports are connected to switchports configured in separate LACP groups. Note that HPE Aruba Networking switches will detect this mismatch condition and place one of the switchports into a LACP blocking state. Additionally, for the LACP bond to become active, all AP uplinks and peer switchports in the LACP bond must negotiate at the same speed. If one of the links in the bond negotiate at a slower speed than the other link, the LACP bond will not establish.

An active / active uplink deployment using LACP requires each AP to be connected to a common access layer switch. This can be a chassis, stack or a logical switch implementing virtualization technology permitting LACP links to be distributed between two physical switches. The APs uplink ports are distributed between I/O modules within in a chassis, members of a stack or the logical switches.

An example of a typical active / active deployment using a stack of CX switches is depicted below. In this example the APs Ethernet 0/0 and Ethernet 0/1 ports implement the default uplink wired port profile where each uplink port connects to a separate stack member within a VSF stack:

  • Ethernet 0/0 – Is connected to switchport 1/1/10 in LACP LAG group 110 (first stack member)

  • Ethernet 0/1 – Is connected to switchport 2/1/10 in LACP LAG group 110 (second stack member)

Illustration of an AP using an active / active connection to a switch stack.

BLD10-FL1-IDF-A

interface 1/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/0]
   lag110
...
interface 2/1/10
   no shutdown
   description [BLD10-FL1-AP-1-0/1]
   lag110
...
interface lag 110
   no shutdown
   description BLD10-FL1-AP1
   no routing
   vlan trunk native vlan 71
   vlan trunk allowed 71,76-79
   lacp mode active

During normal operation, traffic transmitted by the AP to the access switching layer is hashed and distributed across both of the AP’s Ethernet 0/0 and Ethernet 0/1 ports. This includes AP management, tunneled user traffic and bridged user traffic. The fields that APs use to hash egress traffic will be dependent on the traffic type and number of headers that are available:

  • Layer 2 Frames – APs will hash egress traffic across both uplinks based on source MAC / destination MAC.

  • Layer 3 Packets – APs will hash egress traffic across both uplinks based on source MAC / destination MAC and source IP / destination IP.

For tunneled user traffic to a primary cluster consisting of two or more cluster nodes, multiple layers of traffic distribution will occur. The IPsec and GRE tunnels will be distributed between the APs uplink ports based on layer 2 and layer 3 headers while tunneled clients will be distributed between GRE tunnels based on each tunneled client’s bucketmap assignment:

  • GRE Tunnels – APs will hash GRE tunnels based on source MAC / destination MAC and source IP / destination IP.

  • Tunneled Clients – Traffic for each tunneled client is anchored to a specific cluster node based on bucketmap assignment.

PoE redundancy

When utilizing dual uplinks, APs may receive power from the Ethernet 0/0 and/or Ethernet 0/1 uplink ports. Depending on the AP series and model, APs may either simultaneously source power from both uplink ports using sharing or source power from either port using failover. With the exception of the 510 series that can only source power from Ethernet 0/0, APs will either support sharing or failover.

PoE standards and failover options for dual Ethernet equipped AP models:

AP Series PoE Standards PoE Redundancy
320 Series 802.3af, 802.3at Failover
330 Series 802.3af, 802.3at Failover
340 Series 802.3af, 802.3at Failover
510 Series 802.3af, 802.3at, 802.3bt No
530 Series 802.3at, 802.3bt Sharing
550 Series 802.3at, 802.3bt Sharing
570 Series 802.3at, 802.3bt Sharing
630 Series 802.3at, 802.3bt Failover
650 Series 802.3af, 802.3at, 802.3bt Sharing

The AP-530, AP-550, and AP-570 series APs will balance the draw power on each uplink port and will generally draw 40% / 60% power on each port, best case. The AP 650 will draw power from Ethernet 0/0 first and then Ethernet 0/1 once Ethernet 0/0 is maxed out. The max budget on the AP-650 series is the sum of both ports whereas on the AP-530, AP-550, and AP-570 series whichever port is lowest divided by .6.

Downlink ports are used to connect wired client devices to APs but may also be used to connect APs operating as Mesh Points to clients or downstream switches when Mesh bridging is deployed. The number of ports that can be implemented as downlinks will vary based on the number of physical Ethernet ports available on the AP and the number of Ethernet ports that are employed as uplinks.

When downlinks are implemented to connect wired client devices, user traffic can be bridged or tunneled based on the traffic forwarding mode configured in the profile. Client devices can also be optionally MAC, 802.1X or Captive Portal authenticated with static or dynamic VLAN and user role assignments.

The default downlink wired port profile wired_SetMeUp is present on all HPE Aruba Networking APs in a factory defaulted state but is absent in Central. The default downlink profile is assigned to non-uplink ports by default on Hospitality APs.

wired-port-profile wired-SetMeUp
  no shutdown
  switchport-mode access
  allowed-vlan all
  native-vlan guest
  access-rule-name wired-SetMeUp
  speed auto
  duplex auto
  type guest
  captive-portal disable
  inactivity-timeout 1000
Port Profile Assignments
------------------------
Port  Profile Name
----  ------------
0     default_wired_port_profile
1     default_wired_port_profile
2     wired-SetMeUp
3     wired-SetMeUp
4     wired-SetMeUp
USB   wired-SetMeUp

Bridged

Downlink ports configured for bridge forwarding can be used to connect wired client devices to APs or to connect Mesh Points to downstream access layer switches when Mesh bridging is deployed. The downlink wired port profile can be configured for access supporting a single untagged access VLAN or as a trunk supporting a single Native VLAN and one or more 802.1Q tagged VLANs.

When the downlink profile is configured for bridge forwarding, the AP bridges traffic received on a downlink port to an uplink port on the assigned VLAN. The VLAN assignment and uplink port profile configuration determines if the bridged traffic is forwarded out the uplink port untagged or tagged.

When configuring a downlink port profile with bridge forwarding, the VLANs that are configured must be present on APs uplink ports. If the default uplink port profile is implemented, all VLANs are allowed by default. If a user defined uplink port profile is implemented, the bridged VLANs must be included in the Allowed VLAN list. The VLANs must also be extended to the APs from the access switching layer.

An example of a downlink bridged port profile configured for access is depicted below. In this example an IP camera is connected to the APs Ethernet 0/1 downlink port and is assigned to access VLAN 79. VLAN 79 is extended between the access switching layer and the APs Ethernet 0/0 uplink port and is 802.1Q tagged between both ports.

Access bridged downlink port

An example of a downlink bridged port profile configured for trunk is depicted below. In this example an IP phone is connected to the APs Ethernet 0/1 downlink port where untagged VLAN 76 is used for data and 802.1Q tagged VLAN 77 is used for voice. Both VLANs are extended between the access switching layer and the APs Ethernet 0/0 uplink port and is 802.1Q tagged between both ports.

Trunk bridged downlink port

An example of a downlink bridged port profile configured for trunk used for Mesh bridging is depicted below. In this example a user defined uplink port profile with the native VLAN 71 and allowed VLANs 71,76-79 has been assigned to the Mesh Portals Ethernet 0/0 uplink port that connects to the access switching layer. A user defined downlink port profile has been assigned to the Mesh Points Ethernet 0/0 port with the same native VLAN 71 and allowed VLANs 71,76-79. VLANs 71,76-79 are effectively extended from the access switching layer over the mesh link to the remote access layer switch.

Mesh trunk bridged downlink port

Tunneled

Downlink ports configured for tunnel forwarding can be used to connect wired client devices to APs. A downlink wired port profile can only be configured for access supporting a single untagged VLAN. Tunneled trunk ports configured with multiple VLANs is not supported today.

When the downlink profile is configured for trunk forwarding, the AP tunnels traffic received on a downlink port to the selected primary cluster. As with tunneled WLAN clients, each tunneled wired client is assigned a UDG and S-UDG session within the primary cluster via the published bucketmap. If datacenter redundancy is required, failover between a primary and secondary cluster is also supported.

Each tunneled downlink port profile can be configured to tunnel traffic to a specified primary cluster. APs supporting multiple downlink ports can implement port profiles that all tunnel to the same primary cluster or may implement port profiles tunneling to separate primary clusters (MultiZone).

An example of downlink tunneled port profiles applied to hospitality APs is depicted below. In this example two downlink port profiles with tunnel forwarding have been assigned to the APs downlink ports to support in-room services and guest devices:

  • Ethernet 0/1 – A downlink port profile is assigned to support a SmartTV which is MAC authenticated and assigned to VLAN 74.

  • Ethernet 0/2 – Ethernet 0/4 – A downlink port profile is assigned to support hotel guest devices which are Captive Portal authenticated and assigned to VLAN 75.

Access Tunnel Downlink Ports

2.7 - User VLANs

Details about Virtual LANs and how they can be used in within WLAN profiles.

Each client is either statically or dynamically assigned a VLAN upon connecting to a WLAN or downlink port on an AP. The VLAN assignment decision is either made by the AP or the Gateway depending on the traffic forwarding mode configured in the profile:

  • Bridge Forwarding – The VLAN assignment decision is made by the APs

  • Mixed or Tunnel Forwarding – The VLAN assignment decision is made by the Gateways

For mixed forwarding, the static or dynamically assigned VLAN Id determines if the client’s traffic is locally bridged by the AP or tunneled to the primary cluster. If the assigned VLAN is present within the cluster, the client’s traffic is tunneled. If the assigned VLAN is not present within the cluster, the client’s traffic is locally bridged by the AP.

When planning an AOS 10 deployment that implements a combination of bridged, tunneled, and mixed forwarding profiles, it’s extremely important to dedicate VLAN IDs for bridged and tunneled clients. The VLAN IDs must be unique for each forwarding mode and must not overlap. An AP can only bridge or tunnel client traffic to each VLAN ID and cannot do so simultaneously. Mixing bridged and tunneled client traffic on the same VLAN ID is not recommended or supported.

Static VLANs

Each profile requires a static VLAN to be configured which is assigned to client devices if no dynamic VLAN assignment is derived. The VLAN you assign can be considered a catchall VLAN if no other VLAN is derived. For WLAN profiles the static VLAN can be individual VLAN ID or VLAN Name that you specify or select depending on the configured forwarding mode. If named VLANs are implemented, additional configuration is required to map the VLAN names to their respective VLAN IDs. The mapping configuration is either performed within the WLAN profile (bridge forwarding) or Gateway configuration group (tunnel forwarding).

As a recommended best practice, the static VLANs you assigned within a profile should not be the management VLAN of the AP or the Gateway. The assigned VLAN should be dedicated to bridged or tunneled clients. VLAN 1 should also be avoided for bridged and mixed forwarding as VLAN 1 is used as the APs default management VLAN and is also present within each primary cluster. VLAN 1 should only be used if the APs management VLAN has been changed to a different value or is legitimately used to serve client traffic within the primary cluster.

An example of VLAN Id and VLAN name assignments for a WLAN profile is depicted in the below figure. In this example WLAN clients with no dynamic VLAN assignment will be assigned to VLAN 75.

WLAN Profile Example 1

WLAN Profile Example 2

For downlink wired port profiles, the static VLAN configuration options will vary depending on the configured traffic forwarding mode. For bridge forwarding, the downlink ports can be configured as access supporting a single VLAN or trunk supporting multiple VLANs. The configuration is identical to how access or trunk ports are configured on an Ethernet switch:

  • Access – A single access VLAN ID is required which determines the VLAN ID the AP uses to forward the bridged clients traffic out its uplink port.

  • Trunk – A Native VLAN ID and a list of Allowed VLANs must be configured. This will determine which 802.1Q tagged VLANs are accepted by the downlink port and which VLAN is used to forward untagged traffic received from a wired client.

An example of a bridged downlink wired port profile configured as access is depicted in figure XXX. In this example the APs will assign bridged wired clients with no dynamic VLAN assignment to VLAN 75. The AP will forward wired clients traffic out its uplink port on to the access switching layer with the 802.1Q tag 75.

Bridged Access Wired Port Profile Example

An example of a bridged downlink wired port profile configured as a trunk is depicted in the below figure. This example represents a downlink port profile that is configured to support a VoIP telephone that implements an untagged data VLAN (72) and an 802.1Q tagged voice VLAN (73). Untagged traffic from the VoIP phone is placed into the native VLAN 72 while 802.1Q tagged voice traffic is accepted. As only VLANs 72-73 are accepted by the downlink port, all other tagged VLAN IDs received on the downlink port will be discarded by the AP.

Bridged Trunk Wired Port Profile Example

Wired port profiles configured for tunnel forwarding support a single VLAN and must be configured for access mode. The access VLAN can be a VLAN ID or Named VLAN that resides within primary cluster. Downlink wired port profiles configured for tunnel forwarding can only be configured for access mode and will only accept untagged traffic from the wired clients. All traffic received with an 802.1Q tag will be discarded by the AP.

An example of a valid tunneled downlink wired port profile is depicted in the below figure. In this example the APs will assign tunneled wired clients with no dynamic VLAN assignment to VLAN 73 that resides within the primary cluster.

Tunnel Access Wired Port Profile Example

Default VLAN

WLAN profiles configured for mixed traffic forwarding or with dynamic VLAN assignment enabled require a default VLAN to be assigned. As with a static VLAN, the default VLAN is assigned to WLAN clients if no dynamic VLAN assignment is derived from RADIUS or VLAN assignment rule. The default VLAN can be an individual VLAN ID or VLAN Name that you specify or select depending on the configured traffic forwarding mode.

The default VLAN that you defined within each WLAN profile can be considered a catchall VLAN if no dynamic VLAN assignments are made. As a recommended security best practice, the default VLAN should be different than the APs or Gateways management VLAN. If a default AP management VLAN is used, it is recommended that you change the default VLAN to a different value (below figure). This will ensure that no bridged clients are accidentally assigned to the APs management LAN.

Default VLAN

VLAN Pools

A VLAN pool allows bridged or tunneled WLAN clients to be distributed across multiple VLANs. When VLAN pooling is implemented, each client is assigned to a VLAN within the pool based using a MAC address hashing algorithm. The implemented algorithm is consistent on both the APs and Gateways and ensures each client will maintain its VLAN assignment each time it connects or roams.

For bridge traffic forwarding, a VLAN pool can consist of a range of contiguous VLAN IDs, a list of non-contiguous VLAN IDs or a list of selected VLAN Names. VLAN Names requiring the respective VLAN Name to ID mappings to be performed within the WLAN profile before selection. A VLAN pool using a contiguous range of VLAN IDs and a list of VLAN names is depicted in figure below:

Bridged WLAN Profile VLAN Pool 1

Bridged WLAN Profile VLAN Pool 2

For tunnel traffic forwarding, a VLAN pool is selected by configuring a Named VLAN within the Gateway configuration group that includes a range of VLAN IDs. The Named VLAN is selected and assigned to the WLAN profile after the primary cluster is selected. A VLAN pool assigned to a WLAN profile configured for tunnel forwarding is depicted in figure below. The Named VLAN and VLAN IDs mappings were configured and managed within the Gateway configuration group:

Tunneled WLAN Profile VLAN Pool

RADIUS Assigned

Clients connected to WLANs or downlink ports requiring MAC or 802.1X authentication can be dynamically assigned a VLAN from a RADIUS server such as ClearPass or the Cloud Auth service. A RADIUS server can directly assign a VLAN by directly providing a VLAN ID or VLAN Name to an AP or Gateway. A RADIUS server may also indirectly assign a VLAN to a client by providing a user role name which includes the VLAN assignment.

APs and Gateways can directly assign a VLAN provided by a RADIUS server or Cloud Auth service that are configured to provide HPE Aruba Networking vendor-specific attribute value pairs (AVPs) or standard IETF AVPs in RADIUS Access-Accept or change or authorization (CoA) messages. A VLAN can be dynamically assigned from a RADIUS server that is configured to return one of the following AVPs:

  • Aruba-User-VLAN – HPE Aruba Networking AVP that provides a numerical VLAN ID (1-4094).

  • Aruba-Named-User-VLAN – HPE Aruba Networking AVP that provides a VLAN Name that maps to a VLAN ID (1-4094) configured in the WLAN profile (bridged) or Gateway configuration group (tunneled).

  • Tunnel-Private-Group-ID – IETF AVP that provides a numerical VLAN ID (1-4094).

A VLAN can also be dynamically assigned based on user role assignment. Each user role may optionally include a VLAN ID assignment that is configured within the user role as part of the WLAN profile configuration. For tunneled or mixed WLANs, the configured VLAN ID is automatically copied to the user role orchestrated on the Gateways. A RADIUS server or Cloud Auth service dynamically assigns the user role to the clients which determines the bridged or tunneled VLAN assignment. A user role can be dynamically assigned by a RADIUS server by returning the following HPE Aruba Networking vendor-specific AVP:

  • Aruba-User-Role – HPE Aruba Networking AVP that provides the user role name.

To simplify operations and troubleshooting, only one dynamic VLAN assignment method should be implemented at a given time. A VLAN ID or VLAN Name assignment should either be directly assigned or indirectly assigned via a user role. If both VLAN assignment options are provided to an AP or a Gateway, the VLAN ID or VLAN Name will take precedence.

VLAN Assignment Rules

A VLAN assignment may also be determined by creating VLAN assignment rules as part of the profile configuration. VLAN assignment rules are optional and permit dynamic VLAN assignments based on admin defined rules that include an attribute, operator, string value and resulting VLAN assignment.

VLAN assignment rules are often implemented during migrations to HPE Aruba Networking by permitting VLAN assignments to be made by an existing RADIUS server and policies that implement IETF or some 3rd party vendor specific AVPs. Assignment rules permit customers to easily migrate to AOS 10 without having to modify their existing RADIUS policies.

An example of VLAN assignment rules can be leveraged during a migration is depicted in figure XXX. In this example the RADIUS server policies implement the IETF filter-id AVP that returns string values such as employee-acl, iot-acl and guest-acl. Each VLAN assignment rule matches a partial string value and assigns an appropriate VLAN ID. For example, if filter-id returns the string value employee-acl, VLAN 75 is assigned. If no assignment rule is matched, the clients are assigned to VLAN 76.

VLAN Assignment Rules Example

VLAN assignment rules are supported for bridged, tunneled, and mixed forwarding profiles. When VLAN assignment rules are configured for tunneled or mixed profiles, the assignment rules are automatically orchestrated on the Gateways. Each VLAN assignment rule will be added as a server defined rule (SDR) under the server group for the tunneled or mixed profile.

Assignment Priorities

Depending on the traffic forwarding mode configured within a profile, the AP or the Gateway will make the VLAN assignment decision. For bridge profiles the AP makes the VLAN assignment decision while for tunneled and mixed profiles the Gateway makes the VLAN assignment decision. By default, if no VLAN is dynamically derived, the static VLAN or default VLAN configured in the profile will be assigned.

When multiple VLANs outcomes are possible for a client, an assignment priority is followed by the APs and Gateways. When multiple VLAN outcomes are presented, the AP or Gateway will assign the VLAN based on the assignment option with the highest priority. As a general rule, the VLAN ID or VLAN Name assigned by a RADIUS server, or Cloud Auth service will have the highest priority.

Below table provides the AP VLAN assignment priority for bridged profiles. When multiple VLAN assignment outcomes are presented for a bridged client, the AP will select the assignment option with the highest priority:

Priority Assignment Notes
1 (Lowest) Static or default within the WLAN profile VLAN ID, VLAN Range or VLAN Name
2 VLAN derived from user role Default, Aruba-User-Role (RADIUS) or Role Assignment Rule
3 VLAN assignment rule User defined derivation rule
4 (Highest) RADIUS In order of priority:
  1. Tunnel-Private-Group-ID (Lowest)
  2. Aruba-Named-VLAN
  3. Aruba-User-VLAN (Highest)

Below table provides the Gateway VLAN assignment priority for tunneled WLAN profiles. When multiple VLAN assignment outcomes are presented, the Gateway will select the assignment option with the highest priority:

Priority Assignment Notes
1 (Lowest) Static or default within the WLAN profile VLAN ID or VLAN Name
2 VLAN from initial user role
3 VLAN from UDR user role
4 VLAN from UDR rule
5 VLAN from DHCP option 77 UDR user role Wired Clients
6 VLAN from DHCP option 77 UDR rule Wired Clients
7 VLAN from MAC-based Authentication default user role
8 VLAN from SDR user role during MAC-based Authentication
9 VLAN from SDR rule during MAC-based Authentication
10 VLAN from Aruba VSA user role during MAC-based Authentication Aruba-Named-VLAN
Aruba-User-VLAN
11 VLAN from Aruba VSA during MAC-based Authentication
12 VLAN from IETF tunnel attributes during MAC-based Authentication Tunnel-Private-Group-ID
13 VLAN from 802.1X default user role
14 VLAN from SDR user role during 802.1X
15 VLAN from SDR rule during 802.1X
16 VLAN from Aruba VSA user role during 802.1X Aruba-User-Role
17 VLAN from Aruba VSA during 802.1X Aruba-Named-VLAN
Aruba-User-VLAN
18 VLAN from IETF tunnel attributes during 802.1X Tunnel-Private-Group-ID
19 VLAN from DHCP options User Role VLAN inherited by user role assigned from DHCP options
20 (Highest) VLAN from DHCP options

VLAN Best Practices & Considerations

AOS 10 APs can support any combination of profiles implementing bridge, tunnel, or mixed forwarding modes. When profiles of different forwarding types are implemented, the following considerations should be followed:

  • Avoid using VLAN 1 whenever possible. VLAN 1 is the default management VLAN for APs and is also present on the Gateways. VLAN 1 should only be implemented if the default management VLAN is changed on the APs from 1 to a different value.

  • If the default AP management VLAN 1 is retained, avoid assigning tunneled clients to a VLAN on the Gateway that indirectly maps to the APs untagged management VLAN. For example, if APs are managed on untagged VLAN 70 on the access layer switch and this VLAN is extended to a Branch Gateway, don’t assign tunneled clients to VLAN 70.

  • Implemented dedicated VLAN IDs and broadcast domains for bridged and tunneled clients. An AP can either bridge or tunnel clients on a given VLAN ID and cannot do both simultaneously.

  • Prune all tunneled VLANs from the APs uplink ports at the access switching layer. A tunneled VLAN must not be extended to the uplink ports on the APs. As a best practice, only the AP management and bridged user VLANs should be extended to the APs.

  • If implementing mixed forwarding with Branch Gateways, bridged user VLANs must be layer 3 separated from the Gateways. If no layer 3 separation is implemented, all the clients will be tunneled as all the VLANs will be present within the cluster. If layer 3 separation cannot be implemented, a dedicated WLAN using bridged forwarding must be implemented.

  • Each user VLAN can support a maximum of one IPv4 subnet and one IPv6 prefix. Support for multiple IPv4 subnets or IPv6 prefixes (i.e., multinetting) is not supported.

2.8 - Roles

Fundamental concepts of roles (often referred to as user roles), how they are implemented and used in HPE Aruba Networking Wireless Operating System 10 (AOS-10) deployments.

2.8.1 - Fundamentals

An introduction to roles, their uses and how they are assigned on APs, gateways and access layer switches configured for User-Based Tunneling (UBT).

Roles are policy and configuration containers that are assigned to client devices connected to HPE Aruba Networking access points (APs), gateways, and access layer switches. Usage of roles is mandatory for APs and gateways but optional for access layer switches except when User-Based Tunneling (UBT) is deployed.

Roles are a differentiating foundational architectural element supported by HPE Aruba Networking infrastructure devices. They can be used to implement dynamic segmentation and policy enforcement between different sets of client devices and may optionally include other attributes for assignment. Initially introduced for use on wireless controllers and controllerless APs (AOS-8), roles are now supported by all current infrastructure devicesincluding APs, gateways, and switches.

HPE Aruba Networking devices that support roles.

Role uses

Roles are used to apply network access policies and other attributes to client devices or user identities. The policy language and supported attributes are network infrastructure device type specific and vary between APs, gateways, and switches. The available policy options and attributes being limited by the capabilities and supported features for each device type.

In AOS-10, roles contain policy language used to determine host, network, and application permissions. They may optionally include other configuration attributes such as VLAN assignment, captive portal configuration or bandwidth contracts. Global client roles applied to gateways also include group policy identifiers (GPIDs) used by gateways and switches for role-to-role policy enforcement.

AOS-10 role attributes.

On switches, roles are used to dynamically apply configuration to access ports when port-access security is enabled. When a wired client device or user successfully authenticates, the RADIUS authentication server or Central NAC service can return a role name that determines the port’s operation mode, forwarding behavior, switchport mode, and access or trunk VLAN assignments. If UBT is enabled, the assigned role will also determine the cluster (zone) where traffic is tunneled to, and the role assigned on the gateways.

Switch role attributes.

Role assignment

Roles can be assigned to client devices or user identities on APs, gateways, or access layer switches at the point each client device connects to the network. When traffic is tunneled from an AP or UBT access layer switch to gateways, a role is assigned on the tunnelingdevice where the client is attached in addition to the gateways.

APs

Roles are assigned to each wired and wireless client device (unique MAC) that connects to an AP regardless of the forwarding mode configured in the profile. This includes:

  • Wired devices connected to downlink port.

  • Wireless devices connected to WLANs.

Each client device is assigned a default role or a user defined role from a RADIUS authentication server, Central NAC service, or role assignment rule. If no role is dynamically assigned or the assigned role does not exist, a default role is assigned. As wireless clients are nomadic, the assigned role will follow each client as they roam between APs within a roaming domain,the assigned role being cached and automatically distributed by services within Central to neighboring APs.

Default and user defined roles assigned to AOS-10 APs.

Gateways

When a wired or wireless client on an AP or a wired client connected to a UBT switch is tunneled to a gateway cluster, two roles are assigned:

  • A role is assigned at the AP where the wired or wireless client device is attached.

  • A role is assigned on the UBT switch where the wired client device is attached.

Within a cluster, each tunneled client device (unique MAC address) is assigned an active and standby User Designated Gateway (UDG) via the published bucket map for the cluster (see Cluster Roles). Each client’s assigned UDG gateway is the anchor point for all traffic and is persistent. The only time a tunneled client’s UDG gateway assignment is changed is if a gateway is added or removed from a cluster, a failover to a secondary cluster occurs, or the wireless client roams to an AP that is tunneling to a different cluster.

Default and user defined roles assigned to AOS-10 gateways.

A role may also be assigned to wired client devices that are serviced by a switchport on a gateway. When a port or VLAN is untrusted, each wired device can be optionally authenticated where a user defined role can be dynamically provided by a RADIUS server or Central NAC service. For non-authenticated ports or VLANs, a user defined role may be statically assigned.

Access layer switches

When port-access security is configured on an access layer switch, a role can be dynamically assigned to wired devices from a RADIUS authentication server or Central NAC service. The attributes in each role determine the configuration that is applied to the switchport and if user based tunneling (UBT) is activated for forwarding.

When a wired UBT client is tunneled to a gateway cluster, two roles are assigned:

  • A role is assigned on the access layer switch where the wired client device is attached.

  • A role is assigned on the user designated gateway (UDG) for each UBT client.

For UBT to function, a user defined role is assigned on the access layer switch that includes attributes that specifies the cluster (zone) the UBT client’s traffic is tunneled to and the user defined role that is assigned on each UDG gateway. For flexibility, the role mapping configured for each role permits the same role name to be assigned on both the access layer switches and gateways or different role names to be assigned. Additionally, CX access layer switches implement zones allowing UBT clients traffic to be terminated on different clusters within the network.

Roles assignments on access layer switches, gateways, and mappings.

Role types

AOS-10 APs and gateways support default roles, user defined roles, and global client roles. Default roles are applied to wired or wireless client devices when no user defined role is assigned while user defined roles and global client roles are assigned by either an authentication server or role derivation rule.

Default roles

Default roles are automatically created for each downlink port profile and WLAN profile that are configured within an AP configuration group. Each default role has the same name as its parent profile and is assigned to client devices when no user defined role is assigned.

Default roles

Default roles are either created within an AP configuration group or both AP and gateway configuration groups depending on the forwarding mode of the profile:

  • Bridge forwarding – The default role is created in the AP configuration group only.

  • Mixed / tunnel forwarding – The default role is created in both the AP and gateway configuration groups. When both a primary and secondary cluster are assigned, they are created in both the primary and secondary gateway configuration groups.

Default roles are mandatory and must exist on the AP for each profile. They can be used to apply security policies to client devices as well as assign other attributes such as VLANs, captive portal configuration, or bandwidth contracts. They may be used exclusively when no dynamic role assignment is required or be employed as a fall-through/catchall role when no dynamic user defined role is assigned.

While a default role can be dynamically assigned to client devices or user identities connected to other profiles, this is not recommended as default roles are deleted when their parent profile is deleted. If a role needs to be assigned to multiple profiles, a user defined role should be used. A default role should only be used within the context of the parent profile.

User defined roles

User defined roles are configured and named by the administrator. They can be independently configured per AP or gateway configuration group or be orchestrated by Central to the necessary configuration groups by a profile creation workflow. They are assigned to client devices or users either by a RADIUS authentication server, Central NAC service, or role derivation rule. A default user role is assigned to client devices when no user defined role is dynamically assigned or if a dynamically assigned role does not exist on the AP or gateway.

User defined roles

When user defined user roles are added or modified using a profile creation workflow, the roles and associated policies are either created in the AP configuration group or both the AP and gateway configuration groups depending on the forwarding mode of the profile:

  • Bridge forwarding – User defined roles are created in the AP configuration group only.

  • Tunnel forwarding – User defined roles are created in the respective gateway configuration groups. When both primary and secondary clusters are assigned, they are created in both the primary and secondary gateway configuration groups.

  • Mixed forwarding - User defined roles are created in both the AP and gateway configuration groups.

If no user defined roles are configured using the profile creation workflow, they must be manually created in the respective AP and gateway configuration groups by the admin. Only roles added or modified using a profile creation workflow are automatically orchestrated between AP and gateway configuration groups. When a profile creation workflow is used, policies, attributes and derivation rules are also orchestrated between AP and gateway configuration groups. The orchestrated roles can be used across profiles as needed.

For most AOS-10 deployments, user defined roles will either be created in their respective AP or gateway configuration groups as the profiles on the APs will implement either a bridged or tunnel forwarding mode. User defined roles will only need to be created in both AP and gateway configuration groups if the AP is simultaneously bridging and tunneling user traffic and the same user defined role is assigned to client devices or user identities for both forwarding modes. For example, an employee role is assigned to tunneled wireless clients in addition to bridged wired clients connected to wall-plate APs. In this scenario the employee role would be assigned to both AP and the respective gateway configuration groups.

Global client roles

Global client roles are configured and managed in Central then propagated to CX switches or gateways but are not supported on APs or AOS-S switches. Unlike user defined roles which are configured and managed per configuration group, global client roles are centrally configured and managed in Central then propagated to the CX switches, branch gateways, and mobility gateways.

When propagated to branch or mobility gateways, each global client role will be listed in the roles table in each applicable gateway configuration group and are identified with a ‘Yes’ flag in the global column. Each global client role must have a unique name and cannotoverlap with existing default or user defined roles.

Gateway configuration group roles table with global client roles

A global client role can be assigned to tunneled client devices terminating on a gateway cluster in addition to wired client devices that are connected to an untrusted port or VLAN on a gateway. They can be used the same way as user defined roles and can include IP-based policies and attributes.

Unlike default and user defined roles, global client roles do not contain any IP-based network access permissions by default, and these must be assigned post propagation by the admin. If used in an unmodified state, client devices will be unable to obtain IP addressing or be able to communicate over the intermediate IP network. For each propagated role, the admin must assign one or more session access control lists (SACLs) that allows basic network services such as Dynamic Host Configuration Protocol (DHCP) and Domain Name Services (DNS) in addition to the necessary destination host and network permissions.

Global client roles may also be used to apply role-to-role group-based policy enforcement with a NetConductor solution in addition to role-to-role enforcement across gateways as detailed in theVSG.

2.8.2 - Management and Configuration

An overview of how roles can be configured and managed in AP and gateway configuration groups within Central.

Role management and configuration in Central is separated into two management functions. The first management function involves role creation or removal which can be performed in different areas within the Central UI depending on the role type:

  • Default roles – Are supported on APs and gateways. They are added or removed to AP and gateway configuration groups with their parent profile. Default roles cannot be manually created or removed.

  • User defined roles – Are supported on APs and gateways. They are added or removed using either the profile creation workflow or are manually added or removed directly within each AP or gateway configuration group.

  • Global client role – Are added or removed globally within a Central instance then propagated to gateways and switches.

As roles are policy and configuration containers, the second management function involves adding, removing, or modifying network access policies and attributes for each role. For default and user defined roles, the forwarding mode selected for a profile will influence where role management can be performed:

  • Bridge forwarding – Network access policies and attributes can be configured and managed using the profile creation workflow or by directly modifying each role within an AP configuration group.

  • Mixed or tunnel forwarding – Network access policies and attributes are configured and managed directly per AP and gateway configuration group. This recent change permits different network access policies and attributes to be assigned to a role on APs and gateways.

Role to role permission management and group policy identifier configuration for global client roles is performed globally within each Central instance. For global client roles that are propagated to mobility gateways, additional network access policies and attributes are configured and managed directly within each gateway configuration group.

Profile creation workflow

The profile creation workflow provides a convenient way to configure default and user defined roles as part of an intuitive workflow. Roles can be added and removed without requiring the admin to exit the profile workflow. The access slider in the workflow determines the level of role configuration that is exposed:

  • Unrestricted – No role configuration is exposed within the workflow.

  • Network Based – Network access permissions and attributes can be configured and modified for the default role only.

  • Role Based – Full role configuration is exposed.

For bridge forwarding profiles, roles can be added, removed, and configured using the workflow. When Role Based access is selected, adding, editing, or removing user defined roles is possible.

Bridge profile role configuration within the workflow.

For mixed and tunnel forwarding profiles, roles can be added and removed using the profile creation workflow, but policies cannot be configured. User defined roles added or removed using the workflow are added or removed from their respective AP and gateway configuration groups. Note that network access policies and attributes are no longer configurable using the profile creation workflow for mixed and tunnel forwarding profiles and must be manually configured in the respective AP and gateway configuration groups. A warning is displayed in theconfiguration workflow advising of this requirement.

Mixed / tunnel profile role configuration within the profile creation workflow

Configuration groups

User defined roles can be added, removed, and configured directly per AP and gateway configuration group using the Central UI. The admin can configure network access permissions and attributes for existing roles or add, delete, and configure user defined roles. The UI also offers a convenient way to pre-configure user defined roles, network access permissions and attributes prior to creating profiles.

For AP configuration groups, default and user defined roles can be configured and managed under Security > Roles. User defined roles can be added, removed, or configured, but default roles can only be configured and not removed. Default roles can only be removed by removing the parent profile.

Each role is configured by selecting a role from the list which presents the network access policies and attributes that are configured for the selected role. An example of role management within an AP configuration group is depicted below.

AP group role configuration and management.

For gateway configuration groups, default and user defined roles can be configured and managed under Security > Roles. The role table lists all the roles configured in the gateway configuration group which includes predefined roles, default roles, user defined roles, and global client roles. Global client roles are identified with a Global “Yes” flag.

Each role is configured by selecting a role in the table which displays an additional table that presents the network access policies and attributes that are assigned to the selected role.

Gateway group role configuration and management.

2.8.3 - Bridge Forwarding

A discussion on how roles are implemented, assigned, and enforced on APs when using bridge forwarding profiles.

Please refer to the Forwarding Modes of Operation for a detailed overview of bridge forwarding.

Supported role types

For bridge forwarding, the AP makes the role assignment decision. Bridged clients can be assigned a default role or user defined role but not a global role. A bridged client is either assigned a default role or user defined role depending on if a user defined role is dynamically assigned from an authentication server or role derivation rule.

Role derivation and assignment

For bridge forwarding, the APs operate as authenticators and make the role assignment decision. When a client device attaches to an AP or a device/user identity is authenticated, a default or user defined role is assigned:

  • Default role – Is assigned when no user defined role is dynamically assigned, or the dynamically assigned role is not present on the AP.

  • User defined role – Is dynamically assigned from a RADIUS authentication server, Central NAC service or role assignment rule.

A user defined role may also be assigned post-authentication using a DHCP role assignment rule. DHCP role assignment rules are evaluated post authentication as a DHCP message exchange must occur. A default or user defined role may also be changed post-authentication by an authentication server that sends a change of authorization (CoA) message.

Default role

A default role is created for every bridge profile with a default role assignment rule that cannot be modified. The default role assignment for a profile can be viewed in the profile creation workflow when Role Based access is selected. An example of a default role assignment for a profile named BridgeProfile is depicted below.

Bridge profile default role assignment rule.

A default role is assigned to client devices or user identities when no role is dynamically assigned from a RADIUS authentication server, Central NAC service or role assignment rule. They are also assigned if a dynamically assigned role is not present in the AP configuration.

Assignment rules

User defined roles can be dynamically assigned to client sessions by creating role assignment rules within the profile creation workflow. They are optional and permit dynamic user defined role assignment based on admin defined rules that include an attribute, operator, string value, and the resulting role assignment. They operate like security access control lists (ACLs) where rules are evaluated in order (top down). The first assignment rule that is matched is applied. Assignment rules may also be re-ordered at any time.

Role assignment rules are often implemented during migrations to HPE Aruba Networking by allowing role assignments to be made using the attribute value pairs (AVP) from existing RADIUS server policies that implement IETF or vendor specific attributes (VSA).

As an example, a third-party RADIUS server is configured with policies that return the IETF Filter-Id AVP that provides unique string values that can be used by the APs to assign a user defined role. Each condition in the UDR includes a match condition and user defined role assignment.

Role assignment rule using the Filter-Id AVP to determine the role to assign.

Assignment rules can also be used for dynamic role assignment for non-authenticated sessions. For example, assignment rules can be created to dynamically assign user defined roles based on MAC OUI or DHCP options. This can be useful if dynamic VLAN assignments or unique network access policies need to be applied to sets of headless devices that do not support 802.1X or for profiles that do not have 802.1X or MAC authentication enabled.

DHCP option-based rules are evaluated post authentication and are only applicable once a VLAN assignment has been made as the assignment rules operate by matching option fields exchanged in DHCP discover and request messages. DHCP optional-based rules are not applicable for profiles with Captive Portal enabled and should not be used to assign user defined roles that result in a VLAN assignment change.

RADIUS assigned

Clients connected to WLANs or downlink ports requiring MAC or 802.1X authentication can be directly assigned a user defined role from a RADIUS authentication server or Central NAC service that return the HPE Aruba Networking Aruba-User-Role vendor-specific AVP. Policies on the RADIUS authentication server or Central NAC service can be configured to directly return a user defined role name based on the authenticating device/user identity, user identity store attributes such as department, or other contextual conditions such as date or time, location, or posture.

APs performing MAC or 802.1X authentication will accept the Aruba-User-Role AVP from a RADIUS Server or Central NAC with no additional configuration being required. If the user defined role is present on the AP and no role assignment rule is matched, the role name provided by the Aruba-User-Role AVP is assigned.

A role assignment rule can be configured to use a specific role based on the received role name if required. For example, if the Aruba-User-Role is returned with the value Employees, a role assignment rule can be configured to match the received role name and apply a different role. This can be a useful tool for migrations and troubleshooting.

Assignment order

When multiple role assignment outcomes are possible for a client device or user identity, an assignment priority is followed by the AP. As a rule, a user defined role that is derived from a role assignment rule will take precedence over a user defined role received from the Aruba-User-Role AVP. If no user defined role is derived or the derived role does not exist on the AP, a default role is assigned.

Bridge forwarding role assignment order

Priority Assignment Notes
1 (Highest) Role Assignment Rule Evaluated in order
2 Aruba VSA Aruba-User-Role
3 (Lowest) Default role If no user defined role is derived

User defined roles can also be dynamically assigned post authentication which is not captured in the above assignment flow. A user defined role change can occur as the result of a DHCP assignment rule during attachment or change of authorization (CoA) message received from a RADIUS authentication server or the Central NAC service. User defined roles assigned from a DHCP assignment rule or CoA will take precedence over a previously assigned default or user defined role post authentication.

For example, if an 802.1X client device is assigned a user role using the Aruba-User-Role AVP and a DHCP assignment rule is matched that assigns a different role, the role derived from the DHCP assignment rule will take precedence.

Policy enforcement

When bridge forwarding is selected in a profile, the APs operate as the sole policy enforcement point. The APs inspect all user traffic and can make forwarding and drop decisions based on each client device’s role assignment and the network access policies that are configured in each role.

Each AP has a deep packet inspection (DPI) capable firewall that can permit or deny traffic flows based on available information contained within IP headers. When application visibility or unified communications (UCC) is enabled, the APs can also identify applications and real-time application flows by leveraging deep packet inspection (DPI), application layer gateways (ALGs) and advanced heuristics.

Each AP is fully capable of inspecting traffic received from attached client devices and making a forward or drop decision based on the network access rules that are configured within each assigned role. All north / south and east / west traffic flows are inspected and can be acted on by the firewall. Client devices can either be assigned a default role or be dynamically assigned a user defined role. When dynamic role assignment is used, individual clients connected to a WLAN or downlink port can be assigned separate roles each with the necessary network access policies assigned.

Bridge forwarding policy enforcement.

Scaling considerations

When configuring user defined roles within an AP configuration group, scaling must be considered as each AP can only support a specific number of default and user defined roles which is dependent on the version of AOS-10 in use.

AP maximum supported roles

AOS-10 version Max roles
10.5 and below 32
10.6 and above 128

Each wired-port profile and WLAN profile includes a default role that counts against the maximum number of roles supported by the APs. This also includes the 2x default wired-port profiles that are present on each AP and cannot be removed.

To determine the number of user defined roles that can be configured in an AP group, you must subtract the total number of wired-port and WLAN profiles that are present on the AP from the maximum number of roles that are supported. For example, an AP running 10.6 that has a total of 6 wired-port + WLAN profiles configured in the group can support a total of 122 user defined roles (128 – 6 = 122).

2.8.4 - Tunnel Forwarding

A discussion on how roles are implemented, assigned, and enforced on APs and gateways when using mixed or tunnel forwarding profiles.

Please refer to the Forwarding Modes of Operation for a detailed overview of tunnel forwarding.

Supported role types

When mixed or tunnel forwarding mode is enabled in a profile, the gateway determines the role assignment. That role assignment is done at both the AP and the gateway:

  • AP – A default or user defined role

  • Gateway – A default, user defined, or global role.

Split role assignment

Tunneled clients can be assigned the same or different roles on the AP and gateway. A default role is assigned on both the AP and gateway if no role is dynamically assigned from an authentication server, Central NAC service, server derivation rule (SDR), or user derivation rule (UDR). Additionally, an AP will assign a default role to a tunneled client if a dynamically assigned role is not present on the AP. As global client roles are not supported by APs, an AP can only assign a default or user defined role to a tunneled client.

The following combinations of role assignments are supported for tunnel forwarding:

  • Default role – Assigned on both APs and gateways if no dynamic role assignment is made.

  • User defined role – Assigned on both APs and gateways if a dynamic role assignment is made and the role is present in both the AP and gateway configuration groups.

  • Separate roles – A default role is assigned on the APs and a user defined or global role is assigned on the gateways.

Separate roles can only be assigned on the AP and gateway when a dynamically assigned role is not present on the AP. When a role is dynamically assigned to a tunneled client device or user identity that is not present on the AP, the gateway will assign the dynamically assigned role while the AP will assign the default role. For most deployments, the default role on the AP will only contain the default network access policy (allow all) while the user defined or global role on the gateway will contain more restrictive network access rules and attributes.

Role derivation and assignment

For mixed and tunnel forwarding, the gateway operates as the authenticator and makes the role assignment decision. When a client device attaches to an AP or a device/user identity is authenticated, a role is assigned on both the AP and the gateway.

Default role

A default role is created for every mixed or tunnel mode profile. The default role assignment for a profile can be viewed in the profile creation workflow when Role Based access is selected. The default assignment rule cannot currently be modified.

Tunnel profile default role assignment rule.

A default role is assigned to client devices or user identities when no role is dynamically assigned from a RADIUS authentication server, Central NAC service or derivation rule. They are also assigned if a dynamically assigned role is not present on the AP or gateway.

Assignment rules

User defined and global client roles can be dynamically assigned to client devices or user identities by creating role assignment rules. Gateways supports two types of role assignment rules:

  • Server derivation rules (SDR) – Can assign roles based on rules that match IETF or vendor-specific RADIUS attributes and values that are returned from a RADIUS server or Central NAC service.

  • User derivation rules (UDR) – Can assign roles based on rules that match MAC OUIs or DHCP options.

SDR and UDR assignment rules are optional and permit dynamic user defined role or global role assignment based on admin defined rules that include an attribute, operator, string value and the resulting role assignment. They operate like security access control lists (ACLs) where rules are evaluated in order (top down). The first assignment rule that is matched is applied. Assignment rules may also be re-ordered at any time.

Server derivation rules

SDR rules can be either configured within a profile creation workflow or directly within each gateway configuration group. They can be implemented for profiles that use MAC or 802.1X authentication.

SDR rules configured using a profile creation workflow are automatically orchestrated on the respective primary/secondary gateway cluster configuration groups. Each mixed or tunnel mode profile includes a corresponding authentication server group for the profile in the primary/secondary cluster gateway configuration groups. SDR rules configured in workflow are automatically added as server rules in the respective tunnel profile authentication server groups.

When both a primary and secondary gateway cluster are assigned to the profile, the server derivation rules should be managed directly in the mixed or tunnel mode profile. This ensures that the derivation rules are the same for each cluster by modifying the authentication server group configurations in both locations automatically. If SDR rules are defined directly within each gateway configuration group, additional care must be taken to ensure the rules are the same in both authentication server groups else unpredictable role assignments will occur.

Tunnel profile SDR rule example.

User derivation rules

UDR rules are configured per gateway configuration group and can be used to dynamically assign user defined or global client roles to tunneled client devices based on MAC address or DHCP signatures. Each UDR ruleset can contain multiple rules that are evaluated in order (top-down). The first rule that is matched is applied.

UDR rules are configured per gateway configuration group by selecting Security > Advanced > Local User Derivation Rules. Each ruleset has a unique name and can contain multiple rules in order of priority. Existing rules can be re-ordered at any time by selecting a rule and moving it above or below another rule.

Example of an UDR, a ruleset named tunnelprofile with two DHCP option rules has been created. The first rule matches the option 55 signature for MacBook Pro’s running Sonoma while the second rule matches the option 55 signature for an HP Windows 11 notebook.

A ruleset must be assigned to an orchestrated AAA profile by selecting Security > Role Assignment (AAA Profiles). Each mixed or tunnel mode forwarding profile will have a corresponding AAA profile orchestrated on the applicable gateway configuration groups. Only one UDR ruleset can be applied per orchestrated AAA profile.

UDR rule-set assignment to a AAA profile.

DHCP option-based rules are evaluated post authentication and are only applicable once a VLAN assignment has been made as DHCP assignment rules operate by matching option fields transmitted by client devices in DHCP discover and request messages. DHCP option based rules should not be used to assign user defined roles or global client roles that result in a VLAN assignment change and are not applicable for profiles with Captive Portal enabled.

RADIUS assigned

Clients connected to mixed or tunnel mode forwarding WLANs or downlink ports requiring MAC or 802.1X authentication can be directly assigned a user defined or global role from a RADIUS authentication server or Central NAC service configured to return the Aruba-User-Role AVP.

APs forward RADIUS access requests to their assigned designated device gateway (DDG) which is proxied to the configured external RADIUS server or the Central NAC service. The gateways will accept the Aruba-User-Role AVP from a RADIUS Server or Central NAC with no additional configuration being required in the profile. If the user defined role is present on the gateway, the role name supplied by the Aruba-User-Role AVP is assigned.

A role assignment rule can be configured to change the received role name if required. For example, if the Aruba-User-Role is returned with the value Employees, a role assignment rule can be configured to match the received role name and apply a different role name such as employee-role. This can be a useful tool for migrations and troubleshooting.

Assignment order

When multiple role assignment outcomes are possible for a client device or user identity, an assignment priority is followed by the gateway. As a rule, a user defined role received in the Aruba-User-Role AVP, or an SDR will take precedence over a user defined role assigned from a UDR. If no user defined role is derived or the derived role does not exist on the AP or gateway, a default role is assigned.

Mixed/tunnel forwarding role assignment order

Priority Assignment Notes
1 (Highest) Aruba VSA Aruba-User-Role
2 Server derivation rule (SDR) Evaluated in order
3 User derivation rule (UDR) Evaluated in order
4 (Lowest) Default role If no user defined role is derived

User defined roles can also be dynamically assigned post authentication which is not captured in the above assignment order. A user defined role change can occur as the result of a DHCP UDR assignment rule during attachment or change of authorization (CoA) message received from a RADIUS authentication server or the Central NAC service. User defined roles assigned from a DHCP UDR assignment rule or CoA will take precedence over a previously assigned default or user defined role post authentication.

For example, if an 802.1X client device is assigned a user role using the Aruba-User-Role AVP and a DHCP UDR assignment rule is matched that assigns a different role, the role derived from the DHCP assignment rule will take precedence.

Policy enforcement

When tunnel forwarding is enabled in a profile, the APs and gateways can both operate as policy enforcement points. Both can inspect user traffic and make forwarding and drop decisions based on the network access policies defined within each assigned role.

The network access policies included in the role assigned at the AP and gateway determines which device inspects the traffic and makes the drop or forwarding decision. For most tunneled deployments, the client device or user identity will be assigned a default role on the AP and a user defined role on the gateway. The default role on the AP includes a default allow-all rule that permits all traffic to be forwarded while the user defined role on the gateway includes more restrictive network access policies and provides enforcement.

Tunnel forwarding policy enforcement.

For mixed forwarding, the enforcement point depends on the forwarding mode utilized for each client device or user identity.

Ultimately the network access policies assigned to the default and user defined roles determine if the AP, gateway, or both perform the packet inspection and enforcement. As a general recommendation, use the AP as the enforcement point for bridged forwarding and the gateway as the enforcement point for tunnel forwarding.

While the roles on both AP and gateway can each contain separate network access policies, this should be avoided as doing so will result in a more complex policy deployment model as the firewall functions are distributed between the two devices. If network access rules must be implemented on both AP and gateway for tunneled traffic, the less restrictive policies should be applied at the AP with the more restrictive or complex policies at the gateway.

2.8.5 - User-Based Tunneling

A discussion on how roles are implemented, assigned, and enforced on access layer switches and gateways for User-Based Tunneling (UBT) deployments.

This section provides an overview of how user defined roles are implemented on AOS-10 gateways and AOS-CX access layer switches for User-Based Tunneling (UBT) deployments. This section covers the role types that are supported on gateways and switches and how the roles are configured and managed. This section also provides details for how roles are assigned and where network access permissions are enforced.

Role types

UBT deployments implement user defined roles on the access layer switches and gateways which are independently configured and managed by the administrator. The access layer switches may optionally implement downloadable user roles (DUR) from a ClearPass Policy Manager (CPPM) server if needed where user defined roles are dynamically downloaded and installed on the access layer switches upon successful authentication and authorization.

User defined roles

User defined roles are configured and named by the administrator and must be configured per gateway and switch configuration group. They may also be directly configured per access layer switches that are not managed by Central.

User defined roles are assigned to UBT client devices or user identities either by a RADIUS authentication server or Central NAC service. As theaccess layer switch is the authenticator, the user defined role cannot be assigned by gateways using role derivation rules.

The user defined role assigned on the access layer switch and gateway can be the same role name or a different role name. Each user defined role on the access layer switch that is used for UBT includes specific attributes that determines the cluster UBT traffic is tunneled to and the gateway role that is assigned.

As gateways and switches are often managed and configured by separate IT teams, the gateway role mapping allows for discrepancies between role names. For example, a VoIP phone can be assigned a user defined role named ip_phone on the access layer switch and a role named voip-role on the gateway. The same role name may also be assigned on both.

UBT switch roles and gateway mappings

Role configuration and management

User defined roles must be configured and managed separately per gateway and switch configuration group. For access layer switches, UBT configuration, user defined roles and gateway mappings can be applied either using configuration templates or the MultiEdit configuration editor. For gateways, user defined roles, attributes, and network access permissions are configured per gateway configuration group using the Central UI.

Access layer switches

User defined roles can be added, removed, and configured directly per switch configuration group using either templates or the MultiEdit configuration editor. Template groups allow for configuration to be applied to all CX switches within a configuration group or different configurations to be applied to groups of switches based on model and version. The UBT zone configuration, user defined roles and gateway role mappings being defined in each respective template.

The MultiEdit configuration editor allows for configuration to be applied to multiple CX switches simultaneously or individual switches based on selection with the Central UI. The MultiEdit configuration editor allows for UBT zone configuration, user defined roles and gateway role mappings to be added, removed or modified by selecting one or more CX access layer switches, editing the configuration then adding the necessary CLI commands all within a single intuitive workflow within the Central UI. Syntax checking is provided within the workflow.

An example switch group role configuration using MultiEdit that includes three user defined roles named contractor, employee, and ip_phone each with a common UBT zone assignment but unique gateway role mappings.

Gateways

User defined roles can be added, removed, and configured directly per gateway configuration group using the Central UI. The admin can configure network access permissions and attributes for existing roles or add, delete, and configure user defined roles.

For gateway configuration groups, default and user defined roles can be configured and managed under Security > Roles. The role table lists all the roles configured in the gateway configuration group which includes all the roles including predefined roles, default roles, user defined roles and global client roles.

Each role is configured by selecting a role in the table which displays an additional table that presents the network access policies and attributes that are assigned to the selected role. An example of role management within a gateway configuration group is depicted below. In this example a role named contractor-role is selected and the network access policies displayed.

Gateway user defined role configuration and management

Each user defined role on the gateway that is used for UBT must include a VLAN assignment which is defined as an attribute within each user defined role. Each role can be assigned a VLAN ID or VLAN Name defined within the configuration group using a dropdown selection within the More option for each role. The VLAN ID or VLAN Name must be configured and present within the configuration group.

An example of VLAN assignment for a user defined role named contractor-role a is depicted below. In this example UBT clients will be assigned to VLAN ID 82 within the cluster.

Gateway user defined role VLAN assignment.

Role derivation and assignment

When User-Based Tunneling (UBT) is deployed, a user defined role configured on the access layer switch initiates the user based tunneling session to a cluster of gateways. For a typical deployment, the UBT ports are configured with MAC and/or 802.1X port-access security where each wired device (unique MAC) is authenticated against a RADIUS server or Central NAC service. Upon successful authentication, the RADIUS authentication server or Central NAC service returns the Aruba-User-Role AVP that determines the user defined role assignment.

Each user defined role used for UBT includes additional attributes that specifies a UBT zone and gateway role:

  • UBT zone – References configuration within the access layer switch that determines the primary and optionally secondary cluster that traffic is tunneled to. Each role supports one zone assignment.

  • Gateway role – Determines the role that is assigned on the gateway.

The user defined role assigned to the UBT client device or user identity must include both the UBT zone and gateway role attributes as they determine the primary or secondary cluster the traffic is tunneled to in addition to the role assigned within the cluster. The assigned role on the cluster determines the network access policies that are applied in addition to the VLAN assignment within the cluster.

When a wired client device or user identity is authenticated by the access layer switch and user defined role with UBT attributes is assigned, the traffic is tunneled to the respective primary or secondary cluster. Each UBT client is anchored to a user designated gateway (UDG) node within the cluster based on the published bucket map. The configured gateway role determines the user defined role that is assigned to the UBT session on the UDG in addition to the VLAN assignment. Each UBT session can be assigned the same role name on the access layer switch and UDG, or separate roles names if required.

Policy enforcement

For UBT, the access layer switches and gateways can both operate as policy enforcement points, however the traffic inspection capabilities of both devices are quite different. The access layer switches do not implement a stateful packet inspection firewall and only support stateless access control lists (ACLs) which can be applied to ingress or egress traffic. Gateways implement a deep packet inspection (DPI) firewall that is stateful and application aware. Traffic is inspected on ingress.

For most UBT deployments, the network access policies will be defined within the user defined roles on the gateways and all north / south and east / west traffic flows can be inspected and enforced by the gateways. Gateway enforcement also allows for the same user defined roles, network access policies and attributes to be applied to both wireless and UBT clients, but the different client types should be assigned separate VLANs.

UBT policy enforcement

3 - Migrating to AOS 10

This guide for AOS-10 migration provides information and instructions on moving an HPE Aruba Networking mobility infrastructure from AOS 8 or Instant AOS 8 to AOS 10.

Migrating existing hardware to HPE Aruba Networking Wireless Operating System AOS 10 (AOS 10) will require some consideration and planning before firmware can be loaded. While Mobility Controllers that, now known as Gateways in AOS 10, are currently running HPE Aruba Networking Wireless Operating System AOS 8 (AOS 8) have a straightforward path to AOS 10, access points (APs) can be deployed in multiple ways:

  • An AP operating in controller managed mode, running AOS 8 with a Mobility Conductor and/or Mobility Controller
  • HPE Aruba Networking Instant Operating System AOS 8 (Instant AOS 8) managed by HPE Aruba Networking Central (Central)
  • IAP 8 with local management, i.e., the virtual controller
  • IAP 8 managed by HPE Aruba Networking AirWave

How the APs are currently deployed, the software type and version, and the management and operation tool used, will determine the required path for successfully migrating to AOS 10. In the case of an AOS 8 deployment with APs deployed in controller managed mode and directly controlled by Mobility Controllers, migrating to AOS 10 also introduces a major architectural change as Central is now required to operate and manage the APs and Gateways. With this change, the migration to AOS 10 requires the entirety of the AOS 8 configuration be recreated in Central.

Considerations

Familiarity with AOS 10 is a requirement for success with the adoption process; if necessary, review these helpful articles for further information:

Prerequisites

Make sure to validate that all of the applicable prerequisites are met before attempting to migrate any hardware to AOS 10.

General

  • The device hardware (AP or Gateway) is supported on AOS 10.

  • Devices are present within the inventory of HPE GreenLake Edge-to-Cloud Platform (GLCP) and have a valid application and subscription applied; see Setting up Your Aruba Central Instance for assistance.

    • The pre-validate option for conversion of controller managed APs requires pre-provisioning the AP into an AOS 10 Access Point Group in Central.
  • Devices can resolve all applicable FQDNs and reach those targets on required ports; refer to the guide “Opening Firewall Ports for Device Communication” for assistance.

  • Access Points must be running IPv4 or dual stack mode; native IPv6 on APs is not supported

  • Using AirWave 8.2.15.1 or later to perform a firmware upgrade to AOS 10 is not possible.

Instant APs

  • Instant APs must be running Instant AOS 8.6.0.18, 8.7.1.0, or later releases for a successful firmware upgrade.

  • A cluster of Instant APs must not contain any AP models not supported by AOS 10; attempting to upgrade a mixed model cluster that contains models unsupported in AOS 10 will result in an upgrade failure. Make sure to remove all unsupported models from the cluster prior to attempting the upgrade.

  • Instant APs must not have the uplink native VLAN configured as the VLAN configuration is not retained or migrated through the software upgrade.

Controller based APs

  • To convert controller managed APs, the associated gateway should be running AOS 8.10.0.12, 8.12.0.1, or a later release.

    • If using release of AOS 8.10 prior to 8.10.0.5, transferring an image to the controller using SCP will result in incorrect permissions applied to the image; upload the image to the controller using a method other than SCP.

    • If using release of AOS 8.10 prior to 8.10.0.11 and CPSec is enabled, the pre-validate operation will fail due to the packet routing used for CPSec; bypassing the validation process will allow the upgrade to complete at the risk of incomplete setup of the APs.

  • If an HTTP proxy is in place between the AP and access to Central, configuration of the HTTP proxy must be completed prior to upgrade. The proxy settings can be configured through the Mobility Conductor or Controller.

Controllers

  • Mobility Controllers/Gateways running AOS 6 or AOS 8 may require manually upgrading the software to AOS 10.

    • Controllers shipped from the factory prior to 2017 and running a version prior to AOS 6.5.1.4 will not contact Activate and therefore will not automatically be upgraded to a version that permits provisioning by Central.

    • Controllers running AOS 6.5.1.4 or later and that have never had a software upgrade (i.e., are essentially new in box from the factory) are capable of contacting Activate and upgrading to a cloud enabled version of AOS.

    • Any controller that has previously undergone a firmware upgrade must be manually upgraded to AOS 10. After the software is loaded, the controller must have the configuration erased (write erase) and the appliance must be reloaded to boot AOS 10 before the gateway can be provisioned in Central.

    • Controllers already running an AOS 8 based SD-Branch version of software will perform standard provisoning behavior when reloaded after erasing the configuration and can then be managed in Central after initial provisioning.

  • Mobility Controllers/Gateways to be upgraded must be removed from any customer assigned Activate folder with existing provisioning rules before attempting the upgrade or connecting to Central.

3.1 - Preparing Central for AOS 10 Migration

Initial configuration requirements in Central to prepare for migrating devices to AOS 10.

Migrating to AOS 10, regardless of the starting point, requires some initial configuration within Central for the best outcome during migration.

Prepare an AOS 10 Group in Central

  1. Create a new group on the Aruba Central account. Go to Global > Organization > Network Structure > Groups. A new group must be created; do not attempt to use a cloned or existing group with IAP 8.X configuration for AOS 10. Use the + (plus sign) button to add a new group.

  2. Choose the hardware the Group will administer. A single group can contain a single type or a mixture of hardware.
    Group dedicated to APs:


    Group dedicated to Gateways:

  3. Choose ArubaOS 10 as the architecture to use for the new group and select the appropriate Network Role for the APs:


    If creating a group including or dedicated to Gateways, the role for the gateways must be indicated:

  4. Select the newly created group and choose the desired hardware type. In the left menu, use the Firmware option to set the firmware compliance to an AOS 10 firmware version. Click Set Compliance in the upper right.

  5. Enable the Set firmware compliance toggle and choose a version of AOS 10 from the dropdown to ensure that devices added to the group will be upgraded appropriately.

  6. Prepare the configuration for this new AOS 10 group by configuring System settings such as country code, time zone, NTP, WLAN configuration, etc. For AP groups, refer to Configuring ArubaOS 10 APs in Aruba Central. Gateway configuration information can be found at Gateway Deployment.

Mixed Model Aruba Instant Cluster

Before migrating a cluster of Aruba Instant APs to AOS 10, any models unsupported in AOS 10 must be removed or moved to a separate management network to split the cluster. Attempting to upgrade firmware to AOS 10 when unsupported models are in the cluster will result in an error for the download and none of the APs will be upgraded.

3.2 - Access Points

Migration of APs to AOS 10.

3.2.1 - Configuration Retained or Migrated

Upgrading an AP to AOS 10 can retain some information from the previous configuration. Know what is retained, in what situation, to avoid unexpected operations.

WIP: The following information was current as of AOS 10.4, and is being updated based on the expectations when using AOS 10 versions 10.5 or 10.6.

When an AP is loaded with AOS 10 and checks in with Central for the first time, some of the previous configuration is retained on the AP, and a subset of that configuration can be migrated into the configuration for the AP stored in Central. Some of this retained configuration will allow the AP to operate properly on the network. Some settings can cause failures or interruptions.

Potential migration impactful configuration

This is the list of configuration settings, either at the AP or AP Group/Virtual Controller level, that can impact connectivity of the AP to the local network and/or Central when the AOS 8 AP is upgraded to AOS 10.

  • DHCP

    • No changes, default configuration.
  • Static IP Address

    • Campus: Migrated into the device-level AP configuration in Central.

    • Microbranch: Not supported.

  • Native VLAN

    • Native VLAN configuration is not retained on the AP and not migrated to Central. The AP will always assume VID 1 for the native VLAN on the uplink.
  • Management VLAN

    • Management VLAN configuration is not retained on the AP and not migrated to Central. The AP will always assume the management interface is VID 1.
  • LACP

    • Non-default LACP configuration is retained on the AP and migrated into the device level AP configuration.
  • AP1X

    • AP1X configuration is retained on the AP but not migrated into the configuration on Central.

    • Mismatch of configuration can result in AP flapping when synchronized with Central and potentially revert configuration due to loss of connectivity and enact auto-restore.

  • HTTP Proxy

    • HTTP proxy configuration is retained on the AP but not migrated into the configuration on Central.

    • Mismatch of configuration can result in AP flapping when synchronized with Central and potentially revert configuration due to loss of connectivity and enact auto-restore.

  • PPPoE

    • Nothing is retained or migrated into Central.
  • Mesh

    • The existing mesh profile will be retained on the AP for the purpose of allowing reconnection to a mesh network.

Configuration required to maintain or restore connectivity

In some situations the upgrade will cause the AP to operate in a partially functional state, with the local configuration continuing to provide connectivity but the pushed Central configuration causing connectivity interruptions. Automatic restoration of the last configuration state should return the AP to a functional state until the configuration is synchronized once again from Central, resulting in the AP state appearing to flap.

  • Static IP Address

    • Microbranch: only DHCP is currently supported for IP assignment. Design the deployment around this requirement.
  • Native and Management VLAN

    • Configure the AP and/or network so that an untagged VLAN is provided to the AP for management.

    • Deployments that require bridging of VLANs through the AP must postpone upgrading to AOS 10 or account for the potential VID mismatch between AP and switched networks.

  • LACP

    • The default LACP state on the AP in AOS 10 is enabled as “Passive”. Best practice is to use this option and configure “Active” only on the switch ports.
  • AP1X

    • Before upgrading the AP and placing the device in a Central group, configure the AP settings within the target Central AP group with the appropriate AP1X settings to maintain operation.
  • HTTP Proxy

    • Before upgrading the AP and placing the device in a Central group, configure the AP settings within the target Central AP group with the appropriate HTTP Proxy settings to maintain operation.
  • PPPoE

    • The AP must be connected to a network that provides a DHCP IP assignment and can be configured with PPPoE settings for the end installation location.
  • Mesh

    • Before upgrading the AP and placing the device in a Central group, configure the AP settings within the target Central AP group with the appropriate mesh settings to maintain operation.

3.2.2 - Migrating Aruba Central Managed Instant APs to AOS 10

The easiest opportunity for upgrading to AOS 10, IAPs managed by Central are just a few clicks away from completion.

Upgrade process

  1. Go to Global > Organization > Groups. On the Network Structure tab, find the group in which the InstantOS 8 APs are present.

  2. Select the Virtual Conductor and/or APs to be migrated by clicking the entry in the list, then click the Move Devices button to choose a target AOS 10 enabled group as the destination


  3. After this action, the following AP boot process starts:

    • Aruba Central upgrades APs to AOS 10.

    • APs boot up in AOS 10 mode and reconnect with Aruba Central using AOS 10 firmware.

    • Aruba Central performs a configuration audit and pushes the AOS 10 config from the destination group.

    • Aruba Central cleans up the old, swarm-related state.

Revert to Aruba Instant 8

If needed, convert the APs back to Aruba Instant 8:

  1. Make sure a firmware compliance policy matching the current IAP version is configured on the original group.

  2. Move the APs back to the original group.

  3. After the APs are moved, Aruba Central pushed the specified image to the APs. The APs reboot and resume operation as Aruba Instant 8 APs.

3.2.3 - Migrating Locally Managed Instant APs to AOS 10

Locally managed IAPs are little more than a Central subscription away from becoming managed AOS 10 APs.

One of the first requirements for migrating to AOS 10 is to prepare the AP for management by adding to GLCP and Central. Once that has been done, even the locally managed IAP is just another Central managed IAP.

Upgrade process

  1. Go to Global > Organization > Groups. On the Network Structure tab, expand the Unprovisioned devices group.

  2. Select the Cluster and/or APs to be migrated by clicking the entry in the list, then click the Move Devices button to choose a target AOS 10-enabled group as the destination.


  3. After this action, the following AP boot process starts:

    • Aruba Central upgrades the APs to AOS 10.

    • APs boot up in AOS 10 mode and reconnect with Aruba Central using AOS 10 firmware.

    • Aruba Central performs a configuration audit and pushes the AOS 10 config from the destination group.

    • Aruba Central cleans up the old, swarm-related state.

Alternate methods

  1. IAPs can be directly upgraded to an AOS 10 image using one of the traditional upgrade methods through the WebUI or CLI. Add the APs to Aruba Central. After they are rebooted, the APs will operate as AOS 10 APs managed by Aruba Central. This step could be used if the current running version on the IAPs is 8.5 or earlier.

  2. Using TAC’s help, with Activate:

    • AOS 10 image can be pushed to APs.
    • APs are added to Aruba Central.

Revert to Aruba Instant 8 and local control

If needed, convert the APs back to Aruba Instant 8 and use local control again:

  1. Create a new AP group for Aruba Instant 8 and set a firmware compliance policy to the desired version of software.

  2. Move the APs to the newly created ArubaOS 8 group.

  3. After the APs are moved, Aruba Central pushes the specified image to the APs. The APs reboot and resume operation as Aruba Instant 8 APs.

  4. After the APs are converted successfully back to AOS 8, remove the Central application assignment using the HPE GreenLake console to remove the APs from Central management.

  5. Reboot the IAP cluster to immediately revert to local management.

3.2.4 - Migrating AirWave Managed Instant APs to AOS 10

AOS 10 and Central makes management even easier than with the legacy on-premises management of Instant AP by AirWave.

Upgrade process

  1. Remove AirWave discovery options or configuration:

    • If the APs get their AirWave information from DNS discovery or DHCP options, remove the relevant pieces from the DNS zone or DHCP scope configuration.

    • If the APs get their AirWave information from Activate, remove the provisioning rule from Activate.

    • If the APs are manually configured to report to AirWave, no additional actions are required.

  2. Skip this step if the APs are in Managed mode with AirWave. For Monitor Only mode, make sure that AirWave has been configured to allow firmware upgrades. In AirWave, go to AMP Setup > General > Firmware Upgrade/Reboot Options and set the ‘Allow firmware upgrades in monitor-only mode’ option to ‘Yes’.

  3. Upload all the necessary AOS 10 images to support the APs to be migrated by following the directions in the AirWave User Guide subject “Uploading Files and Firmware”. Be sure to provide images for each class of Instant AP.

  4. In AirWave, go to Groups > List and click the group containing the APs to be migrated.

  5. Assign the desired AOS 10 firmware version to be used for the upgrade by choosing the version from the “Aruba Instant Virtual Controller” dropdown in the Group’s Firmware menu. Select the Enforce Group Firmware Version to have AirWave immediately begin an upgrade.

  6. Click the Save and Upgrade Devices button.

  7. The firmware upgrade process follows these steps:

    • AirWave upgrades the APs to AOS 10.

    • The APs boot to AOS 10, receive provisioning rule to connect to Aruba Central from Activate, and then connect to Aruba Central, where the APs are placed in the assigned group.

    • Aruba Central performs a config audit and pushes the group configuration.

    • Aruba Central cleans up the old swarm-related state.

  8. Migrated Aruba Instant APs must be removed from AirWave manually.

Alternate method

The APs also can be upgraded locally:

  1. Remove configuration pointing the APs at AirWave:

    • Server configuration: Remove all the AirWave configuration from DNS, DHCP, and/or Activate servers.

    • Local configuration: Remove the AirWave configuration from the System tab on the IAP.

  2. Remove the APs from AirWave by deleting them. The Instant APs revert to local management after AirWave has ceased management

  3. Follow the steps in the “Locally Managed Instant APs” section.

Revert to Aruba Instant 8 and AirWave management

If needed, convert the APs back to Aruba Instant 8 and use AirWave management again:

  1. Create a new AP group for ArubaOS 8 and set a firmware compliance policy to the desired version of software.
  1. Move the APs to the newly created ArubaOS 8 group.

  2. After the APs are moved, Aruba Central pushes the specified image to the APs. The APs reboot and resume operation as Aruba Instant 8 APs

  3. After the APs are successfully converted back to AOS 8, remove the Central subscription using the HPE GreenLake console to remove the APs from Central management.

  4. Reboot the IAP cluster to revert to local management.

  5. Enroll the IAP cluster to AirWave using the original method implemented.

3.2.5 - Migrating Controller Managed APs to AOS 10

Controller managed APs can be migrated directly to AOS 10 and immediately start being managed by Central.

Convert controller managed APs by using the existing infrastructure or by using a separate controller set up for the purpose. If the existing infrastructure is not running a correct software version to support the conversion, the APs can be re-provisioned statically to a separate, temporary controller that has been loaded with one of the required software versions.

When using a cluster and planning which APs are to be converted, consider which controller the APs are connected to. The AP conversion process can be started only from the currently active controller for the AP. If a large cluster is in production, consider using a separate, non-clustered controller for AP staging and upgrading.

The conversion from AOS 8 to AOS 10 uses the “ap convert” command.

ap convert
  active {all-aps|specific-aps}
  add {ap-group|ap-name}
  cancel
  clear-all
  delete {ap-group|ap-name}
  pre-validate {all-aps|specific-aps}
Parameter Description
active {all-aps|specific-aps} Convert active Campus AP or Remote AP to Instant APs managed by Aruba Central.
add {ap-group|ap-name} Add AP group or AP name to the list for AP conversion.
cancel Cancel conversion. Any APs currently downloading the new image will continue.
clear-all Remove all AP groups and AP names from the list for conversion.
delete {ap-group|ap-name} Delete AP group or AP name from list for conversion.
pre-validate {all-aps|specific-aps} Pre-validate the Campus AP or Remote AP to Aruba Activate or Central connection.

Upgrade process

Using the CLI on the controller:

  1. Add APs to the conversion list by specifying the AP Group using the command:

    (host) [mynode] #ap convert add ap-group <group-name>

    or add individual APs into the conversion list using the command:

    (host) [mynode] #ap convert add ap-name <ap-name>

  2. Pre-validate the APs added to the conversion list:

    (host) [mynode] #ap convert pre-validate specific-aps

    This command checks the connectivity to Central before attempting to push the firmware conversion. This command prompts the APs to check NTP, Activate, and Central URL reachability and checks if the APs are licensed on Central with a group assigned. This can be useful in narrowing down issues that might occur during the conversion process.

  3. Verify the conversion pre-validation status:

    (host) [mynode] #show ap convert-status

    If the pre-validation is successful, the ‘Upgrade Status’ should read ‘Pre Validate Success’, and the assigned ‘Aruba Central Group’ is displayed.

  4. The pre-validation job continues until canceled. After all APs have been tested, issue the command to cancel the job and move forward:

    (host) (host) [mynode] #ap convert cancel

  5. Download the AOS 10 firmware files for the AP models to be upgraded. Download image files from HPE Networking Support Portal or with assistance of an HPE Aruba Networking account team.

  6. Firmware images can be specified individually by AP type, or a TAR file containing all the images can be created to simplify file handling.

  7. Firmware images are delivered to the APs by a file server or directly from the controller

    • If using the ‘server’ option, use an ftp, tftp, http, https, or scp server and copy the AOS 10 image files to the server.

    • If using the ‘local-flash’ option, the AOS 10 firmware image file(s) must be copied to the controller’s flash using the ‘copy’ command on the CLI or under Controller > Diagnostics > Technical Support > Copy Files on the GUI.

    • In addition to the local options, the firmware can be pulled from the web using the Central software repository at common.cloud.hpe.com/ccssvc/ccs-system-firmware-registry/IAP

  8. Execute the command to trigger the conversion using ftp, tftp, http, https, or scp as one of the available options for the server.

    Examples:

    (host) [mynode] #ap convert active specific-aps server scp username <username> <hostname> <image filename>

    or

    (host) [mynode] #ap convert active specific-aps server http common.cloud.hpe.com path ccssvc/ccs-system-firmware-registry/IAP <image filename>

    or

    (host) [mynode] #ap convert active specific-aps local-flash <image filename>

    A third option, ‘activate’, is available but not currently recommended for use.

  9. Verify the conversion status using:

    (host) [mynode] #show ap convert-status

  10. The firmware upgrade process follows these steps:

    • The controller upgrades the APs to AOS 10.

    • The APs boot to AOS 10, receive the provisioning rule to connect to Central from Activate, and connect to Central which places the AP in the assigned group.

    • Central performs a config audit and pushes the group configuration.

    • Central cleans up any old configuration on the APs except for the uplink parameters.

Troubleshooting

AP groups or individual APs can be removed from the conversion list using the relevant command below:

(host) [mynode] #ap convert delete ap-group <ap-group>

(host) [mynode] #ap convert delete ap-name <ap-name>

To clear all APs from the conversion list, execute the following command:

(host) [mynode] #ap convert clear-all

To abort the conversion of APs or stop the pre-validation tasks:

(host) [mynode] #ap convert cancel

Revert to AOS 8 firmware version

If needed, convert the APs back to AOS 8:

  1. Make sure that the target controller is licensed and has the capacity to add the APs.

  2. Make sure no conversion tasks are pending on the controller from previous conversions; otherwise APs will try to convert back to AOS 10.

    (host) [mynode] #show ap convert-status

    If the current status is ‘Active,’ the running job must be canceled:

    (host) [mynode] #ap convert cancel

  3. Open the AP’s console using serial, SSH, or Central Remote Console.

  4. Issue the command to the AP for conversion back to controller-based:

    convert-aos-ap cap <controller-address>

    • The AP will download the AOS 8 image from the controller and reboot.

    • The AP will sync configuration based on provisioning status

  5. After the APs are successfully converted back to AOS 8, remove the Central subscription using the HPE GreenLake console to remove the APs from Central management

3.3 - Upgrading Mobility Controllers to Gateways

Mobility Controllers running AOS 8 must be upgraded to AOS 10 and added to HPE Aruba Networking Central in order to continue supporting a tunneled data path for access points running AOS 10.

In AOS 8, the Mobility Controller provides many services that have been moved to Central in AOS 10. To reflect the change in functionality and to better describe the role of the appliance, the name has been changed to Gateway. The two names often are used interchangeably; however, for AOS 10, the correct terminology is Gateway.

Zero Touch Provisioning

Zero Touch Provisioning (ZTP) is a fast, convenient way to onboard a new or existing Gateway into Central without requiring configuration from the installer. Successful ZTP requires the Gateway to be connected to a switchport configured with an untagged VLAN that provides DHCP addressing and Internet access. Any port on a Gateway except GE 0/0/1 can be used for ZTP.

Depending on how the Gateway is deployed or connected to the LAN, ZTP can be performed over a WAN port, uplink port, or a dedicated staging port. A dedicated staging port can be used to onboard a Gateway in situations where the management VLAN will be 802.1Q tagged on the Gateway uplink, the Gateway is connected using LACP trunks, or DHCP services are not available. After a Gateway is provisioned and configured by Central, the Gateway can be configured to use a desired uplink port(s).

Provision the Gateway

  1. To monitor the ZTP process, connect to the serial console port on the Gateway and power on the Gateway. After booting, the initial provisioning screen is presented:

  2. Connect a ZTP-capable port on the Gateway to a switchport configured with an untagged (access) VLAN that provides DHCP and Internet access. All ports on the Gateway support ZTP except GE 0/0/1.

  3. After receiving a DHCP response, the Gateway resolves the Activate FQDN and communicates with Activate for provisioning:

    • If the Gateway is new and has not been previously provisioned, Activate will push a Central-enabled firmware upgrade and reboot the Gateway.

    • Activate provisions the Gateway with the FQDN for the assigned Central instance.

  4. After booting to a Central enabled firmware and being provisioned with the FQDN, the Gateway can communicate with Central.

  5. The firmware version defined in firmware compliance for the Group is enforced and an upgrade is pushed if necessary. After the upgrade is complete, the Gateway reboots.

  6. The Gateway initializes using the specified AOS 10 version, contacts Central for configuration, based on the Central’s assigned device configuration.

  7. After the configuration is applied successfully, the Gateway is up and operational in Central using the staging port or the configured uplink port(s).

Static Activate

Static activate is a one touch provisioning (OTP) option used to provision a Gateway that requires static addressing or PPPoE authentication. The OTP process requires the installer to use a serial console port or web browser to supply minimum information to the Gateway to permit initial communication with Activate and Central. The use of a web browser requires a computer to be connected to the Gateway on the GE 0/0/1 Ethernet port, which provides a DHCP address for local access.

The available configuration options vary by release when using OTP. A new Gateway shipped from the factory currently is loaded with a version of AOS 8 that permits provisioning over PPPoE WAN links or an untagged VLAN but does not support provisioning a new Gateway over an 802.1Q tagged VLAN or an LACP trunk. Gateways already upgraded to AOS 10 support provisioning using 802.1Q tagged VLANs and/or LACP trunks.

Serial Console

  1. Connect to the serial console port on the Gateway and power on the Gateway. After booting, the initial provisioning screen displays:

  2. Type “static-activate,” then press ENTER to start the process. Choose the options appropriate for the required uplink type (“static” or “pppoe”), then provide the required information. The example shows a statically configured IP address:

  3. After initial provisioning is complete, the Gateway resolves the Activate FQDN and communicates with Activate for further provisioning:

    • If the Gateway is new and has not been provisioned, Activate pushes a Central-enabled firmware upgrade and reboots the Gateway.

    • Activate provisions the Gateway with the FQDN for the assigned Central instance.

  4. After booting to a Central-enabled firmware and provisioning with the FQDN, the Gateway can communicate with Central.

  5. The firmware version defined in firmware compliance for the Group is enforced. An upgrade is pushed if necessary. After the upgrade is complete, the Gateway reboots.

  6. The Gateway initializes using the specified AOS 10 version, then contacts Central for configuration based on assigned Central device configuration.

  7. After configuration, the Gateway is up and operational in Central.

Web-UI

  1. Connect a computer to the GE 0/0/1 Ethernet port on the Gateway. An IP address will be offered by DHCP in the 172.16.0.0/24 network. Open a web browser and navigate to https://172.16.0.254, proceeding past the warning for the invalid SSL certificate:

  2. Select By connecting to activate/central, then click Next:

  3. Select Static IP Address or PPPoE as the connection method. Enter the required information. The example below provisions a Gateway to use the GE 0/0/0 port and a static IP address; a Gateway running AOS 10 has additional options for a trunk port and port-channel:

  4. Verify that the information is correct then click Deploy and Reboot.

  5. After the initial provisioning is complete, the Gateway resolves the Activate FQDN and communicates with Activate for further provisioning:

    • If the Gateway is new and has not been previously provisioned, Activate pushes a Central-enabled firmware upgrade and reboots the Gateway.

    • Activate provisions the Gateway with the FQDN for the assigned Central instance.

  6. After booting to a Central-enabled firmware and provisioning with the FQDN, the Gateway can communicate with Central.

  7. The firmware version defined in firmware compliance for the Group is enforced and an upgrade pushed if necessary. After the upgrade is complete, the Gateway reboots.

  8. The Gateway initializes using the specified AOS 10 version and contacts Central for configuration based on the assigned Central device configuration.

  9. After the configuration is applied successfully, the Gateway is up and operational in Central.

3.4 - FAQ - Migrating to AOS 10

Some frequently asked questions about the migration to AOS 10.
Is my AP supported on AOS 10?

Find your model in the list: Supported Devices for AOS 10

Can I convert my existing Central group to AOS 10?

No. The configuration constructs behind the scenes do not allow for a direct conversion from a group supporting IAP to a group supporting AOS 10.

Can I duplicate my AOS 10 group?

You can clone the existing group as long as no orchestrated configuration, i.e., tunneled or mixed WLANs, has been setup.

Can my AP be downgraded back to AOS 8 or IAP 8 after converting to AOS 10?

Yes. Steps are mentioned at the end of each section above.

What happens if the AP is configured with an uplink VLAN prior to upgrade?

In Aruba Instant 8, the AP automatically uses the uplink native VLAN as the management VLAN, but AOS 10 does not. AOS 10 continues to expect management to occur on VLAN 1 and will tag the management traffic on the uplink. Make sure to clear the uplink native VLAN configuration before installing AOS 10 to prevent the AP from losing communication.

4 - Services

Many of the services that provide normal operations for AOS-10 are running within HPE Aruba Networking Central and their operation is not necessarily apparent. This section describes those services and how they work.

4.1 - AirGroup

AirGroup provides advanced functionality for multicast DNS and SSDP based network devices.

In today’s interconnected and mobile-centric world, seamless communication and interaction between devices are essential. HPE Aruba Networking’s AirGroup service emerges as a powerful solution designed to bridge the gap between diverse devices and the services they offer, enhancing the user experience within enterprise networks. At its core, it leverages zero-configuration networking to facilitate the discovery and utilization of multicast DNS (mDNS) and Simple Service Discovery Protocol (SSDP) services. These services encompass a wide range of functionalities, including Apple® AirPrint, AirPlay, Google Cast streaming, and Amazon Fire TV integration etc, all of which are integral to the modern digital workplace.

AirGroup simplifies the management of diverse devices, each with its own set of services, enabling users to seamlessly access these services from their mobile devices, laptops, and more, all within an enterprise network environment. Whether it’s sharing a presentation via AirPlay, streaming content through GoogleCast, or enjoying Amazon Fire TV services, AirGroup empowers users to be more productive and efficient.

This service is not limited to wireless connections; it seamlessly integrates wired and wireless devices, making it a comprehensive solution for modern network environments. Whether you’re in a bustling corporate office, a dynamic educational institution, or any enterprise setting, AirGroup enhances the functionality of your network by enabling devices to communicate effortlessly.

Key Features

  • Service Discovery: Aruba AirGroup simplifies the process of discovering and accessing services and resources available on the network across layer 2 domains. It enables devices to automatically detect and connect to services such as printers, file servers, media devices, and other resources without the need for complex configurations.

  • Device Isolation and Security: AirGroup ensures that devices can only discover and communicate with other devices within the same security and policy domain. This isolation prevents unauthorized access to sensitive information and enhances network security and privacy.

  • User Role and VLAN Based Access Control: Administrators can implement user role and VLAN based access control policies using AirGroup. This allows them to define specific rules and permissions for different user groups in different VLANs, ensuring that users have appropriate access to services and resources based on their roles and VLANs.

  • Enhanced User Experience: By streamlining service discovery and enabling seamless communication between devices, Aruba AirGroup improves the overall user experience. Users can effortlessly access shared resources and collaborate effectively, boosting productivity and satisfaction.

  • Easy Configuration and Management: AirGroup is easy to configure and manage through Aruba Central management platform. Administrators can use a user-friendly interface to set up and monitor service discovery and communication settings efficiently.

4.1.1 - Architecture of the AirGroup Technology

Deep dive into AirGroup architecture for AOS 10 operations

In transitioning to AOS 10, the AirGroup service has undergone a significant architectural overhaul to meet the dynamic needs of modern enterprise networks. In AOS 8, its centralized model struggled to cope with the growing number of mDNS/SSDP devices. Recognizing the need for change, Aruba reengineered AirGroup, shifting away from the single-central-system approach. In this new design, AirGroup server cache is distributed to every AP in the network. This shift empowers AirGroup to efficiently handle the increasing device population and their evolving behaviors while achieving exceptional performance and scalability.

In this revamped architecture, AirGroup operates as a distributed model, dividing functionality between APs and the AirGroup Service in Aruba Central. This innovative approach ensures AirGroup remains efficient and adaptable to meet the evolving demands of modern enterprise networks.

The New AOS 10 Architecture

  • Diverse Service Advertisement Frequencies - Various mDNS/SSDP devices relay service advertisement frames at intervals ranging from 5 seconds to 2 minutes or more. To ensure that a single or a subset of servers with aggressive advertisement tactics do not monopolize the system, the new AirGroup architecture can manage service advertisement frequency effectively.

  • Variable Client Query Frequencies - Applications like YouTube and Netflix constantly scan for new servers through mDNS/SSDP query frames, especially during video streaming. As queries generally outnumber service advertisements (20% advertisements vs. 80% queries), prompt query response times are essential without causing delays in processing service advertisements.

  • Proliferation of Unsupported Service Advertisements - The Bonjour protocol permits new applications to define and advertise services. As the usage of Wi-Fi BYOD devices continues to grow, the quantity of services being advertised and queried naturally fluctuates. Within this fresh design, each AP is equipped to intelligently filter and drop unsupported service advertisement and query packets originating from devices. This approach ensures that the responsiveness of AirGroup services remains undisturbed, even in the face of numerous unsupported services being advertised and queried by devices.

  • Ease of Serviceability - The new AirGroup architecture prioritizes ease of service, this includes the configuration and the provision of MRT Dashboards, APIs, alerts, and visibility into the state of APs.

  • Horizontally Scalable Architecture - Given the inherent nature of cloud-based deployments, the AirGroup service can be horizontally scalable by merely scaling up pods.

AirGroup Operations in AOS 10

AirGroup on each AP in AOS 10 serves as a protocol-aware proxy cache for service discovery protocols such as mDNS and SSDP. It intercepts and decodes these protocol packets from the L2 header, storing essential information in a cache.

In the AirGroup architecture of AOS 10, two components collaborate to support mDNS/SSDP functionality:

  • AirGroup Service in Aruba Central

  • AP mDNS module in every AP

Devices capable of mDNS/SSDP periodically broadcast their capabilities in the network, referred to as AirGroup servers. Devices searching for these services or capabilities are known as AirGroup users, leading to two distinct packet flows within the AirGroup application:

  • Query packet flow

  • Advertisement packet flow

Message flow of AirGroup within AOS 10

When an AP boots up, its mDNS process receives the AirGroup configuration, which originates from Aruba Central. This configuration encompasses AirGroup’s enable/disable status, service enable/disable status, disallow-role/vlan per service, and allowed role/vlan per service.

Each AP maintains two types of AirGroup server caches:

  • Discover Cache: This cache stores AirGroup servers directly connected to the AP. It facilitates sending delta discover cache updates to AirGroup service in Central and ensures cache coherency during AP-Central connection downtime.

  • Central Cache: The AirGroup service in Central processes cache updates from each AP, applies policies, and sends cache sync messages to the AP and its neighboring APs. The mDNS process on the AP uses these updates to construct the Central cache database. This cache contains only the synchronized cache from the AirGroup service. All AirGroup client queries are answered from this cache after applying configuration policies and per-server global policies.

The mDNS process in the AP primarily handles three processes:

  • Advertisement process

  • Query process

  • Cache synchronization with AirGroup service in Central

The Advertisement process manages new AirGroup server advertisements or updates (cache add/update), server disconnects (cache delete), and suppressed services dropping. Each AP’s mDNS process actively listens on a RAW socket, capturing mDNS/SSDP server advertisement packets. These packets undergo configured policies, including AirGroup service enabling/disabling, disallowed/allowed client roles, and disallowed/allowed VLANs. After policy assessment, the server entry is either updated or added to the AP Discover Cache table.

For new servers, all cache records from the packet are sent to the AirGroup Service in Central. For updates, only the delta updates are relayed. If advertised services are unsupported or disallowed in specific VLANs or client roles, the AP takes direct action by dropping the packet. Simultaneously, information about the suppressed service is forwarded to the AirGroup service in Central to maintain visibility.

The subsequent diagrams illustrate specific workflows for cache entry creation and updates, cache entry deletion, handling unsupported or disallowed service packets.

Advertisement processing on AP – cache addition and update

Advertisement processing on AP – cache deletion

Advertisement processing on AP – updates of dropped packets for disallowed or suppressed services

Query Process

During the query process of APs, when a device seeks mDNS/SSDP services offered by another device, the mDNS process on the AP accesses its Central cache. This cache is built and continuously updated with AirGroup server/service records synchronized from the AirGroup service in Central.

For each incoming query, the mDNS process applies a set of policies, including service configurations (enabling/disabling), disallowed/allowed VLANs, and/or disallowed/allowed client roles. After this filtering process, the mDNS process consults the Central cache for records corresponding to the requested service IDs. If cached records are found, it assembles response packets with these records as the payload and subsequently dispatches them as unicast packets to the querying client.

In cases where all records cannot fit within a single packet, they are transmitted in successive packets. It’s important to note that there is a predefined hard limit, currently set at 150, for the maximum number of records that can be sent in response to any query. Future iterations may allow for configurability in this regard.

Query processing on AP

Cache Synchronization

The AirGroup service in Central plays a pivotal role in cache synchronization:

  • It processes the Discover cache updates received from each AP. Following the application of relevant policies, the AirGroup service dispatches cache synchronization messages to both the respective AP and all neighboring APs.

  • The mDNS process on the APs then processes these cache synchronization updates to construct the Central Cache database. This database exclusively comprises cache entries that have been synchronized from the AirGroup service in Central. All client queries draw from this cache, with all configured policies and per-server global policies applied.

  • In scenarios involving configuration changes or during roaming events, the AirGroup Service sends synchronization updates to all neighboring APs, ensuring the Central cache remains current and up to date.

To maintain cache coherency and consistency:

  • Each AP calculates the crc64 checksum for all Service Identifier counter IDs in both the Discover and Central cache databases. This checksum, along with the Discover cache checksum and Central cache checksum, is included in the periodic checksum messages transmitted to the AirGroup service in Central.

  • The checksum is routinely updated with each central cache synchronization message received, reinforcing the cache’s integrity. This process is especially valuable in scenarios involving connectivity disruptions and aids in recovering from cache losses during connection down/up events.

Ultimately, this design approach solidifies the Central Cache as the definitive and authoritative source of information, ensuring a robust and reliable service discovery environment.

AP cache synchronization with AirGroup service in Central

4.1.2 - Configuration of AirGroup for an AOS 10 Environment

Configuration elements for the AirGroup service including user and server policies, wired and wireless devices, custom services and license requirements.

AirGroup configuration policies are a pivotal component in managing and controlling service discovery within the network. These policies provide administrators with the flexibility to define how AirGroup functions and ensure that it aligns with the specific requirements and security standards of the organization.

Here are some key aspects of AirGroup configuration policies:

Enabling AirGroup Services

AirGroup services can be managed at both Central Global level and the AP group level. When configuring at the group level, it takes precedence over the settings at the Global level.

Administrators have the flexibility to selectively enable or disable specific services. This capability empowers organizations to customize their network environment, accommodating essential services while effectively managing and mitigating potential security risks or unnecessary services.

For instance, suppose you have 7 predefined AirGroup services enabled globally. However, in a specific AP group, you only wish to enable AirPrint, AirPlay, and GoogleCast while disabling the other four services. In this scenario, you can accomplish this by disabling the remaining four services at the AP group level, allowing for precise control over service availability within that specific group.

The following two screen captures demonstrate how to enable the AirGroup service at the Global level and how to subsequently disable the DLNA media service at the AP group level, effectively superseding the Global level configuration.

Enable AirGroup service at Global level

Disable DLNA Media service at ’AOS 10 AP‘ group

User Role and VLAN-Based Policies

AirGroup configuration policies offer granular control by allowing the application of policies based on user roles and VLAN assignments. This precise control mechanism ensures that specific services are exclusively accessible to authorized users or devices within designated network segments. This approach not only bolsters network security but also facilitates the isolation of services as necessary. Both role and VLAN-based policies provide the option to either “allow service” or “deny service,” granting administrators flexibility in defining access rules.

AirPlay policy restricted to Employee user role on VLAN 100 and 200

Wired and Wireless Servers

In a wireless network, a wireless AirGroup server is automatically placed by the AP it connects to, becoming visible and accessible to clients within one hop RF neighborhood of the server’s AP, provided the AirGroup policies allow it.

However, for wired AirGroup servers, automatic positioning in relation to AP locations does not occur. To enable wired AirGroup servers to be shared with wireless clients in AOS 10, global server policies must be configured.

Within a server policy, specific to a server’s MAC address, administrators can stipulate which user roles are permitted or prohibited. Additionally, administrators need to define a list of APs to which the wired AirGroup server will be visible. As a result, all clients connected to those APs will gain visibility and access to the server. This configuration ensures seamless accessibility to wired AirGroup servers for wireless clients within the network. In current release, maximum 50 APs are allowed to be in the visibility list.

Global server policies also serve another crucial purpose – ensuring the visibility of wireless AirGroup servers when a specific server is located beyond the one-hop RF neighborhood. This situation may arise when there’s a need to allow wireless clients to access a server that is not within the typical range of nearby APs the wireless clients connect to.

For instance, consider a library where there’s only one AirPlay printer available, but some APs are situated beyond the one-hop RF neighborhood of the printer. Consequently, clients connected to these remote APs cannot access the printer. In such scenarios, the solution is to establish a server policy for the printer at the Global level and include all relevant APs in the visibility list within the server policy.

By doing so, the configuration ensures that clients connected to any AP in the list, whether within the immediate RF neighborhood or beyond, can seamlessly access the printer. This flexibility in defining server visibility allows organizations to meet their specific connectivity requirements and provide a consistent user experience.

Global server policy

Leader AP for each wired AirGroup server

In AOS 10, the concept of a Leader AP is crucial for managing wired AirGroup servers. For a wired AirGroup server to be recognized and learned by the APs, the VLANs of the wired servers must be trunked to the switch ports connected to the APs. This ensures that all APs on the same VLAN can detect these wired servers. To avoid the inefficiency of having every AP on the same VLAN send redundant server updates to Central—which would generate excessive duplicate information and waste AP resources and WAN link bandwidth—AOS 10 introduces the Leader AP role for each wired AirGroup server on the same VLAN. Central selects a Leader AP, and only this Leader AP is responsible for sending any further updates about the server after it has been learned.

Each wired AirGroup server has its own Leader AP, and any AP can act as the Leader AP for up to 10 wired servers within the same VLAN. This distributes the Leader AP responsibilities and load across the APs on the VLAN. As we know, every AP maintains two cache tables for AirGroup servers: the Discover Cache, which stores all directly connected wireless servers, and the Central Cache, which contains server entries distributed by Central, these entries are used by the AP to service MDNS/SSDP queries. The Leader AP for a wired AirGroup server will cache this specific wired server in its Discover Cache table and send updates for this server to Central. Central then distributes the server information to other APs in the RF neighborhood.

Wired AirGroup server migration considerations from AOS 8 to AOS 10

In AOS 10, AirGroup operates solely on each AP and not on the gateways. To ensure that all wired AirGroup servers are recognized, the VLANs associated with these servers must be trunked to the switch ports connected to the APs. Therefore, when migrating from an AOS 8 AirGroup network, which is based on Mobility Conductor and Mobility Controller, to AOS 10, it is necessary to remove the wired AirGroup server VLANs from the switch ports connected to the gateways and add them to the switch ports connected to the APs. This allows the MDNS/SSDP packets from the wired servers to be detected by the APs, enabling them to learn these servers and make them visible to clients connected to neighboring APs.

Predefined Services

With an AP foundation license, 7 predefined services are available, including AirPlay, AirPrint, Googlecast, Amazon_TV, DIAL, DLNA Print, and DLNA Media. For these 7 predefined services, administrators have the option to disable or suppress specific service IDs that may pose a security risk. This proactive measure prevents these potentially risky services from being discovered or accessed within the network, bolstering security and reducing the attack surface.

Edit service ID of AirPlay

Disable service ID _raop._tcp of AirPlay

Custom Services

Aruba AirGroup encompasses 7 predefined services, including AirPlay, AirPrint, Googlecast, Amazon TV, DIAL, DLNA media and DLNA print. However, the custom service feature extends the flexibility of AirGroup by enabling customers to configure additional AirGroup services beyond the 7 predefined ones. This empowers organizations to tailor their service discovery environment to suit their specific needs and applications.

With custom service policies, customers can define and manage unique AirGroup services that are not part of the standard predefined set. This customization allows organizations to integrate specialized services, applications, or devices into their network while still benefiting from AirGroup’s service discovery and access control capabilities.

For example, a company may have proprietary in-house applications or devices that need to be discoverable and accessible by authorized users within their network. By utilizing the custom service feature, administrators can set up policies that govern the visibility and accessibility of these custom services based on user roles, VLAN assignments while maintaining the security and control provided by AirGroup.

In essence, custom service policies within AirGroup empower organizations to expand and adapt their service discovery ecosystem beyond the predefined services, enhancing the network’s versatility and accommodating their specific requirements.

Custom services can exclusively be configured at the Global level as the following screen capture, which illustrates the manual addition of a custom service. Typically, a single AirGroup service may encompass multiple service IDs, and manually configuring these IDs can be a laborious and error-prone process. To streamline this procedure, the “List” window in the AirGroup section at the Global level offers a comprehensive list of over 140 suppressed services, covering nearly all mDNS/SSDP services available in the market.

Users can conveniently search for and highlight the specific service they wish to add. As a result, the service IDs associated with the selected service are automatically incorporated. When creating a custom service, users need only provide the service name and con user role/VLAN policies. The following screen capture serves as an illustrative example of how to add a custom service via the Suppress Service list within the “List” window. This feature simplifies the process and enhances the accuracy of custom service configuration within Aruba AirGroup.

Add a custom service via suppressed services list at Global level

Licensing Requirements

Access points have two options for licensing in Central: the AP Foundation license and the AP Advanced license.

In earlier versions of Central, the AP Foundation license only allowed the use of the seven predefined AirGroup services: AirPlay, AirPrint, Google Cast, Amazon TV, DIAL, DLNA Print, and DLNA Media. When originally deployed, the AP Advanced license was required for custom services but this is no longer the case. Now, the AP Foundation license supports both the seven predefined AirGroup services and custom services.

Monitoring

Aruba AirGroup offers comprehensive monitoring capabilities, enabling administrators to track various aspects of service discovery. This includes monitoring server availability for specific user roles or VLANs, as well as monitoring server and service entries, which provide information about associated VLANs, user roles, and usernames, among other details.

Image

Image

Image

4.1.3 - Personal Device Visibility and Sharing

Description of the workflow, configuration of Personal wireless AirGroup servers and conversion process to public servers.

Aruba’s AirGroup personal device visibility and sharing feature, once activated in Central, leverages the capabilities of Aruba’s network infrastructure. This allows clients to share various wireless devices, including printers, smart TVs, IoT devices, and more. The streamlined sharing process enhances the client experience, simplifying wireless device discovery and access without the need for intricate setups or additional software. Clients can initiate sharing through the Aruba Cloud Guest Portal, adding further convenience to the process.

Personal devices are exclusively shared with wireless clients authenticated through the UPN (User Principal Name) format. In the current phase, only MPSK AES SSID device owners can share their devices, and the Aruba CloudAuth server serves as the supported authentication server for the MPSK SSID. Sharing a wireless personal device is possible with either MPSK AES or 802.1X authenticated clients, facilitated through the “Manage my devices” portal link hosted by Cloud Guest at the MPSK Wi-Fi password portal. However, this is contingent upon the availability of wireless sharing clients’ user entries in the identity repository utilized by Cloud Auth. For example, if an 802.1X client is authenticated by another RADIUS server, such as HPE Aruba Networking ClearPass, and the same client’s user entry is available in the identity repository used by the Cloud Auth server, then the wireless personal device owner can share with this client.

This feature introduces the concept of “Personal Servers or Devices” and “Public Servers or Devices”:

  • Personal Servers or Devices: Wireless devices associated with a username are default “Personal Devices” with the option to manually change the classification to public when Personal AirGroup feature is enabled.

  • Public Servers or Devices: Devices without a username or associated with a username in the public server list are automatically classified as “Public Devices” when Personal AirGroup feature is enabled at the Global level. When Personal AirGroup feature is disabled, all AirGroup servers are considered as public servers.

Here’s how personal device visibility and sharing typically work in Aruba AirGroup:

  • Device Discovery and Announcement: Device Discovery and Announcement: AirGroup-enabled wireless devices use mDNS or SSDP to announce their presence on the network, providing information about the device and the services it offers.

  • User Identification and Access Control: AirGroup distinguishes wireless personal devices owned by individual users using the UPN format username. Personal devices are automatically accessible by the device owner with the same username or through sharing client lists configured in the Cloud Guest portal.

  • User-Centric Experience: With personal device visibility and sharing, wireless users can easily locate and interact with their own or other clients’ devices, as well as discover shared devices within their authorized scope. This simplifies tasks like printing, streaming, or accessing resources without the need to configure device-specific settings.

  • Security and Privacy: AirGroup ensures secure device sharing and respects user privacy. Administrators can define granular service policies, preventing unauthorized access. User authentication ensures that only sharing clients can share and access their devices.

  • Cross-VLAN Sharing: In segmented VLAN environments, AirGroup facilitates device sharing across different VLANs. This feature is useful when users in different departments or areas need to share resources while maintaining network segregation.

  • User Control and Management: Administrators can centrally manage sharing policies, configuring rules, permissions, and visibility settings based on organizational requirements using user roles, VLANs, and service IDs.

Personal device visibility and sharing in Aruba AirGroup contribute to a collaborative and efficient networking environment, empowering users to interact with both their personal devices and shared resources within the organization.

Workflow

The process of sharing personal devices is compatible with MPSK servers and MPSK/dot1x clients which are authenticated via CloudAuth in Central. Here’s a breakdown of the workflow:

  • The AirGroup server undergoes MPSK authentication with the CloudAuth server in steps 1 to 4. The server’s username is transmitted to the AP through the username Vendor-Specific Attribute (VSA) at step 3.

  • Subsequently, the AP establishes a Discover cache entry for the AirGroup server at step 6, connected directly to the AP after receiving MDNS advertisement packets at step 5. The Discover cache update is then forwarded to Central at step 7.

  • If the personal device visibility and sharing feature is active and the server’s email address is not in the list of public server usernames, the AirGroup service in Central fetches the sharing policy for this specific server from the server sharing policy database at step 8.

  • Any device owner can share their AirGroup server via the “Manage my devices” portal hosted by Cloud Guest. The portal page link is conveniently available at the bottom of the MPSK Wifi password portal page, and access instructions are detailed in the following accompanying screen captures.

  • At step 9, a Central cache entry is generated for this server, contingent upon its compliance with the AirGroup policy.

  • The Central cache updates are disseminated to neighboring APs, specifically those within a one-hop distance from the AP to which the AirGroup server is connected.

  • Consequently, all Access Points within the RF neighborhood establish a Central cache for this specific server. This cache becomes instrumental in handling future mDNS queries.

Workflow for personal device visibility and sharing

It’s crucial to note that the sharing radius of the AirGroup server’s visibility is confined to a one-hop RF neighborhood. Effective interaction between the client and the AirGroup server is only achievable when both are within the proximity of a single-hop RF neighborhood.

Configuration

  • Enable personal device visibility and sharing at Global level.

  • Get into MPSK management window at the section of Security -> Authentication & Policy at Global level.

  • Copy the MPSK password portal page URL and distribute it to the personal device owners.

  • The personal device owners log into the MPSK password portal and clicks “Manage my devices” button, which directs them to the personal device sharing portal page hosted by Cloud Guest.

MPSK clients Wi-Fi password portal and “Manage my device” page link

  • Within the personal device sharing configuration portal, AirGroup server owners can share their devices with other clients or remove sharing access, allowing each device to be shared with a maximum of 8 clients.

Personal device sharing portal

Converting Personal Wireless AirGroup Servers into Public Servers

When a wireless AirGroup server is associated with an AP and authenticated with a username, its initial device visibility type is always set to “Personal.” This is illustrated in the example of the server logged in as conf-room1@abc.com in the following screen capture. However, if there is a requirement to make this wireless AirGroup server become a public server and accessible to the broader RF neighborhood , you can follow these steps:

  • In the list window of AirGroup servers at the Global level, locate the server entry that you want to share.

  • Highlight the specific server entry.

  • Click the “+” sign.

  • This action will add the server’s username to the list of public server usernames as the example in the following screen capture.

  • As a result, the server’s visibility status will change from “Personal” to “Public.” It will now be visible to clients within the same RF neighborhood instead of only being visible to the same user.

By following these steps, you can effectively convert a wireless personal AirGroup server into a public server, expanding its accessibility to clients in the RF neighborhood.

Configuration of converting a wireless personal device into a public server

List of usernames associated with public server

4.1.4 - Survivability

Description of AirGroup’s survivability mechanisms to handle network outages.

Survivability in Aruba AirGroup is a crucial aspect that ensures uninterrupted service discovery even when APs lose their connection to Aruba Central, the centralized management platform. This feature is vital for maintaining the functionality and accessibility of AirGroup services in scenarios where network connectivity to the central management platform is disrupted. Here’s an overview of how survivability is managed in Aruba AirGroup:

  • Local Central Cache: APs in an Aruba AirGroup deployment maintain a local Central cache of service discovery information. This cache includes essential data about AirGroup services, servers, and associated policies. In the event of a network interruption to Aruba Central, this local Central cache allows APs to continue serving AirGroup services based on the last synchronized information.

  • Service Continuity: The local Central cache enables APs to continue responding to service discovery requests from clients, even when they are unable to communicate with Aruba Central. This ensures that AirGroup services remain accessible to users within the AP’s coverage area, minimizing disruptions.

  • Cache Synchronization: When the network connection to Aruba Central is restored, APs sends Discover cache updates and synchronize their Central cache with the latest information from AirGroup service in Central. This process helps maintain consistency across the network and updates any changes or policies made during the network outage.

  • Delta Updates: Aruba AirGroup employs delta update mechanisms to transmit only the changes or updates to the local Central cache, rather than sending the entire cache. This efficient data transfer minimizes bandwidth usage during cache synchronization.

  • Network Resilience: To further enhance survivability, organizations may implement network redundancy and failover mechanisms to reduce the risk of network outages affecting AirGroup operations.

In summary, Aruba AirGroup’s survivability mechanisms ensure that service discovery operations continue seamlessly even in the absence of a connection to Aruba Central. By maintaining a local Central cache and employing efficient synchronization methods, AirGroup enables uninterrupted access to essential services for users, enhancing the reliability and resilience of the network. These features are critical for organizations that prioritize service availability and seamless user experiences.

4.2 - Roaming and the Key Management Service

Deep dive into how roaming is accomplished in AOS 10 and the Key Management Service (KMS) that helps to enable the process.

The Key Management Service (KMS) is a novel addition to HPE Aruba Networking Wireless Operating System Software 10, designed with the specific purpose of facilitating seamless wireless user roaming and enhancing network performance. Its primary function is to distribute critical information, including the Pairwise Master Key (PMK) or 802.11r R1 key, among neighboring APs. This exchange enables fast roaming, ensuring a smooth and uninterrupted user experience in the wireless network.

In addition to key sharing, KMS serves as a conduit for disseminating crucial user-related data. This includes details such as VLAN assignments, user role information, and, when machine authentication is in use, the authentication state of the user’s device. These data elements collectively form a station record for each user, which plays a pivotal role in the roaming process.

The core responsibility of KMS is to efficiently communicate these station records to neighboring APs, thereby enabling them to provide uninterrupted service as users move between APs. The list of neighboring APs is sourced from the AirMatch service, which plays a complementary role in optimizing wireless network performance.

Both KMS and AirMatch services operate within the broader framework of HPE Aruba Networking Central and work collaboratively to facilitate the key-sharing process.

Workflows

Initial state

In this workflow, we delve into the key stages of how KMS manages and disseminates vital data, such as Pairwise Master Keys (PMKs), 802.11r R1 keys, VLAN assignments, user roles, and authentication states, to create a seamless and secure wireless user experience.

Key Management Service workflow

  1. A wireless user initiates association with an Access Point (AP1) and undergoes 802.1X authentication, resulting in the acquisition of either the Pairwise Master Key (PMK) or the derivation of the R0 key from the master session key, depending on whether or not the 802.11r protocol is enabled.

  2. Subsequently, AP1 transmits the user’s station record to KMS located within HPE Aruba Networking Central. This comprehensive station record contains user-specific details, including the PMK or R0 key, VLAN ID, user role, and machine authentication state (if machine authentication is enabled).

  3. Upon receipt of the user’s station record, KMS stores this information in its cache and simultaneously retrieves the list of neighboring APs associated with AP1 through the AirMatch service.

  4. Leveraging the list of neighboring APs for AP1, KMS accesses the cached user station record, including the PMK or R0 key. If the network employs the 802.11r fast roaming protocol, KMS proceeds to generate R1 keys for each of the neighboring APs. However, if the Opportunistic Key Caching (OKC) roaming protocol is utilized, the R1 key generation step is omitted.

  5. To ensure seamless roaming for the user, KMS disseminates the user’s station record to all neighboring APs connected to AP1. Consequently, when the user later transitions to AP2 or AP3, a full authentication process is not required. AP2 or AP3 already possess the user’s PMK or R1 key, allowing for streamlined four-way key exchange between the user and the respective AP, simplifying and expediting the roaming process.

Bridged user roaming

AOS 10 introduces two distinct user types: bridged users and tunneled users. Bridged users encompass all individuals connected to a bridge-mode SSID. In this configuration, user traffic remains localized within the AP’s network and is not routed through a gateway. For bridged users, the associated VLANs are established on the uplink switches of APs and are permitted by the uplink ports of these APs.

Illustrated below is an example of a bridged user engaging in fast roaming by leveraging the capabilities of KMS.

Bridged user roaming workflow with KMS

  1. Following the initial association with AP1 and the completion of the first-time full authentication, the wireless user eventually transitions to neighboring AP2 during the course of their wireless session.

  2. AP2 promptly updates KMS with the user’s new location, ensuring seamless handoff within the network.

  3. KMS, driven by the user’s movement to AP2, retrieves the list of neighboring APs specific to AP2 from the AirMatch service.

  4. Building upon this list of neighboring APs for AP2, KMS references the cached user station record, which includes PMK or R0 key, and generates R1 keys for each neighboring AP. This process is contingent on the utilization of the 802.11r fast roaming protocol, while the R1 key generation step is omitted if OKC roaming protocol is in use.

  5. KMS commences the distribution of the user station record solely to those neighboring APs of AP2 that do not possess a cache of the user station record. This process avoids redundancy by excluding neighbors common to both AP1 and AP2.

  6. AP2 initiates the synchronization of user sessions by transmitting a broadcast user session sync request message across the user VLAN. This synchronization action pertains to the top 120 user datapath sessions.

  7. The user, now associated with AP2, engages in a four-way key exchange with AP2 as part of the seamless roaming process.

  8. AP2 effectively communicates with AP1, instructing it to clear all entries related to the user, such as datapath entries. Subsequently, the user resumes data transmission through the new access point, AP2, ensuring a smooth and uninterrupted wireless experience.

Tunneled user roaming

In the realm of AOS 10, the implementation of a gateway cluster is highly recommended when network scalability becomes a primary concern. As networks grow to encompass a substantial number of APs, typically exceeding 500, or serve a significant client base that surpasses 5000 users, the introduction of a gateway cluster becomes essential. This architectural choice offers a multitude of advantages, including supporting large scale of APs and clients, centralized management of user VLANs, the establishment of unified firewall policies spanning both wireless and wired users, RADIUS proxy capabilities, and more.

With the presence of gateways, wireless users adopt a tunneled user configuration, where all of their network traffic is efficiently tunneled through the gateway cluster. This configuration eliminates the need for individual APs to manage user VLANs, centralizing this function at the gateway level. One notable advantage is that APs no longer need to belong to the same layer 2 domain for smooth client roaming. Consequently, when a tunneled user roams between different APs, their user session synchronization relies on seamless communication with their designated User Designated Gateway (UDG).

Illustrated below is a tunneled user executing fast roaming facilitated by KMS. This approach ensures network scalability while maintaining seamless and uninterrupted user experiences.

Tunneled user roaming workflow with KMS

  1. Following a wireless user’s initial association with AP1 and the completion of full authentication, the user may eventually roam to a neighboring AP2.

  2. AP2 promptly updates KMS with the user’s new location.

  3. KMS, in turn, retrieves the list of neighbor APs associated with AP2 from the AirMatch service.

  4. Leveraging this list, KMS fetches the user station record, encompassing the PMK or R0 key from its cache, and proceeds to generate the R1 keys for each neighboring AP present in the list if 802.11r fast roaming protocol is used for roaming.

  5. KMS initiates the distribution of the user record to the neighboring APs of AP2 that lack the cached user station record. KMS refrains from repeating the station record distribution process for any APs that happen to be neighbors to both AP1 and AP2.

  6. AP2 broadcasts a user session synchronization request message over the user VLAN.

  7. The User Designated Gateway (UDG) forwards this session synchronization message to the user’s original AP, AP1.

  8. AP2 proceeds to synchronize the top 120 user datapath sessions with AP1.

  9. A start accounting notice is dispatched by AP2 to the UDG.

  10. When the UDG gets the start accounting packet, it changes the bridge or user entry to send traffic to the AP2 tunnel. If the user is the first one from that VLAN on AP2, the multicast group gets updated with the client’s VLAN information.

  11. The user embarks on a four-way key exchange with AP2.

  12. AP2 then notifies AP1 to perform cleanup, which includes purging all entries related to the user, such as datapath entries. Following this, the user begins forwarding traffic through AP2.

Non-fast-roaming users

In older versions of AOS 10, user cache synchronization, which included user key information, was exclusively reserved for fast-roaming users like 802.11r users, OKC users, or MPSK users. However, a pressing need arose for cache synchronization among non-fast-roaming users, such as Captive Portal users and MAC authentication users. This need stems from the desire to prevent reauthentication when these users transition from one access point to another. To address this requirement, cache synchronization between neighboring APs was introduced and has been supported from AOS 10.4 onwards.

Cache classification

To optimize cache distribution, cache entries are classified into three distinct types:

  1. Partial Roam Cache: This cache structure exclusively contains essential information necessary during roaming. For non-fast-roaming users, the partial roam cache is synchronized with neighboring APs.

  2. Full Roam Cache: In addition to the data found in the partial roam cache, the full roam cache includes supplementary station-related state information that may not be immediately required during roaming. The full roam cache entry is consistently available in KMS and on the AP to which the client is currently associated.

  3. Key Cache: This specific cache structure is exclusively employed by fast-roaming users. It houses station keys essential for fast roaming, including PMK (Pairwise Master Key), PMKR0, PMKR1 (per-BSSID), and MPSK, alongside comprehensive full roam cache information.

Workflows

Initial state

The diagram below provides an overview of the process for creating and synchronizing cache entries among neighboring AP for non-fast-roaming users.

Cache entry creation and synchronization for non-fast-roaming users

  1. The user establishes a connection with the AP and successfully completes the authentication process.

  2. In this step, the AP generates a full roam cache entry. Within this full cache entry, the partial roam cache information includes user-specific details such as user role, user VLAN, username, ESSID, and sequence number. In addition to the partial roam cache, the full cache incorporates various user state attributes like Class ID, multi-session ID, idle/session timeout, and more.

  3. The AP transmits the full roam cache information of the user to KMS.

  4. KMS retrieves the list of neighboring APs associated with this particular AP.

  5. KMS proceeds to distribute the partial cache information of the user to all the neighboring APs linked to the same AP. This ensures that neighboring APs possess the essential cache data for seamless user roaming and authentication.

Roaming

The roaming workflow for non-fast-roaming users closely resembles that of fast-roaming users, with a notable distinction: the complete roam cache is exclusively retained by the AP and KMS, while only a partial roam cache is distributed to neighboring APs.

Illustrated below are the primary steps in the roaming process for non-fast-roaming clients.

Non-fast-roaming user roaming workflow with KMS

  1. The user initiates a roam from AP1 to AP2.

  2. AP2 transmits a roaming notification to KMS.

  3. KMS retrieves the list of neighboring APs for AP2 from the AirMatch service.

  4. KMS dispatches the partial roam cache for this user to the neighboring APs of AP2, excluding those that overlap with AP1. For instance, in this scenario, AP3 is a common neighbor of both AP1 and AP2. Since AP3 already received the partial roam cache when the user initially connected to AP1, KMS only sends the partial roam cache to AP4 at this stage.

  5. AP2 sends a broadcast session synchronization request within the user’s VLAN to AP1 in an underlay scenario, to AP1 via AP2’s UDG in an overlay scenario, or within the default VLAN of the SSID if the cache is unavailable on AP2.

  6. AP1 responds to the session synchronization request by sharing the top 120 user sessions.

  7. AP2 forwards a user move request to AP1.

  8. AP1 acknowledges the move request.

  9. KMS dispatches the user’s complete roam cache to the AP2 to which the user has roamed.

  10. AP2 initiates an accounting start message to AP1 in an underlay case or to the AP2’s UDG in an overlay case.

  11. AP1 undertakes user entry cleanup, deletes the user’s full roam cache, and installs the partial roam cache. In an overlay scenario, the AP2’s UDG updates the bridge or user entry to direct traffic toward the AP2 tunnel. If the user marks the first instance of that VLAN on AP2, the multicast group is updated with the client’s VLAN information.

Configuration

To configure fast roaming in AOS 10, follow these steps:

  1. Navigate to the WLANs section and select the specific SSID you want to configure.

  2. Access the Security tab on the AP configuration page.

Fast roaming configuration

By default, 802.11r fast roaming is enabled, while OKC is disabled.

For optimal 802.11r configuration, it is highly recommended to set up the Mobility Domain Identification (MDID). MDID represents a cluster of APs that create a continuous radio frequency space, allowing 802.11r R1 keys for devices to be shared and enabling fast roaming.

Additionally, it is recommended to enable 802.11k. This standard facilitates swift AP discovery for devices searching for available roaming targets by creating an optimized channel list. As the signal strength from the current AP weakens, the device scans for target APs based on this list.

When 802.11k is enabled, 802.11v is automatically activated in the background. 802.11v facilitates BSS (Basic Service Set) transition messages between APs and wireless devices. These messages exchange information to help guide the device to a better AP during the 802.11r fast roaming process.

Verification

Command Line Interface

AP CLI command for checking the PMK or R1 key caching of wireless users of the AP:

show ap pmkcache

APIs

  • Retrieving the neighbor APs list for an AP:

    URL: https://<central-url>/airmatch/ap_nbr_graph/v1/Ap/NeighborList/<AP Serial Number>

  • Retrieving the client record:

    https://<app-url>/keymgmt/v1/keycache/{client_mac}

  • Retrieving the encryption key hash:

    https://<app-url>/keymgmt/v1/keyhash

  • Retrieving the client key synced AP list:

    https://<app-url>/keymgmt/v1/syncedaplist/{client_mac}

  • Retrieving the stats per AP:

    https://<app-url>/keymgmt/v1/Stats/ap/{AP_Serial}

  • Checking on the health of KMS:

    https://<app-url>/keymgmt/health

Survivability

Client roaming

In scenarios where connectivity to HPE Aruba Networking Central is lost during a roaming event, the station records and roam cache information of existing users have typically been synchronized among neighboring APs. Consequently, the fast roaming experience for these users remains unaffected.

It is, however, possible that during a network outage, the station records or cache information for new users cannot be synchronized among neighboring APs. In this scenario:

  • For new users who roam during this period, their user devices will undergo full authentication during the roaming event.

  • Despite the full authentication process, these users will continue to enjoy uninterrupted service.

In summary, while connectivity issues with HPE Aruba Networking Central may necessitate full authentication for new users, it does not disrupt their ongoing communications on the network.

Cloud Fallback

In light of the earlier sections detailing user roaming workflows, it is important to highlight that there are two specific steps in which the new AP might not receive a response from the previous AP due to a timeout in the network:

  • Datapath session synchronization: In this phase, the new AP attempts to synchronize datapath sessions with the previous AP.

  • User state cleanup in the previous AP: During this step, the new AP requests the previous AP to clean up user-related information.

To address potential timeouts in these situations, KMS employs the Cloud Fallback mechanism. When a session synchronization or user state cleanup request times out, the new AP communicates with KMS to report the lack of response from the previous AP. KMS then searches the client-AP association table. If a client entry is found, KMS facilitates the communication between both APs, enabling them to coordinate the above-mentioned steps effectively.

4.3 - IoT Operations

Fundamentals for the IoT offerings in areas of BLE, Zigbee and USB based IoT devices.

Aruba Central supports transporting of IoT data over enterprise WLAN. APs receive data from the IoT devices and send the metadata for these devices to Aruba Central and the IoT data to external servers through IoT Connectors. The IoT Connector aggregates the device data, performs edge-compute, and runs business logic on the raw data before sending the metadata and IoT data. The metadata for all IoT devices is displayed in IoT Operations dashboard in Aruba Central. Partner-developed applications, running on the IoT Connector, can be used to send the IoT data to external servers.

While enabling new capabilities to address real business needs, this proliferation of IoT devices at the edge creates a new set of challenges for IT. IoT devices use a variety of different physical layer connectivity types and communication protocols. Vendor-specific IoT gateways are often required to manage those devices and collect their data. IoT gateways obscure IoT devices on the network, making it difficult—if not impossible—to understand at a granular level what is connected to the network and where device data is going. Security is always front-and-center when it comes to IoT because many IoT devices are fundamentally untrustworthy, and the lack of visibility creates greater risk. IoT Operations within Aruba Central provides a solution to all of these problems.

Aruba’s IoT ecosystem mainly relies on its partner integrations. Aruba provides a transport medium in the form of its APs and a IoT Connector at the edge, for the data sensed by the IoT devices from different vendors and send that securely and efficiently to their backend.

Additionally, Aruba offers BLE based tags and beacons for Meridian based location services. Tags are mainly used for asset tracking and beacons are used for indoor wayfinding and identifying device location. To learn more about Meridian, check out the Meridian Platform documentation.

Solution Components

The IoT Operations consists of the following three solution components:

IoT Dashboard

The IoT dashboard provides a unified view of all your IoT Connectors, Access Points sending IoT data to these connectors, Apps that are currently installed and a comprehensive list of all the IoT devices/sensors that are being heard by your Access Points.

It gives a detailed overview of how your IoT network is performing. The IoT dashboard provides a view of non-Wi-Fi IoT devices that would otherwise be obscured by vendor or device-specific hardware. IT can monitor these devices from the first moment they connect to APs anywhere in the environment and see exactly which AP each device is connected to. Once an appropriate App is installed on the IoT connector, previously unknown devices of that type can be automatically and accurately classified, so network administrators know exactly what the IoT devices are, and where the IoT devices are, with confidence. To learn more about monitoring IoT operations, refer to Monitoring HPE Aruba Networking IoT Operations.

Representation of IoT Ops Home Page

IoT App Store

The IoT app store takes the complexity out of deploying new IoT use cases within the organization. Simply visit the IoT app store—located within Central—and use the store’s intuitive interface to browse ArubaEdge IoT applications certified to integrate seamlessly with our networks. Unlike directory-style marketplaces that simply provide pointers to compatible applications, the IoT app store provides certified applications for immediate download and activation with just a few clicks of the mouse. To install a Partner-Developed App refer to Installing a Partner-Developed App

Using the IoT app store also simplifies the complex and often confusing task of IoT device- application configuration. After the application is installed on the IoT connector, the AP can be easily configured to securely transport the device’s telemetry data to the appropriate destination, whether that’s an on-premises server or the cloud. From BLE location tags, beacons, and sensors to Zigbee door locks, IoT deployment is simple—so you no longer need to rely on third-party integrators for custom development. To know what IoT Apps are supported today, refer to IoT Operations App Matrix

IoT Operations App Store

IoT Connector

Intelligent Edge applications which require edge computing of IoT data have historically been some of the most difficult for IT to implement and manage. The challenges are particularly acute when it comes to processing IoT data. In some cases, IoT Connector at the edge is needed to parse/decode IoT telemetry data from Central-managed APs and make the data available to IoT applications, whether hosted on premises or in the cloud (e.g., Microsoft Azure IoT Hub). In other instances, the AP itself can be used as a connector to securely transport the data.

The IoT Connector aggregates the device data, performs edge-compute, and runs business logic on the raw data before sending the dashboard metadata and IoT data. The IoT connector puts Intelligent Edge applications within reach—allowing IT to accommodate whatever technology transition comes next with the speed and ease of deploying a virtual machine or AP in the existing infrastructure. With the Aruba IoT connector component, it’s easy to provision multiple ArubaEdge IoT applications within the environment—it only takes a few clicks. The IoT connector virtual appliance is added as a new data collector within Central and then installed on the VM instance or AP. The administrator can then enable new IoT connectors through the IoT Operations guided user interface and see connectors in use—all within Central.

Each customer deployment can have different IoT Connector in their environment based on the scale of their deployment. IoT Operations supports the following specifications:

Parameter Mini VM Small VM Medium VM DC-2000
APs 50 250 1000 1000
BLE Devices 2000 5000 20000 20000
Zigbee Devices 200 500 2000 2000

The IoT Connector support the following specifications:

Parameter Mini VM Small VM Medium VM DC-2000
CPU(Cores) 4 8 24 24
Memory(GB) 4 16 64 64
Storage(GB) 256 256 480 512

Deployment Models

IoT Connector is an integral part of the IoT Operations solution, providing connectivity and edge processing for IoT use cases. These Connectors are used to parse or decode IoT telemetry data from Central-managed APs and make the data available to partner IoT applications that are either hosted on premises or in the cloud. There are two deployment types available: virtual machine (VM)-based using VMWare ESXi or Aruba AP-based. For downloading and deploying an IoT Connector refer to Downloading IoT connector

VM based IoT Connector

A VM-based IoT Connector leverages Aruba’s Data Collector architecture and is provisioned within Central. Configuration of the IoT Connector is provided within IoT Operations, using the guided user interface.

IoT Operations Architectural Diagram

AP as IoT Connector

For customers who find it difficult to deploy and manage a separate machine outside of their wireless deployment, Aruba provides an option of using the existing AOS 10 APs as IoT Connectors. In this type, as compared to the previous model, the function of the IoT connector is collapsed into the AP.

However, the scale and the capacity of the IoT Connector would be less if it is running on an AP. In this model, only classifier apps like iBeacon, Eddystone, Blyott etc could be used, as the installation of heavy containerized apps like Dormakaba is not supported. Support for container-based Apps inside APs to come in future releases on Aruba Central. For creating AP-based IoT connector refer to Creating AP-based IoT Connector

IoT Operations Architectural Diagram with AP acting as IoT Connector

Types of IoT Solutions

BLE based

BLE or Bluetooth Low Energy based IoT solutions are the most common amongst all the types of IoT solutions. This is mainly because BLE as a technology is very common, easily available, could be implemented with relatively low effort and rather easy to connect. BLE which is basically Bluetooth version 4.0, was introduced for over a decade ago now and found its way to a variety of different applications and solutions. Today, most of BLE based IoT devices that Aruba supports are based on BLE 5.0.

How it works with IoT Operations

The way most of the BLE devices are designed is that they broadcast their BLE beacons at pre-defined regular intervals which consists of raw data along with their payloads. Once a radio profile is configured and enabled on Aruba APs, they will start listening to these beacons and transport them over to the IoT Connector. Now within the IoT Connector, apps are installed to classify these devices and various filters could be applied to only forward that data which is relevant to the partner backend. In case of simple classifier apps like Aruba Devices, iBeacons, Eddystone, Blyott or Minew to name a few, there is no need for container-based workflow in the backend. This makes such apps very easy and quick to build and deploy. More complex solutions like on the BLE based door lock solution that utilize southbound API options necessitates the use of a container.

Use Cases

High value asset tracking, location tracking, indoor navigation and wayfinding are the current most common use cases for this type of BLE-based solution in IoT Operations in AOS 10.

Zigbee based

AP’s built-in IoT radio, which supports 802.15.4 use case like Zigbee is used for providing gateway services to relay the Zigbee-based sensor data to a management server. As of today, Aruba mainly supports two smart door lock vendors as far as Zigbee based solution is concerned.

This allows an administrator to avoid deploying a network of ZigBee routers and gateways to provide connectivity to each door lock. A single network can handle both Wi-Fi and ZigBee devices. An AP from Aruba provides ZigBee gateway functionality that offers a global standard to connect many types of ZigBee networks to the Internet or with service providers.

ZigBee devices are of three kinds:

ZigBee Coordinator (ZC)—The ZC is the most capable device. It forms the root of the network tree and may bridge to other networks. There is only one ZC in each ZigBee network.

ZigBee Router (ZR)—A ZR runs an application function and may act as an intermediate router that transmits data from other devices.

ZigBee End Device (ZED)—A ZED contains enough functionality to communicate with the parent node (either a ZC or ZR). A ZED cannot relay data from other devices. This relationship allows the ZED to be asleep for a significant amount of time thereby using less battery.

An AP acts as a ZC and forms the ZigBee network. It selects the channel, PAN ID, security policy, and stack profile for a network. A ZC is the only device type that can start a ZigBee network and each ZigBee network has only one ZC. After the ZC has started a network, it may allow new devices to join the network. It may also route data packets and communicate with other devices in the network. Aruba solution does not utilize a ZR.

How it works with IoT Operations

Compared to BLE, Zigbee based solutions differs in the fact that they need to directly connect to a coordinator. The configuration is generally more time and labour intensive given the nature of deploying such a solution given the one-by-one nature of putting the APs in permit-joining mode and connecting a given lock.

Zigbee use cases almost always require the use of container based IoT apps given the edge processing is needed to transform data in the payload. The two Zigbee based door lock vendors that are supported today with IoT Operations, require the transport detailed to be configured while installing the apps as opposed to configuring a separate transport stream.

Use Cases

Smart Zigbee-based door locks mainly comprise of the current supported use cases as far as Zigbee is concerned. These are mainly seen in the hospitality industry and enterprises that use smart buildings and facilities. These solutions provide immense ease of use with simple NFC-based key cards or even mobile phones entry, along with providing appropriate security and very detailed analytics. Features like remote locking and unlocking of doors, key blocking, how many times was the door locked/unlocked, was the latch enabled or not are some of the basic smart features that are offered by these solutions.

USB based

All Aruba APs have a dedicated USB-A slot where external supported devices could be plugged and powered. One major benefit this provides is that it opens the spectrum of use cases that are not natively available from the IoT chipset in the APs. Essentially, the AP can run any proprietary protocols other than Wi-Fi, BLE or Zigbee making use of the USB slot.

One thing to note while using this is that the USB slot is strictly governed by ACLs, so unless a supported vendor’s dongle is plugged, the slot will not function or allow for connectivity.

How it works with IoT Operations

The vendors that are supported today can be divided into 2 categories: Ethernet-over-USB and Serial-over-USB. Hanshow and SoluM fall under the 1st Ethernet-over-USB category and EnOcean falls under Serial-over-USB category.

Now apart from installing the vendor app itself, a transport stream or ‘AOS8’ app needs to be configured to specify its endpoint details.

Use Cases

Hanshow and SoluM make the electronic shelf labels(ESL). These are widely used in the retail industries and warehouses. They are replacing the traditional price tags that had be managed manually, were cumbersome and required a lot of time, effort and cost. With these ESLs, all of the above problems could be managed digitally through a central management server.

EnOcean’ USB dongle is used in conjunction with a variety of sensors. The EnOcean Alliance is a federation of nearly 400 product vendors, manufacturing more than 5,000 different lighting, temperature, humidity, air quality, security, safety, and power monitoring sensors and actuators.

The table below shows a summary of the available transport services and the corresponding supported server connection types and device class filters:

IoT Transport Service IoT Radio Connectivity IoT Server Connectivity Device Class Filter
BLE Telemetry Aruba IoT radio Telemetry-WebSocket, Telemetry-HTTPS All BLE device classes
BLE Data Aruba IoT radio Telemetry-WebSocket, Azure-IoT-Hub All BLE device classes
Serial Data USB-to-Serial Telemetry-WebSocket, Azure-IoT-Hub serial-data
Zigbee Data Aruba IoT radio Gen 2 Telemetry-WebSocket zsd

SD-Radio

SD-Radio or SDR is a new feature that allows our IoT partners to load their proprietary firmware onto the built-in IoT radio of the APs and then communicate with their backend server. This SDR can be enabled on both of our internal and external radios. Having support for this feature on external radios, enables the use of older AP models that did not have built-in IoT radio.

When a radio is software defined, it can accept new firmware from IoT App supported in Aruba Central. Once radio switches to the SDR, the App can communicate with the radio and run their logic and protocols which are transparent to Aruba.

Firmware images are stored in Openchanel’s file server. Central pushes the URL and APIKEY to the Connector and the Connector pushes them to the AP. AP downloads the image and then starts upgrading. If file servers need an SSL certificate for downloading image, AP images should embed it in advance. If current APs don’t have such certificate, Central needs to support upload Certificates to APs by customer.

Licensing

IoT Operations is available to Aruba Central customers using AOS 10 based APs, with Foundation and/or Advanced AP licenses. Separate licenses are not required for IoT Operations.

IoT Operations utilizes an IoT Connector to receive IoT data from APs and sends IoT device metadata to Aruba Central and IoT data to partner applications. The APs that are assigned to an IoT Connector utilize their IoT radios to act as IoT gateways for myriad IoT devices in the physical environment.

Aruba uses the license tier of APs assigned to your IoT Connector to determine the user experience. Currently, that user experience is differentiated in the IoT Operations Application Store. You will either have access to all apps in the IoT Operations Application Store or some of the apps in the IoT Operations Application Store. Regardless of license tier, the supported scale and base functionality of IoT Operations are the same. In the future, Aruba may add new capabilities to IoT Operations which may extend across apps or even be offered independently of the apps themselves. The user experience is currently determined in IoT Operations as:

  • When all APs assigned to an IoT Connector have an Advanced AP license, you have access to all apps in the IoT Operations Application Store.

  • When at least one AP assigned to an IoT Connector has a Foundation AP license, you have access to a subset of apps in the IoT Operations Application Store. The apps that are available are shown in full color, while the apps that are unavailable are shaded grey.

Filters can be used in the IoT Operations App Store user interface to further refine your app search. For more information on HPE Aruba Networking Central Licenses, refer to About HPE Aruba Networking Central Licensing

Key Considerations and Setup

This section describes some of the key considerations and brief steps involved for successful implementation of IoT Operations

  • APs need to run ArubaOS 10 code version

  • Configure IoT Radio Profiles and/or Zigbee Service Profiles

    This configuration is required to enable AP’s IoT radio to listen to nearby BLE or Zigbee sensors. This piece of config is done outside of the IoT Operations home page, under AP config>IoT section.

  • Deploy an IoT Connector

    IoT Connector can be deployed and managed under Organization>Platform Integrations>Data Collectors. From here an OVA file could be downloaded, the collector could be deployed and eventually registered to your Central account. Once everything is in place, you can start configuring the IoT Connector under IoT Operations home page.

  • Assign APs to IoT Connector

    APs need to be assigned to a connector for them to transport the IoT data that is sensed by the APs to the connector. Multiple APs could be assigned to one connector. Conversely, one AP can only be assigned to one connector. This could be achieved under Applications>Connectors>Gear Icon.

  • Install Apps

    Once inside your IoT Connector context, navigate to Installed Applications>Manage. This presents a list of all the available apps in the IoT Ops app store. To install any of them, simply open the app card and click Install. Most of the apps are classifier apps that don’t require any additional configuration. Some apps might require additional transport related configuration.

  • Create a Transport Stream

    For the apps which are just classifier apps and don’t require any additional configuration, we need to configure a separate transport stream to send the IoT data to an endpoint. This could be done either using ‘AOS8’ app or creating a transport stream under connector >Transports.

4.4 - Tunnel Orchestrator

The workings and survivability of the tunnel orchestrator service in HPE Aruba Networking Central.

The Overlay Tunnel Orchestrator (OTO) service architecture defines the working model of Tunnel Orchestrator service between Aruba Gateways and APs.

Aruba supports automated Tunnel Orchestrator for LAN Tunnels service for APs and Gateways deployed in campus WLAN. Based on the location of the devices, the tunnel orchestrator service establishes either GRE tunnels (at the branch site) or IPsec tunnels between Gateways and APs provisioned in an Aruba Central account (This has been described in greater detail in the following sections). The tunnel orchestrator service along with AP Tunnel Agent and Gateway Tunnel Agent creates and maintains the tunnels between APs and Gateways.

The Tunnel Orchestrator for LAN Tunnels service can be enabled either globally or on individual device groups. By default, the Tunnel Orchestrator for LAN Tunnels service is enabled for Gateways and AP devices provisioned in an Aruba Central account. The tunnel orchestrator automatically builds a tunnel mode network based on the tunnel endpoint preference that you configure in the WLAN SSID. The tunnel orchestrator selects the Gateway-AP pairs and decides the number of tunnels between the Gateway cluster and APs based on the virtual AP configuration.

Working

The Tunnel Orchestrator service leverages the Tunnel Orchestrator service of Aruba SD-WAN solution released in 2018. In ArubaOS 8, the tunnels between gateways and APs were built through the legacy IPsec process: Both end devices go through IKE phase 1 and phase 2 to authenticate each other, negotiate authentication method and timers, and generate encryption keys used for data traffic.

In ArubaOS 10, each gateway and AP pair does not directly go through IKE phase 1 and phase 2 to establish an IPsec tunnel between the gateway and the AP. Instead, the whole tunnel setup function is moved to the Tunnel Orchestrator service in Aruba Central which is responsible for generating the session keys and SPIs for the gateway and AP pair. These session keys are used to control traffic encryption between the AP and the gateway.

The different parts of the legacy IKE phase 1 process, such as authentication and encryption, timer negotiation, SPI pair generation, and encryption keys generation are skipped in the Tunnel Orchestrator service because all the gateways and APs are registered and subscribed in Aruba Central are treated as trusted entities. Secondly, the encryption policy and timers used by Aruba gateways and APs are hardcoded, there is no negotiation required. In ArubaOS 10, the tunnel orchestrator completely takes over the job of the IKE process and generates the keys and SPIs used for the IPsec tunnels. The Tunnel Orchestrator service not only simplifies the configuration model and device software, but also increases the performance and scalability of the whole network.

AP Tunnel Agent and Gateway Tunnel Agent

The AP Tunnel Agent (ATA) and Gateway Tunnel Agent are the tunnel management modules in APs and Gateways respectively. They are responsible for handling all GRE and IPsec tunnel configurations and maintaining the status in APs and Gateways. ATA and Gateway Tunnel Agent provide the following functions:

  • Register the information of APs and Gateways with tunnel orchestrator service.

  • Receive Gateway cluster and tunnel information and distribute to other processes.

  • Create and maintain IPsec and GRE tunnel and survivability status.

Use Cases of Tunnel Orchestrator

In ArubaOS 10, there are two scenarios where the tunnel orchestrator orchestrates tunnels for a pair of end points:

APs and Gateways in a Campus Network

When a tunnel or mixed mode SSID is configured, tunnel orchestrator orchestrates an IPsec tunnel and a GRE tunnel between each AP in the AP group and each gateway member in the gateway cluster. For example, in a case of one AP and two gateways in a cluster as what is shown in the following diagram: the tunnel orchestrator orchestrates one IPsec tunnel and one GRE tunnel between the AP and the first gateway, and one IPsec tunnel and one GRE tunnel between the AP and the second gateway. The IPsec tunnel is used for control traffic between the AP and the gateway, such as bucketmap and nodelist updates. The GRE tunnel is used for user data traffic from all the configured SSIDs on the AP. The GRE tunnel is not encrypted by the IPsec tunnel to avoid double encryption and performance degradation. The security of user traffic is guaranteed by the encryption method used in the SSID.

Micro Branch APs and Gateways for Remote Offices and Teleworkers

In a Micro Branch deployment, the GRE tunnel is encrypted by the IPsec tunnel. Since the data traffic between the AP and gateway goes through the WAN network, extra encryption becomes necessary.

GRE and IPsec Tunnels in Different Modes

Tunnel Orchestrator Workflow for Creation of Tunnel or Mixed WLAN SSIDs

By now, we know that the tunnels are created between gateways and APs as soon as a tunneled SSID or a mixed SSID is created. At the time of this type of SSID creation, the user needs to select the gateway cluster to which the APs will form a tunnel to. Eventually this will tunnel all the wireless client traffic to the gateway cluster.

The process is automated to the extent that when a new AP or a gateway is added to the existing groups, Tunnel Orchestrator service will automatically build all the relevant new tunnels between all the devices.

If we were to look at the step-by-step workflow of the entire orchestration process when a new tunnel or mixed mode SSID is to be configured, it would include the following:

  1. Tunnel SSID or mixed mode SSID is configured, and the service configuration module notifies the Tunnel Orchestrator.

  2. The Tunnel Orchestrator queries the device configuration module about all the gateway members in the cluster on which the SSID terminates.

  3. The Tunnel Orchestrator queries the group management module about all the APs in the AP group.

  4. The Tunnel Orchestrator generates SPI and encryption keys of the IPsec tunnels and the GRE tunnels for each pair of AP and gateway member in the gateway cluster.

  5. All the tunnel details are pushed to the gateways and APs.

Tunnel Orchestrator Workflow

Key Considerations

To allow APs and Gateways to automatically establish tunnel modes, ensure that the following configuration tasks are completed:

  • Aruba Gateways are onboarded to a group in Aruba Central.

  • Aruba APs are provisioned in Aruba Central.

  • Aruba Gateways and APs are upgraded to ArubaOS 10.0.0.0 or a later software version.

  • A WLAN SSID with the tunnel forwarding mode is configured on the APs. When you create a new SSID, you must select the primary cluster name or Gateway where you want to establish tunnel traffic of the SSID. Optionally, you can select the backup cluster that can be used when the primary cluster goes down completely. The APs establish tunnel with the Gateways in a Gateway cluster.

  • If the overlay IPsec tunnels initiated by APs to a VPN Concentrator use NAT traversal, the UDP 4500 port is enabled.

IPsec SA Rekeying

IPsec SA is created with a finite lifetime to ensure security. The AP-Gateway IPsec SA lifetime is 36 hours. The rekeying process ensures that there is no data loss during rekeying. Before the SA expires, the Tunnel Orchestrator performs rekeying for all the IPsec SAs of AP-Gateway pairs.

The following workflow explains the Tunnel Orchestrator IPsec SA rekeying process:

  1. Twelve hours before the IPsec lifetime expires, the Tunnel Orchestrator starts IPsec SA rekeying and orchestrates new keys for AP-Gateway pairs.

  2. New keys are pushed to the AP-Gateway pairs.

  3. The gateway examines temporary ingress tunnels in learning mode.

  4. The AP sends a probe encrypted with the new key.

  5. Parallelly, all the control traffic is still exchanged between the AP and the gateway through the old IPsec tunnel.

  6. After the gateway successfully decrypts the probe message with the new key, the gateway removes.

  7. temporary learning mode tunnel and installs new ingress and egress tunnels with the new key.

  8. The gateway sends a probe response to the AP.

  9. The AP removes the temporary learning mode tunnel and installs new ingress and egress tunnels with the new keys.

  10. The old tunnels remain active for 10 seconds for transient traffic.

  11. All the control traffic is switched to the new tunnels.

  12. Both the AP and the gateway age out the old tunnels.

Cloud Survivability for Tunnel Orchestrator

Survivability is a feature to mitigate the IPSec traffic between Aruba devices whose IPsec tunnels are orchestrated by Cloud and has a definite key expiration time when the cloud connectivity fails for any unknown reason and the devices have still connectivity between them. With survivability, the devices should be able to re-establish IPSec tunnels between them based on the tunnel config which they have already received from Tunnel Orchestrator using the legacy IKE/IPSec tunnel establishment.

During the rekey phase, if cloud connectivity is lost, the devices will seamlessly switch over to legacy IPSec tunnels. Survivability can be triggered during the following situations:

  • Either side of the tunnel has no connectivity to Tunnel Orchestrator

  • Tunnel Orchestrator pushes new keys to APs and gateways, but they are not received

  • Tunnel Orchestrator does not push new key to APs or gateways

  • The devices are not able to bring up tunnels using received keys

Cryptomaps for all the tunnel config received are created irrespective of Initiator/Responder. But on the initiator side, cryptomaps are in disabled state. After 9 retries to bring the rekey tunnel up, cryptoMap on the initiators side will be enabled for the survival tunnel to be triggered.

Once the survivability tunnel is up, the same process will be started again during rekey.

4.5 - Live Upgrade

Live Upgrade provides for uninterrupted services when upgrading AOS-10 access points and gateways.

Live Upgrade Services

The Live Upgrade and Firmware Management services in HPE Aruba Networking Central provide seamless upgrade of APs and gateway devices. The Live Upgrade service provides the following functions:

  • Upgrades APs, gateway clusters, or both
  • Allows the selection of the desired device firmware from a list of available versions
  • Allows to schedule upgrades up to one month in the future
  • Provides visibility of the upgrade progress
  • Allows upgrade of multiple groups in parallel
  • Allows termination of upgrades in the middle of the upgrade process

The Firmware Management service provides the user interface and other existing functions like firmware compliance and scheduled upgrades. The service also interfaces with the devices to initiate upgrades and receive upgrade status. The Live Upgrade service provides APIs to the Firmware Management service to initiate or abort live upgrades and decides the logic to orchestrate the upgrade process and the time of upgrade for the devices.

AP Live Upgrade

The Live Upgrade service enables the upgrade of APs without disrupting the connection of the already existing clients and reduces the upgrade duration by running parallel upgrades. The Live Upgrade service interfaces with the AirMatch service to partition all the APs that require upgrade into multiple smaller sets of APs that can be upgraded in parallel and each batch of APs will go for reload sequentially. AirMatch partitions APs based on RF neighborhood data such that neighboring APs are available for clients to roam to when the associated AP is undergoing an upgrade, so all the APs in one AP partition are in the same channel and are not in the same RF neighborhood.

To reduce WAN bandwidth consumption and upgrade duration, a set of APs, known as seed APs, are selected. Only these seed APs download the images directly from the Aruba Activate server. The rest of the APs download the image from its designated seed AP. Seed APs are selected based on the following considerations:

  • A non-seed AP is L2 connected to its designated seed AP.

  • A non-seed AP is of the same platform model as its designated seed AP.

  • Not more than 20 non-seed APs are assigned to a given seed AP.

  • Seed AP is randomly selected out of APs which are L2 connected and have same model type.

The following image shows the background process when an AP undergoes the Live Upgrade workflow.

AP Live Upgrade Process

The AP Live Upgrade workflow is as follows:

  1. When the scheduled time arrives, the Firmware Management (FM) service sends the AP group details to the Live Upgrade service. The upgrade time can either be the current time or a scheduled time within a month.

  2. The Live Upgrade service stores the AP list in the AP group and sets all the APs in the list, in the upgrade INIT state.

  3. The Live Upgrade service retrieves the subnet information of the APs from the monitoring module in Aruba Central.

  4. The Live Upgrade service selects seed APs based on seed AP selection criteria, and assigns a designated seed AP for each non-seed AP.

  5. The Live Upgrade service retrieves the AP partition information from the AirMatch service.

  6. The Live Upgrade service initiates an image upgrade of all the seed APs and sends the request to the FM service.

  7. The FM service sends an image upgrade request to all the seed APs.

  8. All the seed APs download the image from the Activate server.

  9. After downloading the image, all the seed APs send upgrade responses and states to the FM service.

  10. The FM service forwards the responses and states of all the seed APs to the Live Upgrade service.

  11. The Live upgrade service initiates an image upgrade of all the non-seed APs and sends the request to the FM service.

  12. The FM service sends an image upgrade request to all the non-seed APs.

  13. All the non-seed APs download the image from their designated seed APs.

  14. After downloading the image, all the non-seeds APs send upgrade responses and states to the FM service.

  15. The FM service forwards the responses and states of all the non-seed APs to the Live Upgrade service.

  16. The Live Upgrade service initiates the reboot for the first AP partition and sends the request to the FM service.

  17. The FM service forwards the AP reboot request to all the APs in the AP partition, and all the rebooted

  18. APs send reboot responses to the FM service after rebooting.

  19. The FM service forwards reboot responses and states to the Live Upgrade service.

  20. Remaining AP partitions are rebooted in sequence

Rebooting Access Points

After all the APs finish downloading the image, the Live Upgrade service starts the AP rebooting process. All the APs in one AP partition reboots at the same time. After the Live Upgrade Service receives updates on successful reboot of all the APs in the partition, the service initiates the reboot process for the next AP partition. This process is repeated until all the AP partitions reboot and come up with the new image.

The following is an example workflow of an AP reboot process:

  1. The Live Upgrade service sends the AP reboot request for a specific AP partition to the FM service.

  2. The request is sent to every AP in the partition.

  3. After receiving the reboot request, each AP disables all its radios.

  4. All the clients associated with the AP are unable to ping the AP and forced to roam to the neighboring APs.

  5. All the neighboring APs associated with the clients perform a session sync with their previously associated AP.

  6. The AP starts rebooting a few seconds later.

Reboot AP Process

The HPE Aruba Networking Central product documentation provides the steps required to perform or schedule the Live Upgrade service for APs.

Upgrading a Gateway Group

As gateways in a gateway group are much smaller in number, seeding and partition concepts are not necessary. To reduce upgrade duration, all the gateways in a cluster are instructed to download the image together. However, they are reloaded sequentially to avoid disruption. As only one gateway is reloaded at any given time, remaining cluster members together need free capacity for clients of only one gateway.

The Cluster UDG (User Anchor Gateway) concept ensures that all the users on a gateway are assigned standby UDG. The user state and Datapath sessions are synced to the standby UDG. When a gateway reloads, cluster failure detection logic detects the failure within a second and moves users to the standby UDG. With multi- version support added, when a gateway reloads with a new firmware version, it rejoins the cluster and can sync the user state and sessions with the cluster members running an older version. Users are not disrupted during the gateway upgrade process.

The gateway Live Upgrade workflow is as follows:

  1. The FM service applies all the configured compliance rules which include gateway group names, the firmware version to which the gateway group needs to upgrade, the time at which the group needs to upgrade. The upgrade time can either be the current time or a scheduled time within a month.

  2. When the scheduled time to upgrade arrives, the FM service sends the gateway group details to the Live Upgrade service.

  3. The Live Upgrade service stores the gateway list in the gateway group and sets all the gateways of the list in the upgrade INIT state.

  4. The Live Upgrade service initiates an image upgrade of all the gateways in the group and sends the request to the FM service.

  5. The FM service sends the image upgrade request to all the gateways.

  6. All the gateways download the image from the Activate server.

  7. After downloading the image, all the gateways send upgrade responses and states to the FM service.

  8. The FM service forwards the responses and states of all the gateways to the Live Upgrade ser-vice.

  9. The Live Upgrade service initiates the reboot request to one of the gateways and sends the request to the FM service.

  10. The FM service sends the reboot request to the gateway.

  11. The gateway reboots.

  12. The gateway sends reboot responses and states to the FM service.

  13. The FM service forwards the reboot responses and states to the Live Upgrade service.

  14. The rest of the gateways in the group are cycled through the reboot process in sequence.

The following image explains the Gateway Live Upgrade workflow:

GW Live Upgrade Process

The HPE Aruba Networking Central product documentation provides the steps required to perform or schedule the Live Upgrade service for gateways or gateway clusters.

Upgrading a Device Image

You can upgrade a device or multiple devices under the Firmware tab in Aruba Central. Any configured device group can be upgraded together or individually.

There are two upgrade types:

  1. Live Upgrade

  2. Standard

The default upgrade type is Standard. In this upgrade type, all the devices under the selected group download the images directly from the Aruba Activate server and reboot simultaneously. There is no consideration for reducing disruption to the network.

To perform Live Upgrade, set the upgrade type to Live Upgrade.

Firmware compliance configuration parameters:

  • Groups: You can choose All Groups, a single group, or a combination of groups

  • Firmware version: This is the target firmware version for the selected devices. There are three choices:

    • Choose a specific build number from the drop-down list.

    • Enter a custom build number, for example, 10.0.0.1_74637, and click Check Validity to validate the build number.

    • Set to None. When you set the option to None, set the compliance status as Not Set as you see in the screen capture. When you want to upgrade a device to a specific firmware version from your FTP server instead of Aruba Central, ensure that the Compliance Status is Not Set for the group to which the device belongs. Otherwise, Aruba Central automatically upgrades the group to the configured firmware version in Aruba Central and overrides the firmware version downloaded from your FTP server.

  • Upgrade type

    • Standard (default type)

    • Live upgrade

  • Upgrade time

    • Now

    • Later date: Firmware upgrade can be scheduled at a future time from the time of upgrade configuration up to one month later.

Refer to the HPE Aruba Networking Central product documentation for the specific steps required to perform or schedule the Live Upgrade service for different devices or device groups.

4.6 - RAPIDS

WIDS/WIPS services for AOS 10.

Rogue Access Point Intrusion Detection System (RAPIDS) automatically detects and locates unauthorized access points (APs), regardless of your deployment persona, through a patented combination of wireless and wired network scans. RAPIDS uses existing, authorized APs to scan the RF environment for any unauthorized devices in range. RAPIDS also scans your wired network to determine if the wirelessly detected rogues are physically connected. Customers can deploy this solution with “hybrid” APs serving as both APs and sensors or as an overlay architecture where Aruba APs act as dedicated sensors called air monitors (AMs). RAPIDS uses data from both the dedicated sensors and deployed APs to provide the most complete view of your wireless environment. The solution improves network security, manages compliance requirements, and reduces the cost of manual security efforts.

  • Rogue device detection is a core component of wireless security.

  • RAPIDS rules engine and containment options allows for creation of a detailed definition of what constitutes a rogue device and can quickly act on a rogue AP for investigation, restrictive action, or both.

  • Once rogue devices are discovered, RAPIDS can alert a security team of the possible threat and provides essential information needed to locate and manage the threat.

  • The RAPIDS feature set is included with Foundation subscriptions.

RAPIDS Flow

RAPIDS provides an effective defense against rogues and other forms of wireless intrusion. To accomplish these objectives, RAPIDS will:

  • Perform multiple types of wireless scans.

  • Correlate the results of the various scans to consolidate all available information about identified devices.

  • Classify the discovered devices based on rules that are customized to an organization’s security needs.

  • Generate automated alerts and reports for IT containing key known information about unauthorized devices, including the physical location and switch port whenever possible.

  • Deploy containment mechanisms to neutralize potential threats.

Key Features & Advantages of using RAPIDS

Feature Benefit
Wireless scanning that leverages existing Access Points and AM sensors Time and cost savings. Eliminates the need to perform walk-arounds or to purchase additional RF sensors or dedicated servers.
Default or Custom Rules-based threat classification Time and resource savings. Allows staff to focus on the most important risk mitigation tasks. Comprehensive device classification that’s tailored to the organization means less time spent investigating false positives.
Automated alerts Faster response times. Alerts staff the instant a rogue is detected, reducing reaction time and further improving security.
Rogue AP location and switch/port information Faster threat mitigation. Greatly simplifies the task of securing rogue devices and removing potential threats.
Reporting Reduced regulatory expense. Comprehensive rogue and audit reports helps companies comply with various industry standards and regulatory requirements.
IDS event management Single point of control. Provides you with a full picture of network security. Improves security by aggregating data for pattern detection.
Manual and automated containment Continuous security. Improves security by enabling immediate action even when network staff is not present.

RAPIDS Use Cases

Regulatory compliance is a key motivator that drives many organizations to implement stringent security processes for their enterprise wireless networks. The most common regulations are Payment Card Industry (PCI) Data Security Standard, Health Insurance Portability and Accountability Act (HIPAA) and Sarbanes-Oxley (SOX).

RAPIDS reporting is helpful for compliance audits

  • PCI DSS requires that all organizations accepting credit or debit cards for purchases protect their networks from attacks via rogue or unauthorized wireless APs and clients. This applies even if the merchant has not deployed a wireless network for its own use.

  • RAPIDS helps retailers and other covered organizations comply with these requirements. RAPIDS also enables companies to set up automated, prioritized alerts that can be emailed to a specified distribution list when rogues are detected.

  • Hospitals use RAPIDS to protect patient data as well as protect thier systems. They need to know if rogues exist on their network along with critical medical devices use for patient care.

WIDS vs RAPIDS

Wireless Intrusion Detection Service (WIDS) provides additional behavioral information and security for a wireless network by constantly scanning the RF environment for pre-defined wireless signatures. Intrusion detection is built into AOS and uses a signature matching logic, as opposed to RAPIDS usage of rule matching.

  • AOS can trigger alerts, known as WIDS events, based on the configured threat detection level: high, medium, or low.

  • WIDS events can be categorized into two buckets:

    • Infrastructure Detection Events

    • Client Detection Events

RAPIDS will consume WIDS events to present the event information in a clear and intelligible manner with logging and rogue location information. Security events rarely happen in isolation, the attack will usually generate multiple WIDS events so RAPIDS will merge the reporting of multiple related attacks into a single event to reduce the amount of noise.

  • RAPIDS in Central aggregates WIDS events and provides a method to view which events are getting raised in the environment.

    • Each event has a specific victim MAC address; events are aggregated for each of those victim MACs.

      • Multiple APs reporting the same event.

      • Several attacks against the same MAC.

  • Visibility in the UI, NBAPI, API streaming

  • Device classification is a combination of cloud processing and edge processing.

    • Aruba access points can discover Rogue access points independently, without intervention from Aruba Central (continuous monitoring).

    • Aruba Central classification takes precedence.

RAPIDS Classifications

RAPIDS ranks classifications in the following hierarchy.

  • Interfering

  • Suspected Rogue

  • Rogue

  • Neighbor (Known Interfering)

  • Manually Contained (DoS)

  • Valid

In the lifecycle of a monitored AP, classifications can only be promoted (i.e.. go higher in the list – in other words left to right in the diagram below) and can never be demoted (ie. go back down to a lower value).

If a neighbor reaches one of the classifications, “Valid” (in orange), this is considered a ‘final state’. Meaning, the AP will stop applying its own classification algorithms on that AP and this is where the AP will remain (unless it is aged out, or if the user manually classifies it to something else).

This same behavior also applies to the custom rules. For example, if a neighbor AP is already classified as Rogue then even if it matches a rule, it will never be demoted to a Suspected Rogue.

{% include image.html rel_url=“image2.png” alt=“RAPIDS ranks classifications in the following hierarchy: Interfering, Suspected Rogue, Rogue, Neighbor (Known Interfering), Manually Contained (DoS), Valid” caption=“RAPIDS ranks classifications in the following hierarchy: Interfering, Suspected Rogue, Rogue, Neighbor (Known Interfering), Manually Contained (DoS), Valid” %}

This same behavior also applies to the custom rules. For example, if a neighbor AP is already classified as Rogue then even if it matches a rule, will never be demoted to a Suspected Rogue.

Configuring Rules

After enabling RAPIDS in the UI, a set of 3 default classification rules will take effect.

For existing RAPIDS customers, these rules are the same rules that have been applied in previous releases. Maximum of 32 Single Rules can be configured. All criteria in a single rule uses an “AND” operand which means a rule will only be applied if all the criteria in that rule evaluate as a match.

Configuring custom rules

Creating a custom rule

Add one or more conditions to your rule

Classification Criteria

  • Signal - The user will be able to specify a minimum signal strength from -120 to 0 dB
  • Detecting AP Count - The number of detecting APs that can “see” the monitored AP 2 to 255
  • WLAN classification – Valid, Interfering, Unsecure, DOS, Unknown, Known Interfering, Suspected Unsecure
  • SSID Includes - Pattern for matching against the SSID value of a monitored AP.
  • SSID excludes - Pattern for matching against the SSID value of a monitored AP.
  • Known valid SSIDs - Match against all known valid SSIDs configured on the customer’s account. Regular expression matching
  • Plugged into wired network - When there is a managed PVOS/CX switch and a neighbor AP is determined to be plugged into the wired network when the BSSID matches the first 40 bits of a known wired MAC address as reported by the switch.
  • Time on network - Minimum number of minutes since monitored AP was first seen on the network
  • Site - List of site IDs for which this rule applies. If not populated then apply rule to all sites.
  • Band - The radio band of the monitored AP. 80211B (2.4 GHz), A (5 GHz), G (2.4 GHz), AG (Not Used), 6GHz
  • Valid client MAC match - Match any monitored BSSID against the current valid station cache list. This must be an exact match.
  • Encryption - Encryption: OPEN, WEP, WPA, WPA2, WPA3

Rule ordering matters; rules are evaluated from top to bottom in the custom rule list.

Whenever a match is found; then that rule is executed and further rule evaluation is stopped.

Because of this, it’s important to order your rules from lower classifications to higher classifications.

Manual classification will be respected; if a neighbor AP has already been manually classified by the user then no rules will be evaluated for that AP.

If the classification rule selects a non-final state classification (ie. Interfering or Suspected Rogue), then AP rogue detection algorithms will continue to be applied at the edge. And theoretically they could determine that the AP is in fact a rogue and promote the classification to Rogue.

Rogues Panel

The rogues panel provides a lot of detailed information about your wireless environment. Here is an example of what information is provided.

Rogues Panel

4.7 - AirMatch

AirMatch is HPE Aruba Networking’s next-generation automatic RF planning service.

Running within HPE Aruba Networking Central, AirMatch has the duty of computing an optimal radio frequency (RF) network resource allocation. AirMatch runs on a 24 hour cycle, first collecting RF network statistics and then developing an optimized RF network plan, which specifies channel, bandwidth, and EIRP settings for each radio, that is deployed once every cycle. As a best practice, the RF plan change should be deployed at the time of lowest network utilization so that radio channel changes have a minimal impact on user experience. In addition to the planning done every 24 hours, AirMatch also reacts to dynamic changes in the RF environment such as channel quality, radar, and high noise events. AirMatch results in a stable network experience with greatly minimized channel and EIRP variations. AirMatch is defined by the following key attributes:

  • A centralized RF optimization service
  • Newly defined information collection and configuration deployment paths
  • Models the network into partitions and then solves the different partitions as a whole
  • Results in optimal channel selection, bandwidth size, radio operating band (for Flex Dual Radio APs), and EIRP plan for the network

If the link between the access points and Central goes down, then features which require the coordination of Central, such as scheduled updates for RF optimization, will be lost. The current RF solution will continue to function and reactive changes resulting from high noise events and radar will still occur.

AirMatch Workflow

The AirMatch workflow occurs using the following steps:

  • APs send RF statistics to Central
  • The AirMatch service in Central calculates the optimal RF solution
    • AirMatch divides the network into separate partitions
    • AirMatch then calculates the optimal channel plan for each partition
    • AirMatch evaluates if the new channel plan for this partition is a sufficient improvement or not
    • If sufficiently improved, AirMatch pushes the solution to the access points at the scheduled time
  • Provides neighboring APs list to the Key Management Service
  • Provides AP partition information to the Live Upgrade Service

AirMatch Configuration

AirMatch was developed to operate with no user input, but instead based on readings taken from the RF network, and as such offers very little in terms of configuration. Constraining the parameters used to help fine tune the behavior is possible, but AirMatch should function correctly without any additional or specific configuration in most cases. Please consult the AOS 10 configuration guide to find this information.

Wireless Coverage Tuning

By default, the wireless coverage tuning is set to Balanced. This can be adjusted so that a channel plan improvement quality threshold which ranges from 0% (aggressive) to 16% (conservative) can be configured. By default, the Balanced setting represents 8% quality threshold improvement.

To determine the channel plan improvement index, the average radio conflict metric is computed. For each radio of an AP, channels that overlap with neighbors are calculated and path-loss is used to calculate a weighted conflict value. The closer the AP with the overlapping channels, the lower the path-loss and consequently, the higher the conflict. After AirMatch comes up with a new channel plan, the conflict value is compared with the current operating network and an improvement percentage is calculated. If the improvement percentage is higher than or equal to the configured quality threshold (8% by default), then the new channel plan is deployed to the AP at the scheduled time as configured.

Channel Quality Aware AirMatch

Channel quality, which is represented as a percentage, is a weighted metric derived from key parameters that can affect the communication quality of a wireless channel, including noise, non-Wi-Fi (interferer) utilization and duty-cycles, and certain types of retries. Note that channel quality is not directly related to Wi-Fi channel utilization, as a higher quality channel may or may not be highly used.

4.8 - ClientMatch

ClientMatch is HPE Aruba Networking’s advanced service for maintaing peak connectivity for wireless clients.

The ClientMatch service continuously gathers RF performance metrics from mobile devices and uses this information to intelligently improve the client’s experience. Proactive and deterministic, ClientMatch dynamically optimizes Wi-Fi client performance, even while users roam and RF conditions change. If a mobile device moves out of range of an AP or RF interference impedes performance, ClientMatch steers the device to a better AP to maximize Wi-Fi performance.

ClientMatch is aware of each client’s capabilities while also being aware of the Wi-Fi environment, this puts ClientMatch in the best position to maximize the user experience of each device as we know which radio each station is most likely to have the best experience on. In doing so, ClientMatch is also able to improve the experience of the entire system as slow clients on an access point also affect the experience of other users.

ClientMatch does this by maintaining a list of radios each station can see which is basically a database that states which access point radio that has been able to see the client’s device, and at which signal level. This information is then used, with a list of rulesets, to enhance the user’s experience.

The main difference between AOS8 and AOS10 regarding ClientMatch is that the orchestration of this feature is now handled by Central in the cloud instead of on a Mobility Conductor.

Move Types

There are two types of moves, deauth moves and BSS Transition Message moves, also known as 802.11v message moves.

Deauth moves function by sending a de-authentication frame to a connected station and then not letting this station associate anywhere but to the desired radio for a duration of time after the frame is sent.

802.11v or BSS Transition Messages are an Action Frame sent out by the AP to the station suggesting that they should instead be moving to that BSSID. Keep in mind that these frames are not mandatorily obeyed by the station and are sometimes ignored or rejected. If that happens 5 or more times, ClientMatch will then trigger a deauth move instead.

Band Steer

When ClientMatch sees an authentication being attempted by a station on the 2.4 GHz radio and he station is known to be 5 GHz or even 6 GHz capable, ClientMatch will then not let the client device connect to the 2.4 GHz band, effectively forcing them up to the more efficient bands.

ClientMatch will only attempt to move clients on 2.4 GHz with a worse signal then -45 dBm onto a target radio with a RSSI on 5 or 6 GHz that is better than -65 dBm.

6 GHz band steering can be disabled using the REST API.

Sticky steer

This feature comes into play in a scenario such as when a station associates to an access point while within the prime coverage area and then moves away to the edge of the coverage of the radio but the station does not roam voluntarily. ClientMatch is then aware that there is an access point that would be a better option for the station and as such is able to tell the client to move to the better candidate.

Load Balancing steer

This is a feature that is more frequently used in high density environments such as auditoriums. Load balancing aims to balance out the number of clients on a per-radio basis as to make sure not all clients are connected to the same radio and that instead the load is split across the network.

This move type only uses 802.11v moves and does not attempt to do deauth moves as the clients do not fully understand the load component involved in the computation and might see a degradation in signal strength as a negative outcome.

This specific move type can be disabled using the REST API.

MU-MIMO steer

The goal of this is to move stations that are MU-MIMO capable so that they are on the same radios so that the AP can leverage the MU-MIMO function as two stations of appropriate types are necessary for an access point to be able to do MU-MIMO.

Un-steerable clients

Some stations refuse to cooperate with ClientMatch and as such will be put into an un-steerable station list for 48 hours if there has been three unsuccessful deauth steer attempts or for 24 hours if the client device ignores more than five consecutive 802.11v moves.

This list of un-steerable clients can be viewed using the REST API. Client devices known to not support being steered by ClientMatch can be added permanently to the un-steerable list by

using the REST API.

Disabling ClientMatch

Should ClientMatch be suspected of causing issues in the network, or if there is a desire or requirement to allow the devices to choose the connected access point without attempts for steering, then ClientMatch can be disabled using the REST API.

ClientMatch Monitoring

In the Network Operations app, set the filter to Global.

  1. Under Alerts & Events, click Events. The Events page is displayed.

  2. Click on the CLICK HERE FOR ADVANCED FILTERING toggle to bring down the filtering options.

  3. Click the ClientMatch Steer option.

  4. Click the Filter button on the right.

Event viewer showing ClientMatch events in Central.

5 - Survivability

AOS-10 is dependent upon centralized services for normal operations. Any interruption of that communication can have an impact on the operations of AOS-10, so survivability of the network is a concern. The topic of survivability describes the impact of network outages on AOS-10 services, devices managed by Central, and client devices connecting to the network.

The purpose of this article is to capture the impact to clients, AOS-10 devices, and services if an outage prevents the AOS-10 devices from communicating with HPE Aruba Networking Central.

Client Devices

The following table captures the impact to client devices, roaming experience and traffic flows during an outage.

Feature or Service Impact
Application Visibility As enforcement is provided by access points (APs) and gateways, the classification and enforcement applications and categories will continue with no loss of functionality. Reporting of application visibility metrics to Central will be lost during the outage.
Authentication
  • Local / Centralized AAA: No impact if the AAA server is still reachable from the APs and/or gateways.
  • Cloud Authentication: As Cloud Authentication services reside in Central, new clients will be unable to join the network.
ClientMatch As ClientMatch services reside in Central, no client steers including sticky clients, band steering and load-balancing will occur. Clients will still be able to attach and roam, but in a less optimized environment.
Roaming Fast roaming is dependent on the Key Management Service (KMS) in Central to pre-distribute keys and user records. Some impact to fast roaming may occur:
  • Existing Clients: Will be able to fast roam to any neighboring APs that have previously received keys and user records. Roaming to non-neighboring APs will require a slow roam as keys and user records will not have been distributed.
  • New Clients: Will perform slow roaming only as keys and user records will no-longer be distributed.
UCC Reporting of call metrics to Central will be lost during the outage. Wi-Fi calling will also be unavailable if the service providers evolved packet data gateway (e-PDG) is unreachable.
WebCC As WebCC is dependent on BrightCloud, there may be some impact to traffic flows for unclassified applications when a cache miss occurs:
  • Gateways: If the gateways are the enforcement point, unclassified applications will be dropped if the Drop Packets during WebCC Miss option is enabled. Previously classified applications will be forwarded uninterrupted.
  • APs: If the the APs are the enforcement point, unclassified applications will be dropped if a deny any rule is assigned to the user role. Previously classified applications will be forwarded uninterrupted.

Managed Devices

The following table captures the impact to AP and gateway management and data-plane during an outage:

Feature or Service Impact
AirGroup As the AirGroup service resides in Central, no new discovery information will be propagated to APs. Existing cached information will be maintained but not updated.
AirMatch As the AirMatch service resides in Central, APs will continue to function with their existing channel, channel bandwidth, and EIRP settings and will respond to high noise and radar events. The APs will not receive newly calculated channel plan assignments from the AirMatch service until they are able to reconnect to Central.
Cloud Connect See Tunnel Orchestration.
Clustering Gateway Clustering is dependent on the Group / Site configuration in addition to Tunnel Orchestration:
  • New gateways: Cluster will not be established.
  • Operational / Connected gateways: Existing clusters will continue to function with no interruption. However, you will not be able to add or subtract nodes from the cluster.
DRT As downloadable regulatory table (DRT) upgrades are dependent on Central, no DRT update will occur. APs will continue to function uninterrupted using their existing regulatory information.
IDPS All devices will continue to function uninterrupted using their existing signatures.
Licensing Devices will continue to function uninterrupted.
Mesh Mesh operation is dependent on configuration from Central:
  • New Mesh APs: Will lose ability to provision and configure new Mesh APs.
  • Existing Mesh APs: No impact to operation. AP and mesh metrics to Central will be lost during the outage.
MultiZone See Tunnel Orchestration.
One Touch Provisioning (OTP) As OTP is dependent on Activate and Central, no new APs or gateways can be provisioned.
Route Orchestration (ORO) As the Route Orchestration service resides in Central:
  • New Devices: No routes will be orchestrated.
  • Operational / Connected Devices: AOS-10 devices will continue function, but no new routing updates will be received.
Tunnel Orchestration (OTO) As Tunnel Orchestration services reside in Central:
  • New Devices / Not Previously Connected: No tunnels can be orchestrated.
  • Operational / Connected Devices: Once tunnels have been orchestrated, AOS-10 devices will fall back to legacy IPsec re-key methodology for tunnel maintenance.
  • Operational / Rebooted Devices: AOS-10 devices will cache previous tunnel destinations and will utilize legacy IPsec methods for tunnel creation and maintenance.
Security Policies All devices will continue to function uninterrupted using their existing security policies.
Zero Touch Provisioning (ZTP) As ZTP is dependent on Activate and Central, no new APs or gateways can be provisioned.

Services

The following table captures the impact to services either consumed in Central or coordinated between Central and devices during an outage:

Feature or Service Impact
Configuration Management As configuration is dependent on Central, you will lose the ability to apply new configurations or make configuration changes to impacted devices. APs and gateway will continue to function using their existing configurations.
Events No events will be triggered for impacted devices during the outage.
Firewall Logging Reporting of firewall flows to Central will be lost during the outage.
IoT Operations As the IoT connector services reside in Central:
  • Lose IoT visibility in Aruba Central IoT Ops dashboard
  • Unable to add/remove AP to connector mapping
  • Unable to take actions on app store for install, uninstall, update etc of apps.
  • Unable to add more new connectors.
  • Previously established IoT integrations should function. (E.g., payloads will still continue to flow from AP to IoT Connector to partner endpoint.)
Location Services As location services, Visual RF and API services reside in Central:
  • AP and client locations for impacted sites will not be accessible on Visual RF dashboard in Central.
  • Location data for impacted devices and clients will not be available via the API.
Monitoring All monitoring information will be lost for impacted devices. This includes APs, gateways and Clients.
Presence Analytics Presence Analytics data in the Dashboard for impacted sites will not be updated.
RAPIDS
WIDS/WIPS
As the RAPIDS services reside in Central:
  • APs will retain existing WIDS/WIPS state (e.g. rogue, interferer, containment BSSID, etc.) until it ages out.
  • No coordination scanning between APs
  • No rogue triggers for wired / wireless containment.
  • No reporting or alerts.
  • All existing WIDS/WIPS state is lost if the AP is rebooted during the outage and it will not be able to perform any WIDS/WIPS activities.