Vous êtes sur la page 1sur 411

“For Your Reference” Version

Designing ISE for Scale &


High Availability
Craig Hyps (chyps@cisco.com)
Senior Technical Marketing Engineer
BRKSEC-3699
Session Abstract
Cisco Identity Services Engine (ISE) delivers context-based access control for every
endpoint that connects to your network. This session will show you how to design ISE to
deliver scalable and highly available access control services for wired, wireless, and VPN
from a single campus to a global deployment.
Focus is on design guidance for distributed ISE architectures including high availability for all
ISE nodes and their services as well as strategies for survivability and fallback during
service outages. Methodologies for increasing scalability and redundancy will be covered
such as load distribution with and without load balancers, optimal profiling design, and the
use of Anycast.
Attendees of this session will gain knowledge on how to best deploy ISE to ensure peak
operational performance, stability, and to support large volumes of authentication activity.
Various deployment architectures will be discussed including ISE platform selection, sizing,
and network placement.
Cisco ISE & TrustSec Sessions: Building Blocks
BRKSEC-2045 -
CCSSEC-2002 BRKSEC-3697
BRKSEC-3699 Mobile Devices and
Cisco IT – Identity Advanced ISE
Designing ISE for BYOD Security -
Services Engine (ISE) Services, Tips and
Scale & High Deployment and Best
Deployment and Tricks
Availability Practices
Best Practices (Tues 1:00pm)…
(Mon 1:00pm) (Tues 3:30pm)
(Thurs 12:30pm). (Wed 1:00pm)…
(Wed 3:30pm)

BRKSEC-2695 - Building an Enterprise Access Control Architecture using ISE and TrustSec
(Mon 10:00 am + Thur 8:00 am)

BRKSEC-2203
BRKCRS-2891 - BRKSEC-3690
Deploying TrustSec BRKSEC-2026 -
Enterprise Network Advanced Security
Security Group Network as a Sensor
Segmentation with Group Tags: The
Tagging and Enforcer
Cisco TrustSec Detailed Walk Through
(Wed 3:30pm) (Mon 1:00pm)
(Mon 8:00am) (Thur 10:00am)
Important: Hidden Slide Alert

Look for this “For Your Reference”


Symbol in your PDF’s
For Your
Reference
There is a tremendous amount of
hidden content, for you to use later!

“Hear me now,
believe me later”
*400 +/- Slides in Reference PDF
Agenda

• Sizing Deployments and Nodes • High Availability


• Admin, MnT, pxGrid, IPN Node
• Scaling ISE Services
Failover
• RADIUS, Auth Policy, AD, Guest,
• Certificate Services Redundancy
Web Services
• PSN Redundancy and Load
• Profiling and Database
Replication Balancing
• NAD Fallback and Recovery
• MnT (Optimize Logging and
Noise Suppression) • Summary
U
D P D
R
E YouL take the blue pillW– the story ends, you walkR out of E
S E O
this
R room and believe whatever P want to believe. V
you
I B F I
E O E
G A I C
A D C S A
N U L E
U I E T P
T I
T R R U
H A N R
E R
H
E C Remember,
This is your lastallchance.
M
A
I'mT
I E
U G E C
H
T G
N
T
T
I offeringBis the
After this, theretruth F
I

is no H I A
I
O S
O
I
C N nothing
turning more.
back. C
A
R
B T
N
I
I
A T Y R
- The Matrix, 1999 Z N
T 8 I
E O A
0 A G
I S
S D T
2 E T I
O
. You take the red pill – you stayI in this room, O
N
1 and I show you how deep the rabbit O hole goes. N
X N
Nodes and Personas
For Your
Reference
Secure Access Policy Server: Identity Services Engine
Policy Server Designed for Secure Access

•Centralized Policy Device Registration


ACS
•RADIUS Server Supplicant and Cert
Provisioning
Profiler •Secure Group Access Mobile Device
•Posture Assessment Management
Guest Context Sharing
Server •Guest Access Services
NAC Identity •Device Profiling
Manager Services
Engine •Monitoring
NAC
Server
•Troubleshooting
•Reporting
For Your
Cisco Identity Services Engine (ISE) Reference

All-in-One Enterprise Policy Control

Who What Where When How

Security Policy Attributes

Identity
Context Cisco® ISE
Business-Relevant
Policies
Wired Wireless VPN

Virtual machine client, IP device, guest, employee, and remote user


Replaces AAA and RADIUS, NAC, guest management, and device identity servers
Node Types For Your
Reference

 Policy Service Node (PSN) Can run in a single host


PSN – Makes policy decisions
– RADIUS server & provides endpoint/user services

 Policy Administration Node (PAN)


PAN
– Interface to configure policies and manage ISE deployment
– Writeable access to the database

MnT
 Monitoring & Troubleshooting Node (MnT)
– Interface to reporting and logging
– Destination for syslog from other ISE nodes and NADs
PXG  pxGrid Controller
– Facilitates sharing of information between network elements

IPN
 Inline Posture Node (IPN)
– Enforces posture policy for legacy or 3rd-party NADs
ISE Policy Architecture For Your
Reference
Policy Administration Node (PAN) For Your
Reference
Writeable Access to the Database
• Interface to configure and view
PAN
policies External
ID
• Responsible for policy sync across Administration AD/LDAP Store

all PSNs and secondary PAN


• Provides:
• Licensing
• Admin authentication & authorization
• Admin audit

• Each ISE deployment must have at least


one PAN
• Only 1x Primary and 1x Secondary (Backup)
PAN possible
Policy Service Node (PSN) For Your
Reference
RADIUS Server for the Network Access Devices
• Per policy decision, responsible for: • Directly communicates to
• Network access (such as AAA RADIUS services) external identity store for user
authentication
• Posture
• Guest access (web portals) • Provides GUI for sponsors,
• Profiling agent download, guests
• Client Provisioning access, device registration,
• BYOD / MDM services and device on-boarding

• WebAuth
• Posture/MDM
• Client Provisioning
AD/LDAP
/RADIUS External
ID
Store
RADIUS/Profiling
PSN
NAD
Policy Synchronization For Your
Reference

• Changes made via Primary PAN DB are automatically synced to


Secondary PAN and all PSNs.
PAN
(Secondary)

PSN
Policy Sync

Policy Change Policy Sync


Admin PSN
User
PAN
(Primary)
PSN
• Guest account creation
Policy Sync
• Device Profile update

PSN
Network Access Device (NAD) For Your
Reference
Also Known as the ‘RADIUS Client’
• Major Secure Access component that enforces network policies.
• NAD sends request to the PSN for implementing authorization decisions for
resources.
• Common enforcement mechanisms:
NADs
• VLAN Assignment
• dACLs
• Security Group Access (SGA)

• Basic NAD types


• Cisco Catalyst Switches
• Cisco Wireless LAN Controllers
• Cisco ASA “VPN Concentrator”
pxGrid Controller (PXG) For Your
Reference
Context Data Sharing
• Enabled as pxGrid persona
• Max 2 nodes
• Control Plane to register
Publisher/Subscriber topics
• Authorize and setup pxGrid client
communications
• pxGrid Clients subscribe to
published topics of interest
• ISE 1.3 – ISE is only controller and
publisher
• MnT publishes Session Directory
Inline Posture Node (IPN) For Your
Reference
Special Case: NADs without CoA Support

VPN RADIUS RADIUS

NAD IPN PSN

• Inline Enforcement:
• Only needed in POSTURE use cases for NADs without RADIUS Change of
Authorization and Sessionized URL Redirect support
• Acts as a RADIUS Proxy in Bridged or Routed Gateway mode

• Inline Enforcement can not be combined with other ISE roles


• Only supported on hardware appliance (33x5/3415) ; No Vmware support
Monitoring and Troubleshooting Node (MnT)
Logging and Reporting
For Your
Reference
• MnT node receives logging from PAN, PSN, IPN, NAD, and ASA
• Each ISE deployment must have at least one MnT
• Max 1x Primary and 1x Secondary (Backup) MnT possible
PAN

Syslog
Syslog from access devices are PSN
correlated with user/device session
MnT

IPN

Syslog from firewall is correlated Syslog from other ISE nodes are
with guest access session sent to monitoring node for reporting
For Your
Reference
ISE Platforms
• Single ISE node (Appliance or VM)
can run PAN, MnT, PSN, and pxGrid
roles simultaneously ESXi Virtual
ESX Host

• For scaling beyond 10,000 endpoints,


roles will need to be dedicated and
distributed across multiple nodes
Cisco ISE
H/W
33x5
34x5
For Your
Monitoring and Troubleshooting Node Reference

Dashboard

PAN

MnT
Monitoring and Troubleshooting Node For Your
Reference
Monitoring and Troubleshooting Tools For Your
Reference

Validate NAD configuration Capture traffic destined for ISE

……..0101111010000…
…..

Download debugs and support package Provide API for 3rd party applications

Session
Troubleshooting
Management

Change of
CRUD
Authorization
ISE Reporting For Your
Reference
Putting It All Together… For Your
Reference

Network Access Monitoring and Policy Service Node Policy Administration


Device Troubleshooting The “Work-Horse”: Node: All Management Admin
Access-Layer Devices Logging and RADIUS, Profiling, UI Activities &
Enforcement Point for Reporting Data WebAuth, Posture, Sponsor synchronizing all ISE
all Policy Portal Client Provisioning Nodes

NAD MnT PSN PAN

Policy Sync
RADIUS from NAD to PSN

RADIUS response from PSN to NAD PSN queries


User external database
RADIUS Accounting directly

syslog syslog
Deployment Models and
Sizing
Node Types
 Policy Service Node (PSN)
PSN
– Makes policy decisions
Can run in a single host
– RADIUS server & provides endpoint/user services
 Policy Administration Node (PAN)
PAN
– Interface to configure policies and manage ISE deployment
– Replication hub for all database config changes
 Monitoring & Troubleshooting Node (MnT)
MnT
– Interface to reporting and logging
– Destination for syslog from other ISE nodes and NADs
PXG  pxGrid Controller
– Facilitates sharing of information between network elements

IPN  Inline Posture Node (IPN)


– Enforces posture policy for legacy or 3rd-party NADs
Standalone Deployment
All Personas on a Single Node: PAN, PSN, MnT
• Maximum endpoints – Platform dependent
 2,000 for 33x5
 5,000 for 3415
ISE Node
 10,000 for 3495 PAN Policy Administration Node

MnT Monitoring and Troubleshooting Node

PSN Policy Service Node

PXG
pxGrid Node
For Your
Reference
ISE Design and Deployment Terms
• Persona Deployment
Standalone = All personas (Admin/MnT/Policy Service) located on same node
Distributed = Separation of one or more personas on different nodes

• Topological Deployment
Centralized = All nodes located in the same LAN/campus network
Distributed = One or more nodes located in different LANs/campus networks separated
by a WAN
Basic 2-Node ISE Deployment (Redundant)
• Maximum endpoints – 10,000 (platform dependent—same as standalone)
• Redundant sizing – 10,000 (platform dependent—same as standalone)
ISE Node ISE Node

Primary PAN PAN Secondary


Admin Admin

Primary MnT MnT Secondary


Monitoring Monitoring

PSN PSN

PXG PXG
Basic 2-Node ISE Deployment (Redundant)
Maximum Endpoints = 10,000 (Platform dependent)

Admin (P) Admin (S) PSN


MnT (P) MnT (S)
PSN PSN

•All Services run on both ISE Nodes


AD/LDAP
HA Inline
(External ID/ •Set one for Primary Admin / Primary MnT
Posture Nodes
Campus
Attribute Store) •Set other for Secondary Monitoring / Secondary Admin
IPN
A •Max Endpoints is platform dependent:
IPN
Non-CoA •33x5 = Max 2k endpoints
ASA VPN
•3415 = Max 5k endpoints
WLC •3495 = Max 10k endpoints
802.1X Switch
AP 802.1X

Branch A Branch B

Switch Switch
AP 802.1X AP 802.1X
Distributed Persona Deployment
Admin + MnT on Same Appliance; Policy Service on Dedicated Appliance

PAN PAN
MnT MnT

• 2 x Admin+Monitor
• Max 5 PSNs
PSN
• Max endpoints – Platform dependent PSN
 5,000 for 3355 or 3415 as PAN+MnT
PSN
 10,000 for 3395 or 3495 as PAN+MnT
PSN

PSN
Basic Distributed Deployment
Maximum Endpoints = 10,000 / Maximum 5 PSNs
Admin (P) Admin (S) Policy Services
MnT (P) MnT (S) Cluster Distributed
Policy Services
PSN PSN
PSN PSN

HA Inline AD/LDAP
Posture Nodes (External ID/ AD/LDAP
Attribute Store) (External ID/
IPN
Data DC B Attribute Store)
IPN
Center A
WLC
Non-CoA 802.1X
ASA VPN
Switch
802.1X AP
WLC
802.1X Switch
AP 802.1X •Dedicated Management Appliances
•Primary Admin / Primary MnT
Branch B •Secondary MnT / Secondary Admin
Branch A
•Dedicated Policy Service Nodes—Up to 5 PSNs
Switch
802.1X
Switch •No more than 10,000 Endpoints Supported
802.1X
AP AP •3355/3415 as Admin/MnT = Max 5k endpoints
•3395/3495 as Admin/MnT = Max 10k endpoints
Distributed Persona Deployment
Dedicated Appliance for Each Persona: Administration, Monitoring, Policy Service
• 2 x Admin
• 2 x Monitoring PAN PAN

• Max 40 PSNs
• Max endpoints (Platform dependent) MnT MnT

 100k using 3395 as PAN and MnT


 250k using 3495 as PAN and MnT

PSN PSN PSN PSN PSN PSN PSN PSN

PSN PSN PSN PSN PSN PSN PSN PSN

PSN PSN PSN PSN PSN PSN PSN PSN

PSN PSN PSN PSN PSN PSN PSN PSN

PSN PSN PSN PSN PSN PSN PSN PSN


Fully Distributed Deployment
Maximum Endpoints = 250,000 / Maximum 40 PSNs

Admin (P) Monitor (P) Policy Services Cluster Distributed


Admin (S) Monitor (S) Policy Services
PAN MnT PSN PSN PSN PSN
PAN MnT PSN PSN

HA Inline AD/LDAP
Posture Nodes (External ID/ AD/LDAP
Attribute Store) (External ID/
IPN
Data DC B Attribute Store)
IPN
Center A
WLC
Non-CoA 802.1X
ASA VPN
Switch
802.1X AP
WLC
802.1X Switch
802.1X
AP
•Redundant, dedicated Administration and Monitoring nodes split
across data centers (P=Primary / S=Secondary)
Branch A Branch B •Policy Service cluster for Wired/Wireless services at main campus
•Distributed Policy Service clusters for DR sites or larger campuses
with higher-bandwidth, lower-latency interconnects.
AP
Switch
802.1X AP
Switch •Centralized PSN clusters for remote Wired/Wireless branch devices
802.1X
•VPN/Wireless (non-CoA) at main campus via HA Inline Posture nodes
Multi-Interface Routing

Admin (P) Monitor (P) Distributed


Policy Admin (S) Monitor (S) Policy Services
PAN MnT PSN PSN PSN PSN Services
Cluster PAN MnT PSN PSN

AD/LDAP
(External ID/ AD/LDAP
Attribute Store) (External ID/
Data DC B Attribute Store)
DNS NTP SMTP
Center A DNS NTP SMTP
WLC
802.1X
Switch
802.1X AP
WLC
802.1X Switch
AP 802.1X

Branch A Branch B

Switch Switch
AP 802.1X AP
802.1X
Sizing Guidance for ISE
Nodes
Determining Minimum Appliance Quantity and
Platform Type
PAN PAN PSN PAN MnT PSN
MnT MnT
PSN

Persona • All Personas running • Administration and Monitoring • Dedicated Administration node(s)
Deployment on single or co-located on single or • Dedicated Monitoring node(s)
redundant nodes redundant nodes • Dedicated Policy Service nodes
• Dedicated Policy Service nodes
Max Nodes by • 2 Admin+MnT+PSN • 2 Admin+MnT nodes • 2 Admin nodes
Type nodes • 5 Policy Service nodes • 2 MnT nodes
• 40 Policy Service nodes
Max • 2k with ISE-33x5 • 5k with ISE-3355 or SNS-3415 • 100k with ISE-3395 for PAN and MnT
Endpoints for • 5k with SNS-3415 for PAN+MnT • 250k with SNS-3495 for PAN and MnT
Entire • 10k with SNS-3495 • 10k with ISE-3395 or SNS-3495
Deployment for PAN+MnT
Scaling by Deployment, Platform, and Persona
Max Concurrent Endpoint Counts by Deployment Model and Platform For Your
Reference

Max # Dedicated
Deployment Model Platform Max # Endpoints per Deployment
PSNs
Standalone (all personas on 33xx 2,000 0
same node) 3415 5,000 0
(2 nodes redundant) 3495 10,000 0
3355 as Admin+MNT 5,000 5
Admin + MnT on same node;
3395 as Admin+MNT 10,000 5
Dedicated PSN
(Minimum 4 nodes redundant) 3415 as Admin+MNT 5,000 5
3495 as Admin+MNT 10,000 5
Dedicated Admin and MnT nodes 3395 as Admin and MNT 100,000 40
(Minimum 6 nodes redundant) 3495 as Admin and MNT 250,000 40

Scaling per PSN Platform Max # Endpoints per PSN


ISE-3315 3,000
Dedicated Policy nodes ISE-3355 6,000
(Max Endpoints Gated by Total ISE-3395 10,000
Deployment Size) SNS-3415 5,000
SNS-3495 20,000
Policy Service Node Sizing
Physical and Virtual Appliance Guidance
• Max Endpoints Per Appliance for Dedicated PSN
Form Platform Size Appliance Maximum
* General VM appliance
Factor Endpoints
sizing guidance:
Small ISE-3315 / ACS-1121 3000
Medium ISE-3355 6000
1) Select physical
Physical Large ISE-3395 10,000 appliance that meets
Small (New) SNS-3415 5,000 required persona and
Large (New) SNS-3495 20,000 scaling requirements
Virtual S/M/L VM 3,000-20,000*
2) Configure VM to match
• Inline Posture Specifications
or exceed the ISE
Max Endpoints per any appliance 3000-10,000 physical appliance
(gated by policy service) specifications
Max throughput per any appliance 936 Mbps
New Appliance Hardware Specifications For Your
Reference
Basis for VMware Appliance Sizing and Redundancy
• ISE 34x5 Appliance Specifications
http://www.cisco.com/c/en/us/td/docs/security/ise/1-2/installation_guide/ise_ig/ise_ovr.html#wp1122898

Platform SNS-3415 SNS-3495 SNS appliances have unique UDI


(SNS-Small) (SNS-Large)
Processor 1 x QuadCore
1 x QuadCore 2 x QuadCore
2 x QuadCore
from manufacturing. If wish to use
Intel
IntelXeon
XeonCPUCPUE5-2609
E5-2609 Intel
IntelXeon
XeonCPU
CPUE5-2609
E5-2609 UCS appliance other than these
@ 2.40@ GHz
2.40 GHz @ 2.40 @ GHz2.40 GHz SKUs, then must deploy as VM
(4 total(4cores)
total cores) (8 total(8
cores)
total cores)
Memory 16 GB 16 GB 32 GB 32 GB
200GB minimum, but MNT
Hard disk 1 x 600-GB 10k SAS HDD 2 x 600-GB 10k SAS HDDs sizing typically much higher for
(600 GB total disk space) (600 GB total disk space)
logging retention
RAID No No Yes (RAID
Yes 0+1)
(RAID 1)

Ethernet 4x Integrated Gigabit NICs 4 x Integrated Gigabit NICs VMs can be configured with 1-4
NICs
Redundant No Yes
NICs. Recommend allow for 2
Power? or more NICs.
Sizing Production VMs to Physical Appliances
Summary

Appliance used for CPU Memory Physical Disk


sizing comparison # Cores Clock Rate* (GB) (GB) **
SNS Large
8 2.4 32 600
(ISE-3495)
SNS Small
4 2.4 16 600
(ISE-3415)
ISE Large
8 2.0 4 600
(ISE-3395)
* Minimum VM processor clock rate = 2.0GHz per core (same as OVA)
ISE Medium
4 2.0 4 600
(ISE-3355)
** Actual disk requirement is dependent on persona(s) deployed and other factors.
SeeISE Small
slide on Disk Sizing. 4 2.66 4 500
(ACS-1121/ISE-3315)
Configuring CPUs in VMware For Your
Reference
ESXi 4.1 Example

• ESXi 5.x Example


Configure CPU
based on cores.
If HT enabled,
logical CPUs
effectively
doubled, but #
cores is same.
Setting CPU and Memory Allocations in VMware
Guest VM Resource Reservations and Limits For Your
Reference

• CPU Example • Memory Example

Set Reservation to
Minimum VM appliance Optionally set CPU allocation limit Similar settings apply to Max
specs to ensure required >= Min ISE VM specs to prevent Allocation and Min Reservations
CPU resources available over-allocation when actual CPU for Memory.
and not shared with assigned exceeds ISE VM
other VMs. requirements.
VMware OVA Templates New in ISE 1.3

• OVA Templates map to Small and


Large hardware appliances ISE-1.3.x.x-Eval-100-endpoint.ova:
• 2 CPU cores @ 2.3 GHz each
• EVAL (Evaluation / Lab testing) • 4 GB RAM
• 200 GB disk
• SNS-3415 (Small)
• 4 NICs
• SNS-3495 (Large)

• Simplifies VM deployment ISE-1.3.x.x-Virtual-SNS-3415.ova:


• 4 CPU cores @ 2 GHz each
• Ensures proper VMware settings. • 16 GB RAM
• 600 GB disk
Presets: • 4 NICs
• vCPU cores
• Memory With Reservations ISE-1.3.x.x-Virtual-SNS-3495.ova:
• 8 CPU cores @ 2 GHz each
• Disk Storage
• 32 GB RAM
• Network Interfaces • 600 GB disk
• 4 NICs
ISE Virtual OS and NIC Support
• ISE 1.3
Notes for VMware Virtual Appliance installs
• VMware ESX 4.x using ISO image (OVA recommended):
• VMware ESX 5.x
• Choose Redhat Linux 6 (64-bit)
• ISE 1.4 • Manually enter resource reservations
• VMware ESX 5.x only • Choose either E1000 or VMXNET3 (default)
• ESX Adapter Ordering Based on NIC Selection

ADE-OS ISE E1000 VMXNET3


Note: NIC ordering same if < 4 VM
NICs configured. eth0 GE0 1 4
eth1 GE1 2 1
OVAs configure with 4 NICs, so
E1000 NICs configured to avoid
eth2 GE2 3 2
confusion. eth3 GE3 4 3
ISE VM Disk Storage Requirements
Minimum Disk Sizes by Persona

• Upper range sets #days MnT log retention Persona Disk (GB)
• Min recommended disk for MnT = 600GB Standalone 200+*
• Max hardware appliance disk size = 600GB Administration Only 200-300**
Monitoring Only 200+*
• Max virtual appliance disk size = 2TB
Policy Service Only 200
Admin + MnT 200+*
** Variations depend on where backups saved
or upgrade files staged (local or repository), Admin + MnT + PSN 200+*
debug, local logging, and data retention
requirements.
For Your
Reference
For Your
MnT Node Log Storage Requirements Reference

Days Retention Based on # Endpoints and Disk Size

Total Disk Space Allocated to MnT Node


200 GB 400 GB 600 GB 1024 GB 2048 GB
10,000 126 252 378 645 1,289
20,000 63 126 189 323 645
Total Endpoints

30,000 42 84 126 215 430


40,000 32 63 95 162 323
50,000 26 51 76 129 258
100,000 13 26 38 65 129
150,000 9 17 26 43 86
200,000 7 13 19 33 65
250,000 6 11 16 26 52
ISE VM Provisioning & Disk IO Guidance
• VMotion officially supported in ISE 1.2
• Thin Provisioning officially supported in ISE 1.3 (recommend Thick Provisioning for MnT)
• Hyper-Threading not required, but can  TPS
• IO Performance Requirements:
• Starting in ISE 1.3:
 Read 300+ MB/sec
No more storage media and file system restrictions.
 Write 50+ MB/sec For example, VMFS is not required and NFS is
• Recommended disk/controller: allowed provided storage is supported by VMware
and meets ISE IO performance requirements.
 10k RPM+ disk drives
 Caching RAID Controller • Customers with VMware expertise may choose to
 RAID mirroring disable resource reservations and over-subscribe,
(Slower writes using RAID 5*) but do so at own risk.

*RAID performance levels: http://www.datarecovery.net/articles/raid-level-comparison.html


ISE Disk IO Performance Testing For Your
Reference
Sample Tests With and Without RAID Controller Caching

• Caching Disabled
• Average Write ~ 25 MB/s

• Caching Enabled
• Average Write ~ 300 MB/s
• > 10x increase
ISE Disk IO Performance Testing For Your
Reference
Sample Tests Using Different RAID Config and Provisioning Options

• 2x Write performance increase using Eager vs Lazy 0


• Note: IO performance equalizes once disk blocks written
• 5x Write performance increase using RAID 10 vs RAID 5
RAID Config Read Write Write Perf ↑ Write Perf ↑ Write Perf ↑
over 1 over 2 over 3
1 RAID 5: 4-Disk Lazy Zero 697 MB/s 9 MB/s NA NA NA
2 RAID 5: 4-Disk Eager Zero 713 MB/s 16 MB/s 78% (~2x) NA NA
3 RAID 10: 4-Disk Eager Zero 636 MB/s 78 MB/s 767% (~10x) 388% (~5x) NA
4 RAID 10: 8-Disk Eager Zero 731 MB/s 167 MB/s 1756% (~20x) 944% (~10x) 114% (~2x)

Read Performance roughly the same Write Performance impact by RAID config
VM Appliance Resource Validation Before Install
For Your
Reference

Validate VM
Readiness BEFORE
Install & Deploy
VM Appliance Resource Validation During Install
• ISE 1.3 install will
For Your
not even proceed Reference
without:
• 4GB RAM
• 2 CPU Cores
• 100GB Disk

• (EVAL settings)
VM Appliance Resource Validation After Install
For Your
• ISE continues to test I/O read/write performance on intervals Reference

ise-psn2/admin# show tech | begin "disk IO perf"


Measuring disk IO performance
*****************************************
Average I/O bandwidth writing to disk device: 194 MB/second
Average I/O bandwidth reading from disk device: over 1024 MB/second
I/O bandwidth performance within supported guidelines
Disk I/O bandwidth filesystem test, writing 300 MB to /opt:
314572800 bytes (315 MB) copied, 1.47342 s, 213 MB/s
Disk I/O bandwidth filesystem read test, reading 300 MB from /opt:
314572800 bytes (315 MB) copied, 0.0504592 s, 6.2 GB/s

Alarm generated if 24-hr average


below requirements
VM Appliance Resource Validation After Install
For Your
• ISE continues
ise-psn2/admin# to test
show techI/O read/write
| begin performance
"disk IO perf" on intervals Reference

Measuring disk IO performance


*****************************************
Average I/O bandwidth writing to disk device: 194 MB/second
Average I/O bandwidth reading from disk device: over 1024 MB/second
I/O bandwidth performance within supported guidelines
Disk I/O bandwidth filesystem test, writing 300 MB to /opt:
314572800 bytes (315 MB) copied, 1.47342 s, 213 MB/s
Disk I/O bandwidth filesystem read test, reading 300 MB from /opt:
314572800 bytes (315 MB) copied, 0.0504592 s, 6.2 GB/s

Alarm generated if 24-hr average


below requirements
General ISE VM Configuration Guidelines
• Oversubscription of CPU, Memory, or Disk storage NOT recommended – All
VMs should have 1:1 mapping between virtual hardware and physical hardware.
• CPU: Map 1 VM vCPU core to 1 physical CPU core.
• Total CPU allocation should be based on physical CPU cores, not logical cores. if HT
enabled, then basis of vCPU allocation is based on physical cores, not logical cores.
• Memory: Sum of VM vRAM may not exceed total physical memory on the
physical server.
• Additional 1 GB+ of physical RAM must be provisioned for VMware ESXi itself (this is to
cover ESXi overhead to run VMs) *See Notes Page for details.
• Disk: Map 1 GB of VM vDisk to 1 GB of physical storage.
• Additional disk space may be required for VMware operations including snapshots.

For Your
Reference
Large Deployments – Bandwidth and Latency
• Bandwidth most critical between:
• PSNs and Primary PAN (DB Replication)
• PSNs and MnT (Audit Logging)
• Latency most critical between PSNs and Primary PAN.

PAN MnT PAN MnT


PSN PSN

PSN PSN
PSN PSN PSN PSN
PSN PSN PSN PSN
PSN PSN PSN PSN

200ms RADIUS
Max round-trip
(RT) latency PSN PSN
PSN
WLC Switch PSN
between any
two nodes in • RADIUS generally requires much less bandwidth and is more
tolerant of higher latencies – Actual requirements based on
ISE 1.2-1.4 many factors including # endpoints, auth rate and protocols
What if Distributed PSNs > 200ms RTT Latency?

< 200 ms
> 200 ms
Option #1: Deploy Separate ISE Instances
(Per-Instance Latency < 200ms)

MnT PAN MnT


PAN MnT PAN
PSN PSN

PSN PSN WLC


PSN PSN PSN PSN
PSN PSN PSN PSN
PSN PSN PSN PSN
Switch

RADIUS
PAN MnT

PSN PSN
< 200 ms PSN PSN
> 200 ms WLC Switch

WLC Switch
Option #2: Centralize PSNs Where Latency < 200ms

RADIUS

Switch
RADIUS

< 200 ms
> 200 ms
Deploy Local Standalone ISE Nodes as “Standby”
Local Standalone nodes can be deployed to
remote locations to serve as local backups in
case of WAN failure, but will not be synced to
centralized deployment.

PSN

Switch

PSN
Access Devices Fallback to Local PSNs on WAN Failure
• Access Devices point to local ISE
nodes as tertiary RADIUS Servers.
• Backup nodes only used if WAN fails
• Standalone ISE Nodes can still log to
centralized MNT nodes.
-- Use TCP Syslog to Buffer logs

PSN

PSN

More on NAD Fallback and Recovery


strategies under the High Availability section.
ISE 1.2 Bandwidth Calculator (Single-Site) For Your
Reference

Please contact your Certified


ATP Partner/SE to request a
WAN bandwidth analysis for
your ISE design / deployment.
ISE 1.2 Bandwidth Calculator (Multi-Site)

Note: Bandwidth required for RADIUS traffic is not included.


Calculator is focused on inter-ISE node bandwidth requirements.

Please contact your Certified ATP Partner/Cisco SE to request a WAN bandwidth


analysis for your ISE design and deployment.
ISE 1.2 Bandwidth Calculator (Multi-Site) (continued)
Note: Bandwidth required for
RADIUS traffic is not included.
Calculator is focused on inter-ISE
node bandwidth requirements.

For Your
Reference
ISE 1.2 Bandwidth Calculator Assumptions For Your
Reference
• ISE Auth Suppression enabled
• Max round-trip latency between any two ISE
• Profiling Whitelist Filter enabled
1.2/1.3/1.4 nodes is currently set at 200ms
• One node group per location
• For Single-Site calculation, primary PAN and MnT nodes are deployed in primary DC to which
bandwidth is calculated; For Multi-Site calculation, primary PAN is deployed in primary DC.
• Mobile endpoints authenticate/reauthenticate as frequently as 10/hr and refresh IP 1/hr
• Non-Mobile endpoints authenticate/reauthenticate no more than once per Reauth Interval and
refresh IP address no more than once per DHCP renewal (1/2 Lease Period)
• Bandwidth required for NAD or Guest Activity logging is not included. These logging activities
are highly variable and should be treated separately based on deployment requirements.
• Bandwidth required for general RADIUS auth and accounting traffic is not included. RADIUS
traffic is generally less significant but actual requirement is highly contingent on multiple factors
including total active endpoints, reauth intervals, and the authentication protocols used.
• Deployments where all ISE nodes are deployed in one location are not considered by this
calculator. All nodes deployed in the same location are assumed to be connected by high-speed
LAN links (Gigabit Ethernet or higher)
Scaling ISE Services
Scaling ISE Services Agenda

• AAA and Auth Policy Tuning


• Active Directory Integration
• Guest and Web Authentication
• Profiling and Database Replication
• MnT (Optimize Logging and Noise Suppression)
Scaling RADIUS, Web, and Profiling Services w/ LB
• Policy Service nodes can be configured in a cluster behind a load balancer (LB).
• Access Devices send RADIUS AAA requests to LB virtual IP.

PSNs
PSN
PSN PSN PSN PSN PSN PSN PSN PSN
(RADIUS
Servers)

Load Balancing covered


under the High Availability Load
Balancers
Section
Virtual IP

Network
Access
Devices
Auth Policy Optimization
Leverage Policy Sets to Organize and Scale Policy Processing

Policy Set
Condition

Authentication

Authorization

Policy
Sets
Administration > System > Settings > Policy Sets
Search Speed Test
• Find the object where…
• Total stars = 10
• Total green stars = 4
• Total red stars = 2
• Outer shape = Red Circle
Auth Policy Optimization • Policy Logic:
Avoid Unnecessary External Store Lookups o First Match, Top Down
o Skip Rule on first negative
condition match
• More specific rules generally at top
• Try to place more “popular” rules
before less used rules.

Example of a Poor Rule: Employee_MDM


• All lookups to External Policy and ID Stores
performed first, then local profile match!
Auth Policy Optimization
Rule Sequence and Condition Order is Important!
Example #1: Employee Example #2: Employee_CWA
1. Endpoint ID Group 1. Location (Network Device Group)
2. Authenticated using AD? 2. Web Authenticated?
3. Auth method/protocol 3. Authenticated via LDAP Store?
4. AD Group Lookup 4. LDAP Attribute Comparison
Enable EAP Session Resume / Fast Reconnect
Major performance boost, but not a complete auth so avoid excessive timeout value

Cache TLS (TLS Handshake Only/Skip Cert)

Cache TLS session

Skip inner method


Scaling AD Integration
Scaling AD Integration w/ Sites & Services
How do I ensure Local PSN is connecting to Local AD controller?

Without Site & Services Properly Configured

Which AD
server should
I connect to?
I will connect
Which AD with local AD
server should server X!
AD ‘X’ AD ‘X’
I connect to?

Site ‘X’ Site ‘X’


Site ‘Y’ Site ‘Y’

AD ‘Y’ I will connect


AD ‘Y’
with local AD
server Y
AD Sites and Services
Links AD Domain Controllers to Client IP Networks

DNS and DC Locator Service


work together to return list of
“closest” Domain Controllers
based on client Site (IP address)
Multi–Forest Active Directory Support New in ISE 1.3
Scales AD Integration through Multiple Join Points and Optimized Lookups

 Join up to 50 Forests or Domains without mutual trusts


 No need for 2-way trust relationship between domains
 Advanced algorithms for dealing with identical ISE
usernames
 SID-Based Group Mapping
 PAP via MS-RPC
 Support for disjointed DNS namespace

domain-1.com domain-2.com domain-n.com


AD Authentication Flow

Identity
Scope AD
Rewrite
AuthC (Optional) Instance (Optional)
Policy to
AD Domain List Target
(Optional) AD
AD Authentication Flow

For Your
Reference

Identity
Scope AD
Rewrite
AuthC (Optional) Instance (Optional)
Policy to
AD Domain List Target
(Optional) AD
Authentication Domains (Whitelisting)

• “Whitelist” only the domains


of interest—those used for
authentication!
• In this example, the join
point can see many trusted
domains but we only care
about r1.dom

Enable r1.dom
And disable the rest
Authentication Domains – Unusable Domains For Your
Reference

• Domains that are unusable, e.g. 1-way trusts, are hidden automatically
• There’s an option to reveal these and see the reason
Run the AD Diagnostic Tool For Your
Reference
Check AD Joins at Install & Periodically to Verify Potential AD Connectivity Issues

• The DNS SRV errors can actually mean something else


• The response was too big…and retried with TCP, etc.
• A sniffer can confirm
• AD Sites or DNS configuration changes are required to get that optimized
For Your
Validating DNS from ISE node CLI Reference

• Checking SRV records for Global Controllers (GC)


psn/admin# nslookup _ldap._tcp.gc._msdcs.myADdomainName querytype SRV

• Checking SRV records for Domain Controllers (DC)


psn/admin# nslookup _ldap._tcp.dc._msdcs.myADdomainName querytype SRV

• More details on Microsoft AD DNS queries:


https://technet.microsoft.com/en-us/library/cc959323.aspx
Cisco Live Online: www.ciscolive.com/online
AD Integration Best Practices BRKSEC-2132 - What's New in ISE Active
Directory Connector by Chris Murray

• DNS servers in ISE nodes must have all relevant AD records (A, PTR, SRV)
• Ensure NTP configured for all ISE nodes and AD servers
• Configure AD Sites and Services
(with ISE machine accounts configured for relevant Sites)

• Configure Authentication Domains (Whitelist domains used) (ISE 1.3)


• Use UPN/fully qualified usernames when possible to expedite use lookups
• Use AD indexed attributes* when possible to expedite attribute lookups
• Run Diagnostics from ISE Admin interface to check for issues.

* Microsoft AD Indexed Attributes:


http://msdn.microsoft.com/en-us/library/ms675095%28v=vs.85%29.aspx
http://technet.microsoft.com/en-gb/library/aa995762%28v=exchg.65%29.aspx
Scaling Guest and
Web Authentication
Services
Scaling Global Sponsor / MyDevices DNS SERVER: DOMAIN =
COMPANY.COM

Global Load Balancers / “Smart DNS” Example SPONSOR


10.1.0.100, 10.2.0.100, 10.3.0.100
MYDEVICES
MnT MnT Global LB
PAN PAN 10.1.0.100, 10.2.0.100, 10.3.0.100
ISE-PSN-1 10.1.1.1
ISE-PSN-2 10.1.1.2
ISE-PSN-3 10.1.1.3
ISE-PSN-4 10.2.1.4
PSN PSN PSN ISE-PSN-5 10.2.1.5
PSN PSN PSN ISE-PSN-6 10.2.1.6
ISE-PSN-7 10.3.1.7
ISE-PSN-8 10.3.1.8
Local LB
ISE-PSN-9 10.3.1.9
Local LB
10.1.0.100
For Your 10.2.0.100
Reference PSN PSN PSN

Use Global Load Balancing / intelligent DNS to direct traffic to closest


VIP. Local Web Load-balancing distributes request to single PSN.
Local LB

Load Balancing simplifies and scales ISE Web Portal Services 10.3.0.100
Scaling Global Sponsor / MyDevices DNS SERVER: DOMAIN =
COMPANY.COM

Anycast Example SPONSOR 10.1.0.100


MYDEVICES 10.1.0.101

MnT MnT
DNS Servers ISE-PSN-1 10.1.1.1
PAN PAN
ISE-PSN-2 10.1.1.2
ISE-PSN-3 10.1.1.3
ISE-PSN-4 10.2.1.4
ISE-PSN-5 10.2.1.5
ISE-PSN-6 10.2.1.6
PSN PSN PSN ISE-PSN-7 10.3.1.7
PSN PSN PSN
ISE-PSN-8 10.3.1.8
ISE-PSN-9 10.3.1.9

10.1.0.100 PSN PSN PSN


10.1.0.100

Use Global Load Balancer or Anycast (example shown) to direct traffic


to closest VIP. Web Load-balancing distributes request to single PSN. 10.1.0.100

Load Balancing also helps to scale Web Portal Services


Scaling Guest Authentications Using 802.1X
“Activated Guest” allows guest accounts to be used without ISE web auth portal

• Guests auth with 802.1X using EAP methods like PEAP-MSCHAPv2 / EAP-GTC
• 802.1X auth performance generally much higher than web auth

Note: AUP and


Password Change
cannot be enforced
since guest
bypasses portal flow.

• ISE 1.2 Guest Role


• ISE 1.3+ Guest Type
Scaling Web Authentication (ISE 1.2) For Your
Reference
Login Once via Web Portal Then Get Registered Access (Portal Bypass)

I keep having to re-enter my Can we login only once and not


credentials every time I be prompted each time we
connect to the network?
disconnect from the network Yes
You
Can!
CWA + Device Registration WebAuth (ISE 1.2) For Your
Reference
Live Log Output and Sample Authorization Policy
1 CWA unknown endpoint
2 Students redirected to DRW
3 Registered device gains access

3
CoA for DRW Success

2
CoA for CWA Success
1

3
2

1
CWA + DRW (1.2) For Your
Reference
+ Fully customizable portal per user group
User Experience
+ Auto-population of MAC address in ID group
+ Selectable ID Group assignment per DRW policy
+ Portal customization allows device registration success and
redirect to predefined URL in one step; User automatically
redirected to pre-defined URL.
- No association of user to endpoint after initial authentication
- No automatic purge of registered device without ERS API.
CWA

Registration Success Optional AUP for CWA and/or DRW


CWA + NSP Device Registration (ISE 1.2) For Your
Reference
NSP Device Registration Only Flow

• Administration > System > Settings > Client Provisioning 1 CWA unknown endpoint
2 Staff users redirected to NSP
3 Registered device gains access

If No matching
policy, then continue
with regular flow.

3
2
Each Authorization Profile can
1 reference a different customized
web portal per group
CWA + NSP Device Registration 1 CWA unknown endpoint
Live Log Output 2 Staff users redirected to NSP
For Your
Reference 3 Registered device gains access

3
CoA for NSP Success

2
CoA for CWA Success

3
2

1
CWA + NSP Device Registration (ISE 1.2) For Your
Reference
User Experience
+ Custom portal per group
+ Option to limit devices per user
+ Auto-population of MAC address
+ Maps User to Endpoint and supports self-service
- NSP customization limited to portal themes (1.2 only)
- All devices mapped to one group (RegisteredDevices)
CWA
(Optional AUP) - User must manually navigate to original web page
- No automatic purge of registered device (ERS API)

Device
Registration

Registration Success
Scaling Web Authentication (ISE 1.3) For ISE 1.2, can “chain”
“Remember Me” Guest Flows CWA+DRW or NSP to
auto-register web auth
• Device/user logs in to hotspot or credentialed portal users, but no auto-purge

• MAC address automatically registered into GuestEndpoint group


• Authz policy for GuestEndpoint ID Group grants access until device purged
Automated Device Registration and Purge New in ISE 1.3

• Web Authenticated users can be auto-


registered and endpoints auto-purged.
• Allows re-auth to be reduced to one day,
multiple days, weeks, etc.
• Improves Web Scaling and User Experience
Endpoint Purging

Matching Conditions
Purge by:
 # Days After
Creation
 # Days Inactive
 Specified Date
Endpoint Purging Examples

Matching Conditions
Purge by:
 # Days After
Creation
 # Days Inactive
 Specified Date

On Demand Purge
Scaling Posture & MDM
Posture Lease
Once Compliant, user may leave/reconnect multiple times before re-posture

7
MDM Scalability and Survivability
What Happens When the MDM Server is Unreachable?

• Scalability ≈ 30 Calls per second per PSN.


• Cloud-Based deployment typically built for scale and redundancy
• For cloud-based solutions, Internet bandwidth and latency must be considered.
• Premise-Based deployment may leverage load balancing

• Authorization permissions can be set based on MDM connectivity status:


• MDM:MDMServerReachable Equals UnReachable
MDM:MDMServerReachable Equals Reachable

• All attributes retrieved & reachability determined by single API call on each new session.

• Separate Heartbeat timer added to current 1.2.x and 1.3.0


• CSCul39011 MDM client is not rejecting queries when MDM server is not responding
Scaling Profiling and
Database Replication
Endpoint Attribute Filter and Whitelist Attributes
Reduces Data Collection and Replication to Subset of Profile-Specific Attributes

• Endpoint Attribute Filter – aka “Whitelist filter” (ISE 1.1.2 and above)
• Disabled by default. If enabled, only these attributes are collected or replicated.
Administration > System Settings > Profiling

• Whitelist Filter limits profile attribute collection to those required to support


default (Cisco-provided) profiles and critical RADIUS operations.
• Filter must be disabled to collect and/or replicate other attributes.
• Attributes used in custom conditions are automatically added to whitelist.
Significant Attributes
When Does Database Replication Occur?

• Configuration Database (Replication on Significant Attribute change)


• Managed by Primary PAN and replicated to all Secondary Nodes
• Stores ISE policy config, endpoint records, internal users, guest database, user certificates, etc.
• Profiling is primary contributor to database replication

• Local Persistence/Cache Database (Replication on Whitelist Attribute change)


• Locally stores endpoint profile updates.
• Last PSN to learn new whitelisted attributes becomes MAC ADDRESS
endpoint owner (tracked by “EndPoint Profiler Server” ENDPOINT POLICY
STATIC ASSIGNMENT
• If another PSN receives newer attributes, it requests attribute sync STATIC GROUP ASSIGNMENT
from prior owner and takes ownership, then notifies other PSNs. ENDPOINT IP
• Only update PAN for Significant Attribute changes POLICY VERSION
• PAN replicates all attributes on significant attribute change. MATCHED VALUE (CF)
NMAP SUBNET SCAN ID
PORTAL USER
CSCur44879 - Remove IP address as Significant Attribute DEVICE REGISTRATION STATUS
Significant Attributes vs. Whitelist Attributes
Significant Attributes Attributes that impact profile
161-udp FirstCollection MDMPinLockSet
• Change triggers global replication AAA-Server FQDN MDMProvider
AC_User_Agent Framed-IP-Address MDMSerialNumber
MACADDRESS AUPAccepted host-name MDMServerReachable
BYODRegistration hrDeviceDescr MDMUpdateTime
ENDPOINTIP NADAddress
CacheUpdateTime IdentityGroup
MATCHEDVALUE Calling-Station-ID IdentityGroupID NAS-IP-Address
ENDPOINTPOLICY cdpCacheAddress IdentityStoreGUID NAS-Port-Id
ENDPOINTPOLICYVERSION cdpCacheCapabilities IdentityStoreName NAS-Port-Type
STATICASSIGNMENT cdpCacheDeviceId ifIndex NmapScanCount
cdpCachePlatform ip NmapSubnetScanID
STATICGROUPASSIGNMENT L4_DST_PORT operating-system
cdpCacheVersion
NMAPSUBNETSCANID Certificate Expiration Date LastNmapScanTime OS Version
PORTALUSER Certificate Issue Date lldpCacheCapabilities OUI
DEVICEREGISTRATIONSTATUS Certificate Issuer Name lldpCapabilitiesMapSupported PhoneID
Certificate Serial Number lldpSystemDescription PhoneIDType
ciaddr MACAddress PolicyVersion
Whitelist Attributes CreateTime MatchedPolicy PortalUser
Description MatchedPolicyID PostureApplicable
• Change triggers PSN-PSN replication DestinationIPAddress MDMCompliant PreviousDeviceRegistrationStatus
and global ownership change Device Identifier MDMCompliantFailureReason Product
Device Name MDMDiskEncrypted RegistrationTimeStamp
DeviceRegistrationStatus MDMEnrolled StaticAssignment
Other Attributes dhcp-class-identifier MDMImei StaticGroupAssignment
MDMJailBroken sysDescr
• Dropped if whitelist filter enabled; dhcp-requested-address
EndPointPolicy MDMManufacturer TimeToProfile
Otherwise, only locally saved by PSN EndPointPolicyID MDMModel Total Certainty Factor
EndPointProfilerServer MDMOSVersion UpdateTime
EndPointSource MDMPhoneNumber User-Agent
Replication using JGroups For Your
Reference

Node Group/Cluster Messaging


2 6 7
1 5
PSN 3 PAN 8 PSN DHCP
RADIUS

PSN1 4 9 PSN2
Global Replication

1. First profile attributes (RADIUS) received for an endpoint by PSN1 and saved in local DB.
2. New endpoint so PSN1 declares ownership to local node group over local cluster channel.
3. PSN1 syncs all attributes for endpoint to PAN; PAN creates endpoint in DB.
4. PAN replicates all attributes for the endpoint to all other nodes via Global Replication channel.
5. New profile attributes (DHCP) for same endpoint received by PSN2 in same node group.
6. PSN2 communicates with PSN1 over local cluster channel to determine if change to white list attribute. In this case,
yes, so PSN2 requests all attributes for endpoint from PSN1.
7. PSN2 declares ownership change to local node group.
8. PSN2—did significant attribute change? Yes, since profile updated. PSN2 syncs all attributes to PAN.
9. PAN saves to central DB and replicates all attributes to all other secondary nodes in deployment over global channel.
JGroups Overview For Your
Reference

• Software for reliable group(cluster) communication


• Keeps track of member joins, leaves or crashes and notifies group members
• Membership “View” shows who is currently part of the group
• Members can send messages to all the members or to specific members and
receive messages
• Sending messages to all members is called multicast and specific members is
called unicast
• Supports different transports:
• UDP multicast
• UDP unicast
• TCP mesh
• TCP hub-and-spoke (gossip router); also referred to as tunneled mode
Replication/Global Cluster For Your
Reference

• Replication Cluster is a group with all nodes in the ISE deployment, i.e. PANs,
MnTs, and PSNs
• Mainly used for the replication of configuration and runtime data from Primary
PAN to all other nodes
• Also used by Profiler for fetching attributes from current owner and updating
endpoint ownership changes; for example, when node is not a node group
member or loses connection to its local node group.
• Uses TCP Hub and Spoke (Gossip Router) transport with Primary PAN as the
hub over port TCP/12001
• All nodes should have connectivity to TCP/12001 on both Primary and
Secondary PAN
ISE 1.2 Node Group/Local Cluster For Your
Reference

• Node Group is a group of PSNs


• Mainly used for terminating posture pending sessions when a PSN in local node group fails
• Also used by Profiler for fetching attributes from current owner and updating endpoint
ownership changes if the owner PSN is also part of the same node group
• Node Groups use the following JGroup transports:
o UDP/45588 as multicast port (for example, to declare ownership change)
o UDP/45590 as unicast port (for example, to exchange attributes with previous endpoint owner)
o TCP/7802 for faster detection of failed nodes.
TCP connection is persistent but no data transferred in a ring configuration—PSN1 to PSN2, PSN2 to PSN3, etc.

• Max 10 nodes per group


• All members of same node group should be L2 adjacent although L3 possible w/ TTL=2
• Highly recommended so that Profiler communications stay localized to nodes in a node
group rather than affecting ALL the nodes in the deployment.
ISE 1.3 Node Group/Local Cluster For Your
Reference

• Node Groups use the following JGroup transports—all SSLv3 over TCP:
o TCP/7800 – TCPPING for JGroup Member Discovery
o TCP/7800 – JGroup Responses/Status, Ownership Changes, Endpoint Profile Attribute
Retrieval.
o TCP/7802 – Fast node failure detection.
• TCP connection is persistent but no data transferred in a ring configuration—PSN1 to PSN2, PSN2 to
PSN3, etc.
• When failure detected, JGroup Controller for Node Group communicates to MnT over HTTPS (TCP/443)
to relay node statuses (Up/Down). Also uses HTTP (TCP/443) to retrieve Posture Pending sessions from
MnT for failed node group member.

• Node Groups highly recommended so that Profiler communications stay localized to


nodes in a node group rather than affecting ALL the nodes in the deployment.
ISE Inter-Node Communications For Your
Reference
Database Operations

MnT (P) MnT (S)


TCP/443 HTTPS (SOAP)
MnT MnT

Admin (P) PAN PAN Admin (S)

PSN PSN PSN2


PSN1

PSN

PSN3
Inter-Node Communications
JGroup Connections – Global Cluster TCP/12001 JGroups Tunneled

MnT (P) MnT (S)


MnT MnT
• All Secondary nodes* establish
connection to Primary PAN (JGroup
Controller) over tunneled connection
(TCP/12001) for config/database sync.
PAN PAN
Admin (P) Admin (S) • Secondary Admin also listens on
GLOBAL TCP/12001 but no connection
JGROUP established unless primary
CONTROLLER fails/secondary promoted
• All Secondary nodes participate in the
PSN PSN
PSN1 PSN2 Global JGroup cluster.

*Secondary node = All nodes


except Primary Admin node;
PSN includes PSNs, MnT and Secondary
Admin nodes
PSN3
Inter-Node Communications TCP/7800 JGroup Peer Communication
TCP/7802 JGroup Failure Detection
Local JGroups and Node Groups TCP/12001 JGroups Tunneled
MnT (P) MnT (S)
MnT MnT
• Node Groups can be used to define
local JGroup* clusters where
members exchange heartbeat and
sync profile data over IP multicast.
PAN
Admin (P) PAN
Admin (S) • PSN claims endpoint ownership only if
GLOBAL PSN1 isincurrent
change endpoint
whitelist owner
attribute; –
triggers
JGROUP DHCP no database
inter-PSN sync replication even
of attributes. if
Whitelist
CONTROLLER Profile checkwhitelist
alwaysattribute changes of
occurs regardless
Update
Change global attribute filter setting.
t=1
t=0
Fetch Attributes
PSN1
PSN PSN
PSN2 • Replication to PAN occurs if
Change PSN2 gets more current update
significant attribute changes, then
LOCAL Ownership for same endpoint and takes
sync all attributes via PAN; if whitelist
JGROUP ownership – fetches all attributes
CONTROLLER NODE GROUP A filter enabled, only whitelist attributes
from PSN1
(JGROUP A) synced to all nodes.
PSN

PSN3
*JGroups: Java toolkit for reliable multicast
communications between group/cluster members.
ISE 1.3 Inter-Node Communications For Your
Reference
Consolidated View for Database Operations
MnT (P) MnT (S)

MnT MnT TCP/443 HTTPS (SOAP) – Node


Up/Down Notification, Session Retrieval
TCP/7800 SSL – JGroup Member Discovery,
JGroup Responses/Status, Ownership Change,
Endpoint Attribute Retrieval
TCP/7802 (SSL) Failure Detection
PAN
TCP/12001 JGroups Tunneled (Global)
Admin (P) PAN Admin (S)
TCP/1528 Oracle DB (Secure JDBC)

PSN PSN
PSN1 PSN2

NODE GROUP A
(JGROUP A)
PSN

PSN3
Inter-Node Communications TCP/7800 JGroup Peer Communication
TCP/7802 JGroup Failure Detection
Local JGroups and Node Groups For Your
TCP/12001 JGroups Tunneled
Reference
• General classification data for given endpoint should stay
local to node group = whitelist attributes • Node groups continue to provide original
• Only certain critical data needs to be shared across entire function of session recovery for failed PSN.
deployment = significant attributes • Profiling sync leverages JGroup channel
• Each LB cluster should be a node group,
LB is NOT a
Load but LB is NOT required for node groups.
requirement for
Balancer
Node Group
NODE GROUP A • Node group members should have GE LAN
(JGROUP A) connectivity (L2 or L3)
• ISE 1.3 no longer uses UDP multicast
for Jgroup—uses SSL only.
PSN1
PSN PSN
PSN2 • ISE 1.2 uses multicast with TTL=2;
max 1 hop)
L2 or L3 LAN
Switching • Reduces sync updates even if different
PSNs receive data – expect few whitelist
PSN changes and even fewer critical attribute
changes.
PSN3
Inter-Node Communications TCP/7800 JGroup Peer Communication
TCP/7802 JGroup Failure Detection
Local JGroups and Node Groups TCP/12001 JGroups Tunneled

MnT MnT

PAN PAN

• Profiling sync leverages JGroup channels


• All replication outside node group must
traverse PAN—including Ownership Change!
• If Local JGroup fails, then nodes fall back to
Global JGroup communication channel.

PSN1 PSN PSN PSN2 PSN4


PSN PSN PSN5
L2 or L3
NODE GROUP A NODE GROUP B
(JGROUP A) (JGROUP B)

PSN PSN

PSN3 PSN6
Inter-Node Communications TCP/7800 JGroup Peer Communication
TCP/7802 JGroup Failure Detection
Local JGroups and Node Groups TCP/12001 JGroups Tunneled

MnT MnT

PAN PAN
For Your
Reference

PSN1 PSN PSN PSN2 PSN4


PSN PSN PSN5
L2 or L3
NODE GROUP A NODE GROUP B
(JGROUP A) (JGROUP B)

PSN PSN

PSN3 PSN6
Configuring Node Groups – ISE 1.2 Example For Your
Reference
Recommended for ALL local PSNs! 2) Assign name and available multicast address
• Administration > System > Deployment

1) Create node group

3) Add individual PSNs to node group

• Node group members may be L2 adjacent


(in same VLAN/subnet) or L3-connected
• Members exchange heartbeats using multicast.
• L3-connected members must have IP Multicast configured on connecting switches.
ISE Node Communications DNS: tcp-udp/53
NTP: udp/123
Repository: FTP, SFTP, NFS,
HTTP, HTTPS
For Your
Reference
File Copy: FTP, SCP, SFTP, TFTP

HTTPS: tcp/8443 RADIUS Auth: udp/1645,1812


RADIUS Acct: udp/1646,1813
PXG pxGrid: tcp/5222 MnT
Email/
SMS Syslog: udp/20514
Gateways SMTP: Logging Syslog: udp/20514, tcp/1468
tcp/25 Secure Syslog: tcp/6514
HTTPS; tcp/443 NetFlow for TS: udp/9996 IPN
Syslog: udp/20514, tcp/1468
Secure Syslog: tcp/6514
pxGrid: tcp/5222 Oracle DB (Secure JDBC): tcp/1528 HTTPS: tcp/443
RADIUS Auth: udp/1645,1812
SMTP: tcp/25 JGroups: tcp/12001 JGroups: tcp/12001 (MnT to PAN) Syslog: udp/20514, tcp/1468
RADIUS Acct: udp/1646,1813
(PPAN: email Secure Syslog: tcp/6514 SSH: tcp/22
RADIUS CoA: udp1700,3799
expiry notifiy) CoA (REST API): udp/1700
TS CoA: udp/1700
Query Attributes
HTTPS: tcp/443 LDAP: tcp-udp/389, tcp/3268
Syslog: udp/20514, tcp/1468 PAN
JGroups: tcp/12001 (PSN to PAN) PSN SMB:tcp/445
Secure Syslog: tcp/6514 KDC:tcp-udp/88
CoA (Admin/Guest Limit): udp/1700
SNMP Traps: udp/162 KPASS: tcp/464
SCEP: tcp/80, tcp/443 PIP
RADIUS Auth: udp/1645,1812 NTP: udp/123
RADIUS Acct: udp/1646,1813 OCSP: tcp/80
MDM API: tcp/XXX RADIUS CoA: udp/1700,3799 CRL: tcp/80, tcp/443, tcp/389
GUI: tcp/80,443 Guest: tcp/8443
Discovery: tcp/8443, tcp/8905 WebAuth: tcp:443,8443
MDM Partner
Posture Updates: tcp/443 SSH: tcp/22
Agent Install: tcp/8443 OCSP: tcp/2560 Inter-Node Communications
Profiler Feed: tcp/8443 Sponsor (PSN): tcp/8443 SNMP: udp/161
SNMP: udp/161 NAC Agent: tcp/8905; udp/8905 Admin(P) - Admin(S): tcp/443,
CMCS: tcp/443 PRA/KA: tcp/8905 SNMP Trap: udp/162 tcp/12001(JGroups)
REST API (MnT): tcp/443 NetFlow: udp/9996
APNS: tcp/2195 MDS Enroll: tcp/7001
ERS API: tcp/9060 DHCP:udp/67, udp/68 Monitor(P) - Monitor(S): tcp/443,
MDS Check-in: tcp/7002
Cloud Services SPAN:tcp/80,8080 udp/20514 (Syslog)
Cisco.com/Perfigo.com Admin->Sponsor:
tcp/9002 Policy - Policy: tcp/7800, tcp/7802 (Node
Profiler Feed Service
Groups/JGroups)
MDM & App Stores
Push Notification
NADs Inline(P) - Inline(S): udp/694 (Heartbeat)
Admin /Sponsor Endpoint
For Your
Reference
AD Connector Ports
Protocol Port Authenticated Notes
DNS (TCP/UDP) 53 No May use DNSSEC
MSRPC (TCP) 445 Yes
Kerberos (TCP/UDP) 88 Yes (Kerberos) MS AD/KDC
Kpasswd 464 No
LDAP (TCP/UDP) 389 Yes. Encrypted & Authenticated Just like native MS
with SASL, not LDAP/S Domain Member.
Global Catalog (TCP) 3268 Yes. Encrypted & Authenticated Just like native MS
with SASL, not LDAP/S Domain Member.
NTP 123 No
IPC 80 Yes, using creds from RBAC ISE REST Library.
system.
ISE Profiling Best Practices
Whenever Possible…
• Use Device Sensor on Cisco switches & Wireless Controllers to optimize data collection.
• Ensure profile data for a given endpoint is sent to a single PSN (or maximum of 2)

Do NOT send profile data to multiple PSNs !
Sending same profile data to multiple PSNs increases inter-PSN traffic and contention for endpoint ownership.
• For redundancy, consider Load Balancing and Anycast to support a single IP target for RADIUS or profiling using…
• DHCP IP Helpers
• SNMP Traps
• DHCP/HTTP with ERSPAN (Requires validation)


DO send profile data to single and same PSN or Node
Ensure profile data for a given endpoint is sent to the same PSN
• Group ! above, but not always possible across different probes
Same issue as
• Use node groups and ensure profile data for a given endpoint is sent to same node
DO use Device Sensor !
group.
• Node Groups reduce inter-PSN communications and need to replicate endpoint changes outside of node group.
• DO probes
Avoid enable
thatthe Profiler
collect the same Attribute Filter
endpoint attributes !
• Example: Device Sensor + SNMP Query/IP Helper
• Enable Profiler Attribute Filter
ISE Profiling Best Practices For Your
Reference
Whenever Possible…
• Use Device Sensor on Cisco switches & Wireless Controllers to optimize data collection.
• Ensure profile data for a given endpoint is sent to a single PSN (or maximum of 2)
• Sending same profile data to multiple PSNs increases inter-PSN traffic and contention for endpoint ownership.
• For redundancy, consider Load Balancing and Anycast to support a single IP target for RADIUS or Profiling using…
• DHCP IP Helpers
• SNMP Traps
• DHCP/HTTP with ERSPAN (Requires validation)

• Ensure profile data for a given endpoint is sent to the same PSN
• Same issue as above, but not always possible across different probes
• Use node groups and ensure profile data for a given endpoint is sent to same node
group.
• Node Groups reduce inter-PSN communications and need to replicate endpoint changes outside of node group.
• Avoid probes that collect the same endpoint attributes
• Example: Device Sensor + SNMP Query/IP Helper
• Enable Profiler Attribute Filter
ISE Profiling Best Practices
General Guidelines for Probes
• HTTP Probe:
• Use URL Redirects instead of SPAN to centralize collection and reduce traffic load related to SPAN/RSPAN.
• Avoid SPAN. If used, look for key traffic chokepoints such as Internet edge or WLC connection; use intelligent
SPAN/tap options or VACL Capture to limit amount of data sent to ISE. Also difficult to provide HA for SPAN.

• DHCP Probe:
• Use IP Helpers when possible—be aware that L3 device serving DHCP will not relay DHCP for same!

Do NOT enable all probes by default !
Avoid DHCP SPAN. If used, make sure probe captures traffic to central DHCP Server. HA challenges.

• Avoid
SNMP SPAN,
Probe: SNMP Traps, and NetFlow probes !
• Be careful of high SNMP traffic due to triggered RADIUS Accounting updates as a result of high re-auth (low
session/re-auth timers) or frequent interim accounting updates.
• For polled SNMP queries, avoid short polling intervals. Be sure to set optimal PSN for polling in ISE NAD config.
• SNMP Traps primarily useful for non-RADIUS deployments like NAC Appliance—Avoid SNMP Traps w/RADIUS auth.

• NetFlow Probe:
Use only for specific use cases in centralized deployments—Potential for high load on network devices and ISE.
ISE Profiling Best Practices For Your
Reference
General Guidelines for Probes
• HTTP Probe:
• Use URL Redirects instead of SPAN to centralize collection and reduce traffic load related to SPAN/RSPAN.
• Avoid SPAN. If used, look for key traffic chokepoints such as Internet edge or WLC connection; use intelligent
SPAN/tap options or VACL Capture to limit amount of data sent to ISE. Also difficult to provide HA for SPAN.

• DHCP Probe:
• Use IP Helpers when possible—be aware that L3 device serving DHCP will not relay DHCP for same!
• Avoid DHCP SPAN. If used, make sure probe captures traffic to central DHCP Server. HA challenges.

• SNMP Probe:
• Be careful of high SNMP traffic due to triggered RADIUS Accounting updates as a result of high re-auth (low
session/re-auth timers) or frequent interim accounting updates.
• For polled SNMP queries, avoid short polling intervals. Be sure to set optimal PSN for polling in ISE NAD config.
• SNMP Traps primarily useful for non-RADIUS deployments like NAC Appliance—Avoid SNMP Traps w/RADIUS auth.

• NetFlow Probe:
Use only for specific use cases in centralized deployments—Potential for high load on network devices and ISE.
ISE 1.1.1“I Patch
applied2 the
initially
latest
helped,
patch but…
across all nodes (Admins,
Profiling Case Study Never applied
Monitors,
otherPSNs).
best practice
The VMrecommendations.
portal page is now showing
DB eventually
the cpufilled
at about
and purge
5% andissues
the network
resultedusage
in DBsfrom
falling out
of sync /~50MB
disconnects.
down to under a 1MB.”
Problem:
Single core
Two cores
• Running ISE 1.1.1 allocated to ISE
• High node CPU and BW VMs
to Primary PAN
• Short-term Fix = Disable Allocate Eight
Profiling cores to ISE
VMs
Interim Solution:
802.1X Devices CoA • Send profile data (traps, IP
• Added 2nd core and CPU ACE
w/CoA (C6500, C3750,
helper,…) to VIP address.
dropped 33% WLC) VIP
• Enable Whitelist Filter
• Applied 1.1.1 Patch 2 and
CPU dropped 85+% and
BW 98+%
ISE Policy
Solution: Node Groups
(x2) (N+1)
• Increase VM to specs
• LB profile data to single IP All profile data (traps,
IP helper,…) sent to
• Enable whitelist filter Profiling Probes: Gig0: Profiling Probes: Gig0:
DHCP, RADIUS, DNS, SNMPQUERY,
every PSN! DHCP, RADIUS, DNS, SNMPQUERY,
• Upgrade to 1.2.1/1.3 SNMPTRAP, HTTP, DHCPSPAN SNMPTRAP, HTTP, DHCPSPAN
Profiling Redundancy – Duplicating Profile Data
Different DHCP Addresses
- Provides Redundancy but Leads to Contention for Ownership = Replication
• Common config is to duplicate IP helper
PSN
data at each NAD to two different PSNs or PSN-CLUSTER1 PSN1 (10.1.99.5)
PSN LB Clusters (10.1.98.8)
DC #1 PSN
PSN2 (10.1.99.6)
• Different PSNs receive data
Load PSN PSN3 (10.1.99.7)
Balancer
int Vlan10
DHCP Request PSN
PSN1 (10.2.101.5)
PSN-CLUSTER2
User (10.2.100.2)
PSN PSN2 (10.2.101.6)
DC #2
interface Vlan10 Load PSN PSN3 (10.2.101.7)
ip helper-address <real_DHCP_Server Balancer
ip helper-address 10.1.98.8
ip helper-address 10.2.100.2
Scaling Profiling and Replication
Single DHCP VIP Address using Anycast
- Limit Profile Data to a Single PSN and Node Group
• Different PSNs or Load Balancer VIPs host
PSN
same target IP for DHCP profile data PSN-CLUSTER1 PSN1 (10.1.99.5)
(10.1.98.8)
• Routing metrics determine which PSN DC #1 PSN
PSN2 (10.1.99.6)
or LB VIP receives DHCP from NAD
Load PSN PSN3 (10.1.99.7)
Balancer
int Vlan10
DHCP Request PSN
PSN1 (10.2.101.5)
User PSN-CLUSTER2
(10.1.98.8)
PSN PSN2 (10.2.101.6)
DC #2
interface Vlan10 Load PSN PSN3 (10.2.101.7)
ip helper-address <real_DHCP_Server> Balancer
ip helper-address 10.1.98.8
Profiler Tuning for Polled SNMP Query Probe
• Set specific PSNs to
periodically poll
access devices for
SNMP data.
• Choose PSN closest
to access device.

PSN PSN2
(Asia)
PSN1 SNMP Polling
(Amer) (Auto) PSN

RADIUS

Switch
Profiler Tuning for Polled SNMP Query Probe
Disable/uncheck SNMP Settings: Disables
all SNMP polling options [CSCur95329]
• Polling Interval
1.2 Default: 3600 sec
(1 hour)
1.3 Default: 28,800 sec
(8 hours) *Recommend
minimum for all releases

• Setting of “0”: Disables


periodic poll but allows
triggered & NMAP Polled Mode = “Catch All”
queries [CSCur95329]
• Triggered query auto-
suppressed for 24 hrs
per endpoint
Profiling Bandwidth For Your
Reference
Factors Impacting Bandwidth Consumption for Profiling (Not Logging/Replication)
• Profiling traffic will be probe specific and dependent on many factors including:
• RADIUS Probe itself does not consume additional bandwidth unless tied to Device Sensor.
• RADIUS traffic generated by Device Sensor will depend on switch DS config, i.e. all events or only changes, functions enabled, and filters set.
• SNMP Query is based on configured polling interval (NAD based) and NAD sizes (for example, bigger switches with more active
ports/connections will result in higher SNMP bandwidth).
• SNMP Query (Port based) can also be triggered by SNMP Traps or RADIUS Accounting State, but current code should limit to one query per
24hrs.
• SNMP Traps will depend on # endpoints and connection events. Note that SNMP trap processing only supported for Wired.
• DHCP-related profile traffic will be dependent on lease timers and connection and reauth rates. Reauth rates can be triggers by idle and session
timers or CoA where session terminates/port bounces and triggers DHCP). Traffic is multiplied by the number of PSN targets configured which is
why I advocate limiting targets to no more than two or possibly one using Anycast.
• DHCP SPAN option will likely consume more bandwidth, especially if not filtered on DHCP only, as it collects all DHCP including bidirectional
traffic flows. Also, since no simple methods for SPAN HA, may need to send multiple SPANs to different PSNs (not pretty and another reason
why I don’t generally recommend SPAN option).
• HTTP via redirects does not consume additional bandwidth
• HTTP via SPAN may consume a lot of bandwidth and will depend on SPAN config, where placed, traffic volume, and whether capture is filtered
for only HTTP. Note, we will not parse HTTPS SPAN traffic. Like DHCP SPAN, multiple targets required for redundancy.
• NMAP is triggered, but only 3 attempts on newly discovered Unknowns or policy triggered. Additional endpoint SNMP queries will be endpoint
specific. For most part, it should be fairly quiet. There is manual nmap scan option, but this should be used with care to avoid excessive ISE or
network load. As manual process, requires deliberate admin trigger.
• DNS is triggered based on new IP discovery, but for most part should be quiet.
• Netflow can add a large amount of traffic and highly dependent on Netflow config on source and the traffic volume. Like SPAN challenges,
volume is multiple by # PSN Netflow targets unless leverage something like Anycast for redundancy.
Scaling MnT (Optimize Logging
and Noise Suppression)
When the Levee Breaks…

PSN

“If it keeps on rainin', levee's goin' to break,


When The Levee Breaks logs have no place to stay.” MnT

*Remix of Led Zeppelin IV, ‘When The Levee Breaks’


The Fall Out From the Mobile Explosion and IoT
 Explosion in number and type of endpoints on the network.
 High auth rates from mobile devices—many personal (unmanaged).
– Short-lived connections: Continuous sleep/hibernation to conserve battery power, roaming, …

 Misbehaving supplicants: Unmanaged endpoints from numerous mobile vendors may be


misconfigured, missing root CA certificates, or running less-than-optimal OS versions
 Misconfigured NADs. Common issue is setting timeouts too low.
 Excessive RADIUS health probes from NADs and Load Balancers.
 Increased logging from Authentication, Profiling, NADs, Guest Activity, …
 System not originally built to scale to new loads.
 End user behavior when above issues occur.
 Bugs in client, NAD, or ISE.
Repeats Every 30 Seconds For Your
Reference
Client/Supplicant NAD ISE

SSID

Step 1: Due to Reauthentication, or Coming back to Campus… New Connection Request

Step 2: Certificate sent to Supplicant

Client Rejects Cert

30 seconds

Step 1: New Connection Request

Step 2: Certificate sent to Supplicant

Client Rejects Cert

30 seconds
First EAP Timeout 120sec

30 Seconds Later
No Response Received From Client For Your
Reference

What might this do to MnT logging??


Clients Misbehave!
• Example education customer:
• ONLY 6,000 Endpoints (all BYOD style)
• 10M Auths / 9M Failures in a 24 hours!
• 42 Different Failure Scenarios – all related to
clients dropping TLS (both PEAP & EAP-TLS).

• Supplicant List:
• Kyocera, Asustek, Murata, Huawei, Motorola, HTC, Samsung, ZTE, RIM, SonyEric, ChiMeiCo,
Apple, Intel, Cybertan, Liteon, Nokia, HonHaiPr, Palm, Pantech, LgElectr, TaiyoYud, Barnes&N

• 5411 No response received during 120 seconds on last EAP message sent to the client
• This error has been seen at a number of Escalation customers
• Typically the result of a misconfigured or misbehaving supplicant not completing the EAP process.
Challenge: How to reduce
the flood of log messages
while increasing PSN and
MNT capacity and tolerance
PSN

MnT
Getting More Information With Less Data
Scaling to Meet Current and Next Generation Logging Demands
Rate Limiting at Source Filtering at Receiving Chain
Reauth period Heartbeat Detect and reject Count and discard
Quiet-period 5 min frequency misbehaving clients repeated events
Held-period / Exclusion 5 min
Log Filter Count and discard
Switch untrusted events
Reauth phones
Quiet LB PSN MNT
period
Unknown users
Quiet WLC LB Health
Period probes Filter health
Reject
bad probes from
Roaming Client supplicant logging
supplicant Exclusion Count and
discard
repeats and
unknown NAD
events
Misbehaving
supplicant
Tune NAD Configuration
Rate Limiting at Wireless Source

Reauth period Wireless (WLC)


Quiet-period 5 min
• RADIUS Server Timeout: Increase from default of 2 to 5 sec
Held-period / Exclusion 5 min
• RADIUS Aggressive-Failover: Disable aggressive failover
• RADIUS Interim Accounting: v7.6: Disable; v8.0: Enable with
Reauth phones interval of 0. (Update auto-sent on DHCP lease or Device Sensor)
• Idle Timer: Increase to 1 hour (3600 sec)
Unknown Quiet • Session Timeout: Increase to 2+ hours (7200+ sec)
users Period WLC
• Client Exclusion: Enable and set exclusion timeout to 180+ sec
Roaming Client • Roaming: Enable CCKM / SKC / 802.11r (when feasible)
supplicant Exclusion
• Bugfixes: Upgrade WLC software to address critical defects

Misbehaving supplicant Prevent Large-Scale Wireless RADIUS Network Melt Downs


http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/118703-technote-wlc-00.html
Public Doc on Recommended Wireless Settings
• Prevent Large-Scale Wireless RADIUS Network Melt Downs
http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/118703-technote-wlc-00.html

For Your
Reference
WLC – RADIUS Server Settings For Your
Reference

• RADIUS Server Timeout Note: Diagrams show default values


• WLC default to receive response from the RADIUS
Server is 2 sec; max=30 seconds
• Recommend increase to larger value, for example 5 sec.

• RADIUS Aggressive-Failover
• (Cisco Controller)>config radius aggressive-failover
disable
• If this is set to 'enable‘ (default), the WLC will failover to
next server after 5 retransmissions for a given client.
• Recommend disable to prevent single misbehaving client
from failing over and disrupting other client sessions
unless there are 3 consecutive tries for 3 different users
(i.e. the radius-server is unresponsive for multiple users).

• RADIUS Interim Accounting


• v7.x: Recommend disable (default setting). If required, WLAN > Security > AAA Servers
increase default from 600 sec to 900 sec (15 minutes)
• v8.x: Recommend enable with Interval set to 0.
RADIUS Accounting Update Behavior in WLC v7.x
Interim Update
• WLC 7.6:
• Recommended setting: Disabled
• Behavior: Only send update on IP
address change
• Ensures we get critical IP updates
(Framed-IP-Address) and Device
Sensor updates.
• Device Sensor updates not
impacted
RADIUS Accounting Update Behavior in WLC v8.x
Interim Update
• WLC 7.6:
• Recommended setting: Disabled
• Behavior: Only send update on IP
address change
• Device Sensor updates not
impacted
• WLC 8.0:
• Recommended setting: Enabled
with Interval set to 0
• Behavior: Only send update on IP
address change
• Device Sensor updates not
impacted
• Settings mapped correctly on
upgrade
For Your
WLC – Authentication Settings Reference

Reduce the # Auths and ReAuths

• Increase Idle Timer to 1 hour


(3600 sec)

• Increase reauth/session timers


to 2+ hrs (7200+ sec)

Note: Diagrams show default values


WLC – Client Exclusion For Your
Reference
Blacklist Misconfigured or Malicious Clients

• Excessive 802.1X Authentication Failures—Clients are excluded on the fourth


802.1X authentication attempt, after three consecutive failures.
• Excessive Web Authentication Failures—Clients are excluded on the fourth web
authentication attempt, after three consecutive failures.
• Client excluded for Time Value specified in WLAN settings. Recommend increase to
1-5 min (60-300 sec). 3 min is a good start.

Note: Diagrams show default values


Wireless Roaming For Your
Reference
Key Caching to Avoid Reauth when Roaming

• 802.11r (aka Fast Transition)


– Enable where supported and feasible;
For example, large Apple deployments.
– Apple support added in iOS6

• CCKM - Cisco Centralized Key Management


– Clients must support CCKM; CCXv4 feature
• SKC (Sticky PMKID Caching)
– Requires WLC 7.2
– Works only with WPA2 WLANs
– Recommended if clients do not support CCKM
or OKC (Opportunistic PMKID Caching)
> config wlan security wpa wpa2 cache sticky enable wlan_id
Which WLC Software Should IDeploy?
• 7.6.130.0 (7.6 MR3) – Currently the most mature and reliable release for ISE.
• 8.0.110.0 (8.0 MR1) – Less mature but includes new feature support + some
additional fixes. Be aware of CSCur20154 HA SSO pair memory leak
• Key Defects Fixed in AireOS 7.6
CDETS Title
CSCuh03648 WLC sends different Framed-IP-Address in accounting updates
CSCui38627 BYOD Dual SSID flow broken: WLC sends session ID not created on that ISE
CSCuh20269 WLC sends accupdates too frequently, indicates user roams to itself
CSCue94442 WLC starts three authentications simultaneously for the same endpoint
CSCue37405 Rate limit radius request when Radius server is overloaded
CSCug36414 McAllen: PreAuth DNS based ACL enhancements - EDCS: 1241322
CSCun62368 Radius NAC Client auth issues for 7.6
CSCuo39416 1131/1242 not forwading CWA redirects on 7.6
CSCug14713 WLC sends acct-update twice in the same millisecond
CSCue37405 Rate limit radius request when Radius server is overloaded
CSCue49527 WLC should delete the session ID from PMK cache when client is removed
CSCud12582 Processing AAA Error 'Out of Memory'
TAC Recommended AireOS 7.6 and 8.0 – Q2 CY15
https://supportforums.cisco.com/document/12481821/tac-recommended-aireos-76-and-80-2q-cy15

• In order to provide our customers with the most reliable Wireless LAN Controller software available,
Cisco Wireless TAC is now offering TAC Recommended AireOS builds for 7.6 and 8.0. These
"escalation" builds have several important bugfixes (beyond what is now available in CCO code) and
have been operating in production at customer sites for several weeks. See the release notes for
bugfix details.
• At present, the TAC Recommended AireOS builds are:
• For AireOS 7.6 customers, 7.6.130.26 Release Notes
• For AireOS 8.0 customers, 8.0.110.11. (Note that this build has many bugfixes beyond what the CCO
8.0.115.0 release has) Release Notes
• The TAC Recommended AireOS builds may be updated every week or two.
• The migration plan, from the TAC Recommended AireOS builds to CCO code, will be to the 8.0 MR2
release, planned for later this year. (Cisco does not plan to release another 7.6 maintenance build to
CCO.) 8.0 MR2 is in beta now (see https://supportforums.cisco.com/document/12492986/80mr2-beta-
availability), but does not yet have all of the applicable fixes.
• Cisco does not at present plan to post these builds to CCO. To request AireOS 7.6.130.26 and/or
8.0.110.11, open a Cisco TAC case on your Wireless LAN Controller contract.
Wireless Controllers Under Extreme Load (WLC 8.1)
• 5508 and WISM2
• 8 queues per server (max 17 servers configurable) Server 1 Server2
Queue 1 src port 1 src port 1
• 8510/7510 Queue 2 src port 2 src port 2
• 16 queues per server (max 17 servers configurable) Queue 3 src port 3 src port 3
Queue 4 src port 4 src port 4
• For all platforms, each queue = 0-255 unique IDs. Queue 5 src port 5 src port 5
So total 256*8 = 2048 requests/server. Queue 6 src port 6 src port 6
• Example using 5508/WISM2 : Queue 7 src port 7 src port 7
Queue 8 src port 8 src port 8
• We will have unique source port per queue.
• Total 8 unique source ports. Related defects:
CSCus51456,CSCur33085
• Queue is selected based on MAC address Hashing. CSCue37368, CSCuj88508
• Before 8.1, separate queues added for Auth and Accounting, but all servers share same
two queues. (CSCud12582, CSCul96254)
Q: 5508/WISM2 will have 8 queues per server. Are the 8 queues divided into 4 auth and 4 accounting?
A: It is not shared but separate 8 queues for accounting for 5508/WISM2
For Your
Wireless Best Practices Reference

Anchor Configurations
• RADIUS Accounting with Anchor Controllers
• Guest Anchors: Disable RADIUS Accounting on Guest Anchor WLAN (Enable on
Foreign Only)
• Campus Anchors: In campus roaming scenario where all controllers need to be
“primary” for same SSID, cannot disable RADIUS Accounting.
• Open SSIDs will always issue new session ID with RADIUS accounting update with new
ID, so disconnects original connection and user is re-authenticated.
• CSCul83594 Sev6 - Session-id is not synchronized across mobility if the network is open
• CSCue50944 Sev6 - CWA Mobility Roam Fails to Foreign with MAC Filtering BYOD
Wireless Best Practices For Your
Reference
Roaming Considerations

• Session IDs can change when roam between controllers (L2 or L3 roaming); Going
between APs to same controller should fine.
• Secure SSIDs (802.1X): L2/L3 roaming between controllers should handle without
reauth—all roams are basically symmetric with tunnel back to foreign controller
• Open SSIDs (MAB, WebAuth):
• Avoid multiple controllers with open SSIDs – otherwise, will get new session ID (reauth) regardless if
L2 or L3 roam.
• Reauth any time change IP. For open SSID, it will always issue new SSID.
• Options:
• Stateful Controller Switchover
• Deploy higher-capacity controllers instead of many smaller ones.
• 802.11r will work with 7.6 or 8.0 and can be applied to entire WLAN—simply not tested
under 7.6 so warning provided.
Tune NAD Configuration
Rate Limiting at Wired Source

Reauth period
Wired (IOS / IOS-XE)
Quiet-period 5 min • RADIUS Interim Accounting: Use newinfo parameter with long
Held-period / Exclusion 5 min interval (for example, 24-48 hrs), if available. Otherwise, set 15
Switch mins.
Reauth phones • 802.1X Timeouts
Quiet • held-period: Increase to 300+ sec
Period • quiet-period: Increase to 300+ sec
Unknown
users • ratelimit-period: Increase to 300+ sec

• Inactivity Timer: Disable or increase to 1+ hours (3600+ sec)


Roaming
supplicant • Session Timeout: Disable or increase to 2+ hours (7200+ sec)
• Reauth Timer: Disable or increase to 2+ hours (7200+ sec)
Misbehaving supplicant
• Bugfixes: Upgrade software to address critical defects.
Wired – RADIUS Interim Accounting For Your
Reference
All IOS and IOS-XE Platforms
• Command:
switch(config)# aaa accounting update [newinfo] [ periodic number [ jitter maximum max-value ] ]

• Recommendation:
switch(config)# aaa accounting update [newinfo periodic 14400 | periodic 15]

• Reference:
• When the aaa accounting update command is activated, the Cisco IOS software issues interim
accounting records for all users on the system. If the keyword newinfo is used, interim accounting
records will be sent to the accounting server every time there is new accounting information to report.
• When used with the keyword periodic, interim accounting records are sent periodically as defined by
the argument number (in minutes). The interim accounting record contains all of the accounting
information recorded for that user up to the time the interim accounting record is sent.
• Jitter is used to provide an interval of time between records so that the AAA server does not get
overwhelmed by a constant stream of records. If certain applications require that periodic records be
sent at exact intervals, you should disable jitter by setting it to 0.

Caution: Using the aaa accounting update periodic command can cause heavy congestion when
many users are logged in to the network
Wired - 802.1X Timeout Settings For Your
Reference
All IOS and IOS-XE Platforms

• switch(config-if)# dot1x timeout held-period 300 | quiet-period 300 | ratelimit-period 300


held-period seconds Supplicant waits X seconds before resending credentials after a failed attempt.
Default 60.
quiet-period seconds Switch waits X seconds following failed authentication before trying to re-authenticate client.
Default: 120. • Cisco 7600 Default: 60.
ratelimit-period seconds Switch ignores EAPOL-Start packets from clients that authenticated successfully for X
seconds.
Default: rate limiting is disabled.

Throttles misconfigured/misbehaving clients:

Quite-Period = 300 sec


= Wait 5 minutes after failed 802.1X auth.

Ratelimit-Period = 300 sec


= Ignore additional auth requests for 5 min. after
successful 802.1X auth.
Wired - 802.1X Timeout Settings For Your
Reference
Command Details

• Wired - All IOS and IOS XE platforms


• switch(config-if)# dot1x timeout held-period 300 | quiet-period 300 | ratelimit-period 300

held-period seconds Configures the time, in seconds for which a supplicant will stay in the HELD state (that is, the
length of time it will wait before trying to send the credentials again after a failed attempt).
• The range is from 1 to 65535. The default is 60.
quiet-period seconds Configures the time, in seconds, that the authenticator (server) remains quiet (in the HELD state)
following a failed authentication exchange before trying to reauthenticate the client.
• For all platforms except the Cisco 7600 series Switch, the range is from 1 to 65535.
The default is 120.
• For the Cisco 7600 series Switch, the range is from 0 to 65535. The default is 60.
ratelimit-period seconds Throttles the EAP-START packets that are sent from misbehaving client PCs (for example, PCs
that send EAP-START packets that result in the wasting of switch processing power).
• The authenticator ignores EAPOL-Start packets from clients that have successfully
authenticated for the rate-limit period duration.
• The range is from 1 to 65535. By default, rate limiting is disabled.
Wired – Authentication Settings For Your
Reference
Reduce the # Auths and ReAuths

• Disable or Increase Inactivity Timer to 1+ hours; Disable /increase Reauth to 2+ hours


switch(config-if)# authentication ? Enable inactivity timer with caution for
• periodic Enable or Disable Reauthentication for this port non-user / MAB endpoints.
• timer Set authentication timer values

switch(config-if)# authentication timer ?


• inactivity Interval in seconds after which if there
is no activity from the client then it will
be unauthorized (default OFF)
• reauthenticate Time in seconds after which an
automatic re-authentication should
be initiated (default 1 hour)

• On the Server Side (ISE), Idle and


Session / Reauth timers are configured
in the Authorization Profile
RADIUS Test Probes For Your
Reference
Reduce Frequency of RADIUS Server Health Checks

Heartbeat • Wired NAD: RADIUS test probe interval


frequency set with idle-time parameter in radius-
server config; Default is 60 minutes
• No action required
Switch
• Wireless NAD: If configured, WLC only
Reauth phones
Quiet LB
sends “active” probe when server marked
period as dead.
Unknown users • No action required
Quiet LB Health
Period
WLC
probes • Load Balancers: Set health probe
intervals and retry values short enough to
Roaming
supplicant
Client ensure prompt failover to another server in
Exclusion
cluster occurs prior to NAD RADIUS
timeout (typically 20-60 sec.) but long
enough to avoid excessive test probes.
Misbehaving
supplicant
NAD RADIUS Test Probes For Your
Reference
IOS Switch Test Probes

• By default, IOS Switches and WLC validate health through active authentications.
• Optional: IOS can send separate RADIUS test probes via idle-time setting.
• Recommendation: Keep default interval = 60 minutes
• Older command syntax :
radius-server host 10.1.98.8 auth-port 1812 acct-port 1813 test
username radtest ignore-acct-port idle-time 120 key cisco123

• Newer command syntax:


radius server psn-cluster1
address ipv4 10.1.98.8 auth-port 1812 acct-port 1813
automate-tester username radtest ignore-acct-port idle-time 120
key cisco123
Load Balancer RADIUS Test Probes
ACE Example F5 Example
• Probe frequency and retry settings:  Probe frequency and retry settings:
• Time interval between probes: – Time interval between probes:
interval seconds # Default: 15 Interval seconds # Default: 10
• Retry count for failed probes: – Timeout before failure = 3*(interval)+1:
faildetect retry_count # Default: 3 Timeout seconds # Default: 31

• Sample ACE RADIUS probe configuration:  Sample F5 RADIUS probe configuration:


probe radius PSN-PROBE Name PSN-Probe
port 1812 Type RADIUS
interval 20 Interval 10
faildetect 2 Timeout 31
passdetect interval 90 Manual Resume No
credentials radprobe cisco123 secret cisco123 Check Util Up Yes
User Name f5-probe
• Recommended setting: Start with defaults Password cisco123
and validate behavior in specific environment. Secret cisco123
Alias Address * All Addresses
Alias Service Port 1812
Debug No
PSN Noise Suppression and Smarter Logging
Filter Noise and Provide Better Feedback on Authentication Issues

• PSN Collection Filters Detect and reject


• PSN Misconfigured Client Dynamic misbehaving clients
Detection and Suppression
Log Filter
• PSN Accounting Flood Suppression
• Detect Slow Authentications
• Enhanced Handling for EAP sessions PSN
dropped by supplicant or Network Access
Server (NAS)
• Failure Reason Message and Filter health
Reject
Classification probes from
bad
• Identify RADIUS Request From Session supplicant logging
Started on Another PSN
• Improved Treatment for Empty NAK List
PSN - Collection Filters
Static Client Suppression
Administration > System > Logging > Collection Filters
• PSN static filter based on
single attribute:
• User Name
• Policy Set Name
• NAS-IP-Address
• Device-IP-Address
• MAC (Calling-Station-ID)

• Filter Messages Based on Auth Result:


• All (Passed/Fail)
• All Failed
• All Passed

• Select Messages to Disable Suppression


for failed auth @PSN and successful auth @MnT
PSN Filtering and Noise Suppression
Misconfigured Client Dynamic Detection and Suppression
Administration > System > Settings > Protocols > RADIUS
 Flag misbehaving supplicants when
fail auth more than once per interval
– Send Alarm with failure stats every interval.
– Stop sending logs for repeat auth failures
for same endpoint during rejection interval.
– Successful auth clears flag
 Reject matching requests during interval
– Match these • Supplicant (Calling-Station-ID)
• NAS (NAS-IP-Address)
attributes:
• Failure reason
– Excludes CoA messages / bad credentials
– Next request after interval is fully processed.

CSCuj03131 Lower "Request Rejection Interval" minimum to 5 minutes (from 30 minutes)


Enhanced EAP Session Handling
Improved Treatment for Empty NAK List
• Best Effort for Supplicants that Improperly Reply with
Empty NAK List: PSN suggests the most secure or
preferred EAP protocol configured (per Allowed
Protocols list).
• Some supplicants may reply with NAK and not suggest
alternative protocol (empty NAK list). For Your
• ISE will now suggest other supported protocols rather than Reference
fail auth.

• Set Preferred EAP Protocol on ISE to most common


method used by network.
• This sets the list of proposed EAP methods sent to
supplicant during auth negotiation.
• Value is disabled by default.

Policy > Policy Elements > Results Authentication > Allowed Protocols
MnT Log Suppression and Smarter Logging
Drop and Count Duplicates / Provide Better Monitoring Tools
• Drop duplicates and increment counter in Live Log for “matching”
passed authentications Count and discard
repeated events
• Display repeat counter to Live Sessions entries.
• Update session, but do not log RADIUS Accounting Interim Count and discard
Updates untrusted events
• Log RADIUS Drops and EAP timeouts to separate table for
reporting purposes and display as counters on Live Log Dashboard MNT
along with Misconfigured Supplicants and NADs
• Alarm enhancements
• Revised guidance to limit syslog at the source.
• MnT storage allocation and data retention limits Count and discard
repeats and unknown
• More aggressive purging
NAD events
• Support larger VM disks to increase logging capacity and retention.
MnT Noise Suppression
Suppress Successful Auths and Accounting
Administration > System > Settings > Protocols > RADIUS
CSCur42723
 Original Range
1 – 30 seconds
 New Range
1 sec – 1 day

 Do not save repeated successful auth


events to DB
(Events will not display in Live Auth log).
 Stop sending Accounting logs for same
session during interval.
 Detect and log NAS retransmission timeouts
for auth steps that exceed threshold. Allow 2 updates, then suppress if get
(Step latency is visible in Detailed Live Logs)
more updates in interval up to 24hrs
MnT Duplicate Passed Auth Suppression
Drop and Count Duplicates
• Unique session entries determined by hash
created based on these attributes:
 Called Station Id
 User Name
 Posture Status
 CTS Security Group
 Authentication Method
 Authentication Protocol
 NAS IP Address
 NAS Port Id
 Selected Authorization Profile

5eaf59f1e6cd6aa6113ca1463c779c3f (MD5 hash)

• “Discard duplicate” logic not applicable to failed auths as these are not cached in session
• RADIUS Accounting (Interim) updates are dropped from storage, but do update session
Live Authentications and Sessions

Blue entry = Most current Live Sessions entry with repeated successful auth counter
Authentication Suppression
Enable/Disable
• Global Suppression Settings: Administration > System > Settings > Protocols >
RADIUS
Failed Auth Suppression Successful Auth Suppression

Caution: Do not disable suppression in deployments with very high auth rates.
It is highly recommended to keep Auth Suppression enabled to reduce MnT logging
• Selective Suppression using Collection Filters: Administration > System > Logging >
Collection Filters
Configure specific traffic to bypass
Successful Auth Suppression
Useful for troubleshooting authentication for a
specific endpoint or group of endpoints, especially
in high auth environments where global suppression
is always required.
Per-Endpoint Time-Constrained Suppression New in ISE 1.3

Right
Click
Per-Endpoint Time-Constrained Suppression New in ISE 1.3

Right Click on Endpoint ID


Sessions States For Your
Reference

• Sessions can have one of 6 states as shown in the Live Sessions drop-down.
• NAD START --> Authenticating
• NAD SUCCESS --> Authorized
• NAD FAIL / ACCT STOP / AUTH FAIL --> Terminated
• POSTURED --> Postured
• AUTH PASS --> Authenticated
• ACCT START / UPDATE --> Started

• The first two happen only for


wired switchports with epm
logging enabled and MnT nodes
are configured to receive these
logs via syslog from the NAD
Clearing Stale ISE Sessions For Your
Reference

• Automatic Purge: A purge job runs approximately every 5 minutes to clear sessions that
meet any of the following criterion:
1. Endpoint disconnected (Ex: failed authentication) in the last 15 minutes (grace time
allotted in case of authentication retries)
2. Endpoint authenticated in last hour but no accounting start or update received

3. Endpoint idle—no activity (authentication / accounting / posturing / profiling updates) in the


last 5 days
• Note: Session is cleared from MnT but does not generate CoA to prevent negative impact to connected
endpoints. In other words, MnT session is no longer visible but it is possible for endpoint to still have
network access, but no longer consumes license.

• Manual Purge via REST API: HTTP DELETE API can manually delete inactive sessions.
 An example web utility that supports HTTP DELETE operation is cURL. It is a free 3rd-party
command line tool for transferring data with HTTP/HTTPS:
http://www.cisco.com/en/US/docs/security/ise/1.2/api_ref_guide/ise_api_ref_ch2.html#wp1072950
For Your
Live Authentications Log Reference

Dashboard Counters Drill Down on


• Misconfigured Supplicants: Supplicants failing tocounters
connect to
repeatedly in the last 24
hours see details
• Misconfigured Network Devices: Network devices with aggressive accounting
updates in the last 24 hours
• RADIUS Drops: RADIUS requests dropped in the last 24 hours
• Misconfigured
Client Stopped Supplicants:
Responding:Supplicants
Supplicantsfailing
stoppedto connect repeatedly
responding in the last 24 hours
during conversations
 in the last 24 hours
Misconfigured Network Devices: Network devices with aggressive accounting updates in
the last 24 hours
• Repeat Counter: Authentication requests repeated in the last 24 hours with no
 change
RADIUSinDrops:
identityRADIUS
content, requests
network device,
droppedand authorization.
in the last 24 hours
 Client Stopped Responding: Supplicants stopped responding during conversations in the
last 24 hours
 Repeat Counter: Authentication requests repeated in the last 24 hours with no change in
identity content, network device, and authorization.
Live Authentications Log For Your
Reference

Dashboard Counters

• Misconfigured Supplicants: Supplicants failing to connect repeatedly in the last 24 hours


• Misconfigured Network Devices: Network devices with aggressive accounting updates in
the last 24 hours
• RADIUS Drops: RADIUS requests dropped in the last 24 hours
• Client Stopped Responding: Supplicants stopped responding during conversations in the
last 24 hours
• Repeat Counter: Authentication requests repeated in the last 24 hours with no change in
identity content, network device, and authorization.
Live Authentications Log For Your
Reference
Dashboard Counters

• Misconfigured Supplicants: Supplicants failing to connect repeatedly in the last 24 hours


• Misconfigured Network Devices: Network devices with aggressive accounting updates in
the last 24 hours
• RADIUS Drops: RADIUS requests dropped in the last 24 hours
• Client Stopped Responding: Supplicants stopped responding during conversations in the
last 24 hours
• Repeat Counter: Authentication requests repeated in the last 24 hours with no change in
identity content, network device, and authorization.
Counters – Misconfigured Supplicants For Your
Reference
Endpoints That Continuously Fail Authentication
Counters – Misconfigured NAS For Your
Reference
Access Devices That Send Excessive or Invalid RADIUS Accounting
Counters – RADIUS Drops For Your
Reference
Duplicate Session Attempts, Undefined NAD, Secret Mismatch, Non-Conforming, Etc.
Counters – Clients Stopped Responding For Your
Reference
Supplicants That Fail to Complete EAP Authentication
Counters – Repeat Count For Your
Reference
Endpoint Tally of Successful Re-authentications
Repeat Counter For Your
Reference
Successful Authentication Suppression

• Global Repeat Counter displayed in Live Authentications Log dashboard:


• Session Repeat Counter displayed in Live Authentication and Sessions Log
• Can reset counters for all sessions or individual session

• Be sure to enable display under “Add or Remove Columns”


ISE 1.2 Alarms For Your
Reference Do not forget about
the new Search
function in 1.2!

• Alarms now displayed as dashlet on ISE Home Page

• Following alarms are added or


enhanced in ISE 1.2
 Misconfigured supplicant
 Misconfigured NAS
 Detect Slow Authentications
 RADIUS Request Dropped with more
accurate failure reasons
 Excessive Accounting Messages
 Mixing RADIUS Request between ISE
PSN’s due to NAD/LB behavior.
Minimize Syslog Load on MNT
Disable NAD Logging and Filter Guest Activity Logging
Guest Activity: Log only if required.
Rate Limiting at Source Filter and send only relevant logs
Disable NAD Logging unless
required for troubleshooting
Filter Syslog
at Source
switch
Reauth phones
LB PSN MNT

Unknown users
wlc

Roaming
supplicant Syslog
Forwarder
* Filter at Relay

Misbehaving Guest Activity: If cannot filter at


supplicant source, use smart syslog relay
NAD Logging For Your
Reference
Disable by Default

• Recommended enable for troubleshooting purposes only.


• If logging configured, the correct commands should include….

epm logging
logging origin-id # where origin-id = IP address A.B.C.D
logging source-interface <interface-id> # where interface-id IP address = A.B.C.D
logging host <MNT1> transport udp port 20514
logging host <MNT2> transport udp port 20514 # Optional for redundancy, but not
required for troubleshooting purposes
Guest Activity Logging For Your
Reference

• Enable with purpose—only send logs of interest that will apply to guest
sessions.
• ISE only parses log messages that include IP address of active guest account
ASA Example:
• Create Service Policy to inspect
HTTP traffic for guest subnet
• Filter messages ID # 304001:
accessed URLs
• Log Filtering:
• If NAD supports, configure filters to limit logs only to those needed/usable by MnT.
• If unable to filter at NAD, use Syslog Relay to filter and forward desired messages.

MnT

Firewall Syslog Forwarder MnT


No Log Suppression With Log Suppression Distributed Logging
Brief Cisco IT Case Study
Current Cisco IT ISE Production Deployment Metrics
• Infrastructure (Production)
• Guest Services ISE 1.2, P13 9 VM servers in one dedicated deployment
• Production ISE 1.2, P13 29 VM servers in one global deployment
• Pre-Production ISE 1.3, P1 24 VM servers in one global deployment
(migration ongoing)
• Services
• Guest services (ION) (440 sites, potential 136K users & 14K guests per week)
• 802.1X Wire Monitor Mode (192 devices, 83 sites)
• 802.1X Wireless Auth Mode (400 wlan sites, 90K+ end-users, All IT owned WLCs except couple sites)
• 802.1X Wireless Auth CVO* (~15K CVO sites, ~15K global users – 60% completion)
• Wireless Policy Enforcement (2 Extranet Partner sites in BGL; Pilot mode)

• Total ~600K+ Profiled Endpoints in database; Max 60K+ Concurrent Endpoints Globally
* CVO is Cisco Virtual Office, or small office/home office
Correct as of 08 March 2015
Executive Summary
• Significant progress has been made in Item Owner Impact
stabilizing ISE 1.2
Configure ACE for Cisco IT High – reduced
• Replication is now working across the accounting accounting traffic from
deployment w RADIUS probing and “stickiness” 6M to 3M txns per day
SNMP polling enabled
• Next steps: Implement eng fix to BU High – further
enable accounting reduction in accounting
• Apply ISE SNMP fixes and enable suppression traffic
SNMP polling – reduce traffic from CVO
sites* Remove “IP” as a BU (design High – removed traffic
• Cisco IT to continue update network significant attribute change) from “noisy” endpoints
devices and endpoints to reduce “traffic”
• Resume production rollout (CVOs and Implement WLC OS Cisco IT High – reduce traffic
wired devices) updates to fix from wireless network
duplicate accounting accounting txns
• Post mortem to review lessons learned
issue
and “product enhancements” *
Implement eng fix for BU High – reduce # of
SNMP polling SNMP traffic to enable
CVO
Impact of Config Changes and Engineering Fixes
Reduction of Transaction load on ISE IT Deployment
Cisco IT and the Identity Services Engine
A multiyear deployment journey
• WhitePaper: http://www.cisco.com/c/en/us/solutions/collateral/enterprise/cisco-on-
cisco/wp-en-02092015-identity-services-engine.html
• Attend CCSSEC-2002
Cisco IT – Identity
Services Engine (ISE)
Deployment and
Best Practices
• Friday (6/11/15)
12:30 – 1:30 pm
• Presented by Bassem
Khalifé, Cisco IT
High Availability
High Availability Agenda
• ISE Node Redundancy
• Administration Nodes
• Monitoring Nodes
• pxGrid Nodes
• Inline Posture Nodes

• HA for Certificate Services


• Policy Service Node Redundancy
• Load Balancing
• Non-LB Options

• NAD Fallback and Recovery


• Maximum two PAN
Administration HA and Synchronization nodes per deployment
• Active / Standby
PAN Steady State Operation
• Changes made to Primary Administration DB are automatically synced to all nodes.

Admin Node PAN


(Secondary)
PSN Policy Service Node
Policy Sync
Admin Node
Policy Sync
(Primary) PAN PSN
Policy Service Node

Admin
PSN
User Policy Policy Service Node
Logging
Sync
MnT MnT

Monitoring Node Monitoring Node


(Primary) (Secondary)
Administration HA and Synchronization
Primary PAN Outage and Recovery
• Prior to ISE 1.4, upon Primary PAN failure, admin user must connect to Secondary PAN and manually
promote Secondary to Primary; new Primary syncs all new changes.
• PSNs buffer endpoint updates if Primary PAN unavailable; buffered updates sent once PAN available.

Admin Node Policy Sync


PAN
(Secondary -> Primary) PSN Policy Service Node Promoting
Secondary Admin
Admin Node
PSN Policy Service Node
may take 10-15
(Primary) minutes before
PAN
process is
Admin PSN Policy Service Node complete.
Policy Sync
User
Logging
New Guest Users or Registered Endpoints
MnT MnT
cannot be added/connect to network when
Monitoring Monitoring Primary Administration node is unavailable!
(Primary) (Secondary)
Policy Service Survivability When Admin Down/Unreachable
Which User Services Are Available if Primary Admin Node Is Unavailable?
Service Use case Works (Y / N)
RADIUS Auth Generally all RADIUS auth should continue provided access to ID stores Y
All existing guests can be authenticated, but new guests, self-registered
Guest N
guests, or guest flows relying on device registration will fail.
Previously profiled endpoints can be authenticated with existing profile.
Profiler New endpoints or updates to existing profile attributes received by owner Y
should apply, but not profile data received by PSN in foreign node group.
Device Device Registration fails if unable to update endpoint record in central
database N
Registration
BYOD/NSP relies on device registration. Additionally, any provisioned
BYOD certificate cannot be saved to database. N
MDM MDM fails on update of endpoint record N
Related to BYOD/NSP use case, certificates can be issued but will not be
CA/Cert
saved and thus fail. OCSP will function, but database used last replicated N
Services
version
Clients that are already authorized for a topic and connected to controller
pxGrid N
will continue to operate, but new registrations and connections will fail.
Automatic PAN Switchover New in
ISE 1.4
Don’t forget, after switchover
admin must connect to PAN-2
for ISE management!
• Primary PAN (PAN-1)
down or network link DC-1 DC-2
down. PAN-2 MNT-2
MNT-1 PAN-1 Secondary
Secondary
• If Health Monitor unable Primary Primary
to reach PAN-1 but can MnT PAN PAN MnT
1
reach PAN-2, then 2
triggers failover
• Secondary PAN (PAN-2) WAN
is promoted by Health
Monitor PSN
PSN PAN PAN PSN
• PAN-2 becomes Primary PSN Health Health
and takes over PSN Monitor Monitor
replication.

Note: Switchover is NOT immediate. Total time based on polling intervals and promotion time.
Expect ~ 30 minutes.
ISE Admin Failover For Your
Reference
“Automated Promotion/Switchover”

• Primary PAN and secondary PAN can be in different subnets/locations


• Secondary nodes close to the respective PANs act as their health monitors
• Health Monitors:
• Maximum 2; Could be same node (recommend 2 if available)
• Requires distributed deployment.
• Can be any node—other than Admin node (or same node where Admin persona present)
• Recommend node(s) close to PAN to be monitored to differentiate between local versus
broader network outage, but should not be on SAME server if virtual appliance.

• Monitor Process:
• Secondary node monitoring the health of the Primary PAN node is the Active monitor
• On Failure detection, Health Monitor for Primary PAN node initiates switchover by
sending request to the Secondary PAN to become new primary PAN
PAN Failover Scenario For Your
Reference
Scenario 1

DC-1 DC-2
PAN-2 MNT-2
MNT-1 PAN-1 Secondary
Secondary
Primary Primary
MnT PAN MnT

X
PAN
1
• Primary PAN (PAN-1) Direct 2
down failover
detection
WAN
• Secondary PAN
(PAN-2) takes over
PSN
PSN PAN PAN PSN
PSN Health Health
Monitor Monitor
PAN Failover Scenario For Your
Reference
Scenario 2

• Connection between DC-1 DC-2


Primary PAN and PAN-2 MNT-2
MNT-1 PAN-1
Secondary PAN is down. Primary Secondary Secondary
Primary
MnT
• Connection between PAN MnT PAN PAN

and Health Monitor is up Direct


failover
• Direct Failover detection
between PANs will cause X
detection
WAN
false switchover and data
out of sync PSN
PSN PAN PAN PSN
• Using an external monitor PSN Health Health
can avoid false switchover Monitor Monitor
PAN Failover Scenario For Your
Reference
Scenario 3

• Connectivity between DC-1 DC-2


the data centers is down MNT-1 PAN-1 PAN-2 MNT-2
Primary Primary Secondary Secondary
• Complete network split PAN MnT
MnT PAN

• Cannot be handled by Direct


failover
PAN Failover detection
• Local WAN survivability
required
X
WAN

PSN
PSN PAN PAN PSN
PSN
Health Health
Monitor Monitor
PAN Failover
Configuration

Configuration using GUI only under Administration > System > Deployment > PAN Failover
Alarms in PAN Auto-Failover For Your
Reference

Critical Alarms Warning Alarms


 Health check node finds primary PAN down Invalid auto-failover monitoring
 Health check node makes a promotion call to
 Mostly because health check node is out of sync
secondary PAN
 Health check node is not able to make  PAN Auto-failover is disabled but primary PAN is
promotion request to secondary PAN receiving health check probes
 Secondary PAN rejects the promotion request  Primary PAN receives health probes from invalid
made by the health check node health check node
 Secondary PAN info with the health check node is
not correct
 Node receiving the health probe says it is not the
correct primary PAN node
No health-check probes received
 Primary PAN does not receive the health check
probes though it is configured
Promotion of secondary PAN is called by the health
check node
PAN Auto-Failover Alarm Details For Your
Reference
Drill down on specific alarm to get Detailed Alarm information in a new page
MnT Distributed Log Collection For Your
Reference
• ISE supports distributed log collection across all nodes to optimize local data collection,
aggregation, and centralized correlation and storage.
• Each ISE node collects logs locally from itself; Policy Service nodes running Profiler Services may
also collect log (profile) data from NADs.
• Each node transports its Audit Logging data to each Monitoring node as Syslog—these logs are not
buffered unless use TCP/Secure Syslog
• NADs may also send Syslog directly to Monitoring node on UDP/20514 for activity logging,
diagnostics, and troubleshooting.
Policy Service Monitoring External Log
Profiler Syslog
NADs Nodes (UDP/30514)
Nodes Servers

HTTP SPAN,
DHCP
SPAN/Helper/Proxy Syslog Alarm-triggered
(UDP/20514) Syslog
NetFlow,
SNMP Traps,
RADIUS
(Not Buffered)
External Log Targets: Syslog (UDP/20514)

Syslog (UDP/20514)
HA for Monitoring and Troubleshooting • Maximum two MnT
nodes per deployment
Steady State Operation • Active / Active

• MnT nodes concurrently receive logging from PAN, PSN, IPN*, NAD, and ASA
• PAN retrieves log/report data from Primary MnT node when available
Monitoring
Node (Primary) PAN MnT data
Admin
MnT
User

Syslog from access devices Syslog from ISE


are correlated with Syslog 20514 PSN
nodes are sent for
user/device session MnT
session tracking
and reporting

Monitoring
Node (Secondary) IPN

Syslog from firewall (or


other user logging device) is *Inline Posture Node
correlated with guest supports logging to a
session for activity logging single target only
HA for Monitoring and Troubleshooting
Primary MnT Outage and Recovery

• Upon MnT node failure, PAN, PSN, NAD, and ASA continue to send logs to remaining MnT node;
IPN must be reconfigured to send logs to active MnT (only supports one log target).
• PAN auto-detects failure (down for > 5 min) and retrieves log/report data from Secondary MnT node.

Monitoring Node (Primary) PAN


MnT data Admin
MnT User

Syslog from access devices Syslog from PSN nodes


Syslog 20514 PSN
are correlated with MnT
sent for auth session
user/device session tracking, troubleshooting
and reporting
Monitoring Node (Secondary)
• PSN logs are not locally buffered when MnT down IPN
unless use TCP/Secure syslog.
Syslog from firewall and other • Log DB is not synced between MnT nodes.
loggers correlated with guest *Inline Posture Node
• Upon return to service, recovered MnT node will not
session for activity logging include data logged during outage supports logging to a
• Backup/Restore required to re-sync MnT database
single target only
Log Buffering
TCP and Secure Syslog Targets

• Default UDP-based
audit logging does not
buffer data when MnT
is unavailable.
• TCP and Secure
Syslog options can be
used to buffer logs
locally
• Note: Overall log
performance will
decrease if use these
acknowledged options.
• Maximum two pxGrid
pxGrid
HA for pxGrid Clients nodes per deployment
(Publishers) • Active / Standby
Steady State
Primary Primary Secondary Secondary
PAN MnT PAN MnT
PAN Publisher Topics: PAN MnT PAN MnT
• Controller Admin
• TrustSec/SGA
• Endpoint Profile
TCP/12001
TCP/5222
TCP/5222
MnT Publisher Topics:
• Session Directory
• Identity Group Active PXG
Standby
PXG
• ANC (EPS) pxGrid pxGrid
Controller Controller

• pxGrid clients can be


configured with up to 2
servers. TCP/5222
• Clients connect to pxGrid
single active controller Client
(Subscriber)
pxGrid
HA for pxGrid Clients
(Publishers)
Failover and Recovery
Primary Primary Secondary Secondary
PAN MnT PAN MnT
PAN Publisher Topics: PAN MnT PAN MnT
• Controller Admin
• TrustSec/SGA
• Endpoint Profile
TCP/12001
TCP/5222
TCP/5222
MnT Publisher Topics:
• Session Directory
• Identity Group Active PXG
Standby
PXG
• ANC (EPS) pxGrid pxGrid
Controller Controller

If active pxGrid
Controller fails, clients
automatically attempt
connection to standby TCP/5222
pxGrid
controller. Client
(Subscriber)
pxGrid HA For Your
Reference
Design Considerations

• Download Identity certs from the Primary and Secondary MnT nodes to pxGrid
clients and import both into the keystore.
• Specify the hostname of both pxGrid nodes in the pxGrid API.

Example:
./register.sh –keystoreFilename isekeyfile.jks –keystorePassword cisco123
–truststoreFilename rootfile.jks –truststorePassword cisco123
–hostname 10.0.1.33 10.0.2.79
• The pxGrid clients will register to both pxGrid nodes.
• If the pxGrid node registered to the primary goes down, the pxGrid client will
continue communication with the pxGrid registered to the secondary node.
HA for Inline Posture Node VLANS
• VLAN 11: (ASA VPN; Inline
ISE Inline node untrusted)
VPN Example ACTIVE • VLAN 12: (Inline node trusted)
• VLAN 13: (Inline Heartbeat Link)
IPN
ASA HA: A/S VLAN 11
• VLAN 14: (ASA Inside)
eth2 (HB Link)
or VPN Cluster • VLAN 15: (Internal Network)
eth1 eth0
VPN Client HA: VLAN 12
VPN to single Internet External ASA
vpn VLAN 13
ASA HA IP or Router Switch
outside
VPN Cluster IP
inside VLAN 15
ISP A VLAN 14 L3 Switch

Inline Inline
FO State Service Trunk: Service Internal
Internet Link IP VLANs PSN
Link IP Network
VPN
eth1 11-15 eth0 New
User ISP B
ASA 9.2.1 supports native
outside inside
CoA and URL Redirection
vpn for ISE Posture Services
Internet External L3 Switch
Router Switch ASA VLAN 12
—Inline Posture Node no
eth1 eth0 longer a required for
• Maximum two IPNs eth2 (HB Link) remote access ASA VPN.
VLAN 11 IPN
per instance; multiple VLAN 13
ASA Redundant
instances supported Links ISE Inline
• Active / Standby STANDBY
Inline Posture Node For Your
Reference
Considerations
• HA link is used to exchange heartbeat messages to check the status of mutual peer
• Appliance eth2 and eth3 ports used for HA link
• Multiple HA links can be configured; as long as heartbeat messages are received over at least on HA
link, then peer is considered healthy.
• HA link is a dedicated, highly reliable Layer 2 connection between failover pairs; can be a LAN cable or
dedicated VLAN connection. (Ethernet ports auto-detect MDI/MDI-X, so crossover cable optional.)
• Inline Posture Node HA supports link detection to allow failover to occur if active Inline Posture Node
detects loss of network connectivity while Standby does not; prevent traffic black hole due to other
network failures.
• In case of failure, Standby Inline Posture Node assumes “ownership” of service IP and sends gratuitous
ARPs out each interface to notify gateways of change.
• HA failover is stateless, so all active sessions need to be re-authorized upon FO. Standby Inline
Posture Node will auto-fetch session state/policy as needed.
HA for Internal Certificate Authority
• Primary PAN is Root CA for ISE deployment
• May be Subordinate to external Root CA or Standalone Root.

• All PSNs are Subordinate CAs to PAN Enterprise Root


• PSNs are SCEP Registration Authorities (RAs) (optional)
• Each PSN can issue certs even if Root
(Primary PAN) fails
• Each PSN runs OCSP responder. Primary PAN PAN Standby PAN
OCSP DB replicated so can point to any ISE Root CA (Backup Root)
PSN, or LB PSN cluster for OCSP HA

• Promotion of Standby PAN:


• No effect on sub-CA operation.
• To maintain same CA root after PSN PSN PSN PSN

promotion Secondary, be sure to


export/import the Public/Private Subordinate CA Subordinate CA Subordinate CA Subordinate CA
keys from Primary PAN from CLI: SCEP RA SCEP RA SCEP RA SCEP RA
# application configure ise OCSP Responder OCSP Responder OCSP Responder OCSP Responder
Export CA Certs from Primary PAN For Your
Reference
# application configure ise
cisco-lab-ise/admin# application configure ise

• Export the CA Certs Selection ISE configuration option


<SNIP>
to a Repository [7]Export Internal CA Store Root CA
[8]Import Internal CA Store
</SNIP>
[12]Exit
7
Export Repository Name: NAS
• Will be an Encrypted Enter encryption-key for export: ########## Sub CA
Export on progress...............
GPG Bundle
The following 4 CA key pairs were exported to repository 'NAS' at
'ise_ca_key_pairs_of_cisco-lab-ise':
Subject:CN=Certificate Services Root CA - cisco-lab-ise
Issuer:CN=Certificate Services Root CA - cisco-lab-ise RA
Serial#:0x6012831a-16794f11-b1248b9b-c7e199ef
• Four Key Pairs
Subject:CN=Certificate Services Endpoint Sub CA - cisco-lab-ise
Issuer:CN=Certificate Services Root CA - cisco-lab-ise
Serial#:0x3e4d9644-934843af-b5167e76-cc0256e0
OCSP
Subject:CN=Certificate Services Endpoint RA - cisco-lab-ise
Issuer:CN=Certificate Services Endpoint Sub CA - cisco-lab-ise
Serial#:0x13511480-9650401a-8461d9d7-5b8dbe17

Subject:CN=Certificate Services OCSP Responder - cisco-lab-ise


Issuer:CN=Certificate Services Root CA - cisco-lab-ise
Serial#:0x10d18efb-92614084-895097f2-9885313b

ISE CA keys export completed successfully


Import CA Certs from Primary to Secondary PAN
cisco-lab-ise/admin# application configure ise

Selection ISE configuration option

• After an upgrade, immediately <SNIP>


[7]Export Internal CA Store
[8]Import Internal CA Store
Export/Import CA certs. </SNIP>
[12]Exit
8
• If want original PPAN to stay Import Repository Name: NAS
Enter CA keys file name to import: ise_ca_key_pairs_of_cisco-lab-ise
Enter encryption-key: ########
Primary after upgrade, promote Import on progress...............

Secondary after CA certs imported. The following 4 CA key pairs were imported:
Subject:CN=Certificate Services Root CA - cisco-lab-ise
Issuer:CN=Certificate Services Root CA - cisco-lab-ise
• Or… Promote Secondary before Serial#:0x6012831a-16794f11-b1248b9b-c7e199ef

Subject:CN=Certificate Services Endpoint Sub CA - cisco-lab-ise


upgrade, upgrade ISE, and then Issuer:CN=Certificate Services Root CA - cisco-lab-ise
Serial#:0x3e4d9644-934843af-b5167e76-cc0256e0
export/import CA certs
Subject:CN=Certificate Services Endpoint RA - cisco-lab-ise
Issuer:CN=Certificate Services Endpoint Sub CA - cisco-lab-ise
• Provides CA redundancy if PPAN Serial#:0x13511480-9650401a-8461d9d7-5b8dbe17

Subject:CN=Certificate Services OCSP Responder - cisco-lab-ise


fails and Secondary promoted. Issuer:CN=Certificate Services Root CA - cisco-lab-ise
Serial#:0x10d18efb-92614084-895097f2-9885313b

Stopping ISE Certificate Authority Service...


Starting ISE Certificate Authority Service...
For Your ISE CA keys import completed successfully
Reference
Certificate Recovery for ISE Nodes
Backup all System (Server) Certificates and Key Pairs

• System Certificates for all nodes can be centrally exported with private key pairs
from Primary PAN in case needed fro Disaster Recovery.
OCSP Responder HA
• Each PSN runs OCSP responder.
• OCSP DB replicated so can point to any PSN, or LB PSN cluster for OCSP HA.

ASA Remote Access VPN Example:


match certificate OCSP_MAP override ocsp trustpoint ISE_Root 1 url http://ise-ocsp.company.com:2560/ocsp/
Each PSN is an OCSP Responder
Load Balancing OCSP Database replication ensures each PSN
Sample Flow contains same info for ISE-issued certificates.

DNS Lookup = ocsp.company.com


DNS PSN
1 DNS Response = 10.1.98.8 Server 10.1.99.5

http://ocsp.company.com ISE-PSN-1

2 http://ocsp. company.com:2560/ocsp @ 10.1.98.8 Load Balancer


PSN
10.1.99.6
https response from ise-psn-3 @ 10.1.99.7
Access VIP: 10.1.98.8 ISE-PSN-2
ASA 4 Device

PSN
1. Authenticator resolves ocsp.company.com to VIP @ 10.1.98.8 10.1.99.7
3
2. OCSP request sent to http://ocsp.company.com:2560/ocsp @ 10.1.98.8
ISE-PSN-3
3. Load balancer forwards request to PSN-3 (OCSP Responder) @ 10.1.99.7
4. Authentication receives OCSP response from PSN-3
SCEP Load Balancing for BYOD/NSP (ISE 1.2)
If Multiple SCEP CA Servers Defined…

• Multiple SCEP Profiles supported—Requests load balanced based on load factor.


• Load Factor = Average Response Time x Total Requests x Outstanding Requests
• Average Response Time = Average of last two 20 requests
• SCEP CA declared down if no response after three consecutive requests.
• CA with the next lowest load used; Periodic polling to failed server until online.
SCEP Load Balancing (ISE 1.3+)
If Multiple SCEP CA Servers Defined…

• SCEP Profile defined in Certificate Template


—only one can be selected.
• ISE 1.3 supports multiple CA URLs
in each profile
• Requests load balanced across CAs
PSN Load Balancing
Load Balancing RADIUS, Web, and Profiling Services
• Policy Service nodes can be configured in a cluster behind a load balancer (LB).
• Access Devices send RADIUS AAA requests to LB virtual IP.

ISE PSNs
PSN
PSN PSN PSN PSN PSN PSN PSN PSN
(RADIUS
Servers)

• N+1 node redundancy assumed


to support total endpoints during:
Load –Unexpected server outage
Balancer –Scheduled maintenance
Virtual IP –Scaling buffer
• HA for LB assumed

Network
Access
Devices
Configure Node Groups for LB Cluster
All PSNs in LB Cluster in Same Node Group
• Administration > System > Deployment
2) Assign name (and multicast address if ISE 1.2)
1) Create node group

3) Add individual PSNs to node group

• Node group members can be L2 or L3


• Multicast no longer a requirements in ISE 1.3
High-Level Load Balancing Diagram For Your
Reference
DNS
NTP
External SMTP
ISE-PAN-1 ISE-MNT-1 Logger MDM AD/LDAP

VLAN 98 VLAN 99 10.1.99.5


(10.1.98.0/24) (10.1.99.0/24) ISE-PSN-1

NAS IP: 10.1.50.2 VIP: 10.1.98.8 LB: 10.1.99.1


10.1.99.6
Network Access ISE-PSN-2
End User/Device Device Load
Balancer

10.1.99.7
ISE-PSN-3

ISE-PAN-2 ISE-MNT-2
Traffic Flow—Fully Inline: Physical Separation
Physical Network Separation Using Separate LB Interfaces Fully Inline Traffic Flow
recommended—
• Load Balancer is directly inline between PSNs and rest of network. physical or logical
• All traffic flows through Load Balancer including RADIUS,
PAN/MnT,Profiling, Web Services, Management, 10.1.99.5
Feed Services, MDM, AD, LDAP… VLAN 98 VLAN 99
(External) (Internal) ISE-PSN-1
Network
Switch
NAS IP: 10.1.50.2
10.1.98.1 10.1.98.2 10.1.99.1
10.1.99.6
Network Access ISE-PSN-2
Device Load
End User/Device
Balancer

DNS AD 10.1.99.7
External NTP LDAP
ISE-PAN ISE-MNT Logger SMTP ISE-PSN-3
MDM
Traffic Flow—Fully Inline: VLAN Separation
Logical Network Separation Using Single LB Interface and VLAN Trunking
Load Balancer
• LB is directly inline between ISE PSNs
and rest of network. VIP: 10.1.98.8

10.1.98.2 10.1.99.1
• All traffic flows through LB including RADIUS, 10.1.99.5
VLAN 98 VLAN 99
PAN/MnT, Profiling, Web Services, Management, (External) (Internal)
ISE-PSN-1
Feed Services, MDM, AD, LDAP… 10.1.98.1
NAS IP: 10.1.50.2
10.1.99.6
Network Access ISE-PSN-2
End User/Device Device Network
Switch

DNS AD 10.1.99.7
External NTP LDAP
ISE-PAN ISE-MNT Logger ISE-PSN-3
SMTP MDM
Partially Inline: Layer 2/Same VLAN (One PSN Interface)
Direct PSN Connections to LB and Rest of Network
Load Balancer
10.1.98.2
• All inbound LB traffic such RADIUS, Profiling,
and directed Web Services sent to LB VIP. 10.1.98.5
VIP: 10.1.98.8
• Other inbound non-LB traffic bypasses LB ISE-PSN-1
including redirected Web Services, PAN/MnT, VLAN 98
Management, Feed Services, MDM, AD, LDAP… 10.1.98.6

• All outbound traffic from PSNs NAS IP: 10.1.50.2 10.1.98.1 ISE-PSN-2
sent to LB as DFGW. 10.1.98.7

• LB must be configured Network Access L3


End User/Device Device Switch
to allow Asymmetric traffic ISE-PSN-3

Generally NOT RECOMMENDED due to


DNS AD
traffic flow complexity—must fully External NTP LDAP
understand path of each flow to ensure ISE-PAN ISE-MNT Logger SMTP MDM
proper handling by routing, LB, and
end stations.
Partially Inline: Layer 3/Different VLANs (One PSN Interface)
Direct PSN Connections to LB and Rest of Network For Your
Load Balancer Reference
10.1.99.2
• All inbound LB traffic such RADIUS, Profiling,
VIP: 10.1.98.8
and directed Web Services sent to LB VIP 10.1.99.5
10.1.98.2
• Other inbound non-LB traffic bypasses LB ISE-PSN-1
VLAN 98 VLAN 99
including redirected Web Services, PAN/MnT, (External) (Internal)
Management, Feed Services, MDM, AD, LDAP… 10.1.99.6

NAS IP: 10.1.98.1


• All outbound traffic from PSNs 10.1.50.2 ISE-PSN-2
sent to LB as DFGW. 10.1.99.1
10.1.99.7

• LB must be configured Network Access L3


End User/Device Device Switch
to allow Asymmetric traffic ISE-PSN-3

Generally NOT RECOMMENDED due to


DNS AD
traffic flow complexity—must fully External NTP LDAP
understand path of each flow to ensure ISE-PAN ISE-MNT Logger SMTP MDM
proper handling by routing, LB, and
end stations.
Partially Inline: Multiple PSN Interfaces
10.1.99.5 10.1.91.5
Separate PSN Connections to LB and Rest of Network
Load Balancer ISE-PSN-1
VIP:
• All LB traffic sent to LB VIP including 10.1.98.8
RADIUS, Profiling (except SPAN data), 10.1.99.2
10.1.99.6 10.1.91.6
and directed Web Services 10.1.98.2
ISE-PSN-2
VLAN 98 VLAN 99
• All traffic initiated by PSNs sent to (External) (Internal)
LB as global default gateway 10.1.99.7 10.1.91.7
NAS IP: 10.1.98.1
• Redirected Web 10.1.50.2 ISE-PSN-3
Services traffic 10.1.91.1
bypasses LB Network Access L3 VLAN 91
End User/Device Device Switch (Web Portals)
• For ISE 1.2,
recommend SNAT redirected
HTTPS traffic at L3 switch External
DNS AD
NTP LDAP
ISE-PAN ISE-MNT Logger SMTP MDM For Your
• ISE 1.3+ supports symmetric Reference
traffic responses (set default
gateway per interface)
Fully Inline – Multiple PSN Interfaces VLAN 91
(Web Portals)
Network Separation Using Separate LB Interfaces
10.1.91.1
• All traffic sent to LB including Load
Balancer
RADIUS, Profiling (except SPAN data), 10.1.99.1 10.1.99.5 10.1.91.5
and directed Web Services 10.1.98.2
VIP: 10.1.98.8 ISE-PSN-1
• All traffic initiated by PSNs sent to VLAN 98 VLAN 99
LB as global default gateway (External) (Internal) 10.1.99.6 10.1.91.6

• LB sends Web NAS IP: 10.1.98.1


ISE-PSN-2
10.1.50.2
Services traffic L3
on separate PSN Switch 10.1.99.7 10.1.91.7
interface. End User/Device
Network Access
Device
ISE-PSN-3
• For ISE 1.2 (and optionally 1.3),
SNAT Web Services at LB DNS AD
External NTP LDAP
• ISE 1.3+ supports symmetric ISE-PAN ISE-MNT Logger SMTP MDM For Your
traffic responses (set default Reference
gateway per interface)
PSN Load Balancing For Your
Reference
Sample Topology and Flow
VLAN 98 (10.1.98.0/24) VLAN 99 (10.1.99.0/24)

Request
DNS for
request
service
sent at
to resolve DNS Lookup = psn-cluster.company.com
single host
psn-cluster DNS PSN
DNS Response = 10.1.98.8 Server 10.1.99.5
‘psn-cluster’
FQDN
ISE-PSN-1
Request to psn-cluster.company.com Load Balancer
PSN
10.1.99.6
Response from ise-psn-3.company.com
Access VIP: 10.1.98.8 ISE-PSN-2
Device PSN-CLUSTER
User

PSN
Request sent to Virtual IP Address 10.1.99.7
(VIP) 10.1.98.8
ISE-PSN-3
Response received from real server
ise-psn-3 @ 10.1.99.7
Load Balancing Policy Services
• RADIUS AAA Services
Packets sent to LB virtual IP are load-balanced to real PSN based on configured algorithm. Sticky algorithm determines
method to ensure same Policy Service node services same endpoint.

• Web URL-Redirected Services: Posture (CPP) / Central WebAuth (CWA) / Native


Supplicant Provisioning (NSP) / Hotspot / Device Registration WebAuth (DRW), Partner MDM.
No LB Required! PSN that terminates RADIUS returns URL Redirect with its own certificate CN name substituted for ‘ip’
variable in URL.
Exception cases: Want to obfuscate node names/IPs, use different cert, LB inspection, DMZ interfaces.
Note: Since ISE requires HTTPS for web access, offload does not provide actual SSL perf increase.

• Web Direct HTTP/S Services: Local WebAuth (LWA) / Sponsor / MyDevices Portal, OCSP
Single web portal domain name should resolve to LB virtual IP for http/s load balancing.

• Profiling Services: DHCP Helper / SNMP Traps / Netflow / RADIUS


LB VIP is the target for one-way Profile Data (no response required). VIP can be same or different than one used by RADIUS
LB; Real server interface can be same or different than one used by RADIUS
Load Balancing
RADIUS
Load Balancing RADIUS
Sample Flow
VLAN 98 (10.1.98.0/24) VLAN 99 (10.1.99.0/24)

PSN
10.1.99.5
1 radius-server host 10.1.98.8
ISE-PSN-1
2 AUTH request
RADIUS ACCTG requesttoto10.1.98.8
10.1.98.8 Load Balancer
PSN
10.1.99.6
AUTH response
RADIUS ACCTG from
response 10.1.99.7
from 10.1.99.7
Access VIP: 10.1.98.8 ISE-PSN-2
User
4 5
Device PSN-CLUSTER

PSN
10.1.99.7
1. NAD has single RADIUS Server defined (10.1.98.8)
2. RADIUS Auth requests sent to VIP 10.1.98.8
ISE-PSN-3
3
3. Requests for same endpoint load balanced to same PSN via sticky based
on RADIUS Calling-Station-ID and Framed-IP-Address
4. RADIUS Response received from real server ise-psn-3 @ 10.1.99.7
5. RADIUS Accounting sent to/from same PSN based on sticky
Load Balancer General RADIUS Guidelines For Your
Reference
RADIUS Servers and Clients – Where Defined PSNs are RADIUS Servers for
Health Probes
ISE Admin Node > Network Devices Name PSN-Probe
(RADIUS Clients) Type RADIUS
Interval 15
ISE-PAN-1 ISE-MNT-1
Timeout 46
PAN MnT
User Name radprobe
Password cisco123
Alias Service Port 1812
PSN

ISE-PSN-1
VIP: 10.1.98.8
NAS IP: 10.1.50.2 10.1.99.1
PSN

Access Device
ISE-PSN-2
F5 LTM
User Load Balancer
PSN
Load Balancer VIP is RADIUS Server
radius-server host 10.1.98.8 auth-port 1812 acct-port
1813 test username radtest ignore-acct-port key cisco123 ISE-PSN-3
Add LB as NAD for RADIUS Health Monitoring For Your
Reference
Administration > Network Resources > Network Devices

• Configure Self IP address of LB Internal


interface connected to PSN RADIUS
interfaces. 10.1.99.1

• Enable Authentication and set RADIUS


shared secret.
PSN

ISE-PSN-1

10.1.99.1
PSN

ISE-PSN-2
F5 LTM
Load Balancer
PSN

ISE-PSN-3
Load Balancer Persistence (Stickiness) Guidelines
Persistence Attributes
• Common RADIUS Sticky Attributes
o Client Address
 Calling-Station-ID MAC Address=00:C0:FF:1A:2B:3C
IP Address=10.1.10.101 PSN
 Framed-IP-Address Device
o NAD Address ISE-PSN-1
10.1.50.2 VIP: 10.1.98.8
 NAS-IP-Address
Session: 00aa…99ff
 Source IP Address PSN

o Session ID Network Access


 RADIUS Session ID Device Load Balancer ISE-PSN-2
 Cisco Audit Session ID User Username=jdoe@company.com
PSN

• Best Practice Recommendations (depends on LB support and design)


1. Calling-Station-ID for persistence across NADs and sessions ISE-PSN-3
2. Source IP or NAS-IP-Address for persistence for all endpoints connected to same NAD
3. Audit Session ID for persistence across re-authentications
Load Balancer Stickiness Guidelines
Persistence Attributes
• ACE Example: RADIUS Sticky on IP and Calling-Station-ID (client MAC
address)
sticky radius framed-ip calling-station-id RADIUS-STICKY
serverfarm ise-psn

• F5 iRule Example: RADIUS Sticky on Calling-Station-ID (client MAC address)


ltm rule RADIUS_iRule {
when CLIENT_ACCEPTED {
Be sure to monitor load
balancer resources when
persist uie [RADIUS::avp 31]
performing advanced parsing.
}
}
Ensure NAD Populates RADIUS Attributes For Your
Reference
Cisco WLC Example
• WLC sets Calling-
Station-ID to MAC
Address for
RADIUS NAC-
enabled WLANs
• General
recommendation is
to set Acct Call
Station ID to
System MAC
Address
• Auth Call Station
ID Type may not
be present in
earlier software
versions
LB Fragmentation and Reassembly
Be aware of load balancers that do not reassemble RADIUS fragments!
Also watch for fragmented packets that are too small. LBs have min allowed frag size and will drop !!!

• Example: EAP-TLS with large certificates LB on Call-ID


IP RADIUS Frag1
• Need to address path fragmentation or persist on source IP
PSN

IP RADIUS w/BigCert IP Fragment #1 IP Fragment #2


PSN

Calling-Station-ID + Certificate Part 1 Certificate Part 2


IP RADIUS Frag2
• ACE reassembles RADIUS packet. LB on Source IP
(No Calling ID in
• F5 LTM reassembles packets by default except for FastL4 Protocol RADIUS packet)
• Must be manually enabled under the FastL4 Protocol Profile

• Citrix NetScaler fragmentation defect—Resolved in NetScaler 10.5 Build 50.10


• Issue ID 429415 addresses fragmentation and the reassembly of large/jumbo frames
NAT Restrictions for RADIUS Load Balancing
Why Source NAT (SNAT) Fails for NADs SNAT results in less visibility as all requests appear
sourced from LB – makes troubleshooting more difficult.

• With SNAT, LB appears as the Network


Access Device (NAD) to PSN.
• CoA sent to wrong IP address

NAS IP Address is User Story 8601 : CoA


correct, but not support for NAT'ed load
currently used for CoA balanced environments
SNAT of NAD Traffic: Live Log Example
Auth Succeeds/CoA Fails: CoA Sent to Load Balancer and Dropped
Allow NAT for PSN CoA Requests
Simplifying Switch CoA Configuration

• Match traffic from PSNs to UDP/1700


(RADIUS CoA) and translate to PSN
cluster VIP. CoA SRC=10.1.99.5 PSN
10.1.99.5
• Access switch config: ISE-PSN-1
CoA SRC=10.1.98.8
• Before:
PSN
aaa server radius dynamic-author 10.1.99.6
client 10.1.99.5 server-key cisco123 10.1.98.8
Access Load ISE-PSN-2
client 10.1.99.6 server-key cisco123 Balancer
Switch
client 10.1.99.7 server-key cisco123
PSN
client 10.1.99.8 server-key cisco123 10.1.99.7
client 10.1.99.9 server-key cisco123
client 10.1.99.10 server-key cisco123 ISE-PSN-3
<…one entry per PSN…>
PSN
• After: 10.1.99.x

aaa server radius dynamic-author ISE-PSN-X


client 10.1.98.8 server-key cisco123
Allow NAT for PSN CoA Requests
Simplifying WLC CoA Configuration

• Before: • After

One RADIUS Server entry One RADIUS Server entry


required per PSN that may send required per load balancer VIP.
CoA from behind load balancer
NAT Guidelines for ISE RADIUS Load Balancing
To NAT or Not To NAT?
ISE-PAN-1 ISE-MNT-1 No NAT
That is the Question!
PAN MnT

PSN
10.1.99.5
VLAN 98 VLAN 99
(10.1.98.0/24) (10.1.99.0/24)
ISE-PSN-1

NAS IP: 10.1.50.2 VIP: 10.1.98.8 LB: 10.1.99.1


PSN
10.1.99.6
Access Device
Load Balancer ISE-PSN-2
User RADIUS AUTH RADIUS AUTH COA
NAS-IP =10.1.50.2 Remove
NAD is
SNAT for NAS-IP =10.1.50.2
SRC-IP =10.1.50.2 Source
Source =10.1.99.1
SRC-IP =10.1.50.2 PSN
NAD is BAD! 10.1.99.7
DST-IP =10.1.98.8 NAT
NATted DST-IP =10.1.99.7
ISE-PSN-3
RADIUS COA RADIUS COA
SNAT for
SRC-IP =10.1.98.8 SRC-IP =10.1.99.7
CoA is Okay! DST-IP =10.1.50.2
DST-IP =10.1.50.2
Load Balancing
ISE Web Services
Load Balancing with URL-Redirection
URL Redirect Web Services: Hotspot/DRW, CWA, BYOD, Posture, MDM

DNS Lookup = ise-psn-3.company.com


DNS
4 DNS Response = 10.1.99.7 Server
PSN
10.1.99.5

ISE-PSN-1
Load Balancer
1 RADIUS request to psn-cluster.company.com
PSN
10.1.99.6
RADIUS response from ise-psn-3.company.com
3
Access VIP: 10.1.98.8 ISE-PSN-2
Device https://ise-psn-3.company.com:8443/... PSN-CLUSTER
User
2
5 HTTPS response from ise-psn-3.company.com PSN
10.1.99.7

1. RADIUS Authentication requests sent to VIP 10.1.98.8 ISE-PSN-3


2. Requests for same endpoint load balanced to same PSN via RADIUS sticky.
3. RADIUS Authorization received from ise-psn-3 @ 10.1.99.7 with URL Redirect to ISE Certificate
https://ise-psn-3.company.com:8443/... Subject CN =
4. Client browser redirected and resolves FQDN in URL to real server address. ise-psn-3.company.com
5. User sends web request directly to same PSN that serviced RADIUS request.
Load Balancing Non-Redirected Web Services
Direct Web Services: Sponsor, My Devices, LWA, OCSP

DNS Lookup = sponsor.company.com


DNS PSN
1 DNS Response = 10.1.98.8 Server 10.1.99.5

https://sponsor.company.com ISE-PSN-1

2 https://sponsor. company.com @ 10.1.98.8 Load Balancer


PSN
10.1.99.6
https response from ise-psn-3 @ 10.1.99.7
Access VIP: 10.1.98.8 ISE-PSN-2
Sponsor 4 Device PSN-CLUSTER

PSN
10.1.99.7

1. Browser resolves sponsor.company.com to VIP @ 10.1.98.8 ISE-PSN-3 3


2. Web request sent to https://sponsor.company.com @ 10.1.98.8
3. ACE load balances request to PSN based on IP or HTTP sticky
4. HTTPS response received from ise-psn-3 @ 10.1.99.7
ISE Certificate without SAN
Certificate Warning - Name Mismatch

DNS
http://sponsor.company.com DNS Lookup = sponsor.company.com Server
PSN
DNS Response = 10.1.98.8 10.1.99.5

ISE-PSN-1
10.1.98.8
SPONSOR http://sponsor.company.com
PSN
10.1.99.6
https://sponsor.company.com:8443/sponsorportal
ISE Certificate Load ISE-PSN-2
Balancer
Subject =
ise-psn-3.company.com
PSN
Name Mismatch! 10.1.99.7
Requested URL = sponsor.company.com
Certificate Subject = ise-psn-3.company.com ISE-PSN-3
ISE Certificate with SAN
No Certificate Warning

DNS
http://sponsor.company.com DNS Lookup = sponsor.company.com Server
PSN
DNS Response = 10.1.98.8 10.1.99.5

ISE-PSN-1
10.1.98.8
SPONSOR http://sponsor.company.com
PSN
10.1.99.6
https://sponsor.company.com:8443/sponsorportal
Load ISE-PSN-2
ISE Certificate
Balancer
Subject =
ise-psn.company.com
PSN
SAN= Certificate OK! 10.1.99.7
ise-psn-1.company.com Requested URL = sponsor.company.com
ise-psn-2.company.com Certificate SAN = sponsor.company.com ISE-PSN-3
ise-psn-3.company.com
sponsor.company.com
Load Balancing Preparation
Configure DNS and Certificates
• Configure DNS entry for PSN cluster(s) and assign VIP IP address.
Example: psn-cluster.company.com
DNS SERVER: DOMAIN = COMPANY.COM
PSN-CLUSTER IN A 10.1.98.8
SPONSOR IN A 10.1.98.8
MYDEVICES IN A 10.1.98.8
ISE-PSN-1 IN A 10.1.99.5
ISE-PSN-2 IN A 10.1.99.6
ISE-PSN-3 IN A 10.1.99.7

• Configure ISE PSN server certs with Subject Alternative


Name configured for other FQDNs to be used by LB VIP
or optionally use wildcards (available in ISE 1.2).
Example
Example certificate SAN: ise-psn-1.company.com certificate with
psn-cluster.company.com multiple FQDN
sponsor.company.com values in SAN.
guest.company.com
ISE Certificate with SAN – “Universal Certs”

CN must also exist in SAN


ise-psn ise-psn/Admin

Universal Cert options:


• UCC / Multi-SAN
• Wildcard SAN

ise-psn.company.com

mydevices.company.com Other FQDNs or wildcard


sponsor.company.com
as “DNS Names”
IP Address is also option
General Best Practices for Universal Certificates
ISE 1.2 Example
• Use a common FQDN for Subject CN:
Examples: ise.company.com
aaa.company.com
• If Subject CN contains FQDN, add same
FQDN to SAN
• Multi-Domain/UCC* Certificate: Update
SAN with all FQDNs serviced by PSN
»OR
Wildcard Certificate: Update SAN with
wildcard domain using syntax
*.company.local
• If required for static IP hosting, add IP
addresses as both DNS and IP entries
(increases device compatibility)
*UCC = Unified Communications Certificate
ISE Certificates For Your
Reference
General Best Practices

• Make sure all certificate CN names can be resolved by DNS


• Use lower case for appliance hostname, DNS name, certificate CN
• ISE cert CSR: Use format “CN=<FQDN>” for subject name
• Ensure time is synced – Use NTP with TZ-UTC for all nodes
• Signed by Trusted CA – required for each node
• For external users/guests, certs should be signed by 3rd-party CA
• Install entire certificate chains as individual certs into ISE trust store
• For Web admin, node communications, web portals, PEAP negotiation, select HTTPS
option for server certificate—currently limited to one cert
• For EAP-TLS, enable “Trust for client authentication” for trusted certs
• Use PEM, not DER encoding for import/export operations.
Load Balancer NAT Guidelines for Web Traffic
URL-Redirected Traffic with Single PSN Interface

• No NAT Required • RADIUS


• Allow web portal traffic direct to PSN without NAT • Guest Portals
10.1.99.0/24

10.1.98.0/24
.5 .6 .7 .x
.1 .8 .1
PSN PSN PSN PSN
10.1.10.0/24
Load Balancer ISE-PSN-1 ISE-PSN-2 ISE-PSN-3 ISE-PSN-X
User

RADIUS session load-balanced to PSN @ 10.1.99.6

URL Redirect automatically includes FQDN/Interface IP of same PSN @ 10.1.99.6


https://ise-psn-2.company.com:8443/guestportal/Login...

Browser traffic redirected to IP for ise-psn-2.company.com:


https://10.1.99.6:8443/guestportal/Login...
SNAT on L3 Switch for Dedicated Web Interfaces (ISE 1.2)
URL-Redirected Traffic with Dedicated PSN Interface for Web Portals (Single LB interface)

• Source NAT portal traffic to simplify routing RADIUS


• Maintains Path Isolation
10.1.99.0/24

L3 Switch
10.1.98.0/24 .5 .6 .7 .x
.1 .8 .1
PSN PSN PSN PSN

10.1.10.0/24 .1 Load Balancer ISE-PSN-1 ISE-PSN-2 ISE-PSN-3 ISE-PSN-X


.1 .7
.5 .6 .x

User
10.1.91.0/24
RADIUS session load-balanced to PSN @ 10.1.99.6 Guest Portals

URL Redirect automatically includes FQDN/Interface IP of Web Portal interface for same PSN @
10.1.91.6: https://ise-psn-2-guest.company.com:8443/guestportal/Login...

Source NAT web traffic from user networks destined to PSN web interfaces @ 10.1.91.x; translate to 10.1.91.x
(or any address block that can be statically added to PSN route table)
Ensures all Web requests received by PSN web interface are returned out same interface.
SNAT on LB for Dedicated Web Interfaces (ISE 1.2)
Direct Access and URL-Redirected Traffic with Dedicated PSN Web Interfaces

RADIUS session load-balanced to PSN @ 10.1.99.6


10.1.99.0/24
L3
User A 10.1.10.0/24Switch 10.1.98.0/24 .5 .6 .7 .x
.1 .8 .1
PSN PSN PSN PSN

10.1.11.0/24 Load
.1 Balancer ISE-PSN-1 ISE-PSN-2 ISE-PSN-3 ISE-PSN-X
.1
User B .5 .6 .7 .x
10.1.12.0/24
10.1.91.0/24

Direct-Access Portals:
User C Enable SNAT on Virtual Servers for ISE Sponsor, My Devices, and LWA portals.

URL-Redirected Web Portals/Services:


Enable SNAT on Load Balancer for IP Forwarding traffic to Virtual Servers.
Multi-Interface Routing (ISE 1.2) Admin Nodes
For Your
Reference
All User Web Traffic Sent Out GE1; Everything Else GE0 Monitor Nodes

Other Policy Nodes

PSN Routing Table Partner MDM


10.1.41.x
DNS
GE1 .1
ip default-gateway 10.1.99.1 NTP
ip route <user_subnet_A> 10.1.41.1 SMTP
ip route <user_subnet_B> 10.1.41.1 SNMP
ip route <user_subnet_C> 10.1.41.1 AD/LDAP

GE0 .1 ip route <user_subnet_D> 10.1.41.1


Wired NAD
ip route <user_subnet_E> 10.1.41.1
10.1.99.x
ip route <user_subnet_F> 10.1.41.1 Wireless NAD
ip route <user_subnet_G> 10.1.41.1
ip route <user_subnet_H> 10.1.41.1 VPN NAD

All NMAP Scans and ip route <user_subnet_I> 10.1.41.1


Endpoint SNMP queries ip route <user_subnet_J> 10.1.41.1 User – Access – Subnet A

also sent out GE1. ip route <user_subnet_K> 10.1.41.1


User – Quarantine – Subnet B
ip route <user_subnet_L> 10.1.41.1
ip route <user_subnet_M> 10.1.41.1
User – Access – Subnet C
ip route <user_subnet_N> 10.1.41.1
...
User – Quarantine – Subnet D
(One entry for each client subnet—no
auto-summary)
Multi-Interface Routing (ISE 1.2) Admin Nodes
For Your
Reference
Default GE1 (inc. All User Web Traffic); Mgmt Targets=GE0 Monitor Nodes

Other Policy Nodes

PSN Routing Table Partner MDM


10.1.41.x
DNS
GE1 .1
ip default-gateway 10.1.41.1 NTP
ip route <admin_nodes> 10.1.99.1 SMTP
ip route <mnt_nodes> 10.1.99.1 SNMP
ip route <other_psn_nodes> 10.1.99.1 AD/LDAP

GE0 .1 ip route <partner_mdm> 10.1.99.1


Wired NAD
ip route <dns_servers> 10.1.99.1
10.1.99.x
ip route <ntp_servers> 10.1.99.1 Wireless NAD
ip route <smtp_server> 10.1.99.1
ip route <snmp_server> 10.1.99.1 VPN NAD

All NMAP Scans and ip route <ad_ldap_servers> 10.1.99.1


Endpoint SNMP queries ip route <nad_radius_subnet_a> 10.1.99.1 User – Access – Subnet A

also sent out GE1. ip route <nad_radius_subnet_b> 10.1.99.1


User – Quarantine – Subnet B
ip route <nad_radius_subnet_c> 10.1.99.1
ip route <nad_radius_subnet_d> 10.1.99.1
User – Access – Subnet C
ip route <nad_radius_subnet_n> 10.1.99.1
...
User – Quarantine – Subnet D
(One entry for each unique access device
subnet for RADIUS—no auto-summary)
Multi-Interface Routing (ISE 1.2) Admin Nodes
For Your
Reference
User Standard DGW; NAT HTTPS Traffic to Web Interface Monitor Nodes

Other Policy Nodes

PSN Routing Table Partner MDM


10.1.41.x
DNS
GE1 .1
ip default-gateway 10.1.41.1 NTP
SMTP
SNMP
AD/LDAP

GE0 .1 Wired NAD


10.1.99.x
Wireless NAD

VPN NAD

NAT traffic destined to PSN nodes on network 10.1.41.x


—Impacts only URL-redirected traffic or client-initiated HTTPS User – Access – Subnet A

(Sponsor, MDP, LWA)


User – Quarantine – Subnet B

PSN has local connection to 10.1.41.x, so no need for static route.


User – Access – Subnet C

NMAP traffic still be sent out GE0.


User – Quarantine – Subnet D
Dedicated Web Interfaces under ISE 1.3
Direct Access and URL-Redirected Traffic with Dedicated PSN Web Interfaces

RADIUS session load-balanced to PSN @ 10.1.99.6


10.1.99.0/24
L3
User A 10.1.10.0/24Switch 10.1.98.0/24 .5 .6 .7 .x
.1 .8 .1
PSN PSN PSN PSN

10.1.11.0/24 Load
.1 Balancer ISE-PSN-1 ISE-PSN-2 ISE-PSN-3 ISE-PSN-X
.1
User B .5 .6 .7 .x
10.1.12.0/24
10.1.91.0/24

Response to traffic received on an interface sent out same interface if


User C
default route exists for interface: No SNAT required!

Default route 0.0.0.0/0 10.1.99.1 eth0


Default route 0.0.0.0/0 10.1.91.1 eth1
Dedicated Web Interfaces under ISE 1.3
Symmetric Traffic Flows
• Configure default routes for each interface to support symmetric return traffic
ise13-psn-x/admin# config t
Enter configuration commands, one per line. End with CNTL/Z.
ise13-psn-x/admin(config)# ip route 0.0.0.0 0.0.0.0 gateway 10.1.91.1

• Validate new default route


ise13-psn-x/admin# sh ip route

Destination Gateway Iface ip default-gateway


----------- ------- ----- setting determines
10.1.91.0/24 0.0.0.0 eth1 default route for traffic
10.1.99.0/24 0.0.0.0 eth0 initiated by ISE node.
default 10.1.91.1 eth1
default 10.1.99.1 eth0
SSL Certificates for Internal Server Names
After November 1, 2015 Certificates for Internal Names Will No Longer Be
Trusted
In November 2011, the CA/Browser Forum (CA/B) adopted Baseline Requirements for the
Issuance and Management of Publicly-Trusted Certificates that took effect on July 1, 2012.
These requirements state:
CAs should notify applicants prior to issuance that use of certificates with a Subject
Alternative Name (SAN) extension or a Subject Common Name field containing a reserved
IP address or internal server name has been deprecated by the CA/B
CAs should not issue a certificate with an expiration date later than November 1, 2015 with a
SAN or Subject Common Name field containing a reserved IP address or internal server
Name
Source: Digicert – https://www.digicert.com/internal-names.htm
Use Publicly-Signed Certs for Guest Portals!

• In 1.3, HTTPS cert for


Admin can be different
from web portals Redirection based on first
service-enabled interface;
• Guest portals can use if eth0, return host FQDN;
a different, public c else return interface IP.
certificate
• Admin and internal c
Public Portal Certificate Group
Certs assigned to
employee portals (or
this group signed by
EAP) can still use
3rd-party CA
certs signed by private
CA.

267
CWA Example
DNS and Port Settings–Single Interface Enabled for Guest Portal

• CWA Guest Portal access for ISE-PSN1 configured for eth1

• IP Address for eth1 on ISE-PSN1 is 10.1.91.5


ISE Node IP Address Interface
ISE-PSN1 10.1.99.5 # eth0
ISE-PSN1 10.1.91.5 # eth1
ISE-PSN1
ISE-PSN1
10.1.92.5
10.1.93.5
# eth2
# eth3
I have a feeling this is
going to end badly!
• Resulting URL Redirect = https://10.1.91.5:8443/...
???
CWA Example with FQDNs in SAN
URL Redirection Uses First Guest-Enabled Interface (eth1) Admin/RADIUS:
eth0: 10.1.99.5
1. RADIUS Authentication requests sent to ise-psn1 @ 10.1.99.5. ISE-PSN1
2. RADIUS Authorization received from ise-psn1 @ 10.1.99.5 with
URL Redirect to https://10.1.91.5:8443/... PSN
3. User sends web request directly to ise-psn1 @ 10.1.99.5.
4. User receives cert name mismatch warning.

1 RADIUS request to ise-psn1 @ 10.1.99.5


RADIUS authorization: URL redirect =
https://10.1.91.5:8443/...
2
Access Switch
Device https://10.1.91.5:8443/...
User
3 HTTPS response from 10.1.91.5 Guest
eth1: 10.1.91.5
Name Mismatch!
MyDevices
ISE Certificate Requested URL = 10.1.91.5
eth2: 10.1.92.5
Subject=
Certificate SAN = ise-psn1.comany.com
ise-psn1.company.com 4 = sponsor.company.com Sponsor
SAN = = mydevices.company.com eth3: 10.1.93.5
ise-psn1.company.com
sponsor.company.com
mydevices.company.com
Interface Aliases Available in ISE 1.2
Specify alternate hostname/FQDN for URL redirection

• Aliases assigned to interfaces using ip host global config command in ADE-OS:


(config)# ip host <interface_ip_address> <hostname|FQDN> <hostname|FQDN>

• Up to two values can be specified—hostname and/or FQDN; if specify


hostname, then globally configured ip domain-name appended for use in URL
redirection.  FQDN can have different domain than global domain!!!
• GigabitEthernet1 (GE1) Example:
ise-psn1/admin(config)# ip host 10.1.91.5 ise-psn1-guest ise-psn1-guest.company.com

• Host entry for Gigabit Ethernet 0 (eth0) cannot be modified


• Use show run to view entries; Use no ip host <ip_address> to remove entry.
• Change in interface IP address or alias requires application server restart.
Interface Alias Example
DNS and Port Settings – Single Interface Enabled for Guest

• Interface eth1 enabled for Guest Portal


• (config)# ip host 10.1.91.5 ise-psn1-guest.company.com
FQDN with
• URL redirect = https://ise-psn1-guest.company.com:8443/... Publicly-Signed
Cert
• Guest DNS resolves FQDN to correct IP address
DNS SERVER
DNS SERVER DOMAIN = COMPANY.LOCAL
DOMAIN = COMPANY.COM
ISE-PSN1 IN A 10.1.99.5 # eth0
ISE-PSN1-GUEST IN A 10.1.91.5 # eth1 ISE-PSN1-MDP IN A 10.1.92.5 # eth2
ISE-PSN1-SPONSOR IN A 10.1.93.5 # eth3
ISE-PSN2-GUEST IN A 10.1.91.6 # eth1
ISE-PSN2 IN A 10.1.99.6 # eth0
ISE-PSN3-GUEST IN A 10.1.91.7 # eth1 ISE-PSN2-MDP IN A 10.1.92.6 # eth2
ISE-PSN2-SPONSOR IN A 10.1.93.6 # eth3

ISE-PSN3 IN A 10.1.99.7 # eth0


ISE-PSN3-MDP IN A 10.1.92.7 # eth2
271
ISE-PSN3-SPONSOR IN A 10.1.93.7 # eth3
CWA Example using Interface Alias
URL Redirection Uses First Guest-Enabled Interface (eth1) Admin/RADIUS:
eth0: 10.1.99.5
1. RADIUS Authentication requests sent to ise-psn1 @ 10.1.99.5.
ISE-PSN1
2. RADIUS Authorization received from ise-psn1 @ 10.1.99.5 with
URL Redirect to https://ise-psn1-guest:8443/...
PSN
3. DNS resolves alias FQDN ise-psn1-guest to 10.1.91.5 and sends
web request to ise-psn1-guest @ 10.1.99.5.
4. No cert warning received since SAN contains interface alias FQDN.

1 RADIUS request to ise-psn1 @ 10.1.99.5


RADIUS authorization: URL redirect =
https://ise-psn1-guest.company.com:8443/...
2
Access Switch
Device https://ise-psn1-guest.company.com:8443/...
User
3 HTTPS response from 10.1.91.5 All Web Portals
eth1: 10.1.91.5
Certificate OK!
ISE Certificate All Web Portals
Requested URL = ise-psn1-guest.company.com eth2: 10.1.92.5
Subject = Certificate SAN = ise-psn1-guest.company.com
ise-psn1.company.com
4 All Web Portals
SAN= ise-psn1- eth3: 10.1.93.5
guest.company.com
Could also use wilcard SAN or UCC cert
Load Balancing
ISE Profiling Services
Load Balancing Profiling Services
Sample Flow

DHCP Request to Helper IP 10.1.1.10


2 DHCP PSN
10.1.99.5
DHCP Response returned from DHCP Server Server
3
ISE-PSN-1
DHCP Request to Helper IP 10.1.98.8 Load Balancer
1 2 PSN
10.1.99.6

Access VIP: 10.1.98.8 ISE-PSN-2


Device PSN-CLUSTER
User

4 PSN
10.1.99.7
1. Client OS sends DHCP Request
2. Next hop router with IP Helper configured forwards DHCP request to ISE-PSN-3
real DHCP server and to secondary entry = LB VIP
3. Real DHCP server responds and provide client a valid IP address
4. DHCP request to VIP is load balanced to PSN @ 10.1.99.7 based on
source IP stick (L3 gateway) or DHCP field parsed from request.
Load Balancing Simplifies Device Configuration
L3 Switch Example for DHCP Relay

• Before !
interface Vlan10
description EMPLOYEE
ip address 10.1.10.1 255.255.255.0
ip helper-address 10.1.100.100 <--- Real DHCP Server
ip helper-address 10.1.99.5 <--- ISE-PSN-1
ip helper-address 10.1.99.6 <--- ISE-PSN-2
Settings
! apply to each
L3 interface
servicing
• After !
DHCP
interface Vlan10
description EMPLOYEE endpoints
ip address 10.1.10.1 255.255.255.0
ip helper-address 10.1.100.100 <--- Real DHCP Server
ip helper-address 10.1.98.8 <--- LB VIP
!
Load Balancing Simplifies Device Configuration
Switch Example for SNMP Traps For Your
Reference

• Before !
snmp-server trap-source GigabitEthernet1/0/24
snmp-server enable traps snmp linkdown linkup
snmp-server enable traps mac-notification change move
snmp-server host 10.1.99.5 version 2c public mac-notification snmp
snmp-server host 10.1.99.6 version 2c public mac-notification snmp
snmp-server host 10.1.99.7 version 2c public mac-notification snmp
!

!
• After
snmp-server trap-source GigabitEthernet1/0/24
snmp-server enable traps snmp linkdown linkup
snmp-server enable traps mac-notification change move
snmp-server host 10.1.98.8 version 2c public mac-notification snmp
!
Profiling Services using Load Balancers For Your
Reference
Which PSN Services Processes Profile Data?

• Profiling Probes
The following profile data can be load balanced to PSN VIP but may not be processed by same PSN that
terminated RADIUS:
• DHCP IP Helper to DHCP probe
• NetFlow export to NetFlow Probe Option to leverage Anycast to reduce
• SNMP Traps log targets and facilitate HA

• SNMP Query Probe (triggered)


PSNs configured to send SNMP Queries will send query to NAD that sent RADIUS or SNMP Trap which
triggered query. Therefore, SNMP Query data processed by same PSN that terminated RADIUS request for
endpoint.

• SNMP Query Probe (polled)


Not impacted by load balancing, although possible that PSN performing polled query is not same PSN that
terminates RADIUS for newly discovered endpoints. PSN will sync new endpoint data with Admin. Since poll
typically conducted at longer intervals, this should not impact more real-time profiling of endpoints.
Profiling Services using Load Balancers (Cont.) For Your
Reference
Which PSN Services Process Profile Data?

• DNS Probe
Submitted by same PSN which obtains IP data for endpoint. Typically the same PSN that processes RADIUS,
DHCP, or SNMP Query Probe data.

• NMAP Probe
Submitted by same PSN which obtains data which matches profile rule condition.

• HTTP (via URL redirect)


URL redirect will point to PSN that terminates RADIUS auth so HTTP data will be parsed by same PSN.

• DHCP SPAN or HTTP SPAN


Since mirror port is associated to a specific interface on real PSN, cannot provide HA for SPAN data unless
configure multiple SPAN destinations to separate PSNs. No guarantee that same PSN that collects SPAN data
terminates RADIUS session.
Load Balancing Sticky Guidelines
Ensure DHCP and RADIUS for a Given Endpoint Use Same PSN

Persistence Cache:
11:22:33:44:55:66 -> PSN-3 PSN
10.1.99.5

MAC: 11:22:33:44:55:66 ISE-PSN-1


F5 LTM
RADIUS request to VIP
1 2 PSN
User 10.1.99.6
NAD RADIUS response from PSN-3
ISE-PSN-2
VIP: 10.1.98.8
DHCP Request IP Helper sends DHCP to VIP
3 4
PSN
10.1.99.7
5
1. RADIUS Authentication request sent to VIP @ 10.1.98.8.
ISE-PSN-3
2. Request is Load Balanced to PSN-3, and entry added to Persistence Cache
3. DHCP Request is sent to VIP @ 10.1.98.8
4. Load Balancer uses the same “Sticky” as RADIUS based on client MAC address
5. DHCP is received by same PSN, thus optimizing endpoint replication
Live Log Output for Load Balanced Sessions For Your
Reference
Synthetic Transactions

• Batch of test authentications generated from Catalyst switch:


# test aaa group radius radtest cisco123 new-code count 100

All RADIUS sent to


LB VIP @ 10.1.98.8

Requests evenly
distributed across
real servers:
ise-psn-1
ise-psn-2
ise-psn-3
Live Log Output for Load Balanced Sessions For Your
Reference
Real Transactions

• All RADIUS sent to LB VIP @ 10.1.98.10


1• All phone auth is load balanced from VIP to ise-psn-3 @ 10.1.99.7
2• All PC auth is load balanced to ise-psn-1 @ 10.1.99.5; URL Redirect traffic sent to same PSN.

3• CoA is sent from same PSN that is handling the auth session.
• dACL downloads are sent from switch itself without a Calling-Station-Id or Framed-IP-Address. Request can be
4 load balanced to any PSN. Not required to pull dACL from same PSN as auth.

3
4 2

1
ISE and Load Balancers For Your
Reference
Failure Scenarios

• The VIP is the RADIUS Server, so if the entire VIP is down, then the NAD should fail over
to the secondary Data Center VIP (listed as the secondary RADIUS server on the NAD).
• Probes on the load balancers should ensure that RADIUS is responding as well as
HTTPS, at a minimum.
• Validate that RADIUS responds, not just that UDP/1812 & UDP/1813 are open
• Validate that HTTPS responds, not just that TCP/8443 is open

• Upon detection of failed node using probes (or node taken out of service), new requests
will be serviced by remaining nodes Minimum N+1 redundancy recommended for node
groups.
• Configure LB cluster as a node group.
• If node group member fails, then another node-group member will issue CoA-reauth for Posture
Pending sessions, forcing the sessions to begin again and not be hung.
• Note: Node groups do not require load balancers, but nodes still must meet IP multicast
requirements.
ISE and Load Balancers For Your
Reference
General Guidelines

• Do not use Source NAT(SNAT) from access layer for RADIUS; SNAT Optional for HTTP/S:
• ISE uses Layer 3 address to identify NAD, not NAS-IP-Address in RADIUS packet, so CoA fails.
• Each PSN must be reachable by the PAN / MNT directly without NAT.
• Each PSN must be reachable directly from client network for URL redirects (*Note sticky exception)
• Perform sticky (aka: persistence) based on Calling-Station-ID.
• Some load balancers support RADIUS Session ID; Others may be limited to Source IP (NAD IP).
• Optional “sticky buddies” (secondary attributes that persist different traffic to same PSN)
• *Framed-IP-Address if URL redirects must be sent through LB and not bypass LB.
• DHCP Requested IP Address to ensure DHCP Profile data hits same PSN that terminated RADIUS.
• VIP for PSNs gets listed as the RADIUS server on each NAD for all RADIUS AAA.
• Each PSN gets listed individually in the NAD CoA list by real IP address (not VIP).
• If source NAT PSN-initiated CoA traffic, then can list single VIP in NAD CoA list.
• Load Balancers get listed as NADs in ISE so their test authentications may be answered.
For Your
Reference

Sample ACE
Configuration
Sample ACE Configuration… For Your
Reference
Health Probes and Real Servers probe tcp 8443-PROBE
port 8443 Simple example;
• Define TCP or HTTP/S probe to verify web interval 30 HTTP/S probe
services active on configured HTTPS ports. passdetect interval 90 recommended
connection term forced
open 1
probe radius PSN-PROBE
port 1812
interval 10
passdetect interval 90
credentials radprobe cisco123 secret cisco123
nas ip address 10.1.99.2
probe icmp ping
interval 15 Sample ping probe
passdetect interval 60

rserver host ise-psn-1


ip address 10.1.99.5
inservice
• Define RADIUS probe to verify AAA services rserver host ise-psn-2
ip address 10.1.99.6
• Define real servers = Policy Service node real IP inservice
address for RADIUS / Web rserver host ise-psn-3
ip address 10.1.99.7
inservice
Sample ACE Configuration… For Your
Reference
Server Farms and Sticky
serverfarm host ise-psn
probe PSN-PROBE
rserver ise-psn-1
• Define server farm for RADIUS services inservice
using RADIUS health probe rserver ise-psn-2
inservice
rserver ise-psn-3
inservice
serverfarm host ise-psn-web
probe 8443-PROBE
• Optionally define separate server farm for rserver ise-psn-1
client HTTPS requests sent to LB VIP inservice
address (different health probe) rserver ise-psn-2
inservice
rserver ise-psn-3
inservice
• RADIUS sticky based on Framed-IP-
Address and Calling-Station-ID to ensure sticky radius framed-ip calling-station-id RADIUS-STICKY
Auth and Acctng for same client stay with serverfarm ise-psn
same PSN
sticky ip-netmask 255.255.255.255 address source SRC-IP-STICKY
• Simple Source IP sticky used for client web timeout 5
requests sent directly to LB VIP. serverfarm ise-psn-web
Sample ACE Configuration… For Your
Reference
Class Maps to Define Matching VIP Traffic HTTP (tcp/80) allows redirect
to secure HTTPS port.

• Profiling Services: Load balance DHCP (IP Helper)


class-map match-all DHCP-CLASS
requests forwarded from client gateways 2 match virtual-address 10.1.98.10 udp eq 67
(UDP/67).
class-map match-any HTTPS-CLASS
• Note: To LB SNMP traps or NetFlow, add 2 match virtual-address 10.1.98.10 tcp eq http
UDP/162 or UDP/9996, respectively to list of ports. 3 match virtual-address 10.1.98.10 tcp eq https
Example: 4 match virtual-address 10.1.98.10 tcp eq 8443
5 match virtual-address 10.1.98.10 tcp eq 8444
class-map match-any PROFILER-CLASS
2 match virtual-address 10.1.98.10 udp eq 67 class-map match-all RAD-L4-CLASS
3 match virtual-address 10.1.98.10 udp eq 162 2 match virtual-address 10.1.98.10 udp range 1812 1813
4 match virtual-address 10.1.98.10 udp eq 9996

• Configure VIP for HTTP/S ports used for direct


access to PSN web services (Guest LWA,
Sponsor, and MyDevices portals)
Specific ports will depend on ISE
• Configure VIP for RADIUS Auth/Acctng configuration for web portal ports.
Sample ACE Configuration… For Your
Reference
Policy Maps policy-map type loadbalance radius first-match RAD-L7-POLICY
class class-default
sticky-serverfarm RADIUS-STICKY

policy-map type loadbalance generic first-match WEB-L4-POLICY


• Map LB policies for RADIUS, DHCP
class class-default
(Profiling), and Web services to specific sticky-serverfarm SRC-IP-STICKY
sticky server farms.
policy-map type loadbalance generic first-match DHCP-L4-POLICY
class class-default
sticky-serverfarm SRC-IP-STICKY

policy-map multi-match RAD-L4-POLICY


class RAD-L4-CLASS
loadbalance vip inservice
• Define general service policy to be loadbalance policy RAD-L7-POLICY
applied to ACE client-side interface. loadbalance vip icmp-reply
class HTTPS-CLASS
loadbalance vip inservice
Maps VIPs to individual LB policies which loadbalance policy WEB-L4-POLICY
point to sticky server farms. loadbalance vip icmp-reply
class DHCP-CLASS
loadbalance vip inservice
loadbalance policy DHCP-L4-POLICY
Allows VIP to be loadbalance vip icmp-reply
pinged by clients class class-default
Sample ACE Configuration… For Your
Reference
Interfaces and Service Policies

• Optional ACL to define traffic permitted access-list ALL line 1 extended permit ip any any
to/from each interface
interface vlan 98
description ACE
• Client-facing interface—includes general ip address 10.1.98.2 255.255.255.0
service policy for LB services access-group input ALL
service-policy input RAD-L4-POLICY
no shutdown

interface vlan 99
description CLUSTER
ip address 10.1.99.1 255.255.255.0
alias 10.1.99.2 255.255.255.0
• Server-facing interface
mac-sticky enable
no icmp-guard
access-group input ALL
no shutdown
• Default route pointing to upstream L3 switch. ip route 10.1.0.0 255.255.0.0 10.1.98.1
Sample ACE Configuration… For Your
Reference
Allow NAT of PSN CoA Requests

• Match traffic from PSNs to UDP/1700 (RADIUS CoA) and translate to PSN cluster VIP.
access-list NAT-COA line 5 extended permit udp 10.1.99.0
255.255.255.248 any eq 1700
PSN
CoA SRC=10.1.99.5 10.1.99.5
class-map match-any NAT-CLASS
ISE-PSN-1 2 match access-list NAT-COA
CoA SRC=10.1.98.10
policy-map multi-match NAT-POLICY
PSN
10.1.99.6 class NAT-CLASS
10.1.98.10 nat dynamic 1 vlan 98
ACE LB ISE-PSN-2
Interface vlan 98
PSN nat-pool 1 10.1.98.10 10.1.98.10 netmask 255.255.255.255 pat
Before 10.1.99.7
aaa server radius dynamic-author interface vlan 99
ISE-PSN-3
client 10.1.99.5 server-key cisco123 service-policy input NAT-POLICY
client 10.1.99.6 server-key cisco123
PSN
client 10.1.99.7 server-key cisco123 10.1.99.x
client 10.1.99.8 server-key cisco123 After
client 10.1.99.9 server-key cisco123 ISE-PSN-X aaa server radius dynamic-author
client 10.1.99.10 server-key cisco123 client 10.1.98.10 server-key cisco123
<…one entry per PSN…>
Sample ACE Configuration… For Your
Reference
Allow Ping for ISE PSNs / UDP Connection Timer

• Allow ISE nodes to ping default gateway. class-map type management match-any remote_access
Otherwise, install fails! 2 match protocol icmp any

policy-map type management first-match


PSN
10.1.99.5 remote_mgmt_allow_policy
class remote_access
ISE-PSN-1 permit
ACE LB
echo
PSN interface vlan 99
10.1.99.6 service-policy input remote_mgmt_allow_policy
echo-reply
10.1.99.1 ISE-PSN-2
(ACE INT VLAN 99)
PSN
10.1.99.7

ISE-PSN-3 parameter-map type connection UDP_CONN


set timeout inactivity 30

• Optionally set UDP connection timeout to policy-map multi-match RAD-L4-POLICY


match switch RADIUS timeout; assign to class RAD-L4-CLASS
RADIUS LB connections. connection advanced-options UDP_CONN
For Your
Reference

Sample F5 LTM
Config
For Your
Reference

Cisco and F5 Deployment Guide:


ISE Load Balancing using BIG-IP:
http://www.cisco.com/c/dam/en/us/td/docs/security/ise/
how_to/HowTo-95-Cisco_and_F5_Deployment_Guide-
ISE_Load_Balancing_Using_BIG-IP_DF.pdf

ISE How-To and Design Guides:


http://www.cisco.com/c/en/us/support/security/identity-
services-engine/products-implementation-design-
guides-list.html
For Your
Reference

Forwarding Non-LB
Traffic
High-Level Load Balancing Diagram For Your
Reference
DNS
NTP
External SMTP
ISE-PAN-1 ISE-MNT-1 Logger MDM AD/LDAP

VLAN 98 VLAN 99 10.1.99.5


(10.1.98.0/24) (10.1.99.0/24) ISE-PSN-1

NAS IP: 10.1.50.2 VIP: 10.1.98.8 LB: 10.1.99.1


10.1.99.6
Network Access ISE-PSN-2
End User/Device Device F5 LTM

10.1.99.7
ISE-PSN-3

ISE-PAN-2 ISE-MNT-2
Non-LB Traffic that Requires IP Forwarding For Your
Reference
Inter-node/Management/Repository/ID Stores/Feeds/Profiling/Redirected Web/RADIUS CoA

• PAN/MnT node communications


• All management traffic to/from the PSN real IP addresses such as HTTPS, SSH, SNMP,
NTP, DNS, SMTP, and Syslog.
• Repository and file management access initiated from PSN including FTP, SCP, SFTP,
TFTP, NFS, HTTP, and HTTPS.
• All external AAA-related traffic to/from the PSN real IP addresses such as AD, LDAP,
RSA, external RADIUS servers (token or foreign proxy), and external CA communications
(CRL downloads, OCSP checks, SCEP proxy).
• All service-related traffic to/from the PSN real IP addresses such as Posture and Profiler
Feed Services, partner MDM integration, pxGrid, and REST/ERS API communications.
• Client traffic to/from PSN real IP addresses resulting from Profiler (NMAP, SNMP queries)
and URL-Redirection such as CWA, DRW/Hotspot, MDM, Posture, and Client
Provisioning.
• RADIUS CoA from PSNs to network access devices.
Virtual Server to Forward General Inbound IP Traffic For Your
Reference
General Properties

• Applies to connections initiated


from outside (external) network
• Type = Forwarding (IP)
• Source = All traffic (0.0.0.0/0) or
limit to specific network.
• Destination = PSN Network
Addresses
• Service Port = 0 (All Ports)
• Availability = Unknown (No service
validation via health monitors)
Virtual Server to Forward General Inbound IP Traffic For Your
Reference
Configuration (Advanced)

• Protocol = All Protocols


• Protocol Profile = fastL4
• Optionally limit to
specific ingress
VLAN(s).
• No SNAT
Virtual Server to Forward General Outbound IP Traffic For Your
Reference
General Properties

• Applies to connections initiated


from PSN (internal) network
• Type = Forwarding (IP)
• Source = PSN Network Addresses
• Destination = All traffic
(0.0.0.0/0.0.0.0) or limit to specific
network.
• Service Port = 0 (All Ports)
• Availability = Unknown (No service
validation via health monitors)
Virtual Server to Forward General Outbound IP Traffic For Your
Reference
Configuration (Advanced)

• Protocol = All Protocols


• Protocol Profile =
fastL4
• Optionally limit to
specific ingress
VLAN(s).
• No SNAT
For Your
Reference
Example Inbound / Outbound IP Forwarding Servers
For Your
Reference

Load Balancing RADIUS


F5 LTM Configuration Components for RADIUS LB
• RADIUS Auth
• RADIUS Acct
UDP Profile
• RADIUS CoA

RADIUS Profile SNAT Pool

iRule Persistence Virtual Server


Virtual Server
(Persistence) Profile

Health Monitor Pool List

For Your
Reference
Member Nodes
RADIUS Health Monitors For Your
Reference
Load Balancer Probes Determine RADIUS Server Health Status

• BIG-IP LTM RADIUS monitor has two key timer settings:


o Interval = probe frequency (default = 10 sec)
o Timeout = total time before monitor fails (default = 31 seconds)

Timeout = (3 * Interval) + 1 Sample LTM RADIUS Health Monitor Config:


(Four health checks are attempted ltm monitor radius /Common/radius_1812 {
before declaring a node failure) debug no
defaults-from /Common/radius
• Timers: Set low enough to ensure destination *:1812
efficient failover but long enough interval 10
to avoid excessive probing (AAA load); password P@$$w0rd
Start with defaults then tune to network. secret P@$$w0rd
• User Account: If valid user account to be time-until-up 0
used for monitor, be sure to configure timeout 31
user in ISE or external ID store with username f5-probe
limited/no network access privileges. }
Configure RADIUS Health Monitor For Your
Reference
Local Traffic > Monitors
• Same monitor can be leveraged for RADIUS
Auth, Accounting, and Profiling to reduce probe
load for multiple services.
• Be sure BIG-IP LTM configured as ISE NAD.
Optional: Configure UDP Profile for RADIUS For Your
Reference
Local Traffic > Profiles > Protocol > UDP

• Start with default Idle Timeout


• Using a custom profile allows for
tuning later if needed without
impacting other services based on
same parent UDP profile
• Disable Datagram LB
Optional: Configure RADIUS Profile For Your
Reference
Local Traffic > Profiles > Services > RADIUS

• Start with default settings


• Using a custom profile allows for tuning later
if needed without impacting other services
based on same parent radiusLB profile
Configure iRule for RADIUS Persistence For Your
Reference
Local Traffic > iRules > iRule List

• Recommend iRule based on


client MAC address
• RADIUS Attribute/Value Pair
= 31 = Calling-Station-Id
• Recommend copy and paste
working iRule into text area.
F5 iRule Editor For Your
Reference
https://devcentral.f5.com/d/tag/irules%20editor
• Manage
iRules
and config
files
• Syntax
checker
• Generate
HTTP
traffic
• Quick
links to
tech
resources
Configuring RADIUS Persistence For Your
Reference
RADIUS Profile Example

• RADIUS Sticky on Calling-Station-ID (client


MAC address)
• Simple option but does not support advanced
logging and other enhanced parsing options
like iRule
• Profile must be applied to Standard Virtual
Server based on UDP Protocol

ltm profile radius /Common/radiusLB {


app-service none
clients none
persist-avp 31
subscriber-aware disabled
subscriber-id-type 3gpp-imsi
iRule for RADIUS Persistence Based on Client MAC
Persistence based on Calling-Station-Id (MAC Address) with fallback to NAS-IP-Address

• iRule assigned to Persistence Profile


For Your
• Persistence Profile assigned to Virtual Server under Resources section Reference

when CLIENT_DATA {
# 0: No Debug Logging 1: Debug Logging
set debug 0 • Optional debug logging
• Enable for troubleshooting only to
reduce processing load
# Persist timeout (seconds)
set nas_port_type [RADIUS::avp 61 "integer"]
if {$nas_port_type equals "19"}{
set persist_ttl 3600 • Configurable persistence timeout
if {$debug} {set access_media "Wireless"} based on media type
} else { oWireless Default = 1 hour
set persist_ttl 28800 oWired Default = 8 hours
if {$debug} {set access_media "Wired"}
}
RADIUS Persistence iRule Based on MAC (cont.)
if {[RADIUS::avp 31] ne "" }{
set mac [RADIUS::avp 31 "string"] For Your
Reference
# Normalize MAC address to upper case
set mac_up [string toupper $mac]
persist uie $mac_up $persist_ttl
if {$debug} {
set target [persist lookup uie $mac_up]
log local0.alert "Username=[RADIUS::avp 1] MAC=$mac Normal
MAC=$mac_up MEDIA=$access_media TARGET=$target"
}
} else {
set nas_ip [RADIUS::avp 4 ip4]
persist uie $nas_ip $persist_ttl
if {$debug} {
set target [persist lookup uie $nas_ip]
log local0.alert "No MAC Address found - Using NAS IP as persist
id. Username=[RADIUS::avp 1] NAS IP=$nas_ip MEDIA=$access_media TARGET=$target"
}
}
}
Configure Persistence Profile for RADIUS For Your
Reference
Local Traffic > Profiles > Persistence

• Enable Match Across Services


• If different Virtual Server IP
addresses used for RADIUS Auth
and Accounting, then enable Match
Across Virtual Servers (not
recommended)
• Specify RADIUS Persistence iRule
• iRule persistence timer overrides
profile setting.
Configure Server Pool for RADIUS Auth For Your
Reference
Local Traffic > Pools > Pool List

• Health Monitor = RADIUS Monitor


• SNAT = No
• Action on Service Down = Reselect
• Ensures existing connections are
moved to an alternate server.
Configure Member Nodes in RADIUS Auth Pool
Local Traffic > Pools > Pool List > Members For Your
Reference

• Load Balancing
Method options:
• Least Connections
(node)
• Least Connections
(member)
• Server Port:
1812 or 1645
Configure Server Pool for RADIUS Accounting For Your
Reference
Local Traffic > Pools > Pool List

• Health Monitor = RADIUS Monitor


(same monitor used for RADIUS Auth)
• SNAT = No
• Action on Service Down = Reselect
• Ensures existing connections are
moved to an alternate server.
Configure Member Nodes in RADIUS Accounting Pool
Local Traffic > Pools > Pool List > Members For Your
Reference

• Load Balancing
Method options:
• Least Connections
(node)
• Least Connections
(member)
• Fastest
(application)
• Server Port:
1813 or 1646
Configure Virtual Server for RADIUS Auth (Properties)
Local Traffic > Virtual Servers > Virtual Server List
• Type = Standard
• Source = 0.0.0.0/0 (all hosts) or
specific network address.
• Destination = RADIUS Virtual IP
• Service Port = 1812 or 1645

RADIUS VIP
For Your
Reference
Configure Virtual Server for RADIUS Auth (Advanced)
Local Traffic > Virtual Servers

• Protocol = UDP
• Protocol Profile = udp or
custom UDP profile
• RADIUS Profile = radiusLB or
custom RADIUS profile
• Optional: Limit traffic to specific
VLAN(s)
• SNAT = None

For Your
Reference
Configure Virtual Server RADIUS Auth (Resources)
Local Traffic > Virtual Servers > Virtual Server List > Resources For Your
Reference

• Default Pool = RADIUS Auth Pool


• Default Persistence Profile =
RADIUS persistence profile
• Fallback Persistence Profile:
• RADIUS iRule setting overrides
value set here.
• If not configured in iRule, set
optional value here. Example:
radius_source_addr

Recommend create new


persistence profile based on
Source Address Affinity to allow
custom timers and match settings.
Configure Virtual Server for RADIUS Accounting
Local Traffic > Virtual Servers > Virtual Server List
• Same settings as RADIUS Auth Virtual
Server but different service port and pool

RADIUS VIP

For Your
Reference
Configure SNAT Pool List for RADIUS CoA For Your
Reference
Local Traffic > Address Translation > SNAT Pool List

• CoA traffic is initiated by PSN to NADs


on UDP/1700
• Define SNAT Pool List with RADIUS
Server Virtual IP as a pool member
Configure Virtual Server to SNAT RADIUS CoA (Properties)
Local Traffic > Virtual Servers > Virtual Server List

• CoA traffic is initiated by PSN to NADs


on UDP/1700
• Type = Standard
• Source = PSN Network
• Destination = 0.0.0.0 / 0.0.0.0 (all hosts)
or specific network for all NADs
• Service Port = 1700

For Your
Reference
Configure Virtual Server to SNAT RADIUS CoA (Advanced)
Local Traffic > Virtual Servers For Your
Reference

• Protocol = UDP
• Optional: Limit traffic to specific
VLAN(s)
• Source Address Translation = SNAT
• SNAT Pool = CoA SNAT Pool List
• Resources = None
For Your
Reference

Load Balancing ISE


Profiling
F5 LTM Configuration Components for Profiling LB

UDP Profile For Your


Reference

iRule Persistence
(Persistence) Profile
Virtual Server
Pool List

Member Nodes
Configure UDP Profile for Profiling For Your
Reference
Local Traffic > Profiles > Protocol > UDP

• Set Idle Timeout to Immediate


Profiling traffic from DHCP and
SNMP Traps are one-way flows to
PSNs—no response sent to these
packets.
• Be sure to create new UDP profile
to ensure these settings are
applied only to Profiling.
• Using a custom profile allows for
tuning later if needed without
impacting other services based on
same parent UDP profile
• Disable Datagram LB
iRule for DHCP Persistence Based on Client MAC (1 of 2)
Persistence based on DHCP Option 61 – Client Identifier (MAC Address)
For Your
Reference
• iRule assigned to Persistence Profile
• Persistence Profile assigned to Virtual Server under Resources section

when CLIENT_ACCEPTED priority 100 {

# Rule Name and Version shown in the log


set static::RULE_NAME "Simple DHCP Parser v0.3"
set static::RULE_ID "dhcp_parser"
• Optional debug logging
# 0: No Debug Logging 1: Debug Logging • Enable for troubleshooting only to
set debug 1 reduce processing load
# Persist timeout (seconds)
set persist_ttl 7200 • Configurable persistence timeout
iRule for DHCP Persistence Based on Client MAC (2 of 2)
# extract value filed in hexadecimal format
binary scan $dhcp_option_payload x[expr $i + 2]a[expr { $length * 2 }]
value_hex
set value ""
switch $option { Note: Example is excerpt
61 { # Client Identifier only—Not complete iRule
binary scan $value_hex a2a* ht id
switch $ht {
01 {
binary scan $id a2a2a2a2a2a2 m(a) m(b) m(c) m(d) m(e) m(f)
set value "$m(a)-$m(b)-$m(c)-$m(d)-$m(e)-$m(f)"
set option61 "$value"
set mac_up [string toupper $option61] # Normalize MAC
} default {
set value "$id"
For Your
persist uie $mac_up $persist_ttl Reference
if {$debug}{
set target [persist lookup uie $mac_up]
log local0.debug "$log_prefix_d ***** iRule: $static::RULE_NAME
competed ***** MAC=$option61 Normal MAC=$mac_up TARGET=$target“
iRule for DHCP Persistence – Sample Debug Output
Sat Sep 27 13:40:08 EDT 2014 debug f5 tmm[9443]
Rule /Common/dhcp_mac_sticky <CLIENT_ACCEPTED>: [dhcp_parser](10.1.10.1)(debug)
***** iRule: Simple DHCP Parser v0.3 competed *****
MAC=00-50-56-a0-0b-3a Normal MAC=00-50-56-A0-0B-3A TARGET= For Your
Reference
Sat Sep 27 13:40:08 EDT 2014 debug f5 tmm[9443]
Rule /Common/dhcp_mac_sticky <CLIENT_ACCEPTED>: [dhcp_parser](10.1.10.1)(debug)
BOOTP: 0.0.0.0 00:50:56:a0:0b:3a

Sat Sep 27 13:40:08 EDT 2014 debug f5 tmm[9443]


Rule /Common/dhcp_mac_sticky <CLIENT_ACCEPTED>: [dhcp_parser](10.1.10.1)(debug)
***** iRule: Simple DHCP Parser v0.3 executed *****

Sat Sep 27 13:39:45 EDT 2014 debug f5 tmm[9443]


Rule /Common/dhcp_mac_sticky <CLIENT_ACCEPTED>: [dhcp_parser](10.1.40.1)(debug)
***** iRule: Simple DHCP Parser v0.3 competed *****
MAC=f0-25-b7-08-33-9d Normal MAC=F0-25-B7-08-33-9D TARGET=
Optional: Configure iRule for DHCP Profiling Persistence
Local Traffic > iRules > iRule List

For Your
• Alternative to basic Source Reference
Address-based persistence
• Sample iRule based on
client MAC address parsed
from DHCP Request
packets
• Allows DHCP for given
endpoint to persist to same
PSN serving RADIUS for
same endpoint
• Recommend copy and
paste working iRule into
text area.
Optional: Configure Persistence Profile for Profiling
Local Traffic > Profiles > Persistence

• Enable Match Across Services For Your


Reference
• If different Virtual Server IP
addresses used for DHCP Profiling
and RADIUS, then enable Match
Across Virtual Servers.
(Recommend use same IP
address)
• Specify DHCP Persistence iRule
• iRule persistence timer overrides
profile setting.
Configure Server Pool for DHCP Profiling
Local Traffic > Pools > Pool List

• Health Monitor = RADIUS Monitor


• If PSN not configured for User
Services (RADIUS auth), then can
use default gateway_icmp
monitor.
• Action on Service Down = Reselect
• Ensures existing connections are
moved to an alternate server.

For Your
Reference
Configure Member Nodes in DHCP Profiling Pool
Local Traffic > Pools > Members For Your
Reference

• Load Balancing
Method = Round
Robin
• Server Port = 67
(DHCP Server)
Configure Server Pool for SNMP Trap Profiling
Local Traffic > Pools For Your
Reference
• Same settings as
DHCP Profiling Pool
except members
configured for UDP
Port 162.
Configure Virtual Server for DHCP Profiling (Properties)
Local Traffic > Virtual Servers > Virtual Server List

• Type = Standard
• Source = 0.0.0.0/0 (all hosts) or
specific network address.
• Destination = Can be same as
RADIUS Virtual IP or unique IP.

Be sure to configure DHCP Relays/


IP Helpers to point to this IP address
• Service Port = 67

For Your
Reference
Configure Virtual Server for DHCP Profiling (Advanced)
Local Traffic > Virtual Servers

• Protocol = UDP
• Protocol Profile = udp or
custom UDP profile
• Optional: Limit traffic to specific
VLAN(s)

For Your
Reference
Configure Virtual Server for DHCP Profiling (Resources)
Local Traffic > Virtual Servers > Resources
• Default Pool = DHCP Profiling Pool
• Default Persistence Profile = Persistence
Profile based on Source Address Affinity, OR
DHCP persistence profile
• Fallback Persistence Profile:
o DHCP iRule setting overrides value set
here.
o If not configured in iRule, set optional value
here. Example: profiling_source_addr
• If persistence profile based on Source
Address Affinity (source_addr),
recommend create new profile to allow
custom timers and “Match Across” settings.
For Your
Reference
Configure Virtual Server for SNMP Trap Profiling
Local Traffic > Virtual Servers For Your
Reference

• Same settings as DHCP Profiling Virtual


Server but different service port and pool.

Additionally, Default Persistence Profile


should be based on Source Address
Affinity (NAD IP address).
For Your
Reference

Load Balancing ISE


Web Services
F5 LTM Configuration Components for HTTP/S LB

TCP Profile For Your


Reference

Persistence
Profile
Virtual Server

Health Monitor Pool List

Member Nodes
Configure HTTPS Health Monitor
Local Traffic > Monitors For Your
Reference

• Configure Send and Receive Strings appropriate


to ISE version
• Set UserName and Password to any value (does
not have to be valid user account)
• Alias Service Port = Portal Port configured in ISE
HTTPS Health Monitor Examples For Your
Reference
Local Traffic > Monitors

• ISE 1.2 Example


• Send String: GET /sponsorportal/
• Receive String: HTTP/1.1 200 OK

• ISE 1.3 Example


• Send String:
GET /sponsorportal/PortalSetup.action?portal=Sponsor%20Portal%20%28default%29
• Receive String: HTTP/1.1 200 OK
Optional: Configure TCP Profile for HTTPS
Local Traffic > Profiles > Protocol > TCP

• Start with default Idle Timeout


• Using a custom profile allows for
tuning later if needed without
impacting other services based on
same parent TCP profile

For Your
Reference
Configure Persistence Profile for HTTPS
Local Traffic > Profiles > Persistence

• Enable Match Across Services


• If different Virtual Server IP
addresses used for Web Services,
then enable Match Across Virtual
Servers

Generally recommend use same


VIP address for all portals
• Timeout = Persistence timer

Value of 1200 seconds = 20 minutes


(default Sponsor Portal idle timeout
setting in ISE)
For Your
Reference
Configure Server Pool for Web Services
Local Traffic > Pools > Pool List

• Health Monitor = HTTPS Monitor


• Action on Service Down = None

For Your
Reference
Configure Member Nodes in Web Services Pool
Local Traffic > Pools > Pool List > Members For Your
Reference

• Load Balancing
Method options:
• Least Connections
(node)
• Least Connections
(member)
• Fastest
(application)
• Server Port = 0
(all ports)
Configure Virtual Server for Web Portals (Properties) For Your
Reference
Local Traffic > Virtual Servers > Virtual Server List

• Type = Standard
• Source = 0.0.0.0/0 (all hosts) or specific
network address.
• Destination = Web Portal Virtual IP
• Service Port = Web Portal Port
configured in ISE (default 8443)
Configure Virtual Server for HTTPS Portals (Advanced) For Your
Reference
Local Traffic > Virtual Servers

• Protocol = TCP
• Protocol Profile = tcp or custom TCP
profile
• Optional: Limit traffic to specific
VLAN(s)
• Source Address Translation (SNAT)
• Single PSN interface: None
• Dedicated PSN interface (ISE 1.2):
Auto Map
• Dedicated PSN interface (ISE 1.3):
None or Auto Map
Configure Virtual Server HTTPS Portals (Resources) For Your
Reference
Local Traffic > Virtual Servers > Virtual Server List > Resources

• Default Pool = Web Portals Pool


• Default Persistence Profile = HTTPS
persistence profile
• Fallback Persistence Profile: Not required
Configure Virtual Server for Web Portals on TCP/443 For Your
Reference
Local Traffic > Virtual Servers > Virtual Server List

• Virtual Server used to forward web


traffic sent to portal FQDN on default
HTTPS port 443
• PSNs will automatically redirect traffic
to FQDN to specific portal port / URL.
• Service Port = 443 (HTTPS)
Default HTTPS port used in initial
portal request by end user.
• All other Virtual Server settings the
same port-specific Virtual Server
(Example: ise_https8443_portals)
Configure Virtual Server for Web Portals on TCP/80 For Your
Reference
Local Traffic > Virtual Servers > Virtual Server List

• Virtual Server used to forward web


traffic sent to portal FQDN on default
HTTP port 80
• PSNs will automatically redirect traffic to
FQDN to specific portal port / URL.
• Service Port = 80 (HTTP)
Default HTTP port used in initial portal
request by end user.
• All other Virtual Server settings the
same port-specific Virtual Server
(Example: ise_https8443_portals)
Configure Virtual Server for Web Portals on TCP/80 For Your
Reference
Optional HTTP -> HTTPS Redirect by F5 LTM

To configure F5 LTM to perform automatic


HTTP to HTTPS redirect instead of PSNs:
• Configure new http profile under Profiles >
Services > HTTP using default settings
• Configure new http class under Profiles >
Protocol > HTTP Class. Under Actions,
set redirect URL.
• Under Virtual Server for HTTP (TCP/80):
• Specify HTTP Profile under Advanced
Configuration
• Specify new HTTP Class under Resources >
HTTP Class Profiles.
Virtual Server List For Your
Reference
Server Pool List For Your
Reference
PSN HA Without Load
Balancers
How can my
company get HA and
scalability without
load balancers?
Load Balancing Web Requests Using DNS
Client-Based Load Balancing/Distribution Based on DNS Response
• Examples:
Cisco Global Site Selector (GSS) / F5 BIG-IP GTM / Microsoft’s DNS Round-Robin feature
• Useful for web services that use static URLs including LWA, Sponsor, My Devices, OCSP.

PSN PSN PSN PSN

10.1.99.5 10.1.99.6 10.2.100.7 10.2.100.8


sponsor IN A 10.1.99.5
sponsor IN A 10.1.99.6
What is IP address for sponsor IN A 10.2.100.7 What is IP address for
sponsor.company.local? sponsor IN A 10.2.100.8 sponsor.company.local?
DNS SOA for company.local

10.1.60.105 10.1.99.5 10.2.100.8 10.2.5.221


Using Anycast for ISE Redundancy
Profiling Example

Provided dedicated
User interface or LB VIPs
used, Anycast may
be used for Profiling,
PSN
Web Portals
(Sponsor, Guest
LWA, and MDP) and
ISE-PSN-1
RADIUS AAA!

PSN NADs are


configured with
single Anycast
ISE-PSN-2
IP address.

Ex: 10.10.10.10
ISE Configuration for Anycast
On each PSN that will participate in Anycast…
1. Configure PSN probes to profile
DHCP (IP Helper), SNMP Traps, or NetFlow
on dedicated interface
2. From CLI, configure dedicated interface with
same IP address on each PSN node.
ISE-PSN-1 Example:
#ise-psn-1/admin# config t
#ise-psn-1/admin (config)# int GigabitEthernet1
#ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0

ISE-PSN-2 Example:
#ise-psn-1/admin# config t
#ise-psn-1/admin (config)# int GigabitEthernet1
#ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0
Routing Configuration for Anycast For Your
Reference
Sample Configuration

• Access Switch 1 • Access Switch 2


interface gigabitEthernet 1/0/23
Both switches interface gigabitEthernet 1/0/23
no switchport
advertise same no switchport
network used
ip address 10.10.10.50 255.255.255.0 ip address 10.10.10.51 255.255.255.0
! for profiling but !
router eigrp 100 different metrics router eigrp 100
no auto-summary no auto-summary
redistribute connected route-map CONNECTED- redistribute connected route-map CONNECTED-
2-EIGRP 2-EIGRP
! !
route-map CONNECTED-2-EIGRP permit 10 route-map CONNECTED-2-EIGRP permit 10
match ip address prefix-list 5 match ip address prefix-list 5
set metric 1000 100 255 1 1500 set metric 500 50 255 1 1500 # less preferred
set metric-type internal set metric-type external
! !
route-map CONNECTED-2-EIGRP permit 20 route-map CONNECTED-2-EIGRP permit 20
ip prefix-list 5 seq 5 permit 10.10.10.0/24 ip prefix-list 5 seq 5 permit 10.10.10.0/24
NAD-Based RADIUS Server Redundancy (IOS)
Multiple RADIUS Servers Defined in Access Device

• Configure Access Devices with multiple RADIUS Servers.


• Fallback to secondary servers if primary fails

RADIUS Auth PSN PSN1 (10.1.2.3)

PSN
PSN2 (10.4.5.6)
User
PSN
PSN3 (10.7.8.9)

radius-server host 10.1.2.3 auth-port 1812 acct-port 1813


radius-server host 10.4.5.6 auth-port 1812 acct-port 1813
radius-server host 10.7.8.9 auth-port 1812 acct-port 1813
NAD-Based Redundancy to Different LB Clusters
RADIUS Example – Different RADIUS VIP Addresses For Your
Reference
• Configure access devices with each
PSN LB cluster VIP as a RADIUS PSN
LB-1 PSN1 (10.1.99.5)
Server.
(10.1.98.8)
DC #1
• Fallback to secondary DC PSN
PSN2 (10.1.99.6)
if primary DC fails
PSN PSN3 (10.1.99.7)
Network Access
Device
RADIUS Auth PSN
LB-2 PSN1 (10.2.101.5)
User
(10.2.100.2)
PSN PSN2 (10.2.101.6)
DC #2
PSN PSN3 (10.2.101.7)

radius-server host 10.1.98.8 auth-port 1812 acct-port 1813


radius-server host 10.2.100.2 auth-port 1812 acct-port 1813
NAD-Based Redundancy to Different LB Clusters
RADIUS Example – Single RADIUS VIP Address using Anycast For Your
Reference
• Configure access devices with each
PSN LB cluster VIP as a RADIUS PSN
LB-1 PSN1 (10.1.99.5)
Server.
(10.1.98.8)
DC #1
• Fallback to secondary DC PSN
PSN2 (10.1.99.6)
if primary DC fails
PSN PSN3 (10.1.99.7)
Network Access
Device
RADIUS Auth PSN
LB-2 PSN1 (10.2.101.5)
User
(10.1.98.8)
PSN PSN2 (10.2.101.6)
DC #2
PSN PSN3 (10.2.101.7)

radius-server host 10.1.98.8 auth-port 1812 acct-port 1813


NAD-Based Redundancy to Different LB Clusters
Profiling Example – Different DHCP VIP Addresses For Your
Reference
• Configure access devices with each
PSN or cluster VIP as an IP Helper. PSN
LB1 PSN1 (10.1.99.5)
• Both Data Centers receive copy (10.1.98.11)
DC #1
of DHCP Profiling data PSN
PSN2 (10.1.99.6)

PSN PSN3 (10.1.99.7)


Network Access
Device
DHCP Relay PSN
LB2 PSN1 (10.2.101.5)
User
(10.2.100.3)
PSN PSN2 (10.2.101.6)
DC #2
interface VLAN 10 PSN PSN3 (10.2.101.7)
ip address A.B.C.D 255.255.255.0
ip helper-address X.X.X.X # Real
ip helper-address 10.1.98.11 # LB1
ip helper-address 10.2.100.3 # LB2
NAD-Based Redundancy to Different LB Clusters
Profiling Example – Single DHCP VIP Address using Anycast For Your
Reference

• Configure access devices with a single IP


PSN
Helper—shared across PSNs or VIPs. LB1 PSN1 (10.1.99.5)
(10.1.98.11)
• Fallback to secondary DC if routing to DC #1 PSN
PSN2 (10.1.99.6)
primary DC fails
PSN PSN3 (10.1.99.7)
Network Access
Device
DHCP Relay PSN
LB2 PSN1 (10.2.101.5)
User
(10.1.98.11)
PSN PSN2 (10.2.101.6)
DC #2
PSN PSN3 (10.2.101.7)
interface VLAN 10
ip address A.B.C.D 255.255.255.0
ip helper-address X.X.X.X # Real
ip helper-address 10.1.98.11 # Anycast
IOS-Based RADIUS Server Load Balancing
Switch Dynamically Distributes Requests to Multiple RADIUS Servers

• RADIUS LB feature distributes batches of AAA transactions to servers within a group.


• Each batch assigned to server with least number of outstanding transactions.

PSN
RADIUS PSN1 (10.1.2.3)
NAD controls the load
User 1 distribution of AAA
PSN
PSN2 (10.4.5.6) requests to all PSNs
in RADIUS group
without dedicated LB.
PSN
PSN3 (10.7.8.9)
User 2

radius-server host 10.1.2.3 auth-port 1812 acct-port 1813


radius-server host 10.4.5.6 auth-port 1812 acct-port 1813
radius-server host 10.7.8.9 auth-port 1812 acct-port 1813
radius-server load-balance method least-outstanding batch-size 5
IOS-Based RADIUS Server Load Balancing
Sample Live Log
• Use test aaa group
command from IOS
CLI to test RADIUS
auth requests

Reasonable load
distribution across all PSNs

Example shows 3 PSNs in


RADIUS group

cat3750x# test aaa group radius radtest cisco123 new users 4 count 50
AAA/SG/TEST: Sending 50 Access-Requests @ 10/sec, 0 Accounting-Requests @ 10/sec
NAD-Based RADIUS Redundancy (WLC)
Wireless LAN Controller

• Multiple RADIUS Auth & Accounting Server Definitions


• RADIUS Fallback options: none, passive, or active Password=
Username

Off = Continue exhaustively through


list; never preempt to preferred server
(entry with lowest index)
Passive = Quarantine failed RADIUS
server for interval then return to active
list w/o validation; always preempt.
Active = Mark failed server dead then
actively probe status per interval
w/username until succeed before
return to list; always preempt.

http://www.cisco.com/en/US/products/ps6366/products_configuration_example09186a008098987e.shtml
HA/LB Summary Table For
For Your
Your
Reference
Reference
Comparison of Various HA/LB Methods
HA/LB Where Primary Pros Cons
Method Configured? USE Cases
Local Load Centrally using RADIUS Large scaling, Fast failover, Higher cost and
Balancers LB near PSN HTTP/S better load distribution, in/out complexity
cluster Profiling servicing, single IP
DNS/Global Centrally using LWA / Large scaling, better load Somewhat higher
LB DNS Sponsor / distribution, in/out servicing, cost and complexity
MDP Portals single URL
Anycast Centrally using Web Portals, Lower cost, supports simple Higher complexity
routing Profiling route-based distribution, in/out
service, single IP
NAD RADIUS Distributed in RADIUS Low cost and complexity, Management of
Server List local NAD config deterministic distribution distributed lists, poor
load distribution
IOS RADIUS Distributed in RADIUS Low cost and complexity, better Management of
LB local NAD config per-NAD load distribution distributed lists
NAD Fallback and
Recovery
For Your
NAD Fallback and Recovery Reference

Common Questions
Q: How does NAD detect failed RADIUS servers?
A: Test Probes and Test User accounts
Q: What is the default behavior when ALL RADIUS servers down?
A: Unless using ‘authentication open’ no access is granted for unauthorized ports.
Q: Which fallback methods are available?
A: Critical Authentication VLAN for Data and Voice; Critical ACLs; EEM controls
Q: What is the impact of using VLAN-based fallback methods?
A: Users may still be blocked by port ACLs or may not get IP if VLAN changes
Q: Which recovery methods are available?
A: Reinitialize ports when RADIUS server available
In example, servers are marked “dead” if no response in 60
NAD Fallback and Recovery seconds (1 transmit + 3 retransmits w/15 second timeout).

After 2 minutes, RADIUS test probe will retry server and


Dead RADIUS Server Detection and Recovery mark “alive” if response; otherwise recheck every 2
For Your minutes(deadtime).
Example using radius-server host Reference
Some releases may require idle-time to be set lower than
interface X
dead-time. (CSCtr61120)
authentication event fail action next-method
authentication event server dead action reinitialize vlan 11 Move new hosts to specified critical data VLAN
authentication event server dead action authorize voice Authorize new phones to voice VLAN
authentication event server alive action reinitialize Reauthenticate endpoints on port once server “alive”
authentication violation restrict Deny access to violating host but do not disable port

radius-server dead-criteria time 15 tries 3 Conditions to mark server as “dead” (Ex: 60 sec.)
radius-server deadtime 2 Minutes before retrying server marked as “dead”

authentication critical recovery delay 1000 Throttle requests for critical ports once server “alive”
dot1x critical eapol Send EAPOL-Success when auth critical port

epm access-control open Permit access if no dACL returned with successful auth

radius-server host 10.1.98.8 auth-port 1812 acct-port 1813 test RADIUS server definition including periodic test to detect
username radtest ignore-acct-port key cisco123 server dead/alive:
username ‘radtest’: Locally defined test user to auth
radius-server host 10.2.101.3 auth-port 1812 acct-port 1813 test idle-time: default = 60 = “Send test probe 1 per hour”
username radtest ignore-acct-port key cisco123 ignore-acct-port : Test auth-port on
Fallback RADIUS server if primary server fails
NAD Fallback and Recovery For Your
Reference
‘aaa radius group’ Example

• Similar configuration as previous example interface X


authentication event fail action next-method
but using aaa radius group and radius authentication event server dead action reinitialize vlan 11
server host commands authentication event server dead action authorize voice
authentication event server alive action reinitialize
authentication violation restrict
• radius server host defines individual
RADIUS servers with separate lines for authentication critical recovery delay 1000
config parameters dot1x critical eapol
epm access-control open
radius-server dead-criteria time 15 tries 3
• aaa radius group defines RADIUS group radius-server deadtime 2
with individual server entries listed
aaa group server radius psn-clusters
server name psn-cluster1
server name psn-cluster2

radius server psn-cluster1


address ipv4 10.1.98.8 auth-port 1812 acct-port 1813
automate-tester username radtest ignore-acct-port key cisco123

radius server psn-cluster2


address ipv4 10.2.101.3 auth-port 1812 acct-port 1813
automate-tester username radtest ignore-acct-port key cisco123
NAD Fallback and Recovery Sequence For Your
Reference

Endpoint Access Switch Policy Service Node


Layer 2 Point-to-Point Layer 3 Link PSN
Auth Request Auth Request
Retry ise-psn.cts.local
15 sec, Auth-Timeout
Access VLAN 10 (or Authorized VLAN) Retry
15 sec, Auth-Timeout
Detection

Retry
15 sec, Auth-Timeout
radius-server dead-criteria 15 tries 3
Dead

15 sec, Auth-Timeout
SERVER DEAD

Authorize Critical VLAN 11 Wait Deadtime = 2 minutes Deadtime Test request


No response Deadtime Test request
Traffic permitted on Critical VLAN per port ACL
No response
Deadtime

Deadtime Test request


No response Deadtime Test request
radius-server deadtime 2 Deadtime Test reply
authentication event server dead action reinitialize vlan 11 SERVER ALIVE

Reinitialize Port / Set Access VLAN per Recovery Interval

60 minute Idle-Time
Traffic permitted per RADIUS authorization
Recovery

Idle-Time Test request

authentication event server alive action reinitialize


radius-server host ... test username radtest idle-time 60 key cisco123 Idle-Time Test request
authentication critical recovery delay 1000
RADIUS Test User Account For Your
Reference
Which User Account Should Be Used?
• Does NAD uniformly treat Auth Fail and Success the same for detecting server health?
IOS treats them the same; ACE RADIUS probe treats Auth Fail as server down.

• IOS Example: If goal is to validate backend ID store, then Auth Fail may not detect external ID store
failure.
Solution: Drop authentication requests when external ID store is down.
• Identity Server Sequence > Advanced Settings:
Authentication Policy >
ID Source custom
processing based on
authentication results

• ACE Example: If auth fails, then PSN declared down.


Solution: Create valid user account so ACE test probes
return Access-Accept.
• Could this present a potential security risk?
RADIUS Test User Account For Your
Reference
Access-Accept or Access-Reject?

• If valid user account used, how prevent unauthorized access using probe account?
If Auth Fail treated as probe failure, then need valid account in ISE db or external store.
• Match auth from probes to specific source/NDG, Service Type, or User Name.
• Allow AuthN to succeed, but return AuthZ that denies access.

Access-Accept
dACL = deny ip any any
Inaccessible Authentication Bypass (IAB)
Also Known As “Critical Auth VLAN” for Data

Access VLAN
Critical VLAN WAN or PSN Down

PSN
WAN / Internet

• Switch detects PSN unavailable by one of two methods


• Periodic probe
• Failure to respond to AAA request Critical VLAN can be anything:
• Same as default access VLAN
• Enables port in critical VLAN
• Same as guest/auth-fail VLAN
• Existing sessions retain authorization status • New VLAN

• Recovery action can re-initialize port when AAA returns

authentication event server dead action authorize vlan 100


authentication event server alive action reinitialize
For Your
Critical Auth for Data VLAN Reference

Sample Configuration
radius-server 10.1.10.50 test username KeepAliveUser key cisco
radius-server dead-criteria time 15 tries 3
radius-server deadtime 1

interface GigabitEthernet1/13
switchport access vlan 2
switchport mode access
switchport voice vlan 200
authentication event fail action next-method
authentication event server dead action authorize vlan 100
authentication event server alive action reinitialize
authentication order dot1x mab
dot1x pae authenticator
authentication port-control auto
dot1x timeout tx-period 10
dot1x max-req 2
mab
spanning-tree portfast
For Your
Reference
Critical Auth for Data
PSN

Data VLAN Enabled

interface fastEthernet 3/48


dot1x pae authenticator
authentication port-control auto
authentication event server dead action authorize vlan x
For Your
Reference
Critical Auth for Voice VLAN (CVV)
PSN

Voice VLAN Enabled

interface fastEthernet 3/48


dot1x pae authenticator
authentication port-control auto

authentication event server dead action authorize voice


Critical Auth for Data and Voice
PSN

Voice VLAN Enabled

Data VLAN Enabled

interface fastEthernet 3/48


dot1x pae authenticator
authentication port-control auto
authentication event server dead action authorize vlan x
authentication event server dead action authorize voice

# show authentication sessions interface fa3/48



Critical Authorization is in effect for domain(s) DATA and VOICE
For Your
Multiple Hosts and Critical Auth Reference

Critical Auth for Data and Voice


• Multi-MDA:
Router(config-if)# authentication event server dead action authorize vlan 10
Router(config-if)# authentication event server dead action authorize voice
Behavior: Existing data sessions stay authorized in current VLAN; New sessions authorized to VLAN 10

• Multi-Auth:
Router(config-if)# authentication event server dead action reinitialize vlan 10
Router(config-if)# authentication event server dead action authorize voice

Behavior: All existing data sessions re-authorized to VLAN 10; New sessions are authorized to VLAN 10

• Catalyst Switch Support: Series Multi-Auth w/VLAN Critical Auth for Voice
2k/3k 12.2(55)SE 15.0(1)SE
4k 15.0(2)SG 15.0(2)SG
IOS XE 3.2.0SG IOS XE 3.2.0SG
6k 12.2(33)SXJ 12.2(33)SXJ1
Default Port ACL Issues with No dACL Authorization
Limited Access If ISE Policy Fails to Return dACL! For Your
Reference

• User authentications successful, but authorization profile does not include dACL to permit
access, so endpoint access still restricted by existing port ACL!

EAP Request RADIUS Access-Request Auth Success


EAP Success RADIUS Access-Accept Authorization Profile =
Employee
PSN
Access Type = Access-
Accept

Only DHCP/DNS/PING/TFTP allowed ! NO dACL!

interface GigabitEthernet1/0/2 ip access-list extended ACL-DEFAULT


switchport access vlan 10 permit udp any eq bootpc any eq bootps
switchport voice vlan 13 permit udp any any eq domain
ip access-group ACL-DEFAULT in permit icmp any any
permit udp any any eq tftp
Protecting Against “No dACL” Authorization For Your
Reference
EPM Access Control
• If authentication successful and no dACL returned, a permit ip host any entry is created
for the host. This entry is created only if no ACLs are downloaded from ISE.

Auth Success
EAP Request RADIUS Access-Request
EAP Success RADIUS Access-Accept Authorization Profile
= Employee
PSN
Access Type
= Access-Accept
ALL traffic allowed ! NO dACL!
Insert at top of port ACL:
permit ip any any
epm access control open 2k/3k: 12.2(55)SE
ip access-list extended ACL-DEFAULT 4k: 12.2(54)G
interface GigabitEthernet1/0/2 permit udp any eq bootpc any eq bootps 6k: 15.2(1)SY
switchport access vlan 10 permit udp any any eq domain
switchport voice vlan 13 permit icmp any any
ip access-group ACL-DEFAULT in permit udp any any eq tftp
Default Port ACL Issues with Critical VLAN
Limited Access Even After Authorization to New VLAN!
• Data VLAN reassigned to critical auth VLAN, but new (or reinitialized) connections are still
restricted by existing port ACL!

Access
Critical VLAN Voice VLAN WAN or PSN Down
Gi1/0/2
PSN

Default ACL
Only DHCP/DNS/PING/TFTP allowed !

interface GigabitEthernet1/0/2 ip access-list extended ACL-DEFAULT


switchport access vlan 10 permit udp any eq bootpc any eq bootps
switchport voice vlan 13 permit udp any any eq domain
ip access-group ACL-DEFAULT in permit icmp any any
authentication event server dead action reinitialize vlan 11 permit udp any any eq tftp
authentication event server dead action authorize voice
authentication event server alive action reinitialize
Critical VLAN w/o Explicit Default Port ACL For Your
Reference
Low Impact versus Closed Mode

• One solution to dACL + Critical Auth VLAN issue is to simply remove the port ACL!
• No static port ACL required for dACLs in current 2k/3k/4k. 2k/3k: 12.2(55)SE
4k: 12.2(54)G
• Low Impact Mode Use Case: 6k: 15.2(1)SY
• Initial access permits all traffic
• Pro: Immediately allows access to critical services for all endpoints including PXE and WoL devices
• Con: Temporary window which allows any unauthenticated endpoint to get full access

• Closed Mode User Case


• No initial access but default authorization can assign default access policy (typically CWA)
• Pro: No access until port authorized
• Con: Some endpoints may fail due to timing requirements such as PXE or WoL
Using Embedded Event Manager with Critical VLAN
Modify or Remove/Add Static Port ACLs Based on PSN Availability
For Your
Reference
• EEM available on 3k/4k/6k
• Allows scripted actions to occur based on various conditions and triggers
event manager applet default-acl-fallback
event syslog pattern "%RADIUS-4-RADIUS_DEAD" maxrun 5
action 1.0 cli command "enable" Single RADIUS
action 1.1 cli command "conf t" pattern "CNTL/Z." Server (LB VIP)
action 2.0 cli command "ip access-list extended ACL-DEFAULT" example shown.
action 3.0 cli command "1 permit ip any any"
action 4.0 cli command "end" Multi-server option:
%RADIUS-3-
event manager applet default-acl-recovery ALLDEADSERVER
event syslog pattern "%RADIUS-4-RADIUS_ALIVE" maxrun 5
action 1.0 cli command "enable"
action 1.1 cli command "conf t" pattern "CNTL/Z."
action 2.0 cli command "ip access-list extended ACL-DEFAULT"
action 3.0 cli command "no 1 permit ip any any"
action 4.0 cli command "end"
EEM Example For Your
Reference
Remove and Add Port ACL on RADIUS Server Status Syslogs
• Port ACLs block new user connections during Critical Auth

WAN or PSN Down


Access VLAN
Critical VLAN Gi1/0/5 EEM
PSN


Only DHCP/DNS/PING/TFTP
All user traffic allowed allowed
● ACL-DEFAULT

• EEM detects syslog message %RADIUS-3- • EEM detects syslog message %RADIUS-6-
ALLDEADSERVER: Group radius: No active SERVERALIVE: Group radius: Radius server
radius servers found and removes ACL- 10.1.98.8:1812,1813 is responding again
DEFAULT. (previously dead)and adds ACL-DEFAULT.
event manager applet remove-default-acl event manager applet add-default-acl
event syslog pattern "%RADIUS-4-RADIUS_DEAD" maxrun 5 event syslog pattern "%RADIUS-4-RADIUS_ALIVE" maxrun 5
action 1.0 cli command "enable" action 1.0 cli command "enable"
action 1.1 cli command "conf t" pattern "CNTL/Z." action 1.1 cli command "conf t" pattern "CNTL/Z."
action 2.0 cli command "interface range gigabitEthernet 1/0/1 - 24" action 2.0 cli command "interface range gigabitEthernet 1/0/1 - 24"
action 3.0 cli command "no ip access-group ACL-DEFAULT in" action 3.0 cli command "ip access-group ACL-DEFAULT in"
action 4.0 cli command "end" action 4.0 cli command "end"
EEM Example 2 For Your
Reference
Modify Port ACL Based on Route Tracking

cat6500(config)# track 1 ip route 10.1.98.0 255.255.255.0 reachability

cat6500(config)# event manager applet default-acl-fallback


cat6500(config-applet)# event track 1 state down maxrun 5
cat6500(config-applet)# action 1.0 cli command "enable"
cat6500(config-applet)# action 1.1 cli command "conf t" pattern "CNTL/Z."
cat6500(config-applet)# action 2.0 cli command "ip access-list extended ACL-DEFAULT"
cat6500(config-applet)# action 3.0 cli command "1 permit ip any any"
cat6500(config-applet)# action 4.0 cli command "end"

cat6500(config)# event manager applet default-acl-recovery


cat6500(config-applet)# event track 1 state up maxrun 5
cat6500(config-applet)# action 1.0 cli command "enable"
cat6500(config-applet)# action 1.1 cli command "conf t" pattern "CNTL/Z."
cat6500(config-applet)# action 2.0 cli command "ip access-list extended ACL-DEFAULT"
cat6500(config-applet)# action 3.0 cli command "no 1 permit ip any any"
cat6500(config-applet)# action 4.0 cli command "end"
Using Embedded Event Manager with Critical VLAN
Modify or Remove/Add Static Port ACLs Based on PSN Availability
• Allows scripted actions to occur based on various conditions and triggers

track 1 ip route 10.1.98.0 255.255.255.0 reachability


event manager applet default-acl-fallback
event track 1 state down maxrun 5 EEM Policy Builder:
action 1.0 cli command "enable" www.progrizon.com/support/pb/pb.php
action 1.1 cli command "conf t" pattern "CNTL/Z."
action 2.0 cli command "ip access-list extended ACL-DEFAULT"
action 3.0 cli command "1 permit ip any any"
action 4.0 cli command "end"
event manager applet default-acl-recovery EEM available
event track 1 state up maxrun 5
action 1.0 cli command "enable"
on Catalyst
action 1.1 cli command "conf t" pattern "CNTL/Z." 3k/4k/6k
action 2.0 cli command "ip access-list extended ACL-DEFAULT" switches
action 3.0 cli command "no 1 permit ip any any"
action 4.0 cli command "end"
Critical ACL using Service Policy Templates
Apply ACL, VLAN, or SGT on RADIUS Server Failure!
• Critical Auth ACL applied on Server Down

Access
Critical VLAN Voice VLAN WAN or PSN Down
Gi1/0/2
PSN

Default ACL
Only DHCP/DNS/PING/TFTP allowed !

interface GigabitEthernet1/0/2 ip access-list extended ACL-DEFAULT


switchport access vlan 10 permit udp any eq bootpc any eq bootps
switchport voice vlan 13 permit udp any any eq domain
ip access-group ACL-DEFAULT in permit icmp any any
access-session port-control auto permit udp any any eq tftp
mab
dot1x pae authenticator
service-policy type control subscriber ACCESS-POLICY
2k/3k/4k: 15.2(1)E
Critical ACL using Service Policy Templates 3k IOS-XE: 3.3.0SE
4k: IOS-XE 3.5.0E
Apply ACL, VLAN, or SGT on RADIUS Server Failure! 6k: 15.2(1)SY

• Critical Auth ACL applied on Server Down

Access
Critical VLAN Voice VLAN WAN or PSN Down
Gi1/0/2
PSN

Critical
Default ACL
Deny PCI networks; Permit Everything
Only DHCP/DNS/PING/TFTP allowed ! Else !
policy-map type control subscriber ACCESS-POLICY
event authentication-failure match-first ACL-DEFAULT
ip access-list extended ACL-CRITICAL
10 class AAA_SVR_DOWN_UNAUTHD do-until-failure
permit udp
remark any
Deny eq bootpc
access to PCI any eq bootps
zone scopes
10 activate service-template CRITICAL_AUTH_VLAN
20 activate service-template DEFAULT_CRITICAL_VOICE_TEMPLATE permittcp
deny udp any
any any eq domain
172.16.8.0 255.255.240.0
30 activate service-template CRITICAL-ACCESS permitudp
deny icmp any
any any
172.16.8.0 255.255.240.0
service-template CRITICAL-ACCESS permitipudp
deny anyany any eq tftp
192.168.0.0 255.255.0.0
access-group ACL-CRITICAL permit ip any any
!
service-template CRITICAL_AUTH_VLAN
vlan 10
service-template DEFAULT_CRITICAL_VOICE_TEMPLATE
username 000c293c8dca password 0 000c293c8dca
For Your
Critical MAB Reference
username 000c293c8dca aaa attribute list mab-local
!
aaa local authentication default authorization mab-local
aaa authorization credential-download mab-local local
Local Authentication during Server failure !
aaa attribute list mab-local
attribute type tunnel-medium-type all-802
attribute type tunnel-private-group-id "150"
000c.293c.8dca attribute type tunnel-type vlan
attribute type inacl "CRITICAL-V4"
!
policy-map type control subscriber ACCESS-POL
...
event authentication-failure match-first
WAN 10 class AAA_SVR_DOWN_UNAUTHD_HOST do-↵
until-failure
10 terminate mab
? 20 terminate dot1x
30 authenticate using mab aaa authc-↵
list mab-local authz-list mab-local
000c.293c.331e
...

 Additional level of check to authorize hosts during a critical condition.


 EEM Scripts could be used for dynamic update of whitelist MAC addresses
 Sessions re-initialize once the server connectivity resumes.
High Availability For Your
Reference
Additional Considerations
• HA design should support any single device failure (PSN, switch, LB, ESX host, etc)

• NAD communications to central Policy Service node


Will access devices list multiple backup RADIUS servers (PSNs) in sequence?
Will Load Balancer be used whereby access devices point to a single RADIUS Server?
What is the fallback policy, if any, when access to Policy Service node is unavailable?

• NAD communications to local Policy Service node


PSN can service requests without connectivity to other nodes but some functions like creation of new guests or profiled
endpoints require access to Primary Admin node.
Note: Promotion of Secondary Administration node is currently a manual process

• Profiling
Is distributed Profiling required to support specific collection methods?
Some profiling techniques like SPAN assume a specific PSN connection. If that particular PSN is not available, will need
to move SPAN to another PSN or else consider duplication of profiling data to more than one node
SNMP Poll Probe not automatically reassigned; must deregister failed PSN

• LWA
Access devices like WLC typically support the entry of only a single URL for LWA to external server. DNS LB or Anycast
may be an option. If not using LB, then may need to change WLC URL if target PSN down.
Key Performance Metrics
(KPM)
KPM in a Nutshell For Your
Reference

What is KPM?
• KPM stands for Key Performance Metrics. These are the metrics collected from the
MNT nodes about the Endpoints and its artifacts

Benefits of KPM:
• There are two flavors captured in two separate spreadsheets.
• Endpoints Onboarding data: Measure key performance metrics about Endpoints, like
Total, Active, Successful, Failures, Endpoints on-boarded/day
• Endpoints Transactional Load data: # radius requests at a PSN level/hr, Radius
requests to # Active EP ratio, How much of these data was persisted in the MNT table
and how many of them were suppressed to determine the suppression ratio, what was
the Avg and Max load on the PSN during that hour, what was the latency and Avg TPS.
Key Performance Metrics (KPM) New in ISE 1.4
# application configure ise (Option 12 and 13)
• Generate performance
metrics:
• Endpoints Onboarding
• Endpoints Transactional Load

• Saves to local disk


• Can copy to repository for
viewing

• Reports are suffixed with date


parameter
• If run in same day, will
overwrite

• Can be resource intensive on


CPU/Memory, so advised to
run during non-peak hours
KPM Attributes For Your
For Your
Reference
Reference

• KPM OnBoarding Results: • KPM Trx Load (cont)


• Total Endpoints : Total number of endpoints in the • Radius Requests : Number of Radius requests sent by
deployment the PSNs for that hour.
• Successful Endpoints : How many of them were on • RR_AEP_ratio : Ratio of Radius Requests to the
boarded successfully number of Active endpoints on an hourly basis. This
• Failed Endpoints : How many failed to on board will give the number of radius request an Active EP
• New EP/day : New endpoints seen in the deployment makes on an average.
for a given day • Logged_to_MNT/hr : Number of Radius Request
• Total Onboarded/day : Total endpoints on-boarded persisted in the DB
for a given day • Noise/hr : Number of Radius Request suppressed, only
the counter increases but the data is not persisted in
the DB.
• KPM Trx Load • Supression_hr % : % of suppression
• Timestamp: Date/Time, This is an hourly window,
• Avg_Load (avg) : Average load of the PSNs during
extrapolated from the syslogs sent by the PSNs
that hourly window
• PSN name : Hostname of the PSN sending syslogs
• Max Load (avg): Max load of the PSNs during that
to the MNT collector
hourly window
• Total Endpoints: Total number of endpoints in the
• Latency_per_request: Latency per radius request
deployment
(average)
• Active Endpoints: Active number of endpoints in the
• Avg TPS : Average number of transactions per second
deployment for that hour.
on that PSN.
Sample KPM Stats Output For Your
Reference

• KPM_TRX_LOAD_<DATE>.xls

• KPM_ONBOARDING_RESULTS_<DATE>.xls
Exiting Large Scale / HA Design Matrix…
Okay to Unplug
ISE Scalability and High Availability
Summary Review
• Appliance selection and persona allocation impacts deployment size.
• VM appliances need to be configured per physical appliance sizing specs.
• Profiling scalability tied to DB replication—deploy node groups and optimize PSN
collection.
• Leverage ISE 1.2 noise suppression to increase auth capacity and reduce storage reqs.
• ISE 1.3 further enhances scalability with multi-AD and auto-device registration & purge.
• Admin, MnT, pxGrid, and IPN HA based on a Primary to Secondary node failover.
• Load balancers can offer higher scaling and redundancy for PSN clusters.
• Non-LB options include “smart” DNS, AnyCast, multiple RADIUS server definitions in the
access devices, and IOS RADIUS LB.
• Special consideration must be given to NAD fallback and recovery options when no
RADIUS servers are available including Critical Auth VLANs for data and voice.
• IBNS 2.0 and EEM offer advanced local intelligence in failover scenarios.
Solution Validation

Design guides available at:


http://www.cisco.com/go/trustsec
Cisco and F5 Deployment Guide:
ISE Load Balancing using BIG-IP:
http://www.cisco.com/c/dam/en/us/td/docs/security/ise/
how_to/HowTo-95-Cisco_and_F5_Deployment_Guide-
ISE_Load_Balancing_Using_BIG-IP_DF.pdf

ISE How-To and Design Guides:


http://www.cisco.com/c/en/us/support/security/identity-
services-engine/products-implementation-design-
guides-list.html
Recommended Reading
• http://amzn.com/1587143259
Participate in the “My Favorite Speaker” Contest
Promote Your Favorite Speaker and You Could Be a Winner
• Promote your favorite speaker through Twitter and you could win $200 of Cisco
Press products (@CiscoPress)
• Send a tweet and include
• Your favorite speaker’s Twitter handle <Speaker—enter your Twitter handle here>
• Two hashtags: #CLUS #MyFavoriteSpeaker

• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin
Complete Your Online Session Evaluation
• Give us your feedback to be
entered into a Daily Survey
Drawing. A daily winner
will receive a $750 Amazon
gift card.
• Complete your session surveys
though the Cisco Live mobile
app or your computer on
Cisco Live Connect.
Don’t forget: Cisco Live sessions will be available
for viewing on-demand after the event at
CiscoLive.com/Online
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
• Related sessions
Questions ?
Thank you