Network Based Application Recognition



Although the majority of applications can be identified using Layer 3 or Layer 4 criteria (such as discrete IP addresses or well-known TCP/UDP ports), there are applications that cannot be identified by such criteria alone. This might be due to legacy limitations but more likely is due to deliberate design. For example, peer-to-peer media-sharing applications deliberately negotiate ports dynamically with the objective of penetrating firewalls.
When Layer 3 or Layer 4 parameters are insufficient to positively identify an application, NBAR might be a viable alternative solution. NBAR is the most sophisticated classifier in the IOS tool suite. NBAR can recognize packets on a complex combination of fields and attributes; however, you need to recognize that NBAR is merely a classifier, nothing more. NBAR can identify flows by performing deep-packet inspection, but it is the job of the policy map to determine what needs to be done with these flows when identified (that is, whether they should be marked, policed, dropped, and so on).
The NBAR deep-packet classification engine examines the data payload of stateless protocols and identifies application-layer protocols by matching them against a Protocol Description Language Module (PDLM), which is essentially an application signature. There are more than 80 PDLMs embedded into Cisco IOS; furthermore, because PDLMs are modular, they can be downloaded from http://www.cisco.com/pcgi-bin/tablebuild.pl/pdlm and added to a system without requiring an IOS upgrade.
NBAR is dependent on Cisco Express Forwarding (CEF) and performs deep-packet classification only on the first packet of a flow. The remainder of the packets belonging to the flow is then CEF-switched.
The NBAR classifier is triggered by the match protocol command within a class map definition and is a more CPU-intensive classifier than classifiers that match traffic by DSCPs or access control lists (ACL).
NBAR can classify packets based on Layer 4 through Layer 7 protocols, which dynamically assign TCP/UDP ports. By looking beyond the TCP/UDP port numbers of a packet (known as subport classification), NBAR examines the packet payload and classifies packets on the payload content, such as transaction identifiers, message types, or other similar data. For example, HTTP traffic can be classified by Universal Resource Locators (URL) or Multipurpose Internet Mail Extension (MIME) types using regular expressions within the CLI. NBAR uses the UNIX filename specification as the basis for the URL specification format, which it converts into a regular expression.
Example 1 demonstrates classifying traffic by L2, L3, L4, and L7 parameters.
Example 1: Classifying Traffic by Layer 2, 3, 4, and 7 Parameters

Router(config)# class-map match-all L2-CLASSIFIER
Router(config-cmap)# match cos 3
Router(config-cmap)#!
Router(config-cmap)# class-map match-all L3-CLASSIFIER
Router(config-cmap)# match access-group name STANDARD-ACL
Router(config-cmap)# !
Router(config-cmap)# class-map match-all L4-CLASSIFIER
Router(config-cmap)# match access-group name EXTENDED-ACL
Router(config-cmap)# !
Router(config-cmap)# class-map match-any L7-CLASSIFIER
Router(config-cmap)# match protocol exchange
Router(config-cmap)# match protocol citrix
Router(config-cmap)# !
Router(config-cmap)#
Router(config-cmap)# ip access-list standard STANDARD-ACL
Router(config-std-nacl)# permit 10.200.200.0 0.0.0.255
Router(config-std-nacl)#
Router(config-std-nacl)# ip access-list extended EXTENDED-ACL
Router(config-ext-nacl)# permit tcp any any eq ftp
Router(config-ext-nacl)# permit tcp any any eq ftp-data

In this example, the class maps classify traffic as follows:
  • class-map match-all L2-CLASSIFIER: Traffic is classified by matching on (Layer 2) 802.1p class of service (CoS) values (discussed in more detail in the next section).
  • class-map match-all L3-CLASSIFIER: Traffic is classified, through a standard ACL, by (Layer 3) source IP address.
  • class-map match-all L4-CLASSIFIER: Traffic is classified, through an extended ACL, by (Layer 4) TCP ports identifying FTP traffic.
  • class-map match-any L7-CLASSIFIER: Traffic is classified, with the match-any operator, by NBAR PDLMs that identify SQLNET or Citrix traffic types.

Class Maps | Classification Tools



The primary classification tool within MQC is the class map. Each class map contains one or more match statements, which specify criteria that must be met for traffic identification
Because classmaps can contain multiple match statements, when a class map is defined, a logical operator for the discrete match statements also needs to be defined. Two options exist as follows:
  • match-all (a logical AND operator), meaning that all match statements must be true at the same time for the class map condition to be true; match-all is the default operator; it is important not to use mutually exclusive match criteria when a match-all operator is defined within the class map, as this combination can never yield a positive match.
  • match-any (a logical OR operator), meaning that any of the match statements might be true for the class map condition to be true.
These match statements specify the criteria for traffic identification. These can include the following:
  • Layer 1 parameters: Physical interface, subinterface, PVC, or port
  • Layer 2 parameters: MAC address, 802.1Q/p class of service (CoS) bits, Multiprotocol Label Switching (MPLS) Experimental (EXP) bits
  • Layer 3 parameters: Differentiated Services Code Points (DSCP), source/destination IP address
  • Layer 4 parameters: TCP or UDP ports
  • Layer 7 parameters: Application signatures and URLs in packet headers or payload through Network Based Application Recognition (NBAR)
Figure 1 illustrates the Layer 2 to Layer 7 packet classification criteria; however, due to space limitations, the diagram is not to scale, nor are all fields indicated.

 
Figure 1: Layer 2 to Layer 7 packet classification criteria

Classification Tools | Network Quality of Service Technologies



Classification tools serve to identify traffic flows so that specific QoS policies can be applied to specific flows, such as TelePresence media and control flows. Often the termsclassification and marking are used interchangeably (yet incorrectly so); therefore, you need to understand the distinction between classification and marking operations:
  • Classification refers to the inspection of one or more fields in a packet (the term packet is used loosely here to include all Layer 2 to Layer 7 fields, not just Layer 3 fields) to identify the type of traffic that the packet is carrying. When identified, the traffic is directed to the applicable policy-enforcement mechanism for that traffic type, where it receives predefined treatment (either preferential or deferential). Such treatment can include marking/remarking, queuing, policing, shaping, or any combination of these (and other) actions.
  • Marking, on the other hand, refers to changing a field within the packet to preserve the classification decision that was reached. When a packet has been marked, a trust boundary is established, upon which other QoS tools later depend. Marking is only necessary at the trust boundaries of the network and (as with all other QoS policy actions) cannot be performed without classification. By marking traffic at the trust boundary edge, subsequent nodes do not have to perform the same in-depth classification and analyses to determine how to treat the packet.
MQC performs classification based on the logic defined within the class map structure. Such logic can include matching criteria at the data link, network, or transport Layer (Layers 2 to 4) or even at the application layer (Layer 7)

Modular QoS Command-Line Interface



As the Cisco QoS tools evolved, they became increasingly platform idiosyncratic. Commands that worked on one platform wouldn’t quite work on another, and there were always platform-specific requirements and constraints that had to be kept in mind. These idiosyncrasies made deploying QoS a laborious and often frustrating exercise, especially when deploying networkwide QoS policies, such as required by TelePresence. As an attempt to make QoS more consistent across platforms, Cisco introduced the MQC, which is a consistent, cross-platform command syntax for QoS.
Any QoS policy requires at least three elements:
  1. Identification of what traffic the policy is to be applied to
  2. What actions should be applied to the identified traffic
  3. Where (that is, which interface) should these policies be applied, and in which direction
To correspond to these required elements, MQC has three main parts:
  1. One or more class maps that identify what traffic the policies are to be applied to
  2. policy map that details the QoS actions that are to be applied to each class of identified traffic
  3. service policy statement that attaches the policy to specific interfaces and specifies the direction (input or output) that the policy is to be applied
As you see in the examples throughout this chapter, although the syntax of MQC might seem simple enough, it allows for nearly every type of QoS policy to be expressed within it and allows these policies, for the most part, to be portable across platforms.

Generic Online Diagnostics



Cisco General Online Diagnostics (GOLD) defines a common framework for diagnostic operations for Cisco IOS Software-based products. GOLD has the objective of checking the health of all hardware components and verifying the proper operation of the system data plane and control plane at boot-time, as well as run-time.
GOLD supports the following:
  • Bootup tests (includes online insertion)
  • Health monitoring tests (background nondisruptive)
  • On-Demand tests (disruptive and nondisruptive)
  • User scheduled tests (disruptive and nondisruptive)
  • CLI access to data through management interface
GOLD, in conjunction with several of the technologies previously discussed, can reduce device failure detection time.

Event Manager

The Cisco IOS Embedded Event Manager (EEM) offers the capability to monitor device hardware, software, and operational events and take informational, corrective, or any desired action—including sending an email alert—when the monitored events occur or when a threshold is reached.
EEM can notify a network management server and an administrator (via email) when an event of interest occurs. Events that can be monitored include the following:
  • Application-specific events
  • CLI events
  • Counter- and interface-counter events
  • Object-tracking events
  • Online insertion and removal events
  • Resource events
  • GOLD events
  • Redundancy events
  • SNMP events
  • Syslog events
  • System manager and system monitor events
  • IOS Watchdog events
  • Timer events
Capturing the state of network devices during such situations can be helpful in taking immediate recovery actions and gathering information to perform root-cause analysis, reducing fault detection and diagnosis time. Notification times are reduced by having the device send email alerts to network administrators. Furthermore, availability is also improved if automatic recovery actions are performed without the need to fully reboot the device.

In Service Software Upgrade

The Cisco In Service Software Upgrade (ISSU) provides a mechanism to perform software upgrades and downgrades without taking a switch out of service. ISSU leverages the capabilities of NSF and SSO to allow the switch to forward traffic during supervisor IOS upgrade (or downgrade). With ISSU, the network does not reroute, and no active links are taken out of service. ISSU thereby expedites software upgrade operations.

Online Insertion and Removal

Online Insertion and Removal (OIR) allows linecards to be added to a device without affecting the system. Additionally with OIR, linecards can be exchanged without losing the configuration. OIR thus expedites hardware repair and replacement operations.

Operational Availabilities Technologies



As has been shown, the predominant way that availability of a network can be improved is to improve its MTBF by using devices that have redundant components and by engineering the network to be as redundant as possible, leveraging many of the technologies discussed in the previous sections.
However, glancing back to the general availability formula, another approach to improving availability is to reduce MTTR. Reducing MTTR is primarily a factor of operational resiliency.
MTTR operations can be significantly improved in conjunction with device and network redundant design. Specifically, the capability to make changes, upgrade software, and replace or upgrade hardware in a production network is extensively improved due to the implementation of device and network redundancy. The capability to upgrade individualdevices without taking them out of service is based on having internal component redundancy complemented with the system software capabilities. Similarly, by having dual active paths through redundant network devices designed to converge in subsecond timeframes, you can schedule an outage event on one element of the network and allow it to be upgraded and then brought back into service with minimal or no disruption to the network as a whole.
You can also improve MTTR by reducing the time required to perform any of the following operations:
  • Failure detection
  • Notification
  • Fault diagnosis
  • Dispatch and Arrival
  • Fault repair
Some technologies that can help automate and streamline these operations include the following:
  • General Online Diagnostics (GOLD)
  • Embedded Event Manager (EEM)
  • In Service Software Upgrade (ISSU)
  • Online Insertion and Removal (OIR)

IP Event Dampening



Routing protocols provide network convergence functionality in IP networks, including TelePresence campus and branch networks. However, these protocols are impeded by links that “flap” or change state repeatedly. Although not a protocol in itself, IP Event Dampening complements the functioning of routing protocols to improve availability by minimizing the impact of flapping on routing protocol convergence.
Whenever the line protocol of an interface changes state, or flaps, routing protocols are notified of the status of the routes affected by the change in state. Every interface state change requires all affected devices in the network to recalculate best paths, install or remove routes from the routing tables, and then advertise valid routes to peer routers. An unstable interface that flaps excessively can cause other devices in the network to consume substantial amounts of system processing resources and cause routing protocols to lose synchronization with the state of the flapping interface.
The IP Event Dampening feature introduces a configurable exponential decay mechanism to suppress the effects of excessive interface flapping events on routing protocols and routing tables in the network. This feature allows the network administrator to configure a router to automatically identify and selectively dampen a local interface that is flapping. Dampening an interface removes the interface from the network until the interface stops flapping and becomes stable.
Configuring the IP Event Dampening feature improves convergence times and stability throughout the network by isolating failures so that disturbances are not propagated, which reduces the utilization of system processing resources by other devices in the network and improves overall network stability.
IP Event Dampening uses a series of administratively defined thresholds to identify flapping interfaces, to assign penalties, to suppress state changes (if necessary), and to make stabilized interfaces available to the network. These thresholds are as follows:
  • Suppress threshold: The value of the accumulated penalty that triggers the router to dampen a flapping interface. The flapping interface is identified by the router and assigned a penalty for each up and down state change, but the interface is not automatically dampened. The router tracks the penalties that a flapping interface accumulates. When the accumulated penalty reaches the default or preconfigured suppress threshold, the interface is placed in a dampened state. The default suppress threshold value is 2000.
  • Half-life period: Determines how fast the accumulated penalty can decay exponentially. When an interface is placed in a dampened state, the router monitors the interface for additional up and down state changes. If the interface continues to accumulate penalties and the interface remains in the suppress threshold range, the interface remains dampened. If the interface stabilizes and stops flapping, the penalty is reduced by half after each half-life period expires. The accumulated penalty reduces until the penalty drops to the reuse threshold. The default half-life period timer is five seconds.
  • Reuse threshold: When the accumulated penalty decreases until the penalty drops to the reuse threshold, the route is unsuppressed and made available to the other devices on the network. The default value is 1000 penalties.
  • Maximum suppress time: Represents the maximum amount of time an interface can remain dampened when a penalty is assigned to an interface. The default maximum penalty timer is 20 seconds.
IP Event Dampening is configured on a per-interface basis (where default values are used for each threshold) as follows:
Router(config-)# interface FastEthernet0/0
Router(config-if)# dampening
IP Event Dampening can be complemented with the use of route summarization, on a per-routing protocol basis, to further compartmentalize the effects of flapping interfaces and associated routes.