One of the most critical responsibilities for network engineers is to ensure that the network is routing packets as intended. IP routing protocols, including BGP, the routing protocol of the Internet, and Interior Gateway Protocols, such as OSPF, IS-IS, or EIGRP, are complex. To manage them, network engineers must understand them.

Every type of organization—from small to large enterprises, from network operators to service providers—depends on routing protocols for data delivery between branches, head offices, data centers, or co-location sites. A routing issue that lasts for a few seconds can interrupt stock trading, banking transactions, e-commerce, video streaming, or a VoIP call. This means operators must make designing, operating, and managing their IP networks a top priority.

What’s wrong with routing?

Why is managing routing behavior so difficult? Several factors complicate routing and add to the management challenge. While change management, hardware failures, configuration errors, and human error often cause routing issues, there are other factors that make managing IP routing so complex and challenging.

Dynamic nature of routing
The advent of any-to-any networks, where application traffic can flow in any direction, led to an increase in the number of meshed networks and the use of dynamic IP routing. When traffic can take any path from source to destination with routing decisions being made on a hop-by-hop basis, network engineers lose visibility into the actual path the traffic takes. This makes it difficult to achieve objectives such as load balancing and end-to-end Quality of Service (QoS). Add to this dynamism the transitory routing error conditions that cause frequent path changes, and service assurance becomes really difficult.

No single source of network routing topology
In IP networks, routers make the decisions on how to forward packets. This means that, when there’s a service delivery problem, network engineers must first find all the hops along the path for the service, and then query each router to find the end-to-end path and troubleshoot the issue. The network does not provide a single repository from which engineers can see the entire network’s routing plan.

Small change, major impact
Even small changes in the network topology or performance can result in major impacts on service delivery. For example, if a link along a path fails and a backup link takes over, all routers in the network will update their forwarding tables with the new path. When this happens in a network with hundreds of routers, the time it takes for protocol convergence can cause a loss or delay in the delivery of data packets and contribute to network performance degradation.

Lack of visibility into routing behavior
SNMP-based tools show the health and performance of network hardware, and NetFlow and other traffic flow analyzers show the volume and composition of traffic. But these traditional network management and monitoring solutions—along with ping, traceroute, and other CLI commands—do not provide real-time, global visibility into dynamically changing routing behavior and the paths critical services take across the network. While there are tools that capture network topology snapshots, this data can become stale quickly in dynamic networks. And for some network applications, real-time information is critical. After all, milliseconds can equate to millions of dollars.

Real-time route analytics

Route analytics technology provides visibility into the control plane. This can be done by querying the routers periodically to capture their configuration data and construct the network topology. However, as mentioned above, periodic discovery may not be adequate when real-time visibility is needed, as is the case with some Software-Defined Networking (SDN) automation applications.

Real-time route analytics technology records the live IGP and BGP protocol messages shared between routers to build and maintain an always-accurate network topology model of all active routing paths. This real-time telemetry can be stored and used to create a live network topology map showing all routing paths, and for troubleshooting, historical analysis, and planning purposes.

Real-time route analytics diagram

Real-time route analytics technology demystifies complex routing behavior and can help network teams successfully tackle a number of management challenges, including those listed above. For example, they can determine:

  • If traffic is taking the least desirable path in search of the lowest cost path
  • If all expected customer VPN prefixes are being advertised
  • The reason for unplanned deviation in routes for specific services
  • The convergence time after network events
  • The cause of intermittent routing issues that would otherwise go undetected and unresolved, and many others

Real-time routing telemetry and analytics are widely used today by network engineering, planning, and operations teams to assure service delivery, optimize networks for performance and resiliency, and mitigate the risk from changes. As the industry embraces SDN and Network Functions Virtualization (NFV) automation to create adaptive, self-healing, and self-optimizing networks, the same real-time telemetry and analytics will provide the intelligence to power resource and service orchestrators.

 

This content was originally published on the Packet Design blog and has been updated since the acquisition by Blue Planet.