Edge and WAN networks are morphing rapidly. They are becoming increasingly application-aware and offer dynamic route selection with higher application quality.
Enterprises that use services such as intranet, voice, video, private/public cloud access, and other business-critical cloud services are expecting little or no problems in sending data from one office (site) to the other. Software-defined wide area networks (SD-WANs) promise all this and more.
However, SD-WAN is not without its own problems as the technology replaces older WANs, essentially, IP/MPLS-based networks. The difficulties arise because:
- SD-WAN allows multiple types of connectivity to be used, including the public Internet.
- OTT (over-the-top) services are bandwidth-unpredictable in nature.
- Video forms an increasing flow of information transferred between sites.
Let’s elaborate upon this. The internet is still considered by some enterprises and service providers to be unsuitable for carrying real-time applications, even when most of us use it every day as part of the consumer-connected world.
Most service providers rely on existing, relatively reliable, and more secure transport networks such as IP/MPLS. While the internet offers high bandwidth and low-cost connectivity, IP/MPLS continues to offer real-time SLA quality for site-to-site connectivity.
The SD-WAN layer will, therefore, have to ride both underlays (MPLS and internet) for a long time. MPLS will not phase out quickly; some estimates show a co-existence of over five years. This translates into a fairly long period for a hybrid solution based on the MPLS underlay and the SD-WAN overlay.
In this scenario, correlating SD-WAN overlay with the supporting underlay infrastructure can provide vital intelligence toward assuring an end-to-end SD-WAN based service. In a survey (Survey carried out by Light Reading/Heavy Reading in Nov. 2018 for Infovista) carried out across over 100 service providers, over 80% of the respondents reported that such correlation is important for them, while almost 30% considered it to be critical.
Correlating underlay with overlay
This is the just the scenario for a service assurance system to come into play. First, it is important to visualise which transport will carry the SD-WAN traffic for an application from one site to another, and then, the manner in which the underlay impacts the SD-WAN overlay performance should be analysed.
Since the SD-WAN overlay acts as a tunnel for applications to travels between sites, and assures that all tunnel traffic is being carried with the correct SLA quality, the key performance indicators (KPIs) related to the SD-WAN tunnel as well as the underlying MPLS network path need to be monitored and correlated. This requirement stands irrespective of the kind of SD-WAN edge device (i.e., physical or virtualised CPEs).
For service providers, poor availability, low throughput, high bandwidth, low latency, high jitter, and high response time over the hybrid (MPLS and SD-WAN) network are just some of the problems that can beleaguer the promised levels of SD-WAN’s agility. To ensure seamless site-to-site connectivity, path analysis becomes extremely important. This includes measuring available bandwidth, utilisation, performance, and availability between sites for each path across different underlays.
Impact of the multi-vendor aspect of SD-WAN on its performance
Added to this complexity is the fact that the service provider needs to deal with multiple vendors of SD-WAN and MPLS. It is now recognised that multi-vendor SD-WAN deployments allow a degree of flexibility to the Service Provider as no single vendor can cover all the aspects of SD-WAN functions in their entirety. Edge CPEs (physical as well as virtualised), SD-WAN overlays, SD-WAN orchestrators, and security functions are delivered by multiple vendors.
In the above mentioned survey, 56% of Service Providers planned to offer a “single-pane-of-glass” reporting ability across WAN types and multiple vendors.
Using a single service assurance solution across multiple SD-WAN vendors can reduce opex as the SD-WAN services scale up.
Importance of SLA management
Service assurance is important not just to assure a high-quality network, but also to honour the terms of an SLA with the Service Provider’s enterprise customer.
It is clear that service providers need to extend their existing service level agreement (SLA) monitoring capabilities: more than 40% think that SLA monitoring is needed for each site, and over 35% believe that SLA monitoring is required for each SD-WAN service.
If SD-WAN is to deliver on its promise, the service assurance system not only needs to track the entire network end-to-end, but also move toward assuring performance at the edge. There are many ways to acquire performance information from edge devices: through the devices themselves, through SD-WAN controllers or, most preferably, through SD-WAN service orchestrators. Speed is vital in such data acquisitions; the faster, the better. Telemetry could act as a key source of data gathering in a streamed fashion, to deliver real-time (sub-minute or even sub-second) reporting on network performance and the SLAs.
At one time, service assurance, as a 24 × 7 management system, was introduced only after a new technology was well deployed, and only after new services were tested, launched, and commercially established. This has changed now. Service assurance is introduced much earlier into the technology deployment cycle; it is seen as a business-critical system to manage performance as soon as SD-WAN is deployed in an environment where multiple vendors provide the underlay and multiple SD-WAN vendors are used by the Service Provider.
In this highly complex multi-vendor environment, understanding and fixing issues because of overlay and underlay relationships is critical to the success of SD-WAN and delivering on its promise.