Fuze recently received a support ticket from a long-standing client who had deployed SD-WAN on their own, only to discover their voice and video quality had dramatically worsened. What was going on? Wasn’t SD-WAN the obvious and easy replacement for expensive MPLS, without compromising on quality? So why wasn’t it working?
For the sake of this post, let’s call this client Tango. They are a customer with offices up and down the West Coast of the United States, with primary voice and video services served out of Fuze’s Data Center in San Jose. As it turned out, Tango’s SD-WAN deployment included a managed, cloud-based firewall for all internet traffic egress for all its offices. That firewall was hosted in Ohio. So for all of Tango’s West Coast offices, Fuze traffic was now traversing SD-WAN tunneling back to Ohio for internet egress, and then all the way back to San Jose, where primary services lived. Yikes! To make matters worse, the IT department’s home VPN client backhauled all traffic to Tango’s headquarters in Los Angeles, so the WFH users had even an additional tunnel backhaul before having their traffic taken to Ohio!
Does this sound familiar? Hopefully not! But perhaps some aspects resonate. Tango is not a common case— their issues have been resolved, but they represent a lot of what can go wrong when deploying SD-WAN and not giving full consideration to what the effect will be on UCaaS or other latency sensitive applications.
This post is meant to explore two key design considerations that represent the bulk of the benefit you can realize in an UCaaS & SD-WAN deployment. While UCaaS and SD-WAN vendors are numerous and diverse and much of this is vendor-agnostic and widely applicable, we will refer to Fuze’s offerings in some of the below examples.
Design Question #1 - Where’s the Edge?
The primary UCaaS design question is where and how traffic will egress from the SD-WAN fabric on its way to the UCaaS provider. For most SD-WAN vendors, the SD-WAN “fabric or transport” is a mesh of IPsec tunnels, but these tunnels only exist between locations that have SD-WAN appliances. How does traffic get to destinations that do not have an appliance? There are four possibilities:
- Local Internet Breakout
- SD-WAN Providers as Network Operators
- Hosting an Appliance “Close to” Fuze
- Hosting an Appliance and Circuits Directly at Fuze
Traffic destined for common public clouds or websites may often just break out locally via an internet egress. This approach can be totally suitable when underlying internet connectivity is robust, but it cannot take advantage of many SD-WAN features. Administrators should understand that local internet breakout is effectively the same as traditional over-the-top (“OTT”) internet.
Some enterprises may find this insufficient for a variety of reasons. But for smaller branches and SMB, this is a cost-effective, scalable approach to routing a trusted application over local internet connections.
Network Operators and POPs
Regarding network underlays and POPs, there is a short list of SD-WAN vendors that actually operate a network and transport traffic to common public cloud destinations and UCaaS providers. To give one example, Fuze currently partners with Aryaka, an SD-WAN vendor that operates 30 POPs globally and its own private backbone. Aryaka customers connect redundantly to the geographically closest POP, and then Aryaka provides dedicated, private transport all the way to Fuze. Aryaka and Fuze accomplish this private transport path by maintaining dedicated peerings in all mutual regions across the globe.
For these providers that operate network transport or cloud gateways for egress, the important question is, “Where are they backhauling or egressing traffic to the UCaaS provider? Do they peer directly with the provider? Are they in the same facility?” As a basic rule of thumb, if the SD-WAN vendor is backhauling UCaaS traffic to some POP or gateway for egress, they must at minimum be doing so in the same metro area as the destination, but ideally they should be peering directly with said provider. As of July 2020, Fuze has confirmed mutual gateway egress points with Aryaka and Cato Networks.
This was one of the core issues with the Tango deployment mentioned above. The Ohio firewall egress point was simply too far from the San Jose services to provide any value or performance.
Hosting an Appliance “Close to” UCaaS Destination via AWS (or Public Cloud)
Clients can host appliances via AWS or other 3rd party clouds, with the approach that hosting an appliance in AWS us-east-1 is going to be “close to” major cloud and UCaaS providers who operate in those respective facilities. The egress Elastic IP is likely serviced over the facility’s respective IX, or stays within a Tier I ISP edge router. This approach is valid, but clients should understand that the network underlay is then only viable via AWS Elastic IP, and UCaaS providers like Fuze would not help in the hosting or management of these appliances. Traffic is also technically exposed to the internet, despite being a “short hop”. Hosting in common public clouds with this “close to” mentality would be a very DIY approach, and would only be recommended for administrators with specific backgrounds, skills, and goals.
Direct Private Hosting
Fuze is one of the very few, if only, UCaaS providers to allow direct hosting of customer appliances and circuits within its own co-location, ensuring that true enterprise clients can connect to Fuze as if it were their own data center. Fuze Private Hosting, or any hosting, has additional costs associated with it, so this solution is probably less suitable for cost-sensitive and SMB deployments.
Design Question #2 - Using SD-WAN Packet Loss Mitigation (“PLM”) Features
Any network engineer will tell you that if you take an IP packet flow between point A and point B, and simply add IPsec encapsulation, that process will either degrade performance or do nothing; tunneling by itself does not improve performance, and can only hurt.
How does SD-WAN then justify its tunneling approach as a solution for reliable transport? It uses a variety of Packet Loss Mitigation features. These techniques are not uniform between vendors; there is no IETF RFC standard. But generally speaking, common approaches include Packet Duplication, Connection Buffering, Tunnel Bonding, and TCP-like connections that allow for retransmission, even for UDP streams. Let’s look at an example below of Packet Duplication:
The above diagram shows a common deployment strategy, where redundant IPsec tunnels have their respective IKE gateways pinned to different ISPs or underlying connections, guaranteeing that tunnels will utilize diverse network paths. By then adding Packet Duplication, we dramatically increase the likelihood that at least one packet will arrive. Done correctly, this approach can deliver reliable, robust network transport.
However -— and this is the most critical part — these features, such as Packet Loss Mitigation, are deployed selectively in nearly all SD-WAN vendors. Typically, there are “app-maps”, or network ACLs that need to be defined, in order to match traffic, and select what is duplicated, and what is not.
As a rule of thumb, Fuze recommends that any UCaaS traffic that is transported over SD-WAN tunneling must use some form of viable Packet Loss Mitigation. As mentioned in the beginning of this section, simple tunneling by itself can only degrade performance. For the SD-WAN vendors with whom Fuze has technology alliances, the recommended PLM features can be found in our Best Practices & Deployment Guides.
Special Considerations - Home Users
One of the primary pitfalls with an SD-WAN deployment is the handling of home users, who often utilize corporate VPN clients. As a general rule, Fuze recommends that IT administrators do not backhaul traffic back to a HQ, data center, or hosted gateway. Often this just moves the egress point further away, and from PLM perspective, packet duplication and those features are generally not available on any clients (they are managed and produced by an appliance, which most home users do not have).
There may be some exceptions. For example, Fuze partners with Cato Networks, who manage their own VPN client, along with network transport, gateways, and a fully managed SASE solution. In specialized circumstances or deployments, where there is a specific plan for the home user base, this may be appropriate to tunnel all endpoint traffic, including UCaaS. But as a default, Fuze recommends a split tunnel or local breakout for home users to reach their UCaaS solution.
If you’ve gotten this far, thanks for reading! If you take anything away, it should be these 2 points:
- Where is your traffic egress point? It should be close to your UCaaS provider.
- Use PLM features for any UCaaS traffic over tunnel transport!
Rolls off the tongue, right? Keeping these points in mind is going to give you a solid foundation to begin your UCaaS & SD-WAN journey. As always, Fuze customers should feel free to engage your Fuze Sales Engineer for a deeper dive on how we tackle each of these questions.