Failover for WAN Smoothing - MAX HD2 with SpeedFusion Cloud

I’m trying to configure WAN smoothing using 2 highly contended conventional WAN connections, plus 2 cell connections. I’m running 8.1.2 build 5114.

The (very) contended WANs are the best available but are not reliable and can have high latency. The high latency then pushes traffic over the cell connections and costs a lot of $ as we often have 3 or 4 HD video conferences running simultaneously.

I’ve tried various configurations based on forum research and trial and error.

Maximum level on same WAN. I had hoped that when set to off it would recognize that one half of the bond had failed and pull a lower priority cell connection in place. I haven’t been able to find any observable difference in performance.

This configuration consumes lots of cellular data when WAN latency is high

This configuration requires both WANs to hard fail and wait about 15 seconds for the health check to fail (I know that I can reduce the health check trigger time) before a cellular connection is used.

Please let me know how to improve the set-up.

Thanks
David

Max level on same wan, just decides if it should send redundant packets on the same link or not.
If the connections are truly FIOS and Spectrum you should only see high latency if you are trying to send more then your circuit is provisioned for. At which case the provider rate-limits you by dropping packets.
Wan smoothing would make this worse as it increases the amount of data transmitted.
Are the WAN links being overloaded? Maybe set some bandwidth rules up?
What do you see on your speedfusion graphs? Are you dropping packets.

Thanks @Jonathan_Pitts for the quick response.

Both WANs are configured in line with specs below. Would you suggest reducing these?

We’re typically using <25Mps up and down at peak so well under total theoretically available. Location is sort of middle of nowhere and as I said very contended.

FIOS is a gigabit connection (tests at ~500Mps around 3am but drops to ~100 during the day).

Spectrum is the latency “problem child”. Latency bounces around and rises to 70+ms during day time. Specified at 400Mps down and 20Mps up (full bandwidth at 3am tests but can be down to 5Mps during day).

There is some packet loss as shown on the attached current snapshot on a “quiet” Friday morning.

Any other suggestions very welcome.

Thanks again
David

Try changing Packet Fragmentation to use df flag.
Re-test see if it’s any better.

I see multiple tunnels on here, but not mentioned in the problem.
What are the tunnel results on high FEC with wan smoothing off?
I’m really concerned about so many packets out of order, seems odd.
What type of data is being sent? Are you using the built in speed test to generate the data?
Do you have the upload/download set to your circuit speed on each wan?

@Jonathan_Pitts Thanks again for the rapid response.

The team has wound down for the weekend, so I’ve set the df flag and will see on Monday.

I don’t use FEC on video conference traffic as its not recommended. Should I try that?

The traffic over “WAN smoothing” is primarily video conferencing, and VOIP. There’ll be some email in there as well as I’ve been unable to segregate Teams from general Microsoft traffic.

The WANs are set as shown.


Thanks again.

Regardless of traffic type, sometime you have to explore other options to overcome issues with troubling WANs :slight_smile: . We offer voip primarily and have found that FEC high with a primary and a secondary selected for wan type work best for us.
Video confrence is just more data on a real time application so I see no reason why it wouldn’t work.
Another though would also be to limit the bandwidth of the tunnel to try to get the video confrence to hopefully auto tweak the compression and or selection of the codec based upon the amount of bandwidth available?

How many conferences are happening at once, is it Zoom or something else?

Jonathan

I’ll try FEC as well next week and report backup.

Traffic is a mix of zoom, Microsoft Teams, and Webex. Rarely less than 1, typically 2 or 3. Rarely 5 or 6.

Do you have any ideas on failovers so that one of the cells will kick in when either FIOS or Spectrum goes down?

Best
David

Use the priority on the speed fusion

1 FIOS
2 Spectrum
3 Cell 1
4 Cell 2

or if you prefer
1 FIOS
2 Cell 1
3 Spectrum
4 Cell 2

I don’t know how your tunnel 1 is setup , but in general that’s your primary tunnel
so it can also be dependent upon which direction the tunnel traffic is.
Is it coming in to the fios or spectrum , or going out on all to the remote site.
Which site has the static ip, or is it specified on each site, which is primary.
What are you connecting to a fusion hub? Another hd2.

What do your wan quality graphs look like from ic2 for various time periods.
When in use, and when idle?

So, I think that I’ve made good progress.

I tried FEC but this was not well received by my colleagues on Teams. It didn’t seem to impact Webex or Zoom but Teams “stuttered”.

I then did some simplification and went from 4 simultaneous connections to SpeedFusion Cloud (2 x wired WANS and 2 x cell) to the 2 wired WANs with the cells on hot standby as shown below to kick in as a bonded connection to SpeedFusion Cloud when both wired WANs are out.

I discovered that by setting the “Maximum Level on Same Link” to normal (as above) that when one of the wired WANs wasn’t performing the other WAN compensated for it. You’ll see 2 “glitches” on WAN2 below that were compensated on WAN1 and my colleague on Teams had no idea that there had been an issue. Wonderful … thank you Peplink!!!

This configuration also handles peaks in traffic way better than my previous set up of 4 bonded connections with negligible packet loss.

I’ve simulated simultaneous failure of both wired WANs and the hot standby cell WANs kick in seamlessly.

@Jonathan_Pitts thank you so much for your suggestions that pointed me to the current set up.

David

2 Likes