Fail over / wan smoothing not working

This was setup as a test to show how good this new tech is, however with the fail I’m not sure the customer is going want this, or that I would recommend this to anyone else.

Network setup. Two sites A/B and a HUB at colo.
VPN is setup from A to colo to B
Comcast is main internet at both A B.
Cell service for Backup internet at A B.

Here is the issues. Comcast is main internet for site A.
Comcast is dropping huge amounts of packets. The default settings are not failing over unless I set the fail over to setting to
Timeout 1
Health check 5
Health check retries 1
Recovery retries 20
However even with them set like this it’s still flopping back and forth, between comcast and cell this causes garbled voice.
So I tried to setup wan smoothing and it does not work getting garbled voice one way and a dropped packet once in a while.
However if I set the site A to use cell for vpn only all works fine.

For wan smoothing I tried setting it up for just A and B did not work. Tried setting up wan smoothing for A,B and colo still didn’t work.

Any config of wan smoothing resulted in a dropped packets though vpn and garbled voice one way.

I tried fail over with both dns lookup and ping.
When set to dns no fail over happened due to only 3 or so packet dropping then 10 good packets.
When set to ping with default settings same issues
When ping set to setting above it would fail over but since comcast would have 10 to 20 good pings it would start flopping between cell and comcast.

Hello @Jason.Smith,

WAN smoothing is applied to traffic being sent from the device, across the WAN connections. Do you know what the latency levels are like on each of the connections?

Which devices are being used at Site A and Site B and are they using Firmware 7.1.0? Is the “HUB” at Colo a FusionHub?

Thanks,

Steve

1 Like

Hi Jason. Just a “side comment”: See my message to @seb82:

Comcast has become quite a problem – and they deny all responsibility. It appears to me that you ahe been diligent and thoughtful as to how you are dealing with this and I have no further suggestions in this regard – but would welcome more information if/when you are able to resolve this.

Importantly, this message is not intended to detract from what @Steve.Taylor has told you.

Peplink health checks won’t work well if the ISPs randomly throw your packets in the “bit bucket.”

Rick

1 Like

WAN smoothing can’t work miracles. For VoIP you need a good internet connection. In my experience running VoIP over cellular is not great, and your Comcast is going up and down like a ping pong ball. Peplink can’t make two bad connections into a single good one.

1 Like

I was told wan smoothing sends packets though both interfaces! So if one interface dropped a packet then it would be made up on the other interface. Also if I set it to only Cell there is no issues…

This is true.So something in the configuration is wrong.

You need to grab diagnostic files from all three devices, enable Remote Assistance on them and get Peplink Engineering involved for review.

1 Like

I did do this, However I have yet to be contacted. I did get an email yesterday, saying they would like to make changes and I replied asking if we could do it today but I never got anything back about it.

Okay, So as I was playing around with the setting on this device, I did notice that the cell was always in standby mode. and the traffic in real time cell had no activity.

So I put both wan and cell in Priority 1 in wan connections
I then put both wan and cell as Priority 1 in the speedfusion vpn settings
I now have activity on the cell and wan at the same time.
Seems both need to be set to 1 in order for wan smoothing to work.

Seems if either one isn’t set to 1 and you enable wan smoothing you start getting packets to drop and bad voip quality.

I still need to test voip quality but I’m guessing it’s all going to work now.

I never did hear back from support been 5 days now.

Awesome! Glad you worked it out.

Let me explain the WAN priority settings. Two places to set these as you have discovered, on the dashboard and in the speedfusion profile itself.

The dashboard priority settings are the definitive availability of the WANs and which order they should be actively used and available for outbound traffic. That outbound traffic might be load balanced across all the available links or those links could be used in a SpeedFusion VPN Profile.

The dashboard then, lets you quickly decide and see which WANs are healthy and which combination should be used in what priority.

The SpeedFusion priority settings dictate which currently available, active, healthy WANs should be used in which order for VPN traffic alone.

By mixing priorities on the dashboard and in the SpeedFusion profile you are able to have granular control of your available WANs and their usage for different traffic types.

Picture a scenario where you have a fixed line internet connection, a satellite link and a Cellular WAN, where you want all traffic to use the fixed line so long as it is healthy.
If the fixed line is down you want the SpeedFusion VPN to use the low latency cellular link and all other general internet traffic to use Satellite.

In the dashboard you would put the fixed line in P1 and the satellite and cellular connections in P2. In SpeedFusion you would have the fixed line in P1 and the cellular in P2, you would not use the satellite link…

You would then have an outbound policy for general internet access set to use a priority based policy of fixed line first then Satellite (so no cellular).

image

When the fixed line fails, the Priority 2 WAN links in the dashboard will become active. Since SpeedFusion has the cellular WAN in P2 it will rebuild the tunnel using the cellular WAN instead of the fixed line so new VPN sessions will now pass over cellular. All other internet traffic will failover to the satellite WAN.

image

If you want the SpeedFusion VPN to hot failover to the cellular link (so perform an uninterrupted failover), you would put the cellular WAN in P1 on the dashboard. Because of the outbound policy rule the cellular WAN would not be used for general internet traffic, but since the wan is active and healthy in the dashboard SpeedFusion would build a tunnel over the cellular link in preparation for when failover occurs. However since the cellular wan is P2 in the speedfusion profile it would not be actively used for VPN traffic.

When the fixed line fails, because we already have a Speedfusion tunnel built on the cellular WAN VPN traffic will be redirected at a packet level across to the cellular link and all active sessions would stay up.

As you can see, with traffic flow control in these three places (dashboard, outbound policy & SF profile) we can build some very clever rules for how WANs are used depending on their availability.

3 Likes

So this does bring up a new issue, if I understand this right.

The whole point of using this tech is to use cheap cell service as a backup to Comcast. So that it protects RDP and VOIP from site to site.

Before we were using Comcast and a point to point fiber with ASA to do failover. Comcast was the failover.

Config A
If I wanted to do just failover then I would put Comcast on 1 and cell on 2 in both. But this would cause issues when Comcast is flapping so this is no good.

Config B
Comcast and cell setup on Priority 1 on both interface and speedfusion with wan smoothing disabled.
If comcast is flapping and I don’t have wan smoothing enabled, it will then start flopping between comcast and cell but will there be issues with voip quality since the tunnel is built on both interfaces, I’m thinking yes since Comcast is dropping packets and they are lost.
This would only be good if Comcast went down 100% and the only interruption would be when it failed over once.

Config C
Wan Smoothing – only way to Guaranty quality?
Comcast and cell setup on Priority 1 on both interface and speedfusion with wan smoothing enable.
if comcast is flapping then no packets will drop and there will be no voice issues. However this uses up a lot of cell data.

Yes you’re right that those are the broad stroke configurations available but they can be fine tuned to be more efficient in response to your specific connectivity issues.

One of the big questions is how much packet loss is there on the Comcast? Is it regular and high frequency or irregular and low frequency? Can you share a screenshot of the speedfusion graph here with 5-10mins of results on it so we can see?
How much VoIP traffic is there? How many handsets / concurrent calls are you supporting?

It might make sense to turn wan smoothing on for just the VoIP traffic - assuming that VoIP usage is low enough that you can afford for it to be replicated onto cellular continuously. Then you could adjust the sensitivity SpeedFusion has to the comcast link flapping by changing the suspension time after packet loss so it doesn’t try to use it immediately after a packet loss event…

Are users complaining about RDP session drops or is it the VoIP they are actually noticing? You could add a receive buffer to the RDP traffic raising it’s latency slightly (anything less than 300ms latency for RDP is hardly noticeable) but giving speedfusion the time needed to re-transmit can packets that were lost over the comcast link.

In fact, if it was me, I’d look to fix the VoIP traffic first (as that’s most noticeable to the end user) using wan smoothing for VoIP traffic only, then see how well RDP performs and adjust the receive buffer and suspension time after packet loss to improve that as stage 2.

Ultimately though that comcast link needs looking at. Maybe they can find the cause of the packet loss if prodded hard? If not an additional cheap wired link from another provider might be the way forward with WAN smoothing turned on across both links, so you only use cellular if that new link fails.

1 Like