Connection Failover Issue w/ SpeedFusion and Cell


#1

I have two Max HD2 units that have a SpeedFusion bonded connection over low latency 900 MHz radios and high latency cellular connections. Because I’m streaming video I have the cellular connection set to priority 2 while the radios are priority 1 (otherwise the video is disrupted by the packets that arrive very late). This works

The one unit is mobile though so when the radio’s get to max range the signal levels and data rates drop such that the video is too degraded to be useful. the problem is that the connection still passes the health check so it doesn’t failover to the cellular connection. this can create a region where the bonded connection is working very poorly.

I looked at using the WAN smoothing, cut-off latency, and suspension time after packet loss. These still didn’t seem to failover from a priority 1 connection to a priority two connection while it still was alive even marginally.

Some things that helped: the radios (not Peplink) have a setting to not allow a client to connect if it’s below a set signal level, but that only helps after they disconnect. Still, it helps prevent the radios from cycling between connecting/disconnecting.

I modified the connections’ health check settings and the SpeedFusion link failure detection to detect the failure as soon as possible.

These have helped but we still have the same underlying issue of a poor quality connection that is priority 1 will mess everything up until it improves for totally fails.

Any suggestions for how to improve this?

Thanks


#2

Sounds like you are on the right track to diagnose this. I assume the latency between the 900 MHz radios is very low when the signal is good, so I recommend to play around with the latency cut-off setting.


#3

Thanks, but unfortunately this won’t work. The normal variability of the radio’s latency is much greater than the increase in latency from poor signal quality and I worry that we might get into some environment where this cutoff kills a good radio connection.

I’m not sure what else could be done…maybe other health check options. This can be a problem too because when the health check fails the WAN interface is disabled so you can’t query the radios for their status. Still, if we could programmatically monitor the radio’s health (my code) and then programmatically disable the interface or this SpeedFusion connection that could work. Is there a way to make these changes to a these units?

Thanks


#4

Hi,

Do you mean when Radio connection is poor, it’s latency still better than Cellular connection? Radio connection is connected to WAN port of HD2?


#5

Yes and Yes. With a good quality connection the radio’s link latency is typically about 5ms. I’ve seen this periodically jump to 100+ ms and that may be due to changing transmission paths, or perhaps changing coding schemes as the signal level changes. When the link’s signal is nearly lost the latency is perhaps 15 to 20 ms. The cell phone is normally 125 to 200 ms.

If I set the latency cutoff to 20 ms on the radio link then I’m liable to lose the connection at times when it still has good signal strength.

The radio’s are connected to the WAN1 port of the HD2.

What didn’t ask clearly in my question above was, is there some API that I can use to make these changes to the HD2 programmatically? How about via SNMP or some other mechanism?

Regards,


#6

Hi,

I think WAN Smoothing is a solution for your environment. Ensure you have sufficient bandwidth and quota limit on Cellular connection.

  1. Please ensure you set WAN1 and Cellular as priority 1 on WAN Priority (Dashboard) and SpeedFusion WAN Connection Priority (Advanced > SpeedFusion > Select SpeedFusion profile > WAN Connection Priority).

  2. Enable WAN Smoothing (Advanced > SpeedFusion > Select SpeedFusion profile > PepVPN Profile > WAN Smoothing = Normal).