Health Check Mechanisms Against Link Failure


#1

The objective of this article is to explain how the different health check methods work, and how to configure them.
Health Checks are used to detect the Internet link’s logical connectivity. Without health checks, ISP routing issues may not be detected, thus failover may not perform smoothly.

Health Check Methods:

PING: The router will send a ICMP/PING packet to the specified IP address (or host name) to test WAN connectivity.

DNS Lookup: The router will perform a DNS lookup to the specified DNS server.

HTTP: The router will perform an HTTP request to the specified URLs. Optional with strings to match.

SmartCheck: Available in Cellular/USB WAN, SmartCheck initiates when outbound traffic goes unresponded for 10 seconds. When SmartCheck initiates, it will run an ICMP health check.

Configuring Health Checks

Health Check Settings are located at: Network > Interfaces > WAN > [WAN Connection Name].

Health Check Parameters:

Timeout: During any health check, the router will send a health check packet. The router will wait the specified number of seconds for a response before the health check is considered as failed.

Health Check Interval: This number specifies the period between each health check.

Health Check Retries: This number specified the number of health check attempts the router will make. Upon reaching this number, the link will be considered as failed

Recovery Retries: This specified the number of successful health checks a failed links needs before the link is considered as up again.


Recommended health check option?
Troubleshoot failed smart check
Managing an up/down WAN connection seamlessly
Pepwave Surf SOHO - WAN Failover
Surf Soho, Intermittent, occasional dropouts
WAN: WAN 1 disconnected (WAN failed PING test)
Basic question on WAN failover feature
Help - Can we use 2 WANs from same provider - Balance 305
Switchover seems delayed on Balance 30
Balance 20 not failing over
Feature request - Intermittent Failure Detection
#2

Alan,

Anything in the works for a speed failure option? Example in the BR1, when the speed drops below a certain benchmark example 600 kbps, it fails over to the 2nd sim?


#3

The problem with this is that we would need to be constantly doing speed tests in the background and this would consume a huge amount of data that the customer would have to pay for.


#4

What about just using the Throughput rating, combined with past patterns of usage? The SURF could have a heuristic that decides when to switch, based on how many connections it has, or what the speed usually is to those IP addresses, and even how long it has been since a certain speed threshold has been passed.

Then the connection could be switched to the alternate, and the SURF Throughput monitored as another heuristic to decide whether that improved the speed, or if the connection should be switched back.

Even so, it should be up to the customer to decide whether to use a little more data for this feature, if it’s important to have a fast connection. Even the backup connection is costly. The amount of extra data used for this feature could be tallied and displayed in the control panel.


#5

Hi Alan,
we are using several Max BR1 with telekom as sim1 and Vodafone as sim2. Firmware is 6.3.1 build 2023.
Cellular settings are: both SIMs, no preference,stay connected, HealthCheck: ping, host 1 + 2 Google, Timeout 3 Interval 10, Retries 3 and Recovery 3.

When device starts it first connects to Telekom and when healtch check failes it makes failover to Vodafone.
It works good most time.

Now we have a device connected to Telekom with very low signal
Signal Strength RSRP: -105 dBm RSSI: -78 dBm
Signal Quality RSRQ: -10 dB SNR: 17 dB
(or below: RSRP -110 dBm, RSRQ -12 dBm). Device is seen in InControl2 but I’m not able co connect to Web Admin and it don’t failover to Vodafone because of a good health check.
It’s not possible to realy work over that connection.

Do you have a recommendation which receive values ​​are minimally necessary?
Is it possible to implement a health check based on minimal reveive values (configurable)?
A benchmark as requested in another post is not sessesary I think.


Failover when connection exists but is poor
Sim Failover Signal Levels
#6

Hello @HoRu,
Can you upgrade to firmware 7.1.0? The latest version of firmware allows you to set a signal threshold level with the fail over, this may solve your problem.

Please keep us informed of how you go.
Happy to Help,
Marcus :slight_smile:


#7

Hi @HoRu,

I’ve made a quick video (needs a little more work), but I think this explains the Cellular Signal Threshold feature, available from firmware 7.1.0

Cellular Signal Threshold Video

Thanks,

Steve


#8

Hi Steve,

tnx for your fast reply. Seems it’s the feature I’m looking for. I’ve a testrouter at my job and currently update to 7.1.0 via cellular. It’s a good test for eventually rolling out a new firmware to my other devices :slight_smile:

Will soon report what’s going on.

Sincereley

Holger


#9

Hi Marcus,

tnx for your fast reply. Have seen the video made by Steve. Seems it’s the feature I’m looking for. I’ve a testrouter at my job and currently updating it to 7.1.0 via cellular. It’s a good test for eventually rolling out a new firmware to my other devices :slight_smile:

Will soon report what’s going on.

Sincereley

Holger


#10

Hello,

Does anybody know what values are used when selecting the threshold level via the bar diagrams?

KR,
Bert


#11

Nope. But you can set your own:


#12

I know, I’ve watched the movie :slight_smile:
However it would be useful to know the predefined values.


#13

Hi @Bert_Verhaeghe

I’ve found out the values for used by the Threshold scale. For LTE / LTE-A (Primary band) we use RSRP and for 3G we use RSSI. Here are the appropriate values for each “jump” on the scale:-

image

Hope this helps,

Steve