Failover-Time - HA vs 2 seperate devices

KPS · September 3, 2018, 1:33am

Hi!

I am just doing some tests with a pair of Balance 710s as HA-pair. As there is no real session-sync between the pair, the failover is quite slow:

Slave-Device detects “no VRRP-packets for 3 seconds”
Slave-Device initiates IPs
Peer-VPNs do a restart, negotiate, set routes, etc.

→ >30s outage.

The alternative would be:

Configure two “independent” 710s
Connect every branch-office to both 710s

→ more config-work
→ faster failover

What do you think? How do you handle this?

Regards,
KPS

MartinLangmaid · September 6, 2018, 1:21pm

I always deploy active/active pairs of balance devices in a single datacenter / HQ. VRRP is messy and cumbersome and antiquated IMO.

The only time I end up deploying devices as a VRRP HA-Pairs is when the customers compliance procedures dictate that a active/passive HA Pair is needed and are too cumbersome to change.

KPS · September 7, 2018, 12:48am

Hi Martin!

@MartinLangmaid
OSPF-routes? How long is the client offline, if the “primary-device” in the DC fails?
How do you configure the path costs? Same pathcosts from branch-office to the two DC-peplinks or different ones?

MartinLangmaid · September 7, 2018, 7:48am

OSPF routes yes. Client is doing failover between two active tunnels generally so that’s almost immediate. Not convinced I have ever timed it admittedly…

Path costs are as you would expect higher for the DC based Peplink in the secondary role, but of course you can split the remote peers into two groups and make one DC Peplink the primary for half and the other DC Peplink the primary device for the other half, so you get remote peer device distribution across a pair of DC based Peplink devices.