FusionHub High Availability?

florentber · March 9, 2017, 12:31am

Hi,

I recently encounter instability with my FusionHub deployed in vSphere,
the device becomes unreachable via inControl (flip online/offline) and after some time the tunnels cease to work will appearing online.

It happened 2-3 times in the last 2 months and I ended up rebooting(relatively quick) the VM to solve the problem.

I am not sure where this comes from, it seems to occur more since I upgraded Balance routers to firmware v7.

is there by any change a possibility to deploy FusionHub in HA setup or do you recommend any thing to pro-actively act on software failure?

Cheers,
Florent

TK_Liew · March 9, 2017, 1:15am

May I know the problem occurs recently? If so, please open the ticket and let us know the date, time of the most recent incident.

Thanks.

MartinLangmaid · March 12, 2017, 3:15pm

Hi Florent,
I have always recommended configuring Fusionhubs for critical systems in active/active Pairs. That way the remote device monitors the health of the Fusionhub (via the availability of the tunnel) and automatically routes traffic over the secondary Fusionhub if the primary is unavailable.

An added benefit (depending on your topology) is that you can then have the Fusionhubs is completely different datacenters - hosted by two different providers, so that if its a provider issue (as it often is) the other Fusionhub is still available.

florentber · May 31, 2017, 3:08am

I encounter some strange performance issue.
I have on Balance 380 Router firmware v7.0.0s072 build 2074 connected to a FusionHub v6.3.2s002 build 1425

VPN Analyzer gives me the following:
VPN WAN1 to FH (UP): 1.72 Mbps Tx Avg, 1.92 Mbps Tx Max, 2.12% Packet loss, RTT 38.35ms
VPN WAN2 to FH (UP): 5.97 Mbps Tx Avg, 7.34 Mbps Tx Max, 0.55% Packet loss, RTT 19.91ms

at the same time if I run Iperf to another VM in same vSphere, I get the following results:

WAN1
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:27:55 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)

[ 8] local 10.42.0.60 port 35465 connected with 188.165.xxx.xxx port 5001
[ 6] local 10.42.0.60 port 35463 connected with 188.165.xxx.xxx port 5001
[ 7] local 10.42.0.60 port 35464 connected with 188.165.xxx.xxx port 5001
[ 3] local 10.42.0.60 port 35462 connected with 188.165.xxx.xxx port 5001
[ 9] local 10.42.0.60 port 35466 connected with 188.165.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 9] 0.0-10.1 sec 42.4 MBytes 35.3 Mbits/sec
[ 6] 0.0-10.1 sec 41.9 MBytes 34.8 Mbits/sec
[ 7] 0.0-10.1 sec 45.4 MBytes 37.7 Mbits/sec
[ 8] 0.0-10.1 sec 45.5 MBytes 37.6 Mbits/sec
[ 3] 0.0-10.3 sec 43.5 MBytes 35.4 Mbits/sec
[SUM] 0.0-10.3 sec 219 MBytes 178 Mbits/sec

WAN2
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:29:57 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)

[ 6] local 10.42.0.60 port 35468 connected with 188.165.xxx.xxx port 5001
[ 3] local 10.42.0.60 port 35467 connected with 188.165.xxx.xxx port 5001
[ 7] local 10.42.0.60 port 35469 connected with 188.165.xxx.xxx port 5001
[ 8] local 10.42.0.60 port 35470 connected with 188.165.xxx.xxx port 5001
[ 9] local 10.42.0.60 port 35471 connected with 188.165.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 8] 0.0-10.0 sec 24.0 MBytes 20.1 Mbits/sec
[ 9] 0.0-10.1 sec 26.6 MBytes 22.0 Mbits/sec
[ 6] 0.0-10.2 sec 13.0 MBytes 10.7 Mbits/sec
[ 3] 0.0-11.0 sec 25.1 MBytes 19.2 Mbits/sec
[ 7] 0.0-11.0 sec 24.9 MBytes 19.0 Mbits/sec
[SUM] 0.0-11.0 sec 114 MBytes 86.6 Mbits/sec

Any idea what could cause the performance difference?

Cheers,
Florent

TK_Liew · June 1, 2017, 2:29am

This throughput test was done within the SpeedFusion tunnel.

florentber:

at the same time if I run Iperf to another VM in same vSphere, I get the following results:

WAN1
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:27:55 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)

[ 8] local 10.42.0.60 port 35465 connected with 188.165.xxx.xxx port 5001
[ 6] local 10.42.0.60 port 35463 connected with 188.165.xxx.xxx port 5001
[ 7] local 10.42.0.60 port 35464 connected with 188.165.xxx.xxx port 5001
[ 3] local 10.42.0.60 port 35462 connected with 188.165.xxx.xxx port 5001
[ 9] local 10.42.0.60 port 35466 connected with 188.165.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 9] 0.0-10.1 sec 42.4 MBytes 35.3 Mbits/sec
[ 6] 0.0-10.1 sec 41.9 MBytes 34.8 Mbits/sec
[ 7] 0.0-10.1 sec 45.4 MBytes 37.7 Mbits/sec
[ 8] 0.0-10.1 sec 45.5 MBytes 37.6 Mbits/sec
[ 3] 0.0-10.3 sec 43.5 MBytes 35.4 Mbits/sec
[SUM] 0.0-10.3 sec 219 MBytes 178 Mbits/sec

WAN2
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:29:57 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)

[ 6] local 10.42.0.60 port 35468 connected with 188.165.xxx.xxx port 5001
[ 3] local 10.42.0.60 port 35467 connected with 188.165.xxx.xxx port 5001
[ 7] local 10.42.0.60 port 35469 connected with 188.165.xxx.xxx port 5001
[ 8] local 10.42.0.60 port 35470 connected with 188.165.xxx.xxx port 5001
[ 9] local 10.42.0.60 port 35471 connected with 188.165.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 8] 0.0-10.0 sec 24.0 MBytes 20.1 Mbits/sec
[ 9] 0.0-10.1 sec 26.6 MBytes 22.0 Mbits/sec
[ 6] 0.0-10.2 sec 13.0 MBytes 10.7 Mbits/sec
[ 3] 0.0-11.0 sec 25.1 MBytes 19.2 Mbits/sec
[ 7] 0.0-11.0 sec 24.9 MBytes 19.0 Mbits/sec
[SUM] 0.0-11.0 sec 114 MBytes 86.6 Mbits/sec

Look like this throughput test was done out of SpeedFusion tunnel since iPerf client connected to a public IP.

SpeedFusion is using UDP but your iPerf test is using TCP. Please open ticket if you still have problem.

HA13029 · December 4, 2017, 3:51am

Hi Martin,

Can you please exaplain a little how you perform HA using FusionHub ?
On the server side ? What’s the default gateway (or routes) defined ?? no VRRP support if 'm wrong with FH

Regards,

HA

MartinLangmaid · December 5, 2017, 3:26pm

Sure. It looks like this:

Remote devices have a primary PepVPN / SpeedFusion tunnel configured to one hosted Fusionhub node and a backup to a secondary Fusionhub. Remote sites are distributed across Fusionhubs. Fusionhubs have a PepVPN between each other. Job done.

Stefano_Santandrea · June 28, 2018, 11:17pm

Hi Martin,

can you please explane how is routed the traffic from datacenter to fusionihub?
Are there any specifications about fusionhub to support this configuration?

Thank you

MartinLangmaid · June 29, 2018, 12:13am

Sure, all traffic at datacenter is routed via the firewall appliances (using Send all traffic via LAN setting on FH).
So for one peer to route to another on the same node:

Peer1 → nodeA Fusionhub → nodeA firewall → nodeA Fusionhub → Peer 2

between nodes:
Peer1 → nodeA Fusionhub → nodeA firewall → nodeA Fusionhub → nodeB FusionHub → nodeB Firewall → NodeB Fusionhub → Peer3

Since the firewalls are inline with all traffic I get really granular control at the firewall level as to which peer LAN devices can communicate with other peers, at an IP level but also a TCP level.

I can also add more public IPs to the firewall to provide inbound NAT over PepVPN to LAN devices connected using MAX routers on dynamic cellular IPs.

This lets be build complex multi-tenanted multi-Fusionhub deployments across multiple datacentres really easily. And I can add in SSL / OpenVPN / TINC / IPSEC VPNS from the firewalls back to the customers corporate resources, add existing IPSEC remote sites (using 3rd party routers) or provide any type of client VPN access as an enhancement to PepVPN for remote site access.

Here is an example where we provide remote CCTV connectivity as a service to multiple CCTV companies across europe. Firewall & routing rules let any remote CCTV system (using a MAX) connect to any of our hybridNET nodes whilst limiting access to the correct service provider / customer.

We effectively become a virtual Network operator for the CCTV service provider companies.

Stefano_Santandrea · June 29, 2018, 2:24am

Hi Martin,

sorry but I’m a bit confused. Maybe I’ve to explanne my scenary:

I’m looking for deploy FusionHub in HA cluster on the same DC (as we can do with Balace) to achieve the 100% speedfusion uptime for branch offices.
We are now routing the traffic to the branch with static routes to the FusionHub prvate IP address (172.16.x.x). The same ip is NAT on internet with a public IP.

If we install a secondary FusionHub how we can do the HA and how wi will route the traffic to the Cluster?
Maybe the cluster make a virtual IP like VRRP of the blanace?

Thank you
Stefano

MartinLangmaid · June 29, 2018, 3:01am

Hi Stefano.
Fusionhub does not support traditional HA/VRRP. As a virtual machine you would use the hypervisor capabilities to provide high availability and resilience for the VM itself and build out a HA core network infrastructure from your hypervisor to and from the internet to guarantee its availability.

When building public cloud hosted environments I always deploy Fusionhub in an active/active way as detailed above on different hosting platforms (ie AWS & AZURE). I find it hard to justify the expense of HA appliances in a single location, compared to the versatility and improved Disaster Recovery capabilities of a pair of appliances hosted in two seperate locations.

All Peplink devices support connecting to at least two remote peers via PepVPN/SpeedFusion for failover between two hub devices - whether they are deployed in active/active or active/failover.

In your case then where you have a single DC at a single location, I would suggest you run two seperate FH appliances, primary and secondary. All remote devices create two tunnels one to each FH. The Secondary FH appliance will have a higher metric set for the remote peers, the remote peers will also have higher metric set for the secondary FH profile. Then use OSPF between the Fusionhubs and the DC core router so that the primary is used until it is unavailable at which point OSPF updates send the traffic via the secondary.

Kindest,

Martin

adrummond · September 17, 2018, 8:38pm

Martin,

I’m trying to implement your suggestion “two seperate FH appliances, primary and secondary. All remote devices create two tunnels one to each FH. The Secondary FH appliance will have a higher metric set for the remote peers, the remote peers will also have higher metric set for the secondary FH profile. Then use OSPF between the Fusionhubs and the DC core router so that the primary is used until it is unavailable at which point OSPF updates send the traffic via the secondary.”

How do you configure OSPF within the Fusion Hubs?

FusionHub High Availability?

WAN1 date && iperf -c 188.165.xxx.xxx -P 5 -r Wed May 31 12:27:55 CEST 2017

Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001 TCP window size: 45.0 KByte (default)

WAN2 date && iperf -c 188.165.xxx.xxx -P 5 -r Wed May 31 12:29:57 CEST 2017

Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001 TCP window size: 45.0 KByte (default)

WAN1
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:27:55 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)

WAN2
date && iperf -c 188.165.xxx.xxx -P 5 -r
Wed May 31 12:29:57 CEST 2017

Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)

Client connecting to 188.165.xxx.xxx, TCP port 5001
TCP window size: 45.0 KByte (default)