HA/VRRP tug o' war

Hi everyone. I’m back again, this time with a question about my HA/VRRP setup on my Balance 380’s.

I am currently running two Balance 380’s in a HA/VRRP config. Both 380’s are the same hardware versions, are running the exact same firmware, and (for the most part) are operating as expected. However, I am experiencing a small issue with regards to how each unit handles the master/slave role, specifically upon a loss of internet connectivity.

I have two WAN connections in my particular setup, one is a cable modem (WAN-1), and the other is a cellular modem (WAN-2). Also worth mentioning is that the cellular modem is a Sierra Wireless ES450 with a physical ethernet port and not a USB modem connected to one of the USB ports on the router.

WAN-1 (cable modem) is connected to a Netgear 5 port smart switch which then feeds both 380’s WAN-1 ports. WAN-2 (cellular modem) is also connected to a Netgear 5 port smart switch which then feeds both 380’s WAN-2 ports. In the past, I have experienced issues with both the cable modem and the cellular modem becoming locked up for various reasons. In my attempts to circumvent a total loss of connectivity while away, I have a schedule that runs in one of my PDU’s that power cycles both modems along with its respective 5 port switch, every day, during the early morning hours.

This morning I noticed that my cable modem went offline sometime during the night and was apparently unable to recover with the scheduled power cycle. This left the cellular modem being the sole gateway for the entire network. When the PDU schedule cycled the modems at 4 AM, both routers began to transition back and forth, from master to slave and back again. According to the logs, this behavior continued for the next two and a half hours. This led me to believe that the cellular modem might not have come back online immediately, but I was able to confirm that this was not the case.

In the past, I have seen these routers fight over who gets to be the master. The problem is, during this tug o’ war between the two, there is no network connectivity and management to the router(s) via the VIP is impossible. I tried enabling the “preferred role” to master in my main/primary router, but that causes another unnecessary transition and possible game of tug o’ war between the two if and when the main/primary router becomes available or ready when the secondary/backup router was previously acting as master.

From what I understand, periodic VRRP advertisement packets are sent out from the master device to VRRP-specific IP multicast addresses via the LAN port. If both units are never disconnected from the LAN, and both units are aware of the others connectivity on the WAN ports, why would they fight over the role of master?

Has anyone else experienced this issue and if so, what was your solution to fix it?

@User187

Please open a support ticket for support team to check the issue. They are two many influences for this and not easy to investigate without reviewing the device logs. Please make sure you include the followings in the ticket so that support team have information to help to check on this.

  1. Diagnostic report when the issue happen
  • Please make sure you include the timestamp when the issue happen.
  1. Physical connection for the HA setup.
  • This is very importance as most of the HA flapping issue are cause by network influences.
  1. Background info for the issue
  • First time happen
  • Intermittent
  • Issue can be reproduced ?
  • Other
  1. How you recover the issue ?
1 Like