Add SIP Back-to-Back User Agent (B2BUA) to Improve VoIP Stability

We love our Balance 30, and we would like to see the addition of a SIP back-to-back User Agent (B2BUA) to improve the reliability of VoIP telephony during a WAN link failure. A B2BUA acts as the registrar for every device on the LAN and the device for every registrar on the WAN, effectively creating two call segments. When a WAN link fails, the B2BUA could send a re-register for all of its client devices to the “real” SIP registrar without the client devices ever noticing a problem. This would prevent the need to re-power or restart IP phones when a WAN link fails. If executed fast enough, it could also continue calls-in-progress with only minimal drop out at the moment of a WAN failure.

Keep up the great work!

Thank Trey. I haven’t had that much experience with B2BUA for SIP but I think this is an element of the SIP protocol and is supposed to be implemented on the SIP server/client side?

1 Like

Kurt, a b2bua could be implemented just about anywhere in a SIP call path. Increasingly, B2BUAs or SIP Proxies (google Kamailio) are being implemented in border devices like Peplink’s to prevent SIP sessions from failing when a WAN link goes down. Let me try to lay out the problem:

  1. When a SIP device such as a desk phone or softphone is powered on, it sends a REGISTER request to the SIP registrar or outbound proxy it’s configured to use for completing calls.
  2. If the SIP device is behind a “normal” single-WAN router, its UDP (and, if the NAT is worth it’s salt, TCP) packets will have their private LAN IPs re-written to the public IP of the WAN link. This works well.
  3. When, however, a SIP device is behind a multi-WAN router, things can go wrong: When a WAN link fails, neither the device nor its registrar will immediately know that the public IP address the device is using has changed. In part because the NAT function now conceals the broken link from the device, which doesn’t know it needs to re-register. Consequently, outbound media and signaling may continue to function, but in-bound call signaling and media delivery can be interrupted. Alternatively, the device may lose all connectivity until it is re-powered and forced to register again.
  4. A B2BUA or SIP proxy fixes this problem by letting all SIP devices within a LAN register with a single, internal appliance, which then conceals underlying WAN topology changes. For example, the proxy (rather than the SIP provider’s server) would respond to the device’s REGISTER request, sending it’s own REGISTER request to the server. If the proxy is on the border device (such as a peplink), it can know when a WAN link has failed and automatically send a RE-REGISTER request for each device it’s serving to the service provider’s server using the public IP of the new WAN link. In many cases, this can happen so quickly that calls-in-progress are affected only minimally, if at all.

Most existing solutions to this problem are either kludgy or aimed only at service providers, rather than small businesses. For example, there are a lot of SMB Enterprise Session Border Controllers on the market that implement a proxy or B2BUA (along with some other functions), but virtually ALL of them presume the existence of legacy analog or TDM voice links. No one (that I or my two outside IT guys have found) makes an all-IP device designed for small business VoIP. I know the SIP Express Router (now Kamailio) project has versions that can run on a DD-WRT install, which makes me think it should be straightforward to port that functionality to a more advanced embedded device like the Balance series.

I know our office would love to have this functionality: It would stop us from having to restart our phones every time a link goes down or recovers. Peplink would also have a tremendous cost advantage over all the expensive-to-implement TDM-based kludge-boxes.

Thanks for considering!

I agree the Trey. The clients that I have moved and or will move to Peplink desperately need better SIP functionality. I deal with SMBs and have tried numerous multiwan routers looking for just this functionality. I can keep them on the net but I can’t keep their phones up when an ISP drops out. The company that fills this need is going to have a substantial leg up on the competition!

I agree with Trey. I use a Peplink with VoIP service in my business and this would really improve the stability of the VoIP service.

Guys, appreciate very much for the feedback. This is well heard. We will investigate into this seriously. Thanks! :slight_smile:

1 Like

+1 for me as well! What trey is suggesting would be a huge improvement and a welcomed addition to our balance 380!

+3 for us too! This would be a huge for our Balance 380 and 2x Balance 20’s.

Excellent feature. Definitely should be added.

+1 excellent feature…

1 Like

Any update on this request?

We’ve had some interest from our members (~6,400 9-1-1 centers in the U.S. and Canada) in multi-WAN solutions for VoIP deployments. Haven’t seen any industry support yet, however.

Thanks!

Am I allowed to upvote this again?

+1 for any updates?

Trey,

Is this still an issue? I’m using a Balance 20 dual wired WAN with 5 Cisco SPA504G IP phones registered directly to the provider. I’ve tested WAN failures and my voip sessions continue with a brief interruption depending on health check settings.

Am I the only one with SIP phones that WAN failover currently works (doesn’t disconnect the calls) for? It’s nice to be special.

Cyclops, can you share any details about your setup? What provider are you using, and do you have any special configuration? I’m sure other members of the community would love to know.

1 Like

Sure. I actually switched from another dual WAN router to Peplink for this very reason. SIP failover between WANs simply didn’t work properly with my previous device.

-WAN1 60/4 cable HSI set to active. Static IP. Always on.
-WAN2 6/512k DSL set to active. DHCP. Always on.
-Five in office Cisco SPA504g IP Phones connected directly to provider (auto attendant, voicemail all hosted remotely)
-Anveo SIP trunking
-QoS highest priority for VOIP

That’s it. I tested it fairly crudely initially by pulling WAN cords with an active call on speaker and using a solid key tone. Initial set up where I had WAN2 set to backup caused a near 10 second delay before continuing the call. When I set WAN2 to active as well, the delay in continuing the call was 3-6 seconds.

Cyclops/Tim:
Calls-in-progress tend not to be the real issue, as SIP is pretty resilient to endpoint IP changes during a media session. The bigger issue is usually inbound calling after a change in external IP: The proxy or server at your provider’s end doesn’t have a way to detect the change in IP address, and so it tries sending INVITE requests to the non-working WAN IP until the phone re-registers either due to a reboot, or the expiration of its previous registration. Since the registration and call handling functions tend to be separate for many Broadworks-based providers (the dominant VoIP switching platform), even if a call-in-progress adjusts to the change and stays up, the proxy/server won’t necessarily follow suit.

If your provider has a set a very low interval between REGISTER requests, that might account for your not having any problems (assuming you don’t…it isn’t mentioned). However, most providers try to keep that interval as long as possible to reduce network overhead.

Having a proxy on the edge device solves all of these problems and can, in some circumstances, maintain internal communications during an all-WAN outage. It can also reduce your total bandwidth utilization, as is beneficial for metered or capped connections, by handling internal calls internally, without the involvement of your cloud-hosted IP-PBX provider.

Good information Trey, thanks for sharing!

1 Like

Trey,

Thanks for the explanation. I’m afraid I still am confused about how this isn’t a proxy issue. I understand how the proxy is a benefit for reducing overhead internally and providing a several trunking options should provider 1 have temporary issues, but it seems that this then introduces this failure when external IP changes (WAN failure). Unless the proxy fails, the phones don’t automatically reboot. Shouldn’t the proxy act as the B2BUA device?

What about another way to handle this issue? Setting up DNS SRV SIP records that would handle inbound calls gracefully should link 1 fail? Or does would that not apply in this situation?

I have the phones I manage set with 180 second registrations. I believe when a WAN failure happens, the phones automatically re-register at the next interval, so there is a 3 minute period where inbound calls may fail.

Cyclops:
It sounds like you’re using an internal IP-PBX/Softswitch, so you wouldn’t experience the problems this feature request is intended to deal with, anyway. For most of us who don’t have the in-house IT mojo to support that kind of configuration (e.g., an Asterix box with SIP trunking), things work rather differently. We use a cloud-hosted softswitch provider (SimpleSignal). They handle all the provisioning, MACD, PSTN intercon, etc. and serve our handsets directly. Same goes for other providers like Nextiva, 8x8, etc. In that configuration, with an off-site switch, you really need the B2BUA or proxy on the edge device, since there’s a lot more signaling (and relatively less media transfer) going on.

T