The issue I’m having is the router works fine for a day or so, then all the traffic gets routed to WAN2 and WAN1 gets almost no traffic. This results in all sorts of problems with throughput, streaming services not working, etc. as the customer needs both WAN connections in order to do their work.
When this happens, if I reboot the router, it works fine for a day or two, and then the issue recurrs.
What should I be looking for in order to diagnose the issue?
Can u send a copy of the outbound policy rules that you have on Network // Outbound policy.
Check that the speed Wan speed on Network//Wan (Upload Bandwidth and Download Bandwidth). I have used upload speed instead of download speed and peplink reduce sending traffic on that wan because a least used policy.
not really - this router isn’t under warranty, so it’s running 5.4.9 build 1732 which is as far it’s going. Considering it’s worked fine for years and this appears to be recent behavior, I’m thinking an upgrade shouldn’t be necessary.
Reading the docs on load-balancing, the router allocates on the assigned bandwidth of each WAN connection. From what I’m seeing, I’m thinking that if both WAN connections have the same bandwidth, then the router locks all the traffic to one WAN connection regardless of the actual usage on both WAN lines. This can result in one WAN line being overloaded while the other WAN line is empty of traffic.
This is the only explanation I can come up with, and it really pisses me off to think I’ve been using this router for so long and have been getting effective use of only one WAN line’s bandwidth when I’ve been paying for 2 all these years.
Normal Application Compatibility hasn’t changed much - I got a video running, and then flipped through some FB pages in rapid order. One line is almost saturated, the other line has almost no traffic.
Tried a disconnect on the offending WAN line and then a reconnect, the router reported a number of userid/password failures, with an eventual connection.
Called the ISP, and they saw an “old” connection hanging around, so what might’ve been happening was the router making multiple attempts to connect, and then somehow getting a login, while the ISP’s system still had the “old” connection still hanging around for some reason.
Checking the log file shows no failed attempts to login, there is a “no cable detected” error - although the router was connected to a modem the entire time.
I’ve put the router back to “load balanced” mode and it appears to be working fine now.
There needs to be more / better diagnostic information recorded and reported - particularly for failed login attempts.
Some services and sites depend on a persistent source IP. A https session must all go out the same WAN, because the server detects a changing IP on https as invalid. Major steaming services consist of many parallel http connections, that get replaced every x minutes, and all these must come from the same IP, or the service views it as a user logon/session error. Same deal for Skype connections, and a lot more. Even simple web surfing with the logon cookies, could decide its a duplicated logon and reject.
Load sharing only works 100% for simple tasks. The complex apps and services do their own kind of load sharing, by splitting up the the session across multiple TCP connections. These servers will not accept these coming from different subnets.
You will need the HTTPS rule. You will need to add rules to converge streaming services onto one WAN IP. If each WAN and ISP has a CDN (Akamai, etc), then each WAN will give a different IP for DNS look ups to sites they host, and you will need to add rules to direct those connections to the appropriate WAN. To get really good results, you need to add the entire ASN IP subnet table, for each WAN, to your Peplink.
It’s a complicated task to get true load sharing and success with complex services.