DNS problems / failures with new Peplink 310

I recently have configured a Peplink 310 for a client of mine located in Suriname, in South American. Since then we’ve had constant, but intermittent, DNS failures and timeouts. I suspect something is wrong with the Peplink, but can’t rule out coincidental issues with DNS within the country. (On the other hand, we’ve not see DNS problems there for the past 3 years, just once the Peplink went into place).

The config is a bit unusual on the WAN2 port. I cannot reconfigure the internet router (DSL) to do an unnumbered connection (no NAT) so I’ve set up the Peplink to do IP Forwarding on the WAN2 connection. I put a 192.168.12.x/24 address on WAN2 and on the LAN port of the WAN2 router. All NAT is done in the WAN2 router, and that router has only had it’s private IP address changed (to match the WAN2 port). The WAN2 router is doing a lot of port forwarding of inbound connections to a firewall on the inside of the Peplink, and I didn’t want to try to redo all of those settings to forward to addresses on the WAN2 port of the Peplink. I just put in a static route on the WAN2 router to point to WAN2 address of the Peplink to get to the LAN side of the Peplink so inbound data comes in to the exact same addresses it did before. Inbound traffic is working fine on the WAN2 side. Essentially I put the Peplink in between the old router and the old firewall and gave the LAN side the address of the old router so I didn’t have to redo the firewall addressing.

The WAN1 port is configured more typically. It is doing NAT and has multiple public IP addresses configured. Static NAT is configured on the Peplink and several inbound connections are defined which point to addresses on the firewall connected on the LAN side. (That firewall is doing dynamic and static NAT and has been working fine for years).

OK - symptoms that occurred with the installation of the Peplink. Almost immediately we started seeing email failures trying to send out. (550 host unknown). Browsing (through a proxy) became slow as well. Firewall, pointing to two internal DNS servers and the DNS proxy on the Peplink shows intermittent but frequent DNS ‘down’ status.

I started testing from the inside using NSLOOKUP. Eventually determined that it wasn’t possible to get DNS working outside the ISP’s DNS servers on the WAN2 side, and connections to the ISP’s DNS servers themselves were very unreliable at best. This was not true before inserting the Peplink between the ISP’s router and the firewall.

On the WAN1 side, DNS was working better. WAN1 is a faster link, but a new ISP, and we have no history of reliability.

Eventually I found that configuring the DNS proxy on the Peplink gave to least unreliable results, and I enabled the option to forward to Google DNS servers in the event of failure of the WAN1 and WAN2 DNS servers. And looking at the Status, Active Sessions on the Peplink sometimes shows 20-25 DNS connections to 8.8.8.8 (Google), indicating that the local DNS services have failed. When I see these connections, it generally means that DNS is failing completely, and the connections to 8.8.8.8 are just attempts at DNS rather than successful attempts.

I’ve tried using Service Forwarding of DNS to the proxy both on and off. I’ve configured the internal DNS servers to point to the Peplink LAN address as the main forwarder.

Due to DNS not working at all through WAN2, I’ve set up an Enforced Outbound Policy to send all DNS requests out the WAN1 side.

I’ve hooked up a test laptop on the WAN1 side, and done some preliminary DNS testing there. DNS seems fine when not going through the Peplink on the WAN1 side.

We’ve now had a week of frustration with bounced emails and slow browsing. I can’t think of anything else to try on the Peplink, and I’m wondering if a firmware revision is the only fix. The firmware (Balance 310) on the Peplink now is 5.3.12 build 1150.

Craig

I recommend starting over with a clean config, looks like you made a lot of changes already and it’s hard to tell at this point.

WAN2 does not need IP forwarding mode and I recommend drop-in mode for that WAN, here is more details:

Problem solved, though I never understood fully why it failed to work in the configuration I had. The key was to enable NAT on both interfaces, at which point DNS started working without a problem (also ICMP). With IP Forwarding on one interface and NAT on the other, something just wasn’t right. I tried pinging a host on the internet, and with WAN2 enabled, it failed. With WAN2 disabled it worked. Enabled WAN2 again, and ping failed again. Changed to NAT on WAN2, and it worked again.

So I left NAT on, and then went through the exercise of setting up a bunch of NAT’s and services (port forwarding) on WAN2 and reconfiguring port fowarding on the WAN2 router so inbound worked again.

Along the way, I discovered a couple of disturbing things about Peplink. First, although I repeatedly checked for new firmware in the Balance 310, it always said no update was found. This in spite of being 5.3.12, and 5.4.7 released some time ago. I manually updated the firmware.

During the manual update, I discovered that the downloaded update file contained two .BIN files - with no explanation whatsoever in the readme as to which one to use or even how to install. Searching on the internet uncovered a mention that the versions corresponded to the hardware version of the Balance 310, so I chose the correct one and went about my business. The firmware update made no difference to the problem.

After setting up NAT on both interfaces, I was still surprised to see that all the HTTP browsing was being done on the slower WAN link. (512/256k versus 4/2M). In case the 310 didn’t understand K versus M, I changed the 4/2 to 4000/2000k, but still all HTTP was on the slower link. I could not find any documentation on what sort of load balance algorithm is used by default on the Balance 310. Eventually I just put in an Outbound Policy for HTTP and gave it a 90/10 ratio. That worked, but I’d like to know what the 310 is supposed to do by default. Ratio by bytes received? Packets received? Alphabetical order of the link name? Currently it’s a minor mystery.