Surf SOHO MK3 never finishes connecting to cable modem if modem reboots


#22

Add me to the list of those with similar problems.

I have a Comcast-compatible Arris TM822 modem and SOHO MK3, FW 7.1.0 Build 1284. I have unsuccessfully tried 1000 Mpbs full duplex, un-advertised speed, auto MTU, and health check disabled. If my modem reboots, whether intentionally by me or remotely by Comcast, the router most often does not reconnect - WAN status gets stuck on “connecting.” To reconnect, I must reboot the router, either by disconnecting power or through the browser interface, though no other measures such as disconnecting the LAN cable or modem reboot are necessary.

The above isn’t helpful for a solution, but perhaps it adds to the case that the issue is somewhat widespread.


#23

Ticket 785172 is still open and we have not heard anything since 07/02/18. We suspect the ticket has been abandoned by Peplink, although Remote Assistance is still on. I sent more information as to the situation via e-mail on on 07/02, 07/07, and 07/09 – without response.

Here’s the (very sad) situation with the Balance 20 … We know a bit more now since we found a computer on the LAN was inadvertently left on and is running TeamViewer. We were able to get into the B20 via the LAN-side. Here’s the situation:

  1. Both instances of PepVPN have not started. From the subject B20 we see “Creating tunnels …” and from the destination routers we see “Starting …” It has been this way for several days.
  2. Both WANS are shown to be up, although one showed the cable was disconnected a few times. (It was not. Both the Balance 20 and the IOGear wi-fi/ethernet adapter are connected with a short cat 6 cable, are on a good quality sine-wave UPS and no power failures have been reported.)
  3. There are several IOT devices on one of the B20’s subnets. Normally, these devices connect and disconnect from the internet periodically. Some, including both Honeywell thermostats, are presently shown as unreachable by their hosts but from the LAN side we can see they are on-line. We wonder if their attempts to connect are blocked as both WANs are marked as unavailable.

The WAN cable disconnects have been experienced numerous items and was the subject of the original ticket. The situation of PepVPNs failing to start has also been experienced before and has never been resolved. The situation of LAN client not connecting is new.

We are confident this router has a serious issue and either Peplink must revise the firmware or there is a hardware problem. We suspect the former.

So, two questions:

To Peplink: Is the ticket still active or has it been abandoned? I ask this question in the forum because it appears communications via the usual support e-mail address, as well as a PM, have failed. We’re confident a reset of the router will solve the problem – in the short term. If you want to see what’s going from the inside NOW is the time. Much will be lost when the device is reset. Absent further communication we’ll reset the B20 24 hours from now.

To other users of Balance 20s: Have you seen such behaviors as i describe? PepVPNs not starting, cables to WAN devices being reported as being disconnected when they are not? LAN devices not making outbound connections when the router presently shows both WANs to be healthy? I ask this because we need to make a decision as to the device that will replace this B20. Reliability is absolutely critical and this B20 is not.


#24

@Rick-DC, your ticket didn’t abandon. The technical support who handles your case was on-leave due to some urgency. I will follow up with you in the ticket.

Sorry for the inconvenience caused.


#25

OK. Thank you very much, TK. your assistance is very much appreciated. I hope @sitloongs is OK!

[I’ll write this for others why may have an interest or experiencing similar issues …] The present issues are:

  1. Low memory situation on the B20 which apparently caused both PepVPNs not to make an outbound connection. (Your discovery today – again, TU)
  2. Periodic “no cable detected” on WAN2
  3. B20 sending health check e-mails when log says otherwise (We are now getting several health check failure e-mails PER MINUTE while the system log shows no failures and a ping of the B20 via PepVPN shows no connectivity issues. (This just started after the reboot.)

Again – thanks for looking into this.

Rick


#26

I have a flickering non-connect when my laptop wakes from sleep - sometimes. The only way I can fix the issue is to reboot the Surf SOHO (Mk3, firmware Build 1289).

I run a VPN (tried two different ones recently) and it’s the same with both. One possible common denominator is that both used my preferred client, Tunnelblick for OpenVPN. I think the auto-connect when Tunnelblick launches might be causing a problem, so now I’m trying the VPN with the provider’s client in lieu of Tunnelblick. If that doesn’t work I’ll select manual launch instead of auto, then maybe no VPN.

I have no idea if this bears on the issue but i thought I’d mention it.

Notes:
15 July 2018: I have removed TunnelBlick OpenVPN and used the client’s app instead: I’m still having intermittent internet disconnects and frequent (many times a day) need to reboot the router. I have now removed VPN and testing that. If I need to reboot this time I’ll submit a ticket.
16 July: Changed out modems and routers and it appears to be the Pepwave Surf SOHO. The WiFi to the router works but doesn’t reach the internet, either by WiFi or by Ethernet. Submitted a ticket.

18 July: Firmware problem. Reverted to Build 1284 from Build 1289. Issue resolved.


#27

I got notified last night by @sitloongs that Peplink’s Engineering had acknowledged the issue we’ve discovered and is working on a fix. So here are the details of the problem we found in my case and the workaround I have put in place in the meantime.
The specific issue I reported when creating this thread was that, if my Comcast modem were to do its startup sequence again (either because of a reboot/power cycle or because of service interruptions), then in most cases the SOHO would get stuck in “Connecting…” mode on the WAN port and would never recover the internet access. This happened with multiple cable modem brands and models.
After thorough investigations the root cause of the issue was discovered and validated as an issue by Peplink: If any of your SOHO MK3 ports are configured to anything other than Trunk/Any (basically Access or even Trunk with a single or multiple select allowed VLANs) then most LAN ARP broadcasts and occasionally some other random LAN traffic leak onto the WAN port.
You might think that is a severe security concern, but what does this have to do with the problem at hand here? Well, it turns out that those Comcast modems are all configured to only allow access to the internet to a single MAC address (unless you purchase more than one IP). Typically that is the first MAC it sees when it locks onto the internet connection. When local traffic leaks onto the WAN port, the modem sees it, and if it happens to be before the SOHO’s WAN MAC address has been seen, which is very common if you have reasonable traffic on your LAN, then your SOHO will not have internet access, and the DHCP requests it sends to Comcast are never acknowledged. Thus the SOHO getting stuck in “Connecting…”. Worse, it is your random LAN device MAC that will be bound to internet access, potentially allowing some of those leaked LAN packets to leave onto the internet under certain conditions.
In my case the problem was exacerbated by a configured printer that is often offline, causing all the local PCs to broadcast ARPs asking who owns the printer IP every 1 or 2 seconds, thus guaranteeing those would almost certainly be what the modem sees first, thus binding to the MAC of one of the local PCs instead of the SOHO’s.
Peplink’s Engineering is now working on a fix. The ETA for that fix has not been communicated back to me, and I don’t know if this will make it to 7.1.1, I hope it does… it is a serious issue.
So the workaround: clearly it is to set all your ports to Trunk/Any. Now obviously, if you had them set to something else, then you need the wired VLAN tagging like me. So what I ended up doing is purchasing a NetGear GS108Ev3 managed switch. They are inexpensive, and they support 802.1Q VLAN tagging. So I can do all the tagging in the switch and trunk them to the SOHO. Since I have done this, I did not get the cable modem problem once (and I did many many artificial CM reboots to make sure), and all WAN port captures are now clean and free of LAN ARPs and other leaky traffic.
I hope this helps others until a software fix is available. Even if you don’t have the CM problem I had, you may want to proactively check for leaky traffic if your ports are not set to Trunk/Any. If I get the fix ETA soon I will post it here.


SB6121 Cable Modem
MTU issues causing internet loss?
#30

It sounds like there is another solution:

Disable Wifi, unplug any and all LAN Ethernet cables and reboot the router (perhaps after rebooting the modem?)

Then enable Wifi and plug in the Ethernet devices. Very impressive debugging, by the way.


#31

OK, so I got confirmation that the fix will not be included in 7.1.1 but will be included in the following official releases. In the meantime, they have a special 7.1.1 build available for those with the issue. So if you have this problem, open a support ticket so that you can get access to that build.


#32

Peparn, thanks for the great troubleshooting. I, too, have been having the same issue with a SURF SOHO MK3 connected to a Spectrum (now Spectrum/TimeWarner) D3G1604W WIFI modem in a home office. I’ve monkeyed with it for quite a few months because I was busy with work during the day and the family wanted to use it at night. So I provided the quick fix of disconnecting from the modem, cycling power and bringing all devices back online. I will be opening a support ticket so I can get the fix. Many thanks for dogging it so systematically!


#33

Happy if the work done on this issue ends up benefiting you too @roughrider :smile: . That is the beauty of these forums. Also their support is very proactive (at least it was my experience) so you will be in good hands. Cheers!


#34

Did the special 7.1.1 build work? I’m considering buying the SOHO, but it looks like it won’t work with my SB6121 modem. The SOHO purchase would max out the budget, so am not looking forward to having to buy different modems until I find one that works.


#35

The special build worked well for me. Also if you are not using wired VLANs (i.e.: assigning the SOHO LAN ports to specific VLANs), leaving all the ports set to Trunk/Any, then the problem mentioned in this thread will not happen. So based on your configuration you might not even need the special build at all.
The SOHO is a great router in my opinion, well worth the money, relative to what it can do.


SB6121 Cable Modem
#36

Below is the firmware fix the ARP traffics leak issue that may caused the service provider modem refuse the WAN connection.

SOHO MK3:
https://download.peplink.com/firmware/br1ac/fw-max_br1mk2_hotspot_sohomk3-7.1.1s022-build1344.bin

For those that having issue that explained here, please upgrade your device using the firmware above.


#37

I can confirm the fix was not included in firmware 7.1.1. Learned this the hard way while connecting to a Netgear CM600 cable modem.


#38

The patched firmware is branded for a Surf SOHO MK3. Is it also valid for a second generation HW2 Surf SOHO?


#39

I have the same issue Surf SOHO MK3 + Arris SB6190. Thank you everyone for the time spent tracking down the root cause.


#40

@Michael234

The issue only valid for SOHO MK3.


#41

The word “issue” could be taken two ways.

Did you mean the issued firmware is only valid for third generation MK3 Surf SOHOs? Or, did you mean the issue/problem was only observed on MK3 Surf SOHOs?

If the latter, I can attest that the problem happens on second generation HW2 models too.


#42

Yes, only observed for SOHO MK3 device.


#43

This has also been observed on a second generation HW2 Surf SOHO. I got around it the problem by having the router powered on and booted before powering on the cable modem (Netgear in my case).

At first I did the reverse, powering on the cable modem first, but that failed. I tried setting the WAN port to GB full duplex both with and without advertising but that did not help.