I got notified last night by @sitloongs that Peplink’s Engineering had acknowledged the issue we’ve discovered and is working on a fix. So here are the details of the problem we found in my case and the workaround I have put in place in the meantime.
The specific issue I reported when creating this thread was that, if my Comcast modem were to do its startup sequence again (either because of a reboot/power cycle or because of service interruptions), then in most cases the SOHO would get stuck in “Connecting…” mode on the WAN port and would never recover the internet access. This happened with multiple cable modem brands and models.
After thorough investigations the root cause of the issue was discovered and validated as an issue by Peplink: If any of your SOHO MK3 ports are configured to anything other than Trunk/Any (basically Access or even Trunk with a single or multiple select allowed VLANs) then most LAN ARP broadcasts and occasionally some other random LAN traffic leak onto the WAN port.
You might think that is a severe security concern, but what does this have to do with the problem at hand here? Well, it turns out that those Comcast modems are all configured to only allow access to the internet to a single MAC address (unless you purchase more than one IP). Typically that is the first MAC it sees when it locks onto the internet connection. When local traffic leaks onto the WAN port, the modem sees it, and if it happens to be before the SOHO’s WAN MAC address has been seen, which is very common if you have reasonable traffic on your LAN, then your SOHO will not have internet access, and the DHCP requests it sends to Comcast are never acknowledged. Thus the SOHO getting stuck in “Connecting…”. Worse, it is your random LAN device MAC that will be bound to internet access, potentially allowing some of those leaked LAN packets to leave onto the internet under certain conditions.
In my case the problem was exacerbated by a configured printer that is often offline, causing all the local PCs to broadcast ARPs asking who owns the printer IP every 1 or 2 seconds, thus guaranteeing those would almost certainly be what the modem sees first, thus binding to the MAC of one of the local PCs instead of the SOHO’s.
Peplink’s Engineering is now working on a fix. The ETA for that fix has not been communicated back to me, and I don’t know if this will make it to 7.1.1, I hope it does… it is a serious issue.
So the workaround: clearly it is to set all your ports to Trunk/Any. Now obviously, if you had them set to something else, then you need the wired VLAN tagging like me. So what I ended up doing is purchasing a NetGear GS108Ev3 managed switch. They are inexpensive, and they support 802.1Q VLAN tagging. So I can do all the tagging in the switch and trunk them to the SOHO. Since I have done this, I did not get the cable modem problem once (and I did many many artificial CM reboots to make sure), and all WAN port captures are now clean and free of LAN ARPs and other leaky traffic.
I hope this helps others until a software fix is available. Even if you don’t have the CM problem I had, you may want to proactively check for leaky traffic if your ports are not set to Trunk/Any. If I get the fix ETA soon I will post it here.
5 Likes