Using these excellent routers for 5 years, and still so much to learn

We commonly run across an issue when we use SFVPN. I would like to know what we are doing wrong.

This is the situation:
Poor throughput with two SIMs in the bond, but excellent throughput with only one (either one) of the installed SIMs in the bond.
Both SIMs are with the same provider, very similar individual latencies and similar throughput.
The tests have been run from a laptop at the Pepwave LAN connecting across the SFVPN to a FusionHub to a single server on the private network on the LAN side of the FH.

Does this sound familiar to anyone else?

Test 1:
Cel3 Only in the bond.

20 Mbps download (SMB file download traffic), 31 Mbps upload (TTCP generated 14 TCP streams)

Test 2:
Cel2 Only in the bond.

15 Mbps download (SMB file download traffic), 31 Mbps upload (TTCP generated 14 TCP streams)

Test 3:
Cel2&3 in the bond.

4.5 Mbps download (SMB file download traffic), 3 Mbps upload (TTCP generated 14 TCP stream

The big downside of bonding is a poor connection(s) dragging down the overall bond, you could have one of the connections spiking latency wise and impacting the overall, have you set any parameters on the SF tunnel around latency cut-off and suspension times?

I find this often helps, especially if you have anything latency sensitive in transit via the SF tunnel.

1 Like

Thanks for your thought cgreen. I should really have added the fact that we had also done a PepVPN test, which produced 35 Mbps bidirectional from Peplink to Peplink. So the bonding is functioning fine to transmit and receive across the given WAN media.
The problem is client to server, and back.
But the client and server prove capable when only one SIM is in the bond.

I’m thinking that there must be others who have seen this behavior. We do a fair bit of SFVPN bonding in our network, successfully aggregating WAN media.

From time to time we encounter cases where it should work but for an unidentifiable reason the bonding algorithm doesn’t transport client server sessions at all like the PepVPN test does. And so I am reaching out to anyone who might know what the cause is and what to do to and why.

This is a POC for a potential customer. It should just work in my experience, because the relative transmission characteristics of the two WAN are similar enough.

I think I must be missing something.

Regards,
Dana

1 Like

For the client to server issue there could be out-of-order packets. Try adding a receive buffer on each side of the VPN (100 ms for example) to see if there is an improvement.

1 Like

Thanks Ron, I do appreciate the suggestion and we will try it.

If you have an idea about it, why do you think the packets could only be out of order for the client server flows when the SFVPN contains two SIMs but not one, when the PepVPN test isn’t similarly affected with two SIMs in the bond?

Regards

Dana

This would not be normal, however based on your description bonding is working and the problem is client to server, and back.

A network capture from the FH would provide more information to troubleshoot this. An MTU adjustment might be needed on the FH side for client VPN users going through Fusionhub for example.

The suggestion was to help narrow the problem down, but a network capture will be be helpful as well.

1 Like

Hi Ron,

this sound like my ticket 9050123 …

Regards
Dennis

Hi, 100ms of receive buffer increase throughput only marginally. We will get a packet capture and introduce it to the ticket we have open. fyi, it is 9050226. I don’t recognize 9050123.

Thanks.
Dana

Hi

9050123 is a ticket that Dennis has with us regarding a similar issue as you are seeing in ticket 9050226.

We have been doing a lot testing over the last few weeks and are finding that the TCP stack in Windows struggles badly with Jitter compared to Linux (ubuntu) in like for like testing. The Receive buffer can help by reducing the the amount of jitter (but by increasing the latency). Using WAN smoothing can also help a lot in this case (but uses double the bandwidth) and is also really useful with dealing with packet loss (which has a big impact on all TCP file transfer throughput).

We are still running through testing scenarios and will show some of the results of the testing when using various settings and connection types.

thanks
James

2 Likes

Thank you James,
The feedback that you acknowledge the problem, are working on it, and have some concrete understanding of its mechanics makes a huge difference.
We have not found a useful improvement with a 100ms recv buffer. We will speak with the customer about their operating system used for the test and the option of WAN smoothing athough I don’t think they will like the data volume impact.

Please keep us posted because this must affect a lot of use cases.

Kind regards,
Dana

Hi DK,

Both SIMs are with the same provider
This is prob the cause for the low throughput. If you really have no other ISP available try selecting different Bands in the advanced band selection. Example (Modem 1 Band 20) (Modem 2 Band1). Bands can differ per country and provider.

Make sure that no Band (freq) from the same ISP is the same inside the tunnel when bonding.

Set a difference cuttoff of 100MS inside the speedfusion profile like Ron proposed. Leave the recieve buffer to 0. suspension time after package loss to 0.

Hope the above tips help you abit.

Greetings,
Wouter

1 Like

Thanks Wouter!

Is this based upon the idea that any single band will have a capacity limit that will be exceeded by two SIMs if they are both using that band? Interesting.
We will try that, cheers.

Ron’s recommendation was recieve buffer tuning, not cutoff, but I am intrigued by your suggestion.

Kind regards
Dana

How LTE works, lets say when you use provider A on Band 20 with both modems. you will share bandwith capacity between both modems. As there is a max throughput supported by the Band (freq) and Cell. Now if you will bond this will only work against you.

But if you use Band 20 + Band 1 from provider A you are simulating carrier aggrigation when bonding (LTEA with also upload 2 bands) increasing you bandwith speeds if both lines are healthy and not more then 100MS appart in latency. Hence the latency difference cutoff.

Provider A band 20 + Provider B band 20 bonded. This will also increase speeds in most cases.
as the ISP’s with there own cel network have a different MHZ spectrum inside the Band 20 spectrum.

If using LTEA in most cases you can also finetune but you need to know what bands are being used. and maybe tune 1 modem back to LTE only.

Greetings,
Wouter

2 Likes

Dear Wouter,

Thanks very much for taking the time to explain that to me. That sounds sensible.

Much appreciated,
Dana

1 Like