BufferBloat

I will let you know what I come up with. In the meantime I’ll see if I can prod Peplink into doing something.

First idea: PEPLINK folks should do the new speedtests on Bufferbloat and post the results with and without SPEEDFUSION. If SPEEDFUSION eliminates or greatly reduces BufferBloat, they may find a whole new subpopulation of users interested in trying SPEEDFUSION. This is easy, it should take a tech 5 minutes or so to get the results.

Second idea: If SPEEDFUSION does NOT have a marked effect on BufferBloat then Algorithms could be built into the SPEEDFUSION link and advertised as a cure for BufferBloat. This is pretty complex and will require some development although ideas have been explored by DD-WRT and the LEDE project.

Third idea: Build BufferBloat algorithms into the next firmware package.This is also pretty complex and will require some development although ideas have been explored by DD-WRT and the LEDE project.

Fourth idea: Develop an add-on appliance that could be purchased separately to control BufferBloat.

Fifth idea: Add BufferBloat control to the new line of Switches PEPLINK has just introduced…I’d buy one for sure.

IQRouter even has illustrations that look rather like some of the SPEEDFUSION illustrations:

Heck, even if they had a beta feature add on type model for the established SQM methods. I would help to test it for sure.

I bought the other lag eliminator to see if it can even do it with my connections. I am beginning to think that I will always have some sort of buffer bloat beyond my control due to not having a dedicated line. My ISP uses an over subscription model, so there is potentially a bottleneck at my first hop. Fingers crossed.

I purchased a Netgear R6300v2 from Ebay for $37 and put LEDE-Project firmware on it and have pretty much made it into a “transparent” router that exists solely to apply Luci-app-sqm BufferBloat control. A nice write up on how to configure a LEDE router for bufferbloat control may be found here:

https://lede-project.org/docs/howto/sqm?s[]=bufferbloat

Peplink continues to handle WAN aggregration and DHCP functions.

There is a problem with applying bufferbloat SQM on the “fat pipe” between the Peplink and Switch: for uploads the SQM only would be activated when uploads reach toward 4 x 700 kbps or 2800 kbps since most of the time the uploads are much less than this. The upload on my system is responsible for the greatest bufferbloat.

So I decided that I really need 4 bufferbloat control appliances, one for each modem. I will be buy 3 more routers from ebay.

This follows the kind advice I received from IQRouters (EvenRoute):


Sandy Fowler (EvenRoute)
Feb 27, 09:42 AST

It might not work in the configuration you propose due to the bufferbloat occurring individually on each of the 4 WAN links. Depending on how the PEPlink load-balances and the actual traffic patterns, you could have any one of the 4 links saturate and bloat out. That can happen when the traffic is ‘sticky’ to the IP of a WAN link, as in an HTTPS session where there is much data being sent outbound, such as an iPhone synching new photos to iCloud.

So bloat must be controlled on each line individually ahead of the PEPLink. Another reason for that that is that while the lines are nominally the same, they might in actuality be synching at slightly different rates, which thens means traffic control settings are different for each. And over time, one might have more issues than another, and require unique settings.

So to truly correct the bloat, it would require four IQrouters placed between the modems and the PEPlink.

DHCP and other unnecessary functions could be turned off, or ignored.


I followed Sandy’s advice next and did just go ahead and put my LEDE R6300v2 between my transparent bridge modem and the Peplink on one WAN connection. I disabled DHCP and it works just fine.

Took a while to get the PEPLINK so it would have a static IP on WAN4 and see the LEDE router as the gateway but I eventually came up with the correct settings.

This actually changed by grades on bufferbloat from DSLReports/speedtest from straight Fs to generally C. The reason being, I believe is that I have an outbound rule that uses the lowest latency connection so the one WAN with Luci SQM is generally fastest and others are not used till it is “full”.

The follow more quantitative evaluation is from SourceForge/speedtest

Something that is working now:

Well after giving up on BufferBloat appliances between my PEPLINK and switch and trying and failing to get something that would not hamper my connections between my PEPLINK and modems, I went back to the position between the PEPLINK and switch.

My initial attempt there had been very similar if not identical to Orangetek’s solution over at LEDE-project forums. Namely just have a lan cable coming into the switch/lan side of the LEDE router and back out again to the switch and apply bufferbloat control not to the bridged switch (br-lan) but to the switch eth1. As I reported previously, in that position I had poor results applying bufferbloat control to my “fat pipe” from the PEPLINK which was the aggregate sum of 4 10 mbps connections.

Strangely in the same position using the WAN port of the R6300v2 as the point of contact with the single lan cable from the PEPLINK and using the 4 lan ports as connections to my game computers I have a satisfactory connection.

Every web page opens well and by applying a speed of 16000 (less than half the speed possible about 38000 kbps max) and an upload speed of 1500 (about 3/5 of possible) I always get an A or A+ on bufferbloat from dslreports/speedtest

So the WAN port on the LEDE box is given an ip of 192.168.1.2 and has a static reservation on the PEPLINK which has the default address of 192.168.1.1. The LAN ports on the LEDE box are given an IP of 192.168.2.1 and assign DHCP to the game computers which also have static reservations on the LEDE box.

Now why this works so much better on the stream of information on the fat pipe from the PEPLINK I do not know, but it does in practice.

A nice thing about this arrangement is most of the office equipment and etc are just like normal and since bufferbloat is not a problem for them it isn’t applied with any reduction in speed.

I’ve not tried sufficiently to see if this arrangement will work with a single lan cable going from the LEDE box to the switch…so all machines would have bufferbloat control. Initial testing suggests that R6300v2 is not up to the task. With just the 4 game computers on it, it frequently hits a load of .3 to .5 and I’m guessing that a higher powered LEDE router with a faster CPU might be necessary in that position.

2 Likes

QOS typically only impacts LAN to WAN. When you had them set up LAN to LAN, they weren’t doing any shaping. Most LAN traffic is at the speed of the wire with only IP traffic (in most cases). With common speeds being 10/100/1000 - there isn’t much shaping that will make an impact.

I feel I understand what bufferbloat is, and why buffers should exist. From what I have learned, the buffer is the routers way of handling the differences in speed between the LAN and the WAN. LAN runs at 100 Mbps, WAN runs slower than that - for this example 20 Mbps. When a LAN client starts to send data to the LAN, it sends at 100 Mbps until it is told to adjust the size of the packets (window size). Any packets that come in (and can’t make it to the WAN link yet) will end up in the buffer. When the buffer fills, packets start being dropped, which is a signal to the device to slow down (reduce payload window size). So, the rule of thumb is to have your buffer hold 200 ms or less worth of packets. In order to determine the number of packets to allow into the buffer, you take your link speed and convert it from Mbps to kbps. Then, assuming you are using standard TCP size of 1500 Bytes (12000 bits) - divide your speed in bps by 12000 bits. An easier way is take your speed in kbps and divide by 12 (12000 bps == 12kbps)

For me, I have 26/4 Mbps and a 16/4 Mbps WAN connections. I am shooting for buffer delays to be less than 100 ms. This yields the following for WAN buffer sizes
26000 kbps / 12 = 2166.6 packets per second down
16000 kbps / 12 = 1333.3 packets per second down

We don’t want to buffer data for a full second. That is an eternity for real time applications. So, I am shooting for 100 ms of buffer time - so, I take those packet per second values and divide by 10. That gives me the following WAN buffer sizes
2166.6 / 10 = 216 packets buffer size
1333.3 / 10 = 133 packets buffer size

That works well without dropping packets, but maybe we can go to 50 ms - divide those values in half - now we have
216 / 2 = 108 packets
133 / 2 = 67 packets

I have been running like this and have gotten good results for buffer bloat tests. I also use the bandwith reservations and keep 10% of my bandwith reserved for non-existent devices - this means I never get full capacity, but I always have low latency - which is my goal. I am fine giving up 4-6 Mbps in exchange for a better gaming experience.

What I don’t understand is why there would ever be any bufferbloat on a download. Nothing coming from WAN to LAN should ever need to be buffered. The LAN side should always be able to deliver the packets immediately to the devices on the LAN because LAN link speed is greater than WAN link speed.

Ideally, it would be nice to have an outbound AND an inbound buffer size setting. Since many internet connections are lopsided, the upload buffer needs to be smaller than the download buffer - at least in my head.

For those wondering the buffer is meant to be just large enough to sustain an endpoint reducing the receive window size by half without having to drop packets. So, the buffer can hold any packets that were sent prior to the window adjustment packet. The buffer should go back down to zero as the receive window starts to grow again.

1 Like

Well; for what it is worth.

I had read this thread as well as many others and decided to play around.

Initial baseline bufferbloat after numerous tests was a C; occasional D but never better.

I then used WAN buffer numbers between 1000 and 186 and at around 500 I regularly get an A with an occasional B and C. It is based on these finding that I will let it ride at 500 for awhile and see if I notice any overall improvements when within latency sensative activities.

As a comparison; this is what the system looked like prior to this change and I did not see a measurable change; meaning the transfer rates and Ping/latency may have improved but not drastically realizing that there are to many uncontrolled variables to do a true immediate compare.

Equipment used

2 Cradlepont AER2100 with CAT 6 modems with both WANs active and fed into the Balace zone operating in balanced mode; 1-1 because the speeds are the same on both 4G lines.
Static IPs
DL = 60 average
UP = 18 average
Ping/latency = 35 average

UPDATE…

Sadly although the bufferbloat tested much better in the real world, the changes made the low latency gaming activities unstable to the point that my son kept getting disconnected so I reverted the WAN buffer settings back to default and his gaming activities returned to normal; meaning without issues.

This was an interesting exercise and I remain very interested in finding a viable way to mitigate the bufferbloat situation and offer my willingness to participate in testing.

Thank you

1 Like

Which game?

I played Fifa 17 on PS4 with low buffers (40 and 10) and it worked fine. But I didn’t try it under heavy load yet.

I suppose that, under load, a low buffer is better, as upload UDP packages won’t get stuck in a slow queue, as latency is very important on Fifa. But, on the other hand, many packages will be lost and that will also have a bad effect on the game.

I guess the only way to play low latency games under heavy load is if Peplink QoS actually works, but its port-based rules are for destination ports only, making it useless for games, as the destination may not be using the default ports.

He was playing online multiplayer WarFrame at the time on his Windows based PC.

My gut says that even though tweaks are made local that have an affect the issue is likely exasperated depending on how many hops and the buffer’s state as each of them.

More than anything I find it interesting as I have always been about getting the most out of the systems. We have good transfer rates and ping/latency as is typically measured and yet when I run a speed test I can see spikes and drops along the route and suspect that bufferbloat maybe the culprit.

This problem has bee solved using the fq_codel algorithm available in the Linux kernel… Does Peplink have any plans of implementing this “easy mode” in the fight against bufferbloat? A $50 has this feature and is good up to at least 60 Mpbs. Sure the $50 router isn’t as good with fail-over but if voice/real-time apps don’t always work, whats the point?

2 Likes

So what about it Peplink? Is it time for Peplink to step up to the new ways of handling QoS and Bufferbloat? Do you want to continue forever with the old way where an expert has to tune QoS and then it is never right for all situations? Or should Peplink adopt fq_codel which automates tuning dynamically to the type of tcp/ip connections which are currently active and dynamically to the amount of bandwidth your DSL or cable provider is providing at different times of the day?

The new way with Fq_codel takes care of automatically adjusting the speed of each tcp/ip stream to make sure your broadband connection(s) is not overloaded and incurring bufferbloat based on measuring RTT, round trip time, keeping it low for each connection. And it does it “fairly” (the F in fq_codel), dynamically handling whatever types of tcp/ip connections are currently active. How many different product discussions are currently pending in the Peplink forums such as “Single Youtube upload killing Internet traffic” or “Limit download throughput per WAN” that are non-issues with fq_codel?

To date, Peplink has given us ways to set the number of buffers, which as commented on earlier in this thread doesn’t really work, fixing one problem while introducing another. And as you will find on the web if you research, “Quality of Service (QoS) settings will help, but won’t solve bufferbloat completely. Why not? Any prioritization scheme works by pushing certain packets to the head of the queue, so they’re transmitted first. Packets farther back in the queue still must be sent eventually. New traffic that hasn’t been prioritized gets added to the end of the queue, and waits behind those previously queued packets. QoS settings don’t have any way to inform the big senders that they’re sending too fast/too much, so packets from those flows simply accumulate, increasing delay for all. Furthermore, you can spend a lot of time updating priorities, setting up new filters, and checking to see whether VoIP, gaming, ssh, netflix, torrent, etc. are “balanced”. (There is a whole cottage industry in updating WonderShaper rule sets. They all have terrible flaws, and they don’t help a lot.) Worst of all, these rules create a maintenance hassle. Each new rule has to be adjusted in the face of new kinds of traffic. And if the router changes, or speed changes, or there’s new traffic in the mix, then they need to be adjusted again.”

The $50 router mentioned in the previous post is most likely the Ubiquiti Edgerouter X which has fq_codel and works GREAT on reducing bufferbloat. Fq_codel, a proven Active Queue Management (AQM) algorithm, is available in an increasing number of commercial routers. For example, besides the Edgerouters, it is a component of Qualcomm’s “streamboost” QoS system. It is in Netgear’s “Dynamic Qos” feature”. The Zyxel Armor Z2 AC2600 MU-MIMO Wireless Router offers StreamBoost. And of course LEDE (now back to being OpenWrt), etc.

The new DOCSIS 3.1 standard includes DOSIS-PIE Active Queue Management for all DOCSIS 3.1 cable modems (too bad many cable providers don’t enable it).

Peplink, we are ready for you to step up to the plate. Yes, you have a lot of QoS and prioritization code that users have tuned their existing installations with, so you can’t just replace it with fq_codel. You can give fq_codel to us as an option to enable however. Please do so!

------------------------- Notes to Peplink users struggling with Bufferbloat ------------------------

I have provided a number of customers as well as friends and relatives with a lot of routers. Here is what I use and why.

Decide whether or not Bufferbloat will be a problem. If you have enough broadband bandwidth, you can get away without fq_codel. I.e., Peplink is great. That said, I always recommend a router with fq_codel for DSL situations.

For a simple home situation that will experience bufferbloat, the IQrouter is great. Don’t plan on doing anything sophisticated like VLAN’s though.

The Ubiquiti Edgerouters are truly excellent if you can get them configured and are willing to use AP’s for Wi-Fi. They are great for situations which fall neatly into configurations for which they have Wizards to create and you need to do little else, especially CLI commands as opposed to GUI configuration. For example, they have a wizard for WAN plus two LAN subnets. On the other hand, you probably don’t want to do multiple WAN’s with VLAN’s for example, unless you have lots of time to become an EdgeOs expert.

Ubiquiti has recently introduced the new Edgerouter ER-4 and Edgerouter ER-6P. These can do 1GB at $200 and $300 respectively as long as you don’t enable fq_codel which takes additional cpu power and reduces the throughput. The Peplink Balance One is limited to 600mb.

To the individuals who have tried to add an outboard fq_codel router between the modem and a multi-wan router, I’ve tried this configuration with both Ubiquiti and Netgear fq_codel. Netgear wasn’t reliable enough for me. Ubiquiti always did its job reliably, but unfortunately the multi-wan router I was using at the time (not Peplink) sometimes would declare a WAN link out of service and never bring it back into service, so I didn’t stick with that setup.

Wrapping this up, I like the IQrouter for its simplicity. I like the power and reliability of Edgerouters. I love the Peplink for its easy configuration, reliability, and support. That’s why I am hoping Peplink adds fq_codel support soon.

1 Like