Traffic Congestion Control

NETWizz

Well-Known Member
Reaction score
1,962
Okay, it is NOT at all uncommon to get customers who complain that the traffic is slow at their sites, and they expect you to work some miracle of making a 3 mbps WAN circuit carry everything for 45 people...

The first thing you are going to need to do is figure out what is sucking up their resources.

To do this, you really need a managed router or switch that is the Gateway for the Subnet that is slow... You need to use sFlow to look at the traffic. Luckily, it's cake.

First you need some program that will run and collect the flows... I recommend Solar Winds Free Flow Analyzer:

http://www.solarwinds.com/freetools/real-time-netflow-analyzer.aspx

Once you get that up and running, just open it and the window will be blank... Now go to your router...

Here is the config:

ip flow-export version 9
ip flow-export destination <your IP here> 2055


interface SomeLink0/0
description Customer WAN facing Interface
ip address 10.1.2.3 255.255.255.252
ip flow ingress
!
interface SlowEthernet0/0
description Customer LAN facing Interface
ip address 192.168.1.1 255.255.255.0
ip flow ingress
!


You don't typically do in and out. Just do the in direction and if you don't get enough info, find the LAN facing interface and add the same command. The protocol is smart enough to figure out what comes in also went out.

You might also want to turn on SMTP for diagnostic purposes at least:

snmp-server community secretpub RO


***************************

The beauty is you don't have to do anything at all... It will simply populate in the software once it sees some flow data coming in.

Now click the little keys icon on the right and simply type your SNMP credentials (i.e. secretpub)

It will fully populate ALL interfaces, device type etc... It will look like this:

upload_2016-5-24_10-1-52.png
Now click on Start Capture... and wait about 5 minutes before trying to make sense of it. You want to grab enough traffic because it is hard to judge a congestion problem base don 10 seconds worth of flows.

Here our heavy-hitter is HTTP, TCP Port 80 (No surprise)...

upload_2016-5-24_10-5-16.png


Now, it becomes clear where the worst part of the traffic is coming from:

upload_2016-5-24_10-8-0.png

Now let's look at the conversation:


upload_2016-5-24_10-10-49.png



You could leave this running ALL day or even all week then come back and look at what is troubling your customer before coming up with an action plan if you want.
 

Attachments

  • upload_2016-5-24_10-1-46.png
    upload_2016-5-24_10-1-46.png
    77 KB · Views: 4
I actually picked the wrong one... The conversation should match in this case pubmedgov-vip... blah blah blah. This will tell you who locally is eating the bandwidth. Sometimes, you may see one (1) computer that is taking 95% or something silly like that.

DO keep in mind that if you have a 3 mbps circuit on a 100 mbps Interface, you will probably need to add something like "bandwidth 3000" on the interface, for the program to recognize 3 mbps is 100%. Otherwise, it will probably calculate the flows percentages out of the interface wire-speed vs. the real speed.

If you like this sFlow, tool and want to implement something more permanent, I recommend ManageEngine Network Flow Analizer, or Solar Winds paid product, Orion! That said the tool is less important than what you do with it. if you do a LONG caputre, you can simply stop the capture and select a Start & End time to look at your capture. You can also open the SolarWindows Real-time NetFlow Analizer and select File > Open and point it to a capture file.

The first thing is to let the customer know what traffic is killing them and/or who is responsible (i.e. which computer) and what is downloaded.

I often go to Robtex:

https://www.robtex.com/

Then paste in an IP or Domain name and find out who owns it... This sheds light on what the transfer was really about.
 
Okay, now I promised in a previous post to show how to shape traffic, but I have been putting it off because there is much to explain. Basically, there are classes of service, TOS bits, Priority bits, and my favorite DSCP (Differentiated Services Code Point). These fall under Quality of Service. The irony is these are all simply bits that can be set in packets to flag them for a router later on down the line to do something with them. These flags alone do absolutely nothing by themselves. Think of them like putting a Post-It note on a folder before handing it to a coworker who does something with it...

The Post-It note does nothing, but if it is your co-worker's policy to immediately act on something with a Post-It note marked IMPORTANT, then it gets the job done.

The one that comes to mind is EF, which is just the typical highest-priority flag on virtually ANY system. It is said to mean expedited-forwarding.

Now there are tons of different strategies for dealing with Quality of Service particularly where it comes from VOIP, which is where most of this flaggig is used, but doing the actual work involves the magic of queues and TCP sliding window (particularly in policing).

***********

Now, before we get all nuts let's make some definitions... and a little background for how network flows really work.

Let's say you have two (2) sites and one has a 5 mbps connection and the toher has a 10 mbps connection through an MPLS cloud, Internet VPN, ISP VPN... whatever. If the one from 10 mbps is transmitting to the site with 5 mbps, something has got to give....

What basically happens is the line is saturated and congested on the incoming interface with the 5 mbps link. whatever cannot make it simply gets dropped. When this ceiling is hit the TCP windowing backs off, yet there is no tangible upper-limit, so it ramps up again a little per packet until it hits the ceiling and a SEGMENT (i.e. think of it like a Layer-4 packet /w a prior three-way handshake) is lost. When this happens, the transmitting station simply sends the next one at half the window-size for most operating systems... Additionally, most systems have no true upper-limit defined.... This keeps happening and the net result is that you are receiving at a 5 mbps rate with some added jitter.

This is analogous to what happens in Policing. In policing, you set an artificial HARD limit, and anything that goes above that limit you simply do a HARD drop. These packets are simply deleted from the Router or Switch's memory and gone forever less they ultimately be re-sent.... Actually, you have different actions for Policing for traffic that Conforms, Exceeds, and Violates and different thresholds can be set for each, but your options are limited in that you can Transmit, Mark/Tag, or Drop packets... There really is no tangible action you can take other than to DROP with policing. But we need not discuss each of the extra buckets to get our hands dirty. Personally, I just set the main bucket and do NOT configure the optional ones because the network device will automatically take care of calculating what the others should be by best practice!

Shaping on the other hand is SOFT...
The key thing to know is that packets are basically queued... and then they are delayed.

Now it is important to know where these actions take place... Policing is done on ingress traffic coming into an interface, and well Shaping is done on egress traffic going out of an interface toward another device.


Think about it for a moment. You basically want to immediately mark traffic coming in, but you really cannot shape traffic coming in because it has already arrived. If you are getting hammered and the traffic has already arrived, the only action you can really take is to police the traffic dropping it down, so the other side starts treating your link as slower than it actually is and other traffic around it can start creeping in... The problem with his strategy is that the pipe is already full from the far end.

In contrast, shaping is done on outbound interfaces. In our example above, we might SHAPE the traffic coming off the 10 mbps router to the 5 mbps router as to NOT saturate that router's slower link. The beauty of SHAPING is that it creates a nice laminar flow.

Prioritization and Bandwidth actions bascially use SHAPING by specifying what amount of the bandwidth is reserved.. What they really do is say that you have a 10 mbps pipe with some VOIP that requires 3 mbps. They basically set aside 3 mbps by SHAPING everything else to not use more than 7 mbps.
 
Okay, let's say we have WSUS setup and it is hammering the Wide-Area-Network because someone setup the policies wrong and the computers are going across the slow WAN to get their updates...

Let's say we have this setup WSUS Server (192.168.1.5/24) > R1 (10 mbps Router) >> MPLS Cloud >> R2 (5 mbps) Router > PC (192.168.17.3/24) getting updates.

Where you would want to shape this traffic is on R1 outbound on its WAN interface connecting to the MPLS cloud.

*****************

Shaping the traffic...

The first thing we would do is recognize or "match" the traffic. That's easily accomplished with an Access-List, but we probably want to only match traffic going to R2; since, there might be an R3 we are not talking about which has a 100 mbps link and would not suffer saturation if we shape traffic going anywhere.

In this case, I am going to match the WSUS traffic going to the entire LAN subnet for R2:


On R1:

ip access-list extended Match_WSUS
permit tcp host 192.168.1.5 eq www 192.168.17.0 0.0.0.255
!

With an Access-Access-List, Permit passes traffic... Deny drops traffic. If it is used to Classify Traffic, Permit WILL match the class (i.e. return true). Deny will return False (class will not match).

The Green above is the source and the Red the destination.

This presumes WSUS is running on TCP Port 80, which has an alias of www. DO keep in mind if you are matching a Port, too, you MUST specify TCP or UDP in the access list because each protocol has its own set of ports. IP itself does NOT have ports...

Keep in mind Access lists have an implict deny at the end, so if it doesn't match it returns FALSE.

Therefore our list conceptually looks like this:

ip access-list extended Match_WSUS
permit tcp host 192.168.1.5 eq www 192.168.17.0 0.0.0.255
<implicit deny all>
!

Now we need to build a Class:

class-map match-all CLASSNAME
match access-group name Match_WSUS
!

Now, DO keep in mind, you could use NBAR to match a protocol right under the class-map. You can match DSCP flags, Priority, TOS, lots of things.

match-all = AND operator
match-any = OR operator

You CAN make it match multiple access lists for example simply by adding another "match access-group" line.

Samples (matches are per line):
class-map match-all SAMPLE_MUST_MATCH_BOTH (think of match-all as an AND operator)
match access-group name FIRST_LIST
match access-group name SECOND_LIST
!
class-map match-any SAMPLE_MUST_MATCH_EITHER (think of match-all as an OR operator)
match access-group name FIRST_LIST
match access-group name SECOND_LIST
!


**********************

Wonderful... Now we need to do something with the class or classes, so we need to put that into a Policy Map.

policy-map SHAPE25PERCENT
class CLASSNAME
shape average 1240000

!

Note: These are in bits per second, and must be evenly divisible with 8000, so 1240000 is 1.24 mbps, which for all intents and purpsoes is close enough to 25% of 5 mbps. Hence, the above would SHAPE our matched traffic to only a quart of the bandwidth of that 5 mbps far-side router.

***************************

That said, we still need to implement the policy.

interface SomeLink0/0
description Customer WAN Facing Interface
blah blah blah
service-policy output SHAPE25PERCENT
blah
!

Now, it is ONLY matching what is in our Access-List, so if a PC over on the other side wants to copy files, pull a website from a different server, pull from a web server on a different port on the same server, etc. That will NOT be put in the shaping queues. Hence you have complete control!




***************************


Here is a summary of my rambling:

ip access-list extended Match_WSUS
permit tcp host 192.168.1.5 eq www 192.168.17.0 0.0.0.255
!
class-map match-all CLASSNAME
match access-group name Match_WSUS
!
policy-map SHAPE25PERCENT
class CLASSNAME
shape average 1240000
!
interface SomeLink0/0
description Customer WAN Facing Interface
service-policy output SHAPE25PERCENT
!
 
Last edited:
Back
Top