Linux Traffic Control (tc)|Mininet




Linux Traffic Control (tc)

The Linux tc command, for traffic control, allows the attachment of any implemented queuing discipline (23 Queuing and Scheduling) to any network interface (usually of a router). A hierarchical example appears in 24.11 Linux HTB. The tc command is also used extensively by Mininet to control, for example, link queue capacities. An explicit example, of adding the fq queuing discipline, appears immediately above.

The two examples presented in this section involve “simple” token-bucket filtering, using tbf, and then “classful” token-bucket filtering, using htb. We will use the latter example to apply token-bucket filtering only to one class of connections; other connections receive no filtering.

The granularity of tc-tbf rate control is limited by the cpu-interrupt timer granularity; typically tbf is able schedules packets every 10 ms. If the transmission rate is 6 MB/s, or about four 1500-byte packets per millisecond, then tbf will schedule 40 packets for transmission every 10 ms. They will, however, most likely be sent as a burst at the start of the 10-ms interval. Some tc schedulers are able to achieve much finer pacing control; eg the ‘fq’ qdisc of 30.7 TCP Competition: Reno vs BBR above.

The Mininet topology in both cases involves a single router between two hosts, h1—r—h2. We will here use the routerline.py example with the option -N 1; the router is then r1 with interfaces r1-eth0 connecting to h1 and r1-eth1 connecting to h2. The desired topology can also be built using competition.py and then ignoring the third host.

To send data we will use sender.py (30.6.1.1 sender.py), though with the default TCP congestion algorithm. To receive data we will use dualreceive.py, though initially with just one connection sending any significant data. We will set the constant PRINT_CUMULATIVE to False, so dualreceive.py prints at intervals the number of bytes received during the most recent interval; we will call this modified version dualreceive_incr.py. We will also redirect the stderr messages to /dev/null, and start this on h2:

python3 dualreceive_incr.py 2>/dev/null

We start the main sender on h1 with the following, where h2 has IPv4 address 10.0.1.10 and 1,000,000 is the number of blocks:

python3 sender.py 1000000 10.0.1.10 5430

The dualreceive program will not do any reading until both connections are enabled, so we also need to create a second connection from h1 in order to get started; this second connection sends only a single block of data:

python3 sender.py 1 10.0.1.10 5431

At this point dualreceive should generate output somewhat like the following (with timestamps in the first column rounded to the nearest millisecond). The byte-count numbers in the middle column are rather hardware-dependent

1.016 14079000       0
1.106 12702000       0
1.216 14724000       0
1.316 13666448       0
1.406 11877552       0

An Introduction to Computer Networks, Release 2.0.4

This means that, on average, h2 is receiving about 13 MB every 100ms, which is about 1.0 Gbps.

Now we run the command below on r1 to reduce the rate (tc requires the abbreviation mbit for megabit/sec; it treats mbps as MegaBytes per second). The token-bucket filter parameters are rate and burst. The purpose of the limit parameter – used by netem and several other qdiscs as well – is to specify the maximum queue size for the waiting packets. Its value here is not very significant, but too low a value can lead to packet loss and thus to momentarily plunging bandwidth. Too high a value, on the other hand, can lead to bufferbloat (21.5.1 Bufferbloat).

tc qdisc add dev r1-eth1 root tbf rate 40mbit burst 50kb limit 200kb

We get output something like this:

1.002     477840      0
1.102     477840      0
1.202     477840      0
1.302     482184      0
1.402     473496      0

477840 bytes per 100 ms is 38.2 Mbps. That is received application data; the extra 5% or so to 40 Mbps corresponds mostly to packet headers (66 bytes out of every 1514, though to see this with WireShark we need to disable TSO, 17.5 TCP Offloading).

We can also change the rate dynamically:

tc qdisc change dev r1-eth1 root tbf rate 20mbit burst 100kb limit 200kb

 

The above use of tbf allows us to throttle (or police) all traffic through interface r1-eth1. Suppose we want to police selected traffic only? Then we can use hierarchical token bucket, or htb. We set up an htb root node, with no limits, and then create two child nodes, one for policed traffic and one for default traffic.

Linux Traffic Control (tc)|Mininet

To create the htb hierarchy we will first create the root qdisc and associated root class. We need the raw interface rate, here taken to be 1000mbit. Class identifiers are of the form major:minor, where major is the integer root “handle” and minor is another integer.

An Introduction to Computer Networks, Release 2.0.4

tc qdisc add dev r1-eth1 root handle 1: htb default 10
tc class add dev r1-eth1 parent 1: classid 1:1 htb rate 1000mbit

We now create the two child classes (not qdiscs), one for the rate-limited traffic and one for default traffic. The rate-limited class has classid 1:2 here; the default class has classid 1:10.

tc class add dev r1-eth1 parent 1: classid 1:2 htb rate 40mbit
tc class add dev r1-eth1 parent 1: classid 1:10 htb rate 1000mbit

We still need a classifier (or filter) to assign selected traffic to class 1:2. Our goal is to police traffic to port 5430 (by default, dualreceive.py accepts traffic at ports 5430 and 5431).

There are several classifiers available; for example u32 (man tc-u32) and bpf (man tc-bpf). The latter is based on the Berkeley Packet Filter virtual machine for packet recognition. However, what we use here – mainly because it seems to work most reliably – is the iptables fwmark mechanism, used earlier in 13.6 Routing on Other Attributes. Iptables is intended for filtering – and sometimes modifying – packets; we can associate a fwmark value of 2 to packets bound for TCP port 5430 with the command below (the fwmark value does not become part of the packet; it exists only while the packet remains in the kernel).

iptables --append FORWARD --table mangle --protocol tcp --dport 5430 --jump ãÑMARK --set-mark 2

 

When this is run on r1, then packets forwarded by r1 to TCP port 5430 receive the fwmark upon arrival.

The next step is to tell the tc subsystem that packets with a fwmark value of 2 are to be placed in class 1:2; this is the rate-limited class above. In the following command, flowid may be used as a synonym for classid.

tc filter add dev r1-eth1 parent 1:0 protocol ip handle 2 fw classid 1:2

We can view all these settings with

tc qdisc show dev r1-eth1
tc class show dev r1-eth1
tc filter show dev r1-eth1 parent 1:1
iptables --table mangle --list

We now verify that all this works. As with tbf, we start dualreceive_incr.py on h2 and two senders on h1. This time, both senders send large amounts of data:

h2: python3 dualreceive_incr.py 2>/dev/null
h1: python3 sender.py 500000 10.0.1.10 5430
h1: python3 sender.py 500000 10.0.1.10 5431

If everything works, then shortly after the second sender starts we should see something like the output below (taken after both TCP connections have their cwnd stabilize). The middle column is the number of received data bytes to the policed port, 5430.

1.000 453224 10425600
1.100 457568 10230120
1.200 461912 9934728

An Introduction to Computer Networks, Release 2.0.4

1.300 476392 10655832
1.401 438744 10230120

 

With 66 bytes of TCP/IP headers in every 1514-byte packet, our requested 40 mbit data-rate cap should yield about 478,000 bytes every 0.1 sec. The slight reduction above appears to be related to TCP competition; the full 478,000-byte rate is achieved after the port-5431 connection terminates.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



Frequently Asked Questions

+
Ans: TCP Competition: Reno vs BBR|Mininet view more..
+
Ans: IP Routers With Simple Distance-Vector Implementation|Mininet view more..
+
Ans: IP Routers in a Line|Mininet view more..
+
Ans: Linux Traffic Control (tc)|Mininet view more..
+
Ans: OpenFlow and the POX Controller|Mininet view more..
+
Ans: RSA|PUBLIC-KEY ENCRYPTION view more..
+
Ans: Forward Secrecy|Public-Key Encryption view more..
+
Ans: Trust and the Man in the Middle|Public-Key Encryption view more..
+
Ans: End-to-End Encryption|Public-Key Encryption view more..
+
Ans: SSH and TLS|Public-Key Encryption view more..
+
Ans: IPsec |Public-Key Encryption view more..




Rating - NAN/5
524 views

Advertisements