Chapter 4. IPFW Dummynet and Traffic Shaping

FreeBSD’s dummynet is not a network for dummies. It is a sophisticated network traffic shaping tool for bandwidth usage and scheduling algorithms. In this use of ipfw, the focus is not on ruleset development, although rules are still used to select traffic to pass to dummynet objects. Instead, the focus is on setting up a system to shape traffic flows. dummynet provides tools to model scheduling, queuing, and similar tasks similar to the real-world Internet.

dummynet works with three main types of objects - a pipe, a queue, and a sched (short for scheduler) which also happen to be the three keywords to now examine.

A pipe (not to be confused with a Unix pipe(2)!) is a model of a network link with a configurable bandwidth, and propagation delay.

A queue is an abstraction used to implement packet scheduling using one of several different scheduling algorithms. Packets sent to a queue are first grouped into flows according to a mask on a 5-tuple (protocol, source address, source port, destination address, destination port) specification. Flows are then passed to the scheduler associated with the queue, and each flow uses scheduling parameters (weight, bandwidth, etc.) as configured in the queue itself. A sched (scheduler) in turn is connected to a pipe (an emulated link) and arbitrates the link’s bandwidth among backlogged flows according to weights and to the features of the scheduling algorithm in use.

Network performance testing is a complex subject that can encompass many variables across many different testing strategies. To understand the basics behind dummynet, it is not necessary to dive into the deepest levels of network performance testing - only enough to understand how to use dummynet. Also, these tests are restricted to using IP and TCP exclusively.

Setting Up for Traffic Measurement

Most of the examples in this section can be done with the architecture used in the original lab setup in Chapter 2, copied here for reference:

IPFW Lab for *dummynet* Examples. Refer to paragraphs below.
Figure 1. IPFW Lab for dummynet Examples

Use this bridge and tap configuration on the FreeBSD host system:

% sudo /bin/sh mkbr.sh reset bridge0 tap0 tap1 tap2
% /bin/sh swim.sh  # or scim.sh for screen(1)
% /bin/sh runvm.sh firewall external1 external2

Apply the correct addressing for each VM in this example.

Where necessary, additional virtual machines can be created and added to the bridge.

4.1. Measuring Default Throughput

The idea behind dummynet is that it lets one model and/or shape network speeds, available bandwidth, and scheduling algorithms. But it is first necessary to know what the current transfer speeds are for the current environment (QEMU virtual machines over a FreeBSD bridge). To find out, here is a short detour to learn iperf3, the network bandwidth testing tool used to perform simple transfer and bitrate calculations.

iperf3, can determine the effective throughput of data transfer for a network. Sometimes called "goodput", this is the basic speed the user sees for transferring data across the network - the value that is unencumbered by protocol type and overhead.

To use iperf3, ensure that the software is installed on both the firewall VM system, and the external1 VM (and external2 and external3), and that ipfw on the firewall VM is disabled (# kldunload ipfw).

The basic operation of iperf3 is as a client-server architecture, so on the external1 VM system, start the iperf3 software in server mode:

# iperf3 -s 		<--- run iperf3 in server mode
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
  . . .

Then, on the firewall VM, run the client:

# iperf3 -c 203.0.113.10        <--- connect to external1 server and send test data
Connecting to host 203.0.113.10, port 5201
[  5] local 203.0.113.50 port 19359 connected to 203.0.113.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.03   sec  12.5 MBytes   102 Mbits/sec    0   1.07 MBytes
[  5]   1.03-2.09   sec  13.8 MBytes   108 Mbits/sec    0   1.07 MBytes
[  5]   2.09-3.07   sec  12.5 MBytes   107 Mbits/sec    0   1.07 MBytes
[  5]   3.07-4.09   sec  12.5 MBytes   103 Mbits/sec    0   1.07 MBytes
[  5]   4.09-5.08   sec  12.5 MBytes   106 Mbits/sec    0   1.07 MBytes
[  5]   5.08-6.09   sec  12.5 MBytes   105 Mbits/sec    0   1.07 MBytes
[  5]   6.09-7.07   sec  12.5 MBytes   107 Mbits/sec    0   1.07 MBytes
[  5]   7.07-8.05   sec  12.5 MBytes   107 Mbits/sec    0   1.07 MBytes
[  5]   8.05-9.04   sec  12.5 MBytes   106 Mbits/sec    0   1.07 MBytes
[  5]   9.04-10.02  sec  12.5 MBytes   107 Mbits/sec    0   1.07 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.02  sec   126 MBytes   106 Mbits/sec    0             sender
[  5]   0.00-10.02  sec   126 MBytes   106 Mbits/sec                  receiver

iperf Done.
#

A key test for measuring throughput is to send a file of data and measure the transfer speed. To create the file, use jot(1) on the firewall VM:

# jot -r -s "" 10000000 0 9 > A.bin

This command creates a file of random ASCII digits exactly 10,000,001 bytes long. (Note that it can take anywhere from 30 seconds to two minutes to create the file on a QEMU virtual machine.)

To transfer the file to the server on the external1 VM use this command:

# iperf3 -F A.bin -c 203.0.113.10 -t 10
Connecting to host 203.0.113.10, port 5201
[  5] local 203.0.113.50 port 51657 connected to 203.0.113.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.04   sec  12.5 MBytes   101 Mbits/sec    0    490 KBytes
[  5]   1.04-1.52   sec  5.81 MBytes   101 Mbits/sec    0    490 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-1.52   sec  18.3 MBytes   101 Mbits/sec    0             sender
        Sent 18.3 MByte / 18.3 MByte (100%) of A.bin
[  5]   0.00-1.52   sec  18.3 MBytes   101 Mbits/sec                  receiver
iperf Done.
#

Running this command several times shows that a consistent average bitrate for throughput on this system is about 101Mbits/second - or about 18.3 MBytes/second. (Your values will differ on your local machine.)

There is now a baseline TCP-based "goodput" value for testing dummynet traffic shaping commands.

4.2. IPFW Commands for Dummynet

To use dummynet, load the kernel module dummynet.ko in addition to the ipfw.ko module on the firewall VM:

# kldload ipfw
# kldload dummynet
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched QFQ loaded
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched PRIO loaded
load_dn_sched dn_sched FQ_CODEL loaded
load_dn_sched dn_sched FQ_PIE loaded
load_dn_aqm dn_aqm CODEL loaded
load_dn_aqm dn_aqm PIE loaded
#

dummynet announces the schedulers it is configured to use.

4.2.1. Simple Pipe Configuration

Recall that dummynet uses pipes, queues, and sched (schedulers) to shape traffic.

To see dummynet in action, create a pipe with limited bandwidth, and assign it to a rule matching traffic to the external1 VM:

# Load the ipfw kernel module if needed:
# kldload ipfw
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to deny, logging disabled
#
# ipfw pipe 1 config bw 300Kbit/s
# ipfw pipe 1 show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
#

The above output shows the pipe configration limiting bandwidth (bw) to 300Kbits/sec.

Recent versions of FreeBSD now use the command alias dnctl for configuration of pipes, queues, and schedulers. See dnctl(8) for details.

Now add ipfw rules to send traffic between the firewall VM and the external1 VM through the pipe:

# ipfw add 100 check-state
00100 check-state :default
#
# ipfw add 1000 pipe 1 ip from any to any
01000 pipe 1 ip from any to any
#
# ipfw list
00100 check-state :default
01000 pipe 1 ip from any to any
65535 deny ip from any to any
#

By adding the matching phrase "ip from any to any" and assigning it to pipe 1, the firewall VM is directed to send all ip-based traffic through pipe 1, now configured as a 300K bps link.

By re-running the basic file transfer command for iperf3 these difference take shape:

# iperf3 -F A.bin -c 203.0.113.10 -t 10 --length 1460
Connecting to host 203.0.113.10, port 5201
[  5] local 203.0.113.50 port 39558 connected to 203.0.113.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec  75.6 KBytes   612 Kbits/sec    0   25.6 KBytes
[  5]   1.01-2.01   sec  41.3 KBytes   339 Kbits/sec    0   51.0 KBytes
[  5]   2.01-3.01   sec  45.6 KBytes   374 Kbits/sec    0   62.3 KBytes
[  5]   3.01-4.01   sec  27.1 KBytes   222 Kbits/sec    0   66.6 KBytes
[  5]   4.01-5.01   sec  35.6 KBytes   292 Kbits/sec    0   66.6 KBytes
[  5]   5.01-6.01   sec  44.2 KBytes   362 Kbits/sec    0   66.6 KBytes
[  5]   6.01-7.01   sec  21.4 KBytes   175 Kbits/sec    0   66.6 KBytes
[  5]   7.01-8.01   sec  37.1 KBytes   304 Kbits/sec    0   66.6 KBytes
[  5]   8.01-9.01   sec  48.5 KBytes   397 Kbits/sec    0   66.6 KBytes
[  5]   9.01-10.01  sec  22.8 KBytes   187 Kbits/sec    0   66.6 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec   399 KBytes   327 Kbits/sec    0            sender
        Sent  399 KByte / 9.54 MByte (4%) of A.bin
[  5]   0.00-10.73  sec   379 KBytes   289 Kbits/sec                  receiver

iperf Done.

If the systems returns the error "iperf3: error - control socket has closed unexpectedly", simply re-run the command.

Here, during iperf3's 10-second run, the ipfw dummynet configuration limited the transfer speed to an average of about 327 Kbits/sec, and only about 4% of the entire 10MB file was transferred.

To see how to use dummynet to configure different link speeds, set up a second pipe:

# ipfw pipe 2 config bw 3Mbit/s
# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
00002:   3.000 Mbit/s    0 ms burst 0
q131074  50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail
 sched 65538 type FIFO flags 0x0 0 buckets 0 active
#

This pipe is set up to be 10 times faster (3Mb/sec instead of 300Kb/sec) than pipe 1. To test this pipe, start up the external2 VM and run iperf3 -s. Then reconfigure the ipfw rules to send to the external2 VM through pipe 2:

# ipfw list
00100 check-state :default
01000 pipe 1 ip from any to any
65535 deny ip from any to any
#
# ipfw delete 1000
#
# ipfw add 1000 pipe 1 ip from me to 203.0.113.10 // external1
01000 pipe 1 ip from me to 203.0.113.10
#
# ipfw add 1100 pipe 1 ip from 203.0.113.10 to me // external1
01100 pipe 1 ip from 203.0.113.10 to me
#
# ipfw add 2000 pipe 2 ip from me to 203.0.113.20 // external2
02000 pipe 2 ip from me to 203.0.113.20
#
# ipfw add 2100 pipe 2 ip from 203.0.113.20 to me // external2
02100 pipe 2 ip from 203.0.113.20 to me
#
# ipfw list
00100 check-state :default
01000 pipe 1 ip from me to 203.0.113.10
01100 pipe 1 ip from 203.0.113.10 to me
02000 pipe 2 ip from me to 203.0.113.20
02100 pipe 2 ip from 203.0.113.20 to me
65535 deny ip from any to any
#

As expected, pipe 2 is approximately 10 times faster than pipe 1:

# iperf3 -F A.bin -c 203.0.113.20 -t 10 --length 1460
Connecting to host 203.0.113.20, port 5201
[  5] local 203.0.113.50 port 48108 connected to 203.0.113.20 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec   399 KBytes  3.24 Mbits/sec    0   64.0 KBytes
[  5]   1.01-2.01   sec   358 KBytes  2.93 Mbits/sec    0   64.0 KBytes
[  5]   2.01-3.01   sec   359 KBytes  2.94 Mbits/sec    0   64.0 KBytes
[  5]   3.01-4.01   sec   364 KBytes  2.98 Mbits/sec    0   64.0 KBytes
[  5]   4.01-5.01   sec   368 KBytes  3.01 Mbits/sec    0   66.9 KBytes
[  5]   5.01-6.01   sec   332 KBytes  2.72 Mbits/sec    0   66.9 KBytes
[  5]   6.01-7.01   sec   362 KBytes  2.97 Mbits/sec    0   66.9 KBytes
[  5]   7.01-8.01   sec   355 KBytes  2.91 Mbits/sec    0   66.9 KBytes
[  5]   8.01-9.01   sec   345 KBytes  2.83 Mbits/sec    0   66.9 KBytes
[  5]   9.01-10.01  sec   344 KBytes  2.81 Mbits/sec    0   66.9 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec  3.50 MBytes  2.93 Mbits/sec    0            sender
        Sent 3.50 MByte / 9.54 MByte (36%) of A.bin
[  5]   0.00-10.06  sec  3.48 MBytes  2.90 Mbits/sec                  receiver

iperf Done.

Next, change the pipe configuration without changing the ruleset. Below, the pipe 1 bandwidth is changed to the equivalent of a telecommunications T1 line as in the days of old:

# ipfw pipe 1 config bw 1544Kbit/s
# ipfw pipe show
00001:   1.544 Mbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
00002:   3.000 Mbit/s    0 ms burst 0
q131074  50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail
 sched 65538 type FIFO flags 0x0 0 buckets 0 active
#

Resending the 10MB file across the T1 configured line shows these results:

root@firewall:~ # iperf3 -F A.bin -c 203.0.113.10 -t 10 --length 1460
Connecting to host 203.0.113.10, port 5201
[  5] local 203.0.113.50 port 35768 connected to 203.0.113.10 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec   235 KBytes  1.91 Mbits/sec    0   25.6 KBytes
[  5]   1.01-2.01   sec   173 KBytes  1.41 Mbits/sec    0   25.6 KBytes
[  5]   2.01-3.01   sec   195 KBytes  1.60 Mbits/sec    0   27.0 KBytes
[  5]   3.01-4.01   sec   174 KBytes  1.42 Mbits/sec    0   45.0 KBytes
[  5]   4.01-5.01   sec   182 KBytes  1.50 Mbits/sec    0   62.1 KBytes
[  5]   5.01-6.01   sec   178 KBytes  1.46 Mbits/sec    0   62.1 KBytes
[  5]   6.01-7.01   sec   174 KBytes  1.42 Mbits/sec    0   62.1 KBytes
[  5]   7.01-8.01   sec   180 KBytes  1.47 Mbits/sec    0   62.1 KBytes
[  5]   8.01-9.01   sec   204 KBytes  1.67 Mbits/sec    0   62.1 KBytes
[  5]   9.01-10.01  sec   178 KBytes  1.46 Mbits/sec    0   62.1 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec  1.83 MBytes  1.53 Mbits/sec    0            sender
        Sent 1.83 MByte / 9.54 MByte (19%) of A.bin
[  5]   0.00-10.21  sec  1.81 MBytes  1.48 Mbits/sec                  receiver

iperf Done.

About half of the 3Mbits/sec speed of pipe 2, again as expected.

By definition, a pipe has just one queue, and it is subject to "First In First Out" (FIFO) operation. All traffic that flows through this pipe shares the same characteristics.

However, creating a pipe also does something else. It creates a default sched (scheduler) that governs the pipe:

Start with no pipes or schedulers

#
# ipfw pipe list
#
# ipfw sched list
#

Create a simple pipe.

# ipfw pipe 1 config bw 100KBit/s
#
# ipfw pipe list
00001: 100.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
#

Observe the default scheduler for this pipe

# ipfw sched list
00001: 100.000 Kbit/s    0 ms burst 0
 sched 1 type WF2Q+ flags 0x0 0 buckets 0 active
#

The default scheduler for a new pipe is of type WF2Q+, a version of the Weighted Fair Queueing algorithm for packet transfer.

This is now a single pipe of type FIFO operation that is managed by a WF2Q+ scheduling algorithm.

The ipfw(8) man page makes note of several other scheduling algorithms. These can be selected by using the "type" keyword on the pipe command. The type keyword selects the type of scheduler applied to the pipe - not the type of the pipe itself (the pipe remains FIFO):

# ipfw pipe list
#
# ipfw sched list
#

Create a pipe and assign a scheduler of type Round Robin (Deficit Round Robin)

# ipfw pipe 1 config bw 100KBit/s type rr
#
# ipfw pipe list
00001: 100.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
#

View the new sheduler of type RR (Deficit Round Robin)

# ipfw sched list
00001: 100.000 Kbit/s    0 ms burst 0
 sched 1 type RR flags 0x0 0 buckets 0 active
#

*pipes and *sched*s (schedulers) are tightly bound. In fact, there is no command to delete a scheduler. The scheduler is deleted when the pipe is deleted.

Note however that the scheduler can be configured independently if desired. Here is a change to the scheduler type from the above type RR to QFQ, a variant of WF2Q+:

#
# ipfw sched 1 config type qfq
Bump qfq weight to 1 (was 0)
Bump qfq maxlen to 1500 (was 0)
#
# ipfw sched list
00001: 100.000 Kbit/s    0 ms burst 0
 sched 1 type QFQ flags 0x0 0 buckets 0 active
#

There are other keywords that can be added to a pipe specification: delay, burst, profile, weight, buckets, mask, noerror, plr, queue, red or gred, codel, and pie. These are described in the ipfw(8) man page.

A contrived example might be:

Start fresh

# ipfw pipe 1 delete
#
# ipfw pipe 1 config bw 100kbit/s delay 20 burst 2000 weight 40 buckets 256 mask src-ip 0x000000ff noerror plr 0.01 queue 75 red .3/25/30/.5 type qfq
#
# ipfw pipe list
00001: 100.000 Kbit/s   20 ms burst 2000
q131073  75 sl.plr 0.010000 0 flows (1 buckets) sched 65537 weight 40 lmax 0 pri 0
          RED w_q 0.299988 min_th 25 max_th 30 max_p 0.500000
 sched 65537 type FIFO flags 0x1 256 buckets 0 active
    mask:  0x00 0x000000ff/0x0000 -> 0x00000000/0x0000
#
# ipfw sched list
00001: 100.000 Kbit/s   20 ms burst 2000
 sched 1 type QFQ flags 0x1 256 buckets 0 active
    mask:  0x00 0x000000ff/0x0000 -> 0x00000000/0x0000
#

Setting up two separate pipes to send data to the same destination is overkill. It is like setting up two separate network links between the two points. While that may be desirable for redundancy or high-availability, it makes no difference for bandwidth allocation. (Yes, link aggregation is possible, but that is not being considered here.)

What is usually needed is a way to separate traffic into different "lanes" and assign different "speed limits" to each lane. That is exactly what queues are for.

4.2.2. Simple Pipe and Queue Configuration

Before going further, it is useful to disambiguate the two meanings of the word "queue".

In a pipe definition, by default, the pipe is assigned a queue where incoming packets are held before processing and transit. The size of this "pipe queue" is by default 50 packets, but can be changed with the queue keyword on the pipe definition:

# ipfw pipe 1 config bw 200Kbit/s
#
# ipfw pipe list
00001: 200.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
#
# ipfw pipe 2 config bw 200Kbit/s  queue 75
#
# ipfw pipe list
00001: 200.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
00002: 200.000 Kbit/s    0 ms burst 0
q131074  75 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail
 sched 65538 type FIFO flags 0x0 0 buckets 0 active
#

In contrast, dummynet has the concept of flow queues which are virtual groupings of packets assigned to a flow according to a mask in their definition with ipfw queue statements.

Configuring a queue is almost as simple as configuring a pipe.

Start with a clean slate (all objects and rules deleted):

# kldunload dummynet
# kldunload ipfw
# kldload ipfw
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to deny, logging disabled
# kldload dummynet
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched QFQ loaded
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched PRIO loaded
load_dn_sched dn_sched FQ_CODEL loaded
load_dn_sched dn_sched FQ_PIE loaded
load_dn_aqm dn_aqm CODEL loaded
load_dn_aqm dn_aqm PIE loaded
#
# ipfw queue 1 config pipe 1
#
# ipfw queue show
q00001  50 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail

Here is one queue of size 50 packets that was created and assigned to pipe 1. Since there is no assigned weight, the weight is 0 (zero), which is the least weight possible. The queue currently has 0 flows, meaning that this queue has no traffic flowing through it.

Notice however, that the queue was created before the pipe. That is why the weight is 0. The default queue weight is 1. This configuration was actually done out of order. To maintain a readable configuration, it is best to configure the objects in the following order:

  1. pipes (also creates a scheduler, which can be assigned a specific scheduler type)

  2. queues - create queues and assign weights, source and destination masks, delay, and other characteristics to the queue

  3. Assign rules to match traffic using standard 5-tuples or as needed

dummynet also has the ability to separate out different flows within the same pipe to perform different scheduling algorithms. An example of this capability is shown later in this section.

When transferring a file to the external1 VM and attempting to type interactively on the external1 VM at the same time, the ability to type at speed is dramatically reduced. The file transfer packets, being much larger than interactive typing packets are hogging all the bandwidth. This effect is a well known phenomenon to anyone who edits documents on a remote site. Since packets are created much faster by a file transfer program than anyone can type, the outbound queue is almost always full of large packets, leaving keystrokes to be separated by large amounts of file transfer data in the queue.

Try this out on the firewall VM by resetting the pipe 1 bandwidth to 300Kbit/sec, and in one session, run iperf3 as iperf3 -c 203.0.113.10 -t 60. Then in another session, add rules for ssh traffic and ssh to external1 VM and try to enter text into a scratch file. The typing delay is almost unbearable.

To control traffic flow between the firewall VM and any external VM host, set up individual queues to separate traffic within a pipe. queues can be either static - defined with ipfw queue config …​ - or they can be dynamic. Dynamic queues are created when using the mask keyword. Masks for queues are called flow masks. The mask determines if a packet entering or leaving the firewall is selected to be entered into a queue. Consider the following example:

# ipfw pipe 1 config bw 200Kbit/s mask src-ip 0x000000ff

Each /24 host transferring data through pipe 1 (based on suitable rules) will have its own dynamic queue, all sharing the bandwidth in the pipe according to the configration of the queue.

If a different data transfer that is not related to the pipe, queue, and flow mask is started, it will not have any effect on the data in the pipe and queue. Dummynet keeps such transfers separate from the pipe and queue operations.

If instead, the goal is to create separate individual queues with different characteristics such as different weights or delay, create static queues and then assign them to individual pipes as desired:

#
# ipfw pipe 1 config bw 300kbit/s
#
# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x0 0 buckets 0 active
#
# ipfw queue 1 config pipe 1 weight 10 mask dst-ip 0xffffffff dst-port 5201
Bump flowset buckets to 64 (was 0)
#
# ipfw queue 2 config pipe 1 weight 10 mask dst-ip 0xffffffff dst-port 5202
Bump flowset buckets to 64 (was 0)
#
# ipfw queue show
q00001  50 sl. 0 flows (64 buckets) sched 1 weight 10 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x1451
q00002  50 sl. 0 flows (64 buckets) sched 1 weight 10 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x1452
#
# ipfw add 10 allow icmp from any to any
00010 allow icmp from any to any
#
# ipfw add 100 check-state
00100 check-state :default
#
# ipfw add 1000 queue 1 tcp from me to 203.0.113.10 5201 setup  keep-state
01000 queue 1 tcp from me to 203.0.113.10 5201 setup keep-state :default
#
# ipfw add 1100 queue 2 tcp from me to 203.0.113.20 5202 setup  keep-state
01100 queue 2 tcp from me to 203.0.113.20 5202 setup keep-state :default
#
# ipfw list
00010 allow icmp from any to any
00100 check-state :default
01000 queue 1 tcp from me to 203.0.113.10 5201 setup keep-state :default
01100 queue 2 tcp from me to 203.0.113.20 5202 setup keep-state :default
65535 deny ip from any to any
#

Later versions of FreeBSD may not return any output on ipfw queue configuration statements. The configuration is completed successfully, though without any output.

Running

# iperf3 -c 203.0.113.10 -p 5201 -t 30 -O 5 --length 1460

produces the output below.

The output is the result of using the "omit" flag (-O) on the sender to ignore the first five seconds of output. This removes the "slow start" portion of the TCP test, and focuses instead on the "steady state" that occurs after slow start gets up to speed.

Testing Separate Static Queues and Pipes. Refer to paragraphs below.
Figure 2. Testing Separate Static Queues and Pipes

This example shows the steady state results of transmitting data through one queue - queue 1. The bitrate was consistently about 293Kbits/sec.

Later versions of FreeBSD and iperf3 may differ from the display in the above figure. Assess the correctness of the queue setup by examining the transfer summary printed at the end of the iperf3 command output. Use of the iperf3 --length parameter may provide additional clarity for transfers.

During the transmission, a view of the queue status was:

# ipfw queue show
q00001  50 sl. 2 flows (64 buckets) sched 1 weight 10 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x1451
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
136 ip           0.0.0.0/0        203.0.113.10/5201  2293  3425216 42 63000   0
 50 ip           0.0.0.0/0        203.0.113.50/1040   752    39104  1   52   0
q00002  50 sl. 0 flows (64 buckets) sched 1 weight 10 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x1452
#

The queue mask, set to show the full destination address and destination port is highlighted.

Note that port numbers are displayed in hexadecimal. A decimal/hexadecimal calculator may be helpful when looking at a lot of queue displays.

The next example shows the result of starting two transmissions, one for each queue.

On the external1 VM, set up the command iperf3 -s -p 5201, and on external2 use the command iperf3 -s -p 5202.

Start the transfer to external1 on the firewall VM with the command:

# iperf3 -c 203.0.113.10 -p 5201 -t 180 -O 30

and start the second transfer from a different session on the firewall VM with the command:

# iperf3 -c 203.0.113.20 -p 5202 -t 180 -O 30

Notice how the queue is adjusted to accommodate the presence of a second queue of equal weight:

Testing Separate Static Queues and Pipes Equally Weighted. Refer to paragraphs below.
Figure 3. Testing Two Static Queues and Pipes

Since the queues were equally weighted, the result was that the transmission bitrate for both queues was reduced to about half of the transmission bitrate before the second transmission started.

The highlighted area shows how the first queue adapted.

Queue characteristics can be changed at any time, even during an active flow. Consider the case below where, during simultaneous transmission through queues of equal weight, the queue weight of the second queue was modified as follows:

queue 1: original weight 10 modified weight 10 (no change)

queue 2: original weight 10 modified weight 50 (increased)

This change can be effected by the command:

# ipfw queue 2 config weight 50
Testing Two Static Queues and Queue 2 Changed In-flight. Refer to paragraphs below.
Figure 4. Testing Two Static Queues and Pipes Changed In-flight

The transmission bitrate for queue 1 dropped from an average of about 140Kbits/sec to an average of about 50Kbits/sec; while the rate for queue 2 expanded during and after the reconfiguration.

Note however, that the above command had a side effect:

# ipfw queue show
q00001  50 sl. 0 flows (64 buckets) sched 1 weight 10 lmax 0 pri 0 droptail
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x1451
q00002  50 sl. 0 flows (1 buckets) sched 1 weight 50 lmax 0 pri 0 droptail
#

The flow mask for queue 2 has been deleted. In fact, all settings not explicitly reset will revert to their default settings. Here is a complicated queue setup:

# ipfw queue 1 config pipe 1 weight 40 buckets 256 mask src-ip 0x000000ff dst-ip 0x0000ffff noerror plr 0.01 queue 75 red .3/25/30/.5
#
# ipfw queue show
q00001  75 sl.plr 0.010000 0 flows (256 buckets) sched 1 weight 40 lmax 0 pri 0
          RED w_q 0.299988 min_th 25 max_th 30 max_p 0.500000
    mask:  0x00 0x000000ff/0x0000 -> 0x0000ffff/0x0000
#

And if, similar to the previous example, only the weight is changed:

# ipfw queue 1 config weight 20
#
# ipfw queue show
q00001  50 sl. 0 flows (1 buckets) sched 1 weight 20 lmax 0 pri 0 droptail
#

All the other parameters of the queue are reset to their defaults. Therefore, it is best to retain the original commands used to construct queues, pipes, and schedulers, even if changing only one parameter. That way, all other parameters can be replicated on the command line. Otherwise it may be necessary to reconstruct the parameters from the output of ipfw queue show which can be quite tedious.

4.2.3. Relationships

As described throughout this section, pipes, queues, and scheds (schedulers) are interrelated. Here are some simplified principles:

  • Bandwidth - the bandwidth of a particular pipe determines the highest rate at which all data will move through the pipe with optimal conditions. With lower configured bandwidth, less data will be transferred. This has an effect on queue size.

  • Queue size - the number of packets, or if expressed in K or Mbytes, the amount of data waiting to be transferred though a pipe. If the queue fills up or overflows, packets are dropped which may result in retransmissions, depending on the protocol or application involved. That being said, best practice is to configure for smaller, rather than larger queue sizes. See RFC 2309 for a thorough discussion.

  • Delay - delay can be configured in a pipe to inject additional time between individual transfers. It is distinct from bandwidth in that it can only slow down traffic, not speed it up.

  • Packet Loss - packet loss can be configured in a pipe to simulate lossy transmission media. It simulates how well the receiver can correctly "hear" the transmissions. Packet loss may also result in retransmissions.

  • Scheduling - scheduling determines the allocation of bandwidth among flows. If there is only one queue in a pipe, and one flow in that queue, the scheduler does not really have much to do. However, if there are multiple queues in a pipe each with their own flow, the scheduler determines the order of service based on the selected algorithm (RR, QFQ, WFQ+, etc.) and queue weights.

  • Queue weight - a numerical value that can be used to influence the scheduler to prefer certain flows ahead of other flows. Higher weights result in increased traffic in a flow. However, even with a very minimal weight, a flow will never starve - that is, it will still eventually get serviced by the scheduler.

Additional detail is contained in ipfw(8).

4.2.4. Dynamic Pipes

Here, note that the simplest setup for pipes creates dynamic pipes when needed:

# ipfw pipe 1 config bw 300kbit/s weight 10 mask src-ip 0x0000ffff dst-ip 0xffffffff
Bump sched buckets to 64 (was 0)
#
# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 10 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 0 active
    mask:  0x00 0x0000ffff/0x0000 -> 0xffffffff/0x0000
#
# ipfw list
00050 allow icmp from any to any
00100 check-state :default
65535 deny ip from any to any
#
# ipfw add 1000 pipe 1 tcp from me to 203.0.113.0/24 5201-5203 setup keep-state
01000 pipe 1 tcp from me to 203.0.113.0/24 5201-5203 setup keep-state :default
#
# ipfw list
01000 pipe 1 tcp from me to 203.0.113.0/24 5201-5203 setup keep-state :default
65535 deny ip from any to any
#

Sending some data with this configuration:

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 10 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 4 active
    mask:  0x00 0x0000ffff/0x0000 -> 0xffffffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
  6 ip         0.0.10.10/0        203.0.113.50/0      236    12272  0    0   0
 78 ip         0.0.10.50/0        203.0.113.10/0     1493  2225216 43 64500   0
 80 ip         0.0.10.50/0        203.0.113.20/0     1355  2018216 42 63000   0
 58 ip         0.0.10.20/0        203.0.113.50/0      366    19032  0    0   0
#
# ipfw list
00050 allow icmp from any to any
00100 check-state :default
01000 pipe 1 tcp from me to 203.0.113.0/24 5201-5203 setup keep-state :default
65535 deny ip from any to any
#

All three transmissions running together, single pipe:

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 10 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 6 active
    mask:  0x00 0x0000ffff/0x0000 -> 0xffffffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
  6 ip         0.0.10.10/0        203.0.113.50/0      588    30576  0    0   0
 78 ip         0.0.10.50/0        203.0.113.10/0     1508  2247716 43 64500   0
 80 ip         0.0.10.50/0        203.0.113.20/0     1357  2021216 43 64500   0
 90 ip         0.0.10.50/0        203.0.113.30/0     1322  1981552 41 61500   0
 46 ip         0.0.10.30/0        203.0.113.50/0       34     1768  0    0   0
 58 ip         0.0.10.20/0        203.0.113.50/0      702    36504  0    0   0

Because of the ipfw rule:

01000 pipe 1 tcp from me to 203.0.113.0/24 5201-5203 setup keep-state :default

All are getting 290 Kbit/sec from iperf3 and they are all sharing the pipe equally.

If iperf3 is changed to send to different ports for each system (5201, 5202, 5203) on the external1, external2, and external3 VMs respectively, there is no change. It is only with queues, and setting the individual flow rate, that can effect change.

Below are examples of different masks and their effect on traffic flow:

* dst-ip 0x0000ffff

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 4 active
    mask:  0x00 0x00000000/0x0000 -> 0x0000ffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
 10 ip           0.0.0.0/0           0.0.10.10/0     1183  1760218 43 64500   0
 20 ip           0.0.0.0/0           0.0.10.20/0      974  1446718 42 63000   0
 30 ip           0.0.0.0/0           0.0.10.30/0      688  1017718 35 52500   0
 50 ip           0.0.0.0/0           0.0.10.50/0     1717    89284  0    0   0


* dst-ip 0xffffffff

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 4 active
    mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
 18 ip           0.0.0.0/0        203.0.113.50/0      402    20888  0    0   0
 42 ip           0.0.0.0/0        203.0.113.10/0      144   204722  0    0   0
 52 ip           0.0.0.0/0        203.0.113.20/0      359   525971  0    0   0
 62 ip           0.0.0.0/0        203.0.113.30/0      562   843000 37 55500   0


* src-ip 0x0000ffff

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 4 active
    mask:  0x00 0x0000ffff/0x0000 -> 0x00000000/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
 20 ip         0.0.10.10/0             0.0.0.0/0      361    19348  0    0   0
100 ip         0.0.10.50/0             0.0.0.0/0     2102  3079974 36 54000  27
 40 ip         0.0.10.20/0             0.0.0.0/0      193    10416  0    0   0
 60 ip         0.0.10.30/0             0.0.0.0/0       47     2612  0    0   0


* mask src-ip 0x0000ffff dst-ip 0x0000ffff     <-only one keyword mask needs to be specified

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 6 active
    mask:  0x00 0x0000ffff/0x0000 -> 0x0000ffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
 14 ip         0.0.10.30/0           0.0.10.50/0      253    13156  0    0   0
 26 ip         0.0.10.20/0           0.0.10.50/0       61     3172  0    0   0
 38 ip         0.0.10.10/0           0.0.10.50/0      771    40094  0    0   0
110 ip         0.0.10.50/0           0.0.10.10/0      853  1265218 40 60000   0
112 ip         0.0.10.50/0           0.0.10.20/0      723  1083052 37 55500   0
122 ip         0.0.10.50/0           0.0.10.30/0      644   951718 34 51000   0


* mask src-ip 0x0000ffff dst-ip 0x0000ffff dst-port 5201

# ipfw pipe show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 6 active
    mask:  0x00 0x0000ffff/0x0000 -> 0x0000ffff/0x1451
BKT Prot    Source IP/port         Dest. IP/port      ot_pkt/bytes Pkt/Byte Drp
204 ip         0.0.10.50/0           0.0.10.10/5201  2132  3183718 43 64500   0
 14 ip         0.0.10.30/0           0.0.10.50/4096   823    42796  0    0   0
210 ip         0.0.10.50/0           0.0.10.20/5201  2001  2987218 43 64500   0
152 ip         0.0.10.20/0           0.0.10.50/4161   663    34476  0    0   0
216 ip         0.0.10.50/0           0.0.10.30/5201  1981  2957218 43 64500   0
164 ip         0.0.10.10/0           0.0.10.50/65     471    24492  0    0   0


* mask src-ip 0xffffffff dst-ip 0xffffffff

# ipfw pipe 1 show
00001: 300.000 Kbit/s    0 ms burst 0
q131073  50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
 sched 65537 type FIFO flags 0x1 64 buckets 6 active
    mask:  0x00 0xffffffff/0x0000 -> 0xffffffff/0x0000
BKT Prot    Source IP/port         Dest. IP/port     Tot_pkt/bytes Pkt/Byte Drp
 64 ip      203.0.113.50/0        203.0.113.20/0     1215  1808218 43 64500   0
 74 ip      203.0.113.50/0        203.0.113.30/0     1023  1533052 43 64500   0
 22 ip      203.0.113.10/0        203.0.113.50/0      746    38792  0    0   0
 94 ip      203.0.113.50/0        203.0.113.10/0     1863  2780218 42 63000   0
 42 ip      203.0.113.20/0        203.0.113.50/0      481    25012  0    0   0
 62 ip      203.0.113.30/0        203.0.113.50/0      159     8268  0    0   0

4.2.5. Other Pipe and Queue Commands

To delete pipes and queues use the following syntax:

For queues, specify the queue number on the command line:

# ipfw queue delete 1

For pipes, specify the pipe number on the command line:

# ipfw pipe delete 1

Note however that:

# ipfw delete pipe 1    <-----  does not throw error, and does not delete the pipe.

The same is true for the corresponding queue keyword. Take care to use the proper syntax.

It is possible to delete a pipe with a pipe statement still in the ruleset. ipfw will not throw an error - but any data transfer matching the pipe statement will not work.

scheds (schedulers) and pipes are tightly bound. To delete a scheduler, first delete the pipe, and then re-create the pipe. The scheduler for the new pipe is reset to the default scheduler. However, it is possible to change the current scheduler type at any time:

To change the scheduler type:
# ipfw sched config 1 type wfq2  # or rr or any other sched type

4.3. Adding Additional Virtual Machines

Up to this point, only two or three virtual machines have been used for exploring ipfw. The later material in this book requires the use of several additional virtual machines.

The NAT chapter calls for several more VMs for:


Setting Up Simple NAT. Refer to paragraphs below.
Figure 5. Setting Up Simple NAT

Setting Up Load Sharing NAT. Refer to paragraphs below.
Figure 6. Setting up Load Sharing NAT

Setting Up NAT64 and DNS64. Refer to paragraphs below.
Figure 7. Setting Up NAT64 and DNS64

Setting Up 464XLAT. Refer to paragraphs below.
Figure 8. Setting Up 464XLAT

If you have not already done so, finish setting up the remaining VMs as described in Appendix A.

Also, ensure each virtual machine is set up to boot a serial console by adding "console=comconsole" to /boot/loader.conf.

Finally, adjust the number of active windows in swim.sh (or scim.sh) by uncommenting the appropriate lines in the script.