kernel/5227: In pf using a 2nd queue for lowdelay does not match all lowdelay packets

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

kernel/5227: In pf using a 2nd queue for lowdelay does not match all lowdelay packets

Steve Welham-2
>Number:         5227
>Category:       kernel
>Synopsis:       In pf using a 2nd queue for lowdelay does not match all lowdelay packets
>Confidential:   yes
>Severity:       non-critical
>Priority:       low
>Responsible:    bugs
>State:          open
>Quarter:        
>Keywords:      
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Sep 05 17:40:01 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Steve Welham
>Release:        OpenBSD 3.9-stable (GENERIC) #1: Wed Aug 30 21:49:15 BST 2006
>Organization:
net
>Environment:
       
        System      : OpenBSD 3.9
        Architecture: OpenBSD.i386
        Machine     : i386
>Description:
       
The option of specifying dual queues to separate traffic requesting lowdelay is defined in "man pf.conf" as follows:

Packets can be assigned to queues based on filter rules by using the queue keyword.  Normally only one queue is specified; when a second one is specified it will instead be used for packets which have a TOS of lowdelay and for TCP ACKs with no data payload.

The dual queue lowdelay tos check is done by an equality comparison with IPTOS_LOWDELAY which is defined in netinet/ip.h as 0x10. The desired meaning of "lowdelay" depends on which RFC we subscribe to:

RFC791 - cares only about the existence of the lowdelay bit, so correct comparison would be a bitwise AND

RFC1349 - cares about the value of bits 3-6 (where MSB is 0 and LSB is 7), so correct comparison would be a bitwise AND with the mask 0x1E and then equality comparison with IPTOS_LOWDELAY (0x10).

RFC2474 - completely redefines the TOS field and makes the lowdelay bit meaningless. (Diffserv)

So the current operation - a full 8 bit field comparison - is incorrect for both RFC 791 and 1349 definitions. For instance a packet with TOS of 0xF0 - requests low delay and IP precedence of 7 but is not treated as low-delay because it does not match 0x10.

>How-To-Repeat:
       
Install hping from ports (/usr/ports/net/hping/) to reproduce packets easily.

This cannot be demo'ed with pflog - we will need to use the queue counters.

Here's the commands and output:
# cat /etc/pf.conf
#

set skip on lo

altq on fxp0 cbq bandwidth 8Mb queue { std, https } queue std bandwidth 50% cbq(default borrow) queue https bandwidth 50% cbq(borrow) { https_fast, https_std } queue  https_std bandwidth 50% priority 7 cbq(borrow) queue  https_fast bandwidth 50% cbq(borrow)

pass all
pass log all tos 0x10
pass log all tos 0x8
pass log all tos 0x80
pass log inet proto tcp from any to any port 443 queue(https_std,
https_fast)
# pfctl -sr
pass all
pass log all tos 0x10
pass log all tos 0x08
pass log all tos 0x80
pass log inet proto tcp from any to any port = https queue(https_std, https_fast
)
# pfctl -sq
queue root_fxp0 bandwidth 8Mb priority 0 cbq( wrr root ) {std, https} queue  std bandwidth 4Mb cbq( borrow default ) queue  https bandwidth 4Mb cbq( borrow ) {https_std, https_fast}
queue   https_std bandwidth 2Mb priority 7 cbq( borrow )
queue   https_fast bandwidth 2Mb cbq( borrow )
# pfctl -sn
# pfctl -f /etc/pf.conf
# pfctl -vsq
queue root_fxp0 bandwidth 8Mb priority 0 cbq( wrr root ) {std, https}
  [ pkts:        255  bytes:     186870  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std bandwidth 4Mb cbq( borrow default )
  [ pkts:        255  bytes:     186870  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  https bandwidth 4Mb cbq( borrow ) {https_std, https_fast}
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_std bandwidth 2Mb priority 7 cbq( borrow )
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_fast bandwidth 2Mb cbq( borrow )
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
# hping -I fxp0 -S 192.168.1.1 -p 443 -c 1 -o 10 HPING 192.168.1.1 (bridge0 192.168.1.1): S set, 40 headers + 0 data bytes

--- 192.168.1.1 hping statistic ---
1 packets tramitted, 0 packets received, 100% packet loss round-trip min/avg/max = 0.0/0.0/0.0 ms # pfctl -vsq queue root_fxp0 bandwidth 8Mb priority 0 cbq( wrr root ) {std, https}
  [ pkts:       2055  bytes:    1502974  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std bandwidth 4Mb cbq( borrow default )
  [ pkts:       2054  bytes:    1502920  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  https bandwidth 4Mb cbq( borrow ) {https_std, https_fast}
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_std bandwidth 2Mb priority 7 cbq( borrow )
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_fast bandwidth 2Mb cbq( borrow )
  [ pkts:          1  bytes:         54  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
# hping -I fxp0 -S 192.168.1.1 -p 443 -c 1 -o 0 HPING 192.168.1.1 (bridge0 192.168.1.1): S set, 40 headers + 0 data bytes

--- 192.168.1.1 hping statistic ---
1 packets tramitted, 0 packets received, 100% packet loss round-trip min/avg/max = 0.0/0.0/0.0 ms # pfctl -vsq queue root_fxp0 bandwidth 8Mb priority 0 cbq( wrr root ) {std, https}
  [ pkts:       2558  bytes:    1865284  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std bandwidth 4Mb cbq( borrow default )
  [ pkts:       2556  bytes:    1865176  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  https bandwidth 4Mb cbq( borrow ) {https_std, https_fast}
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_std bandwidth 2Mb priority 7 cbq( borrow )
  [ pkts:          1  bytes:         54  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_fast bandwidth 2Mb cbq( borrow )
  [ pkts:          1  bytes:         54  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
# hping -I fxp0 -S 192.168.1.1 -p 443 -c 1 -o f0 HPING 192.168.1.1 (bridge0 192.168.1.1): S set, 40 headers + 0 data bytes

--- 192.168.1.1 hping statistic ---
1 packets tramitted, 0 packets received, 100% packet loss round-trip min/avg/max = 0.0/0.0/0.0 ms # pfctl -vsq queue root_fxp0 bandwidth 8Mb priority 0 cbq( wrr root ) {std, https}
  [ pkts:       2956  bytes:    2163117  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  std bandwidth 4Mb cbq( borrow default )
  [ pkts:       2953  bytes:    2162955  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue  https bandwidth 4Mb cbq( borrow ) {https_std, https_fast}
  [ pkts:          0  bytes:          0  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_std bandwidth 2Mb priority 7 cbq( borrow )
  [ pkts:          2  bytes:        108  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]
queue   https_fast bandwidth 2Mb cbq( borrow )
  [ pkts:          1  bytes:         54  dropped pkts:      0 bytes: 0 ]
  [ qlength:   0/ 50  borrows:      0  suspends:      0 ]

Above we tested with 3 hpings - with tos set to 0x10, 0x0 and 0xF0. Only 0x10 is matching as lowdelay and queued as https_fast - however our 0xF0 is also lowdelay under both the RFC791 and RFC1349 definitions and is not matching.

>Fix:
       
The issue is fixed in pf.c depending on the RFC we subscribe to. Since lowdelay is nonsense to Diffserv RFC2474 we may as well assume that people using this dual queue are doing old-style ToS.

Currently:
if (pqid || pd.tos == IPTOS_LOWDELAY)

RFC791:
if (pqid || pd.tos & IPTOS_LOWDELAY)

RFC1394:
if (pqid || (pd.tos & 0x1E)== IPTOS_LOWDELAY)

Below is a patch to resolve according to RFC791:

--- pf.c.orig   Wed Aug 30 21:08:33 2006
+++ pf.c.issue2 Thu Aug 31 11:46:10 2006
@@ -6030,7 +6030,7 @@

 #ifdef ALTQ
        if (action == PF_PASS && r->qid) {
-               if (pqid || pd.tos == IPTOS_LOWDELAY)
+               if (pqid || pd.tos & IPTOS_LOWDELAY)
                        pd.pf_mtag->qid = r->pqid;
                else
                        pd.pf_mtag->qid = r->qid; @@ -6376,7 +6376,7 @@

 #ifdef ALTQ
        if (action == PF_PASS && r->qid) {
-               if (pd.tos == IPTOS_LOWDELAY)
+               if (pd.tos & IPTOS_LOWDELAY)
                        pd.pf_mtag->qid = r->pqid;
                else
                        pd.pf_mtag->qid = r->qid;

Finally my dmesg:

OpenBSD 3.9-stable (GENERIC) #2: Thu Aug 31 00:03:15 BST 2006
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium III ("GenuineIntel" 686-class) 1 GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,SER,MMX,FXSR,SSE
real mem  = 534810624 (522276K)
avail mem = 480972800 (469700K)
using 4278 buffers containing 26845184 bytes (26216K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(57) BIOS, date 12/01/01, BIOS32 rev. 0 @ 0xfd870 apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 30102 dobusy 0 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xfd800/0x800
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdf30/176 (9 entries)
pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371FB ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0xc000 0xcc000/0x600! 0xdf000/0x1000!
0xe0000/0x4000!
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 0 function 0 "Intel 82815 Hub" rev 0x04: rng active, 9Kb/sec
vga1 at pci0 dev 2 function 0 "Intel 82815 Graphics" rev 0x04: aperture at 0xf0000000, size 0x4000000 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation) ppb0 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0x05
pci1 at ppb0 bus 1
xl0 at pci1 dev 1 function 0 "3Com 3c905C 100Base-TX" rev 0x78: irq 3, address 00:01:02:d8:94:ab bmtphy0 at xl0 phy 24: Broadcom 3C905C internal PHY, rev. 7 fxp0 at pci1 dev 8 function 0 "Intel 82562" rev 0x03, i82562: irq 9, address 00:04:23:15:a7:c0 inphy0 at fxp0 phy 1: i82562ET 10/100 PHY, rev. 0 ichpcib0 at pci0 dev 31 function 0 "Intel 82801BA LPC" rev 0x05 pciide0 at pci0 dev 31 function 1 "Intel 82801BA IDE" rev 0x05: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: <ST320410A>
wd0: 16-sector PIO, LBA, 19458MB, 39851760 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5 atapiscsi0 at pciide0 channel 1 drive 0 scsibus0 at atapiscsi0: 2 targets cd0 at scsibus0 targ 0 lun 0: <OEM, CD-ROM F522B, 1.10> SCSI0 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 uhci0 at pci0 dev 31 function 2 "Intel 82801BA USB" rev 0x05: irq 9 usb0 at uhci0: USB revision 1.0 uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered ichiic0 at pci0 dev 31 function 3 "Intel 82801BA SMBus" rev 0x05: irq 5 iic0 at ichiic0
uhci1 at pci0 dev 31 function 4 "Intel 82801BA USB" rev 0x05: irq 11
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered auich0 at pci0 dev 31 function 5 "Intel 82801BA AC97" rev 0x05: irq 5,
ICH2 AC97
ac97: codec id 0x41445360 (Analog Devices AD1885)
ac97: codec features headphone, Analog Devices Phat Stereo audio0 at auich0 isa0 at ichpcib0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 midi0 at pcppi0: <PC speaker> spkr0 at pcppi0 lpt0 at isa0 port 0x378/4 irq 7 npx0 at isa0 port 0xf0/16: using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec biomask ff65 netmask ff6d ttymask ffef
pctr: 686-class user-level performance counters enabled
mtrr: Pentium Pro MTRR support
uhidev0 at uhub1 port 1 configuration 1 interface 0
uhidev0: Logitech USB Receiver, rev 1.10/17.00, addr 2, iclass 3/1 ukbd0 at uhidev0: 8 modifier keys, 6 key codes
wskbd1 at ukbd0 mux 1
wskbd1: connecting to wsdisplay0
uhidev1 at uhub1 port 1 configuration 1 interface 1
uhidev1: Logitech USB Receiver, rev 1.10/17.00, addr 2, iclass 3/1
uhidev1: 4 report ids
ums0 at uhidev1 reportid 1: 16 buttons and Z dir.
wsmouse0 at ums0 mux 0
uhid0 at uhidev1 reportid 2: input=2, output=0, feature=0
uhid1 at uhidev1 reportid 3: input=1, output=0, feature=0
uhid2 at uhidev1 reportid 4: input=3, output=0, feature=0
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x30


>Release-Note:
>Audit-Trail:
>Unformatted: