pf modulate state & TCP option SACK modulation by pf - patch

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

pf modulate state & TCP option SACK modulation by pf - patch

K_Pfaff
[pf modulate state & TCP option SACK modulation by pf - patch]

Hello

I have noticed that when pf is configured to modulate state,

(pf.conf contains:
pass out quick on $ext_if from <localnet> to <internet> modulate state
)

then SYN and ACK numbers are modulated, but the values in SACK options
are not.
Therefore SACK values do not correspond with SYN numbers of remote host.
In a result, SACK option is being used, but does not function properly.
To make matter worse, such packets with invalid SACK value are dropped
by some firewalls
(I have experienced it with one Linux machine - all packets with SACK
option were rejected and the connection in one direction was getting
blocked).

I have modified the  pf.c  file to support TCP modulation in SACK options.
The patch is being provided.
I applied it to the OpenBSD 3.8 official release.
I have tested in on test machines and later in productive network.
Currently it is used being used in my company's network to process all
the internet traffic.
It seems to work fine.
(This patch can also be applied to pf.c 1.508 current, it should work,
but has not been tested)

I attach the file with logs (logs.tgz and short SACK_diffs.txt)
containing tcpdump output with SYN / ACK / SACK values on internal
interface and on external after modulating the sequence numbers.

Please notice that I am a newbie in OpenBSD programming ;)
The code presented in patch file can probably be improved and corrected,
therefore any corrections, suggestions, hints etc. are welcome.
What may be a problem is, eg. that it currently allocates additional
memory for array "thopt[MAXOPTSIZE]" to store options.

Anyway, I hope this code could be useful.

This code can be (rather easily(?)) modified to perform TCP options
checking for packets passing pf (eg. if option lengths are correct). Let
me know if someone finds it useful.

Thank you for your attention,
Krzysztof Pfaff

[demime 1.01d removed an attachment of type application/x-compressed which had a name of logs.tgz]
--- /sys/net/pf.c Sat Nov 26 19:14:50 2005
+++ /sys/net/pf.c Tue Dec  6 14:53:11 2005
@@ -134,6 +134,12 @@
     struct pf_addr *, struct pf_addr *, u_int16_t,
     u_int16_t *, u_int16_t *, u_int16_t *,
     u_int16_t *, u_int8_t, sa_family_t);
+
+u_int32_t pf_modulate_sack (struct mbuf *m, int off,
+    u_int8_t *thopt, int *thopt_len, sa_family_t af,
+    struct tcphdr *th, tcp_seq seqdiff,
+    tcp_seq seqdifftest);
+
 void pf_send_tcp(const struct pf_rule *, sa_family_t,
     const struct pf_addr *, const struct pf_addr *,
     u_int16_t, u_int16_t, u_int32_t, u_int32_t,
@@ -1427,6 +1433,136 @@
  }
 }
 
+u_int32_t
+pf_modulate_sack (struct mbuf *m, int off, u_int8_t *thopt, int *thopt_len,
+    sa_family_t af, struct tcphdr *th, tcp_seq seqdiff, tcp_seq seqdifftest)
+{
+/* by Krzysztof Pfaff. Free to use/change etc., no warranty
+ * Fixes TCP_SACK option when using "modulate state" option with pf.
+ * It works. But any corrections, suggestions, etc. are welcome.
+ */
+ int       hlen;
+ int       sackModulated = 0;
+ u_int8_t  optlen;
+ u_int8_t  *opt;
+ tcp_seq   *sackvalp;   /* will point to the SACK option values */
+
+ hlen = (th->th_off << 2) - sizeof (struct tcphdr);
+ if (hlen == 0) {
+ return (0);
+ }
+ if ( ( hlen < 0 ) || ( hlen > MAX_TCPOPTLEN) ) {
+ if (pf_status.debug >= PF_DEBUG_MISC) {
+ printf("pf: modul_sack: incorrect tcp options "
+    "length (on start): %d not in range %d - %d\n",
+    hlen, 0, MAX_TCPOPTLEN);
+ }
+ return (0);
+ }
+ /* if *thopt_len == -1 then tcp options were not loaded into thopt */
+ /* if (hlen == 0) { *thoptlen = 0 }  */
+ if (*thopt_len < 0) {
+ if (pf_pull_hdr(m, off + sizeof (struct tcphdr), thopt, hlen,
+    NULL, NULL, af)) {
+ *thopt_len = hlen;
+ } else {
+ if (pf_status.debug >= PF_DEBUG_MISC) {
+ printf("pf: modul_sack: pf_pull_hdr failed");
+        }
+ return (0);
+ }
+ } else {
+ if (hlen != *thopt_len) {
+ if (pf_status.debug >= PF_DEBUG_MISC) {
+     printf("pf: modul_sack: hlen: %d != "
+    "thopt_len: %d", hlen, *thopt_len);
+        }
+ return (0);
+ }
+ }
+ /* At this point: thopt contains buffer with TCP options */
+ /* thopt_len contains length of valid data in buffer */
+ opt = thopt;
+ while (hlen >= 2 + 2 * sizeof(tcp_seq)) { /* min TCP_SACK option length */
+ switch (*opt) {
+ case TCPOPT_EOL:
+ case TCPOPT_NOP:
+ hlen--;  
+ opt++;
+ break;
+ case TCPOPT_SACK:
+ optlen = *(++opt);
+ opt++;  
+ hlen -= 2;
+ if ((optlen-2) % TCPOLEN_SACK) {
+ if (pf_status.debug >= PF_DEBUG_MISC) {
+ printf("pf: modul_sack: incorrect SACK "
+    "optlen size: %d\n", optlen);
+ }
+ return (0);
+ }
+ sackvalp = (tcp_seq*)opt;
+ /* sackval is more clear than (tcp_seq*)opt */
+ while (hlen >= (2 * sizeof(tcp_seq))) {
+ /* change 1st SACK option value */
+ if (pf_status.debug >= PF_DEBUG_NOISY) {
+ printf("pf: modul_sack: changed SACK "
+    "at %d (-%d), from %u ",
+    (u_int32_t)sackvalp -
+    (u_int32_t)thopt,
+    hlen, ntohl(*sackvalp));
+ }
+ pf_change_a(sackvalp, &th->th_sum,
+    htonl( ntohl(*sackvalp) - seqdiff), 0);
+ if (pf_status.debug >= PF_DEBUG_NOISY) {
+ printf("( - %u ) to %u\n",
+    seqdiff, ntohl(*sackvalp));
+ }
+ hlen -= sizeof(tcp_seq);
+ sackvalp++;
+
+ /* change 2nd SACK option value */
+ if (pf_status.debug >= PF_DEBUG_NOISY) {
+ printf("pf: modul_sack: changed SACK "
+    "at %d (-%d), from %u ",
+    (u_int32_t)sackvalp -
+    (u_int32_t)thopt, hlen,
+    ntohl(*sackvalp));
+ }
+ pf_change_a(sackvalp, &th->th_sum,
+    htonl( ntohl(*sackvalp) - seqdiff), 0);
+ if (pf_status.debug >= PF_DEBUG_NOISY) {
+ printf("( - %u ) to %u\n",
+    seqdiff, ntohl(*sackvalp));
+ }
+ hlen -= sizeof(tcp_seq);
+ sackvalp++;
+
+ sackModulated = 1;
+ };
+ opt = (u_int8_t*)(sackvalp);
+ break;
+ default:
+ optlen = *(opt+1);
+ if ((optlen < 2) || (optlen > hlen)) {
+ if (pf_status.debug >= PF_DEBUG_MISC) {
+ printf("pf: modul_sack: TCP option "
+    "length incorrect: %d, error at: "
+    "-%d from header end\n",
+    optlen, hlen);
+ }
+ /* optlen = 2;  not necessary, value will not be used */
+ return (sackModulated);
+ }
+ hlen -= optlen;
+ opt += optlen;
+ break;
+ }
+ }
+ return (sackModulated);  /* return value = 1 means SACK was modulated */
+}  /* end of function  pf_modulate_sack */
+
+
 void
 pf_send_tcp(const struct pf_rule *r, sa_family_t af,
     const struct pf_addr *saddr, const struct pf_addr *daddr,
@@ -2968,6 +3104,8 @@
  ;
  pf_change_a(&th->th_seq, &th->th_sum,
     htonl(s->src.seqlo + s->src.seqdiff), 0);
+ /* Since this is the SYN packet, there is no need for
+     * pf_modulate_sack - SACK option correction */
  rewrite = 1;
  } else
  s->src.seqdiff = 0;
@@ -4003,11 +4141,16 @@
 {
  struct pf_state_cmp key;
  struct tcphdr *th = pd->hdr.tcp;
+ u_int8_t thopt [MAX_TCPOPTLEN + 1];
+ /* buffer for TCP opts + 1 byte reserved for DEBUG*/
+ int thopt_len = -1;
+ /* thopt_len == -1 means buffer not initialized, 0 means no options */
  u_int16_t win = ntohs(th->th_win);
  u_int32_t ack, end, seq, orig_seq;
  u_int8_t sws, dws;
  int ackskew;
  int copyback = 0;
+ int copyback_thopt = 0;
  struct pf_state_peer *src, *dst;
 
  key.af = pd->af;
@@ -4155,6 +4298,9 @@
  pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
     src->seqdiff), 0);
  pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
+ /* pf_modulate_sack should not be necessary here. It is
+ * the first packet from this end, no SACK should appear
+ */
  copyback = 1;
  } else {
  ack = ntohl(th->th_ack);
@@ -4207,6 +4353,11 @@
  pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
     src->seqdiff), 0);
  pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
+ copyback_thopt |= pf_modulate_sack(
+    m, off, thopt, &thopt_len, pd->af, th,
+    dst->seqdiff, ack + dst->seqdiff);
+ /* TCP header containing checksum must be copied back
+ * if options were changed (SACK options) */
  copyback = 1;
  }
  end = seq + pd->p_len;
@@ -4425,9 +4576,24 @@
     &th->th_sum, &(*state)->lan.addr,
     (*state)->lan.port, 0, pd->af);
  m_copyback(m, off, sizeof(*th), th);
+
+ /* Copyback sequence modulation in SACK option */
+ if ((copyback_thopt) && (thopt_len>0) &&
+    (thopt_len<=MAX_TCPOPTLEN)) {
+ m_copyback(m, off + sizeof(struct tcphdr),
+    thopt_len, thopt);
+ }
+
  } else if (copyback) {
  /* Copyback sequence modulation or stateful scrub changes */
  m_copyback(m, off, sizeof(*th), th);
+
+ /* Copyback sequence modulation in SACK option */
+ if ((copyback_thopt) && (thopt_len>0) &&
+    (thopt_len<=MAX_TCPOPTLEN)) {
+ m_copyback(m, off + sizeof(struct tcphdr),
+    thopt_len, thopt);
+ }
  }
 
  return (PF_PASS);
@@ -4783,6 +4949,7 @@
  if (src->seqdiff) {
  pf_change_a(&th.th_seq, icmpsum,
     htonl(seq), 0);
+ /* Is pf_modulate_sack necessary in here ??? */
  copyback = 1;
  }
External interface dump (after SEQ / ACK / SACK modulation):

09:50:01.252315 localnet.2930 > remote.80: . [tcp sum ok] 1272002286:1272002286(0) ack 3891953630 win 33580 (DF) (ttl 128, id 4597, len 40)
09:50:01.266885 remote.80 > localnet.2930: . 3891953630:3891955090(1460) ack 1272002286 win 7707 (DF) (ttl 59, id 54995, len 1500)
Segment lost ... 3891955090:3891956550(1460), id 54996  
09:50:01.274762 remote.80 > localnet.2930: . 3891956550:3891958010(1460) ack 1272002286 win 7707 (DF) (ttl 59, id 54997, len 1500)
09:50:01.277256 localnet.2930 > remote.80: . [tcp sum ok] 1272002286:1272002286(0) ack 3891955090 win 33580 <nop,nop,sack 1 {3891956550:3891958010} > (DF) (ttl 128, id 4600, len 52)

The difference is (showing mixed Internal interface dump and external, line by line):
Int: 09:50:01.275543 localnet.2930 > remote.80: . [tcp sum ok]
Ext: 09:50:01.277256 localnet.2930 > remote.80: . [tcp sum ok]

Int: 2902725082:2902725082(0) ack 1880207564 , sack 1 {1880209024:1880210484} >id 4600
Ext: 1272002286:1272002286(0) ack 3891955090 , sack 1 {3891956550:3891958010} >id 4600

Int:  0000: 4500 0034 11f8 4000 8006 4956 0a01 0154  Int
Ext:  0000: 4500 0034 11f8 4000 8006 4956 0a01 0154  Ext

Int:  0010: 3e59 55c8 0b72 0050 ad04 11da 7011 b0cc  I  ... SYN & ACK
Ext:  0010: 3e59 55c8 0b72 0050 4bd1 36ee e7fa 8592  E

Int:  0020: 8010 832c 18c4 0000 0101 050a 7011 b680  I  ... 05=SackOpt, 0a=10d=length
Ext:  0020: 8010 832c 6ed5 0000 0101 050a e7fa 8b46  E   and then SACK values follow

Int:  0030: 7011 bc34       SACK option values are changed by the same value as
Ext:  0030: e7fa 90fa        ACK value is changed

Dec  6 09:50:01 ux1 /bsd: pf: modul_sack: changed SACK at 4 (-8),
from 1880209024=7011b680 - 2283219770=88172b3a to 3891958010=e7fa90fa

Dec  6 09:50:01 ux1 /bsd: pf: modul_sack: changed SACK at 8 (-4),
from 1880210484=7011bc34 - 2283219770=88172b3a to 3891958010=e7fa90fa


External interface dump continues:

09:50:01.282940 remote.80 > localnet.2930: . 3891958010:3891959470(1460) ack 1272002286 win 7707 (DF) (ttl 59, id 54998, len 1500)
  0000: 4500 05dc d6d6 4000 3b06 c3cf 3e59 55c8  E..\VV@.;.CO>YUH
  0010: 0a01 0154 0050 0b72 e7fa 90fa 4bd1 36ee  ...T.P.rgz.zKQ6n
  0020: 5010 1e1b 1831 0000 0a3f 1b74 04d0 1821  P....1...?.t.P.!
  0030: 0293 1a7b 4303 a74a 2fdb cfb6 afae db12  ...{C.'J/[O6/.[.
  0040: a0ea fea3 15ac 10dd fcb9 14ca 69da 7c9c   j~#.,.]|9.JiZ|.
  0050: b81e                                     8.

09:50:01.285401 localnet.2930 > remote.80: . [tcp sum ok] 1272002286:1272002286(0) ack 3891955090 win 33580 <nop,nop,sack 1 {3891956550:3891959470} > (DF) (ttl 128, id 4601, len 52)
...
09:50:01.465719 remote.80 > localnet.2930: . 3891987210:3891988670(1460) ack 1272002286 win 7707 (DF) (ttl 59, id 55018, len 1500)
09:50:01.468219 localnet.2930 > remote.80: . [tcp sum ok] 1272002286:1272002286(0) ack 3891955090 win 33580 <nop,nop,sack 1 {3891956550:3891988670} > (DF) (ttl 128, id 4626, len 52)
  0000: 4500 0034 1212 4000 8006 493c 0a01 0154  E..4..@...I<...T
  0010: 3e59 55c8 0b72 0050 4bd1 36ee e7fa 8592  >YUH.r.PKQ6ngz..
  0020: 8010 832c f710 0000 0101 050a e7fa 8b46  ...,w.......gz.F
  0030: e7fb 08be                                g{.>

09:50:01.474258 remote.80 > localnet.2930: P 3891955090:3891956550(1460) ack 1272002286 win 7707 (DF) (ttl 59, id 55019, len 1500)
09:50:01.475470 localnet.2930 > remote.80: . [tcp sum ok] 1272002286:1272002286(0) ack 3891988670 win 33580 (DF) (ttl 128, id 4629, len 40)