sasyncd and tdb replay counter updates

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

sasyncd and tdb replay counter updates

nathanael-3
When pfsync does a tdb update it preemptively increases the replay
counter. This is explained in src/sys/net/pfsync.c:

   * When a failover happens, the master's rpl is probably above
   * what we see here (we may be up to a second late), so
   * increase it a bit to manage most such situations.
   *
   * For now, just add an offset that is likely to be larger
   * than the number of packets we can see in one second. The RFC
   * just says the next packet must have a higher seq value.
   *
   * XXX What is a good algorithm for this? We could use
   * a rate-determined increase, but to know it, we would have
   * to extend struct tdb.

This only makes sense for an outbound tdb. The replay counter should
not be preemptively increased for an inbound tdb. The patch below
accounts for this by moving the preemptive increase to the originator
of the update, rather than the receivers, and distinguishing between
inbound and outbound tdbs. This is also consistent with any future
rate based increase (hinted at in the comment above) since it is
easier to monitor the rate at the originator than at the receivers.

With this patch and the pfkey patch I posted yesterday sasyncd is
behaving much better in my test environment, but is still not
completely reliable.

Nathanael

--- src-clean/sys/net/if_pfsync.h Fri Nov  4 16:24:14 2005
+++ src/sys/net/if_pfsync.h Sun Apr 30 23:00:13 2006
@@ -330,7 +330,7 @@
  pfsync_pack_state(PFSYNC_ACT_DEL, (st), \
     PFSYNC_FLAG_COMPRESS); \
 } while (0)
-int pfsync_update_tdb(struct tdb *);
+int pfsync_update_tdb(struct tdb *, int);
 #endif

 #endif /* _NET_IF_PFSYNC_H_ */
--- src-clean/sys/net/if_pfsync.c Tue Feb 21 04:12:14 2006
+++ src/sys/net/if_pfsync.c Mon May  1 07:42:45 2006
@@ -1553,24 +1553,7 @@
  s = spltdb();
  tdb = gettdb(pt->spi, &pt->dst, pt->sproto);
  if (tdb) {
- /*
- * When a failover happens, the master's rpl is probably above
- * what we see here (we may be up to a second late), so
- * increase it a bit to manage most such situations.
- *
- * For now, just add an offset that is likely to be larger
- * than the number of packets we can see in one second. The RFC
- * just says the next packet must have a higher seq value.
- *
- * XXX What is a good algorithm for this? We could use
- * a rate-determined increase, but to know it, we would have
- * to extend struct tdb.
- * XXX pt->rpl can wrap over MAXINT, but if so the real tdb
- * will soon be replaced anyway. For now, just don't handle
- * this edge case.
- */
-#define RPL_INCR 16384
- pt->rpl = ntohl(pt->rpl) + RPL_INCR;
+ pt->rpl = ntohl(pt->rpl);
  pt->cur_bytes = betoh64(pt->cur_bytes);

  /* Neither replay nor byte counter should ever decrease. */
@@ -1596,7 +1579,7 @@

 /* One of our local tdbs have been updated, need to sync rpl with others */
 int
-pfsync_update_tdb(struct tdb *tdb)
+pfsync_update_tdb(struct tdb *tdb, int output)
 {
  struct ifnet *ifp = &pfsyncif.sc_if;
  struct pfsync_softc *sc = ifp->if_softc;
@@ -1682,7 +1665,25 @@
  pt->sproto = tdb->tdb_sproto;
  }

- pt->rpl = htonl(tdb->tdb_rpl);
+ /*
+ * When a failover happens, the master's rpl is probably above
+ * what we see here (we may be up to a second late), so
+ * increase it a bit for outbound tdbs to manage most such
+ * situations.
+ *
+ * For now, just add an offset that is likely to be larger
+ * than the number of packets we can see in one second. The RFC
+ * just says the next packet must have a higher seq value.
+ *
+ * XXX What is a good algorithm for this? We could use
+ * a rate-determined increase, but to know it, we would have
+ * to extend struct tdb.
+ * XXX pt->rpl can wrap over MAXINT, but if so the real tdb
+ * will soon be replaced anyway. For now, just don't handle
+ * this edge case.
+ */
+#define RPL_INCR 16384
+ pt->rpl = htonl(tdb->tdb_rpl + (output ? RPL_INCR : 0));
  pt->cur_bytes = htobe64(tdb->tdb_cur_bytes);

  if (h->count == sc->sc_maxcount ||
--- src-clean/sys/netinet/ip_esp.c Tue Dec 20 21:36:28 2005
+++ src/sys/netinet/ip_esp.c Sun Apr 30 22:36:55 2006
@@ -588,7 +588,7 @@
     tdb->tdb_wnd, &(tdb->tdb_bitmap), 1)) {
  case 0: /* All's well */
 #if NPFSYNC > 0
- pfsync_update_tdb(tdb);
+ pfsync_update_tdb(tdb,0);
 #endif
  break;

@@ -883,7 +883,7 @@
  bcopy((caddr_t) &replay, mtod(mo, caddr_t) + sizeof(u_int32_t),
     sizeof(u_int32_t));
 #if NPFSYNC > 0
- pfsync_update_tdb(tdb);
+ pfsync_update_tdb(tdb,1);
 #endif
  }

--- src-clean/sys/netinet/ip_ah.c Tue Dec 20 21:36:28 2005
+++ src/sys/netinet/ip_ah.c Sun Apr 30 23:04:38 2006
@@ -814,7 +814,7 @@
     tdb->tdb_wnd, &(tdb->tdb_bitmap), 1)) {
  case 0: /* All's well. */
 #if NPFSYNC > 0
- pfsync_update_tdb(tdb);
+ pfsync_update_tdb(tdb,0);
 #endif
  break;

@@ -1104,7 +1104,7 @@
  if (!(tdb->tdb_flags & TDBF_NOREPLAY)) {
  ah->ah_rpl = htonl(tdb->tdb_rpl++);
 #if NPFSYNC > 0
- pfsync_update_tdb(tdb);
+ pfsync_update_tdb(tdb,1);
 #endif
  }