FW: [Bug 243590] TCP ECN not adhering extremely strictly to RFC3168 can cause massive TCP perf issues

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

FW: [Bug 243590] TCP ECN not adhering extremely strictly to RFC3168 can cause massive TCP perf issues

Scheffenegger, Richard-2
Hi,

after reading through the TCP ECN related code on CVSweb for the respective MAIN, it is clear that OpenBSD and NetBSD (and probably all other BSD variants and derived OS) suffer from the same problem as FreeBSD, when running a transactional TCP session with ECN against a Linux client (where data changes direction frequently, rather than one bulk transfer using only one half connection; e.g. NFS, iSCSI, SMB,...).

Due to all Linux processing CWR only on packets that also contain data, there is a good probability, that when OpenBSD places a CWR on an arbitrary next packet (like it does now), that the CWR is being ignored by Linux, and ECE remains latched. This in turn results in the BSD sender to further shrink the cwnd, until by chance the CWR ends up set on a data segment - which may be at very small cwnd levels, and after a couple of seconds.

The issue is documented in this FreeBSD bug report:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243590

And while not 1:1 applicable to the OpenBSD code, this is the Diff I provided:

https://reviews.freebsd.org/D23364

Note that the problem will not show up with "typical" bulk transfer testing, only when data is send alternating between both ends, e.g. NFS request for a large file block, server sending that NFS response, etc...


Richard Scheffenegger


-----Original Message-----
From: [hidden email] <[hidden email]>
Sent: Donnerstag, 28. Mai 2020 00:35
To: [hidden email]
Subject: [Bug 243590] TCP ECN not adhering extremely strictly to RFC3168 can cause massive TCP perf issues

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243590

--- Comment #3 from [hidden email] --- A commit references this bug:

Author: rscheff
Date: Wed May 27 22:34:47 UTC 2020
New revision: 361565
URL: https://svnweb.freebsd.org/changeset/base/361565

Log:
  MFS r361436: MFC r361347: With RFC3168 ECN, CWR SHOULD only be sent with new data.

  Overly conservative data receivers may ignore the CWR flag on other
  packets, and keep ECE latched. This can result in continuous reduction
  of the congestion window, and very poor performance when ECN is
  enabled.

  This does NOT contain the merge of the change to RACK since at this
  time that code does not exist in stable/11, and there is no plan to
  merge RACK to stable/11.

  PR:           243590
  Reviewed by:  rgrimes (mentor), rrs
  Approved by:  re(gjb)
  Sponsored by: NetApp, Inc.
  Differential Revision:        https://reviews.freebsd.org/D23364

Changes:
_U  releng/11.4/
  releng/11.4/sys/netinet/tcp_input.c
  releng/11.4/sys/netinet/tcp_output.c

--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You reported the bug.