> Synopsis: Basic Regular Expression (BRE) bug in \{m,n\} with \(\) and \n
> Category: library > Environment: System : OpenBSD 6.7 Details : OpenBSD 6.7 (GENERIC) #7: Wed Jan 6 15:19:25 MST 2021 [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC Architecture: OpenBSD.amd64 Machine : amd64 > Description: Certain BRE expressions fail/misbehave unexpectedly. The failures are the same in both grep and sed (without -E). The failures only occur with certain combinations of use of: \{\}, \(\), \n (where n is digit) syntax, dropping any one of those then generally fails to trigger the bug. The bug/error can be seen most clearly in unexpected behavior of the \{m,n\} portion in the given context. If more of the (apparently dependent) context is removed, the bug doesn't show up. E.g. some of the clearest cases involve replacing * with \{0,\} in the BRE, and getting quite unexpected results (one would expect the results to be the same). These same BREs work under both Solaris 11 and GNU/Linux with their sed and grep. > How-To-Repeat: This example code can be used to illustrate the problem, and both show cases where the bug shows up, and also slightly differing contexts where the bug does not occur. In each of these cases, the output should be the STRING we set/echo into grep/sed where we use our BRE, but in the bug cases we get no output. It's also suggested test cases be added to the code to catch possible regression bugs, should issue recur. :-) Example code to show where bug does (and doesn't) show up: ( exec 2>&1 set -- \ 'YYxx' 'Y*\(x\)\1' \ 'YYxx' 'Y\{0,\}\(x\)\1' \ 'YYxx' 'Y\{2,\}\(x\)\1' \ 'YYxx' 'Y\{0,\}\(x\)' \ 'YYxx' 'Y\{2,\}x' \ 'YYxx' 'Y\{2,\}x\{1,\}' \ 'YYxx' 'Y\{2,\}x\{0,\}' \ 'YYxxz' 'Y\{2,\}x\{0,\}z' \ 'YYxxz' 'Y\{0,\}x\{0,\}z' \ 'YYxyxy' 'Y\{2,\}\(xy\)\1' \ 'YYxyxy' 'Y\{0,\}\(xy\)\1' \ 'YYxyxy' 'Y*\(xy\)\1' \ 'YYxyxy' 'Y\{0,\}\(xy\)xy' while [ "$#" -ge 2 ] do STRING="$1"; shift; BRE="$1"; shift set -x echo "$STRING" | grep -e "$BRE" echo "$STRING" | sed -ne "s/$BRE/&/p" set +x done ) Example run of above code. Bug is present where our STRING echoed into grep/sed fails to appear in the output: + echo YYxx + grep -e Y*\(x\)\1 YYxx + echo YYxx + sed -ne s/Y*\(x\)\1/&/p YYxx + set +x + echo YYxx + grep -e Y\{0,\}\(x\)\1 + echo YYxx + sed -ne s/Y\{0,\}\(x\)\1/&/p + set +x + echo YYxx + grep -e Y\{2,\}\(x\)\1 YYxx + echo YYxx + sed -ne s/Y\{2,\}\(x\)\1/&/p YYxx + set +x + echo YYxx + grep -e Y\{0,\}\(x\) YYxx + echo YYxx + sed -ne s/Y\{0,\}\(x\)/&/p YYxx + set +x + echo YYxx + grep -e Y\{2,\}x YYxx + echo YYxx + sed -ne s/Y\{2,\}x/&/p YYxx + set +x + echo YYxx + grep -e Y\{2,\}x\{1,\} YYxx + echo YYxx + sed -ne s/Y\{2,\}x\{1,\}/&/p YYxx + set +x + echo YYxx + grep -e Y\{2,\}x\{0,\} YYxx + echo YYxx + sed -ne s/Y\{2,\}x\{0,\}/&/p YYxx + set +x + echo YYxxz + grep -e Y\{2,\}x\{0,\}z YYxxz + echo YYxxz + sed -ne s/Y\{2,\}x\{0,\}z/&/p YYxxz + set +x + echo YYxxz + grep -e Y\{0,\}x\{0,\}z YYxxz + echo YYxxz + sed -ne s/Y\{0,\}x\{0,\}z/&/p YYxxz + set +x + echo YYxyxy + grep -e Y\{2,\}\(xy\)\1 YYxyxy + echo YYxyxy + sed -ne s/Y\{2,\}\(xy\)\1/&/p YYxyxy + set +x + echo YYxyxy + grep -e Y\{0,\}\(xy\)\1 + echo YYxyxy + sed -ne s/Y\{0,\}\(xy\)\1/&/p + set +x + echo YYxyxy + grep -e Y*\(xy\)\1 YYxyxy + echo YYxyxy + sed -ne s/Y*\(xy\)\1/&/p YYxyxy + set +x + echo YYxyxy + grep -e Y\{0,\}\(xy\)xy YYxyxy + echo YYxyxy + sed -ne s/Y\{0,\}\(xy\)xy/&/p YYxyxy + set +x > Fix: No known general work-around |
Free forum by Nabble | Edit this page |