update devel/pcre2 10.32 -> 10.33

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

update devel/pcre2 10.32 -> 10.33

Nam Nguyen
This is an update for devel/pcre2 10.33, released April 16, 2019. I
tested it with wget. Changelog: https://www.pcre.org/changelog.txt

Here is some commentary on relevant parts of the changelog.
--8<---------------cut here---------------start------------->8---
3. Added support for callouts from pcre2_substitute(). After 10.33-RC1, but
prior to release, fixed a bug that caused a crash if pcre2_substitute() was
called with a NULL match context.

comment: minor bump pcre2-{8, 16, 32} because added new function

+T pcre2_set_substitute_callout_8

4. The POSIX functions are now all called pcre2_regcomp() etc., with wrapper
functions that use the standard POSIX names. However, in pcre2posix.h the POSIX
names are defined as macros. This should help avoid linking with the wrong
library in some environments while still exporting the POSIX names for
pre-existing programs that use them. (The Debian alternative names are also
defined as macros, but not documented.)

comment: minor bump pcre2-posix. Added new functions, while redefining
old symbols, like regcomp --> pcre2_regcomp.

+T pcre2_regcomp
+T pcre2_regerror
+T pcre2_regexec
+T pcre2_regfree

-PCRE2POSIX_EXP_DECL int regcomp(regex_t *, const char *, int);
-PCRE2POSIX_EXP_DECL int regexec(const regex_t *, const char *, size_t,
+PCRE2POSIX_EXP_DECL int pcre2_regcomp(regex_t *, const char *, int);
+PCRE2POSIX_EXP_DECL int pcre2_regexec(const regex_t *, const char *, size_t,
                      regmatch_t *, int);
-PCRE2POSIX_EXP_DECL size_t regerror(int, const regex_t *, char *, size_t);
-PCRE2POSIX_EXP_DECL void regfree(regex_t *);
+PCRE2POSIX_EXP_DECL size_t pcre2_regerror(int, const regex_t *, char *, size_t);
+PCRE2POSIX_EXP_DECL void pcre2_regfree(regex_t *);
+
+#define regcomp  pcre2_regcomp
+#define regexec  pcre2_regexec
+#define regerror pcre2_regerror
+#define regfree  pcre2_regfree

6. Implement PCRE2_EXTRA_ESCAPED_CR_IS_LF (see Bugzilla 2315).

comment: minor bump because new symbol in pcre2.h

+#define PCRE2_EXTRA_ESCAPED_CR_IS_LF         0x00000010u  /* C */

10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via
pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast
path. Also, when a match fails, set the subject field in the match data to NULL
for tidiness - none of the substring extractors should reference this after
match failure.

comment: minor bump because new symbol in pcre2.h

+#define PCRE2_COPY_MATCHED_SUBJECT        0x00004000u

23. The RunGrepTest script used to cut out the test of NUL characters for
Solaris and MacOS as printf and sed can't handle them. It seems that the *BSD
systems can't either. I've inverted the test so that only those OS that are
known to work (currently only Linux) try to run this test.

comment: Now that the test checks for Linux, I am proposing to
s/Linux/OpenBSD/ in the patch that jca@ added. This seems to be the
easiest way to run this test now.

previous version:
uname=`uname`
if [ "$uname" != "SunOS" -a "$uname" != "Darwin" ] ; then
...

newest version:
uname=`uname`
case $uname in
  Linux)
...

26. With PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL set, escape sequences such as \s
which are valid in character classes, but not as the end of ranges, were being
treated as literals. An example is [_-\s] (but not [\s-_] because that gave an
error at the *start* of a range). Now an "invalid range" error is given
independently of PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.

comment: minor bump because new error code in pcre2_error.c "invalid
range in character class\0"

31. Implemented PCRE2_EXTRA_ALT_BSUX to support ECMAScript 6's \u{hhh}
construct.

comment: minor bump because new symbol in pcre2.h
+#define PCRE2_EXTRA_ALT_BSUX                 0x00000020u  /* C */

--8<---------------cut here---------------end--------------->8---

diff:
? my_test
? pcre2.h.patch
? pcre2_16_new
? pcre2_16_old
? pcre2_32_new
? pcre2_32_old
? pcre2_new
? pcre2_old
? pcre2_posix_new
? pcre2_posix_old
? pcre2posix.h.patch
Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/pcre2/Makefile,v
retrieving revision 1.9
diff -u -p -r1.9 Makefile
--- Makefile 3 Feb 2019 22:40:38 -0000 1.9
+++ Makefile 30 Apr 2019 22:07:59 -0000
@@ -2,12 +2,12 @@
 
 COMMENT = perl-compatible regular expression library, version 2
 
-DISTNAME = pcre2-10.32
+DISTNAME = pcre2-10.33
 
-SHARED_LIBS +=  pcre2-16                  0.2 # 7.1
-SHARED_LIBS +=  pcre2-32                  0.2 # 7.1
-SHARED_LIBS +=  pcre2-8                   0.3 # 7.1
-SHARED_LIBS +=  pcre2-posix               0.2 # 2.1
+SHARED_LIBS +=  pcre2-16                  0.3 # 8.0
+SHARED_LIBS +=  pcre2-32                  0.3 # 8.0
+SHARED_LIBS +=  pcre2-8                   0.4 # 8.0
+SHARED_LIBS +=  pcre2-posix               0.3 # 2.2
 
 CATEGORIES = devel
 
Index: distinfo
===================================================================
RCS file: /cvs/ports/devel/pcre2/distinfo,v
retrieving revision 1.4
diff -u -p -r1.4 distinfo
--- distinfo 31 Jan 2019 17:40:30 -0000 1.4
+++ distinfo 30 Apr 2019 22:07:59 -0000
@@ -1,2 +1,2 @@
-SHA256 (pcre2-10.32.tar.gz) = nKm+cuGgTyK+MIMjyqjAbr0MUe/pnuESeBhsr7xP468=
-SIZE (pcre2-10.32.tar.gz) = 2169349
+SHA256 (pcre2-10.33.tar.gz) = 4uKJmpdIn8atGwzD2nlSx8ypkbSg99tmSbddlyECXTE=
+SIZE (pcre2-10.33.tar.gz) = 2234905
Index: pkg/PLIST
===================================================================
RCS file: /cvs/ports/devel/pcre2/pkg/PLIST,v
retrieving revision 1.3
diff -u -p -r1.3 PLIST
--- pkg/PLIST 26 Apr 2018 13:06:01 -0000 1.3
+++ pkg/PLIST 30 Apr 2019 22:07:59 -0000
@@ -82,6 +82,7 @@ lib/pkgconfig/libpcre2-posix.pc
 @man man/man3/pcre2_set_parens_nest_limit.3
 @man man/man3/pcre2_set_recursion_limit.3
 @man man/man3/pcre2_set_recursion_memory_management.3
+@man man/man3/pcre2_set_substitute_callout.3
 @man man/man3/pcre2_substitute.3
 @man man/man3/pcre2_substring_copy_byname.3
 @man man/man3/pcre2_substring_copy_bynumber.3
@@ -182,6 +183,7 @@ share/doc/pcre2/html/pcre2_set_offset_li
 share/doc/pcre2/html/pcre2_set_parens_nest_limit.html
 share/doc/pcre2/html/pcre2_set_recursion_limit.html
 share/doc/pcre2/html/pcre2_set_recursion_memory_management.html
+share/doc/pcre2/html/pcre2_set_substitute_callout.html
 share/doc/pcre2/html/pcre2_substitute.html
 share/doc/pcre2/html/pcre2_substring_copy_byname.html
 share/doc/pcre2/html/pcre2_substring_copy_bynumber.html
Index: patches/patch-RunGrepTest
===================================================================
RCS file: /cvs/ports/devel/pcre2/patches/patch-RunGrepTest,v
retrieving revision 1.1
diff -u -p -r1.1 patch-RunGrepTest
--- patches/patch-RunGrepTest 3 Feb 2019 22:40:38 -0000 1.1
+++ patches/patch-RunGrepTest 30 Apr 2019 22:08:10 -0000
@@ -5,12 +5,15 @@ Our sed(1) doesn't cope with NUL bytes a
 Index: RunGrepTest
 --- RunGrepTest.orig
 +++ RunGrepTest
-@@ -723,7 +723,7 @@ printf '%c--------------------------- Test N7 --------
+@@ -723,9 +723,9 @@ $valgrind $vjs $pcre2grep -n --newline=anycrlf "^(abc|
+ printf '%c--------------------------- Test N7 ------------------------------\r\n' - >>testtrygrep
  uname=`uname`
- if [ "$uname" != "SunOS" -a "$uname" != "Darwin" ] ; then
-   printf 'abc\0def' >testNinputgrep
--  $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | sed 's/\x00/ZERO/' >>testtrygrep
-+  $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | gsed 's/\x00/ZERO/' >>testtrygrep
-   echo "" >>testtrygrep
- else
-   echo '1:abcZERO2:def' >>testtrygrep
+ case $uname in
+-  Linux)
++  OpenBSD)
+     printf 'abc\0def' >testNinputgrep
+-    $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | sed 's/\x00/ZERO/' >>testtrygrep
++    $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | gsed 's/\x00/ZERO/' >>testtrygrep
+     echo "" >>testtrygrep
+     ;;
+   *)

Reply | Threaded
Open this post in threaded view
|

Re: update devel/pcre2 10.32 -> 10.33

Jeremie Courreges-Anglas-2
On Tue, Apr 30 2019, Nam Nguyen <[hidden email]> wrote:
> This is an update for devel/pcre2 10.33, released April 16, 2019. I
> tested it with wget.

Also tested with devel/vte3 and shells/fish but the update looked safe
anyway.

> Changelog: https://www.pcre.org/changelog.txt
>
> Here is some commentary on relevant parts of the changelog.
>
> --8<---------------cut here---------------start------------->8---
> 3. Added support for callouts from pcre2_substitute(). After 10.33-RC1, but
> prior to release, fixed a bug that caused a crash if pcre2_substitute() was
> called with a NULL match context.
>
> comment: minor bump pcre2-{8, 16, 32} because added new function
>
> +T pcre2_set_substitute_callout_8

yep

> 4. The POSIX functions are now all called pcre2_regcomp() etc., with wrapper
> functions that use the standard POSIX names. However, in pcre2posix.h the POSIX
> names are defined as macros. This should help avoid linking with the wrong
> library in some environments while still exporting the POSIX names for
> pre-existing programs that use them. (The Debian alternative names are also
> defined as macros, but not documented.)
>
> comment: minor bump pcre2-posix. Added new functions, while redefining
> old symbols, like regcomp --> pcre2_regcomp.
>
> +T pcre2_regcomp
> +T pcre2_regerror
> +T pcre2_regexec
> +T pcre2_regfree
>
> -PCRE2POSIX_EXP_DECL int regcomp(regex_t *, const char *, int);
> -PCRE2POSIX_EXP_DECL int regexec(const regex_t *, const char *, size_t,
> +PCRE2POSIX_EXP_DECL int pcre2_regcomp(regex_t *, const char *, int);
> +PCRE2POSIX_EXP_DECL int pcre2_regexec(const regex_t *, const char *, size_t,
>                       regmatch_t *, int);
> -PCRE2POSIX_EXP_DECL size_t regerror(int, const regex_t *, char *, size_t);
> -PCRE2POSIX_EXP_DECL void regfree(regex_t *);
> +PCRE2POSIX_EXP_DECL size_t pcre2_regerror(int, const regex_t *, char *, size_t);
> +PCRE2POSIX_EXP_DECL void pcre2_regfree(regex_t *);
> +
> +#define regcomp  pcre2_regcomp
> +#define regexec  pcre2_regexec
> +#define regerror pcre2_regerror
> +#define regfree  pcre2_regfree

yep

> 6. Implement PCRE2_EXTRA_ESCAPED_CR_IS_LF (see Bugzilla 2315).
>
> comment: minor bump because new symbol in pcre2.h
>
> +#define PCRE2_EXTRA_ESCAPED_CR_IS_LF         0x00000010u  /* C */
>
> 10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via
> pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast
> path. Also, when a match fails, set the subject field in the match data to NULL
> for tidiness - none of the substring extractors should reference this after
> match failure.
>
> comment: minor bump because new symbol in pcre2.h
>
> +#define PCRE2_COPY_MATCHED_SUBJECT        0x00004000u

> 26. With PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL set, escape sequences such as \s
> which are valid in character classes, but not as the end of ranges, were being
> treated as literals. An example is [_-\s] (but not [\s-_] because that gave an
> error at the *start* of a range). Now an "invalid range" error is given
> independently of PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
>
> comment: minor bump because new error code in pcre2_error.c "invalid
> range in character class\0"

> 31. Implemented PCRE2_EXTRA_ALT_BSUX to support ECMAScript 6's \u{hhh}
> construct.
>
> comment: minor bump because new symbol in pcre2.h
> +#define PCRE2_EXTRA_ALT_BSUX                 0x00000020u  /* C */

Technically not symbol additions, but fine, they might warrant a minor
bump anyway.

> 23. The RunGrepTest script used to cut out the test of NUL characters for
> Solaris and MacOS as printf and sed can't handle them. It seems that the *BSD
> systems can't either. I've inverted the test so that only those OS that are
> known to work (currently only Linux) try to run this test.
>
> comment: Now that the test checks for Linux, I am proposing to
> s/Linux/OpenBSD/ in the patch that jca@ added. This seems to be the
> easiest way to run this test now.
>
> previous version:
> uname=`uname`
> if [ "$uname" != "SunOS" -a "$uname" != "Darwin" ] ; then
> ...
>
> newest version:
> uname=`uname`
> case $uname in
>   Linux)
> ...

This is just badly implemented, portability-wise.  sed can't portably
handle NUL characters, so sed shouldn't be used here.  Your change
looks fine as a workaround for our ports tree, though.

Diff committed, thanks!

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: update devel/pcre2 10.32 -> 10.33

Jeremie Courreges-Anglas-2
On Thu, May 02 2019, Jeremie Courreges-Anglas <[hidden email]> wrote:

[...]

>> 23. The RunGrepTest script used to cut out the test of NUL characters for
>> Solaris and MacOS as printf and sed can't handle them. It seems that the *BSD
>> systems can't either. I've inverted the test so that only those OS that are
>> known to work (currently only Linux) try to run this test.
>>
>> comment: Now that the test checks for Linux, I am proposing to
>> s/Linux/OpenBSD/ in the patch that jca@ added. This seems to be the
>> easiest way to run this test now.
>>
>> previous version:
>> uname=`uname`
>> if [ "$uname" != "SunOS" -a "$uname" != "Darwin" ] ; then
>> ...
>>
>> newest version:
>> uname=`uname`
>> case $uname in
>>   Linux)
>> ...
>
> This is just badly implemented, portability-wise.  sed can't portably
> handle NUL characters, so sed shouldn't be used here.  Your change
> looks fine as a workaround for our ports tree, though.

If you'd like to get rid of that patch, here's another approach using
tr(1) that could be pushed upstream (I'm not volunteering).  I'd expect
all tr(1) implementations to support NUL bytes now.  See APPLICATION
USAGE and RATIONALE:

  http://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html

I did not add a patch for testdata/grepoutputN because it contains CR
bytes, hence the sed -i hack.


Index: Makefile
===================================================================
RCS file: /cvs/ports/devel/pcre2/Makefile,v
retrieving revision 1.10
diff -u -p -r1.10 Makefile
--- Makefile 1 May 2019 21:47:27 -0000 1.10
+++ Makefile 1 May 2019 22:47:11 -0000
@@ -39,4 +39,7 @@ CONFIGURE_ARGS += --disable-jit
 CONFIGURE_ENV = CPPFLAGS="-I${LOCALBASE}/include" \
  LDFLAGS="-L${LOCALBASE}/lib"
 
+post-extract:
+ sed -i 's/1:abcZERO/1:abc@/' ${WRKSRC}/testdata/grepoutputN
+
 .include <bsd.port.mk>
Index: patches/patch-RunGrepTest
===================================================================
RCS file: /cvs/ports/devel/pcre2/patches/patch-RunGrepTest,v
retrieving revision 1.2
diff -u -p -r1.2 patch-RunGrepTest
--- patches/patch-RunGrepTest 1 May 2019 21:47:27 -0000 1.2
+++ patches/patch-RunGrepTest 1 May 2019 22:47:11 -0000
@@ -1,19 +1,35 @@
 $OpenBSD: patch-RunGrepTest,v 1.2 2019/05/01 21:47:27 jca Exp $
 
 Our sed(1) doesn't cope with NUL bytes and \x00-style notation.
+Use tr(1) instead.
 
 Index: RunGrepTest
 --- RunGrepTest.orig
 +++ RunGrepTest
-@@ -723,9 +723,9 @@ $valgrind $vjs $pcre2grep -n --newline=anycrlf "^(abc|
+@@ -714,24 +714,9 @@ $valgrind $vjs $pcre2grep -n --newline=any "^(abc|def|
+ printf '%c--------------------------- Test N6 ------------------------------\r\n' - >>testtrygrep
+ $valgrind $vjs $pcre2grep -n --newline=anycrlf "^(abc|def|ghi|jkl)" testNinputgrep >>testtrygrep
+
+-# It seems impossible to handle NUL characters easily in many operating
+-# systems, including Solaris (aka SunOS), where the version of sed explicitly
+-# doesn't like them, and also MacOS (Darwin), OpenBSD, FreeBSD, and NetBSD. So
+-# now we run this test only on OS that are known to work. For the rest, we
+-# fudge the output so that the comparison works.
+-
  printf '%c--------------------------- Test N7 ------------------------------\r\n' - >>testtrygrep
- uname=`uname`
- case $uname in
+-uname=`uname`
+-case $uname in
 -  Linux)
-+  OpenBSD)
-     printf 'abc\0def' >testNinputgrep
+-    printf 'abc\0def' >testNinputgrep
 -    $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | sed 's/\x00/ZERO/' >>testtrygrep
-+    $valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | gsed 's/\x00/ZERO/' >>testtrygrep
-     echo "" >>testtrygrep
-     ;;
-   *)
+-    echo "" >>testtrygrep
+-    ;;
+-  *)
+-    echo '1:abcZERO2:def' >>testtrygrep
+-    ;;
+-esac
++printf 'abc\0def' >testNinputgrep
++$valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | tr '\000' '@' >>testtrygrep
+
+ $cf $srcdir/testdata/grepoutputN testtrygrep
+ if [ $? != 0 ] ; then exit 1; fi

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE