mg: crash on re-search with a file containing an empty line

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

mg: crash on re-search with a file containing an empty line

Hiltjo Posthuma
Hi,

I noticed a bug in mg with regex search.

A way to reproduce the issue:

* Create a file with atleast one empty line, for example:

"a

b"

* Move the cursor after "a".
* M-x re-search-forward, use a term like "^$".

Result: Segmentation fault (core dumped)

Reproduced on -current on amd64.

Backtrace:

(gdb) bt
#0  sslow (m=0x7f7fffff6008, start=0x1 <Address 0x1 out of bounds>, stop=0x0, startst=1,
    stopst=2) at regex/engine.c:779
#1  0x0000013fa9b9a9d0 in regexec (preg=Variable "preg" is not available.
) at regex/engine.c:201
#2  0x0000013d432f55de in re_forwsrch () at /usr/src/usr.bin/mg/re_search.c:335
#3  0x0000013d432f5273 in re_forwsearch (f=0, n=1) at /usr/src/usr.bin/mg/re_search.c:62
#4  0x0000013d432e8615 in extend (f=0, n=1) at /usr/src/usr.bin/mg/extend.c:549
#5  0x0000013d432efb8c in mgwrap (funct=0x13d432e84d0 <extend>, f=0, n=1)
    at /usr/src/usr.bin/mg/kbd.c:461
#6  0x0000013d432efac1 in doin () at /usr/src/usr.bin/mg/kbd.c:167
#7  0x0000013d432f3020 in main (argc=0, argv=0x7f7fffff6350) at /usr/src/usr.bin/mg/main.c:211

--
Kind regards,
Hiltjo

Reply | Threaded
Open this post in threaded view
|

Re: mg: crash on re-search with a file containing an empty line

Theo Buehler-3
On Sun, Jun 28, 2020 at 06:58:22PM +0200, Hiltjo Posthuma wrote:

> Hi,
>
> I noticed a bug in mg with regex search.
>
> A way to reproduce the issue:
>
> * Create a file with atleast one empty line, for example:
>
> "a
>
> b"
>
> * Move the cursor after "a".
> * M-x re-search-forward, use a term like "^$".
>
> Result: Segmentation fault (core dumped)
>
> Reproduced on -current on amd64.

That's because regexec() doesn't like to be passed NULL for its string
argument. This happens for an empty line as ltext(clp) is NULL in that
case.

Diff below is a quick workaround for this problem and marks the places
where similar crashes can be produced. Perhaps it would be better to
check for llength(clp) instead of ltext(clp)?

Index: re_search.c
===================================================================
RCS file: /var/cvs/src/usr.bin/mg/re_search.c,v
retrieving revision 1.33
diff -u -p -r1.33 re_search.c
--- re_search.c 6 Aug 2017 04:39:45 -0000 1.33
+++ re_search.c 28 Jun 2020 18:05:00 -0000
@@ -332,8 +332,8 @@ re_forwsrch(void)
  while (clp != (curbp->b_headp)) {
  regex_match[0].rm_so = tbo;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
  if (error != 0) {
  clp = lforw(clp);
  tdotline++;
@@ -389,8 +389,9 @@ re_backsrch(void)
  * do this character-by-character after the first match since
  * POSIX regexps don't give you a way to do reverse matches.
  */
- while (!regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND) && regex_match[0].rm_so < tbo) {
+ while (!regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND) &&
+    regex_match[0].rm_so < tbo) {
  memcpy(&lastmatch, &regex_match[0], sizeof(regmatch_t));
  regex_match[0].rm_so++;
  regex_match[0].rm_eo = llength(clp);
@@ -538,8 +539,8 @@ killmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Delete line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error)) {
@@ -613,8 +614,8 @@ countmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Count line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error))

Reply | Threaded
Open this post in threaded view
|

Re: mg: crash on re-search with a file containing an empty line

Hiltjo Posthuma
On Sun, Jun 28, 2020 at 08:10:05PM +0200, Theo Buehler wrote:

> On Sun, Jun 28, 2020 at 06:58:22PM +0200, Hiltjo Posthuma wrote:
> > Hi,
> >
> > I noticed a bug in mg with regex search.
> >
> > A way to reproduce the issue:
> >
> > * Create a file with atleast one empty line, for example:
> >
> > "a
> >
> > b"
> >
> > * Move the cursor after "a".
> > * M-x re-search-forward, use a term like "^$".
> >
> > Result: Segmentation fault (core dumped)
> >
> > Reproduced on -current on amd64.
>
> That's because regexec() doesn't like to be passed NULL for its string
> argument. This happens for an empty line as ltext(clp) is NULL in that
> case.
>
> Diff below is a quick workaround for this problem and marks the places
> where similar crashes can be produced. Perhaps it would be better to
> check for llength(clp) instead of ltext(clp)?
>
> Index: re_search.c
> ===================================================================
> RCS file: /var/cvs/src/usr.bin/mg/re_search.c,v
> retrieving revision 1.33
> diff -u -p -r1.33 re_search.c
> --- re_search.c 6 Aug 2017 04:39:45 -0000 1.33
> +++ re_search.c 28 Jun 2020 18:05:00 -0000
> @@ -332,8 +332,8 @@ re_forwsrch(void)
>   while (clp != (curbp->b_headp)) {
>   regex_match[0].rm_so = tbo;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>   if (error != 0) {
>   clp = lforw(clp);
>   tdotline++;
> @@ -389,8 +389,9 @@ re_backsrch(void)
>   * do this character-by-character after the first match since
>   * POSIX regexps don't give you a way to do reverse matches.
>   */
> - while (!regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND) && regex_match[0].rm_so < tbo) {
> + while (!regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND) &&
> +    regex_match[0].rm_so < tbo) {
>   memcpy(&lastmatch, &regex_match[0], sizeof(regmatch_t));
>   regex_match[0].rm_so++;
>   regex_match[0].rm_eo = llength(clp);
> @@ -538,8 +539,8 @@ killmatches(int cond)
>   /* see if line matches */
>   regex_match[0].rm_so = 0;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>  
>   /* Delete line when appropriate */
>   if ((cond == FALSE && error) || (cond == TRUE && !error)) {
> @@ -613,8 +614,8 @@ countmatches(int cond)
>   /* see if line matches */
>   regex_match[0].rm_so = 0;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>  
>   /* Count line when appropriate */
>   if ((cond == FALSE && error) || (cond == TRUE && !error))
>

Hi,

Thanks for looking into it, it looks better.

With this patch I noticed re_backsrch() does not work fully correct yet.

I think the line:

> +    regex_match[0].rm_so < tbo) {

should be:

> +    regex_match[0].rm_so <= tbo) {

because tbo is already adjusted at the top of the function re_backsrch().

With this line change searching backwards with "^$" finds a result.

--
Kind regards,
Hiltjo

Reply | Threaded
Open this post in threaded view
|

Re: mg: crash on re-search with a file containing an empty line

Theo Buehler-3
On Mon, Jun 29, 2020 at 01:17:56PM +0200, Hiltjo Posthuma wrote:

> Hi,
>
> Thanks for looking into it, it looks better.
>
> With this patch I noticed re_backsrch() does not work fully correct yet.
>
> I think the line:
>
> > +    regex_match[0].rm_so < tbo) {
>
> should be:
>
> > +    regex_match[0].rm_so <= tbo) {
>
> because tbo is already adjusted at the top of the function re_backsrch().
>
> With this line change searching backwards with "^$" finds a result.

Indeed, thank you. I must have messed up my testing...

However, changing < to <= changes the behavior of all backward regex
searches.

Type "aaa" on a line and leave the cursor after the third a on the same
line. If you re-search-backward for "a" on -current, the cursor is
placed on the middle "a" (different from emacs which places the cursor
on the third a).  If you then re-search-backward for "aa", the search
fails (as it does on emacs also).

With your suggested change (as in the diff below), the cursor is placed
on the third "a" as in emacs. However, with the cursor on the middle a
re-search-backward for "aa" now succeeds (different from emacs).

I'm not sure this is desirable. I'm no user of either emacs or mg, so I
think more thought and care is needed but this will have to be done by
someone else...

Index: re_search.c
===================================================================
RCS file: /var/cvs/src/usr.bin/mg/re_search.c,v
retrieving revision 1.33
diff -u -p -r1.33 re_search.c
--- re_search.c 6 Aug 2017 04:39:45 -0000 1.33
+++ re_search.c 29 Jun 2020 11:26:07 -0000
@@ -332,8 +332,8 @@ re_forwsrch(void)
while (clp != (curbp->b_headp)) {
  regex_match[0].rm_so = tbo;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
  if (error != 0) {
  clp = lforw(clp);
  tdotline++;
@@ -389,8 +389,9 @@ re_backsrch(void)
  * do this character-by-character after the first match since
  * POSIX regexps don't give you a way to do reverse matches.
  */
- while (!regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND) && regex_match[0].rm_so < tbo) {
+ while (!regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND) &&
+    regex_match[0].rm_so <= tbo) {
  memcpy(&lastmatch, &regex_match[0], sizeof(regmatch_t));
  regex_match[0].rm_so++;
  regex_match[0].rm_eo = llength(clp);
@@ -538,8 +539,8 @@ killmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Delete line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error)) {
@@ -613,8 +614,8 @@ countmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Count line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error))

Reply | Threaded
Open this post in threaded view
|

Re: mg: crash on re-search with a file containing an empty line

Hiltjo Posthuma
On Mon, Jun 29, 2020 at 02:36:31PM +0200, Theo Buehler wrote:

> On Mon, Jun 29, 2020 at 01:17:56PM +0200, Hiltjo Posthuma wrote:
> > Hi,
> >
> > Thanks for looking into it, it looks better.
> >
> > With this patch I noticed re_backsrch() does not work fully correct yet.
> >
> > I think the line:
> >
> > > +    regex_match[0].rm_so < tbo) {
> >
> > should be:
> >
> > > +    regex_match[0].rm_so <= tbo) {
> >
> > because tbo is already adjusted at the top of the function re_backsrch().
> >
> > With this line change searching backwards with "^$" finds a result.
>
> Indeed, thank you. I must have messed up my testing...
>
> However, changing < to <= changes the behavior of all backward regex
> searches.
>
> Type "aaa" on a line and leave the cursor after the third a on the same
> line. If you re-search-backward for "a" on -current, the cursor is
> placed on the middle "a" (different from emacs which places the cursor
> on the third a).  If you then re-search-backward for "aa", the search
> fails (as it does on emacs also).
>
> With your suggested change (as in the diff below), the cursor is placed
> on the third "a" as in emacs. However, with the cursor on the middle a
> re-search-backward for "aa" now succeeds (different from emacs).
>
> I'm not sure this is desirable. I'm no user of either emacs or mg, so I
> think more thought and care is needed but this will have to be done by
> someone else...
>
> Index: re_search.c
> ===================================================================
> RCS file: /var/cvs/src/usr.bin/mg/re_search.c,v
> retrieving revision 1.33
> diff -u -p -r1.33 re_search.c
> --- re_search.c 6 Aug 2017 04:39:45 -0000 1.33
> +++ re_search.c 29 Jun 2020 11:26:07 -0000
> @@ -332,8 +332,8 @@ re_forwsrch(void)
> while (clp != (curbp->b_headp)) {
>   regex_match[0].rm_so = tbo;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>   if (error != 0) {
>   clp = lforw(clp);
>   tdotline++;
> @@ -389,8 +389,9 @@ re_backsrch(void)
>   * do this character-by-character after the first match since
>   * POSIX regexps don't give you a way to do reverse matches.
>   */
> - while (!regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND) && regex_match[0].rm_so < tbo) {
> + while (!regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND) &&
> +    regex_match[0].rm_so <= tbo) {
>   memcpy(&lastmatch, &regex_match[0], sizeof(regmatch_t));
>   regex_match[0].rm_so++;
>   regex_match[0].rm_eo = llength(clp);
> @@ -538,8 +539,8 @@ killmatches(int cond)
>   /* see if line matches */
>   regex_match[0].rm_so = 0;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>  
>   /* Delete line when appropriate */
>   if ((cond == FALSE && error) || (cond == TRUE && !error)) {
> @@ -613,8 +614,8 @@ countmatches(int cond)
>   /* see if line matches */
>   regex_match[0].rm_so = 0;
>   regex_match[0].rm_eo = llength(clp);
> - error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
> -    REG_STARTEND);
> + error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
> +    RE_NMATCH, regex_match, REG_STARTEND);
>  
>   /* Count line when appropriate */
>   if ((cond == FALSE && error) || (cond == TRUE && !error))
>

Hi,

I don't use (GNU/)emacs either.

Can this patch get committed in the current form?

--
Kind regards,
Hiltjo

Reply | Threaded
Open this post in threaded view
|

Re: mg: crash on re-search with a file containing an empty line

Theo Buehler-3
> Can this patch get committed in the current form?

I don't mind committing either version of the patch.

Unless I hear objections, I'm going to commit the patch below soon.

Index: re_search.c
===================================================================
RCS file: /var/cvs/src/usr.bin/mg/re_search.c,v
retrieving revision 1.33
diff -u -p -r1.33 re_search.c
--- re_search.c 6 Aug 2017 04:39:45 -0000 1.33
+++ re_search.c 8 Jul 2020 13:00:40 -0000
@@ -332,8 +332,8 @@ re_forwsrch(void)
  while (clp != (curbp->b_headp)) {
  regex_match[0].rm_so = tbo;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
  if (error != 0) {
  clp = lforw(clp);
  tdotline++;
@@ -389,8 +389,9 @@ re_backsrch(void)
  * do this character-by-character after the first match since
  * POSIX regexps don't give you a way to do reverse matches.
  */
- while (!regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND) && regex_match[0].rm_so < tbo) {
+ while (!regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND) &&
+    regex_match[0].rm_so <= tbo) {
  memcpy(&lastmatch, &regex_match[0], sizeof(regmatch_t));
  regex_match[0].rm_so++;
  regex_match[0].rm_eo = llength(clp);
@@ -538,8 +539,8 @@ killmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Delete line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error)) {
@@ -613,8 +614,8 @@ countmatches(int cond)
  /* see if line matches */
  regex_match[0].rm_so = 0;
  regex_match[0].rm_eo = llength(clp);
- error = regexec(&regex_buff, ltext(clp), RE_NMATCH, regex_match,
-    REG_STARTEND);
+ error = regexec(&regex_buff, ltext(clp) ? ltext(clp) : "",
+    RE_NMATCH, regex_match, REG_STARTEND);
 
  /* Count line when appropriate */
  if ((cond == FALSE && error) || (cond == TRUE && !error))