[PATCH] Folding of Comment Lines in make/lowparse.c

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Folding of Comment Lines in make/lowparse.c

William Ahern-2
The routine skip_empty_lines_and_read_char() is an optimization to skip over
blocks of comment lines. When it reads an unescaped '#' it uses the helper
routine skip_to_end_of_line(). But skip_to_end_of_line() doesn't fold lines
as it should and like its parent caller does. (See patch at end of message.)

I found this bug perusing the BearSSL source code.

  25 # The lines below are a horrible hack that nonetheless works. On a
  26 # "make" utility compatible with Single Unix v4 (this includes GNU and
  27 # BSD make), the '\' at the end of a command line counts as an escape
  28 # for the newline character, so the next line is still a comment.
  29 # However, Microsoft's nmake.exe (that comes with Visual Studio) does
  30 # not interpret the final '\' that way in a comment. The end result is
  31 # that when using nmake.exe, this will include "mk/Win.mk", whereas
  32 # GNU/BSD make will include "mk/Unix.mk".
  33
  34 # \
  35 !ifndef 0 # \
  36 !include mk/NMake.mk # \
  37 !else
  38 .POSIX:
  39 include mk/SingleUnix.mk
  40 # Extra hack for OpenBSD make.
  41 ifndef: all
  42 0: all
  43 endif: all
  44 # \
  45 !endif

  -- https://bearssl.org/gitweb/?p=BearSSL;a=blob;f=Makefile;hb=9dc62112

It's definitely a bug. POSIX says,

  There are three kinds of comments: blank lines, empty lines, and a
  <number-sign> ( '#' ) and all following characters up to the first
  unescaped <newline> character.

and

  When an escaped <newline> (one preceded by a <backslash>) is found
  anywhere in the makefile except in a command line, an include line, or a
  line immediately preceding an include line, it shall be replaced, along
  with any leading white space on the following line, with a single <space>.

Fortunately POSIX leaves unspecified what happens to an escaped newline
preceding an include line. My patch should be the end of the
matter unless there's some other bug lurking. Here's a test case,

  # \
  broken:;@echo broken
  fixed:;@echo fixed

and the patch,

Index: lowparse.c
===================================================================
RCS file: /cvs/src/usr.bin/make/lowparse.c,v
retrieving revision 1.35
diff -u -r1.35 lowparse.c
--- lowparse.c 21 Oct 2016 16:12:38 -0000 1.35
+++ lowparse.c 23 Feb 2018 20:02:59 -0000
@@ -247,20 +247,21 @@
 static int
 skip_to_end_of_line(void)
 {
- if (current->F) {
- if (current->end - current->ptr > 1)
- current->ptr = current->end - 1;
- if (*current->ptr == '\n')
- return *current->ptr++;
- return EOF;
- } else {
- int c;
+ int escaped = 0, c;
 
- do {
- c = read_char();
- } while (c != '\n' && c != EOF);
- return c;
+ while (EOF != (c = read_char())) {
+ if (escaped) {
+ if (c == '\n')
+ current->origin.lineno++;
+ escaped = 0;
+ } else if (c == '\\') {
+ escaped = 1;
+ } else if (c == '\n') {
+ break;
+ }
  }
+
+ return c;
 }
 
 

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Fix Comment Line Unfolding in make/lowparse.c

William Ahern-2
On Fri, Feb 23, 2018 at 12:27:14PM -0800, William Ahern wrote:
> The routine skip_empty_lines_and_read_char() is an optimization to skip over
> blocks of comment lines. When it reads an unescaped '#' it uses the helper
> routine skip_to_end_of_line(). But skip_to_end_of_line() doesn't fold lines
> as it should and like its parent caller does. (See patch at end of message.)

FWIW, I meant skip_to_end_of_line() doesn't _unfold_ lines as it should. I
rephrased the subject line in case anyone ignored this patch thinking I was
submitting a patch to reformat comment lines in lowparse.c itself.

Is there any interest in accepting this patch? Would there be any interest
in a patch which rewrote skip_empty_lines_and_read_char itself? I think it
could be substantially simplified using the same, simple state machine I
used to fix skip_to_end_of_lines. The performance benefit of the current
approach seems dubious relative to the complexity, and especially
considering its (wrongly) predicated on the ability to avoid parsing escape
sequences past '#' comments without violating the syntax rules. I just
figured it wasn't worth fixing what wasn't broken as long as I could easily
fix what was broken.

<snip>

> Index: lowparse.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/make/lowparse.c,v
> retrieving revision 1.35
> diff -u -r1.35 lowparse.c
> --- lowparse.c 21 Oct 2016 16:12:38 -0000 1.35
> +++ lowparse.c 23 Feb 2018 20:02:59 -0000
> @@ -247,20 +247,21 @@
>  static int
>  skip_to_end_of_line(void)
>  {
> - if (current->F) {
> - if (current->end - current->ptr > 1)
> - current->ptr = current->end - 1;
> - if (*current->ptr == '\n')
> - return *current->ptr++;
> - return EOF;
> - } else {
> - int c;
> + int escaped = 0, c;
>  
> - do {
> - c = read_char();
> - } while (c != '\n' && c != EOF);
> - return c;
> + while (EOF != (c = read_char())) {
> + if (escaped) {
> + if (c == '\n')
> + current->origin.lineno++;
> + escaped = 0;
> + } else if (c == '\\') {
> + escaped = 1;
> + } else if (c == '\n') {
> + break;
> + }
>   }
> +
> + return c;
>  }
>  
>