grep -ob: behave like ggrep, adapt documentation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

grep -ob: behave like ggrep, adapt documentation

Christopher Zimmermann-2
Hi,

I just tried to find the byte position of a string within a binary file
using grep. Our base grep -bo let me down because it will only display
the position of the '\n' delimited line, not the position of the
pattern.

That's what our grep(1) says:
-b The offset in bytes of a matched pattern is displayed in
        front of the respective matched line.

what it actually does is displaying the offset of the line, not the
pattern. I think it should read as following:

-b Each output line is preceded by its position (in bytes) in the
        file.  If option -o is also specified, the offset in bytes of
the actual matched pattern is displayed.


and here's the diff. Regression tests still pass. OK?

Christopher



Index: grep.1
===================================================================
RCS file: /cvs/src/usr.bin/grep/grep.1,v
retrieving revision 1.45
diff -u -p -r1.45 grep.1
--- grep.1 10 Dec 2017 09:17:24 -0000 1.45
+++ grep.1 16 Jul 2019 21:16:18 -0000
@@ -131,8 +131,11 @@ and
 .Fl C
 options.
 .It Fl b
-The offset in bytes of a matched pattern is
-displayed in front of the respective matched line.
+Each output line is preceded by its position (in bytes) in the file.
+If option
+.Fl o
+is also specified, the offset in bytes of the actual matched pattern is
+displayed.
 .It Fl C Ns Oo Ar num Oc , Fl -context Ns Op = Ns Ar num
 Print
 .Ar num
Index: util.c
===================================================================
RCS file: /cvs/src/usr.bin/grep/util.c,v
retrieving revision 1.59
diff -u -p -r1.59 util.c
--- util.c 23 Jan 2019 23:00:54 -0000 1.59
+++ util.c 16 Jul 2019 21:16:18 -0000
@@ -250,9 +250,9 @@ print:
  if (Bflag > 0)
  printqueue();
  linesqueued = 0;
- printline(l, ':', oflag ? &pmatch : NULL);
+ printline(l, ':', &pmatch);
  } else {
- printline(l, '-', oflag ? &pmatch : NULL);
+ printline(l, '-', &pmatch);
  tail--;
  }
  }
@@ -651,12 +651,14 @@ printline(str_t *line, int sep, regmatch
  if (bflag) {
  if (n)
  putchar(sep);
- printf("%lld", (long long)line->off);
+ printf("%lld",
+    (long long)line->off
+    + (pmatch && oflag ? pmatch->rm_so : 0));
  ++n;
  }
  if (n)
  putchar(sep);
- if (pmatch)
+ if (oflag && pmatch)
  fwrite(line->dat + pmatch->rm_so,
     pmatch->rm_eo - pmatch->rm_so, 1, stdout);
  else


--
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
CB07 DA40 B0B6 571D 35E2  0DEF 87E2 92A7 13E5 DEE1

--
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
CB07 DA40 B0B6 571D 35E2  0DEF 87E2 92A7 13E5 DEE1

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: grep -ob: behave like ggrep, adapt documentation

Ted Unangst-6
Christopher Zimmermann wrote:

> Hi,
>
> I just tried to find the byte position of a string within a binary file
> using grep. Our base grep -bo let me down because it will only display
> the position of the '\n' delimited line, not the position of the
> pattern.
>
> That's what our grep(1) says:
> -b The offset in bytes of a matched pattern is displayed in
> front of the respective matched line.
>
> what it actually does is displaying the offset of the line, not the
> pattern. I think it should read as following:
>
> -b Each output line is preceded by its position (in bytes) in the
> file.  If option -o is also specified, the offset in bytes of
> the actual matched pattern is displayed.

This seems like the right behavior. But the diff can be simpler?

> Index: util.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/grep/util.c,v
> retrieving revision 1.59
> diff -u -p -r1.59 util.c
> --- util.c 23 Jan 2019 23:00:54 -0000 1.59
> +++ util.c 16 Jul 2019 21:16:18 -0000
> @@ -250,9 +250,9 @@ print:
>   if (Bflag > 0)
>   printqueue();
>   linesqueued = 0;
> - printline(l, ':', oflag ? &pmatch : NULL);
> + printline(l, ':', &pmatch);
>   } else {
> - printline(l, '-', oflag ? &pmatch : NULL);
> + printline(l, '-', &pmatch);
>   tail--;
>   }
>   }
> @@ -651,12 +651,14 @@ printline(str_t *line, int sep, regmatch
>   if (bflag) {
>   if (n)
>   putchar(sep);
> - printf("%lld", (long long)line->off);
> + printf("%lld",
> +    (long long)line->off
> +    + (pmatch && oflag ? pmatch->rm_so : 0));
>   ++n;
>   }
>   if (n)
>   putchar(sep);
> - if (pmatch)
> + if (oflag && pmatch)
>   fwrite(line->dat + pmatch->rm_so,
>      pmatch->rm_eo - pmatch->rm_so, 1, stdout);
>   else

I don't see why you needed to change this much.

- printf("%lld", (long long)line->off);
+ printf("%lld",
+    (long long)line->off
+    + (pmatch ? pmatch->rm_so : 0));

That should be enough, no?