linebuffering diff for tr(1)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

linebuffering diff for tr(1)

Jan Klemkow
Hi,

here is a diff that adds optional linebuffering to tr(1) with command
line switch -u like in sed(1).  I need this to remove '\r' characters
from a continues input steam which lines have to be there immediately.

Please write me if something is wrong with this diff or the change
itself.  I will fix it.

bye,
Jan

Index: tr.1
===================================================================
RCS file: /cvs/src/usr.bin/tr/tr.1,v
retrieving revision 1.20
diff -u -p -r1.20 tr.1
--- tr.1 14 Aug 2013 08:39:27 -0000 1.20
+++ tr.1 19 Nov 2013 20:46:33 -0000
@@ -41,18 +41,18 @@
 .Nd translate characters
 .Sh SYNOPSIS
 .Nm tr
-.Op Fl cs
+.Op Fl csu
 .Ar string1 string2
 .Nm tr
-.Op Fl c
+.Op Fl cu
 .Fl d
 .Ar string1
 .Nm tr
-.Op Fl c
+.Op Fl cu
 .Fl s
 .Ar string1
 .Nm tr
-.Op Fl c
+.Op Fl cu
 .Fl ds
 .Ar string1 string2
 .Sh DESCRIPTION
@@ -86,6 +86,14 @@ or
 .Ar string2 )
 in the input into a single instance of the character.
 This occurs after all deletion and translation is completed.
+.It Fl u
+Force output to be line buffered,
+printing each line as it becomes available.
+By default, output is line buffered when standard output is a terminal
+and block buffered otherwise.
+See
+.Xr setbuf 3
+for a more detailed explanation.
 .El
 .Pp
 In the first synopsis form, the characters in
@@ -284,6 +292,10 @@ The
 utility is compliant with the
 .St -p1003.1-2008
 specification.
+.Pp
+The flag
+.Op Fl u
+is an extension to that specification.
 .Pp
 System V has historically implemented character ranges using the syntax
 .Dq [c-c]
Index: tr.c
===================================================================
RCS file: /cvs/src/usr.bin/tr/tr.c,v
retrieving revision 1.15
diff -u -p -r1.15 tr.c
--- tr.c 27 Oct 2009 23:59:46 -0000 1.15
+++ tr.c 19 Nov 2013 20:46:33 -0000
@@ -88,7 +88,7 @@ main(int argc, char *argv[])
  int cflag, dflag, sflag, isstring2;
 
  cflag = dflag = sflag = 0;
- while ((ch = getopt(argc, argv, "cds")) != -1)
+ while ((ch = getopt(argc, argv, "cdsu")) != -1)
  switch((char)ch) {
  case 'c':
  cflag = 1;
@@ -99,6 +99,9 @@ main(int argc, char *argv[])
  case 's':
  sflag = 1;
  break;
+ case 'u':
+ setlinebuf(stdout);
+ break;
  case '?':
  default:
  usage();
@@ -239,9 +242,9 @@ static void
 usage(void)
 {
  fprintf(stderr,
-    "usage: tr [-cs] string1 string2\n"
-    "       tr [-c] -d string1\n"
-    "       tr [-c] -s string1\n"
-    "       tr [-c] -ds string1 string2\n");
+    "usage: tr [-csu] string1 string2\n"
+    "       tr [-cu] -d string1\n"
+    "       tr [-cu] -s string1\n"
+    "       tr [-cu] -ds string1 string2\n");
  exit(1);
 }

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Theo de Raadt
In general, new non-standard options are bad.

Basically, if we add this someone will use it in a script.  Then it will
become non-portable.  You cannot just invent something on your own like
this, without doing research to find out if someone else added a different
option.  I don't see evidence of that, so the gut answer is no.

> here is a diff that adds optional linebuffering to tr(1) with command
> line switch -u like in sed(1).  I need this to remove '\r' characters
> from a continues input steam which lines have to be there immediately.
>
> Please write me if something is wrong with this diff or the change
> itself.  I will fix it.
>
> bye,
> Jan
>
> Index: tr.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/tr/tr.1,v
> retrieving revision 1.20
> diff -u -p -r1.20 tr.1
> --- tr.1 14 Aug 2013 08:39:27 -0000 1.20
> +++ tr.1 19 Nov 2013 20:46:33 -0000
> @@ -41,18 +41,18 @@
>  .Nd translate characters
>  .Sh SYNOPSIS
>  .Nm tr
> -.Op Fl cs
> +.Op Fl csu
>  .Ar string1 string2
>  .Nm tr
> -.Op Fl c
> +.Op Fl cu
>  .Fl d
>  .Ar string1
>  .Nm tr
> -.Op Fl c
> +.Op Fl cu
>  .Fl s
>  .Ar string1
>  .Nm tr
> -.Op Fl c
> +.Op Fl cu
>  .Fl ds
>  .Ar string1 string2
>  .Sh DESCRIPTION
> @@ -86,6 +86,14 @@ or
>  .Ar string2 )
>  in the input into a single instance of the character.
>  This occurs after all deletion and translation is completed.
> +.It Fl u
> +Force output to be line buffered,
> +printing each line as it becomes available.
> +By default, output is line buffered when standard output is a terminal
> +and block buffered otherwise.
> +See
> +.Xr setbuf 3
> +for a more detailed explanation.
>  .El
>  .Pp
>  In the first synopsis form, the characters in
> @@ -284,6 +292,10 @@ The
>  utility is compliant with the
>  .St -p1003.1-2008
>  specification.
> +.Pp
> +The flag
> +.Op Fl u
> +is an extension to that specification.
>  .Pp
>  System V has historically implemented character ranges using the syntax
>  .Dq [c-c]
> Index: tr.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/tr/tr.c,v
> retrieving revision 1.15
> diff -u -p -r1.15 tr.c
> --- tr.c 27 Oct 2009 23:59:46 -0000 1.15
> +++ tr.c 19 Nov 2013 20:46:33 -0000
> @@ -88,7 +88,7 @@ main(int argc, char *argv[])
>   int cflag, dflag, sflag, isstring2;
>  
>   cflag = dflag = sflag = 0;
> - while ((ch = getopt(argc, argv, "cds")) != -1)
> + while ((ch = getopt(argc, argv, "cdsu")) != -1)
>   switch((char)ch) {
>   case 'c':
>   cflag = 1;
> @@ -99,6 +99,9 @@ main(int argc, char *argv[])
>   case 's':
>   sflag = 1;
>   break;
> + case 'u':
> + setlinebuf(stdout);
> + break;
>   case '?':
>   default:
>   usage();
> @@ -239,9 +242,9 @@ static void
>  usage(void)
>  {
>   fprintf(stderr,
> -    "usage: tr [-cs] string1 string2\n"
> -    "       tr [-c] -d string1\n"
> -    "       tr [-c] -s string1\n"
> -    "       tr [-c] -ds string1 string2\n");
> +    "usage: tr [-csu] string1 string2\n"
> +    "       tr [-cu] -d string1\n"
> +    "       tr [-cu] -s string1\n"
> +    "       tr [-cu] -ds string1 string2\n");
>   exit(1);
>  }
>

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Stuart Henderson-6
On 2013/11/19 14:10, Theo de Raadt wrote:
> In general, new non-standard options are bad.
>
> Basically, if we add this someone will use it in a script.  Then it will
> become non-portable.  You cannot just invent something on your own like
> this, without doing research to find out if someone else added a different
> option.  I don't see evidence of that, so the gut answer is no.

Only FreeBSD/Dragonfly seem to use -u in tr.  How about sed -u instead?

> > here is a diff that adds optional linebuffering to tr(1) with command
> > line switch -u like in sed(1).  I need this to remove '\r' characters
> > from a continues input steam which lines have to be there immediately.
> >
> > Please write me if something is wrong with this diff or the change
> > itself.  I will fix it.
> >
> > bye,
> > Jan
> >
> > Index: tr.1
> > ===================================================================
> > RCS file: /cvs/src/usr.bin/tr/tr.1,v
> > retrieving revision 1.20
> > diff -u -p -r1.20 tr.1
> > --- tr.1 14 Aug 2013 08:39:27 -0000 1.20
> > +++ tr.1 19 Nov 2013 20:46:33 -0000
> > @@ -41,18 +41,18 @@
> >  .Nd translate characters
> >  .Sh SYNOPSIS
> >  .Nm tr
> > -.Op Fl cs
> > +.Op Fl csu
> >  .Ar string1 string2
> >  .Nm tr
> > -.Op Fl c
> > +.Op Fl cu
> >  .Fl d
> >  .Ar string1
> >  .Nm tr
> > -.Op Fl c
> > +.Op Fl cu
> >  .Fl s
> >  .Ar string1
> >  .Nm tr
> > -.Op Fl c
> > +.Op Fl cu
> >  .Fl ds
> >  .Ar string1 string2
> >  .Sh DESCRIPTION
> > @@ -86,6 +86,14 @@ or
> >  .Ar string2 )
> >  in the input into a single instance of the character.
> >  This occurs after all deletion and translation is completed.
> > +.It Fl u
> > +Force output to be line buffered,
> > +printing each line as it becomes available.
> > +By default, output is line buffered when standard output is a terminal
> > +and block buffered otherwise.
> > +See
> > +.Xr setbuf 3
> > +for a more detailed explanation.
> >  .El
> >  .Pp
> >  In the first synopsis form, the characters in
> > @@ -284,6 +292,10 @@ The
> >  utility is compliant with the
> >  .St -p1003.1-2008
> >  specification.
> > +.Pp
> > +The flag
> > +.Op Fl u
> > +is an extension to that specification.
> >  .Pp
> >  System V has historically implemented character ranges using the syntax
> >  .Dq [c-c]
> > Index: tr.c
> > ===================================================================
> > RCS file: /cvs/src/usr.bin/tr/tr.c,v
> > retrieving revision 1.15
> > diff -u -p -r1.15 tr.c
> > --- tr.c 27 Oct 2009 23:59:46 -0000 1.15
> > +++ tr.c 19 Nov 2013 20:46:33 -0000
> > @@ -88,7 +88,7 @@ main(int argc, char *argv[])
> >   int cflag, dflag, sflag, isstring2;
> >  
> >   cflag = dflag = sflag = 0;
> > - while ((ch = getopt(argc, argv, "cds")) != -1)
> > + while ((ch = getopt(argc, argv, "cdsu")) != -1)
> >   switch((char)ch) {
> >   case 'c':
> >   cflag = 1;
> > @@ -99,6 +99,9 @@ main(int argc, char *argv[])
> >   case 's':
> >   sflag = 1;
> >   break;
> > + case 'u':
> > + setlinebuf(stdout);
> > + break;
> >   case '?':
> >   default:
> >   usage();
> > @@ -239,9 +242,9 @@ static void
> >  usage(void)
> >  {
> >   fprintf(stderr,
> > -    "usage: tr [-cs] string1 string2\n"
> > -    "       tr [-c] -d string1\n"
> > -    "       tr [-c] -s string1\n"
> > -    "       tr [-c] -ds string1 string2\n");
> > +    "usage: tr [-csu] string1 string2\n"
> > +    "       tr [-cu] -d string1\n"
> > +    "       tr [-cu] -s string1\n"
> > +    "       tr [-cu] -ds string1 string2\n");
> >   exit(1);
> >  }
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Jan Klemkow
On Tue, Nov 19, 2013 at 09:34:22PM +0000, Stuart Henderson wrote:
> On 2013/11/19 14:10, Theo de Raadt wrote:
> > In general, new non-standard options are bad.

I know and this is my own opinion to, in general.

> > Basically, if we add this someone will use it in a script.  Then it will
> > become non-portable.  You cannot just invent something on your own like
> > this, without doing research to find out if someone else added a different
> > option.  I don't see evidence of that, so the gut answer is no.

You are right, I forgot to have a look on other UNIX OSs before writing
this diff.  I just saw that -u in sed(1) is an extension to POSIX like
this diff would be.  So I think, may be it is a good extension?!?

Thanks to Stuart for doing my homework :-)
> Only FreeBSD/Dragonfly seem to use -u in tr.  How about sed -u instead?

At the moment I use sed(1) for that job, but for deleting one character
sed it to big. (Just my opinion and feeling during writing that script.)

It there a way to get the -u option in tr(1) a BSD extension like the
others?  In this case I would change my diff from line buffering to
unbuffered, like FreeBSD does.

> > > here is a diff that adds optional linebuffering to tr(1) with command
> > > line switch -u like in sed(1).  I need this to remove '\r' characters
> > > from a continues input steam which lines have to be there immediately.
> > >
> > > Please write me if something is wrong with this diff or the change
> > > itself.  I will fix it.
> > >
> > > bye,
> > > Jan
> > >
> > > Index: tr.1
> > > ===================================================================
> > > RCS file: /cvs/src/usr.bin/tr/tr.1,v
> > > retrieving revision 1.20
> > > diff -u -p -r1.20 tr.1
> > > --- tr.1 14 Aug 2013 08:39:27 -0000 1.20
> > > +++ tr.1 19 Nov 2013 20:46:33 -0000
> > > @@ -41,18 +41,18 @@
> > >  .Nd translate characters
> > >  .Sh SYNOPSIS
> > >  .Nm tr
> > > -.Op Fl cs
> > > +.Op Fl csu
> > >  .Ar string1 string2
> > >  .Nm tr
> > > -.Op Fl c
> > > +.Op Fl cu
> > >  .Fl d
> > >  .Ar string1
> > >  .Nm tr
> > > -.Op Fl c
> > > +.Op Fl cu
> > >  .Fl s
> > >  .Ar string1
> > >  .Nm tr
> > > -.Op Fl c
> > > +.Op Fl cu
> > >  .Fl ds
> > >  .Ar string1 string2
> > >  .Sh DESCRIPTION
> > > @@ -86,6 +86,14 @@ or
> > >  .Ar string2 )
> > >  in the input into a single instance of the character.
> > >  This occurs after all deletion and translation is completed.
> > > +.It Fl u
> > > +Force output to be line buffered,
> > > +printing each line as it becomes available.
> > > +By default, output is line buffered when standard output is a terminal
> > > +and block buffered otherwise.
> > > +See
> > > +.Xr setbuf 3
> > > +for a more detailed explanation.
> > >  .El
> > >  .Pp
> > >  In the first synopsis form, the characters in
> > > @@ -284,6 +292,10 @@ The
> > >  utility is compliant with the
> > >  .St -p1003.1-2008
> > >  specification.
> > > +.Pp
> > > +The flag
> > > +.Op Fl u
> > > +is an extension to that specification.
> > >  .Pp
> > >  System V has historically implemented character ranges using the syntax
> > >  .Dq [c-c]
> > > Index: tr.c
> > > ===================================================================
> > > RCS file: /cvs/src/usr.bin/tr/tr.c,v
> > > retrieving revision 1.15
> > > diff -u -p -r1.15 tr.c
> > > --- tr.c 27 Oct 2009 23:59:46 -0000 1.15
> > > +++ tr.c 19 Nov 2013 20:46:33 -0000
> > > @@ -88,7 +88,7 @@ main(int argc, char *argv[])
> > >   int cflag, dflag, sflag, isstring2;
> > >  
> > >   cflag = dflag = sflag = 0;
> > > - while ((ch = getopt(argc, argv, "cds")) != -1)
> > > + while ((ch = getopt(argc, argv, "cdsu")) != -1)
> > >   switch((char)ch) {
> > >   case 'c':
> > >   cflag = 1;
> > > @@ -99,6 +99,9 @@ main(int argc, char *argv[])
> > >   case 's':
> > >   sflag = 1;
> > >   break;
> > > + case 'u':
> > > + setlinebuf(stdout);
> > > + break;
> > >   case '?':
> > >   default:
> > >   usage();
> > > @@ -239,9 +242,9 @@ static void
> > >  usage(void)
> > >  {
> > >   fprintf(stderr,
> > > -    "usage: tr [-cs] string1 string2\n"
> > > -    "       tr [-c] -d string1\n"
> > > -    "       tr [-c] -s string1\n"
> > > -    "       tr [-c] -ds string1 string2\n");
> > > +    "usage: tr [-csu] string1 string2\n"
> > > +    "       tr [-cu] -d string1\n"
> > > +    "       tr [-cu] -s string1\n"
> > > +    "       tr [-cu] -ds string1 string2\n");
> > >   exit(1);
> > >  }
> > >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Christian Weisgerber
In reply to this post by Jan Klemkow
Jan Klemkow <[hidden email]> wrote:

> here is a diff that adds optional linebuffering to tr(1) with command
> line switch -u like in sed(1).  I need this to remove '\r' characters
> from a continues input steam which lines have to be there immediately.

It's really odd to make tr output line-buffered, since tr doesn't
process lines to begin with.  FreeBSD's tr -u is unbuffered.

Maybe, instead of adding such flags to more and more utilities, we
should have a general unbuffer tool that runs things through a pty.
(And I'd be surprised if something like this wasn't already floating
around.)

--
Christian "naddy" Weisgerber                          [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Shawn K. Quinn-2
In reply to this post by Theo de Raadt
On Tue, Nov 19, 2013, at 03:10 PM, Theo de Raadt wrote:
> In general, new non-standard options are bad.
>
> Basically, if we add this someone will use it in a script.  Then it will
> become non-portable.  You cannot just invent something on your own like
> this, without doing research to find out if someone else added a
> different
> option.  I don't see evidence of that, so the gut answer is no.

FreeBSD and Dragonfly BSD have this option in tr. So, this actually
improves portability.

--
  Shawn K. Quinn
  [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Stuart Henderson-6
On 2013/11/20 07:40, Shawn K. Quinn wrote:

> On Tue, Nov 19, 2013, at 03:10 PM, Theo de Raadt wrote:
> > In general, new non-standard options are bad.
> >
> > Basically, if we add this someone will use it in a script.  Then it will
> > become non-portable.  You cannot just invent something on your own like
> > this, without doing research to find out if someone else added a
> > different
> > option.  I don't see evidence of that, so the gut answer is no.
>
> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
> improves portability.

People on those OS which *do* have this have to take extra care to make
sure they aren't writing unportable scripts. It isn't even supported in Linux
(instead they have a generic utility in coreutils, stdbuf).

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Theo de Raadt
In reply to this post by Jan Klemkow
> On Tue, Nov 19, 2013, at 03:10 PM, Theo de Raadt wrote:
> > In general, new non-standard options are bad.
> >
> > Basically, if we add this someone will use it in a script.  Then it will
> > become non-portable.  You cannot just invent something on your own like
> > this, without doing research to find out if someone else added a
> > different
> > option.  I don't see evidence of that, so the gut answer is no.
>
> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
> improves portability.

Totally wrong there.

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Ted Unangst-6
In reply to this post by Theo de Raadt
> On 2013/11/20 07:40, Shawn K. Quinn wrote:

>> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
>> improves portability.

It's just spreading the disease. portable means it works everywhere.
Increasing the number of people who can write nonportable code is not
the same as increasing portability.

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Shawn K. Quinn-2
On Wed, Nov 20, 2013, at 12:49 PM, Ted Unangst wrote:
> > On 2013/11/20 07:40, Shawn K. Quinn wrote:
>
> >> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
> >> improves portability.
>
> It's just spreading the disease. portable means it works everywhere.
> Increasing the number of people who can write nonportable code is not
> the same as increasing portability.

How many others have to adopt it before it's considered portable, then?

Would you feel the same way if this were the -l option on ls, GNU had
it, and none of the BSD descendants did?

It's possible, as mentioned elsewhere, that simply making tr be
unbuffered by default is the better move, and ignore -u for
compatibility with FreeBSD and Dragonfly BSD.

--
  Shawn K. Quinn
  [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Theo de Raadt
In reply to this post by Jan Klemkow
> > >> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
> > >> improves portability.
> >
> > It's just spreading the disease. portable means it works everywhere.
> > Increasing the number of people who can write nonportable code is not
> > the same as increasing portability.
>
> How many others have to adopt it before it's considered portable, then?

It is portable when all of them have it.  Since you can't fix the past,
we must be very conservative in our approach.

> Would you feel the same way if this were the -l option on ls, GNU had
> it, and none of the BSD descendants did?

Uhm, that's a pretty weak argument.

> It's possible, as mentioned elsewhere, that simply making tr be
> unbuffered by default is the better move, and ignore -u for
> compatibility with FreeBSD and Dragonfly BSD.

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Franco Fichtner-2
On 20 Nov 2013, at 21:40, Theo de Raadt <[hidden email]> wrote:

>>>>> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
>>>>> improves portability.
>>>
>>> It's just spreading the disease. portable means it works everywhere.
>>> Increasing the number of people who can write nonportable code is not
>>> the same as increasing portability.
>>
>> How many others have to adopt it before it's considered portable, then?
>
> It is portable when all of them have it.  Since you can't fix the past,
> we must be very conservative in our approach.

In this case `portable' simply means `unavailable'.  And that's good.  :)
DragonFly has it solely because of the shared FreeBSD history, not because
it's being used a lot.

>> It's possible, as mentioned elsewhere, that simply making tr be
>> unbuffered by default is the better move, and ignore -u for
>> compatibility with FreeBSD and Dragonfly BSD.

How will that make things better?

Reply | Threaded
Open this post in threaded view
|

Re: linebuffering diff for tr(1)

Theo de Raadt
In reply to this post by Jan Klemkow
> >>>>> FreeBSD and Dragonfly BSD have this option in tr. So, this actually
> >>>>> improves portability.
> >>>
> >>> It's just spreading the disease. portable means it works everywhere.
> >>> Increasing the number of people who can write nonportable code is not
> >>> the same as increasing portability.
> >>
> >> How many others have to adopt it before it's considered portable, then?
> >
> > It is portable when all of them have it.  Since you can't fix the past,
> > we must be very conservative in our approach.
>
> In this case `portable' simply means `unavailable'.  And that's good.  :)
> DragonFly has it solely because of the shared FreeBSD history, not because
> it's being used a lot.

We always face a mix of goals:

   - portability (for common tools, don't diverge unless value is high enough)
   - improvements (diverge if the value is high enough)
   - standards (when portability is mandated, strongly follow that)
   - defacto standards (non-official standards also exist)

It takes a pretty big cry to start making changes.  In part that is why
this conversation isn't dead yet.

It is quite telling that FreeBSD added it a long long time ago, and it
has not been adopted elsewhere.  Pushes back against the urgency.

> >> It's possible, as mentioned elsewhere, that simply making tr be
> >> unbuffered by default is the better move, and ignore -u for
> >> compatibility with FreeBSD and Dragonfly BSD.
>
> How will that make things better?

Standing alone, "compatibility with FOO" is not a very strong
argument.  What next, "compatibility with Xenix and Windows"?