locale in comm(1)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

locale in comm(1)

Jan Stary
Does comm(1) need to setlocale(3)?

It uses strcoll(3) by default, which ignores the locale
and does what strcmp(3) does, or strcasecmp(3) with -f,
which ignores the locale too.

So remove the setlocale(3), remove the header,
the LC_ that have been commented out since the initial revision in 1995,
and just use strcmp(3) or strcasecmp(3).

        Jan



Index: comm.1
===================================================================
RCS file: /cvs/src/usr.bin/comm/comm.1,v
retrieving revision 1.15
diff -u -p -r1.15 comm.1
--- comm.1 3 Sep 2010 11:09:28 -0000 1.15
+++ comm.1 10 Jan 2019 20:24:35 -0000
@@ -83,14 +83,6 @@ printed in column number three will have
 .Nm
 assumes that the files are lexically sorted; all characters
 participate in line comparisons.
-.\" .Sh ENVIRONMENT
-.\" .Bl -tag -width indent
-.\" .It Ev LANG
-.\" .It Ev LC_ALL
-.\" .It Ev LC_CTYPE
-.\" .It Ev LC_COLLATE
-.\" .It Ev LC_MESSAGES
-.\" .El
 .Sh EXIT STATUS
 .Ex -std comm
 .Sh SEE ALSO
Index: comm.c
===================================================================
RCS file: /cvs/src/usr.bin/comm/comm.c,v
retrieving revision 1.10
diff -u -p -r1.10 comm.c
--- comm.c 9 Oct 2015 01:37:07 -0000 1.10
+++ comm.c 10 Jan 2019 20:24:35 -0000
@@ -35,7 +35,6 @@
 
 #include <err.h>
 #include <limits.h>
-#include <locale.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -59,13 +58,11 @@ main(int argc, char *argv[])
  char **p, line1[MAXLINELEN], line2[MAXLINELEN];
  int (*compare)(const char * ,const char *);
 
- setlocale(LC_ALL, "");
-
  if (pledge("stdio rpath", NULL) == -1)
  err(1, "pledge");
 
  flag1 = flag2 = flag3 = 1;
- compare = strcoll;
+ compare = strcmp;
  while ((ch = getopt(argc, argv, "123f")) != -1)
  switch(ch) {
  case '1':

Reply | Threaded
Open this post in threaded view
|

Re: locale in comm(1)

Ingo Schwarze
Hi,

Jan Stary wrote on Thu, Jan 10, 2019 at 09:30:19PM +0100:

> Does comm(1) need to setlocale(3)?

schwarze@cvs $ grep -A 2 '\<comm\>' ~/TODO-UTF8.txt
comm - character case folding (non-standard '-f' flag)
       [requires wcscasecmp(3)]

So yes, it does need setlocale(LC_CTYPE, ""),
and no, this diff is not OK.

It would merely hide an open TODO item.

Yours,
  Ingo


> It uses strcoll(3) by default, which ignores the locale
> and does what strcmp(3) does, or strcasecmp(3) with -f,
> which ignores the locale too.
>
> So remove the setlocale(3), remove the header,
> the LC_ that have been commented out since the initial revision in 1995,
> and just use strcmp(3) or strcasecmp(3).
>
> Jan
>
>
>
> Index: comm.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/comm/comm.1,v
> retrieving revision 1.15
> diff -u -p -r1.15 comm.1
> --- comm.1 3 Sep 2010 11:09:28 -0000 1.15
> +++ comm.1 10 Jan 2019 20:24:35 -0000
> @@ -83,14 +83,6 @@ printed in column number three will have
>  .Nm
>  assumes that the files are lexically sorted; all characters
>  participate in line comparisons.
> -.\" .Sh ENVIRONMENT
> -.\" .Bl -tag -width indent
> -.\" .It Ev LANG
> -.\" .It Ev LC_ALL
> -.\" .It Ev LC_CTYPE
> -.\" .It Ev LC_COLLATE
> -.\" .It Ev LC_MESSAGES
> -.\" .El
>  .Sh EXIT STATUS
>  .Ex -std comm
>  .Sh SEE ALSO
> Index: comm.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/comm/comm.c,v
> retrieving revision 1.10
> diff -u -p -r1.10 comm.c
> --- comm.c 9 Oct 2015 01:37:07 -0000 1.10
> +++ comm.c 10 Jan 2019 20:24:35 -0000
> @@ -35,7 +35,6 @@
>  
>  #include <err.h>
>  #include <limits.h>
> -#include <locale.h>
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <string.h>
> @@ -59,13 +58,11 @@ main(int argc, char *argv[])
>   char **p, line1[MAXLINELEN], line2[MAXLINELEN];
>   int (*compare)(const char * ,const char *);
>  
> - setlocale(LC_ALL, "");
> -
>   if (pledge("stdio rpath", NULL) == -1)
>   err(1, "pledge");
>  
>   flag1 = flag2 = flag3 = 1;
> - compare = strcoll;
> + compare = strcmp;
>   while ((ch = getopt(argc, argv, "123f")) != -1)
>   switch(ch) {
>   case '1':
>