Non-English manual pages in ports

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Non-English manual pages in ports

Ingo Schwarze
Hi,

while scouring our tree for USE_GROFF, i stumbled over a few cases
of non-English pages and looked around a bit in that respect.
To put it politely, the current situation is only partly coherent.

 1. Many ports install non-English manual pages.
    A few don't, even though upstream provides something.

 2. Many ports install unformatted manual pages,
    a few install preformatted with groff (or jnroff).
    I already fixed one or two where formatting broke the encoding,
    but i suspect that some remain.

 3. Many ports install UTF-8 encoded,
    quite a few install other encodings.
    In most cases, the encoding is not specified anywhere.

 4. Most install to man/language/manN, which is good.
    Some install to man/language_REGION/manN, a few even for
    languages where /language/ also exists.
    Some install to man/language.ENCODING/manN, both for UTF-8
    and for other encodings.
    Very few install to man/language@variant/manN.

I don't say this is a very high priority project, also because there
are only about 500 non-English manuals in the ports tree at this
time, but i think it is relatively cheap to improve quality and
consistency a bit by following these rules of thumb (take these as
Requests For Comments, not as Fiats):

 1. If upstream provides non-English manual pages, install them if
    that is possible without jumping through hoops, and unless there
    are specific reasons not too.  "They are outdated" is not a
    good reason to not install them, as long as upstream still
    includes them in the official distribution.

 2. Never install any encoding except UTF-8.
    If upstream provides UTF-8, great.
    Otherwise, set BUILD_DEPENDS = converters/libiconv
    and use iconv(1) in the post-build target.
    (I already discussed this item with a few porters and they agreed.)

 3. If mandoc(1) copes, which you can check in exactly the same
    way as with English manuals, simply install the UTF-8 source
    code to man/language/manN/*.N and do not USE_GROFF.

 4. If mandoc(1) does not cope - which should be very rare by now,
    we have less than 130 ports left in the whole tree that still
    USE_GROFF, even for English manuals - the proper order of
    operations is iconv(1) -t UTF-8, then preconv(1), then nroff(1),
    never some other way round.

 5. If possible, install to man/language/manN, without any "_", ".",
    or "@": KISS.  There are rare exceptions, most notably zh_CN
    vs. zh_HK vs. zh_TW.
    Never include the encoding in the path name, and make sure
    the /language/ part never contains "." (a dot).

 6. This is all about special needs.  If your port has special needs
    and you understand them and care, by all means do whatever makes
    sense, don't squeeze it into a scheme that doesn't fit.

If the above is followed, people can do the following with no
changes to any part of the default configuration:

  $ doas pkg_add mc
  $ export LC_CTYPE=en_US.UTF-8
  $ alias ruman="man -m /usr/local/man/ru"
  $ ruman mc

Any opinions?

If you think the above makes sense, i'm likely to do some of that
myself while continuing work on USE_GROFF, but i don't promise
to check all 500 pages, of course.


Item 1 above might need an explanation, so here you go:

I don't say that translating documentation is always a good idea.
Quite to the contrary, i think that it is not unusual to do more
harm than good.  But the reasons why that is so are purely practical
reasons (mostly excessive workload and translations being poor in
the first place and getting outdated over time); in principle,
having information available in more than one language could make
the world a better place, and tooling should not discourage people
trying to prove that for some projects, it is also feasible in
practice.

Do not try to judge the quality of the translated pages, or whether
they are up to date.  Just install them anyway.  Most port maintainers
won't even know the languages in question, and even if they do,
they should not waste their time assessing the quality for each
update.  Even if they would try, there is no way to define a
reasonable quality threshold.  For a user having very serious
problems reading English at all, even a badly outdated translation
can help getting started.  For a user who more or less understands
English, even tiny translation errors can make a translated manual
seriously harmful.  Even if a quality standard could be defined,
removing and re-adding the translations for every second update,
following the inevitably oscillating quality, wouldn't make sense.

If people use translated manuals, they are hopefully aware that
they are less exact than the original and more likely to not be up
to date, and deplorably often enough quite badly so.  That's just
common sense.  We always provide rope because that is useful to
some, and if others abuse it to hang themselves (or their friends
or enemies), that is unfortunate, but not an argument against ropes.

Finally, even if a translation is known to be outdated, updating
it is often less work than starting over from scratch, so it is
useful for people to know what exists.  Kicking away outdated pages
doesn't encourage users to update them and send their work upstream.
But OpenBSD wants to encourage users to participate in the free
software life cycle.  It is a developer-oriented system.

Oh, and we are talking about ports here.  Many ports manuals are
low quality anyway, even the English ones.  That is upstream's
responsibility, not porter's.  We don't kick away English manual
pages either, no matter how much upstream may have neglected
keeping them up to date.

Yours,
  Ingo


P.S.
Do not start sending stuff for /usr/share/man/*/manN/.
We *know* that we don't have the resources to maintain that.

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Marc Espie-2
On Mon, May 15, 2017 at 11:38:54PM +0200, Ingo Schwarze wrote:
>  5. If possible, install to man/language/manN, without any "_", ".",
>     or "@": KISS.  There are rare exceptions, most notably zh_CN
>     vs. zh_HK vs. zh_TW.
>     Never include the encoding in the path name, and make sure
>     the /language/ part never contains "." (a dot).

What are the other exceptions ?

> If you think the above makes sense, i'm likely to do some of that
> myself while continuing work on USE_GROFF, but i don't promise
> to check all 500 pages, of course.

Sounds like a good plan.

> I don't say that translating documentation is always a good idea.
> Quite to the contrary, i think that it is not unusual to do more
> harm than good.  But the reasons why that is so are purely practical
> reasons (mostly excessive workload and translations being poor in
> the first place and getting outdated over time); in principle,
> having information available in more than one language could make
> the world a better place, and tooling should not discourage people
> trying to prove that for some projects, it is also feasible in
> practice.

There might also be cases where the reference documentation is not english
because the project is not based in an english speaking country.

I remember when I started playing with japanese ports. A big difficulty
was bootstrap, because the code comments were in japanese, and most decent
documentation was in japanese as well.  Didn't help that we didn't have
any decent editor handling japanese at the time -- hence our antiquated
jvim, nor anything that could print japanese... the size of the fonts were
a problem with printers at the time -- hence kanjips.

The situation is better these days, but there do exist some worthwhile
projects where the reference texts are NOT in english.

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Ingo Schwarze
Hi Marc,

Marc Espie wrote on Tue, May 16, 2017 at 10:38:07AM +0200:
> On Mon, May 15, 2017 at 11:38:54PM +0200, Ingo Schwarze wrote:

>>  5. If possible, install to man/language/manN, without any "_", ".",
>>     or "@": KISS.  There are rare exceptions, most notably zh_CN
>>     vs. zh_HK vs. zh_TW.
>>     Never include the encoding in the path name, and make sure
>>     the /language/ part never contains "." (a dot).

> What are the other exceptions ?

I don't think we should try to define an exhaustive list.
A real-world need for a _REGION-coded manual page may come up
in any port at any time and is hard to predict.

That said, the following currently exist:

sysutils/deja-dup/pkg/PLIST:@man man/en_AU/man1/deja-dup-preferences.1
sysutils/deja-dup/pkg/PLIST:@man man/en_AU/man1/deja-dup.1
sysutils/deja-dup/pkg/PLIST:@man man/en_CA/man1/deja-dup-preferences.1
sysutils/deja-dup/pkg/PLIST:@man man/en_CA/man1/deja-dup.1
sysutils/deja-dup/pkg/PLIST:@man man/en_GB/man1/deja-dup-preferences.1
sysutils/deja-dup/pkg/PLIST:@man man/en_GB/man1/deja-dup.1
comms/wammu/pkg/PLIST:@man man/en_GB/man1/wammu-configure.1
comms/wammu/pkg/PLIST:@man man/en_GB/man1/wammu.1
games/wesnoth/pkg/PLIST:@man man/en_GB/man6/wesnoth.6
games/wesnoth/pkg/PLIST:@man man/en_GB/man6/wesnothd.6
sysutils/deja-dup/pkg/PLIST:@man man/fr_CA/man1/deja-dup-preferences.1
sysutils/deja-dup/pkg/PLIST:@man man/fr_CA/man1/deja-dup.1

I think those are silly and should simply be deleted.
I mean, come on, British translations of American manuals?
Or Quebequois ones of French manuals?
That does not make any sense, really.


Then there are a number of /pt_BR/ in addition to /pt/.
That looks suspicious, but i don't speak Portuguese,
so it may or may not make sense, i don't really know.

That's it, at this time, as far as i'm aware.


[...]
> There might also be cases where the reference documentation is not
> english because the project is not based in an english speaking
> country.

Good point.  The hanterm(1) port that i fixed a few days ago is such
a case.  The documentation is exclusively provided in Korean.

Yours,
  Ingo

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Marc Espie-2
On Tue, May 16, 2017 at 05:39:54PM +0200, Ingo Schwarze wrote:

> Hi Marc,
>
> Marc Espie wrote on Tue, May 16, 2017 at 10:38:07AM +0200:
> > On Mon, May 15, 2017 at 11:38:54PM +0200, Ingo Schwarze wrote:
>
> >>  5. If possible, install to man/language/manN, without any "_", ".",
> >>     or "@": KISS.  There are rare exceptions, most notably zh_CN
> >>     vs. zh_HK vs. zh_TW.
> >>     Never include the encoding in the path name, and make sure
> >>     the /language/ part never contains "." (a dot).
>
> > What are the other exceptions ?
>
> I don't think we should try to define an exhaustive list.

But having guidelines is good.  I understand why zh_* makes sense.

> A real-world need for a _REGION-coded manual page may come up
> in any port at any time and is hard to predict.
>
> That said, the following currently exist:
>
> sysutils/deja-dup/pkg/PLIST:@man man/en_AU/man1/deja-dup-preferences.1
> sysutils/deja-dup/pkg/PLIST:@man man/en_AU/man1/deja-dup.1
> sysutils/deja-dup/pkg/PLIST:@man man/en_CA/man1/deja-dup-preferences.1
> sysutils/deja-dup/pkg/PLIST:@man man/en_CA/man1/deja-dup.1
> sysutils/deja-dup/pkg/PLIST:@man man/en_GB/man1/deja-dup-preferences.1
> sysutils/deja-dup/pkg/PLIST:@man man/en_GB/man1/deja-dup.1
> comms/wammu/pkg/PLIST:@man man/en_GB/man1/wammu-configure.1
> comms/wammu/pkg/PLIST:@man man/en_GB/man1/wammu.1
> games/wesnoth/pkg/PLIST:@man man/en_GB/man6/wesnoth.6
> games/wesnoth/pkg/PLIST:@man man/en_GB/man6/wesnothd.6
> sysutils/deja-dup/pkg/PLIST:@man man/fr_CA/man1/deja-dup-preferences.1
> sysutils/deja-dup/pkg/PLIST:@man man/fr_CA/man1/deja-dup.1
>
> I think those are silly and should simply be deleted.
> I mean, come on, British translations of American manuals?
> Or Quebequois ones of French manuals?
> That does not make any sense, really.
Yes, I definitely agree with that. And that may  offend some people, so
that's really a good idea to delete these :D

> Then there are a number of /pt_BR/ in addition to /pt/.
> That looks suspicious, but i don't speak Portuguese,
> so it may or may not make sense, i don't really know.

I don't think BR makes sense, but I'll let portuguese/brazillians chime in.

(just for other people joining the discussion, we're only talking *manpages*.
For other localisation aspects, *of course* it does make sense to have
en_GB, en_AU, pt_BR !)

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Stuart Henderson
On 2017/05/16 18:01, Marc Espie wrote:
> > I think those are silly and should simply be deleted.
> > I mean, come on, British translations of American manuals?
> > Or Quebequois ones of French manuals?
> > That does not make any sense, really.
> Yes, I definitely agree with that. And that may  offend some people, so
> that's really a good idea to delete these :D

heh.

> > Then there are a number of /pt_BR/ in addition to /pt/.
> > That looks suspicious, but i don't speak Portuguese,
> > so it may or may not make sense, i don't really know.
>
> I don't think BR makes sense, but I'll let portuguese/brazillians chime in.

I suspect you'll find that pt_BR is often better maintained than pt.

> (just for other people joining the discussion, we're only talking *manpages*.
> For other localisation aspects, *of course* it does make sense to have
> en_GB, en_AU, pt_BR !)

Phew, asterisk-sounds/*/en_GB is safe :-)

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Diogo Galvao
In reply to this post by Marc Espie-2
On Tue, May 16, 2017 at 1:01 PM, Marc Espie <[hidden email]> wrote:
>> Then there are a number of /pt_BR/ in addition to /pt/.
>> That looks suspicious, but i don't speak Portuguese,
>> so it may or may not make sense, i don't really know.
>
> I don't think BR makes sense, but I'll let portuguese/brazillians chime in.
>

Brazilian user chiming in: we understand Portuguese from Portugal just
fine but some differences are quite evident to us, so we'd definitely
prefer a local version when available.

I grepped ports for man/pt_BR and found only three that also have
man/pt, if I did it correctly:

games/wesnoth: pt.po for manpages is much more complete and accurate
than pt_BR.po. Even if pt_BR inherits translations from pt.po when no
substitute is provided, some old translations seem just plain wrong.
If it were only for the language differences, I'd prefer the BR
variant as it sound more familiar.

sysutils/deja-dup: there's a perl script to generate manpages from
--help options, so I couldn't really compare them without installing
the port. But judging from the .po files for the whole program, pt_BR
seems much more complete to a point where pt alone wouldn't suffice,
even for Portuguese users.

x11/xfce4/terminal: there are some different words and spellings but
in general everything is the same, except that only pt_BR mentions
--color-table as a general option, as does man/C, so it's slightly
more in sync if we're nitpicking.

I'm not sure these examples help to decide on the proposed rule 5 for
pt_BR, but I'd say that if upstream took the time to provide language
variants of manpages, then users of those languages may benefit from
them.

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Theo de Raadt-2
> On Tue, May 16, 2017 at 1:01 PM, Marc Espie <[hidden email]> wrote:
> >> Then there are a number of /pt_BR/ in addition to /pt/.
> >> That looks suspicious, but i don't speak Portuguese,
> >> so it may or may not make sense, i don't really know.
> >
> > I don't think BR makes sense, but I'll let portuguese/brazillians chime in.
> >
>
> Brazilian user chiming in: we understand Portuguese from Portugal just
> fine but some differences are quite evident to us, so we'd definitely
> prefer a local version when available.

colour/color gray/grey.  I can handle stuff like that, you can handle
it also.

The problem is internationalization efforts are far from free.  It is
a lot of work by unqualified people, leading to development feedback
loops which are long and error prone, resulting in other far more
important development efforts not receiving their due.

There is no "one framework" that can pick the right files.  Even the
upstreams don't have a single selection roadmap, and force each
downstream to make some decisions.  Major downstreams are far better
funded, ie. receiving actual revenue.  Ingo's proposals are trying to
help you by simplifying things so they are manageable rather than
OpenBSD choosing to go English-only.

How many of your realize we are only 10 minutes from having the Balkan
community pipe up.

Don't know the story?  That's why the openbsd web pages are only in
english now, they used to have copies of all other languages but those
got removed.  Why?  Language groups fought email battles and we of the
english majority _DOING ALL THE WORK_ said we the minority has no
right to assign work.  The other languages were removed.

Actual participation isn't built upon complaints.  Careful what you
ask for.  Your requests may smell like demands.  Demands aren't well
received.

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Marc Espie-2
In reply to this post by Diogo Galvao
On Tue, May 16, 2017 at 11:42:32PM -0300, Diogo Galvao wrote:

> On Tue, May 16, 2017 at 1:01 PM, Marc Espie <[hidden email]> wrote:
> >> Then there are a number of /pt_BR/ in addition to /pt/.
> >> That looks suspicious, but i don't speak Portuguese,
> >> so it may or may not make sense, i don't really know.
> >
> > I don't think BR makes sense, but I'll let portuguese/brazillians chime in.
> >
>
> Brazilian user chiming in: we understand Portuguese from Portugal just
> fine but some differences are quite evident to us, so we'd definitely
> prefer a local version when available.
>
> I grepped ports for man/pt_BR and found only three that also have
> man/pt, if I did it correctly:

> games/wesnoth: pt.po for manpages is much more complete and accurate
> than pt_BR.po. Even if pt_BR inherits translations from pt.po when no
> substitute is provided, some old translations seem just plain wrong.
> If it were only for the language differences, I'd prefer the BR
> variant as it sound more familiar.
Not relevant .po is not a manpage.  Like I said.

> sysutils/deja-dup: there's a perl script to generate manpages from
> --help options, so I couldn't really compare them without installing
> the port. But judging from the .po files for the whole program, pt_BR
> seems much more complete to a point where pt alone wouldn't suffice,
> even for Portuguese users.
>
> x11/xfce4/terminal: there are some different words and spellings but
> in general everything is the same, except that only pt_BR mentions
> --color-table as a general option, as does man/C, so it's slightly
> more in sync if we're nitpicking.
>
> I'm not sure these examples help to decide on the proposed rule 5 for
> pt_BR, but I'd say that if upstream took the time to provide language
> variants of manpages, then users of those languages may benefit from
> them.

It corresponds to what sthen said, that it's likely _BR is better
maintained, so we should prefer it.

Reply | Threaded
Open this post in threaded view
|

Re: Non-English manual pages in ports

Ingo Schwarze
In reply to this post by Diogo Galvao
Hi Diogo,

Diogo Galvao wrote on Tue, May 16, 2017 at 11:42:32PM -0300:
> On Tue, May 16, 2017 at 1:01 PM, Marc Espie <[hidden email]> wrote:
>> Ingo Schwarze wrote:

>>> Then there are a number of /pt_BR/ in addition to /pt/.
>>> That looks suspicious, but i don't speak Portuguese,
>>> so it may or may not make sense, i don't really know.

>> I don't think BR makes sense,
>> but I'll let portuguese/brazillians chime in.

> Brazilian user chiming in:

Thanks for explaining the situation.

Given what was said so far, i think the best practice for /pt/ and
/pt_BR/ is to simply install whatever upstream provides, under the
name upstream provides: Deleting /pt_BR/ would be a bad idea because
it's often better, deleting /pt/ would be a bad idea because that's
our standard name (and possibly used as fallback in some situations),
and renaming stuff would just cause confusion.

Besides, we can't really expect port maintainers to judge whether
/pt/ or /pt_BR/ is better maintained for any given port, and the
relative quality may also change over time.

Note that this also means less work for maintainers.  Just
installing what upstream provides is usually simplest, removing
stuff often requires some additional "post-install: rm ..."
target in the Makefile.

So, that leaves us with the following rules of thumb in this respect:

 * install to /language/ if possible
 * never use an ".encoding" suffix
 * try hard to avoid "@variant" suffixes
 * usually avoid "_REGION" suffixes
 * exception: use /zh_CN/ and /zh_TW/, not /zh/
 * exception: tolerate both /pt/ and /pt_BR/
 * these are rules of thumb, not set in stone;
   if your port has special needs and you care,
   do whatever makes most sense for that port

Yours,
  Ingo

P.S.
I don't see a need to complain about complaining.
A question was asked and we got an answer containing useful
information, thanks for that.  Yes, the answer also contained a few
sentences where the question was misunderstood, but no harm done.  :)