heads-up: DESCR limitation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

heads-up: DESCR limitation

Marc Espie-2
there was some internal discussion, and Ingo Schwarze made a compelling
argument for this, as our resident UTF-8 expert: pkg_* tools are mostly
used while setting up machines, and in some cases the tty might not be
100% utf-8 compliant.

So, effective immediately, all of pkg_* printouts will replace weird
stuff (including utf-8 sequences) with ?...

To avoid unpleasantness, pkg_create WILL downright refuse diacritics
and similar stuff in DESCR.

I've run a full bulk. This affected about 10 ports.

To prove Ingo right, my current setup got somewhat wacky while editing
those DESCR. It's probably not even 100% consistent (I got different behavior
between home and work...)

Maybe, just maybe, in 5 years, we'll get good enough utf-8 support across
the board *including network* *including !openbsd behavior* *inclding
editor defaults* that I can actually turn on /u in the safe sanitizer.

As of now, losing accents in those 10 ports seems like a small price to
pay for increased resilience...

Reply | Threaded
Open this post in threaded view
|

Re: heads-up: DESCR limitation

Chris Bennett-3
I just wanted to be clear on something. Basically almost everything I do
is in UTF-8 encoding.
Is it OK to submit in UTF-8 encoding or should I use C locale?
No problem for me to skip accents or such either way.

Thanks,
Chris Bennett

Reply | Threaded
Open this post in threaded view
|

Re: heads-up: DESCR limitation

Marc Espie-2
On Tue, Feb 27, 2018 at 07:29:32PM -0800, Chris Bennett wrote:
> I just wanted to be clear on something. Basically almost everything I do
> is in UTF-8 encoding.
> Is it OK to submit in UTF-8 encoding or should I use C locale?
> No problem for me to skip accents or such either way.
>
> Thanks,
> Chris Bennett

It's just the DESCR that needs to be sanitized.  And pkg_create will now tell
you.

Obviously, ports contents themselves will do utf-8 most of the time.

It's a completely different story.