locale thread support

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

locale thread support

Marc Espie-2
I've had to neuter some setlocale() in mlt, since our code is definitely NOT
thread-safe.  No biggie, since we do not have support for LC_NUMERIC right now.

I think we might want the thread-specific functions, namely stuff like
strtod_l, or sprintf_l  and friends.

Even if they do not do anything specific right now with a locale object,
at least they would prevent re-entrency issues.

Note that the current issue is a time-bomb, because it is a race, so it does only
manifest itself as  a double free in freegl...


uselocale is fine, but it is not on windows, so highly portable code tends to
prefer strtod_l...

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Joerg Sonnenberger-2
On Fri, May 22, 2020 at 01:50:44PM +0200, Marc Espie wrote:
> uselocale is fine, but it is not on windows, so highly portable code tends to
> prefer strtod_l...

The problem with uselocale is two fold and why I explicitly decided to not
implement it in NetBSD:
(1) It can come at a significant performance cost as all locale-aware
operations need to access thread-specific memory. While that is moderate
fast on x86, other architectures are not so nice. Even on x86, it can
easily result a factor of two or three for things like the ctype.h
macros.
(2) It can easily POLA for code that different threads have differnt
locales. There is a lot of existing code written that don't know how to
deal with.

Add that the explicit *_l functions are the preferred mechanism for
implementing the primitives in higher languages and it becomes a
no-brainer.

Joerg

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Mark Kettenis
In reply to this post by Marc Espie-2
> Date: Fri, 22 May 2020 13:50:44 +0200
> From: Marc Espie <[hidden email]>
>
> I've had to neuter some setlocale() in mlt, since our code is
> definitely NOT thread-safe.  No biggie, since we do not have support
> for LC_NUMERIC right now.

Hmm, setlocale() is explicitly mentionded as not thread-safe in POSIX.

> I think we might want the thread-specific functions, namely stuff like
> strtod_l, or sprintf_l  and friends.

That is <xlocale.h>.  which isn't actually standardized.  But there is
a de-facto standard set by Apple.  Most OSes have an implementation by
now and our libc++ needs it (and uses a stub implementation for now).
The Darwin and/or FreeBSD man pages are probably the best
documentation for these interfaces.

> Even if they do not do anything specific right now with a locale object,
> at least they would prevent re-entrency issues.

Yes, implementing these shouldn't be hard.  We already have wctype_l()
and iswctype_l() (which are standardized) which do care about the
locale.  But everything else probably doesn't so these functions can
be simple wrappers around the non-_l functions that simply ignore the
locale.  It's just work but really doesn't require sepcific skills.
I'm sure Ingo will be able to give some constructive feedback once a
diff exists.

> Note that the current issue is a time-bomb, because it is a race, so
> it does only manifest itself as a double free in freegl...
>
> uselocale is fine, but it is not on windows, so highly portable code
> tends to prefer strtod_l...

Which is funny since strtod_l is even less standard than uselocale() ;).

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Marc Espie-2
On Fri, May 22, 2020 at 02:31:38PM +0200, Mark Kettenis wrote:
> > Date: Fri, 22 May 2020 13:50:44 +0200
> > From: Marc Espie <[hidden email]>
> >
> > I've had to neuter some setlocale() in mlt, since our code is
> > definitely NOT thread-safe.  No biggie, since we do not have support
> > for LC_NUMERIC right now.
>
> Hmm, setlocale() is explicitly mentionded as not thread-safe in POSIX.

From a security standpoint, is there a "cheap" way to make setlocale abort()
instead of trying to do a double free on  when running in a race condition ?

> > I think we might want the thread-specific functions, namely stuff like
> > strtod_l, or sprintf_l  and friends.
>
> That is <xlocale.h>.  which isn't actually standardized.  But there is
> a de-facto standard set by Apple.  Most OSes have an implementation by
> now and our libc++ needs it (and uses a stub implementation for now).
> The Darwin and/or FreeBSD man pages are probably the best
> documentation for these interfaces.
>
> > Even if they do not do anything specific right now with a locale object,
> > at least they would prevent re-entrency issues.
>
> Yes, implementing these shouldn't be hard.  We already have wctype_l()
> and iswctype_l() (which are standardized) which do care about the
> locale.  But everything else probably doesn't so these functions can
> be simple wrappers around the non-_l functions that simply ignore the
> locale.  It's just work but really doesn't require sepcific skills.
> I'm sure Ingo will be able to give some constructive feedback once a
> diff exists.


> > Note that the current issue is a time-bomb, because it is a race, so
> > it does only manifest itself as a double free in freegl...
> >
> > uselocale is fine, but it is not on windows, so highly portable code
> > tends to prefer strtod_l...
>
> Which is funny since strtod_l is even less standard than uselocale() ;).

Well, I had a look at the current code, they changed it yet again, so it's
probably going to fail in newer, more funny ways.

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Todd C. Miller-3
On Fri, 22 May 2020 14:57:11 +0200, Marc Espie wrote:

> From a security standpoint, is there a "cheap" way to make setlocale abort()
> instead of trying to do a double free on  when running in a race condition ?

We could use _THREAD_PRIVATE_MUTEX as we do in other parts of libc.

 - todd

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Mark Kettenis
> From: "Todd C. Miller" <[hidden email]>
> Date: Fri, 22 May 2020 07:23:55 -0600
>
> On Fri, 22 May 2020 14:57:11 +0200, Marc Espie wrote:
>
> > From a security standpoint, is there a "cheap" way to make setlocale abort()
> > instead of trying to do a double free on  when running in a race condition ?
>
> We could use _THREAD_PRIVATE_MUTEX as we do in other parts of libc.

That might eliminate two threads racing eachoither in setlocale(), but
it wouldn't stop threads that actually access the locale from
use-after-free type bugs.  Unless you use the lock there as well.  But
that could have a major performance impact.

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Marc Espie-2
On Fri, May 22, 2020 at 04:02:31PM +0200, Mark Kettenis wrote:

> > From: "Todd C. Miller" <[hidden email]>
> > Date: Fri, 22 May 2020 07:23:55 -0600
> >
> > On Fri, 22 May 2020 14:57:11 +0200, Marc Espie wrote:
> >
> > > From a security standpoint, is there a "cheap" way to make setlocale abort()
> > > instead of trying to do a double free on  when running in a race condition ?
> >
> > We could use _THREAD_PRIVATE_MUTEX as we do in other parts of libc.
>
> That might eliminate two threads racing eachoither in setlocale(), but
> it wouldn't stop threads that actually access the locale from
> use-after-free type bugs.  Unless you use the lock there as well.  But
> that could have a major performance impact.

Well, I'd be happy to have an abort() in case a race is detected.

As we said, this is 100% unsupported behavior! but it's reasonably hard to detect.

Maybe a simple way to poison setlocale is it's called from two different threads
in a given process ?... I have zero idea.

My point is NOT to make broken code work, but to make it obvious it's broken!
because right now, it's a sporadic race... and even though this is complicated, I
fear it's a security issue waiting to happen.

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Marc Espie-2
In reply to this post by Mark Kettenis
On Fri, May 22, 2020 at 02:31:38PM +0200, Mark Kettenis wrote:

> > Date: Fri, 22 May 2020 13:50:44 +0200
> > From: Marc Espie <[hidden email]>
> >
> > I've had to neuter some setlocale() in mlt, since our code is
> > definitely NOT thread-safe.  No biggie, since we do not have support
> > for LC_NUMERIC right now.
>
> Hmm, setlocale() is explicitly mentionded as not thread-safe in POSIX.
>
> > I think we might want the thread-specific functions, namely stuff like
> > strtod_l, or sprintf_l  and friends.
>
> That is <xlocale.h>.  which isn't actually standardized.  But there is
> a de-facto standard set by Apple.  Most OSes have an implementation by
> now and our libc++ needs it (and uses a stub implementation for now).
> The Darwin and/or FreeBSD man pages are probably the best
> documentation for these interfaces.
>
> > Even if they do not do anything specific right now with a locale object,
> > at least they would prevent re-entrency issues.
>
> Yes, implementing these shouldn't be hard.  We already have wctype_l()
> and iswctype_l() (which are standardized) which do care about the
> locale.  But everything else probably doesn't so these functions can
> be simple wrappers around the non-_l functions that simply ignore the
> locale.  It's just work but really doesn't require sepcific skills.
> I'm sure Ingo will be able to give some constructive feedback once a
> diff exists.

Looks like for the header part we can mostly borrow from FreeBSD, and I gather
the libc part shouldn't be that hard (minus locale support of course).

I'd wait to see what Ingo thinks about that.

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Ingo Schwarze
Hi,

my impression is there are two totally independent topics in this
thread.

 1. Whether and how to make things safer, ideally terminating the
    program in a controlled manner, if it uses functions that are
    not thread-safe in an invalid manner.  I'm not addressing that
    topic in this mail.

 2. Whether we want additional non-standard xlocale.h functions
    in our libc.

Regarding topic 2, let me say up front that i do not feel strongly
either way.  It seems to me this is mostly a question for porters.
If some functions are widely used - even if non-standard - and help
avoid porting hassle, then sure, let's have them unless they cause
excessive fuss.  But the latter does not seem likely here.

There are a number of functions that seem likely useful to me off
the top of my head, for example: querylocale, nl_langinfo_l,
MB_CUR_MAX_L, wcwidth_l

In general, i must say i like the whole xlocale.h business not all
that much: in a nutshell, it is doubling the number of half the
functions under the sun, which usually indicates poor design in the
first place, and then the vast majority of these additional functions
are almost useless on any operating system and totally useless on
OpenBSD.  Like, i mean, btowc_l(3)?  Or even the isdigit_l(3) which
we already have?  You must be choking, Mr. Chisnall!  I don't think
stuff should be added because it is out there, but only if there is
at least some porting benefit.

Regarding the FreeBSD headers, i like them even less.  There are both
terrible design choices and terrible implementation choices.  Regarding
design, it appears that you *must* #include <xlocale.h> after other
headers - if you include it before, it won't work.  Regarding
implementation, there is quite repulsive macro abuse in xlocale/_ctype.h;
but probably that can be unrolled in LibreSSL style.  Either way, none
of that hinders providing them if doing so yields benefit.

See below for a list of functions that i drafted extremely quickly,
so beware of errors and omissions.  The point isn't to present a
definite plan.  The point is merely to demonstrate the sheer fatness
of the animal and to give a first impression of the degree to which
it stinks and grunts from most of its ends.

If any of this is needed, do porters already know which functions
are in particularly high demand?  Do you have any idea how to find
out which ones are actually useful for porting purposes?

Yours,
  Ingo


Functions we already have marked are with a '*'.

<xlocale/_locale.h>
  useful: querylocale

<xlocale/_langinfo.h>
  useful: nl_langinfo_l*

<xlocale/_stdlib.h>
  useful: MB_CUR_MAX_L
  marginally useful: mbtowc_l mbstowcs_l wctomb_l wcstombs_l
  useless: atof_l atoi_l atol_l atoll_l
           strtod_l strtof_l strtol_l strtold_l strtoll_l strtoul_l strtoull_l
           mblen_l

<xlocale/_wchar.h>
  useful: wcwidth_l wcswidth_l
  marginally useful: fgetwc_l fgetws_l fputwc_l fputws_l getwc_l getwchar_l
                     putwc_l putwchar_l ungetwc_l
                     fwprintf_l fwscanf_l swprintf_l swscanf_l
                     vfwprintf_l vswprintf_l vwprintf_l wprintf_l wscanf_l
                     vfwscanf_l vswscanf_l vwscanf_l
                     mbrlen_l mbrtowc_l mbsinit_l mbsrtowcs_l wcrtomb_l
                     wcsrtombs_l wctob_l mbsnrtowcs_l wcsnrtombs_l
  useless: btowc_l wcsftime_l wcstod_l wcstol_l wcstoul_l
                   wcstof_l wcstold_l wcstoll_l wcstoull_l

<xlocale/_string.h>
  useless: strcasestr_l strcoll_l* strxfrm_l*

<xlocale/_strings.h>
  useless: strcasecmp_l* strncasecmp_l*

<xlocale/_inttypes.h>
  useless: strtoimax_l strtoumax_l wcstoimax_l wcstoumax_l

<xlocale/_monetary.h>
  useless: strfmon_l

<xlocale/_time.h>
  useless: strftime_l* strptime_l

<xlocale/_ctype.h>
  marginally useful: towlower_l* towupper_l* iswctype_l* towctrans_l*
  not sure: nextwctype_l wctype_l wctrans_l* digittoint_l
  useless: isalnum_l* etc.  tolower_l* toupper_l*

include <xlocale/_stdio.h>
  useless: asprintf_l dprintf_l fprintf_l fscanf_l printf_l
           scanf_l snprintf_l sprintf_l sscanf_l
           vfprintf_l vprintf_l vsprintf_l vfscanf_l vscanf_l
           vsnprintf_l vsscanf_l vdprintf_l vasprintf_l

<xlocale/uchar.h>
  not sure: c16rtomb_l c32rtomb_l mbrtoc16_l mbrtoc32_l

Reply | Threaded
Open this post in threaded view
|

Re: locale thread support

Marc Espie-2
On Tue, May 26, 2020 at 12:13:02AM +0200, Ingo Schwarze wrote:

> Hi,
>
> my impression is there are two totally independent topics in this
> thread.
>
>  1. Whether and how to make things safer, ideally terminating the
>     program in a controlled manner, if it uses functions that are
>     not thread-safe in an invalid manner.  I'm not addressing that
>     topic in this mail.
>
>  2. Whether we want additional non-standard xlocale.h functions
>     in our libc.
>
> Regarding topic 2, let me say up front that i do not feel strongly
> either way.  It seems to me this is mostly a question for porters.
> If some functions are widely used - even if non-standard - and help
> avoid porting hassle, then sure, let's have them unless they cause
> excessive fuss.  But the latter does not seem likely here.
>
> There are a number of functions that seem likely useful to me off
> the top of my head, for example: querylocale, nl_langinfo_l,
> MB_CUR_MAX_L, wcwidth_l
>
> In general, i must say i like the whole xlocale.h business not all
> that much: in a nutshell, it is doubling the number of half the
> functions under the sun, which usually indicates poor design in the
> first place, and then the vast majority of these additional functions
> are almost useless on any operating system and totally useless on
> OpenBSD.  Like, i mean, btowc_l(3)?  Or even the isdigit_l(3) which
> we already have?  You must be choking, Mr. Chisnall!  I don't think
> stuff should be added because it is out there, but only if there is
> at least some porting benefit.
>
> Regarding the FreeBSD headers, i like them even less.  There are both
> terrible design choices and terrible implementation choices.  Regarding
> design, it appears that you *must* #include <xlocale.h> after other
> headers - if you include it before, it won't work.  Regarding
> implementation, there is quite repulsive macro abuse in xlocale/_ctype.h;
> but probably that can be unrolled in LibreSSL style.  Either way, none
> of that hinders providing them if doing so yields benefit.
>
> See below for a list of functions that i drafted extremely quickly,
> so beware of errors and omissions.  The point isn't to present a
> definite plan.  The point is merely to demonstrate the sheer fatness
> of the animal and to give a first impression of the degree to which
> it stinks and grunts from most of its ends.
>
> If any of this is needed, do porters already know which functions
> are in particularly high demand?  Do you have any idea how to find
> out which ones are actually useful for porting purposes?

Serendipity

Today, I looked at an older library  for audit purposes.

Library is called udns.

If you look at the code, there's a lot of apparently useless reinvention,
like the code explicitly checks for digits using c >= '0' && c <= '9' patterns
(or isprint for that matter).

Thinking some more, I realized why: this is library code, so it can't assume
it's running under any kind of sensible locale unless it has asserted it
itself.

In that specific context, isdigit_l and friends make a lot of sense.
Either that, or systematic uselocale wrappers for any function that can be
called from outside.

(yes, I now that isdigit/isspace are not a problem on OpenBSD, but they can
be elsewhere!)