

Clarify that drand48 returns values not including 1.0.
Index: src/lib/libc/stdlib/rand48.3
===================================================================
RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v
retrieving revision 1.20
diff u r1.20 rand48.3
 src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 0000 1.20
+++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 0000
@@ 101,7 +101,8 @@
return values of type double.
The full 48 bits of r(n+1) are
loaded into the mantissa of the returned value, with the exponent set
such that the values produced lie in the interval [0.0, 1.0].
+such that the values produced lie in the interval (0.0, 1.0] (which
+excludes 1.0 and includes 0.0).
.Pp
.Fn lrand48
and


On Thu, Dec 19, 2019 at 04:19:36PM 0800, [hidden email] wrote:
>
> Clarify that drand48 returns values not including 1.0.
>
> Index: src/lib/libc/stdlib/rand48.3
> ===================================================================
> RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v
> retrieving revision 1.20
> diff u r1.20 rand48.3
>  src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 0000 1.20
> +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 0000
> @@ 101,7 +101,8 @@
> return values of type double.
> The full 48 bits of r(n+1) are
> loaded into the mantissa of the returned value, with the exponent set
> such that the values produced lie in the interval [0.0, 1.0].
> +such that the values produced lie in the interval (0.0, 1.0] (which
> +excludes 1.0 and includes 0.0).
> .Pp
> .Fn lrand48
> and
...except that your patch says the reverse.
Joerg


[hidden email] wrote:
> Clarify that drand48 returns values not including 1.0.
>
> Index: src/lib/libc/stdlib/rand48.3
> ===================================================================
> RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v
> retrieving revision 1.20
> diff u r1.20 rand48.3
>  src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 0000 1.20
> +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 0000
> @@ 101,7 +101,8 @@
> return values of type double.
> The full 48 bits of r(n+1) are
> loaded into the mantissa of the returned value, with the exponent set
> such that the values produced lie in the interval [0.0, 1.0].
> +such that the values produced lie in the interval (0.0, 1.0] (which
> +excludes 1.0 and includes 0.0).
> .Pp
> .Fn lrand48
> and
It's a mathematical notation that anyone using this page should
understand because it comes with the territory. Once you are touching
doubles, soon you'll be looking at functions in libm proper, where this
type of notation is even more common. Will you then propose to change
all those as well?
I think understanding the landscape's notation is a requirement, and we
don't need to say things a 2nd time in baby talk.


On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt < [hidden email]> wrote:
> It's a mathematical notation that anyone using this page should
> understand because it comes with the territory.
> [snip]
>
> I think understanding the landscape's notation is a requirement, and we
> don't need to say things a 2nd time in baby talk.
I agree it doesn't need to be repeated, but I think there's value in
explicitly showing whether an interval is open or closed.
Though, in this case, the interval would be correctly expressed as
[0.0, 1.0)
or
[0.0, 1.0[
rather than how j's diff does it.
I attached a diff which I feel concisely does this. I elected to not
change the latter two of the three intervals in the man page, since
they already included 1 in their upper bound. But I also have that
as an option, via largediff.txt


Not to appeal to majority, but to compare and contrast...
FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more
explicit) interval notation for [0.0, 1.0) in drand48.3
https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1RELEASE+and+Ports&arch=default&format=htmlhttps://netbsd.gw.com/cgibin/mancgi?drand48+3https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.htmlhttps://docs.oracle.com/cd/E88353_01/html/E37843/drand483c.htmlhttps://docs.oracle.com/cd/E26505_01/html/8165168/drand483c.html(though, Solaris's latter two of three intervals are wrong, unless
they really do mean their upper bound is inclusive)
On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas
< [hidden email]> wrote:
>
> On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt < [hidden email]> wrote:
> > It's a mathematical notation that anyone using this page should
> > understand because it comes with the territory.
> > [snip]
> >
> > I think understanding the landscape's notation is a requirement, and we
> > don't need to say things a 2nd time in baby talk.
> I agree it doesn't need to be repeated, but I think there's value in
> explicitly showing whether an interval is open or closed.
> Though, in this case, the interval would be correctly expressed as
> [0.0, 1.0)
> or
> [0.0, 1.0[
> rather than how j's diff does it.
>
> I attached a diff which I feel concisely does this. I elected to not
> change the latter two of the three intervals in the man page, since
> they already included 1 in their upper bound. But I also have that
> as an option, via largediff.txt


Yes, it should be fixed. I'll let the math nerds crawl through the
tree looking for additional errors.
My objection stands, that we should not dumb this down.
Andras Farkas < [hidden email]> wrote:
> Not to appeal to majority, but to compare and contrast...
> FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more
> explicit) interval notation for [0.0, 1.0) in drand48.3
> https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1RELEASE+and+Ports&arch=default&format=html> https://netbsd.gw.com/cgibin/mancgi?drand48+3> https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.html> https://docs.oracle.com/cd/E88353_01/html/E37843/drand483c.html> https://docs.oracle.com/cd/E26505_01/html/8165168/drand483c.html> (though, Solaris's latter two of three intervals are wrong, unless
> they really do mean their upper bound is inclusive)
>
> On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas
> < [hidden email]> wrote:
> >
> > On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt < [hidden email]> wrote:
> > > It's a mathematical notation that anyone using this page should
> > > understand because it comes with the territory.
> > > [snip]
> > >
> > > I think understanding the landscape's notation is a requirement, and we
> > > don't need to say things a 2nd time in baby talk.
> > I agree it doesn't need to be repeated, but I think there's value in
> > explicitly showing whether an interval is open or closed.
> > Though, in this case, the interval would be correctly expressed as
> > [0.0, 1.0)
> > or
> > [0.0, 1.0[
> > rather than how j's diff does it.
> >
> > I attached a diff which I feel concisely does this. I elected to not
> > change the latter two of the three intervals in the man page, since
> > they already included 1 in their upper bound. But I also have that
> > as an option, via largediff.txt


OK ok ok. I admit and agree my original patch was flawed. Dumbing down
need
not be done just for my benefit.
J
On 20191219 21:34, Theo de Raadt wrote:
> Yes, it should be fixed. I'll let the math nerds crawl through the
> tree looking for additional errors.
>
> My objection stands, that we should not dumb this down.
>
> Andras Farkas < [hidden email]> wrote:
>
>> Not to appeal to majority, but to compare and contrast...
>> FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more
>> explicit) interval notation for [0.0, 1.0) in drand48.3
>> https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1RELEASE+and+Ports&arch=default&format=html>> https://netbsd.gw.com/cgibin/mancgi?drand48+3>> https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.html>> https://docs.oracle.com/cd/E88353_01/html/E37843/drand483c.html>> https://docs.oracle.com/cd/E26505_01/html/8165168/drand483c.html>> (though, Solaris's latter two of three intervals are wrong, unless
>> they really do mean their upper bound is inclusive)
>>
>> On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas
>> < [hidden email]> wrote:
>> >
>> > On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt < [hidden email]> wrote:
>> > > It's a mathematical notation that anyone using this page should
>> > > understand because it comes with the territory.
>> > > [snip]
>> > >
>> > > I think understanding the landscape's notation is a requirement, and we
>> > > don't need to say things a 2nd time in baby talk.
>> > I agree it doesn't need to be repeated, but I think there's value in
>> > explicitly showing whether an interval is open or closed.
>> > Though, in this case, the interval would be correctly expressed as
>> > [0.0, 1.0)
>> > or
>> > [0.0, 1.0[
>> > rather than how j's diff does it.
>> >
>> > I attached a diff which I feel concisely does this. I elected to not
>> > change the latter two of the three intervals in the man page, since
>> > they already included 1 in their upper bound. But I also have that
>> > as an option, via largediff.txt


For completeness:
Index: src/lib/libc/stdlib/rand48.3
===================================================================
RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v
retrieving revision 1.20
diff u r1.20 rand48.3
 src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 0000 1.20
+++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 18:07:57 0000
@@ 101,7 +101,7 @@
return values of type double.
The full 48 bits of r(n+1) are
loaded into the mantissa of the returned value, with the exponent set
such that the values produced lie in the interval [0.0, 1.0].
+such that the values produced lie in the interval [0.0, 1.0).
.Pp
.Fn lrand48
and


On Fri, Dec 20, 2019 at 10:09:46AM 0800, [hidden email] wrote:
> For completeness:
thanks, committed


[hidden email] wrote:
>
> Clarify that drand48 returns values not including 1.0.
It's not clear from the documentation whether drand48 can generate
a denormal number. If it can't, you can exclude 0.0 because it's
a denormal ;)

Alex


Hi,
Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000:
> [hidden email] wrote:
>> Clarify that drand48 returns values not including 1.0.
> It's not clear from the documentation whether drand48 can generate
> a denormal number. If it can't, you can exclude 0.0 because it's
> a denormal ;)
Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
drand48(3) does return 0.0 with a probability of 2^48.
More generally, the function returns a uniform distribution of
numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
Talking about loading bits into the mantissa and adjusting the
exponent feels mildly confusing, given that the distribution is
simply uniform over a fixed finite set. I'm not sending a patch
because i never looked at how floating point representation works
internally, so i would likely only make matters worse.
Yours,
Ingo


Ingo Schwarze wrote:
> Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
> drand48(3) does return 0.0 with a probability of 2^48.
I looked at the code too and I have some comments.
> More generally, the function returns a uniform distribution of
> numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
You don't need three ldexp calls to compose 2^48 * n:
uint64_t n = (uint64_t)xseed[2] << 32  xseed[1] << 16  xseed[0];
return ldexp((double)n, 48);
> Talking about loading bits into the mantissa and adjusting the
> exponent feels mildly confusing, given that the distribution is
> simply uniform over a fixed finite set. I'm not sending a patch
> because i never looked at how floating point representation works
> internally, so i would likely only make matters worse.
Not sure if you wanted to patch the man page or the code.
If the latter, take a binary representation of 1.0 and replace 52
least significant bits with (pseudo)random bits then subtract 1.0.
double d = 1.0;
uint64_t u = 0;
memcpy(&u, &d, sizeof(d));
u = n << 4; /* scale 48 bits to 52 bits, n is from the previous snippet */
memcpy(&d, &u, sizeof(u));
return d  1.0;
It will only work with IEEE 754 compliant doubles, though.

Alex


Ingo Schwarze < [hidden email]> wrote:
> Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000:
> > [hidden email] wrote:
>
> >> Clarify that drand48 returns values not including 1.0.
>
> > It's not clear from the documentation whether drand48 can generate
> > a denormal number. If it can't, you can exclude 0.0 because it's
> > a denormal ;)
>
> Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
> drand48(3) does return 0.0 with a probability of 2^48.
>
> More generally, the function returns a uniform distribution of
> numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
>
> Talking about loading bits into the mantissa and adjusting the
> exponent feels mildly confusing, given that the distribution is
> simply uniform over a fixed finite set. I'm not sending a patch
> because i never looked at how floating point representation works
> internally, so i would likely only make matters worse.
In which mode, the deterministic mode or the actual random mode? :)


Hi Theo,
Theo de Raadt wrote on Sat, Dec 21, 2019 at 04:29:24PM 0700:
> Ingo Schwarze < [hidden email]> wrote:
>> Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000:
>>> [hidden email] wrote:
>>>> Clarify that drand48 returns values not including 1.0.
>>> It's not clear from the documentation whether drand48 can generate
>>> a denormal number. If it can't, you can exclude 0.0 because it's
>>> a denormal ;)
>> Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
>> drand48(3) does return 0.0 with a probability of 2^48.
>>
>> More generally, the function returns a uniform distribution of
>> numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
>>
>> Talking about loading bits into the mantissa and adjusting the
>> exponent feels mildly confusing, given that the distribution is
>> simply uniform over a fixed finite set. I'm not sending a patch
>> because i never looked at how floating point representation works
>> internally, so i would likely only make matters worse.
> In which mode, the deterministic mode or the actual random mode? :)
You are right, it's thrilling to talk about probability distributions in
deterministic mode! Here we go, in deterministic mode, the distribution
is as follows:
* p=1 for the deterministic value requested by the user
* p=0 for any other value
The integral of this probability measure over the whole sample space,
as expected, is 1.
Yours,
;)
P.S.
Please, everybody refrain from sending a patch adding the above
information to the manual page. Describing the distribution in
actual random mode seems sufficient to me.


Alexander Nasonov < [hidden email]> wrote:
> Ingo Schwarze wrote:
> > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
> > drand48(3) does return 0.0 with a probability of 2^48.
>
> I looked at the code too and I have some comments.
>
> > More generally, the function returns a uniform distribution of
> > numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
>
> You don't need three ldexp calls to compose 2^48 * n:
>
> uint64_t n = (uint64_t)xseed[2] << 32  xseed[1] << 16  xseed[0];
> return ldexp((double)n, 48);
xseed?
Yikes.
That version of the code is not reached in OpenBSD unless
__rand48_deterministic is set, which means srand48_deterministic or
seed48_deterministic or lcong48_deterministic were called...
We've seen noone going out of their way, so why touch that code...


Theo de Raadt wrote:
> Alexander Nasonov < [hidden email]> wrote:
>
> > Ingo Schwarze wrote:
> > > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
> > > drand48(3) does return 0.0 with a probability of 2^48.
> >
> > I looked at the code too and I have some comments.
> >
> > > More generally, the function returns a uniform distribution of
> > > numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
> >
> > You don't need three ldexp calls to compose 2^48 * n:
> >
> > uint64_t n = (uint64_t)xseed[2] << 32  xseed[1] << 16  xseed[0];
> > return ldexp((double)n, 48);
>
> xseed?
s/xseed/rseed/g

Alex


On Sat, Dec 21, 2019 at 11:29:20PM +0000, Alexander Nasonov wrote:
> Ingo Schwarze wrote:
> > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that
> > drand48(3) does return 0.0 with a probability of 2^48.
>
> I looked at the code too and I have some comments.
>
> > More generally, the function returns a uniform distribution of
> > numbers from the set {2^48 * n  n integer and 0 <= n < 2^48}.
>
> You don't need three ldexp calls to compose 2^48 * n:
>
> uint64_t n = (uint64_t)xseed[2] << 32  xseed[1] << 16  xseed[0];
> return ldexp((double)n, 48);
>
> > Talking about loading bits into the mantissa and adjusting the
> > exponent feels mildly confusing, given that the distribution is
> > simply uniform over a fixed finite set. I'm not sending a patch
> > because i never looked at how floating point representation works
> > internally, so i would likely only make matters worse.
>
> Not sure if you wanted to patch the man page or the code.
>
> If the latter, take a binary representation of 1.0 and replace 52
> least significant bits with (pseudo)random bits then subtract 1.0.
>
> double d = 1.0;
> uint64_t u = 0;
> memcpy(&u, &d, sizeof(d));
> u = n << 4; /* scale 48 bits to 52 bits, n is from the previous snippet */
> memcpy(&d, &u, sizeof(u));
> return d  1.0;
>
> It will only work with IEEE 754 compliant doubles, though.
A very clear way to generate an equidistributed floating point number in
the range [0.0, 1.0) given a number n in the range [0, 2^481] is simply:
return n * 0x1.0p48
Requiring only C99 and <stdint.h>.
>
> 
> Alex

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

