Clarify that drand48 returns values not including 1.0. Index: src/lib/libc/stdlib/rand48.3 =================================================================== RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v retrieving revision 1.20 diff -u -r1.20 rand48.3 --- src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 -0000 1.20 +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 -0000 @@ -101,7 +101,8 @@ return values of type double. The full 48 bits of r(n+1) are loaded into the mantissa of the returned value, with the exponent set -such that the values produced lie in the interval [0.0, 1.0]. +such that the values produced lie in the interval (0.0, 1.0] (which +excludes 1.0 and includes 0.0). .Pp .Fn lrand48 and |
On Thu, Dec 19, 2019 at 04:19:36PM -0800, [hidden email] wrote:
> > Clarify that drand48 returns values not including 1.0. > > Index: src/lib/libc/stdlib/rand48.3 > =================================================================== > RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v > retrieving revision 1.20 > diff -u -r1.20 rand48.3 > --- src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 -0000 1.20 > +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 -0000 > @@ -101,7 +101,8 @@ > return values of type double. > The full 48 bits of r(n+1) are > loaded into the mantissa of the returned value, with the exponent set > -such that the values produced lie in the interval [0.0, 1.0]. > +such that the values produced lie in the interval (0.0, 1.0] (which > +excludes 1.0 and includes 0.0). > .Pp > .Fn lrand48 > and ...except that your patch says the reverse. Joerg |
In reply to this post by j-62
[hidden email] wrote:
> Clarify that drand48 returns values not including 1.0. > > Index: src/lib/libc/stdlib/rand48.3 > =================================================================== > RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v > retrieving revision 1.20 > diff -u -r1.20 rand48.3 > --- src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 -0000 1.20 > +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 00:08:24 -0000 > @@ -101,7 +101,8 @@ > return values of type double. > The full 48 bits of r(n+1) are > loaded into the mantissa of the returned value, with the exponent set > -such that the values produced lie in the interval [0.0, 1.0]. > +such that the values produced lie in the interval (0.0, 1.0] (which > +excludes 1.0 and includes 0.0). > .Pp > .Fn lrand48 > and It's a mathematical notation that anyone using this page should understand because it comes with the territory. Once you are touching doubles, soon you'll be looking at functions in libm proper, where this type of notation is even more common. Will you then propose to change all those as well? I think understanding the landscape's notation is a requirement, and we don't need to say things a 2nd time in baby talk. |
On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt <[hidden email]> wrote:
> It's a mathematical notation that anyone using this page should > understand because it comes with the territory. > [snip] > > I think understanding the landscape's notation is a requirement, and we > don't need to say things a 2nd time in baby talk. I agree it doesn't need to be repeated, but I think there's value in explicitly showing whether an interval is open or closed. Though, in this case, the interval would be correctly expressed as [0.0, 1.0) or [0.0, 1.0[ rather than how j's diff does it. I attached a diff which I feel concisely does this. I elected to not change the latter two of the three intervals in the man page, since they already included -1 in their upper bound. But I also have that as an option, via largediff.txt |
Not to appeal to majority, but to compare and contrast...
FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more explicit) interval notation for [0.0, 1.0) in drand48.3 https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports&arch=default&format=html https://netbsd.gw.com/cgi-bin/man-cgi?drand48+3 https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.html https://docs.oracle.com/cd/E88353_01/html/E37843/drand48-3c.html https://docs.oracle.com/cd/E26505_01/html/816-5168/drand48-3c.html (though, Solaris's latter two of three intervals are wrong, unless they really do mean their upper bound is inclusive) On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas <[hidden email]> wrote: > > On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt <[hidden email]> wrote: > > It's a mathematical notation that anyone using this page should > > understand because it comes with the territory. > > [snip] > > > > I think understanding the landscape's notation is a requirement, and we > > don't need to say things a 2nd time in baby talk. > I agree it doesn't need to be repeated, but I think there's value in > explicitly showing whether an interval is open or closed. > Though, in this case, the interval would be correctly expressed as > [0.0, 1.0) > or > [0.0, 1.0[ > rather than how j's diff does it. > > I attached a diff which I feel concisely does this. I elected to not > change the latter two of the three intervals in the man page, since > they already included -1 in their upper bound. But I also have that > as an option, via largediff.txt |
Yes, it should be fixed. I'll let the math nerds crawl through the
tree looking for additional errors. My objection stands, that we should not dumb this down. Andras Farkas <[hidden email]> wrote: > Not to appeal to majority, but to compare and contrast... > FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more > explicit) interval notation for [0.0, 1.0) in drand48.3 > https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports&arch=default&format=html > https://netbsd.gw.com/cgi-bin/man-cgi?drand48+3 > https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.html > https://docs.oracle.com/cd/E88353_01/html/E37843/drand48-3c.html > https://docs.oracle.com/cd/E26505_01/html/816-5168/drand48-3c.html > (though, Solaris's latter two of three intervals are wrong, unless > they really do mean their upper bound is inclusive) > > On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas > <[hidden email]> wrote: > > > > On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt <[hidden email]> wrote: > > > It's a mathematical notation that anyone using this page should > > > understand because it comes with the territory. > > > [snip] > > > > > > I think understanding the landscape's notation is a requirement, and we > > > don't need to say things a 2nd time in baby talk. > > I agree it doesn't need to be repeated, but I think there's value in > > explicitly showing whether an interval is open or closed. > > Though, in this case, the interval would be correctly expressed as > > [0.0, 1.0) > > or > > [0.0, 1.0[ > > rather than how j's diff does it. > > > > I attached a diff which I feel concisely does this. I elected to not > > change the latter two of the three intervals in the man page, since > > they already included -1 in their upper bound. But I also have that > > as an option, via largediff.txt |
OK ok ok. I admit and agree my original patch was flawed. Dumbing down
need not be done just for my benefit. --J On 2019-12-19 21:34, Theo de Raadt wrote: > Yes, it should be fixed. I'll let the math nerds crawl through the > tree looking for additional errors. > > My objection stands, that we should not dumb this down. > > Andras Farkas <[hidden email]> wrote: > >> Not to appeal to majority, but to compare and contrast... >> FreeBSD, NetBSD, POSIX, and Solaris all use the correct (or the more >> explicit) interval notation for [0.0, 1.0) in drand48.3 >> https://www.freebsd.org/cgi/man.cgi?query=drand48&apropos=0&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports&arch=default&format=html >> https://netbsd.gw.com/cgi-bin/man-cgi?drand48+3 >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/drand48.html >> https://docs.oracle.com/cd/E88353_01/html/E37843/drand48-3c.html >> https://docs.oracle.com/cd/E26505_01/html/816-5168/drand48-3c.html >> (though, Solaris's latter two of three intervals are wrong, unless >> they really do mean their upper bound is inclusive) >> >> On Thu, Dec 19, 2019 at 10:48 PM Andras Farkas >> <[hidden email]> wrote: >> > >> > On Thu, Dec 19, 2019 at 10:05 PM Theo de Raadt <[hidden email]> wrote: >> > > It's a mathematical notation that anyone using this page should >> > > understand because it comes with the territory. >> > > [snip] >> > > >> > > I think understanding the landscape's notation is a requirement, and we >> > > don't need to say things a 2nd time in baby talk. >> > I agree it doesn't need to be repeated, but I think there's value in >> > explicitly showing whether an interval is open or closed. >> > Though, in this case, the interval would be correctly expressed as >> > [0.0, 1.0) >> > or >> > [0.0, 1.0[ >> > rather than how j's diff does it. >> > >> > I attached a diff which I feel concisely does this. I elected to not >> > change the latter two of the three intervals in the man page, since >> > they already included -1 in their upper bound. But I also have that >> > as an option, via largediff.txt |
In reply to this post by j-62
For completeness:
Index: src/lib/libc/stdlib/rand48.3 =================================================================== RCS file: /cvs/src/lib/libc/stdlib/rand48.3,v retrieving revision 1.20 diff -u -r1.20 rand48.3 --- src/lib/libc/stdlib/rand48.3 10 Nov 2015 23:48:18 -0000 1.20 +++ src/lib/libc/stdlib/rand48.3 20 Dec 2019 18:07:57 -0000 @@ -101,7 +101,7 @@ return values of type double. The full 48 bits of r(n+1) are loaded into the mantissa of the returned value, with the exponent set -such that the values produced lie in the interval [0.0, 1.0]. +such that the values produced lie in the interval [0.0, 1.0). .Pp .Fn lrand48 and |
On Fri, Dec 20, 2019 at 10:09:46AM -0800, [hidden email] wrote:
> For completeness: thanks, committed |
In reply to this post by j-62
[hidden email] wrote:
> > Clarify that drand48 returns values not including 1.0. It's not clear from the documentation whether drand48 can generate a denormal number. If it can't, you can exclude 0.0 because it's a denormal ;-) -- Alex |
Hi,
Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000: > [hidden email] wrote: >> Clarify that drand48 returns values not including 1.0. > It's not clear from the documentation whether drand48 can generate > a denormal number. If it can't, you can exclude 0.0 because it's > a denormal ;-) Looking at our code in lib/libc/stdlib/drand48.c, i conclude that drand48(3) does return 0.0 with a probability of 2^-48. More generally, the function returns a uniform distribution of numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. Talking about loading bits into the mantissa and adjusting the exponent feels mildly confusing, given that the distribution is simply uniform over a fixed finite set. I'm not sending a patch because i never looked at how floating point representation works internally, so i would likely only make matters worse. Yours, Ingo |
Ingo Schwarze wrote:
> Looking at our code in lib/libc/stdlib/drand48.c, i conclude that > drand48(3) does return 0.0 with a probability of 2^-48. I looked at the code too and I have some comments. > More generally, the function returns a uniform distribution of > numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. You don't need three ldexp calls to compose 2^-48 * n: uint64_t n = (uint64_t)xseed[2] << 32 | xseed[1] << 16 | xseed[0]; return ldexp((double)n, -48); > Talking about loading bits into the mantissa and adjusting the > exponent feels mildly confusing, given that the distribution is > simply uniform over a fixed finite set. I'm not sending a patch > because i never looked at how floating point representation works > internally, so i would likely only make matters worse. Not sure if you wanted to patch the man page or the code. If the latter, take a binary representation of 1.0 and replace 52 least significant bits with (pseudo)random bits then subtract 1.0. double d = 1.0; uint64_t u = 0; memcpy(&u, &d, sizeof(d)); u |= n << 4; /* scale 48 bits to 52 bits, n is from the previous snippet */ memcpy(&d, &u, sizeof(u)); return d - 1.0; It will only work with IEEE 754 compliant doubles, though. -- Alex |
In reply to this post by Ingo Schwarze
Ingo Schwarze <[hidden email]> wrote:
> Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000: > > [hidden email] wrote: > > >> Clarify that drand48 returns values not including 1.0. > > > It's not clear from the documentation whether drand48 can generate > > a denormal number. If it can't, you can exclude 0.0 because it's > > a denormal ;-) > > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that > drand48(3) does return 0.0 with a probability of 2^-48. > > More generally, the function returns a uniform distribution of > numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. > > Talking about loading bits into the mantissa and adjusting the > exponent feels mildly confusing, given that the distribution is > simply uniform over a fixed finite set. I'm not sending a patch > because i never looked at how floating point representation works > internally, so i would likely only make matters worse. In which mode, the deterministic mode or the actual random mode? :) |
Hi Theo,
Theo de Raadt wrote on Sat, Dec 21, 2019 at 04:29:24PM -0700: > Ingo Schwarze <[hidden email]> wrote: >> Alexander Nasonov wrote on Fri, Dec 20, 2019 at 08:33:40PM +0000: >>> [hidden email] wrote: >>>> Clarify that drand48 returns values not including 1.0. >>> It's not clear from the documentation whether drand48 can generate >>> a denormal number. If it can't, you can exclude 0.0 because it's >>> a denormal ;-) >> Looking at our code in lib/libc/stdlib/drand48.c, i conclude that >> drand48(3) does return 0.0 with a probability of 2^-48. >> >> More generally, the function returns a uniform distribution of >> numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. >> >> Talking about loading bits into the mantissa and adjusting the >> exponent feels mildly confusing, given that the distribution is >> simply uniform over a fixed finite set. I'm not sending a patch >> because i never looked at how floating point representation works >> internally, so i would likely only make matters worse. > In which mode, the deterministic mode or the actual random mode? :) You are right, it's thrilling to talk about probability distributions in deterministic mode! Here we go, in deterministic mode, the distribution is as follows: * p=1 for the deterministic value requested by the user * p=0 for any other value The integral of this probability measure over the whole sample space, as expected, is 1. Yours, ;-) P.S. Please, everybody refrain from sending a patch adding the above information to the manual page. Describing the distribution in actual random mode seems sufficient to me. |
In reply to this post by Alexander Nasonov
Alexander Nasonov <[hidden email]> wrote:
> Ingo Schwarze wrote: > > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that > > drand48(3) does return 0.0 with a probability of 2^-48. > > I looked at the code too and I have some comments. > > > More generally, the function returns a uniform distribution of > > numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. > > You don't need three ldexp calls to compose 2^-48 * n: > > uint64_t n = (uint64_t)xseed[2] << 32 | xseed[1] << 16 | xseed[0]; > return ldexp((double)n, -48); xseed? Yikes. That version of the code is not reached in OpenBSD unless __rand48_deterministic is set, which means srand48_deterministic or seed48_deterministic or lcong48_deterministic were called... We've seen noone going out of their way, so why touch that code... |
Theo de Raadt wrote:
> Alexander Nasonov <[hidden email]> wrote: > > > Ingo Schwarze wrote: > > > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that > > > drand48(3) does return 0.0 with a probability of 2^-48. > > > > I looked at the code too and I have some comments. > > > > > More generally, the function returns a uniform distribution of > > > numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. > > > > You don't need three ldexp calls to compose 2^-48 * n: > > > > uint64_t n = (uint64_t)xseed[2] << 32 | xseed[1] << 16 | xseed[0]; > > return ldexp((double)n, -48); > > xseed? s/xseed/rseed/g -- Alex |
In reply to this post by Alexander Nasonov
On Sat, Dec 21, 2019 at 11:29:20PM +0000, Alexander Nasonov wrote:
> Ingo Schwarze wrote: > > Looking at our code in lib/libc/stdlib/drand48.c, i conclude that > > drand48(3) does return 0.0 with a probability of 2^-48. > > I looked at the code too and I have some comments. > > > More generally, the function returns a uniform distribution of > > numbers from the set {2^-48 * n | n integer and 0 <= n < 2^48}. > > You don't need three ldexp calls to compose 2^-48 * n: > > uint64_t n = (uint64_t)xseed[2] << 32 | xseed[1] << 16 | xseed[0]; > return ldexp((double)n, -48); > > > Talking about loading bits into the mantissa and adjusting the > > exponent feels mildly confusing, given that the distribution is > > simply uniform over a fixed finite set. I'm not sending a patch > > because i never looked at how floating point representation works > > internally, so i would likely only make matters worse. > > Not sure if you wanted to patch the man page or the code. > > If the latter, take a binary representation of 1.0 and replace 52 > least significant bits with (pseudo)random bits then subtract 1.0. > > double d = 1.0; > uint64_t u = 0; > memcpy(&u, &d, sizeof(d)); > u |= n << 4; /* scale 48 bits to 52 bits, n is from the previous snippet */ > memcpy(&d, &u, sizeof(u)); > return d - 1.0; > > It will only work with IEEE 754 compliant doubles, though. A very clear way to generate an equidistributed floating point number in the range [0.0, 1.0) given a number n in the range [0, 2^48-1] is simply: return n * 0x1.0p-48 Requiring only C99 and <stdint.h>. > > -- > Alex -- / Raimo Niskanen, Erlang/OTP, Ericsson AB |
Free forum by Nabble | Edit this page |