size of size_t (diff angle)

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

size of size_t (diff angle)

zeurkous
Haai,

The definition of size_t keeps biting me.

Some background: in nnx, me's been using the equiv of caddr_t for
counts. This works well; yet, while writing against existing code that
uses size_t, an issue has surfaced.

First of all, let us reflect upon the definition of size_t in C99.

>         size_t
> which is the unsigned integer type of the result of the sizeof
> operator;

That's not very specific. It kind-of implies that SIZE_MAX (defined
later in the standard) is the largest possible offset, but not
necessarily the largest possible address. This reeks of i86 real mode
semantics, obsolete (for general-purpose machines) already when the
PDP-11 was new.

POSIX is even less helpful:

> size_t
>         Used for sizes of objects.

(Let me note in passing that medisapproves of the significant overlap
between C99 and POSIX, and the shameless disregard, in both, of the
byte-oriented nature of UNIX and C).

So, as meknows of no better place to ask (take that as a compliment
folks!), mehas the following question *cue drums*:

Is SIZE_MAX guaranteed to *not* be greater than the highest address?

Me'd be grateful for any insight anyone can offer.

Thanks in advance,

Baai,

        --zeurkous.

--
Friggin' Machines!

Reply | Threaded
Open this post in threaded view
|

Re: size of size_t (diff angle)

Anders Andersson
On Tue, Feb 25, 2020 at 12:14 PM <[hidden email]> wrote:

>
> Haai,
>
> The definition of size_t keeps biting me.
>
> Some background: in nnx, me's been using the equiv of caddr_t for
> counts. This works well; yet, while writing against existing code that
> uses size_t, an issue has surfaced.
>
> First of all, let us reflect upon the definition of size_t in C99.
>
> >         size_t
> > which is the unsigned integer type of the result of the sizeof
> > operator;
>
> That's not very specific. It kind-of implies that SIZE_MAX (defined
> later in the standard) is the largest possible offset, but not
> necessarily the largest possible address. This reeks of i86 real mode
> semantics, obsolete (for general-purpose machines) already when the
> PDP-11 was new.

I think it's pretty clear, size_t is for the size of objects, not for
offsets or pointers. The C standard frowns upon mixing up pointers and
integers, to much grief from low-level developers.



> Is SIZE_MAX guaranteed to *not* be greater than the highest address?

I'm almost certain that C99 offers no such guarantees, since a pointer
to a float does not have to be the same size as a pointer to int, for
example. Maybe if you're being a little more specific. There are some
exceptions for void * and char *.

In fact, the standard only *recommends* that implementations keep
SIZE_MAX as small as possible but not smaller. Since it is only a
recommendation, it can be inferred that the standard acknowledges that
an implementation with SIZE_MAX > highest address is valid.

"The types used for size_t and ptrdiff_t should not have an integer
conversion rank greater than that of signed long int unless the
implementation supports objects large enough to make this necessary."

Or my interpretation: "Just because there is now a new and fancy
64-bit long long in C99 doesn't mean that you should make size_t a
long long just because you can, because it's pointless if your
compiler/target only has a 32-bit address space."

Reply | Threaded
Open this post in threaded view
|

Re: size of size_t (diff angle)

Marc Espie-2
In reply to this post by zeurkous
On Tue, Feb 25, 2020 at 08:56:06AM +0100, [hidden email] wrote:

> Haai,
>
> The definition of size_t keeps biting me.
>
> Some background: in nnx, me's been using the equiv of caddr_t for
> counts. This works well; yet, while writing against existing code that
> uses size_t, an issue has surfaced.
>
> First of all, let us reflect upon the definition of size_t in C99.
>
> >         size_t
> > which is the unsigned integer type of the result of the sizeof
> > operator;
>
> That's not very specific. It kind-of implies that SIZE_MAX (defined
> later in the standard) is the largest possible offset, but not
> necessarily the largest possible address. This reeks of i86 real mode
> semantics, obsolete (for general-purpose machines) already when the
> PDP-11 was new.
>
> POSIX is even less helpful:
>
> > size_t
> >         Used for sizes of objects.
>
> (Let me note in passing that medisapproves of the significant overlap
> between C99 and POSIX, and the shameless disregard, in both, of the
> byte-oriented nature of UNIX and C).

You're looking at the wrong type. size_t is very good for what it does.

Try uintptr_t

Reply | Threaded
Open this post in threaded view
|

RE: size of size_t (diff angle)

zeurkous
In reply to this post by Anders Andersson
Haai,

"Anders Andersson" <[hidden email]> wrote:

> On Tue, Feb 25, 2020 at 12:14 PM <[hidden email]> wrote:
>>
>> First of all, let us reflect upon the definition of size_t in C99.
>>
>> > size_t
>> > which is the unsigned integer type of the result of the sizeof
>> > operator;
>>
>> That's not very specific. It kind-of implies that SIZE_MAX (defined
>> later in the standard) is the largest possible offset, but not
>> necessarily the largest possible address. This reeks of i86 real mode
>> semantics, obsolete (for general-purpose machines) already when the
>> PDP-11 was new.
>
> I think it's pretty clear, size_t is for the size of objects, not for
> offsets or pointers.

At least in C, the difference between sizes, offsets, and addresses are
semantic in nature. Even B, as meunderstands, treats memory as a flat
array (just of words instead of bytes).

> The C standard frowns upon mixing up pointers and
> integers, to much grief from low-level developers.

Menoticed that too. Given the nature and background of C, it's pretty
weird (not to say inappropriate) for 'the standard'(tm) to do so.

>> Is SIZE_MAX guaranteed to *not* be greater than the highest address?
>
> I'm almost certain that C99 offers no such guarantees, since a pointer
> to a float does not have to be the same size as a pointer to int, for
> example. Maybe if you're being a little more specific. There are some
> exceptions for void * and char *.
>
> In fact, the standard only *recommends* that implementations keep
> SIZE_MAX as small as possible but not smaller. Since it is only a
> recommendation, it can be inferred that the standard acknowledges that
> an implementation with SIZE_MAX > highest address is valid.

Mesupposes that's tolerable, as long as sizeof() indeed never returns a
value greater than the largest address... mehas observed no such
guarantee either, however.

> "The types used for size_t and ptrdiff_t should not have an integer
> conversion rank greater than that of signed long int unless the
> implementation supports objects large enough to make this necessary."

Yes, menoticed. Strange for the standard to suddenly group sizes and
addresses (address "diffs" even!) together. Quite schizophrenic if you
ask me...

> Or my interpretation: "Just because there is now a new and fancy
> 64-bit long long in C99 doesn't mean that you should make size_t a
> long long just because you can, because it's pointless if your
> compiler/target only has a 32-bit address space."

The issue mebumped into was that SIZE_MAX being an arbitrary value, with
no specified relation to the highest address, makes some compat code
pretty messy. Hence mehoped that there was some kind of guarantee; megot
the answer mefeared.

Thanks, your answer was quite helpful.

Baai,

        --zeurkous.

--
Friggin' Machines!

Reply | Threaded
Open this post in threaded view
|

RE: size of size_t (diff angle)

zeurkous
In reply to this post by Marc Espie-2
Haai,

"Marc Espie" <[hidden email]> wrote:
> On Tue, Feb 25, 2020 at 08:56:06AM +0100, [hidden email] wrote:
>
> You're looking at the wrong type. size_t is very good for what it does.

Yes; meproblem is with the 'what it does' part.

> Try uintptr_t

Are you proposing a change to struct iovec?

        --zeurkous.

--
Friggin' Machines!

Reply | Threaded
Open this post in threaded view
|

Re: size of size_t (diff angle)

Marc Espie-2
On Wed, Feb 26, 2020 at 11:01:56PM +0100, [hidden email] wrote:
> Haai,
>
> "Marc Espie" <[hidden email]> wrote:
> > On Tue, Feb 25, 2020 at 08:56:06AM +0100, [hidden email] wrote:
> >
> > You're looking at the wrong type. size_t is very good for what it does.
>
> Yes; meproblem is with the 'what it does' part.
It represents memory sizes. It works on anything with a sane
memory model.

> > Try uintptr_t
>
> Are you proposing a change to struct iovec?

Why should I ? readv works with sizes, so size_t is adequate.

You were mentionning caddr_t earlier. intptr_t and uintptr_t are
the adequate types for working with addresses. size_t is the adequate
family for working with sizes.

POSIX kind-of implies readv, which means that both realms tend of
mesh.

If you're on something where they don't, you're fucked.

Good luck.

What are you doing asking questions on an OpenBSD list, btw ?

Reply | Threaded
Open this post in threaded view
|

RE: size of size_t (diff angle)

zeurkous
Haai,

"Marc Espie" <[hidden email]> wrote:
>>> You're looking at the wrong type. size_t is very good for what it does.
>>
>> Yes; meproblem is with the 'what it does' part.
>
> It represents memory sizes. It works on anything with a sane
> memory model.

The way meunderstands it, it's just an offset, plain and simple. Which
on a sane machine is indeed of the same type as an address[0].

Unfortunately, C99 does not appear to reflect that. Now, to what degree
(if!) we should respect C99, or take it much seriously at all, is
another matter...

>>> Try uintptr_t
>>
>> Are you proposing a change to struct iovec?
>
> Why should I ? readv works with sizes, so size_t is adequate.

Yes, why should you? That was me implied question. You told me to use
uintptr_t, but that will hardly solve things on the exact problem mewas
working on (medidn't specify what it was, and you didn't ask), unless we
change struct iovec (cue an 'over my dead body' response from theo, and
with respect to compat, he'd be damn right).

> You were mentionning caddr_t earlier. intptr_t and uintptr_t are
> the adequate types for working with addresses. size_t is the adequate
> family for working with sizes.

Me's found that such statements emerge from a shallow understanding of
the nature of C. C doesn't know sizes: indeed, it barely knows indices
and offsets. If sizeof() would have been defined to return the index
of the final byte, instead of the count of bytes, then the C99
definition for size_t would've been pre-empted.

> POSIX kind-of implies readv, which means that both realms tend of
> mesh.

Yes, that's an obvious layer error. C as a language should not be
confused with libc, or UNIX in general. In fact, C and UNIX appear to
only have two concrete things in common: ASCII, and the byte as the
fundamental type. That's it.

> If you're on something where they don't, you're fucked.

Me's never been the type to play it safe. The path forward is not blind
obedience to the ravings of committees, especially those that pretend to
set a universal standard.

> Good luck.

Thanks. Me's decided to ditch the {read,write}v compat wrappers and take
the performance hit. It's all preperation for a real OS, after all:
me'll do it right in there.

> What are you doing asking questions on an OpenBSD list, btw ?

nnx runs on OpenBSD. You must be confusing it with NetNIX, which is the
OS that will eventually emerge.

NetNIX will not have size_t.

Baai,

        --zeurkous.

[0] Except, of course, it's an 'offset + 1'. Oops. But that's the least
    of the problems if SIZE_MAX is not guaranteed to be the highest
    address...

--
Friggin' Machines!

Reply | Threaded
Open this post in threaded view
|

Re: size of size_t (diff angle)

Claudio Jeker
This has not much to do with OpenBSD.
As for OpenBSD, it only runs on two types of machines: ILP32 and I32LP64.
Any other type of machine that is not covered by these two types will
not run OpenBSD.

In both cases size_t is defined as unsigned long which is the same as
uintptr_t and the same size as pointer.

Now if SIZE_MAX is the highest address is a different thing.
On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
it covers actually more then what is possible). The highest valid
address is in most cases less than SIZE_MAX.

--
:wq Claudio


On Thu, Feb 27, 2020 at 01:36:39AM +0100, [hidden email] wrote:

> Haai,
>
> "Marc Espie" <[hidden email]> wrote:
> >>> You're looking at the wrong type. size_t is very good for what it does.
> >>
> >> Yes; meproblem is with the 'what it does' part.
> >
> > It represents memory sizes. It works on anything with a sane
> > memory model.
>
> The way meunderstands it, it's just an offset, plain and simple. Which
> on a sane machine is indeed of the same type as an address[0].
>
> Unfortunately, C99 does not appear to reflect that. Now, to what degree
> (if!) we should respect C99, or take it much seriously at all, is
> another matter...
>
> >>> Try uintptr_t
> >>
> >> Are you proposing a change to struct iovec?
> >
> > Why should I ? readv works with sizes, so size_t is adequate.
>
> Yes, why should you? That was me implied question. You told me to use
> uintptr_t, but that will hardly solve things on the exact problem mewas
> working on (medidn't specify what it was, and you didn't ask), unless we
> change struct iovec (cue an 'over my dead body' response from theo, and
> with respect to compat, he'd be damn right).
>
> > You were mentionning caddr_t earlier. intptr_t and uintptr_t are
> > the adequate types for working with addresses. size_t is the adequate
> > family for working with sizes.
>
> Me's found that such statements emerge from a shallow understanding of
> the nature of C. C doesn't know sizes: indeed, it barely knows indices
> and offsets. If sizeof() would have been defined to return the index
> of the final byte, instead of the count of bytes, then the C99
> definition for size_t would've been pre-empted.
>
> > POSIX kind-of implies readv, which means that both realms tend of
> > mesh.
>
> Yes, that's an obvious layer error. C as a language should not be
> confused with libc, or UNIX in general. In fact, C and UNIX appear to
> only have two concrete things in common: ASCII, and the byte as the
> fundamental type. That's it.
>
> > If you're on something where they don't, you're fucked.
>
> Me's never been the type to play it safe. The path forward is not blind
> obedience to the ravings of committees, especially those that pretend to
> set a universal standard.
>
> > Good luck.
>
> Thanks. Me's decided to ditch the {read,write}v compat wrappers and take
> the performance hit. It's all preperation for a real OS, after all:
> me'll do it right in there.
>
> > What are you doing asking questions on an OpenBSD list, btw ?
>
> nnx runs on OpenBSD. You must be confusing it with NetNIX, which is the
> OS that will eventually emerge.
>
> NetNIX will not have size_t.
>
> Baai,
>
>         --zeurkous.
>
> [0] Except, of course, it's an 'offset + 1'. Oops. But that's the least
>     of the problems if SIZE_MAX is not guaranteed to be the highest
>     address...
>
> --
> Friggin' Machines!
>

Reply | Threaded
Open this post in threaded view
|

RE: size of size_t (diff angle)

zeurkous
Haai,

"Claudio Jeker" <[hidden email]> wrote:
> This has not much to do with OpenBSD.

On the contrary: these issues touch the fundaments of UNIX programming.

> As for OpenBSD, it only runs on two types of machines: ILP32 and I32LP64.
> Any other type of machine that is not covered by these two types will
> not run OpenBSD.

Oh yes, this is not NetBSD, me's well aware... And yet, metries hard to
satisfy basic portability when feasible. This is consistent with OpenBSD
practice, at least if the manual pages are anything to go by.

> In both cases size_t is defined as unsigned long which is the same as
> uintptr_t and the same size as pointer.

Of course, in practice that's the case. You'll really get no argument
from me there.

> Now if SIZE_MAX is the highest address is a different thing.
> On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
> it covers actually more then what is possible). The highest valid
> address is in most cases less than SIZE_MAX.

Yes, the {,in}famous halfway split... for calculations involving
already valid {addresse,offset,size}s that hardly matters, however.

What *does* matter, is the potential lack of equivalence of the types.
Which, as you pointed out, does not affect OpenBSD (at this time), yet
might be a portability issue. Hence me raising it.

Baai,

        --zeurkous.

--
Friggin' Machines!

Reply | Threaded
Open this post in threaded view
|

Re: size of size_t (diff angle)

Claudio Jeker
On Thu, Feb 27, 2020 at 02:07:36PM +0100, [hidden email] wrote:

> Haai,
>
> "Claudio Jeker" <[hidden email]> wrote:
> > This has not much to do with OpenBSD.
>
> On the contrary: these issues touch the fundaments of UNIX programming.
>
> > As for OpenBSD, it only runs on two types of machines: ILP32 and I32LP64.
> > Any other type of machine that is not covered by these two types will
> > not run OpenBSD.
>
> Oh yes, this is not NetBSD, me's well aware... And yet, metries hard to
> satisfy basic portability when feasible. This is consistent with OpenBSD
> practice, at least if the manual pages are anything to go by.
>
> > In both cases size_t is defined as unsigned long which is the same as
> > uintptr_t and the same size as pointer.
>
> Of course, in practice that's the case. You'll really get no argument
> from me there.
>
> > Now if SIZE_MAX is the highest address is a different thing.
> > On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
> > it covers actually more then what is possible). The highest valid
> > address is in most cases less than SIZE_MAX.
>
> Yes, the {,in}famous halfway split... for calculations involving
> already valid {addresse,offset,size}s that hardly matters, however.
>
> What *does* matter, is the potential lack of equivalence of the types.
> Which, as you pointed out, does not affect OpenBSD (at this time), yet
> might be a portability issue. Hence me raising it.

The times of non ILP32 or I32LP64 UNIX systems is over (at least when it
comes to userland processes). If you want a UNIX-like OS where code will
work then those are your only options. The ecosystem is not able to handle
anything else anymore. All the other discussions are theortical and will
not result in anything that is usable to run UNIX software.

--
:wq Claudio

Reply | Threaded
Open this post in threaded view
|

RE: size of size_t (diff angle)

zeurkous
Haai,

"Claudio Jeker" <[hidden email]> wrote:

>>> Now if SIZE_MAX is the highest address is a different thing.
>>> On OpenBSD 0..SIZE_MAX will cover the address room (in most cases
>>> it covers actually more then what is possible). The highest valid
>>> address is in most cases less than SIZE_MAX.
>>
>> Yes, the {,in}famous halfway split... for calculations involving
>> already valid {addresse,offset,size}s that hardly matters, however.
>>
>> What *does* matter, is the potential lack of equivalence of the types.
>> Which, as you pointed out, does not affect OpenBSD (at this time), yet
>> might be a portability issue. Hence me raising it.
>
> The times of non ILP32 or I32LP64 UNIX systems is over (at least when it
> comes to userland processes).

This just adds fuel to me argument that we should ditch size_t,
uintptr_t, et al., in favour of a simple 'char *' (by that or any other
name (such as caddr_t)).

> If you want a UNIX-like OS where code will
> work then those are your only options. The ecosystem is not able to handle
> anything else anymore. All the other discussions are theortical and will
> not result in anything that is usable to run UNIX software.

Then me'd say that it's high time the relevant standards are updated to
reflect that reality.

The latter is, of course, outside the purview of OpenBSD. (But we can
set a good example.)

Thank you.

Baai,

         --zeurkous.

--
Friggin' Machines!