Multiple cmsghdrs in msghdr

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple cmsghdrs in msghdr

William Orr-2
Hey,

I was debugging a few CPython test failures yesterday, and I noticed
that attaching multiple cmsg structures causes unp_internalize to return
EINVAL.

I've looked in unix(4) and sendmsg(2), and this caveat isn't documented
anywhere.

I looked at other OSes, and Linux supports this, FreeBSD fails in
interesting ways and OS X returns E2BIG.

Is this behavior intentional, and the documentation is missing this
failure mode? Or is the behavior unintentional? I'm happy to submit a
patch for either, I just want to know which behavior is intended.

For reference, the code that returns EINVAL follows:

int
unp_internalize(struct mbuf *control, struct proc *p)
{
        struct filedesc *fdp = p->p_fd;
        struct cmsghdr *cm = mtod(control, struct cmsghdr *);
        struct file **rp, *fp;
        int i, error;
        int nfds, *ip, fd, neededspace;

        /*
         * Check for two potential msg_controllen values because
         * IETF stuck their nose in a place it does not belong.
         */
        if (cm->cmsg_type != SCM_RIGHTS || cm->cmsg_level != SOL_SOCKET ||
            !(cm->cmsg_len == control->m_len ||
            control->m_len == CMSG_ALIGN(cm->cmsg_len)))
                return (EINVAL);
...

My super-awful test, also follows:
#include <sys/socket.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <err.h>
#include <string.h>

void
child(int sock)
{
        struct msghdr msg;
        memset(&msg, 0, sizeof(msg));
        recvmsg(sock, &msg, 0);

        printf("controllen: %zu\n", msg.msg_controllen);
        printf("control: %p\n", msg.msg_control);
}

void
parent(int sock)
{
        int fds[] = { -1, -1 };
        struct msghdr msg;
        struct cmsghdr  *cmsg;
        union {
                struct cmsghdr hdr;
                unsigned char    buf[2 * CMSG_SPACE(sizeof(int))];
        } cmsgbuf;
        char sfn[30];

        memset(&msg, 0, sizeof(msg));
        for (int i = 0; i < sizeof(fds); i++) {
                (void)strlcpy(sfn, "/tmp/worrtest.XXXXXX", sizeof(sfn));
                if ((fds[i] = mkstemp(sfn)) == -1) {
                        err(1, "mkstemp");
                }
        }

        msg.msg_control = &cmsgbuf.buf;
        msg.msg_controllen = sizeof(cmsgbuf.buf);

        cmsg = CMSG_FIRSTHDR(&msg);
        cmsg->cmsg_len = CMSG_LEN(sizeof(int));
        cmsg->cmsg_level = SOL_SOCKET;
        cmsg->cmsg_type = SCM_RIGHTS;
        *(int *)CMSG_DATA(cmsg) = fds[0];

        cmsg = CMSG_NXTHDR(&msg, cmsg);
        cmsg->cmsg_len = CMSG_LEN(sizeof(int));
        cmsg->cmsg_level = SOL_SOCKET;
        cmsg->cmsg_type = SCM_RIGHTS;
        *(int *)CMSG_DATA(cmsg) = fds[1];

        if (sendmsg(sock, &msg, 10240) == -1)
                err(1, "sendmsg");
}

int
main(void)
{
        int sock[] = {-1, -1};

        if (socketpair(AF_UNIX, SOCK_STREAM, 0, sock) == -1)
                err(1, "socket");

        switch (fork()) {
                case 0:
                        child(sock[0]);
                        exit(0);
                case -1:
                        err(1, "fork");
                default:
                        parent(sock[1]);
                        exit(0);
        }
}


Thanks,
William Orr






signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Multiple cmsghdrs in msghdr

Otto Moerbeek

Always fun, cmsg structures. Anyway, I'll try to find some info on
this (maybe Stevens has something to say here).

But your experiments indicate that various Unixes do not agree, so
it's probably better to avoid sending multiple cmsg structures. Or
there is a subtle error in the way you build your messages...

        -Otto

On Tue, Apr 14, 2015 at 09:26:25PM -0400, William Orr wrote:

> Hey,
>
> I was debugging a few CPython test failures yesterday, and I noticed
> that attaching multiple cmsg structures causes unp_internalize to return
> EINVAL.
>
> I've looked in unix(4) and sendmsg(2), and this caveat isn't documented
> anywhere.
>
> I looked at other OSes, and Linux supports this, FreeBSD fails in
> interesting ways and OS X returns E2BIG.
>
> Is this behavior intentional, and the documentation is missing this
> failure mode? Or is the behavior unintentional? I'm happy to submit a
> patch for either, I just want to know which behavior is intended.
>
> For reference, the code that returns EINVAL follows:
>
> int
> unp_internalize(struct mbuf *control, struct proc *p)
> {
> struct filedesc *fdp = p->p_fd;
> struct cmsghdr *cm = mtod(control, struct cmsghdr *);
> struct file **rp, *fp;
> int i, error;
> int nfds, *ip, fd, neededspace;
>
> /*
> * Check for two potential msg_controllen values because
> * IETF stuck their nose in a place it does not belong.
> */
> if (cm->cmsg_type != SCM_RIGHTS || cm->cmsg_level != SOL_SOCKET ||
>    !(cm->cmsg_len == control->m_len ||
>    control->m_len == CMSG_ALIGN(cm->cmsg_len)))
> return (EINVAL);
> ...
>
> My super-awful test, also follows:
> #include <sys/socket.h>
> #include <sys/types.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <err.h>
> #include <string.h>
>
> void
> child(int sock)
> {
> struct msghdr msg;
> memset(&msg, 0, sizeof(msg));
> recvmsg(sock, &msg, 0);
>
> printf("controllen: %zu\n", msg.msg_controllen);
> printf("control: %p\n", msg.msg_control);
> }
>
> void
> parent(int sock)
> {
> int fds[] = { -1, -1 };
> struct msghdr msg;
> struct cmsghdr  *cmsg;
> union {
> struct cmsghdr hdr;
> unsigned char    buf[2 * CMSG_SPACE(sizeof(int))];
> } cmsgbuf;
> char sfn[30];
>
> memset(&msg, 0, sizeof(msg));
> for (int i = 0; i < sizeof(fds); i++) {
> (void)strlcpy(sfn, "/tmp/worrtest.XXXXXX", sizeof(sfn));
> if ((fds[i] = mkstemp(sfn)) == -1) {
> err(1, "mkstemp");
> }
> }
>
> msg.msg_control = &cmsgbuf.buf;
> msg.msg_controllen = sizeof(cmsgbuf.buf);
>
> cmsg = CMSG_FIRSTHDR(&msg);
> cmsg->cmsg_len = CMSG_LEN(sizeof(int));
> cmsg->cmsg_level = SOL_SOCKET;
> cmsg->cmsg_type = SCM_RIGHTS;
> *(int *)CMSG_DATA(cmsg) = fds[0];
>
> cmsg = CMSG_NXTHDR(&msg, cmsg);
> cmsg->cmsg_len = CMSG_LEN(sizeof(int));
> cmsg->cmsg_level = SOL_SOCKET;
> cmsg->cmsg_type = SCM_RIGHTS;
> *(int *)CMSG_DATA(cmsg) = fds[1];
>
> if (sendmsg(sock, &msg, 10240) == -1)
> err(1, "sendmsg");
> }
>
> int
> main(void)
> {
> int sock[] = {-1, -1};
>
> if (socketpair(AF_UNIX, SOCK_STREAM, 0, sock) == -1)
> err(1, "socket");
>
> switch (fork()) {
> case 0:
> child(sock[0]);
> exit(0);
> case -1:
> err(1, "fork");
> default:
> parent(sock[1]);
> exit(0);
> }
> }
>
>
> Thanks,
> William Orr
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Multiple cmsghdrs in msghdr

Mark Kettenis
In reply to this post by William Orr-2
> Date: Tue, 14 Apr 2015 21:26:25 -0400
> From: William Orr <[hidden email]>
>
> Hey,
>
> I was debugging a few CPython test failures yesterday, and I noticed
> that attaching multiple cmsg structures causes unp_internalize to return
> EINVAL.
>
> I've looked in unix(4) and sendmsg(2), and this caveat isn't documented
> anywhere.
>
> I looked at other OSes, and Linux supports this, FreeBSD fails in
> interesting ways and OS X returns E2BIG.
>
> Is this behavior intentional, and the documentation is missing this
> failure mode? Or is the behavior unintentional? I'm happy to submit a
> patch for either, I just want to know which behavior is intended.

The behaviour is intentional.  The additional complexity of supporting
multiple cmsghdrs has caused many bugs (and associated security
issues) in the past.  The alignment fuckups in various OSes make it
hard to use this functionality in a portable way anyway.  And we only
support SCM_RIGHTS, so there is no real reason to use multiple
cmsghdrs in your code.

Reply | Threaded
Open this post in threaded view
|

Re: Multiple cmsghdrs in msghdr

Otto Moerbeek
On Wed, Apr 15, 2015 at 11:32:11AM +0200, Mark Kettenis wrote:

> > Date: Tue, 14 Apr 2015 21:26:25 -0400
> > From: William Orr <[hidden email]>
> >
> > Hey,
> >
> > I was debugging a few CPython test failures yesterday, and I noticed
> > that attaching multiple cmsg structures causes unp_internalize to return
> > EINVAL.
> >
> > I've looked in unix(4) and sendmsg(2), and this caveat isn't documented
> > anywhere.
> >
> > I looked at other OSes, and Linux supports this, FreeBSD fails in
> > interesting ways and OS X returns E2BIG.
> >
> > Is this behavior intentional, and the documentation is missing this
> > failure mode? Or is the behavior unintentional? I'm happy to submit a
> > patch for either, I just want to know which behavior is intended.
>
> The behaviour is intentional.  The additional complexity of supporting
> multiple cmsghdrs has caused many bugs (and associated security
> issues) in the past.  The alignment fuckups in various OSes make it
> hard to use this functionality in a portable way anyway.  And we only
> support SCM_RIGHTS, so there is no real reason to use multiple
> cmsghdrs in your code.

Plus it *is* possible to send multiple fd's in one message.

        -Otto

Reply | Threaded
Open this post in threaded view
|

Re: Multiple cmsghdrs in msghdr

William Orr-2
On 4/15/15 5:37 AM, Otto Moerbeek wrote:

> On Wed, Apr 15, 2015 at 11:32:11AM +0200, Mark Kettenis wrote:
>
>>> Date: Tue, 14 Apr 2015 21:26:25 -0400
>>> From: William Orr <[hidden email]>
>>>
>>> Hey,
>>>
>>> I was debugging a few CPython test failures yesterday, and I noticed
>>> that attaching multiple cmsg structures causes unp_internalize to return
>>> EINVAL.
>>>
>>> I've looked in unix(4) and sendmsg(2), and this caveat isn't documented
>>> anywhere.
>>>
>>> I looked at other OSes, and Linux supports this, FreeBSD fails in
>>> interesting ways and OS X returns E2BIG.
>>>
>>> Is this behavior intentional, and the documentation is missing this
>>> failure mode? Or is the behavior unintentional? I'm happy to submit a
>>> patch for either, I just want to know which behavior is intended.
>>
>> The behaviour is intentional.  The additional complexity of supporting
>> multiple cmsghdrs has caused many bugs (and associated security
>> issues) in the past.  The alignment fuckups in various OSes make it
>> hard to use this functionality in a portable way anyway.  And we only
>> support SCM_RIGHTS, so there is no real reason to use multiple
>> cmsghdrs in your code.
>
> Plus it *is* possible to send multiple fd's in one message.
>
> -Otto
>
Yeah, I was wondering why this was allowed on some OSes in the first
place, since it seems redundant.

Once I'm not in an airport, I'll submit a docs patch just so that it's
clear.

re: CPython's test suite, I have a patch in the queue that only enables
this behavior on Linux.

Thanks,
William Orr


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Multiple cmsghdrs in msghdr

Philip Guenther-2
On Wed, Apr 15, 2015 at 11:21 AM, William Orr <[hidden email]> wrote:
> On 4/15/15 5:37 AM, Otto Moerbeek wrote:
>> On Wed, Apr 15, 2015 at 11:32:11AM +0200, Mark Kettenis wrote:
>>
>>>> Date: Tue, 14 Apr 2015 21:26:25 -0400
>>>> From: William Orr <[hidden email]>
...

>>>> Is this behavior intentional, and the documentation is missing this
>>>> failure mode? Or is the behavior unintentional? I'm happy to submit a
>>>> patch for either, I just want to know which behavior is intended.
>>>
>>> The behaviour is intentional.  The additional complexity of supporting
>>> multiple cmsghdrs has caused many bugs (and associated security
>>> issues) in the past.  The alignment fuckups in various OSes make it
>>> hard to use this functionality in a portable way anyway.  And we only
>>> support SCM_RIGHTS, so there is no real reason to use multiple
>>> cmsghdrs in your code.
>>
>> Plus it *is* possible to send multiple fd's in one message.
>
> Yeah, I was wondering why this was allowed on some OSes in the first
> place, since it seems redundant.
>
> Once I'm not in an airport, I'll submit a docs patch just so that it's
> clear.
>
> re: CPython's test suite, I have a patch in the queue that only enables
> this behavior on Linux.

If CPython exposes the raw C APIs, then it would be good to patch
their docs too, to indicate that multiple fds should be passed in one
cmsg and not as separate cmsgs...


Philip Guenther

Reply | Threaded
Open this post in threaded view
|

[PATCH] Re: Multiple cmsghdrs in msghdr

William Orr-2
In reply to this post by William Orr-2
This documents the error code when passing multiple cmsg structs. Let me
know if the wording needs to be improved.

Index: lib/libc/sys/send.2
===================================================================
RCS file: /cvs/src/lib/libc/sys/send.2,v
retrieving revision 1.31
diff -u -p -r1.31 send.2
--- lib/libc/sys/send.2 9 Sep 2014 06:32:37 -0000 1.31
+++ lib/libc/sys/send.2 16 Apr 2015 12:48:32 -0000
@@ -223,6 +223,17 @@ values in the
 .Fa msg_iov
 array overflowed an
 .Em ssize_t .
+.It Bq Er EINVAL
+The socket
+.Fa s
+is a
+.Xr unix 4
+socket, and
+.Em controlmsg
+is an invalid size or multiple
+.Em controlmsg
+structures were passed as part of
+.Fa msg .
 .It Bq Er EMSGSIZE
 The
 .Fa msg_iovlen


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Re: Multiple cmsghdrs in msghdr

Jason McIntyre-2
On Thu, Apr 16, 2015 at 09:19:13AM -0400, William Orr wrote:

> This documents the error code when passing multiple cmsg structs. Let me
> know if the wording needs to be improved.
>
> Index: lib/libc/sys/send.2
> ===================================================================
> RCS file: /cvs/src/lib/libc/sys/send.2,v
> retrieving revision 1.31
> diff -u -p -r1.31 send.2
> --- lib/libc/sys/send.2 9 Sep 2014 06:32:37 -0000 1.31
> +++ lib/libc/sys/send.2 16 Apr 2015 12:48:32 -0000
> @@ -223,6 +223,17 @@ values in the
>  .Fa msg_iov
>  array overflowed an
>  .Em ssize_t .
> +.It Bq Er EINVAL
> +The socket
> +.Fa s
> +is a
> +.Xr unix 4
> +socket, and
> +.Em controlmsg
> +is an invalid size or multiple
> +.Em controlmsg
> +structures were passed as part of
> +.Fa msg .
>  .It Bq Er EMSGSIZE
>  The
>  .Fa msg_iovlen
>

anyone want to take this? it's fine by me, if someone can verify its
accuracy.

jmc

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Re: Multiple cmsghdrs in msghdr

Theo de Raadt
In reply to this post by William Orr-2
> On Thu, Apr 16, 2015 at 09:19:13AM -0400, William Orr wrote:
> > This documents the error code when passing multiple cmsg structs. Let me
> > know if the wording needs to be improved.
> >
> > Index: lib/libc/sys/send.2
> > ===================================================================
> > RCS file: /cvs/src/lib/libc/sys/send.2,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 send.2
> > --- lib/libc/sys/send.2 9 Sep 2014 06:32:37 -0000 1.31
> > +++ lib/libc/sys/send.2 16 Apr 2015 12:48:32 -0000
> > @@ -223,6 +223,17 @@ values in the
> >  .Fa msg_iov
> >  array overflowed an
> >  .Em ssize_t .
> > +.It Bq Er EINVAL
> > +The socket
> > +.Fa s
> > +is a
> > +.Xr unix 4
> > +socket, and
> > +.Em controlmsg
> > +is an invalid size or multiple
> > +.Em controlmsg
> > +structures were passed as part of
> > +.Fa msg .
> >  .It Bq Er EMSGSIZE
> >  The
> >  .Fa msg_iovlen
> >
>
> anyone want to take this? it's fine by me, if someone can verify its
> accuracy.

It seems we cannot exhaustively document the scenarios that lead to
certain errnos being returned (in particular, EAGAIN and EINVAL).

Overdocumentation leads to reader fatigue; reader fatigue leads to
the documentation being ignored.

I think it is clearly understood that controlmsg layout is complex,
and there are many illegal layouts, even non-portable ones at that.
This is a complex space, under-documented perhaps?  In any case all
those details are missing from this manual page.  So it seems
completely wrong to document just this one narrow detail.

So I don't think the proposed diff improves the manual page.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Re: Multiple cmsghdrs in msghdr

Jason McIntyre-2
On Mon, Apr 20, 2015 at 12:46:20PM -0600, Theo de Raadt wrote:

> > On Thu, Apr 16, 2015 at 09:19:13AM -0400, William Orr wrote:
> > > This documents the error code when passing multiple cmsg structs. Let me
> > > know if the wording needs to be improved.
> > >
> > > Index: lib/libc/sys/send.2
> > > ===================================================================
> > > RCS file: /cvs/src/lib/libc/sys/send.2,v
> > > retrieving revision 1.31
> > > diff -u -p -r1.31 send.2
> > > --- lib/libc/sys/send.2 9 Sep 2014 06:32:37 -0000 1.31
> > > +++ lib/libc/sys/send.2 16 Apr 2015 12:48:32 -0000
> > > @@ -223,6 +223,17 @@ values in the
> > >  .Fa msg_iov
> > >  array overflowed an
> > >  .Em ssize_t .
> > > +.It Bq Er EINVAL
> > > +The socket
> > > +.Fa s
> > > +is a
> > > +.Xr unix 4
> > > +socket, and
> > > +.Em controlmsg
> > > +is an invalid size or multiple
> > > +.Em controlmsg
> > > +structures were passed as part of
> > > +.Fa msg .
> > >  .It Bq Er EMSGSIZE
> > >  The
> > >  .Fa msg_iovlen
> > >
> >
> > anyone want to take this? it's fine by me, if someone can verify its
> > accuracy.
>
> It seems we cannot exhaustively document the scenarios that lead to
> certain errnos being returned (in particular, EAGAIN and EINVAL).
>
> Overdocumentation leads to reader fatigue; reader fatigue leads to
> the documentation being ignored.
>
> I think it is clearly understood that controlmsg layout is complex,
> and there are many illegal layouts, even non-portable ones at that.
> This is a complex space, under-documented perhaps?  In any case all
> those details are missing from this manual page.  So it seems
> completely wrong to document just this one narrow detail.
>
> So I don't think the proposed diff improves the manual page.
>

fine, i'll drop it.
jmc