ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jesper Wallin
Hi all,

I recently decided to add SSHFP records for my servers, since I never
memorize or write down my key fingerprints.  I learned that if I
want ssh(1) to trust these records, DNSSEC needs to be enabled for my
zone.  To validate these records, ssh(1) is using getrrsetbyname(3),
which checks if the AD (Authentic Data) flag is set in the response.

To get a response with the AD flag set, the request itself also needs
to have the AD flag set.  It turns out that getrrsetbyname(3) doesn't
set this and will therefore never get a validated response, unless the
resolver is configured to always send it, no matter what the client
requested.

It seems like unwind(8) behaves this way but it also responds with the
RRSIG records, which is extra overhead and ignored by getrrsetbyname(3).

This was mentioned a few years ago [0] and the solution suggested was
to add the RES_USE_DNSSEC to _res.options, which also makes the resolver
respond with the extra RRSIG records.

Instead, by only setting the AD flag, both the request and the response
has the same size as without the flag set.  The patch below will add
RES_USE_AD as an option to _res.options and set it by default.
This is also the default behaviour in dig(1), which I understand is a
bit different, but that sure added some confusion while debugging this.

This let you run unbound(8) or any other validating resolver on your
local network and getrrsetbyname(3) will trust it.  Do read the CAVEATS
in the manual of getrrsetbyname(3) though.

As a side note, I noticed that the default value of _res.options was the
same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for
the sake of consistency.

Thoughts?


Yours,
Jesper Wallin

[0] https://marc.info/?t=149983843200002&r=1&w=2


Index: include/resolv.h
===================================================================
RCS file: /cvs/src/include/resolv.h,v
retrieving revision 1.22
diff -u -p -r1.22 resolv.h
--- include/resolv.h 14 Jan 2019 06:23:06 -0000 1.22
+++ include/resolv.h 17 Jul 2020 21:25:30 -0000
@@ -191,8 +191,9 @@ struct __res_state_ext {
 /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
 #define RES_USE_DNSSEC 0x20000000 /* use DNSSEC using OK bit in OPT */
 #define RES_USE_CD 0x10000000 /* set Checking Disabled flag */
+#define RES_USE_AD 0x80000000 /* set Authentic Data flag */
 
-#define RES_DEFAULT (RES_RECURSE | RES_DEFNAMES | RES_DNSRCH)
+#define RES_DEFAULT (RES_RECURSE | RES_DEFNAMES | RES_DNSRCH | RES_USE_AD)
 
 /*
  * Resolver "pfcode" values.  Used by dig.
Index: lib/libc/asr/asr.c
===================================================================
RCS file: /cvs/src/lib/libc/asr/asr.c,v
retrieving revision 1.64
diff -u -p -r1.64 asr.c
--- lib/libc/asr/asr.c 6 Jul 2020 13:33:05 -0000 1.64
+++ lib/libc/asr/asr.c 17 Jul 2020 21:25:30 -0000
@@ -511,7 +511,7 @@ asr_ctx_create(void)
  if ((ac = calloc(1, sizeof(*ac))) == NULL)
  return (NULL);
 
- ac->ac_options = RES_RECURSE | RES_DEFNAMES | RES_DNSRCH;
+ ac->ac_options = RES_DEFAULT;
  ac->ac_refcount = 1;
  ac->ac_ndots = 1;
  ac->ac_family[0] = AF_INET;
Index: lib/libc/asr/res_mkquery.c
===================================================================
RCS file: /cvs/src/lib/libc/asr/res_mkquery.c,v
retrieving revision 1.13
diff -u -p -r1.13 res_mkquery.c
--- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 -0000 1.13
+++ lib/libc/asr/res_mkquery.c 17 Jul 2020 21:25:30 -0000
@@ -62,6 +62,8 @@ res_mkquery(int op, const char *dname, i
  h.flags |= RD_MASK;
  if (ac->ac_options & RES_USE_CD)
  h.flags |= CD_MASK;
+ if (ac->ac_options & RES_USE_AD)
+ h.flags |= AD_MASK;
  h.qdcount = 1;
  if (ac->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
  h.arcount = 1;
Index: lib/libc/asr/res_send_async.c
===================================================================
RCS file: /cvs/src/lib/libc/asr/res_send_async.c,v
retrieving revision 1.39
diff -u -p -r1.39 res_send_async.c
--- lib/libc/asr/res_send_async.c 28 Sep 2019 11:21:07 -0000 1.39
+++ lib/libc/asr/res_send_async.c 17 Jul 2020 21:25:30 -0000
@@ -378,6 +378,8 @@ setup_query(struct asr_query *as, const
  h.flags |= RD_MASK;
  if (as->as_ctx->ac_options & RES_USE_CD)
  h.flags |= CD_MASK;
+ if (as->as_ctx->ac_options & RES_USE_AD)
+ h.flags |= AD_MASK;
  h.qdcount = 1;
  if (as->as_ctx->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
  h.arcount = 1;
Index: lib/libc/net/res_init.3
===================================================================
RCS file: /cvs/src/lib/libc/net/res_init.3,v
retrieving revision 1.4
diff -u -p -r1.4 res_init.3
--- lib/libc/net/res_init.3 25 Apr 2020 21:06:17 -0000 1.4
+++ lib/libc/net/res_init.3 17 Jul 2020 21:25:30 -0000
@@ -206,6 +206,8 @@ uses 4096 bytes as input buffer size.
 Request that the resolver uses
 Domain Name System Security Extensions (DNSSEC),
 as defined in RFCs 4033, 4034, and 4035.
+.It Dv RES_USE_AD
+Set the Authentic Data flag on queries.
 .It Dv RES_USE_CD
 Set the Checking Disabled flag on queries.
 .El

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremy C. Reed
On Fri, 17 Jul 2020, Jesper Wallin wrote:

> To get a response with the AD flag set, the request itself also needs
> to have the AD flag set.

...

> -#define RES_DEFAULT    (RES_RECURSE | RES_DEFNAMES | RES_DNSRCH)
> +#define RES_DEFAULT    (RES_RECURSE | RES_DEFNAMES | RES_DNSRCH | RES_USE_AD)

Can you share some examples of "needs" AD flag set in
combination with DO flag? It's not required to set the AD
flag when the DO flag is already set. You can set the DO flag without AD
flag also set and get AD flag set in the response. (As a brief
explanation, if the validator already authenticated it, when setting the
AD flag without the DO flag in the query, the validator can send back
the AD without the RRSIG records.)

(I don't have any argument with adding the feature and enabling it, but
I am curious what specific example caused you to require it.)

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremy C. Reed
On Fri, 17 Jul 2020, Jeremy C. Reed wrote:

> > To get a response with the AD flag set, the request itself also needs
> > to have the AD flag set.

I re-read your post again and see you already clarified this.

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Peter J. Philipp-3
In reply to this post by Jesper Wallin
On Fri, Jul 17, 2020 at 11:45:22PM +0200, Jesper Wallin wrote:
> Thoughts?
>
>
> Yours,
> Jesper Wallin

I found this very interesting.  Too bad you didn't quote any RFC's that
support this behaviour because RFC 4033 says you shouldn't set the AD bit
in a query, RFC 4035 says something similar, but then digging some further
in RFC 6840 (internet standard) it says in section 5.7:

5.7.  Setting the AD Bit on Queries

   The semantics of the Authentic Data (AD) bit in the query were
   previously undefined.  Section 4.6 of [RFC4035] instructed resolvers
   to always clear the AD bit when composing queries.

   This document defines setting the AD bit in a query as a signal
   indicating that the requester understands and is interested in the
   value of the AD bit in the response.  This allows a requester to
   indicate that it understands the AD bit without also requesting
   DNSSEC data via the DO bit.

I just wanted to add this as a thought supporting your patch.

Best Regards,
-peter

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
In reply to this post by Jesper Wallin
On Fri, Jul 17 2020, Jesper Wallin <[hidden email]> wrote:

> Hi all,
>
> I recently decided to add SSHFP records for my servers, since I never
> memorize or write down my key fingerprints.  I learned that if I
> want ssh(1) to trust these records, DNSSEC needs to be enabled for my
> zone.  To validate these records, ssh(1) is using getrrsetbyname(3),
> which checks if the AD (Authentic Data) flag is set in the response.
>
> To get a response with the AD flag set, the request itself also needs
> to have the AD flag set.  It turns out that getrrsetbyname(3) doesn't
> set this and will therefore never get a validated response, unless the
> resolver is configured to always send it, no matter what the client
> requested.
>
> It seems like unwind(8) behaves this way but it also responds with the
> RRSIG records, which is extra overhead and ignored by getrrsetbyname(3).
>
> This was mentioned a few years ago [0] and the solution suggested was
> to add the RES_USE_DNSSEC to _res.options, which also makes the resolver
> respond with the extra RRSIG records.
>
> Instead, by only setting the AD flag, both the request and the response
> has the same size as without the flag set.  The patch below will add
> RES_USE_AD as an option to _res.options and set it by default.
> This is also the default behaviour in dig(1), which I understand is a
> bit different, but that sure added some confusion while debugging this.
>
> This let you run unbound(8) or any other validating resolver on your
> local network and getrrsetbyname(3) will trust it.  Do read the CAVEATS
> in the manual of getrrsetbyname(3) though.
>
> As a side note, I noticed that the default value of _res.options was the
> same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for
> the sake of consistency.
>
> Thoughts?

Thanks for addressing this longstanding issue.

I think using the AD bit in queries is a good idea. IIRC Peter J.
Philipp (cc'd) suggested using it but I was not thrilled because:

1. the meaning of the AD bit in queries is relatively recent (2013
  I think[0])
2. getrrsetbyname also collects signature records, and for this you need
  EDNS + the DO bit set.  Implementing this in was not 100% trivial,
  I think we had something working but Eric or I were not 100% happy
  with it.

1. is probably not a concern, after all you're supposed to use
a trustworthy resolver, which should be modern and understand the
purpose of the AD bit in queries.

2. is probably not a concern either.  I guess that all getrrsetbyname(3)
callers only care about the target records and the AD bit, not about the
signature records.  (Why would they use it for anyway?).  In the base
system, only ssh(1) and traceroute(8) use getrrsetbyname(3).
AFAIK no other system provides getrrsetbyname(3), and ISC has removed
getrrsetbyname and the whole lwres API in 2017[1].  So I'd say we're
free to improve our version of getrrsetbyname(3) as we see fit.

Now let's discuss the implementation.  There are two main things
I dislike with your diff:
- it affects all DNS queries done through libc, even if the user doesn't
  care at all about DNSSEC.  I suspect there's potential for fallout
  here, but I have no data to back this.
- trusting the AD bit by default looks a bit too easy.  The
  documentation says that people should only trust the AD bit /
  RRSET_VALIDATED if the communication between libc and the resolver is
  secure, but who actually reads the documentation?

For those two reasons I think the feature should be opt-in.  That's what
Florian Weimer did in glibc, with the implementation of RES_TRUSTAD[2][3].
The diff below implements the same semantics:
- by default, don't send queries the AD flag, and don't trust its value
  in replies
- if "options trust-ad" is set in resolv.conf(5), set the AD flag in
  queries and allow applications to look at the AD flag in replies.

There's one more thing I'd like to check in the res_* API, but I'm
sitting on this idea since too long already, and I'm happy with my test
results so far.

Thoughts?  Comments?

[0] https://tools.ietf.org/html/rfc6840#section-5.7
[1] https://gitlab.isc.org/isc-projects/bind9/-/commit/8eb88aafee951859264e36c315b1289cd8c2088b
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1164339
[3] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=446997ff1433d33452b81dfa9e626b8dccf101a4



Index: include/resolv.h
===================================================================
RCS file: /d/cvs/src/include/resolv.h,v
retrieving revision 1.22
diff -u -p -r1.22 resolv.h
--- include/resolv.h 14 Jan 2019 06:23:06 -0000 1.22
+++ include/resolv.h 23 Jul 2020 13:28:15 -0000
@@ -186,6 +186,8 @@ struct __res_state_ext {
 #define RES_INSECURE2 0x00000800 /* type 2 security disabled */
 #define RES_NOALIASES 0x00001000 /* shuts off HOSTALIASES feature */
 #define RES_USE_INET6 0x00002000 /* use/map IPv6 in gethostbyname() */
+/* GNU extension: use higher bit to avoid conflict with ISC use */
+#define RES_TRUSTAD 0x80000000 /* query with AD, trust AD in reply */
 /* KAME extensions: use higher bit to avoid conflict with ISC use */
 #define RES_USE_EDNS0 0x40000000 /* use EDNS0 */
 /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
Index: lib/libc/net/res_init.3
===================================================================
RCS file: /d/cvs/src/lib/libc/net/res_init.3,v
retrieving revision 1.4
diff -u -p -r1.4 res_init.3
--- lib/libc/net/res_init.3 25 Apr 2020 21:06:17 -0000 1.4
+++ lib/libc/net/res_init.3 25 Jul 2020 02:09:58 -0000
@@ -179,6 +179,17 @@ This option has no effect.
 In the past, it turned off the legacy
 .Ev HOSTALIASES
 feature.
+.It Dv RES_TRUSTAD
+If set, the resolver routines will set the AD flag in DNS queries and
+preserve the value of the AD flag in DNS replies (this flag is stripped
+by default).
+Useful when applications want to ensure that the DNS resources they
+look up have been signed with DNSSEC and validated by the name server.
+Direct use of this option to enable AD bit processing is discouraged.
+Instead the use of trusted name servers should be annotated with
+.Dq options trust-ad
+in
+.Xr resolv.conf 5 .
 .It Dv RES_USE_INET6
 With this option
 .Xr gethostbyname 3
Index: share/man/man5/resolv.conf.5
===================================================================
RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
retrieving revision 1.60
diff -u -p -r1.60 resolv.conf.5
--- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
+++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
@@ -285,6 +285,15 @@ first as an absolute name before any sea
 .It Cm tcp
 Forces the use of TCP for queries.
 Normal behaviour is to query via UDP but fall back to TCP on failure.
+.It Cm trust-ad
+Sets the AD flag in DNS queries, and preserve the value of the AD flag
+in DNS replies (this flag is stripped by default).
+Useful when applications want to ensure that the DNS resources they
+look up have been signed with DNSSEC and validated by the
+.Ic nameserver .
+This option should be used only if the transport between the
+.Ic nameserver
+and the resolver is secure.
 .El
 .El
 .Pp
Index: lib/libc/asr/asr.c
===================================================================
RCS file: /d/cvs/src/lib/libc/asr/asr.c,v
retrieving revision 1.64
diff -u -p -r1.64 asr.c
--- lib/libc/asr/asr.c 6 Jul 2020 13:33:05 -0000 1.64
+++ lib/libc/asr/asr.c 23 Jul 2020 23:36:06 -0000
@@ -640,6 +640,8 @@ pass0(char **tok, int n, struct asr_ctx
  ac->ac_options |= RES_USEVC;
  else if (!strcmp(tok[i], "edns0"))
  ac->ac_options |= RES_USE_EDNS0;
+ else if (!strcmp(tok[i], "trust-ad"))
+ ac->ac_options |= RES_TRUSTAD;
  else if ((!strncmp(tok[i], "ndots:", 6))) {
  e = NULL;
  d = strtonum(tok[i] + 6, 1, 16, &e);
Index: lib/libc/asr/asr_debug.c
===================================================================
RCS file: /d/cvs/src/lib/libc/asr/asr_debug.c,v
retrieving revision 1.26
diff -u -p -r1.26 asr_debug.c
--- lib/libc/asr/asr_debug.c 3 Jul 2019 03:24:03 -0000 1.26
+++ lib/libc/asr/asr_debug.c 23 Jul 2020 23:36:06 -0000
@@ -285,6 +285,7 @@ _asr_dump_config(FILE *f, struct asr *a)
  PRINTOPT(RES_NOALIASES, "NOALIASES");
  PRINTOPT(RES_USE_EDNS0, "USE_EDNS0");
  PRINTOPT(RES_USE_DNSSEC, "USE_DNSSEC");
+ PRINTOPT(RES_TRUSTAD, "TRUSTAD");
  if (o)
  fprintf(f, " 0x%08x", o);
  fprintf(f, "\n");
Index: lib/libc/asr/getrrsetbyname_async.c
===================================================================
RCS file: /d/cvs/src/lib/libc/asr/getrrsetbyname_async.c,v
retrieving revision 1.11
diff -u -p -r1.11 getrrsetbyname_async.c
--- lib/libc/asr/getrrsetbyname_async.c 23 Feb 2017 17:04:02 -0000 1.11
+++ lib/libc/asr/getrrsetbyname_async.c 23 Jul 2020 23:36:06 -0000
@@ -32,7 +32,8 @@
 #include "asr_private.h"
 
 static int getrrsetbyname_async_run(struct asr_query *, struct asr_result *);
-static void get_response(struct asr_result *, const char *, int);
+static void get_response(struct asr_query *, struct asr_result *, const char *,
+    int);
 
 struct asr_query *
 getrrsetbyname_async(const char *hostname, unsigned int rdclass,
@@ -150,7 +151,7 @@ getrrsetbyname_async_run(struct asr_quer
  break;
  }
 
- get_response(ar, ar->ar_data, ar->ar_datalen);
+ get_response(as, ar, ar->ar_data, ar->ar_datalen);
  free(ar->ar_data);
  async_set_state(as, ASR_STATE_HALT);
  break;
@@ -255,7 +256,8 @@ static void free_dns_response(struct dns
 static int count_dns_rr(struct dns_rr *, u_int16_t, u_int16_t);
 
 static void
-get_response(struct asr_result *ar, const char *pkt, int pktlen)
+get_response(struct asr_query *as, struct asr_result *ar, const char *pkt,
+    int pktlen)
 {
  struct rrsetinfo *rrset = NULL;
  struct dns_response *response = NULL;
@@ -287,7 +289,7 @@ get_response(struct asr_result *ar, cons
  rrset->rri_nrdatas = response->header.ancount;
 
  /* check for authenticated data */
- if (response->header.ad == 1)
+ if (as->as_ctx->ac_options & RES_TRUSTAD && response->header.ad == 1)
  rrset->rri_flags |= RRSET_VALIDATED;
 
  /* copy name from answer section */
Index: lib/libc/asr/res_mkquery.c
===================================================================
RCS file: /d/cvs/src/lib/libc/asr/res_mkquery.c,v
retrieving revision 1.13
diff -u -p -r1.13 res_mkquery.c
--- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 -0000 1.13
+++ lib/libc/asr/res_mkquery.c 24 Jul 2020 12:34:27 -0000
@@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i
  h.flags |= RD_MASK;
  if (ac->ac_options & RES_USE_CD)
  h.flags |= CD_MASK;
+ if ((ac->ac_options & RES_TRUSTAD) &&
+    !(ac->ac_options & RES_USE_DNSSEC))
+ h.flags |= AD_MASK;
  h.qdcount = 1;
  if (ac->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
  h.arcount = 1;
Index: lib/libc/asr/res_send_async.c
===================================================================
RCS file: /d/cvs/src/lib/libc/asr/res_send_async.c,v
retrieving revision 1.39
diff -u -p -r1.39 res_send_async.c
--- lib/libc/asr/res_send_async.c 28 Sep 2019 11:21:07 -0000 1.39
+++ lib/libc/asr/res_send_async.c 24 Jul 2020 13:02:38 -0000
@@ -378,6 +378,9 @@ setup_query(struct asr_query *as, const
  h.flags |= RD_MASK;
  if (as->as_ctx->ac_options & RES_USE_CD)
  h.flags |= CD_MASK;
+ if ((as->as_ctx->ac_options & RES_TRUSTAD) &&
+    !(as->as_ctx->ac_options & RES_USE_DNSSEC))
+ h.flags |= AD_MASK;
  h.qdcount = 1;
  if (as->as_ctx->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
  h.arcount = 1;
@@ -700,6 +703,9 @@ validate_packet(struct asr_query *as)
  /* Must be a response */
  if ((h.flags & QR_MASK) == 0)
  goto inval;
+ /* Zap the AD flag unless the server is considered trusted. */
+ if (!(as->as_ctx->ac_options & RES_TRUSTAD))
+ h.flags &= ~(AD_MASK);
 
  as->as.dns.rcode = RCODE(h.flags);
  as->as.dns.ancount = h.ancount;


--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Sebastian Benoit-3
Jeremie Courreges-Anglas([hidden email]) on 2020.07.25 14:01:06 +0200:

> On Fri, Jul 17 2020, Jesper Wallin <[hidden email]> wrote:
> > Hi all,
> >
> > I recently decided to add SSHFP records for my servers, since I never
> > memorize or write down my key fingerprints.  I learned that if I
> > want ssh(1) to trust these records, DNSSEC needs to be enabled for my
> > zone.  To validate these records, ssh(1) is using getrrsetbyname(3),
> > which checks if the AD (Authentic Data) flag is set in the response.
> >
> > To get a response with the AD flag set, the request itself also needs
> > to have the AD flag set.  It turns out that getrrsetbyname(3) doesn't
> > set this and will therefore never get a validated response, unless the
> > resolver is configured to always send it, no matter what the client
> > requested.
> >
> > It seems like unwind(8) behaves this way but it also responds with the
> > RRSIG records, which is extra overhead and ignored by getrrsetbyname(3).
> >
> > This was mentioned a few years ago [0] and the solution suggested was
> > to add the RES_USE_DNSSEC to _res.options, which also makes the resolver
> > respond with the extra RRSIG records.
> >
> > Instead, by only setting the AD flag, both the request and the response
> > has the same size as without the flag set.  The patch below will add
> > RES_USE_AD as an option to _res.options and set it by default.
> > This is also the default behaviour in dig(1), which I understand is a
> > bit different, but that sure added some confusion while debugging this.
> >
> > This let you run unbound(8) or any other validating resolver on your
> > local network and getrrsetbyname(3) will trust it.  Do read the CAVEATS
> > in the manual of getrrsetbyname(3) though.
> >
> > As a side note, I noticed that the default value of _res.options was the
> > same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for
> > the sake of consistency.
> >
> > Thoughts?
>
> Thanks for addressing this longstanding issue.
>
> I think using the AD bit in queries is a good idea. IIRC Peter J.
> Philipp (cc'd) suggested using it but I was not thrilled because:
>
> 1. the meaning of the AD bit in queries is relatively recent (2013
>   I think[0])
> 2. getrrsetbyname also collects signature records, and for this you need
>   EDNS + the DO bit set.  Implementing this in was not 100% trivial,
>   I think we had something working but Eric or I were not 100% happy
>   with it.
>
> 1. is probably not a concern, after all you're supposed to use
> a trustworthy resolver, which should be modern and understand the
> purpose of the AD bit in queries.
>
> 2. is probably not a concern either.  I guess that all getrrsetbyname(3)
> callers only care about the target records and the AD bit, not about the
> signature records.  (Why would they use it for anyway?).  In the base
> system, only ssh(1) and traceroute(8) use getrrsetbyname(3).
> AFAIK no other system provides getrrsetbyname(3), and ISC has removed
> getrrsetbyname and the whole lwres API in 2017[1].  So I'd say we're
> free to improve our version of getrrsetbyname(3) as we see fit.

This is a concern for the stub resolver, because edns and AD does not work
everywhere. Indeed when we switched unbound to validate by default we
learned that this is not a good idea for everyone - which lead to the
development of unwind(8).
 

> Now let's discuss the implementation.  There are two main things
> I dislike with your diff:
> - it affects all DNS queries done through libc, even if the user doesn't
>   care at all about DNSSEC.  I suspect there's potential for fallout
>   here, but I have no data to back this.
> - trusting the AD bit by default looks a bit too easy.  The
>   documentation says that people should only trust the AD bit /
>   RRSET_VALIDATED if the communication between libc and the resolver is
>   secure, but who actually reads the documentation?
>
> For those two reasons I think the feature should be opt-in.  That's what
> Florian Weimer did in glibc, with the implementation of RES_TRUSTAD[2][3].
> The diff below implements the same semantics:
> - by default, don't send queries the AD flag, and don't trust its value
>   in replies
> - if "options trust-ad" is set in resolv.conf(5), set the AD flag in
>   queries and allow applications to look at the AD flag in replies.
>
> There's one more thing I'd like to check in the res_* API, but I'm
> sitting on this idea since too long already, and I'm happy with my test
> results so far.
>
> Thoughts?  Comments?

Your observation that this should not be on by default is probably correct.

If you enable trust-ad on a system that moves around, e.g. your laptop, you
will experience failures, which is why unwind tests for this and falls back
to non-trusting dhcp learned resolvers in such cases. In fact it also falls
back to the stub resolver, i.e. our libc resolver. Which makes me think that
trust-ad should not be used when you run unwind (because the whole point of
the fallback is that dnssec does not work). But thats a documentation issue.

The patch reads ok, fwiw. i'm not familiar with the code.

> [0] https://tools.ietf.org/html/rfc6840#section-5.7
> [1] https://gitlab.isc.org/isc-projects/bind9/-/commit/8eb88aafee951859264e36c315b1289cd8c2088b
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1164339
> [3] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=446997ff1433d33452b81dfa9e626b8dccf101a4
>
>
>
> Index: include/resolv.h
> ===================================================================
> RCS file: /d/cvs/src/include/resolv.h,v
> retrieving revision 1.22
> diff -u -p -r1.22 resolv.h
> --- include/resolv.h 14 Jan 2019 06:23:06 -0000 1.22
> +++ include/resolv.h 23 Jul 2020 13:28:15 -0000
> @@ -186,6 +186,8 @@ struct __res_state_ext {
>  #define RES_INSECURE2 0x00000800 /* type 2 security disabled */
>  #define RES_NOALIASES 0x00001000 /* shuts off HOSTALIASES feature */
>  #define RES_USE_INET6 0x00002000 /* use/map IPv6 in gethostbyname() */
> +/* GNU extension: use higher bit to avoid conflict with ISC use */
> +#define RES_TRUSTAD 0x80000000 /* query with AD, trust AD in reply */
>  /* KAME extensions: use higher bit to avoid conflict with ISC use */
>  #define RES_USE_EDNS0 0x40000000 /* use EDNS0 */
>  /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
> Index: lib/libc/net/res_init.3
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/net/res_init.3,v
> retrieving revision 1.4
> diff -u -p -r1.4 res_init.3
> --- lib/libc/net/res_init.3 25 Apr 2020 21:06:17 -0000 1.4
> +++ lib/libc/net/res_init.3 25 Jul 2020 02:09:58 -0000
> @@ -179,6 +179,17 @@ This option has no effect.
>  In the past, it turned off the legacy
>  .Ev HOSTALIASES
>  feature.
> +.It Dv RES_TRUSTAD
> +If set, the resolver routines will set the AD flag in DNS queries and
> +preserve the value of the AD flag in DNS replies (this flag is stripped
> +by default).
> +Useful when applications want to ensure that the DNS resources they
> +look up have been signed with DNSSEC and validated by the name server.
> +Direct use of this option to enable AD bit processing is discouraged.
> +Instead the use of trusted name servers should be annotated with
> +.Dq options trust-ad
> +in
> +.Xr resolv.conf 5 .
>  .It Dv RES_USE_INET6
>  With this option
>  .Xr gethostbyname 3
> Index: share/man/man5/resolv.conf.5
> ===================================================================
> RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
> retrieving revision 1.60
> diff -u -p -r1.60 resolv.conf.5
> --- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
> +++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
> @@ -285,6 +285,15 @@ first as an absolute name before any sea
>  .It Cm tcp
>  Forces the use of TCP for queries.
>  Normal behaviour is to query via UDP but fall back to TCP on failure.
> +.It Cm trust-ad
> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
> +in DNS replies (this flag is stripped by default).
> +Useful when applications want to ensure that the DNS resources they
> +look up have been signed with DNSSEC and validated by the
> +.Ic nameserver .
> +This option should be used only if the transport between the
> +.Ic nameserver
> +and the resolver is secure.
>  .El
>  .El
>  .Pp
> Index: lib/libc/asr/asr.c
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/asr/asr.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 asr.c
> --- lib/libc/asr/asr.c 6 Jul 2020 13:33:05 -0000 1.64
> +++ lib/libc/asr/asr.c 23 Jul 2020 23:36:06 -0000
> @@ -640,6 +640,8 @@ pass0(char **tok, int n, struct asr_ctx
>   ac->ac_options |= RES_USEVC;
>   else if (!strcmp(tok[i], "edns0"))
>   ac->ac_options |= RES_USE_EDNS0;
> + else if (!strcmp(tok[i], "trust-ad"))
> + ac->ac_options |= RES_TRUSTAD;
>   else if ((!strncmp(tok[i], "ndots:", 6))) {
>   e = NULL;
>   d = strtonum(tok[i] + 6, 1, 16, &e);
> Index: lib/libc/asr/asr_debug.c
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/asr/asr_debug.c,v
> retrieving revision 1.26
> diff -u -p -r1.26 asr_debug.c
> --- lib/libc/asr/asr_debug.c 3 Jul 2019 03:24:03 -0000 1.26
> +++ lib/libc/asr/asr_debug.c 23 Jul 2020 23:36:06 -0000
> @@ -285,6 +285,7 @@ _asr_dump_config(FILE *f, struct asr *a)
>   PRINTOPT(RES_NOALIASES, "NOALIASES");
>   PRINTOPT(RES_USE_EDNS0, "USE_EDNS0");
>   PRINTOPT(RES_USE_DNSSEC, "USE_DNSSEC");
> + PRINTOPT(RES_TRUSTAD, "TRUSTAD");
>   if (o)
>   fprintf(f, " 0x%08x", o);
>   fprintf(f, "\n");
> Index: lib/libc/asr/getrrsetbyname_async.c
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/asr/getrrsetbyname_async.c,v
> retrieving revision 1.11
> diff -u -p -r1.11 getrrsetbyname_async.c
> --- lib/libc/asr/getrrsetbyname_async.c 23 Feb 2017 17:04:02 -0000 1.11
> +++ lib/libc/asr/getrrsetbyname_async.c 23 Jul 2020 23:36:06 -0000
> @@ -32,7 +32,8 @@
>  #include "asr_private.h"
>  
>  static int getrrsetbyname_async_run(struct asr_query *, struct asr_result *);
> -static void get_response(struct asr_result *, const char *, int);
> +static void get_response(struct asr_query *, struct asr_result *, const char *,
> +    int);
>  
>  struct asr_query *
>  getrrsetbyname_async(const char *hostname, unsigned int rdclass,
> @@ -150,7 +151,7 @@ getrrsetbyname_async_run(struct asr_quer
>   break;
>   }
>  
> - get_response(ar, ar->ar_data, ar->ar_datalen);
> + get_response(as, ar, ar->ar_data, ar->ar_datalen);
>   free(ar->ar_data);
>   async_set_state(as, ASR_STATE_HALT);
>   break;
> @@ -255,7 +256,8 @@ static void free_dns_response(struct dns
>  static int count_dns_rr(struct dns_rr *, u_int16_t, u_int16_t);
>  
>  static void
> -get_response(struct asr_result *ar, const char *pkt, int pktlen)
> +get_response(struct asr_query *as, struct asr_result *ar, const char *pkt,
> +    int pktlen)
>  {
>   struct rrsetinfo *rrset = NULL;
>   struct dns_response *response = NULL;
> @@ -287,7 +289,7 @@ get_response(struct asr_result *ar, cons
>   rrset->rri_nrdatas = response->header.ancount;
>  
>   /* check for authenticated data */
> - if (response->header.ad == 1)
> + if (as->as_ctx->ac_options & RES_TRUSTAD && response->header.ad == 1)
>   rrset->rri_flags |= RRSET_VALIDATED;
>  
>   /* copy name from answer section */
> Index: lib/libc/asr/res_mkquery.c
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/asr/res_mkquery.c,v
> retrieving revision 1.13
> diff -u -p -r1.13 res_mkquery.c
> --- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 -0000 1.13
> +++ lib/libc/asr/res_mkquery.c 24 Jul 2020 12:34:27 -0000
> @@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i
>   h.flags |= RD_MASK;
>   if (ac->ac_options & RES_USE_CD)
>   h.flags |= CD_MASK;
> + if ((ac->ac_options & RES_TRUSTAD) &&
> +    !(ac->ac_options & RES_USE_DNSSEC))
> + h.flags |= AD_MASK;
>   h.qdcount = 1;
>   if (ac->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
>   h.arcount = 1;
> Index: lib/libc/asr/res_send_async.c
> ===================================================================
> RCS file: /d/cvs/src/lib/libc/asr/res_send_async.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 res_send_async.c
> --- lib/libc/asr/res_send_async.c 28 Sep 2019 11:21:07 -0000 1.39
> +++ lib/libc/asr/res_send_async.c 24 Jul 2020 13:02:38 -0000
> @@ -378,6 +378,9 @@ setup_query(struct asr_query *as, const
>   h.flags |= RD_MASK;
>   if (as->as_ctx->ac_options & RES_USE_CD)
>   h.flags |= CD_MASK;
> + if ((as->as_ctx->ac_options & RES_TRUSTAD) &&
> +    !(as->as_ctx->ac_options & RES_USE_DNSSEC))
> + h.flags |= AD_MASK;
>   h.qdcount = 1;
>   if (as->as_ctx->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
>   h.arcount = 1;
> @@ -700,6 +703,9 @@ validate_packet(struct asr_query *as)
>   /* Must be a response */
>   if ((h.flags & QR_MASK) == 0)
>   goto inval;
> + /* Zap the AD flag unless the server is considered trusted. */
> + if (!(as->as_ctx->ac_options & RES_TRUSTAD))
> + h.flags &= ~(AD_MASK);
>  
>   as->as.dns.rcode = RCODE(h.flags);
>   as->as.dns.ancount = h.ancount;
>
>
> --
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jesper Wallin
In reply to this post by Jeremie Courreges-Anglas-2
On Sat, Jul 25, 2020 at 02:01:06PM +0200, Jeremie Courreges-Anglas wrote:
>
> For those two reasons I think the feature should be opt-in.

Yeah, I agree with you.  My first approach was to have it check what
kind of DNS record that was requested, and add the AD-flag only if type
was SSHFP, but that felt even uglier.  I also wasn't so sure my approach
was the right one after reading the RFCs Peter J. Philipp mentioned.

Perhaps another approach would be to make use of the currently unused
flags argument in getrrsetbyname(3)?  This way, only getrrsetbyname(3)
and certain requests are affected by it.


Yours,
Jesper Wallin

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
In reply to this post by Sebastian Benoit-3
On Sat, Jul 25 2020, Sebastian Benoit <[hidden email]> wrote:

> Jeremie Courreges-Anglas([hidden email]) on 2020.07.25 14:01:06 +0200:
>> On Fri, Jul 17 2020, Jesper Wallin <[hidden email]> wrote:
>> > Hi all,
>> >
>> > I recently decided to add SSHFP records for my servers, since I never
>> > memorize or write down my key fingerprints.  I learned that if I
>> > want ssh(1) to trust these records, DNSSEC needs to be enabled for my
>> > zone.  To validate these records, ssh(1) is using getrrsetbyname(3),
>> > which checks if the AD (Authentic Data) flag is set in the response.
>> >
>> > To get a response with the AD flag set, the request itself also needs
>> > to have the AD flag set.  It turns out that getrrsetbyname(3) doesn't
>> > set this and will therefore never get a validated response, unless the
>> > resolver is configured to always send it, no matter what the client
>> > requested.
>> >
>> > It seems like unwind(8) behaves this way but it also responds with the
>> > RRSIG records, which is extra overhead and ignored by getrrsetbyname(3).
>> >
>> > This was mentioned a few years ago [0] and the solution suggested was
>> > to add the RES_USE_DNSSEC to _res.options, which also makes the resolver
>> > respond with the extra RRSIG records.
>> >
>> > Instead, by only setting the AD flag, both the request and the response
>> > has the same size as without the flag set.  The patch below will add
>> > RES_USE_AD as an option to _res.options and set it by default.
>> > This is also the default behaviour in dig(1), which I understand is a
>> > bit different, but that sure added some confusion while debugging this.
>> >
>> > This let you run unbound(8) or any other validating resolver on your
>> > local network and getrrsetbyname(3) will trust it.  Do read the CAVEATS
>> > in the manual of getrrsetbyname(3) though.
>> >
>> > As a side note, I noticed that the default value of _res.options was the
>> > same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for
>> > the sake of consistency.
>> >
>> > Thoughts?
>>
>> Thanks for addressing this longstanding issue.
>>
>> I think using the AD bit in queries is a good idea. IIRC Peter J.
>> Philipp (cc'd) suggested using it but I was not thrilled because:
>>
>> 1. the meaning of the AD bit in queries is relatively recent (2013
>>   I think[0])
>> 2. getrrsetbyname also collects signature records, and for this you need
>>   EDNS + the DO bit set.  Implementing this in was not 100% trivial,
>>   I think we had something working but Eric or I were not 100% happy
>>   with it.
>>
>> 1. is probably not a concern, after all you're supposed to use
>> a trustworthy resolver, which should be modern and understand the
>> purpose of the AD bit in queries.
>>
>> 2. is probably not a concern either.  I guess that all getrrsetbyname(3)
>> callers only care about the target records and the AD bit, not about the
>> signature records.  (Why would they use it for anyway?).  In the base
>> system, only ssh(1) and traceroute(8) use getrrsetbyname(3).
>> AFAIK no other system provides getrrsetbyname(3), and ISC has removed
>> getrrsetbyname and the whole lwres API in 2017[1].  So I'd say we're
>> free to improve our version of getrrsetbyname(3) as we see fit.
>
> This is a concern for the stub resolver, because edns and AD does not work
> everywhere.
> Indeed when we switched unbound to validate by default we
> learned that this is not a good idea for everyone - which lead to the
> development of unwind(8).

Do you by chance have any data regarding fallout because of the AD bit
set in queries?  I would expect it to be ignored when not supported.
EDNS and the DNSSEC DO bit is a different story indeed.

>> Now let's discuss the implementation.  There are two main things
>> I dislike with your diff:
>> - it affects all DNS queries done through libc, even if the user doesn't
>>   care at all about DNSSEC.  I suspect there's potential for fallout
>>   here, but I have no data to back this.
>> - trusting the AD bit by default looks a bit too easy.  The
>>   documentation says that people should only trust the AD bit /
>>   RRSET_VALIDATED if the communication between libc and the resolver is
>>   secure, but who actually reads the documentation?
>>
>> For those two reasons I think the feature should be opt-in.  That's what
>> Florian Weimer did in glibc, with the implementation of RES_TRUSTAD[2][3].
>> The diff below implements the same semantics:
>> - by default, don't send queries the AD flag, and don't trust its value
>>   in replies
>> - if "options trust-ad" is set in resolv.conf(5), set the AD flag in
>>   queries and allow applications to look at the AD flag in replies.
>>
>> There's one more thing I'd like to check in the res_* API, but I'm
>> sitting on this idea since too long already, and I'm happy with my test
>> results so far.
>>
>> Thoughts?  Comments?
>
> Your observation that this should not be on by default is probably correct.
>
> If you enable trust-ad on a system that moves around, e.g. your laptop, you
> will experience failures, which is why unwind tests for this and falls back
> to non-trusting dhcp learned resolvers in such cases. In fact it also falls
> back to the stub resolver, i.e. our libc resolver. Which makes me think that
> trust-ad should not be used when you run unwind (because the whole point of
> the fallback is that dnssec does not work). But thats a documentation issue.

Thanks for pointing this out.  I would expect from unwind(8) that it
always clears the AD bit from its responses if it could not validate
them.  Inclusing when it falls back to dynamically learned resolvers or
the libc resolver.

Sadly "options trust-ad" is not per-nameserver, so you can't have
trust-ad work for 127.0.0.1 or ::1 and not random nameservers learned
by DHCP.  I'll investiate whet would be needed on the documentation
side.

> The patch reads ok, fwiw. i'm not familiar with the code.

Thanks for the feedback!

>> [0] https://tools.ietf.org/html/rfc6840#section-5.7
>> [1] https://gitlab.isc.org/isc-projects/bind9/-/commit/8eb88aafee951859264e36c315b1289cd8c2088b
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1164339
>> [3] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=446997ff1433d33452b81dfa9e626b8dccf101a4
>>
>>
>>
>> Index: include/resolv.h
>> ===================================================================
>> RCS file: /d/cvs/src/include/resolv.h,v
>> retrieving revision 1.22
>> diff -u -p -r1.22 resolv.h
>> --- include/resolv.h 14 Jan 2019 06:23:06 -0000 1.22
>> +++ include/resolv.h 23 Jul 2020 13:28:15 -0000
>> @@ -186,6 +186,8 @@ struct __res_state_ext {
>>  #define RES_INSECURE2 0x00000800 /* type 2 security disabled */
>>  #define RES_NOALIASES 0x00001000 /* shuts off HOSTALIASES feature */
>>  #define RES_USE_INET6 0x00002000 /* use/map IPv6 in gethostbyname() */
>> +/* GNU extension: use higher bit to avoid conflict with ISC use */
>> +#define RES_TRUSTAD 0x80000000 /* query with AD, trust AD in reply */
>>  /* KAME extensions: use higher bit to avoid conflict with ISC use */
>>  #define RES_USE_EDNS0 0x40000000 /* use EDNS0 */
>>  /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
>> Index: lib/libc/net/res_init.3
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/net/res_init.3,v
>> retrieving revision 1.4
>> diff -u -p -r1.4 res_init.3
>> --- lib/libc/net/res_init.3 25 Apr 2020 21:06:17 -0000 1.4
>> +++ lib/libc/net/res_init.3 25 Jul 2020 02:09:58 -0000
>> @@ -179,6 +179,17 @@ This option has no effect.
>>  In the past, it turned off the legacy
>>  .Ev HOSTALIASES
>>  feature.
>> +.It Dv RES_TRUSTAD
>> +If set, the resolver routines will set the AD flag in DNS queries and
>> +preserve the value of the AD flag in DNS replies (this flag is stripped
>> +by default).
>> +Useful when applications want to ensure that the DNS resources they
>> +look up have been signed with DNSSEC and validated by the name server.
>> +Direct use of this option to enable AD bit processing is discouraged.
>> +Instead the use of trusted name servers should be annotated with
>> +.Dq options trust-ad
>> +in
>> +.Xr resolv.conf 5 .
>>  .It Dv RES_USE_INET6
>>  With this option
>>  .Xr gethostbyname 3
>> Index: share/man/man5/resolv.conf.5
>> ===================================================================
>> RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
>> retrieving revision 1.60
>> diff -u -p -r1.60 resolv.conf.5
>> --- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
>> +++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
>> @@ -285,6 +285,15 @@ first as an absolute name before any sea
>>  .It Cm tcp
>>  Forces the use of TCP for queries.
>>  Normal behaviour is to query via UDP but fall back to TCP on failure.
>> +.It Cm trust-ad
>> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
>> +in DNS replies (this flag is stripped by default).
>> +Useful when applications want to ensure that the DNS resources they
>> +look up have been signed with DNSSEC and validated by the
>> +.Ic nameserver .
>> +This option should be used only if the transport between the
>> +.Ic nameserver
>> +and the resolver is secure.
>>  .El
>>  .El
>>  .Pp
>> Index: lib/libc/asr/asr.c
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/asr/asr.c,v
>> retrieving revision 1.64
>> diff -u -p -r1.64 asr.c
>> --- lib/libc/asr/asr.c 6 Jul 2020 13:33:05 -0000 1.64
>> +++ lib/libc/asr/asr.c 23 Jul 2020 23:36:06 -0000
>> @@ -640,6 +640,8 @@ pass0(char **tok, int n, struct asr_ctx
>>   ac->ac_options |= RES_USEVC;
>>   else if (!strcmp(tok[i], "edns0"))
>>   ac->ac_options |= RES_USE_EDNS0;
>> + else if (!strcmp(tok[i], "trust-ad"))
>> + ac->ac_options |= RES_TRUSTAD;
>>   else if ((!strncmp(tok[i], "ndots:", 6))) {
>>   e = NULL;
>>   d = strtonum(tok[i] + 6, 1, 16, &e);
>> Index: lib/libc/asr/asr_debug.c
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/asr/asr_debug.c,v
>> retrieving revision 1.26
>> diff -u -p -r1.26 asr_debug.c
>> --- lib/libc/asr/asr_debug.c 3 Jul 2019 03:24:03 -0000 1.26
>> +++ lib/libc/asr/asr_debug.c 23 Jul 2020 23:36:06 -0000
>> @@ -285,6 +285,7 @@ _asr_dump_config(FILE *f, struct asr *a)
>>   PRINTOPT(RES_NOALIASES, "NOALIASES");
>>   PRINTOPT(RES_USE_EDNS0, "USE_EDNS0");
>>   PRINTOPT(RES_USE_DNSSEC, "USE_DNSSEC");
>> + PRINTOPT(RES_TRUSTAD, "TRUSTAD");
>>   if (o)
>>   fprintf(f, " 0x%08x", o);
>>   fprintf(f, "\n");
>> Index: lib/libc/asr/getrrsetbyname_async.c
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/asr/getrrsetbyname_async.c,v
>> retrieving revision 1.11
>> diff -u -p -r1.11 getrrsetbyname_async.c
>> --- lib/libc/asr/getrrsetbyname_async.c 23 Feb 2017 17:04:02 -0000 1.11
>> +++ lib/libc/asr/getrrsetbyname_async.c 23 Jul 2020 23:36:06 -0000
>> @@ -32,7 +32,8 @@
>>  #include "asr_private.h"
>>  
>>  static int getrrsetbyname_async_run(struct asr_query *, struct asr_result *);
>> -static void get_response(struct asr_result *, const char *, int);
>> +static void get_response(struct asr_query *, struct asr_result *, const char *,
>> +    int);
>>  
>>  struct asr_query *
>>  getrrsetbyname_async(const char *hostname, unsigned int rdclass,
>> @@ -150,7 +151,7 @@ getrrsetbyname_async_run(struct asr_quer
>>   break;
>>   }
>>  
>> - get_response(ar, ar->ar_data, ar->ar_datalen);
>> + get_response(as, ar, ar->ar_data, ar->ar_datalen);
>>   free(ar->ar_data);
>>   async_set_state(as, ASR_STATE_HALT);
>>   break;
>> @@ -255,7 +256,8 @@ static void free_dns_response(struct dns
>>  static int count_dns_rr(struct dns_rr *, u_int16_t, u_int16_t);
>>  
>>  static void
>> -get_response(struct asr_result *ar, const char *pkt, int pktlen)
>> +get_response(struct asr_query *as, struct asr_result *ar, const char *pkt,
>> +    int pktlen)
>>  {
>>   struct rrsetinfo *rrset = NULL;
>>   struct dns_response *response = NULL;
>> @@ -287,7 +289,7 @@ get_response(struct asr_result *ar, cons
>>   rrset->rri_nrdatas = response->header.ancount;
>>  
>>   /* check for authenticated data */
>> - if (response->header.ad == 1)
>> + if (as->as_ctx->ac_options & RES_TRUSTAD && response->header.ad == 1)
>>   rrset->rri_flags |= RRSET_VALIDATED;
>>  
>>   /* copy name from answer section */
>> Index: lib/libc/asr/res_mkquery.c
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/asr/res_mkquery.c,v
>> retrieving revision 1.13
>> diff -u -p -r1.13 res_mkquery.c
>> --- lib/libc/asr/res_mkquery.c 14 Jan 2019 06:49:42 -0000 1.13
>> +++ lib/libc/asr/res_mkquery.c 24 Jul 2020 12:34:27 -0000
>> @@ -62,6 +62,9 @@ res_mkquery(int op, const char *dname, i
>>   h.flags |= RD_MASK;
>>   if (ac->ac_options & RES_USE_CD)
>>   h.flags |= CD_MASK;
>> + if ((ac->ac_options & RES_TRUSTAD) &&
>> +    !(ac->ac_options & RES_USE_DNSSEC))
>> + h.flags |= AD_MASK;
>>   h.qdcount = 1;
>>   if (ac->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
>>   h.arcount = 1;
>> Index: lib/libc/asr/res_send_async.c
>> ===================================================================
>> RCS file: /d/cvs/src/lib/libc/asr/res_send_async.c,v
>> retrieving revision 1.39
>> diff -u -p -r1.39 res_send_async.c
>> --- lib/libc/asr/res_send_async.c 28 Sep 2019 11:21:07 -0000 1.39
>> +++ lib/libc/asr/res_send_async.c 24 Jul 2020 13:02:38 -0000
>> @@ -378,6 +378,9 @@ setup_query(struct asr_query *as, const
>>   h.flags |= RD_MASK;
>>   if (as->as_ctx->ac_options & RES_USE_CD)
>>   h.flags |= CD_MASK;
>> + if ((as->as_ctx->ac_options & RES_TRUSTAD) &&
>> +    !(as->as_ctx->ac_options & RES_USE_DNSSEC))
>> + h.flags |= AD_MASK;
>>   h.qdcount = 1;
>>   if (as->as_ctx->ac_options & (RES_USE_EDNS0 | RES_USE_DNSSEC))
>>   h.arcount = 1;
>> @@ -700,6 +703,9 @@ validate_packet(struct asr_query *as)
>>   /* Must be a response */
>>   if ((h.flags & QR_MASK) == 0)
>>   goto inval;
>> + /* Zap the AD flag unless the server is considered trusted. */
>> + if (!(as->as_ctx->ac_options & RES_TRUSTAD))
>> + h.flags &= ~(AD_MASK);
>>  
>>   as->as.dns.rcode = RCODE(h.flags);
>>   as->as.dns.ancount = h.ancount;
>>
>>
>> --
>> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>>
>
<#secure method=pgpmime mode=sign>

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
In reply to this post by Jesper Wallin
On Sun, Jul 26 2020, Jesper Wallin <[hidden email]> wrote:
> On Sat, Jul 25, 2020 at 02:01:06PM +0200, Jeremie Courreges-Anglas wrote:
>>
>> For those two reasons I think the feature should be opt-in.
>
> Yeah, I agree with you.  My first approach was to have it check what
> kind of DNS record that was requested, and add the AD-flag only if type
> was SSHFP, but that felt even uglier.  I also wasn't so sure my approach
> was the right one after reading the RFCs Peter J. Philipp mentioned.

The quote from RFC6840 seems clear to me, care to share why you had some
doubts if they still exist?

> Perhaps another approach would be to make use of the currently unused
> flags argument in getrrsetbyname(3)?  This way, only getrrsetbyname(3)
> and certain requests are affected by it.

I thought about that too, but getrrsetbyname(3) isn't the only function
of interest.  There's also res_query(3) and friends, which are in more
widely use in the larger ecosystem.  I guess we could restrict AD flag
tweaking to APIs where the caller can actually access the AD flag in the
response, but the "default vs opt-in" question is still present.

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jesper Wallin
On Mon, Jul 27, 2020 at 03:30:52AM +0200, Jeremie Courreges-Anglas wrote:

> On Sun, Jul 26 2020, Jesper Wallin <[hidden email]> wrote:
> > On Sat, Jul 25, 2020 at 02:01:06PM +0200, Jeremie Courreges-Anglas wrote:
> >>
> >> For those two reasons I think the feature should be opt-in.
> >
> > Yeah, I agree with you.  My first approach was to have it check what
> > kind of DNS record that was requested, and add the AD-flag only if type
> > was SSHFP, but that felt even uglier.  I also wasn't so sure my approach
> > was the right one after reading the RFCs Peter J. Philipp mentioned.
>
> The quote from RFC6840 seems clear to me, care to share why you had some
> doubts if they still exist?

Sorry for being unclear, I was referring to the previous RFCs, but no,
no doubts.  RFC6840 is indeed clear on the matter.

> > Perhaps another approach would be to make use of the currently unused
> > flags argument in getrrsetbyname(3)?  This way, only getrrsetbyname(3)
> > and certain requests are affected by it.
>
> I thought about that too, but getrrsetbyname(3) isn't the only function
> of interest.  There's also res_query(3) and friends, which are in more
> widely use in the larger ecosystem.  I guess we could restrict AD flag
> tweaking to APIs where the caller can actually access the AD flag in the
> response, but the "default vs opt-in" question is still present.
>

Yeah, true.  I'd say that opt-in is the proper way to go, no matter what
approcah is taken.  For what it's worth, my vote is to go with your
trust-ad approach.  It's a lot more elegant than mine.  It would need
some manuals changes though, to cover the cases benno@ mentioned.

I still think the RES_USE_AD option might be a useful though, for when
you want to decide on an application-level whether to trust AD or not?


Yours,
Jesper Wallin

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
On Mon, Jul 27 2020, Jesper Wallin <[hidden email]> wrote:

[...]

> I still think the RES_USE_AD option might be a useful though, for when
> you want to decide on an application-level whether to trust AD or not?

RES_TRUSTAD can also be used for this, but as the proposed documentation
points out it would be better to let the admin decide.  One
(dis)advantage of using that name is that some external applications
already grok it (eg ports/mail/postfix).

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jesper Wallin
On Wed, Jul 29, 2020 at 02:46:06PM +0200, Jeremie Courreges-Anglas wrote:

> On Mon, Jul 27 2020, Jesper Wallin <[hidden email]> wrote:
>
> [...]
>
> > I still think the RES_USE_AD option might be a useful though, for when
> > you want to decide on an application-level whether to trust AD or not?
>
> RES_TRUSTAD can also be used for this, but as the proposed documentation
> points out it would be better to let the admin decide.  One
> (dis)advantage of using that name is that some external applications
> already grok it (eg ports/mail/postfix).
>

Oh, of course! My bad. *facepalm*

I decided to use RES_USE_AD since the last 4 options in res_init(3) was
using that naming convension.  Also, it would be silly to "use ad"
without actually trusting it? :-)

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Sebastian Benoit-3
In reply to this post by Jeremie Courreges-Anglas-2
Jeremie Courreges-Anglas([hidden email]) on 2020.07.26 22:52:47 +0200:

> On Sat, Jul 25 2020, Sebastian Benoit <[hidden email]> wrote:
> > Jeremie Courreges-Anglas([hidden email]) on 2020.07.25 14:01:06 +0200:
> >> On Fri, Jul 17 2020, Jesper Wallin <[hidden email]> wrote:
> >> > Hi all,
> >> >
> >> > I recently decided to add SSHFP records for my servers, since I never
> >> > memorize or write down my key fingerprints.  I learned that if I
> >> > want ssh(1) to trust these records, DNSSEC needs to be enabled for my
> >> > zone.  To validate these records, ssh(1) is using getrrsetbyname(3),
> >> > which checks if the AD (Authentic Data) flag is set in the response.
> >> >
> >> > To get a response with the AD flag set, the request itself also needs
> >> > to have the AD flag set.  It turns out that getrrsetbyname(3) doesn't
> >> > set this and will therefore never get a validated response, unless the
> >> > resolver is configured to always send it, no matter what the client
> >> > requested.
> >> >
> >> > It seems like unwind(8) behaves this way but it also responds with the
> >> > RRSIG records, which is extra overhead and ignored by getrrsetbyname(3).
> >> >
> >> > This was mentioned a few years ago [0] and the solution suggested was
> >> > to add the RES_USE_DNSSEC to _res.options, which also makes the resolver
> >> > respond with the extra RRSIG records.
> >> >
> >> > Instead, by only setting the AD flag, both the request and the response
> >> > has the same size as without the flag set.  The patch below will add
> >> > RES_USE_AD as an option to _res.options and set it by default.
> >> > This is also the default behaviour in dig(1), which I understand is a
> >> > bit different, but that sure added some confusion while debugging this.
> >> >
> >> > This let you run unbound(8) or any other validating resolver on your
> >> > local network and getrrsetbyname(3) will trust it.  Do read the CAVEATS
> >> > in the manual of getrrsetbyname(3) though.
> >> >
> >> > As a side note, I noticed that the default value of _res.options was the
> >> > same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for
> >> > the sake of consistency.
> >> >
> >> > Thoughts?
> >>
> >> Thanks for addressing this longstanding issue.
> >>
> >> I think using the AD bit in queries is a good idea. IIRC Peter J.
> >> Philipp (cc'd) suggested using it but I was not thrilled because:
> >>
> >> 1. the meaning of the AD bit in queries is relatively recent (2013
> >>   I think[0])
> >> 2. getrrsetbyname also collects signature records, and for this you need
> >>   EDNS + the DO bit set.  Implementing this in was not 100% trivial,
> >>   I think we had something working but Eric or I were not 100% happy
> >>   with it.
> >>
> >> 1. is probably not a concern, after all you're supposed to use
> >> a trustworthy resolver, which should be modern and understand the
> >> purpose of the AD bit in queries.
> >>
> >> 2. is probably not a concern either.  I guess that all getrrsetbyname(3)
> >> callers only care about the target records and the AD bit, not about the
> >> signature records.  (Why would they use it for anyway?).  In the base
> >> system, only ssh(1) and traceroute(8) use getrrsetbyname(3).
> >> AFAIK no other system provides getrrsetbyname(3), and ISC has removed
> >> getrrsetbyname and the whole lwres API in 2017[1].  So I'd say we're
> >> free to improve our version of getrrsetbyname(3) as we see fit.
> >
> > This is a concern for the stub resolver, because edns and AD does not work
> > everywhere.
> > Indeed when we switched unbound to validate by default we
> > learned that this is not a good idea for everyone - which lead to the
> > development of unwind(8).
>
> Do you by chance have any data regarding fallout because of the AD bit
> set in queries?  I would expect it to be ignored when not supported.
> EDNS and the DNSSEC DO bit is a different story indeed.

If i remember correctly, the fallout was caused by EDNS but i might be
wrong. The unbound commit caused a developer some headscratching, because
his upstream internet did not work with such packets, which led to immediate
backout of the change, because a default config that does not work is not
good.

In that regard i worry about the resolv.conf option: it has the potential to
break peoples DNS in non obvious ways. At least dont advertise it as a way
to "enable dnssec" for the whole system, instead point people to unwind.

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
In reply to this post by Jeremie Courreges-Anglas-2
On Sun, Jul 26 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> On Sat, Jul 25 2020, Sebastian Benoit <[hidden email]> wrote:

[...]

>> If you enable trust-ad on a system that moves around, e.g. your laptop, you
>> will experience failures, which is why unwind tests for this and falls back
>> to non-trusting dhcp learned resolvers in such cases. In fact it also falls
>> back to the stub resolver, i.e. our libc resolver. Which makes me think that
>> trust-ad should not be used when you run unwind (because the whole point of
>> the fallback is that dnssec does not work). But thats a documentation issue.

One important thing to keep in mind: trust-ad is not so much about being
able to do DNSSEC validation.  It's about trusting the validation done
by the resolver(s).  So if the resolver can't do validation and ensures
that AD flag is clear in the reply there is no breach of contract.

> Thanks for pointing this out.  I would expect from unwind(8) that it
> always clears the AD bit from its responses if it could not validate
> them.  Inclusing when it falls back to dynamically learned resolvers or
> the libc resolver.

So I did a few tests and read some unwind(8) code, and it *appears* safe
to use unwind(8) along with "options trust-ad".

First you mention fallback to DHCP-learned resolvers.  Those you should
probably not trust indeed, but it looks like unwind(8) attempts to use
them to perform its own validation.  So the value of the AD flag in
unwind(8)'s response should be trustworthy.  At least that's what I see
when I test unwind with "preference { dhcp }".

And then there's the asr fallback, tested with "preference { stub }".
unwind(8) allocates its own asr context, ignoring the global
/etc/resolv.conf and thus "options trust-ad".  IIUC it then hands off
the answer to libunbound which cooks its own soup.  So it looks like
unwind needs no special code to disable/ignore "options trust-ad".

I have no idea how unwind/libunbound behaves when using forwarders, the
"accept bogus" feature, (opportunistic) DoT or other stuff I'm probably
overlooking.  But AFAICS using unwind + "options trust-ad" brings no
problem.  cc'ing Florian, input welcome.

I think the issue with trust-ad is the list of resolvers that end up in
resolv.conf.  I would expect people who use a resolver on localhost to
also have other resolvers listed in their /etc/resolv.conf, because of
a conscious choice (bsd.rd) or just because it should be more reliable.
It seems to me that the resolv.conf(5) manpage addition should cover
that.  Or should I make it more explicit?

Index: share/man/man5/resolv.conf.5
===================================================================
RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
retrieving revision 1.60
diff -u -p -r1.60 resolv.conf.5
--- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
+++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
@@ -285,6 +285,15 @@ first as an absolute name before any sea
 .It Cm tcp
 Forces the use of TCP for queries.
 Normal behaviour is to query via UDP but fall back to TCP on failure.
+.It Cm trust-ad
+Sets the AD flag in DNS queries, and preserve the value of the AD flag
+in DNS replies (this flag is stripped by default).
+Useful when applications want to ensure that the DNS resources they
+look up have been signed with DNSSEC and validated by the
+.Ic nameserver .
+This option should be used only if the transport between the
+.Ic nameserver
+and the resolver is secure.
 .El
 .El
 .Pp

Lazy me would be tempted to just let asr trust 127.0.0.1 / ::1 by
default and let others deal with the edge cases, but where is the fun in
that... :)

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Sebastian Benoit-3
Jeremie Courreges-Anglas([hidden email]) on 2020.07.29 15:51:55 +0200:

> On Sun, Jul 26 2020, Jeremie Courreges-Anglas <[hidden email]> wrote:
> > On Sat, Jul 25 2020, Sebastian Benoit <[hidden email]> wrote:
>
> [...]
>
> >> If you enable trust-ad on a system that moves around, e.g. your laptop, you
> >> will experience failures, which is why unwind tests for this and falls back
> >> to non-trusting dhcp learned resolvers in such cases. In fact it also falls
> >> back to the stub resolver, i.e. our libc resolver. Which makes me think that
> >> trust-ad should not be used when you run unwind (because the whole point of
> >> the fallback is that dnssec does not work). But thats a documentation issue.
>
> One important thing to keep in mind: trust-ad is not so much about being
> able to do DNSSEC validation.  It's about trusting the validation done
> by the resolver(s).  So if the resolver can't do validation and ensures
> that AD flag is clear in the reply there is no breach of contract.
>
> > Thanks for pointing this out.  I would expect from unwind(8) that it
> > always clears the AD bit from its responses if it could not validate
> > them.  Inclusing when it falls back to dynamically learned resolvers or
> > the libc resolver.
>
> So I did a few tests and read some unwind(8) code, and it *appears* safe
> to use unwind(8) along with "options trust-ad".
>
> First you mention fallback to DHCP-learned resolvers.  Those you should
> probably not trust indeed, but it looks like unwind(8) attempts to use
> them to perform its own validation.  So the value of the AD flag in
> unwind(8)'s response should be trustworthy.  At least that's what I see
> when I test unwind with "preference { dhcp }".
>
> And then there's the asr fallback, tested with "preference { stub }".
> unwind(8) allocates its own asr context, ignoring the global
> /etc/resolv.conf and thus "options trust-ad".  IIUC it then hands off
> the answer to libunbound which cooks its own soup.  So it looks like
> unwind needs no special code to disable/ignore "options trust-ad".

you are right, i forgot about that.
 

> I have no idea how unwind/libunbound behaves when using forwarders, the
> "accept bogus" feature, (opportunistic) DoT or other stuff I'm probably
> overlooking.  But AFAICS using unwind + "options trust-ad" brings no
> problem.  cc'ing Florian, input welcome.
>
> I think the issue with trust-ad is the list of resolvers that end up in
> resolv.conf.  I would expect people who use a resolver on localhost to
> also have other resolvers listed in their /etc/resolv.conf, because of
> a conscious choice (bsd.rd) or just because it should be more reliable.
> It seems to me that the resolv.conf(5) manpage addition should cover
> that.  Or should I make it more explicit?

I think its a good start.

Fwiw i dont have other resolvers except localhost in resolv.conf, on laptops
(all running unwind) as well as my gateway (running unbound). (Server type
systems are a different thing of course). If the local resolver does not
work, i want to notice it right away.

>
> Index: share/man/man5/resolv.conf.5
> ===================================================================
> RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
> retrieving revision 1.60
> diff -u -p -r1.60 resolv.conf.5
> --- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
> +++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
> @@ -285,6 +285,15 @@ first as an absolute name before any sea
>  .It Cm tcp
>  Forces the use of TCP for queries.
>  Normal behaviour is to query via UDP but fall back to TCP on failure.
> +.It Cm trust-ad
> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
> +in DNS replies (this flag is stripped by default).
> +Useful when applications want to ensure that the DNS resources they
> +look up have been signed with DNSSEC and validated by the
> +.Ic nameserver .
> +This option should be used only if the transport between the
> +.Ic nameserver
> +and the resolver is secure.
>  .El
>  .El
>  .Pp
>
> Lazy me would be tempted to just let asr trust 127.0.0.1 / ::1 by
> default and let others deal with the edge cases, but where is the fun in
> that... :)
>
> --
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Florian Obser-2
In reply to this post by Sebastian Benoit-3
On Wed, Jul 29, 2020 at 03:51:17PM +0200, Sebastian Benoit wrote:
> If i remember correctly, the fallout was caused by EDNS but i might be
> wrong. The unbound commit caused a developer some headscratching, because
> his upstream internet did not work with such packets, which led to immediate
> backout of the change, because a default config that does not work is not
> good.

It was time. Running DNSSEC validation on a system without an RTC is
not a good idea. NTP could not fix this because it depends on working
DNS. This has since been addressed by Otto.

The edns problem is well understood and has nothing to do with turning
DNSSEC validation on in unbound since unbound always sends an edns0
option. So if your network sucks so badly that you can't edns0 you
can't use unbound, period.

--
I'm not entirely sure you are real.

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Theo de Raadt-2
Florian Obser <[hidden email]> wrote:

> On Wed, Jul 29, 2020 at 03:51:17PM +0200, Sebastian Benoit wrote:
> > If i remember correctly, the fallout was caused by EDNS but i might be
> > wrong. The unbound commit caused a developer some headscratching, because
> > his upstream internet did not work with such packets, which led to immediate
> > backout of the change, because a default config that does not work is not
> > good.
>
> It was time. Running DNSSEC validation on a system without an RTC is
> not a good idea. NTP could not fix this because it depends on working
> DNS. This has since been addressed by Otto.

Addressed as well as possible, but perfection is not achievable.

ntpd (which is run by default) tries aggressively to repair the clock at
boot.  Later on, it is not so aggressive at correcting the clock.

For real computers, this is of no concern, it will work fine.

But on machines without a battery RTC, if internet operation is weak at
boot, but becomes better later on, there is still a window of time
(before ntpd slowly corrects the clock) where time is incoherent and
weird stuff can happen.

So my recommendation is to buy real computers.  And garbage can misbehave
like garbage will.

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Florian Obser-2
In reply to this post by Jeremie Courreges-Anglas-2
On Wed, Jul 29, 2020 at 03:51:55PM +0200, Jeremie Courreges-Anglas wrote:
> So I did a few tests and read some unwind(8) code, and it *appears* safe
> to use unwind(8) along with "options trust-ad".

Yes.

>
> First you mention fallback to DHCP-learned resolvers.  Those you should
> probably not trust indeed, but it looks like unwind(8) attempts to use
> them to perform its own validation.  So the value of the AD flag in
> unwind(8)'s response should be trustworthy.  At least that's what I see
> when I test unwind with "preference { dhcp }".
>

unwind always does its own validation or decides it can't do
validation. In both cases you can trust the AD flag.
The AD flag is set if and only if unwind did a successful validation.

You cannot trust the RCODE (and answer) when AD is *NOT* set. That
seems like a strange and obvious statement but it is the main
difference to validating unbound. When you enable validation in
unbound and someone messes with your packets you will get a SERVFAIL.

You can trick unwind into believing that DNSSEC validation is not
possible in your current network location. It will still resolve but
it will NOT set AD.

In the context of the current discussion this means that you can't
validate SSHFPs, but you can still try to connect to your ssh server
if you have the fingerprint in known_hosts. The later is still save
even if people play tricks with DNS in your current location.
 
> And then there's the asr fallback, tested with "preference { stub }".
> unwind(8) allocates its own asr context, ignoring the global
> /etc/resolv.conf and thus "options trust-ad".  IIUC it then hands off

Yes. But even if it did not it would not matter since it will clear
AD.

> the answer to libunbound which cooks its own soup.  So it looks like

No, the asr fallback is there for when your local network is truly
fucked - think last century equipment that knows DNS is udp/53 only and
max 512 bytes. While we already lower standards considerably in unwind
when we use libunbound compared to unbound the network still needs to
provide some things. Working edns0 is one of them, it's near
impossible to turn of in libunbound.

> unwind needs no special code to disable/ignore "options trust-ad".
>
> I have no idea how unwind/libunbound behaves when using forwarders, the
> "accept bogus" feature, (opportunistic) DoT or other stuff I'm probably

Nope, as stated above, you can trust AD.

> overlooking.  But AFAICS using unwind + "options trust-ad" brings no
> problem.  cc'ing Florian, input welcome.
>
> I think the issue with trust-ad is the list of resolvers that end up in
> resolv.conf.  I would expect people who use a resolver on localhost to
> also have other resolvers listed in their /etc/resolv.conf, because of
> a conscious choice (bsd.rd) or just because it should be more reliable.
> It seems to me that the resolv.conf(5) manpage addition should cover
> that.  Or should I make it more explicit?
>
> Index: share/man/man5/resolv.conf.5
> ===================================================================
> RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
> retrieving revision 1.60
> diff -u -p -r1.60 resolv.conf.5
> --- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
> +++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
> @@ -285,6 +285,15 @@ first as an absolute name before any sea
>  .It Cm tcp
>  Forces the use of TCP for queries.
>  Normal behaviour is to query via UDP but fall back to TCP on failure.
> +.It Cm trust-ad
> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
> +in DNS replies (this flag is stripped by default).
> +Useful when applications want to ensure that the DNS resources they
> +look up have been signed with DNSSEC and validated by the
> +.Ic nameserver .
> +This option should be used only if the transport between the
> +.Ic nameserver
> +and the resolver is secure.

The last sentence is the problem in my opinion. trust-ad is a button
mere mortals cannot push.

The typical answer is: Oh, I use a vpn tunel and my nameserver is
behind the tunnel. Does the tunel terminate on the nameserver?
My enterprise vpn hands me 4 resolvers. They are in a different
subnet than the vpn concentrator.

>  .El
>  .El
>  .Pp
>
> Lazy me would be tempted to just let asr trust 127.0.0.1 / ::1 by

probably only save if you know that you are talking to certain
software that does it's own validation (unwind, unbound). Not save if
you are talking to say dnscrypt-proxy.

> default and let others deal with the edge cases, but where is the fun in
> that... :)
>
> --
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>

--
I'm not entirely sure you are real.

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Jeremie Courreges-Anglas-2
On Wed, Jul 29 2020, Florian Obser <[hidden email]> wrote:

> On Wed, Jul 29, 2020 at 03:51:55PM +0200, Jeremie Courreges-Anglas wrote:
>> So I did a few tests and read some unwind(8) code, and it *appears* safe
>> to use unwind(8) along with "options trust-ad".
>
> Yes.
>
>>
>> First you mention fallback to DHCP-learned resolvers.  Those you should
>> probably not trust indeed, but it looks like unwind(8) attempts to use
>> them to perform its own validation.  So the value of the AD flag in
>> unwind(8)'s response should be trustworthy.  At least that's what I see
>> when I test unwind with "preference { dhcp }".
>>
>
> unwind always does its own validation or decides it can't do
> validation. In both cases you can trust the AD flag.
> The AD flag is set if and only if unwind did a successful validation.
>
> You cannot trust the RCODE (and answer) when AD is *NOT* set. That
> seems like a strange and obvious statement but it is the main
> difference to validating unbound. When you enable validation in
> unbound and someone messes with your packets you will get a SERVFAIL.
>
> You can trick unwind into believing that DNSSEC validation is not
> possible in your current network location. It will still resolve but
> it will NOT set AD.

Thanks for confirming.

> In the context of the current discussion this means that you can't
> validate SSHFPs, but you can still try to connect to your ssh server
> if you have the fingerprint in known_hosts. The later is still save
> even if people play tricks with DNS in your current location.

Yep.

>> And then there's the asr fallback, tested with "preference { stub }".
>> unwind(8) allocates its own asr context, ignoring the global
>> /etc/resolv.conf and thus "options trust-ad".  IIUC it then hands off
>
> Yes. But even if it did not it would not matter since it will clear
> AD.
>
>> the answer to libunbound which cooks its own soup.  So it looks like
>
> No, the asr fallback is there for when your local network is truly
> fucked - think last century equipment that knows DNS is udp/53 only and
> max 512 bytes. While we already lower standards considerably in unwind
> when we use libunbound compared to unbound the network still needs to
> provide some things. Working edns0 is one of them, it's near
> impossible to turn of in libunbound.
>
>> unwind needs no special code to disable/ignore "options trust-ad".
>>
>> I have no idea how unwind/libunbound behaves when using forwarders, the
>> "accept bogus" feature, (opportunistic) DoT or other stuff I'm probably
>
> Nope, as stated above, you can trust AD.
>
>> overlooking.  But AFAICS using unwind + "options trust-ad" brings no
>> problem.  cc'ing Florian, input welcome.
>>
>> I think the issue with trust-ad is the list of resolvers that end up in
>> resolv.conf.  I would expect people who use a resolver on localhost to
>> also have other resolvers listed in their /etc/resolv.conf, because of
>> a conscious choice (bsd.rd) or just because it should be more reliable.
>> It seems to me that the resolv.conf(5) manpage addition should cover
>> that.  Or should I make it more explicit?
>>
>> Index: share/man/man5/resolv.conf.5
>> ===================================================================
>> RCS file: /d/cvs/src/share/man/man5/resolv.conf.5,v
>> retrieving revision 1.60
>> diff -u -p -r1.60 resolv.conf.5
>> --- share/man/man5/resolv.conf.5 25 Apr 2020 14:22:04 -0000 1.60
>> +++ share/man/man5/resolv.conf.5 25 Jul 2020 02:08:17 -0000
>> @@ -285,6 +285,15 @@ first as an absolute name before any sea
>>  .It Cm tcp
>>  Forces the use of TCP for queries.
>>  Normal behaviour is to query via UDP but fall back to TCP on failure.
>> +.It Cm trust-ad
>> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
>> +in DNS replies (this flag is stripped by default).
>> +Useful when applications want to ensure that the DNS resources they
>> +look up have been signed with DNSSEC and validated by the
>> +.Ic nameserver .
>> +This option should be used only if the transport between the
>> +.Ic nameserver
>> +and the resolver is secure.
>
> The last sentence is the problem in my opinion.

Note: that sentence was adapted from getrrsetbyname(3), stripping the
example list precisely because I didn't want to be specific.  The devil
is in the details.  I'm not even sure loopback communication be
considered secure.

> trust-ad is a button
> mere mortals cannot push.

Do mere mortals care about or rely on DNSSEC? ;)

> The typical answer is: Oh, I use a vpn tunel and my nameserver is
> behind the tunnel. Does the tunel terminate on the nameserver?
> My enterprise vpn hands me 4 resolvers. They are in a different
> subnet than the vpn concentrator.

Well one of the reasons I'm proposing a knob is that I doubt we can find
a one-size-fits-all solution.

--
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply | Threaded
Open this post in threaded view
|

Re: ssh(1), getrrsetbyname(3), SSHFP and DNSSEC

Peter J. Philipp-3
In reply to this post by Florian Obser-2
On Wed, Jul 29, 2020 at 05:42:16PM +0200, Florian Obser wrote:

> > First you mention fallback to DHCP-learned resolvers.  Those you should
> > probably not trust indeed, but it looks like unwind(8) attempts to use
> > them to perform its own validation.  So the value of the AD flag in
> > unwind(8)'s response should be trustworthy.  At least that's what I see
> > when I test unwind with "preference { dhcp }".
> >
>
> unwind always does its own validation or decides it can't do
> validation. In both cases you can trust the AD flag.
> The AD flag is set if and only if unwind did a successful validation.
>
> You cannot trust the RCODE (and answer) when AD is *NOT* set. That
> seems like a strange and obvious statement but it is the main
> difference to validating unbound. When you enable validation in
> unbound and someone messes with your packets you will get a SERVFAIL.

Hi all,

I just put unwind on my own forwarding dns server and you can see what it does
in the following logs (I like it!):

--- here I restarted unwind, it looks up the root zone ---
Jul 29 18:06:19 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.291ms) for "." \
type=NS(2) class=1, edns0, dnssecok, answering "FORWARD" (28/83)                            
Jul 29 18:06:19 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.486ms) for "." \
type=DNSKEY(48) class=1, edns0, dnssecok, answering "FORWARD" (28/83)

Jul 29 18:06:19 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.286ms) for "." \
type=DNSKEY(48) class=1, edns0, dnssecok, answering "FORWARD" (28/83)

---- it now has the anchor keys to check that these dnskeys are right ----

---- at this point I did a ping of centroid.eu -----

Jul 29 18:07:00 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.347ms) for \
"centroid.eu." type=A(1) class=1, edns0, dnssecok, answering "FORWARD" (40/83)

--- it then proceeds to check the eu. DS and DNSKEY to do DNSSEC validation ---

Jul 29 18:07:00 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.358ms) for "eu." \
type=DS(43) class=1, edns0, dnssecok, answering "FORWARD" (31/83)

Jul 29 18:07:00 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.289ms) for "eu." \
type=DNSKEY(48) class=1, edns0, dnssecok, answering "FORWARD" (31/83)

Jul 29 18:07:00 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.292ms) for \
"centroid.eu." type=DS(43) class=1, edns0, dnssecok, answering "FORWARD" (40/83)

Jul 29 18:07:00 eta delphinusdnsd[56662]: request on descriptor 16 interface \
"cnmac1" from 192.168.177.2 (ttl=64, region=255, tta=1.346ms) for \
"centroid.eu." type=DNSKEY(48) class=1, edns0, dnssecok, answering "FORWARD" \
(40/83)

--- at this point the DNSSEC validation reverse walk is complete ----

Here is the unwind .conf file I used on my network:

forwarder 192.168.177.1
preference forwarder

And here is some attempt at explaining the delphinusdnsd log format:
https://delphinusdns.org/blog/c?article=1568367256

The log FORWARD is new, it just means that delphinusdnsd gives off the
question to its own forwarding engine that forwards the request.  My setup
is as follows (in terms of dns servers):

unwind[1]->delphinusdnsd forwarder[2]->delphinusdnsd forwarder[3]->unbound[4]

The path between [2] and [3] is a secure TSIG channel on another port than 53.
[2] has a cache, [3] does not.

I assume unwind also has a cache?

Unbound does give the AD bit back here, though that's not in my logs.  I know
this though because of dnssecok and edns0 bits on the question in delphinusdnsd it is passed forwards.  I know also that unwind would get the AD flag here,

It is just like Florian describes it.  unwind does validation despite the AD
flag.  This is awesome!

I think I'll keep this setup for a while it can't hurt.  Ultimately I'd love
to get rid of [3] and [4] in this setup and have [2] use the ISP's powerdns
nameservers.  But [2] sits at my gateway (a little octeon USG) and can't get
the nameservers from the pppoe0 without a patch.  Because maintaining patches
that don't come with OpenBSD get discarded and overrun eventually I can only
point to the patch I sent about a month ago to tech@ that brings Gerhard
Roth's patch up to the times (but I broke it in some way, so it would need to
be fixed up a little).  I would really like to see this functionality in
OpenBSD for modifying this little setup above, *pretty please?* :)

Thanks guys for the great work and interesting discussion!

-peter