two issues with the new ASR code in libc

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

two issues with the new ASR code in libc

Matthieu Herrb
Hi,

I recently upgraded two servers at work from OpenBSD 5.2 to 5.4 and
got problems on both of the machines that are related to the new
asynchronous resolver in libc.

1. this machnie is SMTP server for a subpart of our domains, running
sendmail. It receives the mail from the outside (and runs spamd) and
delivers the messages to internal mail box servers, according to the
alias map which has entries in the form

 joe: joe@servera

This setup has worked perfectly for years under various OSs and also
worked under 5.2. Under 5.4 I get errors like this one from sendmail:

Jan 13 13:00:02 smtp sm-mta[12261]: s0DC02FV014174: to=joe@servera,
delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=32928, relay=servera,
dsn=5.1.2, stat=Host unknown (Name server: servera: host not found)

And looking at DNS requests I see that sendmail is only emitting MX
queries for 'servera' and 'servera.mydomain.fr' but no AAAA or A
queries.

2. the second issue is with the dhcpd server. We assign static IPv4
addresses to about 2000 machines based on their MAC address. For
historical reasons the dhcpd.conf file has host names, not IP
addresses, so dhcpd has to resolve all names before starting. Again in
5.2 this was working. (dhcpd startup is a bit slow but that's ok). Now
in 5.4 in only starts once over 2 or 3 attempts, and the other cases
fail because it randomly can't resolve one of the names.

Jan 13 15:39:35 dhcp dhcpd[9242]: /etc/dhcpd.conf line 4553: fr (265): could not resolve hostname
Jan 13 15:39:35 dhcp dhcpd[9242]:     fixed-address foo.mydomain.fr;
Jan 13 15:51:19 nairobi dhcpd[15375]:                             ^
Jan 13 15:51:26 nairobi dhcpd[15375]: Configuration file errors encountered
Jan 13 15:51:26 nairobi dhcpd[15375]: exiting.

I guess a timeout in the new ASR code is shorter than in the old code
or something, causing those random failures.


I know I can "fix" both issues by changing the way things are handled,
but in both cases what I do is legal (even if not optimal) and I'd
rather see those issues fixed in OpenBSD.

I'm willing to test patches...
--
Matthieu Herrb

Reply | Threaded
Open this post in threaded view
|

Re: two issues with the new ASR code in libc

Matthieu Herrb
On Mon, Jan 13, 2014 at 05:03:53PM +0100, Matthieu Herrb wrote:

> Hi,
>
> I recently upgraded two servers at work from OpenBSD 5.2 to 5.4 and
> got problems on both of the machines that are related to the new
> asynchronous resolver in libc.
>
> 1. this machnie is SMTP server for a subpart of our domains, running
> sendmail. It receives the mail from the outside (and runs spamd) and
> delivers the messages to internal mail box servers, according to the
> alias map which has entries in the form
>
>  joe: joe@servera
>
> This setup has worked perfectly for years under various OSs and also
> worked under 5.2. Under 5.4 I get errors like this one from sendmail:
>
> Jan 13 13:00:02 smtp sm-mta[12261]: s0DC02FV014174: to=joe@servera,
> delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=32928, relay=servera,
> dsn=5.1.2, stat=Host unknown (Name server: servera: host not found)
>
> And looking at DNS requests I see that sendmail is only emitting MX
> queries for 'servera' and 'servera.mydomain.fr' but no AAAA or A
> queries.

One data point: reverting libc to the 5.3 resolver code makes my setup
work again. So this confirms that the new asr code is responsible for
the behaviour change.
OTOH, tracing what's happening in sendmail is a pain. So I've not yet
been able to explain why it's behaving this way.

--
Matthieu Herrb