Bind SERVFAIL on slave zone

This is for my own future reference.

I had a problem earlier tonight where the slave DNS server my domains wouldn’t return anything but SERVFAIL for any domain. The server was up and running and the config files hadn’t changed, but it just wouldn’t give any useful answers.

The log (in /var/log/named/bind.log) only gave messages like:

general: warning: zone expired

I tried using rndc on the slave to retransfer the zones, upped the debug trace logging, and even eventually stopped and restarted Bind. No change.

On a whim, I restarted the master Bind. This produced a more useful error:

security: error: client a.b.c.d#nnn: request has invalid signature: TSIG rickosborne: tsig
 verify failure (BADTIME)

That BADTIME message comes up when the clocks on the master and slave get more than 5 minutes out of sync. The master server’s clock was off by ~15 minutes.

The virtual server, running on Xen, ignored any attempts to set the clock. I tried plain old date and ntpdate, and both appeared to work, but were actually silently ignored. There’s a magical incantation to fix that:

echo 1 > /proc/sys/xen/independent_wallclock

So, yeah. That was a fun 2 hours of my life to track down and fix. I hope it helps someone else.

Published by Rick Osborne

I am a web geek who has been doing this sort of thing entirely too long. I rant, I muse, I whine. That is, I am not at all atypical for my breed.