This is for my own future reference.
I had a problem earlier tonight where the slave DNS server my domains wouldn’t return anything but SERVFAIL for any domain. The server was up and running and the config files hadn’t changed, but it just wouldn’t give any useful answers.
The log (in /var/log/named/bind.log) only gave messages like:
general: warning: zone rickosborne.org/IN: expired
I tried using rndc on the slave to retransfer the zones, upped the debug trace logging, and even eventually stopped and restarted Bind. No change.
On a whim, I restarted the master Bind. This produced a more useful error:
security: error: client a.b.c.d#nnn: request has invalid signature: TSIG rickosborne: tsig verify failure (BADTIME)
That BADTIME message comes up when the clocks on the master and slave get more than 5 minutes out of sync. The master server’s clock was off by ~15 minutes.
The virtual server, running on Xen, ignored any attempts to set the clock. I tried plain old date and ntpdate, and both appeared to work, but were actually silently ignored. There’s a magical incantation to fix that:
echo 1 > /proc/sys/xen/independent_wallclock
So, yeah. That was a fun 2 hours of my life to track down and fix. I hope it helps someone else.