Infiniroot Blog: We sometimes write, too.

Of course we cannot always share details about our work with customers, but nevertheless it is nice to show our technical achievements and share some of our implemented solutions.

How to monitor stale zones/domains on PowerDNS secondary (slave) servers

Published on March 11th 2022


When running PowerDNS in a Primary-Secondary (old terminology: Master-Slave) setup, the DNS replication happens over the DNS protocol.

Hint: Check out this article to read about the differences between Native and DNS replication in PowerDNS.

This type of replication works fine for zone changes and for newly created domains on the Primary server; new domains are automatically created on the Secondary server(s) when autosecondary=yes is configured.

Note: The "autosecondary" parameter was named "superslave" prior to PowerDNS 4.5.x. 

There is one downside of this replication type though: Zones (domains) deleted on the Primary are not replicated to the Secondary server(s). The Secondary server(s) still have the deleted domains in their database and serves the (should be deleted) domains as authoritative server.

This is what we call a "stale domain": The domain was deleted on the Primary but the Secondary server(s) don't know about it.

Quick one-liner check in Bash

To check for stale domains in the local database on a Secondary server, the following Bash one-liner is helpful:

root@secondary:~# mysql -Bse "select name from powerdns.domains" | while read domain; do answer=$(dig -t SOA $domain @ns1.example.com +short); rc=$?; if [[ "$answer" = "" ]]; then echo "Domain $domain not found on ns1.example.com"; fi; done
Domain example1.com not found on ns1.example.com
Domain example2.com not found on ns1.example.com
Domain example3.com not found on ns1.example.com
Domain example4.com not found on ns1.example.com

The one-liner above parses through all the zones found in the local PowerDNS database (here: powerdns) in the table domains. For each domain, a SOA record is looked up on the Primary server (here: ns1.example.com). If no SOA record could be found, the zone is considered stale.

This solution is fine for a manual check from time to time, but obviously you want to integrate this into monitoring.

Presenting check_pdns_stale monitoring plugin

That is why a new open source monitoring plugin was created: check_pdns_stale is a Python script which does the same as the above Bash one-liner, but also supports arguments and adds performance data.

The purpose of this monitoring plugin is self-explaining from its name: Run against the MySQL database of a PowerDNS Secondary server, it identifies stale zones/domains and warns if any are found.

MySQL privileges

To be able to read the domains from the database, the MySQL user used for this plugin requires SELECT privileges on the "domains" table. Assuming your PowerDNS database is called "powerdns", you can create a monitoring user with SELECT privileges with the following command:

mysql> GRANT SELECT ON powerdns.domains TO 'monitoring'@'localhost' IDENTIFIED BY 'secret';

Note: If you're using MySQL >= 8.0, check out this article how to create a user and grant privileges in MySQL 8.x.

Python3 requirements

The plugin is written in Python3 and requires - obviously - Python3 and the following modules:

These modules can be installed using the pip3 command:

$ sudo pip3 install mysql.connector dnspython

Or by using prepared packages from the Operating System. Here on Ubuntu Linux:

$ sudo apt-get install python3-mysql.connector python3-dnspython

Usage

Here's an example where the plugin is run on a PowerDNS Secondary server itself, connecting to the local MySQL database:

ck@secondary:~$ ./check_pdns_stale.py -H localhost -u monitoring -p secret -d powerdns -P ns1.example.com
PDNS SECONDARY WARNING: 4 stale zones: ['example1.com', 'example2.com', 'example3.com', 'example4.com'] |total_zones=534;;;; stale_zones=4;;;;

Integration into monitoring

Of course there are many ways how to integrate this new monitoring plugin into your monitoring software, but here are a few ideas.

Using NRPE

By using NRPE, the plugin is executed on the PowerDNS Secondary server, accessing (typically) the MySQL database on localhost. 

Note: Depending on the number of domains hosted on the Secondary, the check might take some time. Increase the NRPE timeout if you run into the NRPE check timeout (default 10s).

The following NRPE command definition can be used:

command[check_pdns_stale]=/usr/lib/nagios/plugins/check_pdns_stale.py -H $ARG1$ -u $ARG2$ -p $ARG3$ -d $ARG4$ -P $ARG5$

Of course all the parameters could be hard-coded, too.

On the Nagios (or config compatible) monitoring server, a service could look like this:

# Check PowerDNS Secondary Stale Domains
define service{
  use generic-service
  host_name ns2.example.com
  service_description PowerDNS Secondary Stale Domains
  check_command check_nrpe!check_pdns_stale!-a "localhost" "monitoring" "secret" "powerdns" "ns1.example.com"
}

On Icinga 2, a Service object could look like this:

# check pdns stale via nrpe
object Service "PowerDNS Secondary Stale Domains" {
  import "generic-service"
  host_name = "ns2.example.com"
  check_command = "nrpe"
  vars.nrpe_command = "check_pdns_stale"
  vars.nrpe_timeout = "30"
  vars.nrpe_arguments = [ "localhost", "monitoring", "secret", "powerdns", "ns1.example.com" ]
}

Using CheckCommand in Icinga 2

Of course the check can also be executed on the monitoring server itself, connecting remotely to the MySQL database of the PowerDNS Secondary server (requires that the MySQL database is also listening on the public interface). Or you could use the Icinga 2 agent and directly execute the commands on the Secondary server.

Here's an example of an Icinga 2 CheckCommand:

object CheckCommand "check_pdns_stale" {
  import "plugin-check-command"

  command = [ PluginContribDir + "/check_pdns_stale.py" ]

  arguments = {
    "-H" = {
      value = "$pdns_stale_mysql_host$"
      description = "MySQL host of the PowerDNS Secondary"
    }
    "-u" = {
      value = "$pdns_stale_mysql_user$"
      description = "MySQL user to connect as"
    }
    "-p" = {
      value = "$pdns_stale_mysql_user$"
      description = "MySQL password for the given user"
    }
    "-d" = {
      value = "$pdns_stale_mysql_database$"
      description = "MySQL database name (default: powerdns)"
    }
    "-P" = {
      value = "$pdns_stale_primary$"
      description = "IP or hostname of the PowerDNS Primary server"
    }
    "--debug" = {
      set_if = "$pdns_stale_debug$"
      description = "Run plugin in debug mode"
    }
  }

  vars.pdns_stale_mysql_host = "$address$"
  vars.pdns_stale_mysql_database = "powerdns"
  vars.pdns_stale_debug = false
}

And a relevant Service check object:

# Check PowerDNS Secondary Stale Domains
object Service "PowerDNS Secondary Stale Domains" {
  import "generic-service"
  host_name "ns2.example.com"
  check_command = "check_pdns_stale"
  vars.pdns_stale_mysql_user = "monitoring"
  vars.pdns_stale_mysql_password = "secret"
  vars.pdns_stale_primary = "ns1.example.com"
}

Screenshots

What would this article be without a screenshot of the implemented solution. Here we go - but with obfuscated information:

Monitoring of stale domains on PowerDNS Secondary server, integrated into Icinga 2