Counting NTP Clients

Wherever you’re running an NTP server: It is really interesting to see how many clients are using it. Either at home, in your company or worldwide at the NTP Pool Project. The problem is that ntp itself does not give you this answer of how many clients it serves. There are the “monstats” and “mrulist” queries but they are not reliable at all since they are not made for this. Hence I had to take another path in order to count NTP clients for my stratum 1 NTP servers. Let’s dig in:

This article is one of many blogposts within this NTP series. Please have a look!

Not: monstats & mrulist

If you are running an NTP server with just a few clients (let’s say: up to 100), you can use the monstats and mrulist queries to get the number of clients and their addresses. However, you should NOT use those queries on NTP servers that are under high load. They will take minutes to finish and the results are not reliable at all.

I tried monitoring my NTP servers with the “maximum addresses” output from the monstats query, but the numbers weren’t exact at all. Even adjusting the “reclaim above count” and “reclaim older than” did not succeed in any way.

Alternative: UFW with grep ‘n sed et al.

This is my solution: I installed a simple firewall on the NTP server, the uncomplicated firewall UFW, which only logs every single NTP request into syslog. With some scripts I am grepping through the logs, sorting and uniqing ;) the source IP addresses, and finally, count them.

Since my monitoring system (MRTG w/ Routers2 and RRDtool) polls every 5 minutes, I am counting the unique NTP clients during the last 5 minutes. But since common NTP clients increase the query interval up to 1024 seconds = about 17 minutes, those clients will only be listed in the “last 5 minutes” graph every third time. Hence this graph has some ups and downs. Therefore I am counting the unique source addresses over the last 20 minutes as well to get an idea of how many NTP clients are constantly using my NTP server. Note that a one-time peak will be correctly shown in the 5min graph while it will be wrongly displayed for 15 more minutes in the 20min graph. This may happen when investigating DDos attacks or when using the NTP servers within the NTP Pool Project. In summary, I am quite happy with displaying both lines in one graph to have a view about the last 5 minutes as well as a quite realistic value of all clients (20 minutes).

Setting up UFW

Please note that I am operating my NTP servers with IPv6-only. Hence all subsequent commands are mostly IPv6 orientated. If you are still using legacy IP you need to adjust some of those steps.

Have a look at this and that documentation to get an idea. I installed the UFW on some Raspberry Pis as well as some generic Ubuntu Linux servers. Every time remotely via SSH though I would not recommend it that way. ;)

At first, you need to install UFW, add an allow rule for SSH to not cut off your branches, and verify that everything is working up to now:

Normally I would have added a “sudo ufw allow ntp/udp” or with logging such as “sudo ufw allow log-all ntp/udp” but in any case, the UFW made a log limit such as “-m limit –limit 3/min –limit-burst 10”. Therefore I added my NTP rules with logging (without limits) and a custom “log-prefix” text to the before6.rules. More information here. That is: sudo nano /etc/ufw/before6.rules and adding the following lines before the last COMMIT:

Followed by a sudo ufw reload. Using  sudo ufw show raw shows many lines. The chain “ufw6-before-input” now has two more lines at the end:

After a few seconds (depending on the load of your NTP server) the pkts and bytes are increasing:

And the /var/log/syslog shows log entries with the defined [UFW ALLOW NTP] prefix:

Perfect! ;)

Counting Source IP Addresses

With some small shell commands, you can extract the source IPv6 address only, sort it, list only unique addresses, and count it. You will get something like this:

After a RIPE Atlas measurement with 50 clients I got this:

Great!

Now I wanted to grep all addresses from the last 5 minutes since my default MRTG/Routers2 installation polls every 5 minutes. That is, I want to know the count of unique IPv6 source addresses during a period of 5 minutes. I found a great awk command that extracts the last 5 minutes by Alfred Tong which worked out of the box:

Combined with my grep sort uniq wc command it is this. Hence I’m getting the count of unique addresses during the last 5 minutes. Yeah. Since the default syslog file is rotated every day I am grepping through both logfiles, the current one and the one from yesterday with the “.1” extension. (Though this is only needed for correct stats for a few minutes a day, the script reads both files completely at every execution. But I don’t care. ;))

For getting the “clients during the last 20 minutes” it’s almost the same line, but with “-20 min” instead of the “-5 min” statement.

Monitoring via SNMP

Now that I have the logs and count of clients on the NTP server itself, I wanted to get this data into my monitoring system. I decided to use SNMP and its “EXTENDING THE AGENT” section. But before using SNMP, the user account of the snmpd must be able to read the syslog file. Therefore this user must be added to the “adm” group. Please note that on a newer Raspbian the snmpd was not run anymore by a user called “snmp” but “Debian-snmp”. Hence you must add this user to the adm group:

You can now proceed with the SNMP extensions:  sudo nano /etc/snmp/snmpd.conf and adding:

Followed by a: sudo service snmpd restart. From the SNMP server you can walk the OIDs to find the correct ones:

Note that under heavy load my script takes more than 2 seconds to run. This is no problem for the Pi but for MRTG which uses a default SNMP timeout of 2 seconds. Hence I increased the value to 11 seconds as well as the retries to 5. This value is at the end of the Target line after the second colon, while the last “2” declares SNMP version 2: ::11:5::2:

One more note: This approach will give correct results for IPv6 addresses since every single IPv6 node that sends a request is a unique IPv6 client. If you are using legacy IP the count of unique source IPv4 addresses might not be exactly the count of NTP clients since NAT might hide several NTP clients behind a single IPv4 address. Avoid using IPv4!

Now let’s have a look at the MRTG Target. Following are two different Targets as a reference, plus one summary graph to sum them up:

Uff. You’re done. ;) Congratulations!

Sample Graphs

Finally here are some graphs. This one shows normal days with internal NTP clients only (about 50-60). Note the filled pink area that shows the 5 min address count, while the purple line shows the 20 min address count which gives the sum of all current clients:

These graps list the clients during NTP Pool Project participation (10 – 50 k!!!). As always you have the daily/weekly/monthly/yearly graphs to either show overviews or more details:

Finally my summary graph over four different NTP servers, showing only the 20 min address counts, daily view:

Cheers!

Featured image “Wer im Alter nicht nur Peanuts zählen will, sollte sich bereits jetzt um ‘ne vernünftige Altersvorsorge kümmern – später bleibt keine Zeit mehr dafür. Und auch wenn das Thema so dröge wie langweilig erscheinen mag, ist es einfach wichtig, sich zu informie” by ppc1337 is licensed under CC BY-SA 2.0.

Leave a Reply

Your email address will not be published. Required fields are marked *