Monitoring a process for high memory consumption using Monit


I run Pi-hole on an old PogoPlug E02 with a custom compiled dnsmasq (or pihole-FTL, as they now call their customised version of it). Lately I have been noticing my DNS queries becoming slow erratically, and upon further investigation it looked like pihole-FTL has a memory balloon, and it consumes all of the 256 MBs of memory available and starts swapping, bringing everything to an almost standstill.

In comes Monit, a highly configurable process supervisor. This is how I set up monitoring for the errant pihole-FTL process. It checks whether the process consumes more than 100 MB of memory for more than three cycles, and if it does, it restarts it. This has taken care of any sort of manual tinkering I need to do whenever there’s complaints of the internet being slow.

check process pihole-FTL with pidfile /run/pihole-FTL.pid
   start program = "/usr/sbin/service pihole-FTL start" with timeout 20 seconds
   stop program = "/usr/sbin/service pihole-FTL stop"
   if totalmem > 100.0 MB for 3 cycles then restart

PS: Monit has nice commands to check the status of the processes/files/directories, etc. it monitors. monit summary for succinct information, or monit status for more verbose output. Note that you might need to turn on the HTTP API for these to work.

soumik@pi-hole:~# monit summary
 Monit 5.20.0 uptime: 32m
 ┌─────────────────────────────────┬────────────────────────────┬───────────────┐
 │ Service Name                    │ Status                     │ Type          │
 ├─────────────────────────────────┼────────────────────────────┼───────────────┤
 │ pi-hole                         │ Running                    │ System        │
 ├─────────────────────────────────┼────────────────────────────┼───────────────┤
 │ pihole-FTL                      │ Running                    │ Process       │
 └─────────────────────────────────┴────────────────────────────┴───────────────┘
soumik@pi-hole:~# monit status
 Monit 5.20.0 uptime: 32m
 Process 'pihole-FTL'
   status                       Running
   monitoring status            Monitored
   monitoring mode              active
   on reboot                    start
   pid                          6363
   parent pid                   1
   uid                          999
   effective uid                999
   gid                          999
   uptime                       22h 51m
   threads                      6
   children                     0
   cpu                          0.2%
   cpu total                    0.2%
   memory                       8.6% [20.7 MB]
   memory total                 8.6% [20.7 MB]
   data collected               Tue, 26 Feb 2019 18:40:28
 System 'pi-hole'
   status                       Running
   monitoring status            Monitored
   monitoring mode              active
   on reboot                    start
   load average                 [0.00] [0.00] [0.07]
   cpu                          0.4%us 0.3%sy 0.3%wa
   memory usage                 43.1 MB [17.8%]
   swap usage                   8.2 MB [1.6%]
   uptime                       1d 20h 37m
   boot time                    Sun, 24 Feb 2019 22:03:33
   data collected               Tue, 26 Feb 2019 18:40:28