snarlysodboxer/main.go

## main.go
package main

import (
        "flag"
        "fmt"
        "os"
        "syscall"
        "time"
)

var (
        filePath = flag.String("file_path", "/root/file.txt", "The path to the file whose mtime to check")
        ttl      = flag.Int("ttl", 60, "Seconds ago within which the file must have been modified. Older than this will report failure.")
)

func main() {
        flag.Parse()

        stats := &syscall.Stat_t{}
        if err := syscall.Stat(*filePath, stats); err != nil {
                fmt.Println(err)
                os.Exit(1)
        }

        currentTime := int64(time.Now().Unix())

        if currentTime-stats.Mtim.Sec > int64(*ttl) {
                fmt.Println("FAILURE")
                os.Exit(1)
        } else {
                fmt.Println("OK")
        }
}

## notes.md

      
    Raw
  

              notes.md
            
          
    monitoring cronjobs

Thoughts and approaches when monitoring cronjobs
In my experience there's often a better way to do things than with cronjobs, however for some use-cases it's the right tool for the job.

If it's possible to modify the job itself to push a metric to a metrics system, this often reduces systems setup, coupling, and moving parts.

The metrics system can then alert on a lack of recent data points.
I've successfully used this method to monitor database backups both using Prometheus's push-gateway, as well as InfluxDB with Grafana.
This also enables sending along other data points, such as duration information for the job so it can be graphed.
Care must be taken not to allow failed metrics code to cause the job to fail.


Where it's not reasonable to modify the job, here's a couple of approaches that can be considered:

Replacing the cron entry with a wrapper script that records duration and sends the metric.

Signal handling should be implemented and passed through to the child process (the job).


Creating a custom metrics exporter that checks on the results of the actions taken by the job.

E.G. checking S3 for recent files in the backups directory.
Different metrics systems would require different paradigms.
With Prometheus, it could be a daemon exposing a /metrics endpoint which when hit reaches out to S3 and formats the returned data for consumption by Prometheus. Prometheus could then alert on both missing backups as well as an unreachable metrics exporter.
	package main

	import (
	"flag"
	"fmt"
	"os"
	"syscall"
	"time"
	)

	var (
	filePath = flag.String("file_path", "/root/file.txt", "The path to the file whose mtime to check")
	ttl = flag.Int("ttl", 60, "Seconds ago within which the file must have been modified. Older than this will report failure.")
	)

	func main() {
	flag.Parse()

	stats := &syscall.Stat_t{}
	if err := syscall.Stat(*filePath, stats); err != nil {
	fmt.Println(err)
	os.Exit(1)
	}

	currentTime := int64(time.Now().Unix())

	if currentTime-stats.Mtim.Sec > int64(*ttl) {
	fmt.Println("FAILURE")
	os.Exit(1)
	} else {
	fmt.Println("OK")
	}
	}