Skip to content

Instantly share code, notes, and snippets.

@jjmckain
Last active March 5, 2020 11:59
Show Gist options
  • Save jjmckain/afd0a72166b185657b53d3327d328838 to your computer and use it in GitHub Desktop.
Save jjmckain/afd0a72166b185657b53d3327d328838 to your computer and use it in GitHub Desktop.
Raspi-Sump InfluxDB and Grafana Integration

Raspi-Sump InfluxDB and Grafana Integration

This Gist outlines a basic extension of the Raspi-Sump project. It takes the data/output from the project's logfile and feeds to the Telegraf agent (locally on the Pi), which in turn feeds it to an InfluxDB time-series database, which is then graphed using Grafana. All of these tools are freely available and setup is beyond this scope.

I wanted to quickly capture what I did for future reference, and to benefit others who may have been equally inspired by Al's project.

Capture Run Cycles - added March 4, 2020

I've enhanced the bash script to compare the previous/current water levels in an effort to detect run cycles. If the water level has dropped by at least 3 inches (since the last run) we write an additonal entry to Telegraf, which gives us the ability to put some fancier data into Grafana. Yes, I'm making some presumptions here, but running data collection every 1m makes it a fairly safe bet. We don't know (definitively) the pump ran, we're just presuming it did because the water level changed (significantly).

{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 3,
"links": [],
"panels": [
{
"datasource": "InfluxDB",
"gridPos": {
"h": 7,
"w": 6,
"x": 0,
"y": 0
},
"id": 4,
"options": {
"fieldOptions": {
"calcs": [
"last"
],
"defaults": {
"decimals": 2,
"mappings": [],
"max": 11.5,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "#EAB839",
"value": 10.5
},
{
"color": "dark-red",
"value": 11
}
]
},
"title": ""
},
"overrides": [],
"values": false
},
"orientation": "auto",
"showThresholdLabels": false,
"showThresholdMarkers": true
},
"pluginVersion": "6.6.1",
"targets": [
{
"groupBy": [
{
"params": [
"1m"
],
"type": "time"
},
{
"params": [
"null"
],
"type": "fill"
}
],
"limit": "",
"measurement": "water_depth",
"orderByTime": "ASC",
"policy": "default",
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"water_depth"
],
"type": "field"
},
{
"params": [
"1"
],
"type": "top"
}
]
],
"tags": []
}
],
"timeFrom": null,
"timeShift": null,
"title": "Current Depth (in)",
"transparent": true,
"type": "gauge"
},
{
"cacheTimeout": null,
"datasource": "InfluxDB",
"description": "",
"gridPos": {
"h": 7,
"w": 6,
"x": 6,
"y": 0
},
"id": 6,
"interval": "1m",
"links": [],
"options": {
"colorMode": "value",
"fieldOptions": {
"calcs": [
"lastNotNull"
],
"defaults": {
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"nullValueMode": "connected",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "dateTimeAsUS"
},
"overrides": [],
"values": false
},
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal"
},
"pluginVersion": "6.6.1",
"targets": [
{
"groupBy": [],
"measurement": "pump_cycle_end",
"orderByTime": "ASC",
"policy": "default",
"query": "SELECT \"timestamp\"*1000 FROM \"pump_cycle_end\" WHERE (\"host\" = 'raspisump') AND $timeFilter",
"rawQuery": true,
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"timestamp"
],
"type": "field"
}
]
],
"tags": [
{
"key": "host",
"operator": "=",
"value": "raspisump"
}
]
}
],
"timeFrom": null,
"timeShift": null,
"title": "Last Run",
"transparent": true,
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "InfluxDB",
"gridPos": {
"h": 7,
"w": 6,
"x": 12,
"y": 0
},
"id": 8,
"links": [],
"options": {
"colorMode": "value",
"fieldOptions": {
"calcs": [
"count"
],
"defaults": {
"mappings": [
{
"id": 0,
"op": "=",
"text": "0",
"type": 1,
"value": "null"
}
],
"nullValueMode": "connected",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "#6ED0E0",
"value": 48
},
{
"color": "#EAB839",
"value": 72
},
{
"color": "red",
"value": 144
}
]
},
"unit": "none"
},
"overrides": [],
"values": false
},
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal"
},
"pluginVersion": "6.6.1",
"targets": [
{
"groupBy": [
{
"params": [
"$interval"
],
"type": "time"
}
],
"measurement": "pump_cycle_end",
"orderByTime": "ASC",
"policy": "default",
"query": "SELECT \"water_depth\" FROM \"pump_cycle_end\" WHERE (\"host\" = 'raspisump') AND $timeFilter",
"rawQuery": true,
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"water_depth"
],
"type": "field"
}
]
],
"tags": [
{
"key": "host",
"operator": "=",
"value": "raspisump"
}
]
}
],
"timeFrom": null,
"timeShift": null,
"title": "Run Cycles (12hrs)",
"transparent": true,
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "InfluxDB",
"gridPos": {
"h": 7,
"w": 6,
"x": 18,
"y": 0
},
"id": 9,
"links": [],
"options": {
"colorMode": "value",
"fieldOptions": {
"calcs": [
"count"
],
"defaults": {
"mappings": [
{
"id": 0,
"op": "=",
"text": "0",
"type": 1,
"value": "null"
}
],
"nullValueMode": "connected",
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": [],
"values": false
},
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal"
},
"pluginVersion": "6.6.1",
"targets": [
{
"groupBy": [],
"measurement": "pump_cycle_end",
"orderByTime": "ASC",
"policy": "default",
"query": "SELECT \"water_depth\" FROM \"pump_cycle_end\" WHERE (\"host\" = 'raspisump')",
"rawQuery": true,
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"water_depth"
],
"type": "field"
}
]
],
"tags": [
{
"key": "host",
"operator": "=",
"value": "raspisump"
}
]
}
],
"timeFrom": null,
"timeShift": null,
"title": "Total Run Cycles",
"transparent": true,
"type": "stat"
},
{
"alert": {
"alertRuleTags": {},
"conditions": [
{
"evaluator": {
"params": [
11
],
"type": "gt"
},
"operator": {
"type": "and"
},
"query": {
"params": [
"A",
"5m",
"now"
]
},
"reducer": {
"params": [],
"type": "avg"
},
"type": "query"
}
],
"executionErrorState": "alerting",
"for": "3m",
"frequency": "1m",
"handler": 1,
"message": "Water depth anomaly or too high. Please check the pit!",
"name": "Water Level alert",
"noDataState": "no_data",
"notifications": []
},
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "InfluxDB",
"decimals": 2,
"description": "How much water is in the sump pit, measured in inches.",
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 17,
"w": 24,
"x": 0,
"y": 7
},
"hiddenSeries": false,
"id": 2,
"interval": "1m",
"legend": {
"alignAsTable": true,
"avg": true,
"current": true,
"hideEmpty": false,
"hideZero": false,
"max": true,
"min": true,
"rightSide": false,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [
{
"targetBlank": true,
"title": "View the on-device graphs",
"url": "http://10.0.20.205/"
}
],
"nullPointMode": "null",
"options": {
"dataLinks": []
},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"groupBy": [
{
"params": [
"$__interval"
],
"type": "time"
}
],
"measurement": "water_depth",
"orderByTime": "ASC",
"policy": "default",
"query": "SELECT sum(\"water_depth\") FROM \"water_depth\" WHERE $timeFilter GROUP BY time($__interval)",
"rawQuery": false,
"refId": "A",
"resultFormat": "time_series",
"select": [
[
{
"params": [
"water_depth"
],
"type": "field"
},
{
"params": [],
"type": "median"
}
]
],
"tags": []
}
],
"thresholds": [
{
"colorMode": "critical",
"fill": true,
"line": true,
"op": "gt",
"value": 11
}
],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Water Level",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"transparent": true,
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "1m",
"schemaVersion": 22,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-12h",
"to": "now"
},
"timepicker": {
"hidden": false,
"nowDelay": "",
"refresh_intervals": [
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
]
},
"timezone": "",
"title": "Raspi-Sump Water Level",
"uid": "nZ5FSklZk",
"version": 37
}
# This is a change to the project's crontab. As soon as the rsump.py completes, take a brief snooze
(purely for good measure, so the file can be written), then invoke the Telegraf script.
# m h dom mon dow command
*/1 * * * * sudo /usr/local/bin/rsump.py && sleep 2s && sudo /home/pi/raspi-sump/telegraf-logger.sh
# This file is an extension of the telegraf config on the Pi.
# save it into the /etc/telegraf/telegraf.d/ folder and restart the agent.
# This configuration allows Telegraf to monitor the logfile being written be our
# cron script. Telegrafs detects new-lines appended, and ships them off to InfluxDB.
# adjust paths, tags, etc. accordingly.
[[inputs.file]]
files = ["/home/pi/raspi-sump/telegraf-logs/logfile.csv"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "csv"
## Indicates how many rows to treat as a header. By default, the parser assumes
## there is no header and will parse the first row as data. If set to anything more
## than 1, column names will be concatenated with the name listed in the next header row.
## If `csv_column_names` is specified, the column names in header will be overridden.
csv_header_row_count = 0
## For assigning custom names to columns
## If this is specified, all columns should have a name
## Unnamed columns will be ignored by the parser.
## If `csv_header_row_count` is set to 0, this config must be used
csv_column_names = ["measurement","tag","timestamp","water_depth"]
## For assigning explicit data types to columns.
## Supported types: "int", "float", "bool", "string".
## Specify types in order by column (e.g. `["string", "int", "float"]`)
## If this is not specified, type conversion will be done on the types above.
csv_column_types = ["string","string","int","float"]
## Indicates the number of rows to skip before looking for header information.
csv_skip_rows = 0
## Indicates the number of columns to skip before looking for data to parse.
## These columns will be skipped in the header as well.
csv_skip_columns = 0
## The seperator between csv fields
## By default, the parser assumes a comma (",")
csv_delimiter = ","
## The character reserved for marking a row as a comment row
## Commented rows are skipped and not parsed
csv_comment = ""
## If set to true, the parser will remove leading whitespace from fields
## By default, this is false
csv_trim_space = false
## Columns listed here will be added as tags. Any other columns
## will be added as fields.
csv_tag_columns = ["tag"]
## The column to extract the name of the metric from
csv_measurement_column = "measurement"
## The column to extract time information for the metric
## `csv_timestamp_format` must be specified if this is used
csv_timestamp_column = "timestamp"
## The format of time data extracted from `csv_timestamp_column`
## this must be specified if `csv_timestamp_column` is specified
#csv_timestamp_format = "2020-02-24T12:59:59-0500"
csv_timestamp_format = "unix"
#!/bin/bash
####################
#
# Copies the last line in the active raspisump logfile over to a telegraf logfile,
# while inserting today's date as a prefix to the time (already present) along
# with adding some tags to make the data more meaningful in Grafana.
#
##################
cd /home/pi/raspi-sump
IFS=','
TELELOGS="./telegraf-logs"
# read the last two lines from the file
todayDt="$(date +'%Y%m%d')"
# get the previous entry from the file, to determine if the pump just ran
line="$(tail -n 2 ./csv/waterlevel-${todayDt}.csv | head -n1)"
read -ra prevRun <<< "$line"
echo "prevTime: ${prevRun[0]}"
echo "prevDepth: ${prevRun[1]}"
# split the line on the comma: <time>,<depth>
line="$(tail -n 1 ./csv/waterlevel-${todayDt}.csv)"
read -ra thisRun <<< "$line"
echo "time: ${thisRun[0]}"
echo "depth: ${thisRun[1]}"
# turn the time into an epoch timestamp, using today's date & concat time
timeStr="${todayDt}T${thisRun[0]}"
echo "TimeStr: $timeStr"
epochDt="$(date -d "${timeStr} 12 hours")"
epochDt="$(date --date=$epochDt +%s)"
echo "${epochDt}"
# did more than 3 inches drain from the pit? If so, let's proclaim the pump ran.
depthChange=$(echo "scale=2; ${prevRun[1]} - ${thisRun[1]}" | bc)
boolInt=$(echo "scale=1; ${prevRun[1]} - ${thisRun[1]} > 3" | bc)
echo "did the pump run? prev=${prevRun[1]} now=${thisRun[1]} diff=${depthChange}"
if [[ $boolInt -eq 1 ]]; then
# append an entry we can use to report on run cycles
echo "It looks like the pump just ran!"
echo "pump_cycle_end,raspi-sump,${epochDt},${depthChange}" >> ${TELELOGS}/logfile.csv
fi
# once a day, rotate and compress the old logfile
if [[ ${thisRun[0]} =~ 00:00.* ]]; then
echo "It's midnight; time to rotate the Telegraf log."
gzip -S "-${todayDt}.gz" ${TELELOGS}/logfile.csv
fi
echo "Writing "${epochDt},${thisRun[1]}" to Telegraf log."
echo "water_depth,raspi-sump,${epochDt},${thisRun[1]}" >> ${TELELOGS}/logfile.csv
exit 0
@jjmckain
Copy link
Author

jjmckain commented Mar 1, 2020

Screenshot from 2020-03-01 08-41-33

@alaudet
Copy link

alaudet commented Mar 2, 2020

Just reading more on this. I like the open source aspect of grafana. What is performance like on the pi, what version of Pi are you using?

@jjmckain
Copy link
Author

jjmckain commented Mar 2, 2020

I have a Pi 2b, its several years old. It's idle 100% of the time - running psump.py every 1m is practically immeasurable. Telegraf->InfluxDB->Grafana is implemented as a distributed model. Only the Telegraf agent runs on the Pi (with the raspi-sump scripts). Telegraf looks for changes in a particular places (one being my logfile), I believe every 15ms (as part of a larger system/resources scan), and pushes said changes to InfluxDB across the network. I run InfluxDB in a Docker container on my main computer, and Grafana similarly in it's own Docker container. More on setting those up here and here.

The time-series trio and architecture is quite lean and powerful. You should read up on them and all the diverse hooks they offer - monitor anything from AWS Cloudformations to database queries to disk I/O. Using it for raspi-sump was me applying work experience to home-hobbies.
There are larger, more complex monitoring features I want to incorporate in phase 2; Grafana & InfluxDB will play bigger roles then. Grafana also has built-in alerting abilities; see that red line in my sample graph! Eventually I want to stream the data out of InfluxDB and do some machine learning on it (cycle times, predictive failures, possible clogs, etc.)...but for now at least I'm capturing the data.

Screenshot from 2020-03-02 18-34-08

@kiloennyankey
Copy link

is it possible to replace the Matlab plotting on the web ui with this graph?

@jjmckain
Copy link
Author

jjmckain commented Mar 3, 2020

I suppose you could omit Matlab, since its optional, and do this instead. They aren't interchangeable offerings.

@alaudet
Copy link

alaudet commented Mar 3, 2020

Correct there is no reason at all to use the bundled matplotlib graphing I have included with Raspi-Sump. You can use any approach you like to graph your readings.
There are many other ways, I just wanted to use a pure python approach as part of the application and not rely on external services where the terms of service can change.

EDIT:

It is the same reason I don't use services like Twilio for the notifications when there is a python library like smtplib that can handle it. :-)

@kiloennyankey
Copy link

kiloennyankey commented Mar 4, 2020 via email

@jjmckain
Copy link
Author

jjmckain commented Mar 5, 2020

Per latest revision (#3), here's how the enhanced dashboard looks after capturing run cycles.

Capture20200304

@kiloennyankey
Copy link

kiloennyankey commented Mar 5, 2020 via email

@jjmckain
Copy link
Author

jjmckain commented Mar 5, 2020

Grafana is a webui for rendering the data pulled from InfluxDB. This is a screenshot, not a static image (like Matplotlib) - through my browser there are interactive controls (change the time range, refresh rate, labels, rearrange/resize each panel, set warning levels and alerts, etc.). Grafana will actually email me when any given panel reaches a critical level.
In this Gist I've attempted to give you the pieces to copy/paste. There are links in the readme for setting up Telegraf/InfluxDB/Grafana in a matter of minutes. Tuning always takes time, but if you're up for an experiment it's a rewarding one!

@kiloennyankey
Copy link

kiloennyankey commented Mar 5, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment