Skip to content

Instantly share code, notes, and snippets.

@taikedz
Last active November 18, 2021 15:40
Show Gist options
  • Save taikedz/724cd69140d13f873d54ae7baa1447e5 to your computer and use it in GitHub Desktop.
Save taikedz/724cd69140d13f873d54ae7baa1447e5 to your computer and use it in GitHub Desktop.

Plot times examples

As a network behaviour demonstration, I did a repeated clone-out of a Git repository, writing its timestamp data with date '+%F %T' in columns one and two (start and end times) of a CSV

This Python script processes that CSV to produce a plot of the durations:

  • timedeltas calculated using datetime
  • plotting using matplotlib.pyplot
  • adding labels at spaced out intervals, and rotated

Shell scripts

The clone test is looping_clones.sh which does a clone of a "large" repo every 10min. It includes start times, end times, and such in human-readable form as "comments", and non-commented lines to serve as CSV data

extract_clonetest_csv.sh isolates that data into a CSV file proper, and produces a Base64 encoded tarball of that data - this was for easy copy-pasting the file out to another machine via clipboard (needs must....)

It is unpacked to local directory by running base64 -di | tar xz , pasting the data, and hitting CTRL + D to terminate input. Only works in Unix shell environments like bash, sh, ksh, etc and Windows Git Bash.

#!/usr/bin/env python3
import sys
from datetime import datetime
from matplotlib import pyplot
# Timestamp as printed by `date '+%F %T`
DATEFORMAT = '%Y-%m-%d %H:%M:%S'
# Produce a label every 10 instances
LABEL_SPACING = 10
def main():
for pf in sys.argv[1:]:
durations, labels, ticks = extract_points(pf)
plot_points(pf, durations, labels, ticks)
def get_date(timestamp):
return datetime.strptime(timestamp, DATEFORMAT)
def extract_points(csv_file):
durations = []
labels = []
# Tracking data point count, for label spacing
count = 0
print("Plotting "+csv_file)
with open(csv_file) as fh:
for line in fh:
line = line.strip()
# Start time and end time in the first two columns
start, end, *remainder = line.split(",")
start_label = start
start = get_date(start)
end = get_date(end)
duration = end-start
durations.append(duration.seconds/60.0)
if count % LABEL_SPACING == 0:
labels.append(start_label)
count += 1
ticks = list(range(0, count, LABEL_SPACING))
return durations, labels, ticks
def plot_points(csv_file, durations, labels, ticks):
pyplot.plot(durations)
# Spaced labels on X-axis. `yticks` exists for y axis
pyplot.xticks(ticks=ticks, labels=labels, rotation=30)
pyplot.title("Git Clone Duration (minutes, decimal)\n"+csv_file)
# Ensure labels don't fall of bottom of rendered image:
pyplot.tight_layout()
pyplot.savefig(csv_file+".png", dpi=150)
pyplot.clf() # Clear the figure
if __name__ == "__main__":
main()
#!/usr/bin/env bash
grep -vE '^#' clone-data.txt | sed -r 's/\s+git-clonetest$//' > extracted-clone-data.csv
echo
echo "Extracted data"
ls extracted-*.csv
echo
echo Base64 tarball
echo
tar cz extracted-clone-data.csv | base64
#!/usr/bin/env bash
HERE="$(dirname "$0")"
cd "$HERE"
reponame="git-clonetest"
# Your reference "large" repo
# Can be passed in as script argument
repourl="${1:-https://github.com/minetest/minetest}"
sleep_seconds=$(( 60 * 10 ))
datafile="clone-data.txt"
wait_for() {
i=$1
echo -n "Sleeping $i seconds "
while [[ $i -gt 0 ]]; do
dec=10 # Decrement rate
echo -n "."
sleep $dec
i=$((i-dec))
done
echo
}
main() {
while true; do
[[ ! -d "$reponame" ]] || rm -rf "$reponame"
START_DATE="$(date '+%F %T')"
# Most entries start with '#' as comments
echo "# START -- $START_DATE" >> "$datafile"
git clone "$repourl" "$reponame"
END_DATE="$(date '+%F %T')"
BYTE_SIZE="$(du -s "$reponame")"
echo "# Size : $BYTE_SIZE" >> "$datafile"
echo "# Size (-h): $(du -sh "$reponame")" >> "$datafile"
echo "# END -- $END_DATE" >> "$datafile"
# A CSV data line
echo "$START_DATE,$END_DATE,$BYTE_SIZE" >> "$datafile"
echo "# ====" >> "$datafile"
wait_for $sleep_seconds
done
}
main "$@"
matplotlib==3.5.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment