Skip to content

Instantly share code, notes, and snippets.

@iancharters
Last active April 10, 2024 20:36
Show Gist options
  • Save iancharters/403c2a7c04343b48db3c1dafb913010d to your computer and use it in GitHub Desktop.
Save iancharters/403c2a7c04343b48db3c1dafb913010d to your computer and use it in GitHub Desktop.
Finds the average size in bytes of the content of the column "Body"
#!/bin/bash
# Check if a file path is provided
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <csv_file_path>"
exit 1
fi
file_path=$1
total_size=0
count=0
# Check if the file exists
if [ ! -f "$file_path" ]; then
echo "File does not exist: $file_path"
exit 1
fi
# Read the file line by line, skipping the header
while IFS=, read -r date body message; do
if [ $count -ne 0 ]; then # Skip the header row
# Calculate the size of the Body field in bytes
size=$(echo -n "$body" | wc -c | awk '{print $1}')
total_size=$((total_size + size))
fi
count=$((count + 1))
done < <(tail -n +2 "$file_path")
# Calculate the average size
if [ $count -gt 1 ]; then # Adjust for the fact that count includes the header
avg_size=$(echo "scale=2; $total_size / ($count - 1)" | bc)
echo "Average size: $avg_size bytes"
else
echo "No data to process."
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment