Skip to content

Instantly share code, notes, and snippets.

@coltenkrauter
Last active September 20, 2023 06:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save coltenkrauter/88a39b19ef67debdd3284e7af0845924 to your computer and use it in GitHub Desktop.
Save coltenkrauter/88a39b19ef67debdd3284e7af0845924 to your computer and use it in GitHub Desktop.
A script for validating Google Takeout archives by checking for missing files and verifying their sizes.

Google Takeout Validator

this gist contains a script for validating google takeout archives. it verifies the integrity of the archive by identifying missing files and verifying their sizes.

Intro

this script is built to help you ensure that your google takeout archive is complete and that each file meets the expected size parameters. the script identifies the first and last files in the archive and checks for missing files between these. additionally, it verifies that each file in the archive is not smaller than the specified size.

Contents

  1. script
  2. usage
  3. resources
  4. collaboration
  5. credits

Script

#!/bin/zsh

# **google-takeout-validator.zsh**
# -------------------------------
echo "\e[34;1mgoogle-takeout-validator\e[0m"
echo "----------------------------------------"

# Default directory
DIR="."
# Size in bytes (50GB)
SIZE=53687091200

# Parse input flags for directory and size
while getopts "d:s:" opt; do
  case ${opt} in
    d ) DIR=$OPTARG
      ;;
    s ) SIZE=$OPTARG
      ;;
    * ) echo "Usage: $0 [-d directory] [-s size]"
        exit 1
      ;;
  esac
done

# Phase 1: Identifying first and last files
echo "\e[32mPhase 1: Identifying first and last files.\e[0m"
FILES=($DIR/takeout-*.tgz(N))
if [[ -z $FILES ]]; then
  echo "No files found to process. Please check the directory path."
  exit 1
fi

# Extracting first and last filenames
FIRST_FILE=${FILES[1]##*/}
LAST_FILE=${FILES[-1]##*/}

echo "\e[35m↠ First file: $FIRST_FILE\e[0m"
echo "\e[35m↠ Last file: $LAST_FILE\e[0m"

# Phase 2: Checking for missing files and verifying file size
echo "\e[32mPhase 2: Checking for missing files and verifying file size.\e[0m"

FIRST_NUM=${FIRST_FILE##*-}
FIRST_NUM=${FIRST_NUM%%.*}
LAST_NUM=${LAST_FILE##*-}
LAST_NUM=${LAST_NUM%%.*}

PREFIX=${FIRST_FILE%-*}

# Checking for missing files
for i in {$FIRST_NUM..$LAST_NUM}; do
  FILE="$DIR/$PREFIX-$i.tgz"
  if [[ ! -f $FILE ]]; then
    echo "\e[35mMissing file: ${FILE//\/\//\/}\e[0m"
  fi
done

# Checking file sizes
for FILE in $DIR/takeout-*.tgz; do
  FILE_SIZE=$(stat -f%z "$FILE")
  if (( FILE_SIZE < SIZE )); then
    SIZE_GB=$(echo "scale=4; $FILE_SIZE / 1073741824" | bc)
    echo "File ${FILE//\/\//\/} is smaller than 50 GB (Actual size: ${SIZE_GB} GB)"
  fi
done

# Concluding the report
END_TIME=$(date +"%T")
echo "\e[34;1mReport completed in $END_TIME.\e[0m"

Usage

to use the script, follow these steps:

  1. download the script from this gist.

  2. open your terminal.

  3. navigate to the directory where the script is located.

  4. run the script with the necessary flags (e.g., -d for specifying the directory and -s for specifying the file size).

    example usage:

    ./google-takeout-validator.zsh -d /path/to/your/takeout/directory -s 50GB

Resources

  1. Google Takeout Official Website
  2. Zsh Official Documentation
  3. Bash Parameter Expansion

Video Tutorial: Working with Google Takeout Data (YouTube)

Collaboration

Feel free to collaborate on this script by suggesting improvements or reporting issues. your input is valued.

Credits

Script developed with assistance from gpt-4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment