Skip to content

Instantly share code, notes, and snippets.

@JCGoran
Last active January 19, 2024 14:56
Show Gist options
  • Save JCGoran/c179761e47eee831134fa74f8b304061 to your computer and use it in GitHub Desktop.
Save JCGoran/c179761e47eee831134fa74f8b304061 to your computer and use it in GitHub Desktop.
MODELDB cleaner

How to use

  1. run getmodels in the root dir of the https://github.com/neuronsimulator/nrn-modeldb-ci/ repo. This will download all of the models from MODELDB in the cache subdirectory.
  2. run bash remove_useless_files.sh. This will create a subdirectory called output in the current directory which will contain the models (again as ZIP files), but with all of the unecessary files removed (empty directories, temporary files, etc.).
  3. (optional) run TMPDIR=/some/temp/dir bash compare_old_and_new.sh to verify that nothing notable was removed. This will create a subdirectory diffs_old_and_new with just the differences (in the file [NAME].diff, where [NAME] is the ID of the model) between the "standard" files (from cache) and the cleaned up ones (from output).
#!/usr/bin/env bash
set -eu
extract(){
inputdir="$1"
name="$2"
outputdir="$3"
unzip -qq -d "${outputdir}" "${inputdir}/${name}.zip"
}
compare(){
name="$1"
# for the first dir
inputdir1="cache"
outputdir1="$(mktemp -d -p "${TMPDIR}")"
extract ${inputdir1} ${name} ${outputdir1}
# for the second dir
inputdir2="output"
outputdir2="$(mktemp -d -p "${TMPDIR}")"
extract ${inputdir2} ${name} ${outputdir2}
# making the diff
dfile="$(mktemp -p "${TMPDIR}")"
if ! diff -r --ignore-all-space "${outputdir1}" "${outputdir2}" > "${dfile}"
then
if [ -s "${dfile}" ]
then
cp -a "${dfile}" "${diffdir}/${name}.diff"
printf "ERROR: differences detected for model %s, see %s for details\n" "${name}" "${diffdir}/${name}.diff"
fi
else
printf "INFO: no differences detected for model %s\n" "${name}"
fi
}
diffdir="diffs_old_and_new"
mkdir -p "${diffdir}"
for filename in cache/*
do
name="$(basename "${filename}")"
name="${name%.*}"
printf "INFO: model %s already downloaded, comparing\n" "${name}"
compare "${name}"
done
set +eu
#!/usr/bin/env bash
set -eu
# Create the output directory (replace /path/to/output with your desired directory)
output_dir="${PWD}/output/"
mkdir -p "$output_dir"
# go through all of the files
for filename in cache/*
do
name="$(basename "${filename}")"
name="${name%.*}"
printf "Attempting to cleanup %s\n" "${name}"
# Create a temporary directory
temp_dir=$(mktemp -d)
# Find all zip files in /path/to/cache and extract them to the temporary directory
unzip -qq -d "$temp_dir" "${filename}"
# Remove empty directories in the temporary directory
find "$temp_dir" -type d -empty -delete
# Remove unwanted files and directories in the temporary directory
# NOTE: you can modify the patterns here, the below are just the most common ones
find "$temp_dir" \( -name '__pycache__' -o -name "__MACOSX" -o -name '*.DS_Store*' -o -name 'x86_64' -o -name '*.o' -o -name '*.dll' -o -name '*.pyc' -o -name '.svn' -o -name '*.so' -o -name '~$*' \) -prune -exec rm -fr {} +
# Zip the modified contents of the temporary directory to the output directory
cd "$temp_dir" && zip -qq -r "$output_dir/${name}.zip" . && cd -
# Remove the temporary directory
rm -fr "$temp_dir"
done
set +eu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment