Last active
November 26, 2024 20:56
-
-
Save peterc/3f608281719d2c8563f98428f4abe0f9 to your computer and use it in GitHub Desktop.
Script to take a remote git repo (e.g. GitHub) and create some basic documentation for it
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Check if a repository URL was provided | |
if [ $# -ne 1 ]; then | |
echo "Usage: $0 <repository-url>" | |
exit 1 | |
fi | |
REPO_URL=$1 | |
# Extract repo name from URL | |
# This will get the last part of the URL without .git extension | |
REPO_NAME=$(basename "$REPO_URL" .git) | |
# Create a temporary directory for our work | |
TEMP_DIR=$(mktemp -d) | |
FILE_LIST="$TEMP_DIR/file_list.txt" | |
# Clone the repository | |
git clone "$REPO_URL" "$TEMP_DIR/project" | |
cd $TEMP_DIR | |
# Generate the file list using Claude | |
echo "Given a list of directories and files in a software repo, list which ones would be useful to scan to put together some documentation? Think tests and README style documents. Only return a list of the matching paths. Put the README first, if possible. Here are the paths:" | cat - <(find project -maxdepth 1 -mindepth 1) | llm -m claude-3.5-sonnet -s "Return only a list of paths with no preamble or extra explanation" > "$FILE_LIST" | |
# Generate the documentation | |
cat "$FILE_LIST" | grep -v '^\.|\/' | tr '\n' ' ' | xargs files-to-prompt -c > "$TEMP_DIR/prompt.txt" | |
cd - | |
cat "$TEMP_DIR/prompt.txt" | ttok -t 150000 > "$TEMP_DIR/prompt2.txt" | |
cat "$TEMP_DIR/prompt2.txt" | llm -m claude-3.5-sonnet -s 'write detailed usage documentation for this project including realistic examples. do not include a preamble, go straight into the documentation' > "${REPO_NAME}.md" | |
# Clean up | |
rm -f "$FILE_LIST" | |
#rm -rf "$TEMP_DIR" | |
echo "Documentation generated in ${REPO_NAME}.md" |
Cheated a bit by changing directory and back again. Also added token truncation with ttok
.
I want to come up with a safer way of handling the cleaning up at the end, because rm -rf
anything is terrifying in any case.
Apparently macOS, at least, cleans up any mktemp
directories on reboot, so in theory I could ditch the clean up.
Commented it out for now, will think about it.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This works but does have a flaw in that passing complex temporary folder names to LLMs can... be problematic. So I want to clean that up first before making this public.