Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
#!/bin/bash
set -e
arch=${1:-amd64}
repo=${2:-"http://ftp.us.debian.org/debian/dists/stable/main"}
url="${repo}/Contents-${arch}.gz"
gz_path="$HOME/.deb_content_files/$(sed -r 's|https?://||g' <<< $url)"
txt_path=${gz_path/%.gz/}
mkdir -p $(dirname "$gz_path")
cd $(dirname "$gz_path")
wget "$url"
gunzip --force "$gz_path"
printf "Processing %'d lines\n" $(wc -l "$txt_path" | awk '{print $1}')
echo "Yeah just go grab a coffee while I do my thing..."
time cat "$txt_path" |
awk '{print $2}' | # Get only the last column where package names are mentioned
sed -r 's|,|\n|g' | # Split comma-separated packages by newline
sed -r 's|(.*/)+||g' | # Ignore everything else but the package name
sort | uniq -c | # Count the number of times the package occured
sort -rn | # Reverse sort based on the count
head -n 10 # List only the top 10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.