Skip to content

Instantly share code, notes, and snippets.

@AfroThundr3007730
Last active January 9, 2023 09:44
Show Gist options
  • Save AfroThundr3007730/d813dc149b2407cf53936915e98659af to your computer and use it in GitHub Desktop.
Save AfroThundr3007730/d813dc149b2407cf53936915e98659af to your computer and use it in GitHub Desktop.
Configuration files for local yum and apt repositories. (Retired)

Note: This gist is historical now that I've moved this to a proper repo: AfroThundr3007730/syncrepo

Below are configuration files for creating upstream and downstream yum and apt repositories. These can be hosted on a CentOS, Debian, or Ubuntu system and served via rsync or http. The only requirements are rsync and a webserver, such as apache, and also debmirror. Note that not all required setup steps are listed here (TODO: add setup guide).

Filename Description Notes
00-syncrepo.sh Combined all-in-one repo sync script. Still alpha. Last tested: v1.6.5
01-yum-repoupdate-us.sh Upstream yum repository updater script. Older stable version
02-yum-repoupdate-ds.sh Downstream yum repository updater script. Older stable version
03-apt-repoupdate-us.sh Upstream apt repository updater script. Older stable version
04-apt-repoupdate-ds.sh Downstream apt repository updater script. Older stable version
05-repoupdate.service Systemd service unit for repoupdate script.
06-repoupdate.timer Systemd timer unit for repoupdate script.
07-yum-rsyncd.conf Rsync config for yum repository. Should combine these..
08-apt-rsyncd.conf Rsync config for apt repository. ^
09-rsyncd.service Systemd service unit for rsyncd service.
10-yum-vhost.conf Apache vhost config for yum repository. I need to update this
11-apt-vhost.conf Apache vhost config for apt repository. Same here.
12-centos-local.repo Centos package config for clients.
13-debian-sources.list Debian package sources for clients.
14-ubuntu-sources.list Ubuntu package sources for clients.
15-repoupdate-log.conf Logrotate config file.
98-debmirror.pl copy of debmirror for convenience. should use the distro packaged version
99-clamavmirror.py copy of clamavmirror for convenience. Cisco-Talos/clamav-faq
#!/bin/bash
## shellcheck disable=SC2086
# Repository sync script for CentOS & Debian distros
# This script can sync the repos listed in $SOFTWARE
# Gotta keep the namespace clean
set_globals () {
AUTHOR='AfroThundr'
BASENAME="${0##*/}"
MODIFIED='20181029'
VERSION='1.7.0-rc1'
SOFTWARE='CentOS, EPEL, Debian, Ubuntu, and ClamAV'
# Global config variables (modify as necessary)
UPSTREAM=true
CENTOS_SYNC=true
EPEL_SYNC=true
DEBIAN_SYNC=true
DEBSEC_SYNC=true
UBUNTU_SYNC=true
CLAMAV_SYNC=true
LOCAL_SYNC=true
REPODIR=/srv/repository
LOCKFILE=/var/lock/subsys/reposync
LOGFILE=/var/log/reposync.log
PROGFILE=/var/log/reposync_progress.log
# More internal config variables
MIRROR=mirrors.mit.edu
UMIRROR=mirror-us.lab.local
CENTARCH=x86_64
CENTREPO=${REPODIR}/centos
CENTHOST=${MIRROR}::centos
EPELREPO=${REPODIR}/fedora-epel
EPELHOST=${MIRROR}::fedora-epel
DEBARCH=amd64
UBUNTUREPO=${REPODIR}/ubuntu
UBUNTUHOST=${MIRROR}::ubuntu
DEBIANREPO=${REPODIR}/debian
DEBIANHOST=${MIRROR}::debian
SMIRROR=security.debian.org
DEBSECREPO=${REPODIR}/debian-security
DEBSECHOST=${SMIRROR}/
CMIRROR=database.clamav.net
CLAMREPO=${REPODIR}/clamav
LOCALREPO=${REPODIR}/local
LOCALHOST=${MIRROR}::local
ROPTS="-hlmprtzDHS --stats --no-motd --del --delete-excluded --log-file=$PROGFILE"
TEELOG="tee -a $LOGFILE $PROGFILE"
}
# Parse command line options
argument_handler () {
if [[ ! -n $1 ]]; then
say -h 'No arguments specified, use -h for help.'
exit 0
fi
while [[ -n $1 ]]; do
if [[ $1 == -v ]]; then
say -h '%s: Version %s, updated %s by %s' \
"$BASENAME" "$VERSION" "$MODIFIED" "$AUTHOR"
ver=true
shift
elif [[ $1 == -h ]]; then
say -h 'Software repository updater script for linux distros.'
say -h 'Can curently sync the following repositories:'
say -h '%s\n' "$SOFTWARE"
say -h 'Usage: %s [-v] (-h | -y)\n' "$BASENAME"
say -h 'Options:'
say -h ' -h Display help text.'
say -h ' -v Emit version info.'
say -h ' -y Confirm repo sync.'
# Need to add options
# -l|--log-file
# -p|--prog-log
# -u|--upstream
# --centos-sync
# --epel-sync
# --ubuntu-sync
# --debian-sync
# --debsec-sync
# --clamav-sync
# --local-sync
exit 0
elif [[ $1 == -y ]]; then
CONFIRM=true
shift
else
say -h 'Invalid argument specified, use -h for help.'
exit 0
fi
done
if [[ ! $CONFIRM == true ]]; then
if [[ ! $ver == true ]]; then
say -h 'Confirm with -y to start the sync.'
exit 10
fi
exit 0
fi
}
# Log message and print to stdout
# shellcheck disable=SC2059
say () {
if [[ $1 == -h ]]; then
shift; local s=$1; shift
tput setaf 2; printf "$s\\n" "$@"
else
if [[ $LOGFILE == no && $PROGFILE == no ]] || [[ $1 == -n ]]; then
[[ $1 == -n ]] && shift
else
local log=true
fi
[[ $1 == -t ]] && (echo > $PROGFILE; shift)
if [[ $1 == info || $1 == warn || $1 == err ]]; then
[[ $1 == info ]] && tput setaf 4
[[ $1 == warn ]] && tput setaf 3
[[ $1 == err ]] && tput setaf 1
local l=${1^^}; shift
local s="$l: $1"; shift
else
local s="$1"; shift
fi
if [[ $log == true ]]; then
printf "%s: $s\\n" "$(date -u +%FT%TZ)" "$@" | $TEELOG
else
printf "%s: $s\\n" "$(date -u +%FT%TZ)" "$@"
fi
fi
tput setaf 7 # For CentOS
}
# Construct the sync environment
build_vars () {
# Declare more variables (CentOS/EPEL)
if [[ $CENTOS_SYNC == true || $EPEL_SYNC == true ]]; then
mapfile -t allrels <<< "$(
rsync $CENTHOST | \
awk '/^d/ && /[0-9]+\.[0-9.]+$/ {print $5}' |
sort -V
)"
mapfile -t oldrels <<< "$(
for i in "${allrels[@]}"; do
if [[ ${i%%.*} -eq "(${allrels[-1]%%.*} - 1)" ]]; then
echo "$i";
fi;
done
)"
currel=${allrels[-1]}
curmaj=${currel%%.*}
cprerel=${allrels[-2]}
oldrel=${oldrels[-1]}
oldmaj=${oldrel%%.*}
oprerel=${oldrels[-2]}
centex=$(echo --include={os,extras,updates,centosplus,readme,os/$CENTARCH/{repodata,Packages}} --exclude={i386,"os/$CENTARCH/*"} --exclude="/*")
epelex=$(echo --exclude={SRPMS,aarch64,i386,ppc64,ppc64le,$CENTARCH/debug})
fi
# Declare more variables (Debian/Ubuntu)
if [[ $UBUNTU_SYNC == true ]]; then
mapfile -t uburels <<< "$(
curl -sL $MIRROR/ubuntu-releases/HEADER.html |
awk -F '[() ]' '/<li>/ && /LTS/ {print $6}'
)"
ubucur=${uburels[1],}
ubupre=${uburels[2],}
ubuntucomps="main,restricted,universe,multiverse"
ubunturel1="$ubupre,$ubupre-backports,$ubupre-updates,$ubupre-proposed,$ubupre-security"
ubunturel2="$ubucur,$ubucur-backports,$ubucur-updates,$ubucur-proposed,$ubucur-security"
ubuntuopts1="-s $ubuntucomps -d $ubunturel1 -h $MIRROR -r /ubuntu"
ubuntuopts2="-s $ubuntucomps -d $ubunturel2 -h $MIRROR -r /ubuntu"
fi
if [[ $DEBIAN_SYNC == true || $DEBSEC_SYNC == true ]]; then
mapfile -t debrels <<< "$(
curl -sL $MIRROR/debian/README.html |
awk -F '[<> ]' '/<dt>/ && /Debian/ {print $9}'
)"
debcur=${debrels[0]}
debpre=${debrels[1]}
debiancomps="main,contrib,non-free"
debianrel1="$debpre,$debpre-backports,$debpre-updates,$debpre-proposed-updates"
debianrel2="$debcur,$debcur-backports,$debcur-updates,$debcur-proposed-updates"
debianopts1="-s $debiancomps -d $debianrel1 -h $MIRROR -r /debian"
debianopts2="-s $debiancomps -d $debianrel2 -h $MIRROR -r /debian"
debsecrel1="$debpre/updates"
debsecrel2="$debcur/updates"
debsecopts1="-s $debiancomps -d $debsecrel1 -h $SMIRROR -r /"
debsecopts2="-s $debiancomps -d $debsecrel2 -h $SMIRROR -r /"
fi
if [[ $UBUNTU_SYNC == true || $DEBIAN_SYNC == true || $DEBSEC_SYNC == true ]]; then
dmirror="debmirror -a $DEBARCH --no-source --ignore-small-errors --method=rsync --retry-rsync-packages=5 -p --rsync-options="
dmirror2="debmirror -a $DEBARCH --no-source --ignore-small-errors --method=http --checksums -p"
fi
# And a few more (ClamAV)
if [[ $CLAMAV_SYNC == true ]]; then
clamsync="clamavmirror -a $CMIRROR -d $CLAMREPO -u root -g www-data"
fi
return 0
}
centos_sync () {
# Check for older centos release directory
if [[ ! -d $CENTREPO/$oldrel ]]; then
mkdir -p "$CENTREPO/$oldrel"
ln -frs "$CENTREPO/$oldrel" "$CENTREPO/$oldmaj"
fi
# Sync older centos repository
say 'Beginning sync of legacy CentOS %s repository from %s.' \
"$oldrel" "$CENTHOST"
rsync $ROPTS $centex "$CENTHOST/$oldrel/" "$CENTREPO/$oldrel/"
say 'Done.\n'
# Check for centos release directory
if [[ ! -d $CENTREPO/$currel ]]; then
mkdir -p "$CENTREPO/$currel"
ln -frs "$CENTREPO/$currel" "$CENTREPO/$curmaj"
fi
# Sync current centos repository
say 'Beginning sync of current CentOS %s repository from %s.' \
"$currel" "$CENTHOST"
rsync $ROPTS $centex "$CENTHOST/$currel/" "$CENTREPO/$currel/"
say 'Done.\n'
# Continue to sync previous point releases til they're empty
# Check for older previous centos point release placeholder
if [[ ! -f $CENTREPO/$oprerel/readme ]]; then
# Check for older previous centos release directory
if [[ ! -d $CENTREPO/$oprerel ]]; then
mkdir -p "$CENTREPO/$oprerel"
fi
# Sync older previous centos repository
say 'Beginning sync of legacy CentOS %s repository from %s.' \
"$oprerel" "$CENTHOST"
rsync $ROPTS $centex "$CENTHOST/$oprerel/" "$CENTREPO/$oprerel/"
say 'Done.\n'
fi
# Check for previous centos point release placeholder
if [[ ! -f $CENTREPO/$cprerel/readme ]]; then
# Check for previous centos release directory
if [[ ! -d $CENTREPO/$cprerel ]]; then
mkdir -p "$CENTREPO/$cprerel"
fi
# Sync current previous centos repository
say 'Beginning sync of current CentOS %s repository from %s.' \
"$cprerel" "$CENTHOST"
rsync $ROPTS $centex "$CENTHOST/$cprerel/" "$CENTREPO/$cprerel/"
say 'Done.\n'
fi
return 0
}
epel_sync () {
# Check for older epel release directory
if [[ ! -d $EPELREPO/$oldmaj ]]; then
mkdir -p "$EPELREPO/$oldmaj"
fi
# Sync older epel repository
say 'Beginning sync of legacy EPEL %s repository from %s.' \
"$oldmaj" "$EPELHOST"
rsync $ROPTS $epelex "$EPELHOST/$oldmaj/" "$EPELREPO/$oldmaj/"
say 'Done.\n'
# Check for older epel-testing release directory
if [[ ! -d $EPELREPO/testing/$oldmaj ]]; then
mkdir -p "$EPELREPO/testing/$oldmaj"
fi
# Sync older epel-testing repository
say 'Beginning sync of legacy EPEL %s Testing repository from %s.' \
"$oldmaj" "$EPELHOST"
rsync $ROPTS $epelex "$EPELHOST/testing/$oldmaj/" "$EPELREPO/testing/$oldmaj/"
say 'Done.\n'
# Check for current epel release directory
if [[ ! -d $EPELREPO/$curmaj ]]; then
mkdir -p "$EPELREPO/$curmaj"
fi
# Sync current epel repository
say 'Beginning sync of current EPEL %s repository from %s.' \
"$curmaj" "$EPELHOST"
rsync $ROPTS $epelex "$EPELHOST/$curmaj/" "$EPELREPO/$curmaj/"
say 'Done.\n'
# Check for current epel-testing release directory
if [[ ! -d $EPELREPO/testing/$curmaj ]]; then
mkdir -p "$EPELREPO/testing/$curmaj"
fi
# Sync current epel-testing repository
say 'Beginning sync of current EPEL %s Testing repository from %s.' \
"$curmaj" "$EPELHOST"
rsync $ROPTS $epelex "$EPELHOST/testing/$curmaj/" "$EPELREPO/testing/$curmaj/"
say 'Done.\n'
return 0
}
ubuntu_sync () {
export GNUPGHOME=$REPODIR/.gpg
# Check for ubuntu release directory
if [[ ! -d $UBUNTUREPO ]]; then
mkdir -p "$UBUNTUREPO"
fi
# Sync older ubuntu repository
say 'Beginning sync of legacy Ubuntu %s repository from %s.' \
"${ubupre^}" "$UBUNTUHOST"
$dmirror"$ROPTS" $ubuntuopts1 $UBUNTUREPO | tee -a $PROGFILE
say 'Done.\n'
# Sync current ubuntu repository
say 'Beginning sync of current Ubuntu %s repository from %s.' \
"${ubucur^}" "$UBUNTUHOST"
$dmirror"$ROPTS" $ubuntuopts2 $UBUNTUREPO | tee -a $PROGFILE
say 'Done.\n'
unset GNUPGHOME
return 0
}
debian_sync () {
export GNUPGHOME=$REPODIR/.gpg
# Check for debian release directory
if [[ ! -d $DEBIANREPO ]]; then
mkdir -p "$DEBIANREPO"
fi
# Sync older debian repository
say 'Beginning sync of legacy Debian %s repository from %s.' \
"${debpre^}" "$DEBIANHOST"
$dmirror"$ROPTS" $debianopts1 $DEBIANREPO | tee -a $PROGFILE
say 'Done.\n'
# Sync current debian repository
say 'Beginning sync of current Debian %s repository from %s.' \
"${debcur^}" "$DEBIANHOST"
$dmirror"$ROPTS" $debianopts2 $DEBIANREPO | tee -a $PROGFILE
say 'Done.\n'
unset GNUPGHOME
return 0
}
debsec_sync () {
export GNUPGHOME=$REPODIR/.gpg
# Check for ubuntu release directory
if [[ ! -d $DEBIANREPO ]]; then
mkdir -p "$DEBIANREPO"
fi
# Sync older debian security repository
say 'Beginning sync of legacy Debian %s Security repository from %s.' \
"${debpre^}" "$DEBSECHOST"
$dmirror2 $debsecopts1 $DEBSECREPO &>> $PROGFILE
say 'Done.\n'
# Sync current debian security repository
say 'Beginning sync of current Debian %s Security repository from %s.' \
"${debcur^}" "$DEBSECHOST"
$dmirror2 $debsecopts2 $DEBSECREPO &>> $PROGFILE
say 'Done.\n'
unset GNUPGHOME
return 0
}
clamav_sync () {
# Check for clamav release directory
if [[ ! -d $CLAMREPO ]]; then
mkdir -p "$CLAMREPO"
fi
# Sync clamav repository
say 'Beginning sync of ClamAV repository from %s.' "$CMIRROR"
$clamsync &>> $PROGFILE
say 'Done.\n'
return 0
}
local_sync () {
# Check for local repository directory
if [[ ! -d $LOCALREPO ]]; then
mkdir -p "$LOCALREPO"
fi
# Sync local repository
say 'Beginning sync of local repository from %s.' "$MIRROR"
rsync $ROPTS $centex "$LOCALHOST/" "$LOCALREPO/"
say 'Done.\n'
return 0
}
# Where the magic happens
main () {
# Process arguments
argument_handler "$@"
# Set Globals
set_globals
# Here we go...
say -t 'Progress log reset.'
say 'Started synchronization of %s repositories.' "$SOFTWARE"
say 'Use tail -f %s to view progress.' "$PROGFILE"
# Check if the rsync script is already running
if [[ -f $LOCKFILE ]]; then
say err 'Detected lockfile: %s' "$LOCKFILE"
say err 'Repository updates are already running.'
exit 10
# Check that we can reach the public mirror
elif ! ping -c 5 $MIRROR &> /dev/null; then
say err 'Cannot reach the %s mirror server.' "$MIRROR"
exit 20
# Check that the repository is mounted
elif ! mount | grep $REPODIR &> /dev/null; then
say err 'Directory %s is not mounted.' "$REPODIR"
exit 30
# Everything is good, let's continue
else
# There can be only one...
touch "$LOCKFILE"
# Generate variables
build_vars
# Are we upstream?
[[ $UPSTREAM == false ]] && MIRROR="$UMIRROR"
# Sync CentOS repo
[[ $CENTOS_SYNC == true ]] && centos_sync
# Sync EPEL repo
[[ $EPEL_SYNC == true ]] && epel_sync
# Sync Ubuntu repo
[[ $UBUNTU_SYNC == true ]] && ubuntu_sync
# Sync Debian repo
[[ $DEBIAN_SYNC == true ]] && debian_sync
# Sync Debian Security repo
[[ $DEBSEC_SYNC == true ]] && debsec_sync
# Sync Clamav reop
[[ $CLAMAV_SYNC == true ]] && clamav_sync
# Sync Local repo
[[ $LOCAL_SYNC == true ]] && local_sync
# Clear the lockfile
rm -f "$LOCKFILE"
fi
# Now we're done
say 'Completed synchronization of %s repositories.\n' "$SOFTWARE"
exit 0
}
# Only execute if not being sourced
[[ ${BASH_SOURCE[0]} == "$0" ]] && main "$@"
#!/bin/bash
# Yum repository updater script for CentOS (upstream)
# Currently syncs CentOS, EPEL, and EPEL Testing
# Version 1.4.2 updated 20181003 by <AfroThundr>
# Version handler
for i in "$@"; do
if [ "$i" = "-v" ]; then
v=$(head -4 "$0" | tail -1)
printf '%s\n' "$v"
exit 0
fi
done
# Declare some variables (modify as necessary)
arch=x86_64
repodir=/srv/repository
centosrepo=$repodir/centos
epelrepo=$repodir/epel
mirror=mirrors.mit.edu
centoshost=$mirror::centos
epelhost=$mirror::fedora-epel
lockfile=/var/lock/subsys/yum_rsync
logfile=/var/log/yum_rsync.log
progfile=/var/log/yum_rsync_prog.log
# Declare some more vars (don't break these)
centoslist=$(rsync $centoshost | awk '/^d/ && /[0-9]+\.[0-9.]+$/ {print $5}')
release=$(echo "$centoslist" | tr ' ' '\n' | tail -1)
majorver=${release%%.*}
oldmajorver=$((majorver-1))
oldrelease=$(echo "$centoslist" | tr ' ' '\n' | awk "/^$oldmajorver\\./" | tail -1)
prevrelease=$(echo "$centoslist" | tr ' ' '\n' | awk "/^$majorver\\./" | tail -2 | head -1)
oldprevrelease=$(echo "$centoslist" | tr ' ' '\n' | awk "/^$oldmajorver\\./" | tail -2 | head -1)
# Build the commands, with more variables
centosexclude=$(echo --include={os,extras,updates,centosplus,readme} --exclude=i386 --exclude="/*")
epelexclude=$(echo --exclude={SRPMS,aarch64,i386,ppc64,ppc64le,$arch/debug})
rsync="rsync -hlmprtzDHS --stats --no-motd --del --delete-excluded --log-file=$progfile"
teelog="tee -a $logfile $progfile"
# Here we go...
printf '%s: Progress log reset.\n' "$(date -u +%FT%TZ)" > $progfile
printf '%s: Started synchronization of CentOS and EPEL repositories.\n' "$(date -u +%FT%TZ)" | $teelog
printf '%s: Use tail -f %s to view progress.\n\n' "$(date -u +%FT%TZ)" "$progfile"
# Check if the rsync script is already running
if [ -f $lockfile ]; then
printf '%s: Error: Repository updates are already running.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 10
# Check that we can reach the public mirror
elif ! ping -c 5 $mirror &> /dev/null; then
printf '%s: Error: Cannot reach the %s mirror server.\n\n' "$(date -u +%FT%TZ)" "$mirror" | $teelog
exit 20
# Check that the repository is mounted
elif ! mount | grep $repodir &> /dev/null; then
printf '%s: Error: Directory %s is not mounted.\n\n' "$(date -u +%FT%TZ)" "$repodir" | $teelog
exit 30
else
# Check for older centos release directory
if [ ! -d "$centosrepo/$oldrelease" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for CentOS %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$oldrelease" | $teelog
cd "$centosrepo" || exit 40; mkdir -p "$oldrelease"; rm -f "$oldmajorver"; ln -s "$oldrelease" "$oldmajorver"
fi
# Create lockfile, sync older centos repo, delete lockfile
printf '%s: Beginning rsync of Legacy CentOS %s repo from %s.\n' "$(date -u +%FT%TZ)" "$oldrelease" "$centoshost" | $teelog
touch $lockfile
$rsync $centosexclude "$centoshost/$oldrelease/" "$centosrepo/$oldrelease/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Check for centos release directory
if [ ! -d "$centosrepo/$release" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for CentOS %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$release" | $teelog
cd "$centosrepo" || exit 40; mkdir -p "$release"; rm -f "$majorver"; ln -s "$release" "$majorver"
fi
# Create lockfile, sync centos repo, delete lockfile
printf '%s: Beginning rsync of CentOS %s repo from %s.\n' "$(date -u +%FT%TZ)" "$release" "$centoshost" | $teelog
touch $lockfile
$rsync $centosexclude "$centoshost/$release/" "$centosrepo/$release/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Check for older epel release directory
if [ ! -d "$epelrepo/$oldmajorver" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for EPEL %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$oldmajorver" | $teelog
mkdir -p "$epelrepo/$oldmajorver"
fi
# Create lockfile, sync older epel repo, delete lockfile
printf '%s: Beginning rsync of Legacy EPEL %s repo from %s.\n' "$(date -u +%FT%TZ)" "$oldmajorver" "$epelhost" | $teelog
touch $lockfile
$rsync $epelexclude "$epelhost/$oldmajorver/" "$epelrepo/$oldmajorver/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Check for older epel-testing release directory
if [ ! -d "$epelrepo/testing/$oldmajorver" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for EPEL %s Testing does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$oldmajorver" | $teelog
mkdir -p "$epelrepo/testing/$oldmajorver"
fi
# Create lockfile, sync older epel-testing repo, delete lockfile
printf '%s: Beginning rsync of Legacy EPEL %s Testing repo from %s.\n' "$(date -u +%FT%TZ)" "$oldmajorver" "$epelhost" | $teelog
touch $lockfile
$rsync $epelexclude "$epelhost/testing/$oldmajorver/" "$epelrepo/testing/$oldmajorver/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Check for epel release directory
if [ ! -d "$epelrepo/$majorver" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for EPEL %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$majorver" | $teelog
mkdir -p "$epelrepo/$majorver"
fi
# Create lockfile, sync epel repo, delete lockfile
printf '%s: Beginning rsync of EPEL %s repo from %s.\n' "$(date -u +%FT%TZ)" "$majorver" "$epelhost" | $teelog
touch $lockfile
$rsync $epelexclude "$epelhost/$majorver/" "$epelrepo/$majorver/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Check for epel-testing release directory
if [ ! -d "$epelrepo/testing/$majorver" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for EPEL %s Testing does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$majorver" | $teelog
mkdir -p "$epelrepo/testing/$majorver"
fi
# Create lockfile, sync epel-testing repo, delete lockfile
printf '%s: Beginning rsync of EPEL %s Testing repo from %s.\n' "$(date -u +%FT%TZ)" "$majorver" "$epelhost" | $teelog
touch $lockfile
$rsync $epelexclude "$epelhost/testing/$majorver/" "$epelrepo/testing/$majorver/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# We ain't out of the woods yet, continue to sync previous point release til its empty
# Check for older previous centos point release placeholder
if [ ! -f "$centosrepo/$oldprevrelease/readme" ]; then
# Check for older previous centos release directory
if [ ! -d "$centosrepo/$oldprevrelease" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for CentOS %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$oldprevrelease" | $teelog
cd "$centosrepo" || exit 40; mkdir -p "$oldprevrelease"
fi
# Create lockfile, sync older previous centos repo, delete lockfile
printf '%s: Beginning rsync of CentOS %s repo from %s.\n' "$(date -u +%FT%TZ)" "$oldprevrelease" "$centoshost" | $teelog
touch $lockfile
$rsync $centosexclude "$centoshost/$oldprevrelease/" "$centosrepo/$oldprevrelease/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
fi
# Check for previous centos point release placeholder
if [ ! -f "$centosrepo/$prevrelease/readme" ]; then
# Check for previous centos release directory
if [ ! -d "$centosrepo/$prevrelease" ]; then
# Make directory if it doesn't exist
printf '%s: Directory for CentOS %s does not exist. Creating..\n' "$(date -u +%FT%TZ)" "$prevrelease" | $teelog
cd "$centosrepo" || exit 40; mkdir -p "$prevrelease"
fi
# Create lockfile, sync previous centos repo, delete lockfile
printf '%s: Beginning rsync of CentOS %s repo from %s.\n' "$(date -u +%FT%TZ)" "$prevrelease" "$centoshost" | $teelog
touch $lockfile
$rsync $centosexclude "$centoshost/$prevrelease/" "$centosrepo/$prevrelease/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
fi
fi
# Now we're done
printf '%s: Completed synchronization of CentOS and EPEL repositories.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 0
#!/bin/bash
# Yum repository updater script for CentOS (downstream)
# Currently syncs CentOS, EPEL, and EPEL Testing
# Version 1.4.2 updated 20181003 by <AfroThundr>
# Version handler
for i in "$@"; do
if [ "$i" = "-v" ]; then
v=$(head -4 "$0" | tail -1)
printf '%s\n' "$v"
exit 0
fi
done
# Declare some variables (modify as necessary)
repodir=/srv/repository
centosrepo=$repodir/centos
epelrepo=$repodir/epel
mirror=yum.dmz.lab.local
centoshost=$mirror::centos
epelhost=$mirror::fedora-epel
lockfile=/var/lock/subsys/yum_rsync
logfile=/var/log/yum_rsync.log
progfile=/var/log/yum_rsync_prog.log
# Build the commands, with more variables
rsync="rsync -hlmprtzDHS --stats --no-motd --del --delete-excluded --log-file=$progfile"
teelog="tee -a $logfile $progfile"
# Here we go...
printf '%s: Progress log reset.\n' "$(date -u +%FT%TZ)" > $progfile
printf '%s: Started synchronization of CentOS and EPEL repositories.\n' "$(date -u +%FT%TZ)" | $teelog
printf '%s: Use tail -f %s to view progress.\n\n' "$(date -u +%FT%TZ)" "$progfile"
# Check if the rsync script is already running
if [ -f $lockfile ]; then
printf '%s: Error: Repository updates are already running.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 10
# Check that we can reach the public mirror
elif ! ping -c 5 $mirror &> /dev/null; then
printf '%s: Error: Cannot reach the %s mirror server.\n\n' "$(date -u +%FT%TZ)" "$mirror" | $teelog
exit 20
# Check that the repository is mounted
elif ! mount | grep $repodir &> /dev/null; then
printf '%s: Error: Directory %s is not mounted.\n\n' "$(date -u +%FT%TZ)" "$repodir" | $teelog
exit 30
else
# Just sync everything since we're downstream
# Create lockfile, sync centos repo, delete lockfile
printf '%s: Beginning rsync of CentOS repo from %s.\n' "$(date -u +%FT%TZ)" "$centoshost" | $teelog
touch $lockfile
$rsync "$centoshost/" "$centosrepo/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Create lockfile, sync epel repo, delete lockfile
printf '%s: Beginning rsync of EPEL repo from %s.\n' "$(date -u +%FT%TZ)" "$epelhost" | $teelog
touch $lockfile
$rsync "$epelhost/" "$epelrepo/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
fi
# Now we're done
printf '%s: Completed synchronization of CentOS and EPEL repositories.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 0
#!/bin/bash
# Apt repository updater script for Ubuntu (upstream)
# Currently syncs Ubuntu, Debian, and Debian Security
# Version 1.4.2 updated 20181003 by <AfroThundr>
# Version handler
for i in "$@"; do
if [ "$i" = "-v" ]; then
v=$(head -4 "$0" | tail -1)
printf '%s\n' "$v"
exit 0
fi
done
# Declare some variables (modify as necessary)
arch=amd64
repodir=/srv/repository
ubunturepo=$repodir/ubuntu
debianrepo=$repodir/debian
debsecrepo=$repodir/debian-security
mirror=mirrors.mit.edu
smirror=security.debian.org
ubuntuhost=$mirror::ubuntu
debianhost=$mirror::debian
debsechost=$smirror/
lockfile=/var/lock/subsys/apt_mirror
logfile=/var/log/apt_mirror.log
progfile=/var/log/apt_mirror_prog.log
# Declare some more vars (modify as necessary)
ubuntucomps="main,restricted,universe,multiverse"
debiancomps="main,contrib,non-free"
ubunturel1="trusty,trusty-backports,trusty-proposed,trusty-security,trusty-updates"
ubunturel2="xenial,xenial-backports,xenial-proposed,xenial-security,xenial-updates"
debianrel1="wheezy,wheezy-backports,wheezy-updates,wheezy-proposed-updates"
debianrel2="jessie,jessie-backports,jessie-updates,jessie-proposed-updates"
debsecrel="wheezy/updates,jessie/updates"
# Build the commands, with more variables
ubuntuopts="-s $ubuntucomps -d $ubunturel1 -d $ubunturel2 -h $mirror -r /ubuntu"
debianopts="-s $debiancomps -d $debianrel1 -d $debianrel2 -h $mirror -r /debian"
debsecopts="-s $debiancomps -d $debsecrel -h $smirror -r /"
ropts="-hlmprtzDHS --stats --no-motd --del --delete-excluded --log-file=$progfile"
dmirror="debmirror -a $arch --no-source --ignore-small-errors --method=rsync --retry-rsync-packages=5 -p --rsync-options="
dmirror2="debmirror -a $arch --no-source --ignore-small-errors --method=http --checksums -p"
teelog="tee -a $logfile $progfile"
# Here we go...
printf '%s: Progress log reset.\n' "$(date -u +%FT%TZ)" > $progfile
printf '%s: Started synchronization of Ubuntu and Debian repositories.\n' "$(date -u +%FT%TZ)" | $teelog
printf '%s: Use tail -f %s to view progress.\n\n' "$(date -u +%FT%TZ)" "$progfile"
# Check if the rsync script is already running
if [ -f $lockfile ]; then
printf '%s: Error: Repository updates are already running.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 10
# Check that we can reach the public mirror
elif ! ping -c 5 $mirror &> /dev/null; then
printf '%s: Error: Cannot reach the %s servers.\n\n' "$(date -u +%FT%TZ)" "$mirror" | $teelog
exit 20
# Check that the repository is mounted
elif ! mount | grep $repodir &> /dev/null; then
printf '%s: Error: Directory %s is not mounted.\n\n' "$(date -u +%FT%TZ)" "$repodir" | $teelog
exit 30
else
export GNUPGHOME=$repodir
# Create lockfile, sync ubuntu repo, delete lockfile
printf '%s: Beginning rsync of Ubuntu repo from %s.\n' "$(date -u +%FT%TZ)" "$ubuntuhost" | $teelog
touch $lockfile
$dmirror"$ropts" $ubuntuopts $ubunturepo
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Create lockfile, sync debian repo, delete lockfile
printf '%s: Beginning rsync of Debian repo from %s.\n' "$(date -u +%FT%TZ)" "$debianhost" | $teelog
touch $lockfile
$dmirror"$ropts" $debianopts $debianrepo
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Create lockfile, sync debian security repo, delete lockfile
printf '%s: Beginning rsync of Debian Security repo from %s.\n' "$(date -u +%FT%TZ)" "$debsechost" | $teelog
touch $lockfile
$dmirror2 $debsecopts $debsecrepo
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
unset GNUPGHOME
fi
# Now we're done
printf '%s: Completed synchronization of Ubuntu and Debian repositories.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 0
#!/bin/bash
# Apt repository updater script for Ubuntu (downstream)
# Currently syncs Ubuntu, Debian, and Debian Security
# Version 1.4.2 updated 20181003 by <AfroThundr>
# Version handler
for i in "$@"; do
if [ "$i" = "-v" ]; then
v=$(head -4 "$0" | tail -1)
printf '%s\n' "$v"
exit 0
fi
done
# Declare some variables (modify as necessary)
repodir=/srv/repository
ubunturepo=$repodir/ubuntu
debianrepo=$repodir/debian
debsecrepo=$repodir/debian-security
mirror=apt.dmz.lab.local
ubuntuhost=$mirror::ubuntu
debianhost=$mirror::debian
debsechost=$mirror::debian-security
lockfile=/var/lock/subsys/yum_rsync
logfile=/var/log/yum_rsync.log
progfile=/var/log/yum_rsync_prog.log
# Build the commands, with more variables
rsync="rsync -hlmprtzDHS --stats --no-motd --del --delete-excluded --log-file=$progfile"
teelog="tee -a $logfile $progfile"
# Here we go...
printf '%s: Progress log reset.\n' "$(date -u +%FT%TZ)" > $progfile
printf '%s: Started synchronization of Ubuntu and Debian repositories.\n' "$(date -u +%FT%TZ)" | $teelog
printf '%s: Use tail -f %s to view progress.\n\n' "$(date -u +%FT%TZ)" "$progfile"
# Check if the rsync script is already running
if [ -f $lockfile ]; then
printf '%s: Error: Repository updates are already running.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 10
# Check that we can reach the public mirror
elif ! ping -c 5 $mirror &> /dev/null; then
printf '%s: Error: Cannot reach the %s mirror server.\n\n' "$(date -u +%FT%TZ)" "$mirror" | $teelog
exit 20
# Check that the repository is mounted
elif ! mount | grep $repodir &> /dev/null; then
printf '%s: Error: Directory %s is not mounted.\n\n' "$(date -u +%FT%TZ)" "$repodir" | $teelog
exit 30
else
# Just sync everything since we're downstream
# Create lockfile, sync ubuntu repo, delete lockfile
printf '%s: Beginning rsync of Ubuntu repo from %s.\n' "$(date -u +%FT%TZ)" "$ubuntuhost" | $teelog
touch $lockfile
$rsync "$ubuntuhost/" "$ubunturepo/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Create lockfile, sync debian repo, delete lockfile
printf '%s: Beginning rsync of Debian repo from %s.\n' "$(date -u +%FT%TZ)" "$debianhost" | $teelog
touch $lockfile
$rsync "$debianhost/" "$debianrepo/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
# Create lockfile, sync debian security repo, delete lockfile
printf '%s: Beginning rsync of Debian Security repo from %s.\n' "$(date -u +%FT%TZ)" "$debsechost" | $teelog
touch $lockfile
$rsync "$debsechost/" "$debsecrepo/"
rm -f $lockfile
printf '%s: Done.\n\n' "$(date -u +%FT%TZ)" | $teelog
fi
# Now we're done
printf '%s: Completed synchronization of Ubuntu and Debian repositories.\n\n' "$(date -u +%FT%TZ)" | $teelog
exit 0
[Unit]
Description=Updates CentOS and EPEL yum repositories.
[Service]
Type=simple
ExecStart=/usr/local/sbin/repoupdate
StandardOutput=syslog
User=root
Group=www-data
[Unit]
Description=Updates CentOS and EPEL yum repositories every 6 hours.
[Timer]
OnCalendar=3/6:15
Persistent=1
[Install]
WantedBy=multi-user.target
lock file = /var/lock/subsys/rsyncd.lock
log file = /var/log/rsyncd.log
pid file = /var/run/rsyncd.pid
[centos]
path = /srv/repository/centos
comment = CentOS Repository
uid = rsync
gid = rsync
read only = yes
list = yes
use chroot = false
[fedora-epel]
path = /srv/repository/epel
comment = EPEL Repository
uid = rsync
gid = rsync
read only = yes
list = yes
use chroot = false
lock file = /var/lock/subsys/rsyncd.lock
log file = /var/log/rsyncd.log
pid file = /var/run/rsyncd.pid
[debian]
path = /srv/repository/debian
comment = Debian Repository
uid = rsync
gid = rsync
read only = yes
list = yes
use chroot = false
[debian-security]
path = /srv/repository/debian-security
comment = Debian Security Repository
uid = rsync
gid = rsync
read only = yes
list = yes
use chroot = false
[ubuntu]
path = /srv/repository/ubuntu
comment = Ubuntu Repository
uid = rsync
gid = rsync
read only = yes
list = yes
use chroot = false
[Unit]
Description = rsync daemon
After = network.target
[Service]
Type = simple
ExecStart = /bin/rsync --daemon --no-detach
Restart = on-failure
PrivateTmp = true
PIDFile = /var/run/rsyncd.pid
[Install]
WantedBy = multi-user.target
<VirtualHost *:80>
ServerName yum.lab.local
DocumentRoot /var/www/html
DirectoryIndex index.html
<Directory "/var/www/html">
AllowOverride None
Require all granted
</Directory>
CustomLog /var/log/httpd/yum-access.log combined
ErrorLog /var/log/httpd/yum-error.log
</VirtualHost>
<VirtualHost *:80>
ServerName apt.lab.local
DocumentRoot /var/www/html
DirectoryIndex index.html
<Directory "/var/www/html">
AllowOverride None
Require all granted
</Directory>
CustomLog /var/log/httpd/apt-access.log combined
ErrorLog /var/log/httpd/apt-error.log
</VirtualHost>
# CentOS local.repo
# Note: RPM-GPG-KEY-EPEL-7 not installed by default
[Base]
name=CentOS-$releasever - Base
baseurl=http://yum.dmz.local/repos/centos/$releasever/os/$basearch/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
[updates]
name=CentOS-$releasever - Updates
baseurl=http://yum.dmz.local/repos/centos/$releasever/updates/$basearch/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
[extras]
name=CentOS-$releasever - Extras
baseurl=http://yum.dmz.local/repos/centos/$releasever/extras/$basearch/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
[centosplus]
name=CentOS-$releasever - Plus
baseurl=http://yum.dmz.local/repos/centos/$releasever/centosplus/$basearch/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
[epel]
name=Extra Packages for Enterprise Linux 7
baseurl=http://yum.dmz.local/repos/epel/$releasever/$basearch
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
[epel-testing]
name=Extra Packages for Enterprise Linux 7
baseurl=http://yum.dmz.local/repos/epel/testing/$releasever/$basearch
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
# Debian apt sources.list
# Replace $(version) with your actual release
deb http://apt.lab.local/repos/debian/ $(version) main contrib non-free
deb http://apt.lab.local/repos/debian/ $(version)-updates main contrib non-free
deb http://apt.lab.local/repos/debian/ $(version)-backports main contrib non-free
deb http://apt.lab.local/repos/debian/ $(version)-proposed-updates main contrib non-free
deb http://apt.lab.local/repos/debian-security/ $(version)/updates main contrib non-free
# Ubuntu apt sources.list
# Replace $(version) with your actual release
deb http://apt.lab.local/repos/ubuntu/ $(version) main restricted universe multiverse
deb http://apt.lab.local/repos/ubuntu/ $(version)-updatees main restricted universe multiverse
deb http://apt.lab.local/repos/ubuntu/ $(version)-backports main restricted universe multiverse
deb http://apt.lab.local/repos/ubuntu/ $(version)-proposed main restricted universe multiverse
deb http://apt.lab.local/repos/ubuntu/ $(version)-security main restricted universe multiverse
/var/log/repoupdate {
weekly
rotate 4
notifempty
missingok
create 0640 root root
}
#!/usr/bin/perl -w
=head1 NAME
debmirror - Debian partial mirror script, with ftp, http or
rsync and package pool support
=head1 SYNOPSIS
B<debmirror> [I<options>] I<mirrordir>
=head1 DESCRIPTION
This program downloads and maintains a partial local Ubuntu mirror. It can
mirror any combination of architectures, distributions, and sections. Files
are transferred by ftp, and package pools are fully supported. It also does
locking and updates trace files.
The partial mirror created by this program is not suitable to be used as a
public Debian mirror. If that is your aim, you should instead follow
the instructions at L<http://www.debian.org/mirrors/ftpmirror>.
This program mirrors in three steps.
=over 4
=item 1. download Packages and Sources files
First it downloads all Packages and Sources files for the subset of Ubuntu it
was instructed to get.
=item 2. download everything else
The Packages and Sources files are scanned, to build up a list of all the
files they refer to. A few other miscellaneous files are added to the list.
Then the program makes sure that each file in the list is present on the
local mirror and is up-to-date, using file size (and optionally checksum) checks.
Any necessary files are downloaded.
=item 3. clean up unknown files
Any files and directories on the local mirror that are not in the list are
removed.
=back
=cut
sub usage {
warn join(" ", @_)."\n" if @_;
warn <<EOF;
Usage: $0 [options] <mirrordir>
For details, see man page.
EOF
exit(1);
}
=head1 OPTIONS
=over 4
=item I<mirrordir>
This required (unless defined in a configuration file) parameter specifies
where the local mirror directory is. If the directory does not exist, it will
be created. Be careful; telling this program that your home directory is the
mirrordir is guaranteed to replace your home directory with an Ubuntu mirror!
=item B<-p>, B<--progress>
Displays progress bars as files are downloaded.
=item B<-v>, B<--verbose>
Displays progress between file downloads.
=item B<--debug>
Enables verbose debug output, including ftp protocol dump.
=item B<--dry-run>
Simulate a mirror run. This will still download the meta files to the
F<./.temp> working directory, but won't replace the old meta files, won't
download debs and source files and only simulates cleanup.
=item B<--skip-installer>=I<foo[,bar,..]>
Don't download debian-installer files for the specified distribution.
=item B<--help>
Display a usage summary.
=item B<-h>, B<--host>=I<remotehost>
Specify the remote host to mirror from. Defaults to I<archive.ubuntu.com>,
you are strongly encouraged to find a closer mirror.
=item B<-r>, B<--root>=I<directory>
Specifies the directory on the remote host that is the root of the Ubuntu
archive. Defaults to F<ubuntu>, which will work for most mirrors. The root
directory has a F<dists> subdirectory.
=item B<--method>=I<method>
Specify the method to download files. Currently, supported methods are
B<ftp>, B<http>, B<https>, and B<rsync>. The B<file> method is
experimentally supported.
=item B<--passive>
Download in passive mode when using ftp.
=item B<-u>, B<--user>=I<remoteusername>
Specify the remote user name to use to log into the remote host.
Defaults to C<anonymous>.
=item B<--passwd>=I<remoteuserpassword>
Specify the remote user password to use to log into the remote ftp host.
It is used with B<--user> and defaults to C<anonymous@>.
=item B<--proxy>=I<http://user:pass@url:port/>
Specifies the http proxy (like Squid) to use for http or ftp methods.
=item B<-d>, B<--dist>=I<foo[,bar,..]>
Specify the distribution (lucid, oneiric, precise) of Ubuntu to
mirror. This switch may be used multiple times, and multiple
distributions may be specified at once, separated by commas.
You may also use the stable, testing, unstable, names.
=item B<--omit-suite-symlinks>
With this option set, B<debmirror> will not create the
symlink from I<suite> to I<codename>.
This is needed for example when mirroring archived Debian
releases as they will all have either C<stable> or C<oldstable> as
suite in their F<Release> files.
=item B<-s>, B<--section>=I<foo[,bar,..]>
Specify the section of Ubuntu to mirror. Defaults to
C<main,contrib,non-free,main/debian-installer>.
=item B<-a>, B<--arch>=I<foo[,bar,..]>
Specify the architectures to mirror. The default is B<--arch=i386>.
Specifying B<--arch=none> will mirror no archs.
=item B<--rsync-extra>=I<foo[,bar,..]>
Allows you to also mirror files from a number of directories that are not
part of the package archive itself.
B<Debmirror> will B<always> use rsync for the transfer of these files,
irrespective of what transfer method is specified in the B<--method> option.
This
will therefore not work if your remote mirror does not support rsync, or if
the mirror needs a different B<--root> option for rsync than for the main
transfer method specified with B<--method>.
Note that excluding individual files in the directories is not supported.
The following values are supported.
=over 2
=item B<doc>
Download all files and subdirectories in F<doc> directory, and all README
files in the root directory of the archive.
=item B<indices>
Download all files and subdirectories in F<indices> directory. Note that
this directory can contain some rather large files; don't include this
type unless you know you need these files.
=item B<tools>
Download all files and subdirectories in F<tools> directory.
=item B<trace>
Download the remote mirror's trace files for the archive (F<project/trace/*>).
This is enabled by default.
=item B<none>
This can be used to disable getting extra files with rsync.
=back
If specified, the update of trace files will be done at the beginning of
the mirror run; the other types are done near the end.
This switch may be used multiple times, and multiple values may be specified
at once, separated by commas; unknown values are ignored.
=item B<--di-dist>=I<dists | foo[,bar,..]>
Mirror current Debian Installer images for the specified dists.
See further the section L<Mirroring Debian Installer images> below.
=item B<--di-arch>=I<arches | foo[,bar,..]>
Mirror current Debian Installer images for the specified architectures.
See further the section L<Mirroring Debian Installer images> below.
=item B<--source>
Include source in the mirror (default).
=item B<--nosource>
Do not include source.
=item B<--i18n>
Additionally download F<Translation-E<lt>langE<gt>.bz2> files, which contain
translations of package descriptions. Selection of specific translations is
possible using the B<--include> and B<--exclude> options. The default
is to download only the English file.
=item B<--getcontents>
Additionally download F<Contents.E<lt>archE<gt>.gz> files. Note that these
files can be relatively big and can change frequently, especially for the
testing and unstable suites. Use of the available diff files is strongly
recommended (see the B<--diff> option).
=item B<--checksums>
Use checksums to determine if files on the local mirror that are
the correct size actually have the correct content. Not enabled by default,
because it is too paranoid, and too slow.
When the state cache is used, B<debmirror> will only check checksums
during runs
where the cache has expired or been invalidated, so it is worth considering
to use these two options together.
=item B<--ignore-missing-release>
Don't fail if the F<Release> file is missing.
=item B<--check-gpg>, B<--no-check-gpg>
Controls whether gpg signatures from the F<Release.gpg> file should be
checked. The default is to check signatures.
=item B<--keyring>=I<file>
Use I<file> as an additional gpg-format keyring. May be given multiple times.
Note that these will be used in addition to $GNUPGHOME/trustedkeys.gpg.
The latter can be removed from the set of keyrings by setting
$GNUPGHOME to something non-existent when using this option.
On a typical Debian system, the Debian archive keyring can be used
directly with this option:
debmirror --keyring /usr/share/keyrings/debian-archive-keyring.gpg ...
=item B<--ignore-release-gpg>
Don't fail if the F<Release.gpg> file is missing. If the file does exist, it
is mirrored and verified, but any errors are ignored.
=item B<--ignore>=I<regex>
Never delete any files whose filenames match the regex. May be used multiple times.
=item B<--exclude>=B<regex>
Never download any files whose filenames match the regex. May be used multiple times.
=item B<--include>=I<regex>
Don't exclude any files whose filenames match the regex. May be used multiple times.
=item B<--exclude-deb-section>=I<regex>
Never download any files whose Debian Section (games, doc, oldlibs,
science, ...) match the regex. May be used multiple times.
=item B<--limit-priority>=I<regex>
Limit download to files whose Debian Priority (required, extra,
optional, ...) match the regex. May be used multiple times.
=item B<--exclude-field>=I<fieldname>=I<regex>
Never download any binary packages where the contents of I<fieldname> match
the regex. May be used multiple times. If this option is used and the mirror
includes source packages, only those source packages corresponding to
included binary packages will be downloaded.
=item B<--include-field>=I<fieldname>=I<regex>
Don't exclude any binary packages where the contents of I<fieldname> match
the regex. May be used multiple times. If this option is used and the mirror
includes source packages, only those source packages corresponding to
included binary packages will be downloaded.
=item B<-t>, B<--timeout>=I<seconds>
Specifies the timeout to use for network operations (either FTP or rsync).
Set this to a higher value if you experience failed downloads. Defaults
to 300 seconds.
=item B<--max-batch>=I<number>
Download at most max-batch number of files (and ignore rest).
=item B<--rsync-batch>=I<number>
Download at most number of files with each rsync call and then loop.
=item B<--rsync-options>=I<options>
Specify alternative rsync options to be used. Default options are
"-aL --partial". Care must be taken when specifying alternative
options not to disrupt operations, it's best to only add to those
options.
The most likely option to add is "--bwlimit=x" to avoid saturating the
bandwidth of your link.
=item B<--postcleanup>
Clean up the local mirror but only after mirroring is complete and
only if there was no error.
This is the default, because it ensures that the mirror is consistent
at all times.
=item B<--precleanup>
Clean up the local mirror before starting mirroring.
This option may be useful if you have limited disk space, but it will result
in an inconsistent mirror when debmirror is running.
The deprecated B<--cleanup> option also enables this mode.
=item B<--nocleanup>
Do not clean up the local mirror.
=item B<--skippackages>
Don't re-download F<Packages> and F<Sources> files.
Useful if you know they are up-to-date.
=item B<--diff>=I<use|mirror|none>
If B<--diff=use> is specified and the F<Release> file contains entries for
diff files, then debmirror will attempt to use them to update F<Packages>,
F<Sources>, and F<Contents> files (which can significantly reduce the download
size for meta files), but will not include them in the mirror. This is
the default behavior and avoids having time consuming diff files for a
fast local mirror.
Specifying B<--diff=mirror> does the same as B<use>, but will also include
the downloaded diff files in the local mirror. Specify B<--diff=none> to
completely ignore diff files.
Note that if rsync is used as method to download files and the archive
being mirrored has "rsyncable" gzipped meta files, then using B<--diff=none>
may be the most efficient way to download them. See the B<gzip>(1) man page
for information about its rsyncable option.
=item B<--gzip-options>=I<options>
Specify alternative options to be used when calling B<gzip>(1) to compress meta
files after applying diffs. The default options are C<-9 -n --rsyncable>
which corresponds with the options used to gzip meta files for the main
Debian archive.
These options may need to be modified if the checksum of the file as gzipped by
debmirror does not match the checksum listed in the F<Release> file (which will
result in the gzipped file being downloaded unnecessarily after diffs were
successfully applied).
=item B<--slow-cpu>
By default debmirror saves some bandwidth by performing cpu-intensive
tasks, such as compressing files to generate .gz and .xz files. Use this
mode if the computer's CPU is slow, and it makes more sense to use more
bandwidth and less CPU.
This option implies B<--diff=none>.
=item B<--state-cache-days>=I<number>
Save the state of the mirror in a cache file between runs. The cache will
expire after the specified number of days, at which time a full check and
cleanup of the mirror will be done. While the cache is valid, B<debmirror>
will trust that the mirror is consistent with this cache.
The cache is only used for files that have a unique name, i.e. binary
packages and source files. If a mirror update fails for any reason, the
cache will be invalidated and the next run will include a full check.
Main advantage of using the state cache is that it avoids a large amount
of disk access while checking which files need to be fetched. It may also
reduce the time required for mirror updates.
=item B<--ignore-small-errors>
Normally B<debmirror> will report an error if any deb files or sources
fail to download and refuse to update the meta data to an inconsistent
mirror. Normally this is a good things as it indicates something went
wrong during download and should be retried. But sometimes the
upstream mirror actually is broken. Specifying B<--ignore-small-errors>
causes B<debmirror> to ignore missing or broken deb and source files but
still be pedantic about checking meta files.
=item B<--allow-dist-rename>
The directory name for a dist should be equal to its Codename and not to
a Suite. If the local mirror currently has directories named after Suites,
B<debmirror> can rename them automatically.
An existing symlink from I<codename> to I<suite> will be removed,
but B<debmirror>
will automatically create a new symlink S<suite -E<gt> codename> (immediately
after moving meta files in place). This conversion should only be needed once.
=item B<--disable-ssl-verification>
When https is used, debmirror checks that the SSL certificate is value.
If the server has a self-signed certificate, the check can be disabled
with this option.
=item B<--debmarshal>
On each pull, keep the repository meta data from dists/* in a numbered
subdirectory, and maintain a symlink latest to the most recent pull.
This is similar to Debmarshal in tracking mode, see
debmarshal.debian.net for examples and use.
debmirror cleanup is disabled when this flag is specified.
Separate pool and snapshot cleanup utilities are available at
http://code.google.com/p/debmarshal/source/browse/#svn/trunk/repository2
=item B<--config-file>=I<file>
Specify a configuration file. This option may be repeated to read
multiple configuration files. By default debmirror reads
/etc/debmirror.conf and ~/.debmirror.conf (see section FILES).
=back
=head2 Experimental options
=over 4
=item B<--retry-rsync-packages>=I<number>
While downloading Packages and related files via rsync, try up to
this many times if rsync fails to connect. Defaults to 1, to try
only once. (A typical nondefault value is 10. To try an unlimited
number of times, use -1 or 0.)
=back
=head1 USING DEBMIRROR
=head2 Using regular expressions in options
Various options accept regular expressions that can be used to tune what
is included in the mirror. They can be any regular expression valid in
I<perl>, which also means that extended syntax is standard. Make sure to
anchor regular expressions appropriately: this is not done by debmirror.
The --include and --exclude options can be combined. This combination
for example will, if the --i18n option is used, exclude all F<Translation>
files, except for the ones for Portuguese (pt) and Brazillian (pt_BR):
--exclude='/Translation-.*\.bz2$' --include='/Translation-pt.*\.bz2$'
=head2 Mirroring Debian Installer images
Debmirror will only mirror the "current" images that are on the remote
mirror. At least one of the options --di-dist or --di-arch must be
passed to enable mirroring of the images.
The special values "dists" and "arches" can be used to tell debmirror
to use the same dists and architectures for D-I images as for the archive,
but it is also possible to specify different values. If either option is
not set, it will default to the same values as for the archive.
If you wish to create custom CD images using for example I<debian-cd>,
you will probably also want add the option "--rsync-extra=doc,tools".
B<Limitations>
There are no progress updates displayed for D-I images.
=head2 Archive size
The tables in the file F</usr/share/doc/debmirror/mirror_size> give an
indication of the space needed to mirror the Debian archive. They are
particularly useful if you wish to set up a partial mirror.
Only the size of source and binary packages is included. You should allow
for around 1-4 GB of meta data (in F<./dists/E<lt>distE<gt>>) per suite
(depending in your settings). Plus whatever space is needed for extra
directories (e.g. F<tools>, F<doc>) you wish to mirror.
The tables also show how much additional space is required if you add
a release on top of its predecessor. Note that the additional space
needed for testing and (to a lesser extent) unstable varies during the
development cycle of a release. The additional space needed for testing
is zero immediately after a stable release and grows from that time
onwards.
B<Note>
Debmirror keeps an extra copy of all meta data. This is necessary to
guarantee that the local mirror stays consistent while debmirror is
running.
=head1 EXAMPLES
Simply make a mirror in F</srv/mirror/debian>, using all defaults (or the
settings defined in F<debmirror.conf>):
debmirror /srv/mirror/debian
Make a mirror of i386 and amd64 binaries, main and universe only, and include
both LTS and latest versions of Ubuntu; download from 'archive.ubuntu.com':
debmirror -a i386,amd64 -d lucid -d precise -s main,universe --nosource \
-h archive.ubuntu.com --progress $HOME/mirror/debian
Make a mirror using rsync (rsync server is 'ftp.debian.org::debian'),
excluding the section 'debug' and the package 'foo-doc':
debmirror -e rsync $HOME/mirror/debian --exclude='/foo-doc_' \
--exclude-deb-section='^debug$'
=head1 FILES
/etc/debmirror.conf
~/.debmirror.conf
Debmirror will look for the presence of these files and load them
in the indicated order if they exist.
See the example in /usr/share/doc/debmirror/examples for syntax.
~/.gnupg/trustedkeys.gpg
When gpg checking is enabled,
debmirror uses gpgv to verify Release and Release.gpg using the
default keying ~/.gnupg/trustedkeys.gpg. This can be changed by
exporting GNUPGHOME resulting in $GNUPGHOME/trustedkeys.gpg being
used. (Note that keyring files can also be specified directly
with debmirror's --keyring option -- see above).
To add the right key to this keyring you can import it from the
ubuntu keyring (in case of the Ubuntu archive) using:
gpg --keyring /usr/share/keyrings/ubuntu-archive-keyring.gpg --export \
| gpg --no-default-keyring --keyring trustedkeys.gpg --import
or download the key from a keyserver:
gpg --no-default-keyring --keyring trustedkeys.gpg \
--keyserver keyserver.ubuntu.com --recv-keys <key ID>
The <key ID> can be found in the gpgv error message in debmirror:
gpgv: Signature made Tue Jan 23 09:07:53 2007 CET using DSA key ID 2D230C5F
=cut
use strict;
use Cwd;
use Storable qw(nstore retrieve);
use Getopt::Long;
use File::Temp qw/ tempfile /;
use File::Path qw(make_path);
use IO::Pipe;
use IO::Select;
use LockFile::Simple;
use Compress::Zlib;
use Digest::MD5;
use Digest::SHA;
use if $] lt "5.022", "Net::INET6Glue";
use Net::FTP;
use LWP::UserAgent;
# Yeah, I use too many global variables in this program.
our $mirrordir;
our @config_files;
our ($debug, $progress, $verbose, $passive, $skippackages, $getcontents, $i18n);
our ($ua, $proxy, $ftp);
our (@dists, @sections, @arches, @ignores, @excludes, @includes, @keyrings, @skip_installer);
our (@excludes_deb_section, @limit_priority);
our (%excludes_field, %includes_field);
our (@di_dists, @di_arches, @rsync_extra);
our $state_cache_days = 0;
our $verify_checksums = 0;
our $pre_cleanup=0;
our $post_cleanup=1;
our $no_cleanup=0;
our $do_source=1;
our $host="archive.ubuntu.com";
our $user="anonymous";
our $passwd="anonymous@";
our $remoteroot="ubuntu";
our $download_method="ftp";
our $timeout=300;
our $max_batch=0;
our $rsync_batch=300;
our $num_errors=0;
our $bytes_to_get=0;
our $bytes_gotten=0;
our $bytes_meta=0;
our $doing_meta=1;
our $ignore_missing_release=0;
our $ignore_release_gpg=0;
our $start_time = time;
our $dry_run=0;
our $do_dry_run=0;
our $rsync_options="-aL --partial";
our $ignore_small_errors=0;
our $diff_mode="use";
our $gzip_options="-9 -n --rsyncable";
our $omit_suite_symlinks=0;
our $allow_dist_rename=0;
our $debmarshal=0;
our $disable_ssl_verification;
our $retry_rsync_packages=1;
our $slow_cpu=0;
our $check_gpg=1;
our $new_mirror=0;
our $retry_rsync_packages_delay=30; # seconds
my @errlog;
my $HOME;
($HOME = $ENV{'HOME'}) or die "HOME not defined in environment!\n";
# Switch to auto-flushing mode for stdout.
select STDOUT; $|=1;
# Load in config files first so options can override them.
Getopt::Long::Configure qw(pass_through);
GetOptions('config-file=s' => \@config_files);
if (@config_files) {
foreach my $config_file (@config_files) {
die "Can't open config file $config_file!\n" if ! -r $config_file;
require $config_file;
}
} else {
require "/etc/debmirror.conf" if -r "/etc/debmirror.conf";
require "$HOME/.debmirror.conf" if -r "$HOME/.debmirror.conf";
}
# This hash contains the releases to mirror. If both codename and suite can be
# determined from the Release file, the codename is used in the key. If not,
# it can also be a suite (or whatever was requested by the user).
# The hash has tree subtypes:
# - suite: if both codename and suite could be determined from the Release file,
# the codename is the key and the value is the name of the suitei - used to
# update the suite -> codename symlinks;
# - mirror: set to 1 if the package archive should be mirrored for the dist;
# - d-i: set to 1 if D-I images should be mirrored for the dist.
# For the last two subtypes the key can also include a subdir.
my %distset=();
# This hash holds all the files we know about. Values are:
# - -1: file was not on mirror and download attempt failed
# - 0: file was not on mirror and either needs downloading or was
# downloaded this run
# - 1: file is on mirror and wanted according to meta data
# - 2: file is on mirror and listed in state cache, but not (yet)
# verified as wanted according to meta data
# Values -1 and 2 can occur in the state cache; see $files_cache_version
# below! Filenames should be relative to $mirrordir.
my %files;
# Hash to record size and checksums of meta files and package files (from the
# Release file and Source/Packages files).
my %file_lists;
# Hash to record which Translation files needs download. Contains size and sha1
# info. Files also get registered in %files.
my %i18n_get;
# Hash to record which DEP-11 metadata files need to be downloaded. Files
# also get registered in %files.
my %dep11_get;
# Separate hash for files belonging to Debian Installer images.
# This data is not cached.
my %di_files;
## State cache meta-data
my $use_cache = 0;
my $state_cache_exptime;
# Next variable *must* be changed if the structure of the %files hash is
# changed in a way that makes old state-cache files incompatible.
my $files_cache_version = "1.0";
my $help;
Getopt::Long::Configure qw(no_pass_through);
GetOptions('debug' => \$debug,
'progress|p' => \$progress,
'verbose|v' => \$verbose,
'source!' => \$do_source,
'checksums!' => \$verify_checksums,
'md5sums|m' => \$verify_checksums, # back compat
'passive!' => \$passive,
'host|h=s' => \$host,
'user|u=s' => \$user,
'passwd=s' => \$passwd,
'root|r=s' => \$remoteroot,
'dist|d=s' => \@dists,
'section|s=s' => \@sections,
'arch|a=s' => \@arches,
'di-dist=s' => \@di_dists,
'di-arch=s' => \@di_arches,
'rsync-extra=s' => \@rsync_extra,
'precleanup' => \$pre_cleanup,
'cleanup' => \$pre_cleanup,
'postcleanup' => \$post_cleanup,
'nocleanup' => \$no_cleanup,
'ignore=s' => \@ignores,
'skip-installer=s' => \@skip_installer,
'exclude=s' => \@excludes,
'exclude-deb-section=s' => \@excludes_deb_section,
'limit-priority=s' => \@limit_priority,
'include=s' => \@includes,
'exclude-field=s' => \%excludes_field,
'include-field=s' => \%includes_field,
'skippackages' => \$skippackages,
'i18n' => \$i18n,
'getcontents' => \$getcontents,
'method|e=s' => \$download_method,
'timeout|t=s' => \$timeout,
'max-batch=s' => \$max_batch,
'rsync-batch=s' => \$rsync_batch,
'state-cache-days=s' => \$state_cache_days,
'ignore-missing-release' => \$ignore_missing_release,
'ignore-release-gpg' => \$ignore_release_gpg,
'check-gpg!' => \$check_gpg,
'dry-run' => \$dry_run,
'proxy=s' => \$proxy,
'rsync-options=s' => \$rsync_options,
'gzip-options=s' => \$gzip_options,
'ignore-small-errors' => \$ignore_small_errors,
'diff=s' => \$diff_mode,
'omit-suite-symlinks' => \$omit_suite_symlinks,
'allow-dist-rename' => \$allow_dist_rename,
'debmarshal' => \$debmarshal,
'slow-cpu' => \$slow_cpu,
'disable-ssl-verification' => \$disable_ssl_verification,
'retry-rsync-packages=s' => \$retry_rsync_packages,
'keyring=s' => \@keyrings,
'help' => \$help,
) or usage;
usage if $help;
usage("invalid number of arguments") if $ARGV[1];
# This parameter is so important that it is the only required parameter,
# unless specified in a configuration file.
$mirrordir = shift if $ARGV[0];
usage("mirrordir not specified") unless defined $mirrordir;
# Constrain other parameters
$diff_mode="none" if $slow_cpu;
if ($download_method eq 'hftp') { # deprecated
$download_method='ftp';
}
$retry_rsync_packages =~ /^-?\d+$/
or die 'unusable retry-rsync-packages value';
$retry_rsync_packages < 1 and $retry_rsync_packages = -1;
# Check for patch binary if needed
if (!($diff_mode eq "none")) {
if (system("patch --version 2>/dev/null >/dev/null")) {
say("Patch binary missing, falling back to --diff=none");
push (@errlog,"Patch binary missing, falling back to --diff=none\n");
$diff_mode = "none";
}
if (system("ed --version 2>/dev/null >/dev/null")) {
say("Ed binary missing, falling back to --diff=none");
push (@errlog,"Ed binary missing, falling back to --diff=none\n");
$diff_mode = "none";
}
}
# Backwards compatibility: remote root dir no longer needs prefix
$remoteroot =~ s%^[:/]%% unless downloads_via_file();
# Post-process arrays. Allow commas to separate values the user entered.
# If the user entered nothing, provide defaults.
@dists=split(/,/,join(',',@dists));
@dists=qw(precise) unless @dists;
@sections=split(/,/,join(',',@sections));
@sections=qw(main main/debian-installer universe restricted multiverse) unless @sections;
@arches=split(/,/,join(',',@arches));
@arches=qw(i386) unless @arches;
@arches=() if (join(',',@arches) eq "none");
@di_dists=split(/,/,join(',',@di_dists));
@di_arches=split(/,/,join(',',@di_arches));
if (@di_dists) {
@di_dists = @dists if ($di_dists[0] eq "dists");
@di_arches = @arches if (!@di_arches || $di_arches[0] eq "arches");
} elsif (@di_arches) {
@di_dists = @dists if (!@di_dists);
@di_arches = @arches if ($di_arches[0] eq "arches");
}
@rsync_extra=split(/,/,join(',',@rsync_extra));
@rsync_extra="trace" unless @rsync_extra;
if (! grep { $_ eq 'trace' } @rsync_extra) {
print STDERR "Warning: --rsync-extra is not configured to mirror the trace files.\n";
print STDERR " This configuration is not recommended.\n";
}
@rsync_extra=() if grep { $_ eq "none" } @rsync_extra;
$pre_cleanup=0 if ($no_cleanup);
$pre_cleanup=0 if ($debmarshal);
$post_cleanup=0 if ($no_cleanup);
$post_cleanup=0 if ($pre_cleanup);
$post_cleanup=0 if ($debmarshal);
@skip_installer=split(/,/,join(',',@skip_installer));
@skip_installer=() unless @skip_installer;
# Display configuration.
$|=1 if $debug;
if ($passwd eq "anonymous@") {
if (downloads_via_http()) {
say("Mirroring to $mirrordir from $download_method://$host/$remoteroot/");
} else {
say("Mirroring to $mirrordir from $download_method://$user\@$host/$remoteroot/");
}
} else {
say("Mirroring to $mirrordir from $download_method://$user:XXX\@$host/$remoteroot/");
}
say("Arches: ".join(",", @arches));
say("Dists: ".join(",", @dists));
say("Sections: ".join(",", @sections));
say("Including source.") if $do_source;
say("D-I arches: ".join(",", @di_arches)) if @di_arches;
say("D-I dists: ".join(",", @di_dists)) if @di_dists;
say("Pdiff mode: $diff_mode");
say("Slow CPU mode.") if $slow_cpu;
say("Veriftying checksums.") if $verify_checksums;
say("Not checking Release gpg signatures.") if ! $check_gpg;
say("Passive mode on.") if $passive;
say("Proxy: $proxy") if $proxy;
say("Download at most $max_batch files.") if ($max_batch > 0);
say("Download at most $rsync_batch files per rsync call.") if (downloads_via_rsync());
if ($pre_cleanup) {
say("Will clean up before mirroring.");
} elsif ($post_cleanup) {
say("Will clean up after mirroring.");
} else {
say("Will NOT clean up.");
}
say("Dry run.") if $dry_run;
say("Debmarshal snapshots kept.") if $debmarshal;
say("Disable SSL verification.") if $disable_ssl_verification;
# Set up mirror directory and resolve $mirrordir to a full path for
# locking and rsync
if (! -d $mirrordir) {
make_dir($mirrordir);
$new_mirror = 1;
}
die "You need write permissions on $mirrordir" if (! -w $mirrordir);
chdir($mirrordir) or die "chdir $mirrordir: $!";
$mirrordir = cwd();
# Handle the lock file. This is the same method used by official
# Debian push mirrors.
my $hostname=`hostname -f 2>/dev/null || hostname`;
chomp $hostname;
my $lockfile="Archive-Update-in-Progress-$hostname";
say("Attempting to get lock ...");
my $lockmgr = LockFile::Simple->make(-format => "%f/$lockfile", -max => 12,
-delay => 10, -nfs => 1, -autoclean => 1,
-warn => 1, -stale => 1, -hold => 0);
my $lock = $lockmgr->lock("$mirrordir")
or die "$lockfile exists or you lack proper permissions; aborting";
$SIG{INT}=sub { $lock->release; exit 1 };
$SIG{TERM}=sub { $lock->release; exit 1 };
# Create tempdir if missing
my $tempdir=".temp";
make_dir($tempdir) if (! -d $tempdir);
die "You need write permissions on $tempdir" if (! -w $tempdir);
# Load the state cache.
load_state_cache() if $state_cache_days;
# Register the trace and lock files.
my $tracefile="project/trace/$hostname";
$files{$tracefile}=1;
$files{$lockfile}=1;
my $rsynctempfile;
END { unlink $rsynctempfile if $rsynctempfile }
sub init_connection {
$_ = $download_method;
downloads_via_http() && do {
$ua = LWP::UserAgent->new(keep_alive => 1);
$ua->timeout($timeout);
$ua->proxy('http', $ENV{http_proxy}) if $ENV{http_proxy};
$ua->proxy('http', $proxy) if $proxy;
$ua->show_progress($progress);
return;
};
downloads_via_https() && do {
$ua = LWP::UserAgent->new(keep_alive => 1, ssl_opts => {
verify_hostname => ! $disable_ssl_verification });
$ua->timeout($timeout);
$ua->proxy('https', $ENV{https_proxy}) if $ENV{https_proxy};
$ua->proxy('https', $proxy) if $proxy;
$ua->show_progress($progress);
return;
};
downloads_via_ftp() && do {
if ($proxy || $ENV{ftp_proxy}) {
$ua = LWP::UserAgent->new;
$ua->timeout($timeout);
$ua->proxy('ftp', $proxy ? $proxy : $ENV{ftp_proxy});
}
else {
my %opts = (Debug => $debug, Passive => $passive, Timeout => $timeout);
$ftp=Net::FTP->new($host, %opts) or die "$@\n";
$ftp->login($user, $passwd) or die "login failed"; # anonymous
$ftp->binary or die "could not set binary mode";
$ftp->cwd("/$remoteroot") or die "cwd to /$remoteroot failed";
$ftp->hash(\*STDOUT,102400) if $progress;
}
return;
};
downloads_via_file() && do {
$ua = LWP::UserAgent->new;
$ua->timeout($timeout);
$ua->show_progress($progress);
$host='localhost';
return;
};
downloads_via_rsync() && do {
return;
};
usage("unknown download method: $_");
}
init_connection();
# determine remote root for rsync transfers
my $rsyncremote;
if (length $remoteroot) {
if (downloads_via_file()) {
$rsyncremote = "$remoteroot/";
}
else {
$rsyncremote = "$host\:\:$remoteroot/";
if ($user ne 'anonymous') {
$rsyncremote = "$user\@$rsyncremote";
}
}
}
else {
if (downloads_via_rsync()) {
die "rsync cannot be used with a root of $remoteroot/\n";
}
}
# Update the remote trace files; also update ignores for @rsync_extra.
rsync_extra(1, @rsync_extra);
# Get Release files without caching for http
say("Getting meta files ...");
$ua->default_header( "Cache-Control" => "max-age=0" ) if ($ua);
foreach my $dist (@dists) {
my $tdir="$tempdir/.tmp/dists/$dist";
my $have_release = get_release($tdir, $dist);
next unless ($have_release || $ignore_missing_release);
my ($codename, $suite, $dist_sdir) = name_release("mirror", $tdir, $dist);
if ($have_release) {
my $next;
make_dir ("dists/$codename$dist_sdir");
make_dir ("$tempdir/dists/$codename$dist_sdir");
rename("$tdir/Release", "$tempdir/dists/$codename$dist_sdir/Release")
or die "Error while moving $tdir/Release: $!\n";
$files{"dists/$codename$dist_sdir/Release"}=1;
$files{$tempdir."/"."dists/$codename$dist_sdir/Release"}=1;
if ($debmarshal) {
$next = make_next_snapshot($mirrordir,$dist,$codename,
$dist_sdir,$tempdir);
}
if (-f "$tdir/Release.gpg") {
rename("$tdir/Release.gpg", "$tempdir/dists/$codename$dist_sdir/Release.gpg")
or die "Error while moving $tdir/Release.gpg: $!\n";
$files{"dists/$codename$dist_sdir/Release.gpg"}=1;
$files{$tempdir."/"."dists/$codename$dist_sdir/Release.gpg"}=1;
if ($debmarshal) {
link_release_into_snapshot($mirrordir,$dist,$next,$tempdir,
$codename,$dist_sdir,"Release.gpg");
}
}
if (-f "$tdir/InRelease") {
rename("$tdir/InRelease", "$tempdir/dists/$codename$dist_sdir/InRelease")
or die "Error while moving $tdir/InRelease: $!\n";
$files{"dists/$codename$dist_sdir/InRelease"}=1;
$files{$tempdir."/"."dists/$codename$dist_sdir/InRelease"}=1;
if ($debmarshal) {
link_release_into_snapshot($mirrordir,$dist,$next,$tempdir,
$codename,$dist_sdir,"InRelease");
}
}
}
}
# Check that @di_dists contains valid codenames
di_check_dists() if @di_dists;
foreach my $dist (keys %distset) {
next unless exists $distset{$dist}{mirror};
# Parse the Release and extract the files listed for all checksum types.
if (open RELEASE, "<$tempdir/dists/$dist/Release") {
my $checksum_type;
while (<RELEASE>) {
if (/^(MD5Sum|SHA\d+):/) {
$checksum_type=$1;
}
elsif (/^ / && defined $checksum_type) {
my ($checksum, $size, $filename) = /^ +([a-z0-9]+) +(\d+) +(.*)$/;
$file_lists{"$tempdir/dists/$dist/$filename"}{$checksum_type} = $checksum;
$file_lists{"$tempdir/dists/$dist/$filename"}{size} = $size;
}
}
close RELEASE;
}
}
if ($num_errors != 0 && $ignore_missing_release) {
say("Ignoring failed Release files.");
push (@errlog,"Ignoring failed Release files\n");
$num_errors = 0;
}
if ($num_errors != 0) {
print "Errors:\n ".join(" ",@errlog) if (@errlog);
die "Failed to download some Release, Release.gpg or InRelease files!\n";
}
# Enable caching again for http
init_connection if ($ua);
# Calculate expected downloads for meta files
# As we don't actually download most of the meta files (due to getting
# only one compression variant or using diffs), we keep a separate count
# of the actual downloaded amount of data in $bytes_meta.
# The root Release files have already been downloaded
$bytes_to_get = $bytes_meta;
$bytes_gotten = $bytes_meta;
sub add_bytes {
my $name=shift;
$bytes_to_get += $file_lists{"$tempdir/$name"}{size} if exists $file_lists{"$tempdir/$name"};
}
foreach my $dist (keys %distset) {
next unless exists $distset{$dist}{mirror};
foreach my $section (@sections) {
foreach my $arch (@arches) {
add_bytes("dists/$dist/$section/binary-$arch/Packages");
add_bytes("dists/$dist/$section/binary-$arch/Packages.gz");
add_bytes("dists/$dist/$section/binary-$arch/Packages.xz");
add_bytes("dists/$dist/$section/binary-$arch/Release");
add_bytes("dists/$dist/$section/binary-$arch/Packages.diff/Index") unless ($diff_mode eq "none");
}
# d-i does not have separate source sections
if ($do_source && $section !~ /debian-installer/) {
add_bytes("dists/$dist/$section/source/Sources");
add_bytes("dists/$dist/$section/source/Sources.gz");
add_bytes("dists/$dist/$section/source/Sources.xz");
add_bytes("dists/$dist/$section/source/Release");
add_bytes("dists/$dist/$section/source/Sources.diff/Index") unless ($diff_mode eq "none");
}
add_bytes("dists/$dist/$section/i18n/Index");
}
}
# Get and parse MD5SUMS files for D-I images.
# (There are not currently other checksums for these.)
di_add_files() if @di_dists;
# Get Packages and Sources files and other miscellany.
my (@package_files, @source_files);
foreach my $dist (keys %distset) {
next unless exists $distset{$dist}{mirror};
foreach my $section (@sections) {
# some suites don't have d-i
next if ($section =~ /debian-installer/ && di_skip_dist($dist) );
foreach my $arch (@arches) {
get_index("dists/$dist/$section/binary-$arch", "Packages");
link_index($dist,$section,$arch) if $debmarshal;
}
# d-i does not have separate source sections
if ($do_source && $section !~ /debian-installer/) {
get_index("dists/$dist/$section/source", "Sources");
link_index($dist,$section,"source") if $debmarshal;
}
}
}
# Set download size for meta files to actual values
$doing_meta=0;
$bytes_to_get=$bytes_meta;
$bytes_gotten=$bytes_meta;
# Sanity check. I once nuked a mirror because of this..
if (@arches && ! @package_files) {
print "Errors:\n ".join(" ",@errlog) if (@errlog);
die "Failed to download any Packages files!\n";
}
if ($do_source && ! @source_files) {
print "Errors:\n ".join(" ",@errlog) if (@errlog);
die "Failed to download any Sources files!\n";
}
if ($num_errors != 0) {
print "Errors:\n ".join(" ",@errlog) if (@errlog);
die "Failed to download some Package, Sources or Release files!\n";
}
# Activate dry-run option now if it was given. This delay is needed
# for the ftp method.
$do_dry_run = $dry_run;
# Determine size of Contents, Translation, and DEP-11 files to get.
if ($getcontents) {
# Updates of Contents files using diffs are done here; only full downloads
# are delayed.
say("Update Contents files.") if ($diff_mode ne "none");
my $update_contents_files_using_diffs = sub {
my($operational_parameters, $dist, $arch, $sect) = @_;
# In Debian Wheezy, the Contents-*.gz moved to '/dists/$dist/$sect/'.
# This handles the new location, but also checks the old location
# for backwards compatibility.
if ($diff_mode ne "none") {
if (!update_contents("dists/$dist$sect", "Contents-$arch")) {
add_bytes("dists/$dist$sect/Contents-$arch.gz");
}
} elsif (!check_lists("$tempdir/dists/$dist$sect/Contents-$arch.gz")) {
add_bytes("dists/$dist$sect/Contents-$arch.gz");
}
};
do_contents_for_each_dist_arch_sect(
$update_contents_files_using_diffs, [], {}
);
}
foreach my $dist (keys %distset) {
next unless exists $distset{$dist}{mirror};
foreach my $section (@sections) {
i18n_from_release($dist,"$section/i18n");
dep11_from_release($dist,"$section/dep11");
}
}
# close ftp connection to avoid timeouts, will reopen later
if ($ftp) { $ftp->quit; }
say("Parsing Packages and Sources files ...");
{
local $/="\n\n"; # Set input separator to read entire package
my $empty_mirror = 1;
my %arches = map { $_ => 1 } (@arches, "all");
my $include = "(".join("|", @includes).")" if @includes;
my $exclude = "(".join("|", @excludes).")" if @excludes;
my $exclude_deb_section =
"(".join("|", @excludes_deb_section).")" if @excludes_deb_section;
my $limit_priority = "(".join("|", @limit_priority).")" if @limit_priority;
my $field_filters =
scalar(keys %includes_field) || scalar(keys %excludes_field);
my %binaries;
foreach my $file (@package_files) {
next if (!-f $file);
open(FILE, "<", $file) or die "$file: $!";
for (;;) {
unless (defined( $_ = <FILE> )) {
last if eof;
die "$file: $!" if $!;
}
my ($filename)=m/^Filename:\s+(.*)/im;
$filename=~s:/+:/:; # remove redundant slashes in paths
my ($deb_section)=m/^Section:\s+(.*)/im;
my ($deb_priority)=m/^Priority:\s+(.*)/im;
my ($architecture)=m/^Architecture:\s+(.*)/im;
next if (!$arches{$architecture});
if(!(defined($include) && ($filename=~/$include/o))) {
next if (defined($exclude) && $filename=~/$exclude/o);
next if (defined($exclude_deb_section) && defined($deb_section)
&& $deb_section=~/$exclude_deb_section/o);
next if (defined($limit_priority) && defined($deb_priority)
&& ! ($deb_priority=~/$limit_priority/o));
}
next if $field_filters && !check_field_filters($_);
my ($package)=m/^Package:\s+(.*)/im;
$binaries{$package} = 1;
# File was listed in state cache, or file occurs multiple times
if (exists $files{$filename}) {
if ($files{$filename} >= 0) {
$files{$filename} = 1 if $files{$filename} == 2;
$empty_mirror = 0;
next;
} else { # download failed previous run, retry
$files{$filename} = 0;
}
}
my ($size)=m/^Size:\s+(\d+)/im;
my %checksums;
while (m/^(MD5sum|SHA\d+):\s+([A-Za-z0-9]+)/img) {
$checksums{$1}=$2;
}
if (check_file(filename => $filename, size => $size, %checksums)) {
$files{$filename} = 1;
} else {
$files{$filename} = 0;
$file_lists{$filename} = \%checksums;
$file_lists{$filename}{size} = $size;
$bytes_to_get += $size;
}
$empty_mirror = 0;
}
close(FILE);
}
foreach my $file (@source_files) {
next if (!-f $file);
open(FILE, "<", $file) or die "$file: $!";
SOURCE:
for (;;) {
my $stanza;
unless (defined( $stanza = <FILE> )) {
last if eof;
die "$file: $!" if $!;
}
my @lines=split(/\n/, $stanza);
my $directory;
my %source_files;
my $parse_source_files=sub {
my $checksum_type=shift;
while (@lines && $lines[0] =~ m/^ ([A-Za-z0-9]+ .*)/) {
my ($checksum, $size, $filename)=split(' ', $1, 3);
$source_files{$filename}{size}=$size;
$source_files{$filename}{$checksum_type}=$checksum;
shift @lines;
}
};
while (@lines) {
my $line=shift @lines;
if ($line=~/^Directory:\s+(.*)/i) {
$directory=$1;
}
elsif ($line=~/^Section:\s+(.*)/i) {
my $deb_section=$1;
next SOURCE if (defined($exclude_deb_section) && defined($deb_section)
&& $deb_section=~/$exclude_deb_section/o);
}
elsif ($line=~/^Priority:\s+(.*)/i) {
my $deb_priority=$1;
next SOURCE if (defined($limit_priority) && defined($deb_priority)
&& ! ($deb_priority=~/$limit_priority/o));
}
elsif ($line=~/^Binary:\s+(.*)/i) {
if ($field_filters) {
my @binary_names=split(/\s*,\s*/,$1);
my $fetching_binary=0;
for my $binary_name (@binary_names) {
if (exists $binaries{$binary_name}) {
$fetching_binary=1;
last;
}
}
next SOURCE unless $fetching_binary;
}
}
elsif ($line=~/^Files:/i) {
$parse_source_files->("MD5Sum");
}
elsif ($line=~/^Checksums-(\w+):/i) {
$parse_source_files->($1);
}
}
foreach my $filename (keys %source_files) {
my %file_data=%{$source_files{$filename}};
$filename="$directory/$filename";
$filename=~s:/+:/:; # remove redundant slashes in paths
if(!(defined($include) && $filename=~/$include/o)) {
next if (defined($exclude) && $filename=~/$exclude/o);
}
# File was listed in state cache, or file occurs multiple times
if (exists $files{$filename}) {
if ($files{$filename} >= 0) {
$files{$filename} = 1 if $files{$filename} == 2;
$empty_mirror = 0;
next;
} else { # download failed previous run, retry
$files{$filename} = 0;
}
}
if (check_file(filename => $filename, %file_data)) {
$files{$filename} = 1;
} else {
$files{$filename} = 0;
$file_lists{$filename} = \%file_data;
$bytes_to_get += $file_data{size};
}
}
$empty_mirror = 0;
}
close(FILE);
}
# Sanity check to avoid completely nuking a mirror.
if ($empty_mirror && ! $new_mirror) {
print "Errors:\n ".join(" ",@errlog) if (@errlog);
die "No packages after parsing Packages and Sources files!\n";
}
}
# With pre-mirror cleanup Contents, Translation, and DEP-11 files need to be
# downloaded before the cleanup as otherwise they would be deleted
# because they haven't been registered yet.
# With post-mirror cleanup it's more neat to do all downloads together.
# This could be simplified if we could register the files earlier.
# Download Contents, Translation, and DEP-11 files.
init_connection();
get_contents_files() if ($getcontents);
get_i18n_files();
get_dep11_files();
# Pre-mirror cleanup
if ($pre_cleanup) {
# close ftp connection during cleanup to avoid timeouts
if ($ftp) { $ftp->quit; }
cleanup_unknown_files();
init_connection();
}
say("Files to download: ".print_dl_size($bytes_to_get - $bytes_gotten));
# Download all package files that we need to get.
batch_get();
sub batch_get {
if (uses_LWP()) {
my $dirname;
my $i=0;
foreach my $file (sort keys %files) {
if (!$files{$file}) {
if (($dirname) = $file =~ m:(.*)/:) {
make_dir($dirname);
}
if ($ftp) {
ftp_get($file);
}
else {
http_get($file);
}
if ($max_batch > 0 && ++$i >= $max_batch) {
push (@errlog,"Batch limit exceeded, mirror run was partial\n");
$num_errors++;
last;
}
}
}
return;
}
else {
my $opt=$rsync_options;
my $fh;
my @result;
my $i=0;
my $j=0;
my @tofetch;
$opt .= " --progress" if $progress;
$opt .= " -v" if $verbose or $debug;
$opt .= " -n" if $do_dry_run;
$opt .= " --no-motd" unless $verbose;
foreach my $file (sort keys %files) {
push(@tofetch, $file) if (!$files{$file});
}
my $last = scalar(@tofetch);
foreach my $file (@tofetch) {
my $dirname;
my @dir;
($dirname) = $file =~ m:(.*/):;
@dir= split(/\//, $dirname);
for (0..$#dir) {
push (@result, "" . join('/', @dir[0..$_]) . "/");
}
push (@result, "$file");
$i++;
$j++;
say("want $file ($i/$last $j/$rsync_batch)") if ($progress || $verbose);
if ($j >= $rsync_batch || $i == $last) {
$j = 0;
($fh, $rsynctempfile) = tempfile();
if (@result) {
@result = sort(@result);
my $prev = "not equal to $result[0]";
@result = grep($_ ne $prev && ($prev = $_, 1), @result);
for (@result) {
print $fh "$_\n";
}
}
while (1) {
system (
"rsync --timeout=$timeout $opt $rsyncremote --include-from=$rsynctempfile --exclude='*' $mirrordir"
);
my $rc = $?;
last if ($rc == 0);
# Retry on connection failures
if (($rc>>8) == 5) {
die "rsync failed too many times!"
if ($retry_rsync_packages >= 1 && --$retry_rsync_packages < 1);
say("Pausing before retry...");
sleep($retry_rsync_packages_delay);
}
else {
die "rsync failed!";
}
}
close $fh;
unlink $rsynctempfile;
foreach my $dest (@result) {
if (-f $dest) {
if (!check_lists($dest)) {
say("$dest failed checksum verification");
$num_errors++;
}
} elsif (!-d $dest) {
say("$dest missing");
$num_errors++;
}
}
@result = ();
}
if ($max_batch > 0 && ($i + 1) >= $max_batch) {
print "Batch limit exceeded, mirror run will be partial\n";
push (@errlog,"Batch limit exceeded, mirror run was partial\n");
$num_errors++;
last;
}
}
return;
}
}
if (! @di_dists) {
download_finished();
}
say("Everything OK. Moving meta files ...");
if ($debmarshal) {
update_latest_links($mirrordir, $tempdir, @dists);
}
chdir($tempdir) or die "unable to chdir($tempdir): $!\n";
my $res=0;
foreach my $file (`find . -type f 2>/dev/null`) {
chomp $file;
$file=~s:^\./::;
# this skips diff files if unwanted
next if (!exists $files{$file});
print("Moving $file\n") if ($debug);
if (! $do_dry_run) {
$res &= unlink($mirrordir."/".$file) if ($mirrordir."/".$file);
"$file" =~ m,(^.*)/,;
make_dir("$mirrordir/$1");
if (!link($file, $mirrordir."/".$file)) {
$res &= system("cp $file $mirrordir/$file");
}
}
}
chdir($mirrordir) or die "chdir $mirrordir: $!";
# Get optional directories using rsync.
rsync_extra(0, @rsync_extra);
# Download D-I images.
if (@di_dists) {
di_get_files();
download_finished();
}
# Update suite->codename symlinks.
if (! $omit_suite_symlinks && ! $do_dry_run) {
my %suites;
opendir (DIR, 'dists') or die "Can't open dists/: $!\n";
foreach my $file (grep (!/^\.\.?$/, readdir (DIR))) {
if (-l "dists/$file") {
my $cur = readlink("dists/$file") or die "Error reading symlink dists/$file: $!";
if (exists $distset{$cur}{suite} &&
($file eq $distset{$cur}{suite} || $file eq "stable-$distset{$cur}{suite}")) {
$suites{$file} = "ok";
} else {
unlink("dists/$file") or die "Failed to remove symlink dists/$file: $!";
}
}
}
closedir (DIR);
foreach my $dist (keys %distset) {
next if (! exists $distset{$dist}{suite});
next if (!-d "dists/$dist");
my $suite = $distset{$dist}{suite};
if (! exists $suites{$suite}) {
symlink("$dist", "dists/$suite") or die "Failed to create symlink dists/$suite: $!";
}
if ($suite eq "proposed-updates"&& !exists $suites{"stable-$suite"}) {
symlink("$dist", "dists/stable-$suite") or die "Failed to create symlink dists/stable-$suite: $!";
}
}
}
# Write out trace file.
if (! $do_dry_run) {
make_dir("project/trace");
open OUT, ">$tracefile" or die "$tracefile: $!";
print OUT `LC_ALL=C date -u`;
close OUT;
}
# Post mirror cleanup.
cleanup_unknown_files() if ($post_cleanup && ! $debmarshal);
# Mirror cleanup for directories.
if (! $use_cache && ($pre_cleanup || $post_cleanup)) {
# Remove all empty directories. Not done as part of main cleanup
# to prevent race problems with pool download code, which
# makes directories.. Sort so they are removable in bottom-up
# order.
chdir($mirrordir) or die "chdir $mirrordir: $!";
system("find . -depth -type d ! -name . ! -name .. -print0 2>/dev/null | xargs -0 rmdir 2>/dev/null") if (! $do_dry_run);
}
if ($res != 0) {
die("Failed to move some meta files.");
}
# Save the state cache.
save_state_cache() if $state_cache_days && !$do_dry_run;
say("All done.");
$lock->release;
print "Errors:\n ".join(" ",@errlog) if (@errlog);
if ($num_errors != 0) {
print "Failed to download files ($num_errors errors)!\n";
exit 1 if (!$ignore_small_errors);
}
exit;
sub print_dl_size {
my $size=shift;
my $unit;
if ($size >= 10*1000*1024) {
$size=int($size/1024/1024);
$unit="MiB";
} elsif ($size >= 10*1000) {
$size=int($size/1024);
$unit="kiB";
} else {
$unit="B";
}
return "$size $unit";
}
sub add_bytes_gotten {
my $size=shift;
$bytes_gotten += $size;
if ($doing_meta) {
$bytes_meta += $size;
}
}
# Return true if a package stanza is permitted by
# --include-field/--exclude-field.
sub check_field_filters {
my $stanza = shift;
for my $name (keys %includes_field) {
if ($stanza=~/^\Q$name\E:\s+(.*)/im) {
my $value=$1;
return 1 if $value=~/$includes_field{$name}/;
}
}
return 0 if keys %includes_field;
for my $name (keys %excludes_field) {
if ($stanza=~/^\Q$name\E:\s+(.*)/im) {
my $value=$1;
return 0 if $value=~/$excludes_field{$name}/;
}
}
return 1;
}
# Takes named parameters: filename, size.
#
# Optionally can also be passed parameters specifying expected checksums
# for the file, using checksum names as in the Release/Packages/Sources files
# ("SHA1", "MD5Sum", etc).
#
# Size is always checked; verifying the checksum is optional. However, if
# a value of -1 is passed for size, a check of the checksum is forced.
#
# It will return true if the tests show the file matches.
sub check_file {
my %params=@_;
my ($filename, $size)=delete @params{qw{filename size}};
if (! -f $filename) {
say("Missing: $filename") if ($verbose);
return 0;
}
my $disksize = -s _;
if ($size == $disksize || $size == -1) {
if ($verify_checksums || $size == -1) {
# Prefer checking stronger checksums, and failing that, fall back
# to whatever checksums are present and supported, trying to prefer
# FOObignum over FOOsmallnum.
my ($summer, $checksum);
foreach my $checksum_type ("SHA512", "SHA256", "SHA1", reverse sort keys %params) {
next unless defined $params{$checksum_type};
if (lc $checksum_type eq 'md5sum') {
$summer=Digest::MD5->new;
}
elsif ($checksum_type=~/^sha(\d+)$/i) {
# returns undef on unknown/too large SHA type
$summer=Digest::SHA->new($1);
}
if (defined $summer) {
$checksum=$params{$checksum_type};
last;
}
}
if (! defined $summer) {
die "unsupported checksum type(s): ".(join(" ", keys %params))."\n";
}
open HANDLE, $filename or die "$filename: $!";
$summer->addfile(*HANDLE);
return 1 if $checksum eq $summer->hexdigest;
say(sprintf("Mismatch '$filename': sum is %s, expected %s", $summer->hexdigest, $checksum))
if ($verbose);
}
else {
return 1;
}
}
elsif ($verbose) {
say(sprintf("Mismatch '$filename': size is %d, expected %d", $disksize, $size));
}
return 0;
}
# Check uncompressed diff content against sha1sum from Index file.
sub check_diff {
my ($filename, $size, $sha1) = @_;
my $digest = Digest::SHA->new(1);
my $ret = 0;
if (-f "$filename.gz") {
system_redirect_io("gzip -d", "$filename.gz", "$filename");
if ($size == -s $filename) {
open HANDLE, $filename or die "$filename: $!";
$digest->addfile(*HANDLE);
$ret = ($sha1 eq $digest->hexdigest);
}
unlink ($filename);
}
return $ret;
}
# Check file against checksum and size from the Release file.
# It will return true if the checksum matches.
sub check_lists {
my $file = shift;
my $t = $verify_checksums;
my $ret = 1;
$verify_checksums = 1;
if (exists $file_lists{$file}) {
$ret = check_file(filename => $file, %{$file_lists{$file}});
}
$verify_checksums = $t;
return $ret;
}
sub remote_get {
my $file=shift;
my $tdir=shift;
my $res;
return 1 if ($skippackages);
$tdir=$tempdir unless $tdir;
chdir($tdir) or die "unable to chdir($tdir): $!\n";
if (uses_LWP()) {
$res=$ftp ? ftp_get($file) : http_get($file);
$res=$res && check_lists($file);
if (-f $file && !$res) {
say("$file failed checksum verification, removing");
unlink($file) if (-f $file);
}
}
else {
$res=rsync_get($file);
$res=$res && check_lists($file);
if (-f $file && !$res) {
say("$file failed checksum verification");
# FIXME: make sure the size doesn't match so it gets retried
}
}
chdir($mirrordir) or die "unable to chdir($mirrordir): $!\n";
return $res;
}
sub print_percent {
my $message=shift;
my $percent = $bytes_to_get ? (($bytes_gotten / $bytes_to_get)*100) : 0;
printf "[%3.0f%%] %s", $percent, $message;
}
# Get a file via http, or possibly ftp if a proxy is being used with that
# method. First displaying its filename if progress is on.
sub http_get {
my $oldautoflush = $|;
$| = 1;
my $file=shift;
my $url;
if ($user eq 'anonymous'){
$url="$download_method://${host}/${remoteroot}/${file}";
}
else {
$url="$download_method://${user}:${passwd}\@${host}/${remoteroot}/${file}";
}
my $ret=1;
print "$url => " if ($debug);
print_percent "Getting: $file... " if $progress or $verbose;
print "\t #" if $progress;
if (! $do_dry_run) {
unlink($file) if (-f $file);
$ret = $ua->mirror($url, $file);
print $ret->status_line . "\n" if ($debug);
if ($ret->is_error) {
$files{$file} = -1;
warn "failed " . $ret->status_line . "\n" if ($progress or $verbose);
push (@errlog,"Download of $file failed: ".$ret->status_line."\n");
$num_errors++;
} elsif ($progress || $verbose) {
print "ok\n";
}
$ret = not ( $ret->is_error );
} elsif ($progress || $verbose) {
print "ok\n";
}
# Account for actual bytes gotten
my @stat = stat $file;
add_bytes_gotten($stat[7]) if (@stat);
$| = $oldautoflush;
return $ret;
}
# Get a file via ftp, first displaying its filename if progress is on.
sub ftp_get {
my $oldautoflush = $|;
$| = 1;
my $file=shift;
my $mtime;
my @stat = stat $file;
if (@stat) { # already have the file?
my $size = $ftp->size($file);
my $mtime = $ftp->mdtm($file);
if ($mtime && $size
&& $size == $stat[7]
&& $mtime == $stat[9]) { # size and time match
print_percent "Keeping: $file\n" if $progress or $verbose;
add_bytes_gotten($size);
return 1;
}
}
print_percent "Getting: $file" if $progress or $verbose;
print "\t #" if $progress;
my $ret=1;
if (! $do_dry_run) {
unlink($file) if (-f $file);
$ret = $ftp->get($file, $file);
if ($ret) {
my $mtime=$ftp->mdtm($file);
utime($mtime, $mtime, $file) if defined $mtime;
} else {
$files{$file} = -1;
warn " failed:".$ftp->message if ($progress or $verbose);
push (@errlog,"Download of $file failed: ".$ftp->message."\n");
$num_errors++;
}
}
my $size=$ftp->size($file);
add_bytes_gotten($size) if $size;
$| = $oldautoflush;
print "\n" if (($verbose and not $progress) or ($do_dry_run and $progress));
return $ret;
}
sub rsync_get {
my $file=shift;
my $opt=$rsync_options;
(my $dirname) = $file =~ m:(.*/):;
my @dir= split(/\//, $dirname);
for (0..$#dir) {
$opt = "$opt --include=" . join('/', @dir[0..$_]) . "/";
}
$opt .= " --progress" if $progress;
$opt .= " -v" if $debug;
$opt .= " --no-motd" unless $verbose;
system ("rsync --timeout=$timeout $opt $rsyncremote --include=$file --exclude='*' .");
if ($? == 0 && -f $file) {
return 1;
} else {
$files{$file} = -1;
push (@errlog,"Download of $file failed\n");
$num_errors++;
return 0;
}
}
sub rsync_extra {
my ($early, @extras) = @_;
my @includes;
if (! defined $rsyncremote) {
say("Not able to use rsync to update remote trace files ...");
return;
}
# @ignores is updated during $early to prevent removal of files
# if cleanup is done early.
for my $type (@extras) {
if ($early) {
if ($type eq "trace") {
push(@includes, "- /project/trace/$hostname");
push(@includes, "/project/trace/*");
push(@ignores, "^project/trace/");
say("Updating remote trace files (using rsync) ...");
} elsif ($type eq "doc") {
push(@ignores, "^doc/");
push(@ignores, "^README*");
} elsif ($type eq "tools") {
push(@ignores, "^tools/");
} elsif ($type eq "indices") {
push(@ignores, "^indices/");
}
} else {
if ($type eq "doc") {
push(@includes, "/doc/***");
push(@includes, "/README*");
} elsif ($type eq "tools") {
push(@includes, "/tools/***");
} elsif ($type eq "indices") {
push(@includes, "/indices/***");
}
}
}
return if (! @includes);
if (! $early) {
@extras = grep(!/^trace$/, @extras); # drop 'trace' from list
say("Updating extra files (using rsync): @extras.");
}
rsync_extra_get(@includes);
}
sub rsync_extra_get {
my @includes = @_;
my $fh;
my @result;
my $opt=$rsync_options;
$opt .= " --progress" if $progress;
$opt .= " -v" if $verbose or $debug;
$opt .= " -n" if $do_dry_run;
$opt .= " --no-motd" unless $verbose;
($fh, $rsynctempfile) = tempfile();
foreach my $line (@includes) {
if ($line !~ /^- /) {
my $dirname;
my @dir;
($dirname) = ($line =~ m:(.*/):);
@dir= split(/\//, $dirname);
for (1..$#dir) {
push (@result, "" . join('/', @dir[0..$_]) . "/");
}
}
push (@result, "$line");
}
for (@result) {
print $fh "$_\n";
}
my $ret=system("rsync --timeout=$timeout $opt $rsyncremote --delete --include-from=$rsynctempfile --exclude='*' $mirrordir");
if ($ret != 0) {
print STDERR "Warning: failed to use rsync to download extra files.\n";
}
close $fh;
unlink $rsynctempfile;
}
# run system() with stdin and stdout redirected to files
# unlinks stdout target file first to break hard links
sub system_redirect_io {
my ($command, $fromfile, $tofile) = @_;
if (-f $tofile) {
unlink($tofile) or die "unlink($tofile) failed: $!";
}
my $cmd="$command <$fromfile >$tofile";
system("$cmd");
die "Failed: $cmd\n" if ($? != 0);
}
sub split_dist {
my $dist = shift;
my ($dist_raw) = ($dist =~ m:^([^/]+)/?:);
$dist =~ m:^[^/]+(/.*)?$:;
my $dist_sdir = $1 // "";
return ($dist_raw, $dist_sdir);
}
sub get_next_snapshot {
my ($dist) = @_;
my $latest = readlink("$mirrordir/dists/$dist/latest");
if (defined $latest) {
$latest++;
} else {
$latest = 0;
}
return $latest;
}
sub make_next_snapshot {
my ($mirrordir, $dist, $codename, $dist_sdir, $tempdir) = @_;
my $next = get_next_snapshot($dist);
make_dir("$mirrordir/dists/$dist/$next");
unlink("$mirrordir/dists/$dist/$next/Release");
link("$tempdir/dists/$codename$dist_sdir/Release",
"$mirrordir/dists/$dist/$next/Release")
or die "Error while linking $tempdir/dists/$codename$dist_sdir/Release: $!\n";
return $next;
}
sub update_latest_links {
my ($mirrordir, $tempdir, @dists) = @_;
foreach my $dist (@dists) {
system("diff","-q","$mirrordir/dists/$dist/latest/Release",
"$tempdir/dists/$dist/Release");
if ($?) {
my $next = get_next_snapshot($dist);
say("Updating $mirrordir/dists/$dist/latest to $next");
unlink("$mirrordir/dists/$dist/latest");
symlink($next,"$mirrordir/dists/$dist/latest")
or die "Error while symlinking $mirrordir/dists/$dist/latest to $next: $\n";
} else {
say("Not updating $mirrordir/dists/$dist/latest");
}
}
}
sub link_release_into_snapshot {
my ($mirrordir,$dist,$next,$tempdir,$codename,$dist_sdir,$base) = @_;
unlink("$mirrordir/dists/$dist/$next/$base");
link("$tempdir/dists/$codename$dist_sdir/$base",
"$mirrordir/dists/$dist/$next/$base")
or die "Error while linking $tempdir/dists/$codename$dist_sdir/$base: $!\n";
}
sub link_contents_into_snapshot {
my ($dist,$mirrordir,$arch,$tempdir) = @_;
my $next = get_next_snapshot($dist);
push my @sects, @sections, "";
foreach my $sect (@sects) {
if ($sect ne "") {$sect = "/$sect";}
if (exists $file_lists{"$tempdir/dists/$dist$sect/Contents-$arch.gz"}) {
unlink("$mirrordir/dists/$dist/$next$sect/Contents-$arch.gz");
link("$tempdir/dists/$dist$sect/Contents-$arch.gz",
"$mirrordir/dists/$dist/$next$sect/Contents-$arch.gz")
or die "Error while linking $tempdir/dists/$dist$sect/Contents-$arch.gz: $!\n";
}
}
}
sub link_auxfile_into_snapshot {
my ($file,$dist,$distpath,$filename,$mirrordir,$tempdir) = @_;
my $next = get_next_snapshot($dist);
my $target_path = "$mirrordir/dists/$dist/$next/$distpath";
say("linking $file");
unlink("$target_path/$filename");
make_path($target_path);
link("$tempdir/$file", "$target_path/$filename")
or die "Error while linking $tempdir/$file: $!";
}
sub gpg_verify {
my (@files) = @_;
# Check for gpg
if (system("gpgv --version >/dev/null 2>/dev/null")) {
say("gpgv failed: gpgv binary missing?");
push (@errlog,"gpgv failed: gpgv binary missing?\n");
$num_errors++;
} else {
# Verify Release signature
my $gpgv_res = 0;
my $outp = IO::Pipe->new;
my $errp = IO::Pipe->new;
my $gpgvout = "";
my $gpgverr = "";
if (my $child = fork) {
$outp->reader;
$errp->reader;
my $sel = IO::Select->new;
$sel->add($outp, $errp);
while (my @ready = $sel->can_read) {
for (@ready) {
my $buf = "";
my $bytesread = $_->read($buf, 1024);
if (!defined($bytesread)) {
die "read error: $!\n";
} elsif ($bytesread == 0) {
$sel->remove($_);
$_->close;
} else {
if ($_ == $outp) {
$gpgvout .= $buf;
}
if ($_ == $errp) {
$gpgverr .= $buf;
}
}
}
}
waitpid($child, 0) == -1
and die "was pid $child automatically reaped?\n";
$gpgv_res = not $?;
}
else {
$outp->writer;
$errp->writer;
STDOUT->fdopen(fileno($outp), "w") or die;
STDERR->fdopen(fileno($errp), "w") or die;
my @gpgv = qw(gpgv --status-fd 1);
push @gpgv, (map { ('--keyring' => $_) } @keyrings);
push @gpgv, @files;
exec(@gpgv) or die "exec: $gpgv[0]: $!\n";
}
# In debug or verbose mode, display the gpg error message on stdout.
if (! $gpgv_res || $debug) {
print $gpgvout;
print $gpgverr;
}
if ($verbose && ! $debug) {
print $gpgverr;
}
if (! $gpgv_res) {
say("$files[0] signature does not verify.");
push (@errlog,"$files[0] signature does not verify\n");
$num_errors++;
}
}
}
sub get_release {
my ($tdir, $dist) = @_;
make_dir ("$tdir");
return 0 unless remote_get("dists/$dist/Release", "$tempdir/.tmp");
# Save current error state so we can roll back if $ignore_release_gpg
# is set; needed because remote_get() can register errors
my @t_errlog = @errlog;
my $t_errors = $num_errors;
remote_get("dists/$dist/InRelease", "$tempdir/.tmp");
my @inr_t_errlog = @errlog;
my $inr_t_errors = $num_errors;
remote_get("dists/$dist/Release.gpg", "$tempdir/.tmp");
# We only need one of InRelease and Release.gpg.
if ($num_errors == $t_errors || $num_errors == $inr_t_errors) {
@errlog = @t_errlog;
$num_errors = $t_errors;
}
if (! $check_gpg) {
# Nothing to do.
}
else {
my $got_gpg = 0;
if (-f "$tdir/Release" && -f "$tdir/Release.gpg") {
$got_gpg = 1;
gpg_verify("$tdir/Release.gpg", "$tdir/Release");
}
if (-f "$tdir/InRelease") {
$got_gpg = 1;
gpg_verify("$tdir/InRelease");
}
if (! $got_gpg) {
say("Release gpg signature does not verify, file missing.");
push (@errlog,"Release gpg signature does not verify\n");
$num_errors++;
}
}
if ($ignore_release_gpg) {
@errlog = @t_errlog;
$num_errors = $t_errors;
}
return 1
}
sub name_release {
my ($type, $tdir, $dist) = @_;
my ($buf, $codename, $suite);
my $origin = "unknown";
if (-f "$tdir/Release") {
if (open RELEASE, "<$tdir/Release") {
while (<RELEASE>) {
last if /^MD5Sum:/;
$buf = $buf . $_;
}
close RELEASE;
}
$_ = $buf;
($origin) = m/^Origin:\s+(.*)/im if (/^Origin:/im);
($codename) = m/^Codename:\s+(.*)/im;
($suite) = m/^Suite:\s+(.*)/im;
} elsif ($ignore_missing_release) {
$origin = "none";
}
# Allow for example "<codename|suite>/updates"; split into the
# raw dist (codename or suite) and the subdirectory.
my ($dist_raw, $dist_sdir) = split_dist($dist);
if ($origin eq "none") {
$codename = $dist_raw;
} elsif ($origin eq "Ubuntu" or $origin eq "Canonical") {
if ($suite) {
say("Ubuntu Release file: using Suite ($suite).");
$codename = $suite;
} else {
say("Invalid Ubuntu Release file.");
push (@errlog,"Invalid Ubuntu Release file.\n");
$num_errors++;
next;
}
} elsif ($codename) {
if ($dist_raw ne $codename && $dist_raw ne $suite) {
say("Broken Release file: neither Codename nor Suite matches $dist.");
push (@errlog,"Broken Release file: neither Codename nor Suite matches $dist\n");
$num_errors++;
next;
}
} elsif ($suite) {
say("Release file does not contain Codename; using Suite ($suite).");
$codename = $suite;
} else {
say("Release file contains neither Codename nor Suite; using $dist.");
$codename = $dist_raw;
}
# For experimental the suite is the same as the codename
$suite = "" if (! $suite || $suite eq $codename);
die("Duplicate dist $codename$dist_sdir.\n")
if exists $distset{"$codename$dist_sdir"}{$type};
$distset{"$codename$dist_sdir"}{$type} = 1;
die("Conflicting suites '$suite' and '$distset{$codename}{suite}' for $codename.\n")
if (exists $distset{"$codename"}{suite} && ($suite ne $distset{$codename}{suite}));
$distset{$codename}{suite} = "$suite" if ($suite);
# This should be a one-time conversion only
if ($suite) {
if (-d "$tempdir/dists/$suite" && !-l "$tempdir/dists/$suite") {
rename_distdir("$tempdir/dists", $codename, $suite);
}
if (-d "dists/$suite" && !-l "dists/$suite") {
rename_distdir("dists", $codename, $suite);
}
}
return ($codename, $suite, $dist_sdir);
}
# Get Index file in the passed subdirectory.
sub get_index {
my $subdir=shift;
my $file=shift;
make_dir($subdir);
make_dir("$tempdir/$subdir");
if ($diff_mode ne "none" && exists $file_lists{"$tempdir/$subdir/$file.diff/Index"}) {
if (!check_lists("$tempdir/$subdir/$file.diff/Index")) {
make_dir("$tempdir/$subdir/$file.diff");
if (!remote_get("$subdir/$file.diff/Index")) {
push (@errlog,"$subdir/$file.diff/Index failed checksum verification, removing\n");
} else {
fetch_and_apply_diffs(0, $subdir, $file);
if (check_lists("$tempdir/$subdir/$file")) {
if (! $slow_cpu) {
system_redirect_io("gzip $gzip_options", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.gz");
system_redirect_io("xz", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.xz");
}
}
}
} else {
$bytes_gotten += $file_lists{"$tempdir/$subdir/$file.diff/Index"}{size};
fetch_and_apply_diffs(0, $subdir, "$file");
if (check_lists("$tempdir/$subdir/$file")) {
if (! $slow_cpu) {
system_redirect_io("gzip $gzip_options", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.gz");
system_redirect_io("xz", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.xz");
}
}
}
$files{"$subdir/$file.diff/Index"}=1 if ($diff_mode eq "mirror");
$files{"$tempdir/$subdir/$file.diff/Index"}=1;
}
my $got_any_file=0;
if (exists $file_lists{"$tempdir/$subdir/$file.xz"}{size}) {
my $got_xz=0;
if (!check_lists("$tempdir/$subdir/$file.xz")) {
if (remote_get("$subdir/$file.xz")) {
$got_xz=1;
} else {
push (@errlog,"$subdir/$file.xz failed checksum verification\n");
$num_errors++;
}
} else {
$bytes_gotten += $file_lists{"$tempdir/$subdir/$file.xz"}{size};
$got_xz=1;
}
if ($got_xz) {
if (!check_lists("$tempdir/$subdir/$file")) {
system_redirect_io("xz -d", "$tempdir/$subdir/$file.xz", "$tempdir/$subdir/$file");
}
if (!check_lists("$tempdir/$subdir/$file.gz") && ! $slow_cpu) {
system_redirect_io("gzip $gzip_options", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.gz");
}
$files{"$subdir/$file.xz"}=1;
$files{"$tempdir/$subdir/$file.xz"}=1;
$got_any_file=1;
}
}
if (exists $file_lists{"$tempdir/$subdir/$file.gz"}{size}) {
my $got_gz=0;
if (!check_lists("$tempdir/$subdir/$file.gz")) {
if (remote_get("$subdir/$file.gz")) {
$got_gz=1;
} else {
push (@errlog,"$subdir/$file.gz failed checksum verification\n");
$num_errors++;
}
} else {
$bytes_gotten += $file_lists{"$tempdir/$subdir/$file.gz"}{size};
$got_gz=1;
}
if ($got_gz) {
if (!check_lists("$tempdir/$subdir/$file")) {
system_redirect_io("gzip -d", "$tempdir/$subdir/$file.gz", "$tempdir/$subdir/$file");
}
if (!check_lists("$tempdir/$subdir/$file.xz") && ! $slow_cpu) {
system_redirect_io("xz", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.xz");
}
$files{"$subdir/$file.gz"}=1;
$files{"$tempdir/$subdir/$file.gz"}=1;
$got_any_file=1;
}
}
if (exists $file_lists{"$tempdir/$subdir/$file"}) {
if (!check_lists("$tempdir/$subdir/$file")) {
if (remote_get("$subdir/$file")) {
$got_any_file=1;
} else {
push (@errlog,"$subdir/$file failed checksum verification\n");
$num_errors++;
}
} else {
$bytes_gotten += $file_lists{"$tempdir/$subdir/$file"}{size};
$got_any_file=1;
}
}
if ($got_any_file) {
} elsif ($ignore_missing_release) {
say("Ignoring missing Release file for $subdir/$file.gz");
push (@errlog,"Ignoring missing Release file for $subdir/$file.gz\n");
if (remote_get("$subdir/$file.gz")) {
system_redirect_io("gzip -d", "$tempdir/$subdir/$file.gz", "$tempdir/$subdir/$file");
}
} else {
if (-f "$subdir/$file.gz") {
say("$subdir/$file.gz exists locally but not in Release");
die "Won't mirror without $subdir/$file.gz signature in Release";
} else {
say("$subdir/$file.gz does not exist locally or in Release, skipping.") if ($debug);
}
}
if (exists $file_lists{"$tempdir/$subdir/Release"}) {
if (!check_lists("$tempdir/$subdir/Release")) {
if (!remote_get("$subdir/Release")) {
push (@errlog,"$subdir/Release failed checksum verification, removing\n");
}
} else {
$bytes_gotten += $file_lists{"$tempdir/$subdir/Release"}{size};
}
}
if ($file eq "Packages") {
push @package_files, "$tempdir/$subdir/$file";
} elsif ($file eq "Sources") {
push @source_files, "$tempdir/$subdir/$file";
} else {
die "get_index called with unknown type $file\n";
}
# Uncompressed files are no longer kept on the mirrors
$files{"$subdir/$file"}=1 unless exists $file_lists{"$tempdir/$subdir/$file.xz"} or exists $file_lists{"$tempdir/$subdir/$file.gz"};
$files{"$subdir/Release"}=1;
$files{"$tempdir/$subdir/$file"}=1;
$files{"$tempdir/$subdir/Release"}=1;
}
sub update_contents {
my ($subdir, $file) = @_;
my $file_ok = check_lists("$tempdir/$subdir/$file.gz");
# Get the Index file for the diffs
if (exists $file_lists{"$tempdir/$subdir/$file.diff/Index"}) {
if (!check_lists("$tempdir/$subdir/$file.diff/Index")) {
make_dir("$tempdir/$subdir/$file.diff");
if (!remote_get("$subdir/$file.diff/Index")) {
push (@errlog,"$subdir/$file.diff/Index failed checksum verification, removing\n");
return $file_ok;
}
#FIXME: before download
if (-f "$tempdir/$subdir/$file.diff/Index") {
$bytes_to_get += -s "$tempdir/$subdir/$file.diff/Index";
}
}
$files{"$subdir/$file.diff/Index"}=1 if ($diff_mode eq "mirror");
$files{"$tempdir/$subdir/$file.diff/Index"}=1;
} else {
return $file_ok;
}
if (! -f "$tempdir/$subdir/$file.gz" || $file_ok) {
# fetch diffs only
fetch_and_apply_diffs(1, $subdir, $file);
return $file_ok;
}
# Uncompress the Contents file
system_redirect_io("gzip -d", "$tempdir/$subdir/$file.gz", "$tempdir/$subdir/$file");
# Update it
fetch_and_apply_diffs(0, $subdir, $file);
# And compress it again
if (-f "$tempdir/$subdir/$file") {
system_redirect_io("gzip $gzip_options", "$tempdir/$subdir/$file", "$tempdir/$subdir/$file.gz");
unlink "$tempdir/$subdir/$file";
}
return check_lists("$tempdir/$subdir/$file.gz");
}
sub get_contents_files {
my $help = sub {
my($first, $operational_parameters, $dist, $arch, $sect) = @_;
if (!check_lists("$tempdir/dists/$dist$sect/Contents-$arch.gz")) {
if ($$first) {
say("Get Contents files.");
$$first = 0;
}
remote_get("dists/$dist$sect/Contents-$arch.gz");
}
$files{"dists/$dist$sect/Contents-$arch.gz"}=1;
$files{$tempdir."/"."dists/$dist$sect/Contents-$arch.gz"}=1;
if ($debmarshal) {
link_contents_into_snapshot($dist,$mirrordir,$arch,$tempdir);
}
};
my $first = 1;
do_contents_for_each_dist_arch_sect(
$help, [\$first], {}
);
}
# hardlink index files from tempdir to next debmarshal snapshot location
sub link_index {
my ($dist,$section,$arch) = @_;
my ($file,$archdir);
if ($arch eq "source") {
$file = "Sources";
$archdir = "source";
} else {
$file = "Packages";
$archdir = "binary-$arch";
}
my $next = get_next_snapshot($dist);
make_dir("$mirrordir/dists/$dist/$next/$section/$archdir");
unlink("$mirrordir/dists/$dist/$next/$section/$archdir/$file");
link("$tempdir/dists/$dist/$section/$archdir/$file",
"$mirrordir/dists/$dist/$next/$section/$archdir/$file")
or warn "Error while linking $tempdir/dists/$dist/$section/$archdir/$file: $!\n";
unlink("$mirrordir/dists/$dist/$next/$section/$archdir/$file.gz");
link("$tempdir/dists/$dist/$section/$archdir/$file.gz",
"$mirrordir/dists/$dist/$next/$section/$archdir/$file.gz")
or die "Error while linking $tempdir/dists/$dist/$section/$archdir/$file.gz: $!\n";
unlink("$mirrordir/dists/$dist/$next/$section/$archdir/$file.xz");
link("$tempdir/dists/$dist/$section/$archdir/$file.xz",
"$mirrordir/dists/$dist/$next/$section/$archdir/$file.xz")
or die "Error while linking $tempdir/dists/$dist/$section/$archdir/$file.xz: $!\n";
}
sub i18n_from_release {
my ($dist,$distpath) = @_;
my $subdir = "dists/$dist/$distpath";
my $compdir = $tempdir."/".$subdir;
my ($sha1, $size, $filename);
my $exclude = "(".join("|", @excludes).")" if @excludes;
my $include = "(".join("|", @includes).")" if @includes;
# Create i18n directories
make_dir($subdir);
make_dir($compdir);
# Search for translation files in file_lists
foreach my $path (keys %file_lists) {
next if length($compdir)+1>length($path); # the +1 stands for the slash after $compdir
next if substr($path, 0, length($compdir)) ne $compdir;
my $filename = substr($path, length($compdir)+1, length($path)-length($compdir)-1);
next if $filename !~ /\.(?:gz|bz2|xz)$/;
if(!(defined($include) && ($subdir."/".$filename)=~/$include/o)) {
next if (defined($exclude) && ($subdir."/".$filename)=~/$exclude/o);
}
next if ! $i18n && $filename !~ /-en/;
$files{"$subdir/$filename"}=1;
$files{$tempdir."/"."$subdir/$filename"}=1;
if (! check_lists($path)) {
$bytes_to_get += $file_lists{$path}{size};
$i18n_get{"$subdir/$filename"}{dist} = $dist;
$i18n_get{"$subdir/$filename"}{distpath} = $distpath;
$i18n_get{"$subdir/$filename"}{filename} = $filename;
}
}
}
sub get_i18n_files {
say("Get Translation files ...");
foreach my $file (sort keys %i18n_get) {
if (! check_lists("$tempdir/$file")) {
remote_get("$file");
if ($debmarshal) {
link_auxfile_into_snapshot($file,
$i18n_get{$file}{dist},
$i18n_get{$file}{distpath},
$i18n_get{$file}{filename},
$mirrordir,
$tempdir);
}
}
}
}
sub dep11_from_release {
my ($dist,$distpath) = @_;
my $subdir = "dists/$dist/$distpath";
my $compdir = $tempdir."/".$subdir;
my ($size, $filename);
my $exclude = "(".join("|", @excludes).")" if @excludes;
my $include = "(".join("|", @includes).")" if @includes;
# Create dep11 directories
make_dir($subdir);
make_dir($compdir);
# Search for DEP-11 files in file_lists
foreach my $path (keys %file_lists) {
next if length($compdir)+1>length($path); # the +1 stands for the slash after $compdir
next if substr($path, 0, length($compdir)) ne $compdir;
my $filename = substr($path, length($compdir)+1, length($path)-length($compdir)-1);
next if $filename !~ /\.(?:gz|bz2|xz)$/;
my $all_arches = "(".join("|", map(quotemeta, @arches)).")";
next if $filename =~ /^Components-/ and $filename !~ /^Components-$all_arches\./;
my $size = $file_lists{$path}{size};
if(!(defined($include) && ($subdir."/".$filename)=~/$include/o)) {
next if (defined($exclude) && ($subdir."/".$filename)=~/$exclude/o);
}
$files{"$subdir/$filename"}=1;
$files{$tempdir."/"."$subdir/$filename"}=1;
if (!check_lists("$tempdir/$subdir/$filename")) {
$bytes_to_get += $size;
$dep11_get{"$subdir/$filename"}{dist} = $dist;
$dep11_get{"$subdir/$filename"}{distpath} = $distpath;
$dep11_get{"$subdir/$filename"}{filename} = $filename;
}
}
}
sub get_dep11_files {
say("Get DEP-11 metadata files ...");
foreach my $file (sort keys %dep11_get) {
if (!check_lists("$tempdir/$file")) {
remote_get($file);
if ($debmarshal) {
link_auxfile_into_snapshot($file,
$dep11_get{$file}{dist},
$dep11_get{$file}{distpath},
$dep11_get{$file}{filename},
$mirrordir,
$tempdir);
}
}
}
}
sub fetch_and_apply_diffs {
my ($fetch_only, $subdir, $type) = @_;
local (*INDEX, *FILE);
my (%history_sha1, %history_size, %diff_sha1, %diff_size);
my ($current_sha1, $current_size, $sha1, $size, $file, $digest, $ret);
my $t = $num_errors;
# Parse DiffIndex file
open(INDEX, "$tempdir/$subdir/$type.diff/Index") or die "$tempdir/$subdir/$type.diff/Index: $!";
$_ = <INDEX>;
while (defined($_)) {
if (m/^SHA1-Current:/m) {
($current_sha1, $current_size) = m/^SHA1-Current:\s+([A-Za-z0-9]+)\s+(\d+)/m;
$_ = <INDEX>;
}
elsif (m/^SHA1-History:/m) {
while (defined($_ = <INDEX>)) {
last if (!m/^\s/m);
($sha1, $size, $file) = m/^\s+([A-Za-z0-9]+)\s+(\d+)\s+(.*)/m;
$history_sha1{$file} = $sha1;
$history_size{$file} = $size;
}
}
elsif (m/^SHA1-Patches:/m) {
while (defined($_ = <INDEX>)) {
last if (!m/^\s/m);
($sha1, $size, $file) = m/^\s+([A-Za-z0-9]+)\s+(\d+)\s+(.*)/m;
$diff_sha1{$file} = $sha1;
$diff_size{$file} = $size;
}
}
else {
$_ = <INDEX>;
}
}
close(INDEX);
# Download diff files as necessary
$ret = 1;
foreach $file (sort keys %diff_sha1) {
if (!check_diff("$tempdir/$subdir/$type.diff/$file", $diff_size{$file}, $diff_sha1{$file})) {
remote_get("$subdir/$type.diff/$file.gz");
#FIXME: before download
if (-f "$tempdir/$subdir/$type.diff/$file.gz") {
$bytes_to_get += -s "$tempdir/$subdir/$type.diff/$file.gz";
}
if (!check_diff("$tempdir/$subdir/$type.diff/$file", $diff_size{$file}, $diff_sha1{$file})) {
say("$subdir/$type.diff/$file.gz failed sha1sum check, removing");
push (@errlog,"$subdir/$type.diff/$file.gz failed sha1sum check, removing\n");
unlink "$tempdir/$subdir/$type.diff/$file.gz";
$ret = 0;
}
}
$files{"$subdir/$type.diff/$file.gz"}=1 if ($diff_mode eq "mirror");
$files{"$tempdir/$subdir/$type.diff/$file.gz"}=1;
}
$num_errors = $t if ($ignore_small_errors);
return if ($fetch_only || ! $ret);
# Apply diff files
open(FILE, "$tempdir/$subdir/$type") or return;
$digest = Digest::SHA->new(1);
$digest->addfile(*FILE);
$sha1 = $digest->hexdigest;
$size = -s "$tempdir/$subdir/$type";
foreach $file (sort keys %history_sha1) {
next unless ($sha1 eq $history_sha1{$file} && $size eq $history_size{$file});
if (system("gzip -d < \"$tempdir/$subdir/$type.diff/$file.gz\" | patch --ed \"$tempdir/$subdir/$type\"")) {
say("Patch $file failed, will fetch $subdir/$type file");
unlink "$tempdir/$subdir/$type";
return;
}
open(FILE, "$tempdir/$subdir/$type") or return;
$digest = Digest::SHA->new(1);
$digest->addfile(*FILE);
$sha1 = $digest->hexdigest;
$size = -s "$tempdir/$subdir/$type";
say("$subdir/$type patched with $subdir/$type.diff/$file.gz");
}
if (!($sha1 eq $current_sha1 && $size eq $current_size)) {
say("$subdir/$type failed sha1sum check, removing");
push (@errlog,"$subdir/$type failed sha1sum check, removing\n");
unlink "$tempdir/$subdir/$type";
}
}
# Make a directory including all needed parents.
{
my %seen;
sub make_dir {
my $dir=shift;
my @parts=split('/', $dir);
my $current='';
foreach my $part (@parts) {
$current.="$part/";
if (! $seen{$current}) {
if (! -d $current) {
mkdir($current) or die "mkdir failed: $!";
debug("Created directory: $current");
}
$seen{$current}=1;
}
}
}
}
# Mirror cleanup for unknown files that cannot be found in Packages files.
# This subroutine is called on pre- and post-cleanup and takes no arguments.
# It uses some global variables like $files, $mirrordir, @ignores.
sub cleanup_unknown_files {
print("Cleanup mirror") if ($verbose or $progress);
if ($use_cache) {
say(": using cache.");
foreach my $file (sort keys %files) {
next if (@di_dists && $file =~ m:installer-\w(-|\w)*/current/images/:);
if ($files{$file} == 2 && -f $file) {
say("deleting $file") if ($verbose);
if (! $do_dry_run) {
unlink $file or die "unlink $file: $!";
}
}
}
} else {
say($state_cache_days ? ": full." : ".");
chdir($mirrordir) or die "chdir $mirrordir: $!";
my $ignore;
$ignore = "(".join("|", @ignores).")" if @ignores;
# Remove all files in the mirror that we don't know about
foreach my $file (`find . -type f 2>/dev/null`) {
chomp $file;
$file=~s:^\./::;
next if (@di_dists && $file =~ m:installer-\w(-|\w)*/current/images/:);
unless ((exists $files{$file} && $files{$file} != 2) or
(defined($ignore) && $file=~/$ignore/o)) {
say("deleting $file") if ($verbose);
if (! $do_dry_run) {
unlink $file or die "unlink $file: $!";
}
}
}
}
# Clean up obsolete files of D-I images
di_cleanup() if @di_dists;
}
# Figure out whether debian-installer should be skipped for a given dist.
my %skip_installer=("woody" => 1, "experimental" => 1);
foreach my $skipped_dist (@skip_installer) {
$skip_installer{$skipped_dist} = 1;
}
sub di_skip_dist {
my $dist=shift;
if ( defined($skip_installer{$dist}) ) {
return 1;
}
return 0;
}
sub di_check_dists {
DI_DIST:
for my $di_dist (@di_dists) {
next if di_skip_dist($di_dist);
if (exists $distset{$di_dist}) {
# Valid dist and also mirroring the archive itself
$distset{$di_dist}{"d-i"} = 1;
} else {
foreach my $dist (keys %distset) {
my ($dist_raw, $dist_sdir) = split_dist($dist);
if ($di_dist eq $distset{$dist_raw}{suite}) {
# Suite specified, use codename instead
$distset{"$dist_raw$dist_sdir"}{"d-i"} = 1;
next DI_DIST;
}
}
# Only mirroring D-I images, not the archive itself
my $tdir="$tempdir/.tmp/dists/$di_dist";
next unless (get_release($tdir, $di_dist) || $ignore_missing_release);
name_release("d-i", $tdir, $di_dist);
unlink "$tdir/Release";
unlink "$tdir/Release.gpg";
unlink "$tdir/InRelease";
}
}
}
sub di_add_files {
my $tdir = "$tempdir/d-i";
my $exclude = "(".join("|", @excludes).")" if @excludes;
my $include = "(".join("|", @includes).")" if @includes;
foreach my $dist (keys %distset) {
next unless exists $distset{$dist}{"d-i"};
foreach my $arch (@di_arches) {
next if $arch eq "all";
my $image_dir = "dists/$dist/main/installer-$arch/current/images";
make_dir ("$tdir/$image_dir");
if (!remote_get("$image_dir/MD5SUMS", $tdir)) {
say("Failed to download $image_dir/MD5SUMS; skipping.");
return;
}
if (-f "$tdir/$image_dir/MD5SUMS") {
$bytes_to_get += -s _; # As we did not have the size earlier
}
local $/;
undef $/; # Read whole file
open(FILE, "<", "$tdir/$image_dir/MD5SUMS") or die "$tdir/$image_dir/MD5SUMS: $!";
$_ = <FILE>;
while (m/^([A-Za-z0-9]{32} .*)/mg) {
my ($md5sum, $filename) = split(' ', $1, 3);
$filename =~ s:^\./::;
if(!(defined($include) && ($image_dir."/".$filename)=~/$include/o)) {
next if (defined($exclude) && ($image_dir."/".$filename)=~/$exclude/o);
}
$di_files{$image_dir}{$filename}{md5sum} = $md5sum;
# Check against the version currently on the mirror
if (check_file(filename => "$image_dir/$filename", size => -1, MD5Sum => $md5sum)) {
$di_files{$image_dir}{$filename}{status} = 1;
} else {
$di_files{$image_dir}{$filename}{status} = 0;
}
}
close(FILE);
}
}
}
# ToDo: for rsync maybe it would make sense to sync the images directly
# into place, the whole $image_dir at a time.
sub di_get_files {
say("Getting Debian Installer images.");
my $tdir = "$tempdir/d-i";
foreach my $image_dir (sort keys %di_files) {
my $lres = 1;
foreach my $file (sort keys %{ $di_files{$image_dir} }) {
next unless $di_files{$image_dir}{$file}{status} == 0;
# Fetch images into a temporary location
$file =~ m:(^.*)/:;
make_dir ("$tdir/$image_dir/$1") if $1;
if (!remote_get("$image_dir/$file", $tdir) ||
!check_file(filename => "$tdir/$image_dir/$file", size => -1, MD5Sum => $di_files{$image_dir}{$file}{md5sum})) {
$lres = 0;
last if (! $do_dry_run);
}
if (-f "$tdir/$image_dir/$file") {
$bytes_to_get += -s _; # As we did not have the size in add_di_files()
}
}
# Move images in place on mirror
if ($lres && ! $do_dry_run) {
foreach my $file (sort keys %{ $di_files{$image_dir} }) {
next unless $di_files{$image_dir}{$file}{status} == 0;
$file =~ m:(^.*)/:;
make_dir ("$image_dir/$1") if $1;
unlink "$image_dir/$file" if (-f "$image_dir/$file");
link("$tdir/$image_dir/$file", "$image_dir/$file");
}
# Move the MD5SUMS file in place on mirror
unlink "$image_dir/MD5SUMS" if (-f "$image_dir/MD5SUMS");
link("$tdir/$image_dir/MD5SUMS", "$image_dir/MD5SUMS");
} elsif (! $do_dry_run) {
say("Failed to download some files in $image_dir; not updating images.");
}
}
}
sub di_cleanup {
# Clean up obsolete files
foreach my $image_dir (`find dists/ -type d -name images 2>/dev/null`) {
next unless $image_dir =~ m:/installer-\w(-|\w)*/current/images$:;
chomp $image_dir;
chdir("$image_dir") or die "unable to chdir($image_dir): $!\n";
foreach my $file (`find . -type f 2>/dev/null`) {
chomp $file;
$file=~s:^\./::;
if (! exists $di_files{$image_dir} || ! exists $di_files{$image_dir}{$file}) {
next if (exists $di_files{$image_dir} && $file eq "MD5SUMS");
say("deleting $image_dir/$file") if ($verbose);
if (! $do_dry_run) {
unlink "$file" or die "unlink $image_dir/$file: $!\n";
}
}
}
chdir("$mirrordir") or die "unable to chdir($tempdir): $!\n";
}
# Clean up temporary D-I files (silently)
if (-d "$tempdir/d-i") {
chdir("$tempdir/d-i") or die "unable to chdir($tempdir/d-i): $!\n";
foreach my $file (`find . -type f 2>/dev/null`) {
chomp $file;
$file=~s:^\./::;
unlink "$file" or die "unlink $tempdir/d-i/$file: $!\n";
}
chdir("$mirrordir") or die "unable to chdir($mirrordir): $!\n";
}
}
sub download_finished {
if ($ftp) { $ftp->quit; }
my $total_time = time - $start_time;
if (downloads_via_rsync() || $bytes_gotten == 0) {
say("Download completed in ".$total_time."s.");
} else {
my $avg_speed = 0;
$avg_speed = sprintf("%3.0f",($bytes_gotten / $total_time)) unless ($total_time == 0);
say("Downloaded ".print_dl_size($bytes_gotten)." in ".$total_time."s at ".(int($avg_speed/1024*100)/100)." kiB/s.");
}
}
sub rename_distdir {
my ($dir, $codename, $suite) = @_;
say("The directory for a dist should be its codename, not a suite.");
if (!$allow_dist_rename) {
die("Use --allow-dist-rename to have debmirror do the conversion automatically.\n");
}
say("Starting conversion - renaming '$dir/$suite' to '$dir/$codename':");
if (-l "$dir/$codename") {
say(" removing symlink '$dir/$codename'; a new symlink for the suite will be created later");
unlink "$dir/$codename";
}
if (-d "$dir/$codename") {
die("Directory '$dir/$codename' already exists; aborting conversion.\n");
}
rename("$dir/$suite", "$dir/$codename");
say(" conversion completed successfully");
}
sub save_state_cache {
my $cache_file = "$tempdir/debmirror_state.cache";
say("Saving debmirror state cache.");
foreach my $file (keys %files) {
if ($files{$file} == 2) {
delete $files{$file};
} elsif ($files{$file} >= 0){
$files{$file} = 2;
}
}
# Add state cache meta data
my $now = time();
$files{cache_version} = $files_cache_version;
if (! $state_cache_exptime) {
$state_cache_exptime = $now + $state_cache_days * 24 * 60 * 60;
}
$files{cache_expiration_time} = $state_cache_exptime;
if (! nstore(\%files, $cache_file)) {
say("Failed to save state cache.");
unlink $cache_file if -f $cache_file;
} else {
my $expires = int(($state_cache_exptime - $now) / (60 * 60)); # hours
if ($expires > 0) {
my $days = int($expires / 24);
my $hours = $expires % 24;
say("State cache will expire in " .
($days ? "$days day(s)" : ($hours ? "" : "the next hour")) .
($hours ? ($days ? " and " : "") . "$hours hour(s)" : "") . ".");
} else {
say("State cache expired during this run; next run will not use cache.");
}
}
}
sub load_state_cache {
my $cache_file = "$tempdir/debmirror_state.cache";
if (! -f $cache_file) {
say("State cache file does not exist; doing full mirroring.");
return;
}
my $rfiles;
say("Loading debmirror state cache.");
$rfiles = retrieve($cache_file);
if (! defined $rfiles) {
say("Failed to load state cache; doing full mirror check.");
return
}
if (! exists $$rfiles{cache_version}) {
say("Cache version missing in state cache; doing full mirroring.");
return
} elsif ($$rfiles{cache_version} ne $files_cache_version) {
say("State cache is incompatible with this version of debmirror; doing full mirror check.");
return
} else {
delete $$rfiles{cache_version};
}
if (! exists $$rfiles{cache_expiration_time}) {
say("Expiration time missing in state cache; doing full mirror check.");
return
} elsif ($$rfiles{cache_expiration_time} < time()) {
say("State cache has expired; doing full mirror check.");
return
} else {
$state_cache_exptime = $$rfiles{cache_expiration_time};
delete $$rfiles{cache_expiration_time};
}
say("State cache loaded successfully; will use cache.");
%files = %$rfiles;
$use_cache = 1;
# Preserve state cache during dry runs
if ($dry_run) {
$files{$cache_file} = 1;
} else {
unlink $cache_file if -f $cache_file;
}
}
sub downloads_via_http {
local $_ = shift;
defined or $_ = $download_method;
return $_ eq 'http';
}
sub downloads_via_https {
local $_ = shift;
defined or $_ = $download_method;
return $_ eq 'https';
}
sub downloads_via_http_or_https {
return downloads_via_http(@_) || downloads_via_https(@_);
}
sub downloads_via_ftp {
local $_ = shift;
defined or $_ = $download_method;
return $_ eq 'ftp';
}
sub downloads_via_file {
local $_ = shift;
defined or $_ = $download_method;
return $_ eq 'file';
}
sub downloads_via_rsync {
local $_ = shift;
defined or $_ = $download_method;
return $_ eq 'rsync';
}
sub uses_LWP {
return !downloads_via_rsync(@_);
}
# Do some specified thing with respect to a Contents file for each
# specified distribution, architecture and section. Optionally, do
# the same thing also for the source of each specified distribution
# and section.
sub do_contents_for_each_dist_arch_sect {
my($routine, $routine_args, $operational_params) = @_;
my @sects = ((map {"/$_"} @sections), "");
foreach my $dist (keys %distset) {
next if $dist=~/\bexperimental\b|-proposed-updates\b/o;
next unless exists $distset{$dist}{mirror};
foreach my $arch (@arches) {
my %op_params = %$operational_params;
$op_params{is_source} = $arch=~/\bsource\b/o;
unless ($op_params{is_source} && !$op_params{do_for_source}) {
foreach my $sect (@sects) {
$routine->(
@$routine_args, \%op_params, $dist, $arch, $sect
) if exists $file_lists{
"$tempdir/dists/$dist$sect/Contents-$arch.gz"
}
}
}
}
}
return 1;
}
sub say {
print join(' ', @_)."\n" if ($verbose or $progress);
}
sub debug {
print $0.': '.join(' ', @_)."\n" if $debug;
}
=head1 COPYRIGHT
This program is copyright 2000-2001, 2010-2014 by
Joey Hess <joeyh@debian.org>, under
the terms of the GNU GPL (either version 2 of the licence or, at your
option, any later version), copyright 2001-2002 by Joerg Wendland
<joergland@debian.org>, copyright 2003-2007 by Goswin von Brederlow
<goswin-v-b@web.de>, copyright 2009-2010 by Frans Pop <fjp@debian.org>,
copyright 2015 by Thaddeus H. Black <thb@debian.org>, and copyright 2016
by Colin Watson <cjwatson@debian.org>.
The author disclaims any responsibility for any mangling of your system,
unexpected bandwidth usage bills, meltdown of the Debian/Ubuntu mirror network,
etc, that this script may cause. See NO WARRANTY section of GPL.
=head1 AUTHOR
Author:
Joey Hess <joeyh@debian.org>
Previous maintainers:
Joerg Wendland <joergland@debian.org>
Goswin von Brederlow <goswin-v-b@web.de>
Frans Pop <fjp@debian.org>
Joey Hess <joeyh@debian.org>
Thaddeus H. Black <thb@debian.org>
Current maintainer:
Colin Watson <cjwatson@debian.org>
=head1 MOTTO
Waste bandwidth -- put a partial mirror on your laptop today!
=cut
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# clamavmirror.py
# Copyright (C) 2015 Andrew Colin Kissa <andrew@topdog.za.net>
# vim: ai ts=4 sts=4 et sw=4
"""ClamAV Signature Mirroring Tool
Why
---
The existing clamdownloader.pl script does not have any error
correction it simply bails out if a downloaded file is not
valid and is unable to retry different mirrors if one fails.
This script will retry if a download fails with an http code
that is not 404, it will connect to another mirror if retries
fail or file not found or if the downloaded file is invalid.
It has options to set the locations for the working and
mirror directory as well as user/group ownership for the
downloaded files. It uses locking to prevent multiple
instances from running at the same time.
Requirements
------------
DNS-Python module - http://www.dnspython.org/
Usage
-----
$ ./clamavmirror.py -h
Usage: clamavmirror.py [options]
Options:
-h, --help show this help message and exit
-a HOSTNAME, --hostname=HOSTNAME
ClamAV source server hostname
-r TXTRECORD, --text-record=TXTRECORD
ClamAV Updates TXT record
-w WORKDIR, --work-directory=WORKDIR
Working directory
-d MIRRORDIR, --mirror-directory=MIRRORDIR
The mirror directory
-u USER, --user=USER Change file owner to this user
-g GROUP, --group=GROUP
Change file group to this group
-l LOCKDIR, --locks-directory=LOCKDIR
Lock files directory
Example Usage
-------------
mkdir /tmp/clamav/{lock,mirror,tmp}
./clamavmirror.py \
-l /tmp/clamav/lock \
-d /tmp/clamav/mirror \
-w /tmp/clamav/tmp \
-a db.za.clamav.net \
-u nginx \
-g nginx
"""
import os
import pwd
import grp
import sys
import time
import fcntl
import hashlib
from shutil import move
from optparse import OptionParser
from subprocess import PIPE, Popen
from dns.resolver import query, NXDOMAIN
if sys.version_info < (3, 0):
from urllib2 import Request, URLError, urlopen
else:
from urllib.request import Request
from urllib.request import urlopen
from urllib.error import URLError
def get_file_md5(filename):
"""Get a file's MD5"""
if os.path.exists(filename):
blocksize = 65536
try:
hasher = hashlib.md5()
except ValueError:
hasher = hashlib.new('md5', usedforsecurity=False)
with open(filename, 'rb') as afile:
buf = afile.read(blocksize)
while len(buf) > 0:
hasher.update(buf)
buf = afile.read(blocksize)
return hasher.hexdigest()
else:
return ''
def get_md5(string):
"""Get a string's MD5"""
try:
hasher = hashlib.md5()
except ValueError:
hasher = hashlib.new('md5', usedforsecurity=False)
hasher.update(string.encode())
return hasher.hexdigest()
def chunk_report(bytes_so_far, total_size):
"""Display progress"""
percent = float(bytes_so_far) / total_size
percent = round(percent * 100, 2)
sys.stdout.write(
"[x] Downloaded %d of %d bytes (%0.2f%%)\r" %
(bytes_so_far, total_size, percent))
if bytes_so_far >= total_size:
sys.stdout.write('\n')
def chunk_read(response, handle, chunk_size=8192, report_hook=None):
"""Read chunks"""
total_size = int(response.info().get('Content-Length'))
bytes_so_far = 0
while 1:
chunk = response.read(chunk_size)
handle.write(chunk)
bytes_so_far += len(chunk)
if not chunk:
handle.close()
break
if report_hook:
report_hook(bytes_so_far, total_size)
return bytes_so_far
def error(msg):
"""print to stderr"""
sys.stderr.write(msg + "\n")
def info(msg):
"""print to stdout"""
print(msg)
def deploy_signature(source, dest, user=None, group=None):
"""Deploy a signature fole"""
move(source, dest)
os.chmod(dest, 0o644)
if user and group:
try:
uid = pwd.getpwnam(user).pw_uid
gid = grp.getgrnam(group).gr_gid
os.chown(dest, uid, gid)
except (KeyError, OSError):
pass
def create_file(name, content):
"Generic to write file"
with open(name, 'w') as writefile:
writefile.write(content)
def get_ip_addresses(hostname):
"""Return ip addresses from hostname"""
try:
answers = query(hostname, 'A')
return [rdata.address for rdata in answers]
except NXDOMAIN:
return []
def get_txt_record(hostname):
"""Get the text record"""
try:
answers = query(hostname, 'TXT')
return answers[0].strings[0].decode()
except (IndexError, NXDOMAIN):
return ''
def get_local_version(sigdir, sig):
"""Get the local version of a signature"""
version = None
filename = os.path.join(sigdir, '%s.cvd' % sig)
if os.path.exists(filename):
cmd = ['sigtool', '-i', filename]
sigtool = Popen(cmd, stdout=PIPE, stderr=PIPE)
while True:
line = sigtool.stdout.readline().decode()
if line:
if line.startswith('Version:'):
version = line.split()[1].rstrip()
break
else:
break
sigtool.wait()
return version
def verify_sigfile(sigdir, sig):
"""Verify a signature file"""
cmd = ['sigtool', '-i', '%s/%s.cvd' % (sigdir, sig)]
sigtool = Popen(cmd, stdout=PIPE, stderr=PIPE)
ret_val = sigtool.wait()
return ret_val == 0
def download_sig(opts, ips, sig, version=None):
"""Download signature for IP list"""
code = None
downloaded = False
for ipaddr in ips:
try:
if version:
url = 'http://%s/%s.cvd' % (ipaddr, sig)
filename = os.path.join(opts.workdir, '%s.cvd' % sig)
else:
url = 'http://%s/%s.cdiff' % (ipaddr, sig)
filename = os.path.join(opts.workdir, '%s.cdiff' % sig)
req = Request(url)
req.add_header('Host', opts.hostname)
response = urlopen(req)
code = response.getcode()
handle = open(filename, 'wb')
chunk_read(response, handle, report_hook=chunk_report)
if version:
if (
verify_sigfile(opts.workdir, sig) and
version == get_local_version(opts.workdir, sig)):
downloaded = True
break
else:
downloaded = True
break
except URLError as err:
if hasattr(err, 'code'):
code = err.code
continue
finally:
if 'handle' in locals():
handle.close()
return downloaded, code
def get_addrs(hostname):
"""get addrs"""
count = 1
for passno in range(1, 6):
count = passno
info("[+] Resolving hostname: %s pass: %d" % (hostname, passno))
addrs = get_ip_addresses(hostname)
if addrs:
info("=> Resolved to: %s" % ','.join(addrs))
break
else:
info("=> Resolution failed, sleeping 5 secs")
time.sleep(5)
if not addrs:
error(
"=> Resolving hostname: %s failed after %d tries" %
(hostname, count))
sys.exit(2)
return addrs
def get_record(opts):
"""Get record"""
count = 1
for passno in range(1, 5):
count = passno
info("[+] Querying TXT record: %s pass: %s" % (opts.txtrecord, passno))
record = get_txt_record(opts.txtrecord)
if record:
info("=> Query returned: %s" % record)
break
else:
info("=> Txt record query failed, sleeping 5 secs")
time.sleep(5)
if not record:
error("=> Txt record query failed after %d tries" % count)
sys.exit(3)
return record
def copy_sig(sig, opts, isdiff):
"""Deploy a sig"""
info("Deploying signature: %s" % sig)
if isdiff:
sourcefile = os.path.join(opts.workdir, '%s.cdiff' % sig)
destfile = os.path.join(opts.mirrordir, '%s.cdiff' % sig)
else:
sourcefile = os.path.join(opts.workdir, '%s.cvd' % sig)
destfile = os.path.join(opts.mirrordir, '%s.cvd' % sig)
deploy_signature(sourcefile, destfile, opts.user, opts.group)
info("=> Deployed signature: %s" % sig)
def update_sig(options, addrs, sign, vers):
"""update signature"""
info("[+] Checking signature version: %s" % sign)
localver = get_local_version(options.mirrordir, sign)
remotever = vers[sign]
if localver is None or (localver and int(localver) < int(remotever)):
info("=> Update required L: %s => R: %s" % (localver, remotever))
for passno in range(1, 6):
info("=> Downloading signature: %s pass: %d" % (sign, passno))
status, code = download_sig(options, addrs, sign, remotever)
if status:
info("=> Downloaded signature: %s" % sign)
copy_sig(sign, options, 0)
break
else:
if code == 404:
error("=> Signature: %s not found, will not retry" % sign)
break
error(
"=> Download failed: %s pass: %d, sleeping 5sec" %
(sign, passno))
time.sleep(5)
else:
info("=> No update required L: %s => R: %s" % (localver, remotever))
def update_diff(opts, addrs, sig):
"""Update diff"""
for passno in range(1, 6):
info("[+] Downloading cdiff: %s pass: %d" % (sig, passno))
status, code = download_sig(opts, addrs, sig)
if status:
info("=> Downloaded cdiff: %s" % sig)
copy_sig(sig, opts, 1)
break
else:
if code == 404:
error("=> Signature: %s not found, will not retry" % sig)
break
error(
"=> Download failed: %s pass: %d, sleeping 5sec" %
(sig, passno))
time.sleep(5)
def create_dns_file(opts, record):
"""Create the DNS record file"""
info("[+] Updating dns.txt file")
filename = os.path.join(opts.mirrordir, 'dns.txt')
localmd5 = get_file_md5(filename)
remotemd5 = get_md5(record)
if localmd5 != remotemd5:
create_file(filename, record)
info("=> dns.txt file updated")
else:
info("=> No update required L: %s => R: %s" % (localmd5, remotemd5))
def main(options):
"""The main functions"""
addrs = get_addrs(options.hostname)
record = get_record(options)
record_list = record.split(':')
versions = {
'main': record_list[1],
'daily': record_list[2],
'safebrowsing': record_list[6],
'bytecode': record_list[7]
}
for signature_type in versions.keys():
if signature_type in [i for i in versions.keys() if i != 'main']:
# download diffs
localver = get_local_version(options.mirrordir, signature_type)
remotever = versions[signature_type]
if localver is not None:
for num in range(int(localver), int(remotever) + 1):
sig_diff = '%s-%d' % (signature_type, num)
filename = os.path.join(
options.mirrordir, '%s.cdiff' % sig_diff)
if not os.path.exists(filename):
update_diff(options, addrs, sig_diff)
update_sig(options, addrs, signature_type, versions)
create_dns_file(options, record)
sys.exit(0)
if __name__ == '__main__':
PARSER = OptionParser()
PARSER.add_option(
'-a', '--hostname',
help='ClamAV source server hostname',
dest='hostname',
type='str',
default='database.clamav.net')
PARSER.add_option(
'-r', '--text-record',
help='ClamAV Updates TXT record',
dest='txtrecord',
type='str',
default='current.cvd.clamav.net')
PARSER.add_option(
'-w', '--work-directory',
help='Working directory',
dest='workdir',
type='str',
default='/var/spool/clamav-mirror')
PARSER.add_option(
'-d', '--mirror-directory',
help='The mirror directory',
dest='mirrordir',
type='str',
default='/srv/www/datafeeds.baruwa.com/clamav')
PARSER.add_option(
'-u', '--user',
help='Change file owner to this user',
dest='user',
type='str',
default='nginx')
PARSER.add_option(
'-g', '--group',
help='Change file group to this group',
dest='group',
type='str',
default='nginx')
PARSER.add_option(
'-l', '--locks-directory',
help='Lock files directory',
dest='lockdir',
type='str',
default='/var/lock/subsys')
OPTIONS, _ = PARSER.parse_args()
try:
LOCKFILE = os.path.join(OPTIONS.lockdir, 'clamavmirror')
with open(LOCKFILE, 'w+') as lock:
fcntl.lockf(lock, fcntl.LOCK_EX | fcntl.LOCK_NB)
main(OPTIONS)
except IOError:
info("=> Another instance is already running")
sys.exit(254)
@AfroThundr3007730
Copy link
Author

AfroThundr3007730 commented May 24, 2018

RepoSync TODO / Roadmap

  • Documentation
    • Changelog (prob. from git history)
    • Version Header
    • Help / Usage (WIP)
    • Also Cmd Options (WIP)
  • Refactoring
    • Export Functions (WIP)
    • Combine Apt / Yum
    • Combine US / DS (WIP)
    • Fix Assumptions
  • Config File
    • Distros / Versions
    • Sync Behavior
    • Mirror Types
    • Auto Generate Option
  • Installation
    • Aoto Repo Init
    • Dependency Install
    • Service Creation
    • Rsync / HTTP Config
    • Sync System User
  • Testing

Misc. Notes & TODO

  • Move to separate repo
  • Single-file distribution
  • Build system, auto signing
  • Stick with bash or plain sh?
  • Detailed docs (built in w/ less)
  • Cross-distro portability (even BSD)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment