Skip to content

Instantly share code, notes, and snippets.

@CaptainJH
Last active August 29, 2015 14:00
Show Gist options
  • Save CaptainJH/bbb1ba8f41831d66abca to your computer and use it in GitHub Desktop.
Save CaptainJH/bbb1ba8f41831d66abca to your computer and use it in GitHub Desktop.
parse webpage then download file, the decompress them
#!/bin/bash
echo "=======ExportUserReport bash==========="
Root="http://10.130.19.139/19659/"
Root2="http://10.164.7.76/19659/"
DateBegin=$1
TargetRoot="F:/UserReport/"
d=$DateBegin
while true; do
echo $d
if [ $d == $(date +%F) ]; then
break
fi
mkdir $d
cd $d
index=1
for r in $Root $Root2; do
subFolder="Root_$index"
mkdir $subFolder
cd $subFolder
rawFile="$r$d/"
echo $rawFile
text=$(curl --silent $rawFile)
echo $text > raw.txt
grep -ohE 'href="bill-[^"]+' raw.txt | sed "s@href=\"@@g" > list.txt
while read line
do
echo $line
curl -O --silent $rawFile$line
if echo $line | grep --silent "\.dat\.gz"; then
gzip -d $line
fi
done < list.txt
for f in $(sed "s@.gz@@g" list.txt); do
while read line2
do
echo $line2 >> ../RawText.txt
done < $f
done
index=$((index + 1))
cd ..
done
targetPath="$TargetRoot$d"
#echo "check $targetPath"
if [ -d "$targetPath" ]; then
echo "$targetPath Exists!"
cp RawText.txt $targetPath
else
mkdir $targetPath
cp RawText.txt $targetPath
fi
cd ..
d=$(date -d "$d 1 day" +%F)
done
echo "Finished"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment