Skip to content

Instantly share code, notes, and snippets.

@johanquiroga
Last active February 2, 2018 01:28
Show Gist options
  • Save johanquiroga/b7649beed3041a66881974e2059f96a4 to your computer and use it in GitHub Desktop.
Save johanquiroga/b7649beed3041a66881974e2059f96a4 to your computer and use it in GitHub Desktop.
Small script to extract valid emails from a list of students in XML format
#!/bin/bash
if [ $# -eq 0 ];then
echo "Escribe el path completo al archivo xml (o arrastralo hasta aquí)"
read file;
echo "Escribe el nombre del archivo de salida (con extensión)"
read outputFile;
else
if [ "$1" ];then
file=$1
else
echo "No se ha proporcionado ningun archivo para procesar"
exit 1;
fi
fi
if [[ -e $file ]];
then
if [ ! "$outputFile" ];
then
outputFile="${file%.*}.txt"
if [ "$2" ];
then
outputFile=$2
fi
fi
pattern='^<EMAIL>\b'
emailRegex='[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}'
data=`grep -E $pattern "$file" | grep -Eo $emailRegex | tr '\n' ','`
echo -n $data | sed 's/,$//' > "$outputFile"
exit 0;
else
echo "El archivo no existe o hay un problema con este"
exit 1;
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment