Skip to content

Instantly share code, notes, and snippets.

@sp00nman
Last active July 29, 2023 08:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sp00nman/108a26cd366730b0077e to your computer and use it in GitHub Desktop.
Save sp00nman/108a26cd366730b0077e to your computer and use it in GitHub Desktop.
Finds files with the same ID and merges them into one single file.
#!/bin/bash
# UPD...unique patient ID; file with UPDs
# eg. H_0047C\nH_0060A\nH_0062D\n... \n newline separated
UPD=$1
# SPATH...search path
SPATH=$2
# FEXTENSION...file extension to search for [eg. .fastq.gz]
FEXTENSION=$3
# DEBUG
DEBUG=0
# define usage function
usage(){
echo "Usage: $0 filename_UPD \"search_path\" \"file_extension\""
exit 1
}
if [[ -z ${UPD} ]] && [[ -z ${SPATH} ]] && [[ -z ${FEXTENSION} ]]; then
usage
fi
if [[${DEBUG}]]; then
echo "UPD: "$UPD
for LINE in `cat ${UPD}`
do
echo $LINE
done
echo "SPATH: "$SPATH
echo "FEXTENSION: "$FEXTENSION
fi
# get files
find ${SPATH} | grep -f ${UPD} | grep ${FEXTENSION} >processed_files.txt
for ID in `cat ${UPD}`
do
#echo $ID
R1=$(grep -e "${ID}.*R1" processed_files.txt)
R2=$(grep -e "${ID}.*R2" processed_files.txt)
echo "cat "$R1" >"$ID"_R1.fastq.gz"
echo "cat "$R2" >"$ID"_R2.fastq.gz"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment