Skip to content

Instantly share code, notes, and snippets.

@nievergeltlab
Created June 30, 2021 16:46
Show Gist options
  • Save nievergeltlab/1cc61c8fd1dd7a6dd9610e220a3dc7c3 to your computer and use it in GitHub Desktop.
Save nievergeltlab/1cc61c8fd1dd7a6dd9610e220a3dc7c3 to your computer and use it in GitHub Desktop.
Intersect phenotype and covariate files to get total N from each phenotype file.
for file in $(ls pheno | grep .pheno)
do
fname=$(echo $file | awk 'BEGIN {FS="_"}{print $2}')
fname2=$(echo $file | awk 'BEGIN {FS="_"}{print $3}' | sed 's/.pheno//g')
# scount=$(awk '{print $3}' pheno/$file | grep -v NA | wc -l | awk '{print $1}')
scount=$(LC_ALL=C join <(awk '{print $1"_"$2,$3}' pheno/$file | LC_ALL=C sort -k1b,1 ) <( awk '{print $1"_"$2}' pheno/p2_"$fname"_eur_"$fname2"_agesex.cov | LC_ALL=C sort -k1b,1) | awk '{print $2}' | grep -v NA | wc -l | awk '{print $1}')
echo $fname $fname2 $scount
head -n1 pheno/$file
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment