Skip to content

Instantly share code, notes, and snippets.

@Ruborcalor
Created May 3, 2020 03:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Ruborcalor/c1884d7ea35e1878709e3c9c23a876ab to your computer and use it in GitHub Desktop.
Save Ruborcalor/c1884d7ea35e1878709e3c9c23a876ab to your computer and use it in GitHub Desktop.
Code To Download Dataset
# Get dataset. It's is too big to come as a single file
!wget -O "./GSE87571_Matrix_Avg_Beta.txt.gz" "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE87571&format=file&file=GSE87571%5Fmatrix1of2%2Etxt%2Egz"
!wget -O "./GSE87571_Matrix_Avg_Beta2.txt.gz" "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE87571&format=file&file=GSE87571%5Fmatrix2of2%2Etxt%2Egz
# Extract dataset
!gunzip "./GSE87571_Matrix_Avg_Beta.txt.gz"
!gunzip "./GSE87571_Matrix_Avg_Beta2.txt.gz
# Merge data set files into one
!cut -d$'\t' -f1 --complement "./GSE87571_Matrix_Avg_Beta2.txt" > newFile && mv newFile "./GSE87571_Matrix_Avg_Beta2.txt"
!paste "GSE87571_Matrix_Avg_Beta.txt" "./GSE87571_Matrix_Avg_Beta2.txt" > matrix.csv
# Remove every other column because every other column has blank values
!awk '{{printf "%s ", $1}for(i=2;i<=NF;i=i+2){printf "%s ", $i}{printf "%s", RS}}' matrix.csv > final_matrix.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment