Created
May 30, 2023 11:30
-
-
Save silvtal/a41524332fc142c4c8dafdd4779c609a to your computer and use it in GitHub Desktop.
sometimes qiime input (for pick_otus.py) is too heavy of a file. Therefore you need to split it and obtain several output OTU files. But you can't just paste together the outputs again, since it would mean repeated OTU IDs. So... Use this script.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# -*- coding: utf-8 -*- | |
""" | |
Created on Wed Sep 28 17:14:18 2022 | |
@author: silvia | |
@description: sometimes qiime input (for pick_otus.py) is too heavy of a file. | |
Therefore you need to split it and obtain several output OTU files. But you can't | |
just paste together the outputs again, since it would mean repeated OTU IDs. So... | |
Use this script. | |
You also need to remove quotes after wards with | |
cat <file> | tr -d '"' > <newfile> | |
""" | |
# Example: | |
#-bash-4.2$ wc -l $INPUT | |
#15294992 /home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV.fa | |
#-bash-4.2$ head -10000000 $INPUT > $INPUT"3.fa" | |
#-bash-4.2$ tail -5294992 $INPUT > $INPUT"3.fa" | |
#-bash-4.2$ cat ./qiime/tomate_23sep_ASV.fa2_otus.txt > ./qiime/tomate_23sep_ASV_otus.txt | |
#-bash-4.2$ cat ./qiime/tomate_23sep_ASV.fa3_otus.txt >> ./qiime/tomate_23sep_ASV_otus.txt | |
import pandas | |
# import csv | |
f = "/home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV_otus.txt" | |
table = pandas.read_csv(f, sep=" ", names=["mycol"]) | |
table = table.mycol.str.split("\t", expand=True, n=1) | |
table.columns = ["otu", "instance"] | |
d = {} | |
for i, vals in table.iterrows(): | |
if vals.otu in d: | |
d[vals.otu] = d[vals.otu] + "\t" + vals.instance | |
else: | |
d[vals.otu] = vals.instance | |
s = pandas.Series(d, name='DateValue') | |
s.to_csv("/home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV_otus_merged.txt",sep="\t", | |
header=False)#, quoting=csv.QUOTE_NONE,escapechar="\\") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment