Skip to content

Instantly share code, notes, and snippets.

@silvtal
Created May 30, 2023 11:30
Show Gist options
  • Save silvtal/a41524332fc142c4c8dafdd4779c609a to your computer and use it in GitHub Desktop.
Save silvtal/a41524332fc142c4c8dafdd4779c609a to your computer and use it in GitHub Desktop.
sometimes qiime input (for pick_otus.py) is too heavy of a file. Therefore you need to split it and obtain several output OTU files. But you can't just paste together the outputs again, since it would mean repeated OTU IDs. So... Use this script.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Sep 28 17:14:18 2022
@author: silvia
@description: sometimes qiime input (for pick_otus.py) is too heavy of a file.
Therefore you need to split it and obtain several output OTU files. But you can't
just paste together the outputs again, since it would mean repeated OTU IDs. So...
Use this script.
You also need to remove quotes after wards with
cat <file> | tr -d '"' > <newfile>
"""
# Example:
#-bash-4.2$ wc -l $INPUT
#15294992 /home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV.fa
#-bash-4.2$ head -10000000 $INPUT > $INPUT"3.fa"
#-bash-4.2$ tail -5294992 $INPUT > $INPUT"3.fa"
#-bash-4.2$ cat ./qiime/tomate_23sep_ASV.fa2_otus.txt > ./qiime/tomate_23sep_ASV_otus.txt
#-bash-4.2$ cat ./qiime/tomate_23sep_ASV.fa3_otus.txt >> ./qiime/tomate_23sep_ASV_otus.txt
import pandas
# import csv
f = "/home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV_otus.txt"
table = pandas.read_csv(f, sep=" ", names=["mycol"])
table = table.mycol.str.split("\t", expand=True, n=1)
table.columns = ["otu", "instance"]
d = {}
for i, vals in table.iterrows():
if vals.otu in d:
d[vals.otu] = d[vals.otu] + "\t" + vals.instance
else:
d[vals.otu] = vals.instance
s = pandas.Series(d, name='DateValue')
s.to_csv("/home/silviatm/micro/tomate/tomate_23sep_dada2cleaned//tomate_23sep_ASV_otus_merged.txt",sep="\t",
header=False)#, quoting=csv.QUOTE_NONE,escapechar="\\")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment