Skip to content

Instantly share code, notes, and snippets.

@JWDebler
Created January 6, 2022 07:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JWDebler/4af1e37ac96af3d9ed2104658993d944 to your computer and use it in GitHub Desktop.
Save JWDebler/4af1e37ac96af3d9ed2104658993d944 to your computer and use it in GitHub Desktop.
fast5 demultiplex by barcode
pip3 install ont-fast5-api
cat sequencing_summary.txt | cut -f 21 | grep barcode[0-9] | sort | uniq > barcodes.log
while read -r barcode
do
head -n 1 sequencing_summary.txt > $barcode.txt
cat sequencing_summary.txt | grep $barcode >> $barcode.txt
for f in *.txt
do
fast5_subset -i folder/with/your/fast5s -s output/folder/for/demultiplexed_fast5/$barcode -l $f -t 14
done
done < barcodes.log
@Fatihlrcfs
Copy link

Hi Johannes,

thanks for providing this command pipeline. I am wondering is there any way to generate separated(for each barcode) fast5 files from fast5_skip. many thanks. have a great days

@JWDebler
Copy link
Author

JWDebler commented Feb 15, 2022 via email

@Fatihlrcfs
Copy link

hi Johannes,

Firstly thanks for a quick reply. Fast5_skip file was generated from MinKNOWN when ı stopped the basecalling(while Minkonw was still working for basecall after sequence finished) after sequencing finished. then additional fast5 (fast5_skip) was generated from MinKNOWN and put all unbasecallled fast5 files into that file. so ı am trying to find a way to separate my fast5 files for each barcode from the fast5_skip. many thanks. Sincerely.

@JWDebler
Copy link
Author

JWDebler commented Feb 16, 2022 via email

@Fatihlrcfs
Copy link

Hi Johannes,

many thanks for sending your commands. ı am really appreciated it. but ı got some error. the first ı tried with version 6(guppy) and got it = Unexpected option '-o' found on command line.
Unexpected option '--min_score_mid_barcodes' found on command line.
then ı tried previous version 5.0.16(guppy) and got "Unexpected option '-o' found on the command line.
Missing required option 'save_path'." error. Also, ı don't have GPU facilities currently and are new in this field. manny thanks for your help.
used command= /cluster/lrcfs/ftiras/bin/ont-guppy-cpu-5.0.16/bin/guppy_basecaller -c dna_r9.4.1_450bps_sup.cfg -i fast5_skip/ --recursive -o output_folder/ --barcode_kits EXP-NBD104 --trim_barcodes --detect_mid_strand_barcodes --min_score_mid_barcodes 60 --compress_fastq --fast5_out

Sincerely

@Fatihlrcfs
Copy link

Hi Johannes,

when ı remove the -o output_folder command and added save_path command the basecalling is worked but into the work space all fast5 are the same area and they have same name with what is in fast5_skip folder.
Also, ı attached a jpeg for my file appearance, ı hape it gives better idea. manny thanks. have a great days.
best wishes

Slayt1

@JWDebler
Copy link
Author

JWDebler commented Feb 17, 2022 via email

@Fatihlrcfs
Copy link

Hi Johannes,

Now it looks like working. many thanks. but after basecalled, generated fast5 files are still all together in one file (in the workspace) and named like FAP...fast5_skip.....fast5. is there any way to separate that fast5 files according to each barcodes because ı wanna use deepsignal and tombo packages? many thanks. :)

Slayt1

Sincerely.

@JWDebler
Copy link
Author

JWDebler commented Feb 17, 2022 via email

@Fatihlrcfs
Copy link

Hi Johannes,

I am still working to fix my problem but many thanks for your help. I have really appreciated your effort for help. have a great days.:)

Sincerely.

@JWDebler
Copy link
Author

JWDebler commented Feb 21, 2022 via email

@Fatihlrcfs
Copy link

Hi Johannes,

I wanna let you know that your code has worked on my last run and ı have been able to separate fast5 for each barcode. I dont know why but ıt has not been working my previous run (still generate - folder and put all fast5 in there like here nanoporetech/ont_fast5_api#68) so ı decided to basecall all my previous raw data again and see how it come up. I wanna personally say many thanks for your effort. have a great days.:)

Best wishes.

@JWDebler
Copy link
Author

JWDebler commented Feb 28, 2022 via email

@Fatihlrcfs
Copy link

hi @BeatrizFaustino

I used demux_fast5 --input fast5_skip/ --save_path ./demutiplexedfast5/ --summary_file sequencing_summary.txt command for demultiplex my fast_skip folder for barcoded. You should use Minknown sequencing summary file for input.

@Fatihlrcfs
Copy link

Hi @BeatrizFaustino ,

the sequence summary file that ı used is provided after sample runs (sequenced) from MiKnown generally sequence_summarry_FAPxxxxx.txt. it's not generated a file after base calling like a guppy.
Yes fast5_skip includes all fast5 that is generated after sequencing by MinKnown and they don't separate by MinKnown (or you stopped the basecalling after sequencing finished but basecalling was still continuing). I hope this help you. thanks

@BeatrizFaustino
Copy link

Hi
is demux present in guppy or other software?

@Fatihlrcfs
Copy link

hi @BeatrizFaustino ,

I used this page as a demultiplex ont_fast5_api

@BeatrizFaustino
Copy link

Hi @Fatihlrcfs
Thank you very much, using the commands you gave me the analyzes ended now and I was successful

@JWDebler
Copy link
Author

JWDebler commented Dec 20, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment