Created
October 19, 2012 19:25
-
-
Save dansmith01/3920169 to your computer and use it in GitHub Desktop.
Syncs a barcode.fastq file to the joined.fastq file produce by ea-util's fastq-join script.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
# | |
# ---------------------------------------- # | |
# Daniel Smith - October 1st, 2012 # | |
# Argonne National Laboratory # | |
# Creative Commons License BY-SA 3.0 # | |
# ---------------------------------------- # | |
# | |
use strict; | |
use warnings; | |
my $USAGE = <<EOF; | |
USAGE: $0 barcodes.fastq joined_reads.fastq > barcode_subset.fastq | |
This script is designed to complement ea-util's fastq-join script. | |
Given three fastq-formatted files containing the forward, reverse, and barcode | |
reads, a joined and corresponding barcode file can be generated via: | |
fastq-join forward.fastq reverse.fastq -o out.%.fastq | |
$0 barcodes.fastq out.join.fastq > out.barcodes.fastq | |
The purpose of $0 is to remove entries from barcodes.fastq which | |
no longer have a corresponding read in out.join.fastq. | |
EOF | |
my $barcode_file = shift || die ($USAGE); | |
my $joined_file = shift || die ($USAGE); | |
die $USAGE unless (-r $barcode_file && -r $joined_file); | |
open (BARCODES, $barcode_file) or die ("$!\n"); | |
open (READS, $joined_file) or die ("$!\n"); | |
while (my $header = <READS>) { | |
my $line = <READS>; | |
$line = <READS>; | |
$line = <READS>; | |
my ($bc1, $bc2, $bc3, $bc4) = map("", 1..4); | |
while ($bc1 ne $header) { | |
exit if (eof(BARCODES)); | |
$bc1 = <BARCODES>; | |
$bc2 = <BARCODES>; | |
$bc3 = <BARCODES>; | |
$bc4 = <BARCODES>; | |
} | |
print $bc1, $bc2, $bc3, $bc4; | |
} | |
close READS; | |
close BARCODES; |
A little perplexing, but this current script does not pull the barcodes correctly. I end up with an empty barcodes join.fastq when I install and run it. Your older script when it was being hosted by dropbox years seems to run fine (we installed from an older computer)? I just was wondering why? It's similar to the problem the previous posters issue. We had an older version of your perl script that we reinstalled and this script was able to pull the barcodes out.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi.
I am new to Bioinformatics.
Currently I am using Oracle VM VirtualBox, Ubuntu to run my data.
For your information, I have installed ea-utils and used fastq-join with my paired-end reads, using the following script:
fastq-join R1R2fastq/read1.fastq R1R2fastq/read2.fastq -o read.%.fastq
Belows are the output:
Total reads: 16842367
Total joined: 13919484
Average join len: 49.21
Stdev join len: 3.23
Then, I used the following Perl script to filter the barcodes with the joined reads generated from fastq.join. The barcodes used were generated directly from sequencing facililty (Miseq Illumina).
perl Downloads/fastq-barcode.pl Desktop\R1R2fastq\barcodes.fastq Desktop\read.join.fastq >barcodes_subset.fastq
However, I have met a problem, which it generated a 0kb output file.
Could you give me some advices on this?
Thank you.
Regards,
MianZi