Skip to content

Instantly share code, notes, and snippets.

@dansmith01
Created October 19, 2012 19:25
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dansmith01/3920169 to your computer and use it in GitHub Desktop.
Save dansmith01/3920169 to your computer and use it in GitHub Desktop.
Syncs a barcode.fastq file to the joined.fastq file produce by ea-util's fastq-join script.
#!/usr/bin/perl
#
# ---------------------------------------- #
# Daniel Smith - October 1st, 2012 #
# Argonne National Laboratory #
# Creative Commons License BY-SA 3.0 #
# ---------------------------------------- #
#
use strict;
use warnings;
my $USAGE = <<EOF;
USAGE: $0 barcodes.fastq joined_reads.fastq > barcode_subset.fastq
This script is designed to complement ea-util's fastq-join script.
Given three fastq-formatted files containing the forward, reverse, and barcode
reads, a joined and corresponding barcode file can be generated via:
fastq-join forward.fastq reverse.fastq -o out.%.fastq
$0 barcodes.fastq out.join.fastq > out.barcodes.fastq
The purpose of $0 is to remove entries from barcodes.fastq which
no longer have a corresponding read in out.join.fastq.
EOF
my $barcode_file = shift || die ($USAGE);
my $joined_file = shift || die ($USAGE);
die $USAGE unless (-r $barcode_file && -r $joined_file);
open (BARCODES, $barcode_file) or die ("$!\n");
open (READS, $joined_file) or die ("$!\n");
while (my $header = <READS>) {
my $line = <READS>;
$line = <READS>;
$line = <READS>;
my ($bc1, $bc2, $bc3, $bc4) = map("", 1..4);
while ($bc1 ne $header) {
exit if (eof(BARCODES));
$bc1 = <BARCODES>;
$bc2 = <BARCODES>;
$bc3 = <BARCODES>;
$bc4 = <BARCODES>;
}
print $bc1, $bc2, $bc3, $bc4;
}
close READS;
close BARCODES;
@Mianzi
Copy link

Mianzi commented Aug 17, 2016

Hi.
I am new to Bioinformatics.
Currently I am using Oracle VM VirtualBox, Ubuntu to run my data.
For your information, I have installed ea-utils and used fastq-join with my paired-end reads, using the following script:
fastq-join R1R2fastq/read1.fastq R1R2fastq/read2.fastq -o read.%.fastq
Belows are the output:

Total reads: 16842367
Total joined: 13919484
Average join len: 49.21
Stdev join len: 3.23

Then, I used the following Perl script to filter the barcodes with the joined reads generated from fastq.join. The barcodes used were generated directly from sequencing facililty (Miseq Illumina).

perl Downloads/fastq-barcode.pl Desktop\R1R2fastq\barcodes.fastq Desktop\read.join.fastq >barcodes_subset.fastq

However, I have met a problem, which it generated a 0kb output file.
Could you give me some advices on this?
Thank you.

Regards,
MianZi

@benligan
Copy link

@dansmith01

A little perplexing, but this current script does not pull the barcodes correctly. I end up with an empty barcodes join.fastq when I install and run it. Your older script when it was being hosted by dropbox years seems to run fine (we installed from an older computer)? I just was wondering why? It's similar to the problem the previous posters issue. We had an older version of your perl script that we reinstalled and this script was able to pull the barcodes out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment