Skip to content

Instantly share code, notes, and snippets.

@ivan-krukov
Last active August 29, 2015 14:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ivan-krukov/c11df698adae899c6927 to your computer and use it in GitHub Desktop.
Save ivan-krukov/c11df698adae899c6927 to your computer and use it in GitHub Desktop.
A dumb line filter
#/usr/bin/perl
use strict;
use warnings;
#The names of the input files are provided on the command line
if (@ARGV == 2) {
# Read input files
my $ids_file = $ARGV[0];
# <ids.txt>
#rs61733845
#rs16823940
#...
my $vcf_vile = $ARGV[1];
# <YRI.exon.2010_03.sites.vcf>
#22 43691009 rs8135982 C T . PASS AA=C;AC=13;AN=132;DP=723
#22 48958132 rs35195493 C G . PASS AA=C;AC=4;AN=178;DP=3609
#...
my @ids = (); #This will contain the IDs from ids.txt
open (ids_handle, $ids_file);
while (my $line = <ids_handle>) {
chomp($line); #Remove the whitespace from the line
push(@ids, $line); #Add the ID to the list of IDs
}
close(ids_handle);
#Search the VCF for matching IDs
open(vcf_handle, $vcf_vile);
while (my $line = <vcf_handle>) {
foreach my $id (@ids) { #try to match each of the IDs
if ($line =~ /\Q$id\E/) { #check if line contains ID
print "$line"; #print it if we found it
}
}
}
close(vcf_handle);
} else {
die "USAGE: vcf-search ids.txt input.vcf\n";
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment