Skip to content

Instantly share code, notes, and snippets.

@jtprince
Created May 30, 2013 16:12
Show Gist options
  • Save jtprince/5679115 to your computer and use it in GitHub Desktop.
Save jtprince/5679115 to your computer and use it in GitHub Desktop.
A simple script that displays unique tryptic peptides for proteins retrieved by uniprot accession number.
#!/usr/bin/env ruby
require 'open-uri'
require 'mspire/digester' # gem install mspire
require 'bio'
require 'set'
accessions = ARGV[0,2]
missed_cleavages = ARGV[2].to_i
abort 'usage: acc1 acc2 [missed_cleavages]' unless accessions.size == 2
puts "Missed cleavages: #{missed_cleavages}"
base_uri = 'http://www.uniprot.org/uniprot/'
ext = '.fasta'
Protein = Struct.new(:acc, :seq, :peptides)
trypsin = Mspire::Digester[:trypsin]
prots = accessions.map do |acc|
uri = base_uri + acc + ext
prot = Protein.new(acc, Bio::FastaFormat.new( open(uri) {|io| io.read } ).seq )
prot.peptides = trypsin.digest(prot.seq, missed_cleavages).to_set
prot
end
[:first, :last].each do |focus|
prot = prots.send(focus)
other = prots.send( (focus == :first) ? :last : :first )
uniq = (prot.peptides - other.peptides).to_a
puts "Unique to #{prot.acc} (#{uniq.size} of #{prot.peptides.size}):"
puts uniq.join(" ")
puts
if focus == :last
shared = (prot.peptides & other.peptides).to_a
puts "Shared (#{shared.size}): "
puts shared.join(" ")
puts
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment