Skip to content

Instantly share code, notes, and snippets.

@paulghaddad
Last active August 29, 2015 14:10
Show Gist options
  • Save paulghaddad/24134419db002d1ef21f to your computer and use it in GitHub Desktop.
Save paulghaddad/24134419db002d1ef21f to your computer and use it in GitHub Desktop.
Exercism: Hamming
# Hamming
Write a program that can calculate the Hamming difference between two DNA strands.
A mutation is simply a mistake that occurs during the creation or
copying of a nucleic acid, in particular DNA. Because nucleic acids are
vital to cellular functions, mutations tend to cause a ripple effect
throughout the cell. Although mutations are technically mistakes, a very
rare mutation may equip the cell with a beneficial attribute. In fact,
the macro effects of evolution are attributable by the accumulated
result of beneficial microscopic mutations over many generations.
The simplest and most common type of nucleic acid mutation is a point
mutation, which replaces one base with another at a single nucleotide.
By counting the number of differences between two homologous DNA strands
taken from different genomes with a common ancestor, we get a measure of
the minimum number of point mutations that could have occurred on the
evolutionary path between the two strands.
This is called the 'Hamming distance'
GAGCCTACTAACGGGAT
CATCGTAATGACGGCCT
^ ^ ^ ^ ^ ^^
The Hamming distance between these two DNA strands is 7.
# Implementation notes
The Hamming distance is only defined for sequences of equal length. This means
that based on the definition, each language could deal with getting sequences
of equal length differently.
## Source
The Calculating Point Mutations problem at Rosalind [view source](http://rosalind.info/problems/hamm/)
####### Submission for Hamming Exercism Problem #########
class Hamming
def self.compute(strand, comparison_strand)
return 0 if strand == comparison_strand
strand_array = strand.split(//)
comparison_strand_array = comparison_strand.split(//)
compute_difference(strand_array, comparison_strand_array)
end
def self.compute_difference(strand_1, strand_2)
shorter_strand, longer_strand = order_strands([strand_1, strand_2])
hamming_difference = 0
shorter_strand.each_with_index do |base, index|
hamming_difference += 1 unless base == longer_strand[index]
end
hamming_difference
end
def self.order_strands(strands)
strands.sort_by(&:size)
end
end
require 'minitest/autorun'
require 'pry'
require_relative 'hamming'
class HammingTest < MiniTest::Unit::TestCase
def test_no_difference_between_identical_strands
assert_equal 0, Hamming.compute('A', 'A')
end
def test_complete_hamming_distance_of_for_single_nucleotide_strand
assert_equal 1, Hamming.compute('A','G')
end
def test_complete_hamming_distance_of_for_small_strand
assert_equal 2, Hamming.compute('AG','CT')
end
def test_small_hamming_distance
assert_equal 1, Hamming.compute('AT','CT')
end
def test_small_hamming_distance_in_longer_strand
assert_equal 1, Hamming.compute('GGACG', 'GGTCG')
end
def test_ignores_extra_length_on_first_strand_when_longer
assert_equal 1, Hamming.compute('AGAGACTTA', 'AAA')
end
def test_ignores_extra_length_on_other_strand_when_longer
assert_equal 2, Hamming.compute('AGG', 'AAAACTGACCCACCCCAGG')
end
def test_large_hamming_distance
assert_equal 4, Hamming.compute('GATACA', 'GCATAA')
end
def test_hamming_distance_in_very_long_strand
assert_equal 9, Hamming.compute('GGACGGATTCTG', 'AGGACGGATTCT')
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment