Last active
August 29, 2015 14:10
-
-
Save paulghaddad/24134419db002d1ef21f to your computer and use it in GitHub Desktop.
Exercism: Hamming
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Hamming | |
Write a program that can calculate the Hamming difference between two DNA strands. | |
A mutation is simply a mistake that occurs during the creation or | |
copying of a nucleic acid, in particular DNA. Because nucleic acids are | |
vital to cellular functions, mutations tend to cause a ripple effect | |
throughout the cell. Although mutations are technically mistakes, a very | |
rare mutation may equip the cell with a beneficial attribute. In fact, | |
the macro effects of evolution are attributable by the accumulated | |
result of beneficial microscopic mutations over many generations. | |
The simplest and most common type of nucleic acid mutation is a point | |
mutation, which replaces one base with another at a single nucleotide. | |
By counting the number of differences between two homologous DNA strands | |
taken from different genomes with a common ancestor, we get a measure of | |
the minimum number of point mutations that could have occurred on the | |
evolutionary path between the two strands. | |
This is called the 'Hamming distance' | |
GAGCCTACTAACGGGAT | |
CATCGTAATGACGGCCT | |
^ ^ ^ ^ ^ ^^ | |
The Hamming distance between these two DNA strands is 7. | |
# Implementation notes | |
The Hamming distance is only defined for sequences of equal length. This means | |
that based on the definition, each language could deal with getting sequences | |
of equal length differently. | |
## Source | |
The Calculating Point Mutations problem at Rosalind [view source](http://rosalind.info/problems/hamm/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
####### Submission for Hamming Exercism Problem ######### | |
class Hamming | |
def self.compute(strand, comparison_strand) | |
return 0 if strand == comparison_strand | |
strand_array = strand.split(//) | |
comparison_strand_array = comparison_strand.split(//) | |
compute_difference(strand_array, comparison_strand_array) | |
end | |
def self.compute_difference(strand_1, strand_2) | |
shorter_strand, longer_strand = order_strands([strand_1, strand_2]) | |
hamming_difference = 0 | |
shorter_strand.each_with_index do |base, index| | |
hamming_difference += 1 unless base == longer_strand[index] | |
end | |
hamming_difference | |
end | |
def self.order_strands(strands) | |
strands.sort_by(&:size) | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'minitest/autorun' | |
require 'pry' | |
require_relative 'hamming' | |
class HammingTest < MiniTest::Unit::TestCase | |
def test_no_difference_between_identical_strands | |
assert_equal 0, Hamming.compute('A', 'A') | |
end | |
def test_complete_hamming_distance_of_for_single_nucleotide_strand | |
assert_equal 1, Hamming.compute('A','G') | |
end | |
def test_complete_hamming_distance_of_for_small_strand | |
assert_equal 2, Hamming.compute('AG','CT') | |
end | |
def test_small_hamming_distance | |
assert_equal 1, Hamming.compute('AT','CT') | |
end | |
def test_small_hamming_distance_in_longer_strand | |
assert_equal 1, Hamming.compute('GGACG', 'GGTCG') | |
end | |
def test_ignores_extra_length_on_first_strand_when_longer | |
assert_equal 1, Hamming.compute('AGAGACTTA', 'AAA') | |
end | |
def test_ignores_extra_length_on_other_strand_when_longer | |
assert_equal 2, Hamming.compute('AGG', 'AAAACTGACCCACCCCAGG') | |
end | |
def test_large_hamming_distance | |
assert_equal 4, Hamming.compute('GATACA', 'GCATAA') | |
end | |
def test_hamming_distance_in_very_long_strand | |
assert_equal 9, Hamming.compute('GGACGGATTCTG', 'AGGACGGATTCT') | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment