Last active
September 30, 2015 19:58
-
-
Save djo/1854177 to your computer and use it in GitHub Desktop.
The Embedly Challenge, http://apply.embed.ly
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Problem 1 of 3: Math | |
# ==================== | |
# | |
# n! means n * (n - 1) * ... * 3 * 2 * 1 | |
# For example, 10! = 10 * 9 * ... * 3 * 2 * 1 = 3628800 | |
# Let R(n) equal the sum of the digits in the number n! | |
# For example, R(10) is 3 + 6 + 2 + 8 + 8 + 0 + 0 = 27. | |
# Find the lowest value for n where R(n) is 8001. | |
# | |
# Result: 787. | |
def fact(num, acc = 1) | |
return acc if num == 1 | |
fact(num - 1, acc * num) | |
end | |
def sum(num) | |
res = 0 | |
num.to_s.each_char { |ch| res = res + ch.to_i } | |
res | |
end | |
i = 10 | |
while(true) | |
if(sum(fact(i)) == 8001) | |
p i | |
break | |
end | |
i = i + 1 | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Problem 2 of 3: HTML | |
# ==================== | |
# | |
# One way to exclude miscellaneous text from an article is to find the standard deviation | |
# of the depth of the <p> tags for the <article>. For http://apply.embed.ly/static/data/2.html | |
# find the standard deviation of all the <p> tags within the <article> tag. Round to the nearest tenth: X.X. | |
# | |
# Result: 1.4. | |
require 'nokogiri' | |
require 'open-uri' | |
def depth(element, num = 1) | |
return num if element.parent.name == 'article' | |
depth(element.parent, num + 1) | |
end | |
s = 'http://apply.embed.ly/static/data/2.html' | |
doc = Nokogiri::HTML open(s) | |
res = doc.xpath('//p').map { |el| depth(el) } | |
#=> [1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 2, 1, 3, 1] | |
avg = res.inject(0){ |n, memo| memo + n }.to_f / res.size | |
diffs = res.map { |n| (n - avg)*(n - avg) } | |
p Math.sqrt(diffs.inject(0){ |n, memo| memo + n }.to_f / res.size) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Problem 3 of 3: Zipf's law | |
# ========================== | |
# | |
# A simplified version of Zipf's law: | |
# "For a given body of text, the most frequent word will occur approximately twice as often | |
# as the second most frequent word, three times as often as the third most frequent word, etc. | |
# [x, x/2, x/3, x/4, x/5, ...]" | |
# The following is a frequency set of words in a body of text that follows Zipf's law: | |
# [ | |
# ('the', 2520), | |
# ('of', 1260), | |
# ('and', 840), | |
# ('a', 630), | |
# ('to', 504) | |
# ... | |
# ] | |
# Given that the text has 900 unique words, how many unique words, starting with the most frequently used word, make up half the text? | |
# | |
# Result: 22. | |
count = 0 | |
num = 2520 | |
900.times { |n| count = count + (num.to_f / (n+1)).round } | |
half = (count.to_f / 2).round | |
sum = 0 | |
res = 0 | |
900.times do |n| | |
break if sum >= half | |
sum = sum + (num.to_f / (n+1)).round | |
res = res + 1 | |
end | |
p res |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment