himansudesai/average_time_of_day.rb Secret

## average_time_of_day.rb
# Himansu Desai (himansudesai@hotmail.com)

require 'cyc_time.rb'
require 'set.rb'

=begin
Program Notes:
-------------------
1. The algorithm/approach was more interesting for this problem than actual coding techniques + the code was getting large so
    an attempt was made to reduce program size by not doing much error checking or other features that would be needed in a
    real deployed application.  Hopefully that does not disqualify my contest entry.
2. The program was shortened to not use optargs or spit out useful error messages to the user.  It does not accept spaces
    between times.  Hopefully this will not disqualify the entry.  The program can be used as:
    ruby average_time_of_day "7:01am","7:02am","7:03am"  - without space after the comma.

Algorithm Notes:
---------------------
1. Times are assumed to fall within a contiguous 24 HOUR period.  There seemed no other obvious ways to deal with this.  If the
    flight arrives at 2pm and 3pm, there is no way to know if it is 3pm the next day or 10 days from now or 1 day early.  So all times
    are treated as falling within one circular 24 hr period.
2. Given a set of times, a chain of segments are created between them to do further analysis.
3. Time is treated as cyclical.  A "segment" has two times (head and tail) and a length in minutes.  Given two times, there can only
    be one segment between them, the one that yields in a smaller segment.  For example, between 7pm and 6am, there is a segment
    with 7pm as the head and 6am as the tail and it is 11 hrs long.  The other segment with 6am as the head would be 13 hrs long and
    is not valid.  If the times are exactly 12 hrs off, then we take any one of the two segments for further analysis.
4.  Out of multiple valid segments that have time t1 as the 'head', take the shortest one of them and that is the ONE segment with
    t1 as the head that we will now use for creating a chain
5. Given a chain of segments, it can be "closed" or "open".  For example, the 3 segments
                    7pm(head)-->1am(tail)
                    1am(head)-->4am(tail) and
                    4am(head)-->6am(tail)
                    constitute an open chain because there is no valid segment from 6am to 7pm closing it.  On the other hand, the 4 segments
                    7pm(head)-->1am(tail)
                    1am(head)-->4am(tail) and
                    4am(head)-->10am(tail)
                    10am(head)-->7pm(tail)
                    constitute a closed chain because the tail of the last segment is the head of the first.

    a) For open chains, it is easy - use the head of the first segment as the head of the entire chain.  Compute differences from that head
        time and then add the average difference back to the head time to get the average flight arrival time.  NOTE:  If there is more than
        one arrival at the same time, this averaging method give extra weight to that time because that time will be counted multiple times.
    b) For close chains, throw out the "largest" segment.  What remains is the chain we use to compute the average per (a) above.  The
        idea is that time is circular so we pick the times that are closely clustered together to figure out when the "earliest" (head of the chain)
        arrival time is and when the "latest" (tail of the chain) is.
        Consider 7pm, 12am, 5am, 10am.  The way we know 7pm as the earliest time and 10am as the latest time (instead of any other
        combination) is that the segment from 10am to 7pm is 9hrs long which is longer than any of the other 5 hr segments.
6. Times can be uniformly spread across a 24 hr period - For example 12:10am, 12:10pm, 6:10am, 6:10pm
    a) Uniform spread with no duplicates - OUT OF LUCK.  You don't have a good guess on when you need to be at the airport
    b) Uniform spread with duplicate times - Take the time with the largest recurrence and suggest that to the user as when the plane is
        most likely to arrive.
=end

all_times = []
ARGV.first.split(",").each do |e|
    all_times << CycTime.new(e)
end

unique_times = CycTime.remove_duplicates(all_times)

duplicate_times = (unique_times.size < all_times.size) ? true : false

segments = CycTime.compute_segments(unique_times)
if (segments.size == 1)
    seg = segments[segments.keys[0]]
    head = seg[:head]
    tail = seg[:tail]
    diff = ( tail.minutes_from(head) / 2).to_i
    avg_time = head.add(diff)
    puts "Average flight arrival time = #{avg_time}"
else
    seg_chain = CycTime.analyze_segments(segments)
    if (seg_chain[:uniform_segments])
        if (!duplicate_times)
            puts "WARNING!  Times are uniformly distributed over a 24 hours period. Show up at the airport when you can."
        else
            time = CycTime.most_repeating(all_times)
            puts "Average flight arrival time = #{time}"
        end
    else
        head = seg_chain[:head]
        total_time = 0
        all_times.each do |t|
            total_time += t.minutes_from(head)
        end
        avg_diff = (total_time / (all_times.size)).to_i
        puts "#{head.add(avg_diff)}"
    end
end

## cyc_time.rb
# Himansu Desai (himansudesai@hotmail.com)

class CycTime < Object
    attr_accessor :hrs, :mins, :pm
    class CycSegment  < Struct.new(:head, :tail, :mins); end
    class CycSegmentChain < Struct.new(:head, :uniform_segments); end

    def self.remove_duplicates(times)
        h = Hash.new
        midnight = CycTime.new("12:00am")
        times.each {|t| h[t.minutes_from(midnight)] = t}
        s = Set.new
        h.each_value { |t|  s << t}
        s
    end

    def self.most_repeating(times)
        h = Hash.new
        midnight = CycTime.new("12:00am")
        times.each do |t|
            mins = t.minutes_from(midnight)
        end
        times.each { |t| (h[t] == nil) ? h[t] = 1 : h[t] = h[t] + 1}
        max_count = 0
        max_time = nil
        h.each_pair { |time, count| if h[time] > max_count then max_count = h[time]; max_time = time; end }
        max_time
    end

    def self.compute_segments(times)
        segments = {}
        times.each do | t |
            compare_list = times.clone.delete(t)
            short_seg = segments[t]
            compare_list.each do |other|
                cur_seg = t.circ_segment(other)
                short_seg = cur_seg if ( (cur_seg[:head] == t) && ((short_seg == nil) || (short_seg[:mins] > cur_seg[:mins]) ))
            end
            segments[t] = short_seg if (short_seg != nil)
        end
        segments
    end

    def self.analyze_segments(segments)
        segs = segments.clone
        head_seg = tail_seg = segs.delete(segs.keys[0])
        count = segs.size
        while (count > 1)
            segs.each_pair do |time, seg|
                if (seg[:tail] == head_seg[:head])
                    head_seg = seg
                    count -= 1
                    segs.delete(time)
                    break
                elsif (seg[:head] == tail_seg[:tail])
                    tail_seg = seg
                    count -= 1
                    segs.delete(time)
                    break
                end
            end
        end
        last_seg = segs[segs.keys[0]]
        if (last_seg[:head] == tail_seg[:tail])
            tail_seg = last_seg
        end
        if (last_seg[:tail] == head_seg[:head])
            head_seg = last_seg
        end
        if (head_seg == tail_seg)
            seg_size = nil
            identical = true
            segments.each_value do |seg|
                seg_size = seg[:mins] if (seg_size == nil)
                identical = false unless seg[:mins] == seg_size
            end
            if identical then return CycSegmentChain.new(segments[segments.keys[0]][:head], true); end

            largest_seg = nil
            segments.each_pair { |time, seg| largest_seg = seg if ( (largest_seg == nil) || (seg[:mins] > largest_seg[:mins]) ) }
            return CycSegmentChain.new(largest_seg[:tail], false)
        else
            return CycSegmentChain.new(head_seg[:head], false)
        end
    end

    def add(mins)
        am_pm = self.pm
        add_hrs = mins / 60
        add_mins = mins % 60
        new_mins = self.mins + add_mins
        if (new_mins > 59)
            new_hrs = self.hrs + add_hrs + 1
            new_mins = new_mins - 60
        else
            new_hrs = self.hrs + add_hrs
        end
        if (new_hrs > 11)
            new_hrs = new_hrs - 12
            am_pm = !am_pm
        end
        if (new_hrs == 0) then new_hrs = 12; end
        as_str(new_hrs, new_mins, am_pm)
#        new_mins = "#{new_mins}".rjust(2, '0')
#        str = "#{new_hrs}:#{new_mins}#{(am_pm) ? 'pm' : 'am'}"
#        str
    end

    def to_s
        as_str(self.hrs, self.mins, self.pm)
#        two_digit_mins = "#{self.mins}".rjust(2, '0')
#        "#{self.hrs}:#{two_digit_mins}#{(self.pm) ? 'pm' : 'am'}"
    end

    def mins_12hr_format
        (self.hrs == 12) ? (self.mins) : (self.hrs * 60) + self.mins
    end

    def minutes_from(time)
        if (self.pm == time.pm)
            diff = self.mins_12hr_format - time.mins_12hr_format if (self.pm == time.pm)
            return (diff < 0) ? (1440 - diff) : diff
        end
        return 720 - (time.mins_12hr_format) + self.mins_12hr_format
    end

    def initialize(time)
        if (time =~ /\d*:\d*\a*/)
            time =~ /(\d*):(\d*)(\D*)$/
            self.hrs = $1.to_i
            self.mins = $2.to_i
            self.pm = false
            self.pm = true if ($3 == 'pm')
        end
    end

    def circ_segment(other)
        if (self.pm == other.pm)
            if (other.mins_12hr_format > self.mins_12hr_format)
                return CycSegment.new(self, other,  (other.mins_12hr_format - self.mins_12hr_format))
            else
                return CycSegment.new(other, self, (self.mins_12hr_format - other.mins_12hr_format));
            end
        else
            bigger = other; smaller = self
            if (self.mins_12hr_format > other.mins_12hr_format) then bigger = self;  smaller = other; end
            return CycSegment.new(bigger, smaller, (720 - bigger.mins_12hr_format) + smaller.mins_12hr_format)
        end
    end

    private
    def as_str(h, m, pm)
        two_digit_mins = "#{m}".rjust(2, '0')
        "#{h}:#{two_digit_mins}#{(pm) ? 'pm' : 'am'}"
    end
end
	# Himansu Desai (himansudesai@hotmail.com)

	require 'cyc_time.rb'
	require 'set.rb'

	=begin
	Program Notes:
	-------------------
	1. The algorithm/approach was more interesting for this problem than actual coding techniques + the code was getting large so
	an attempt was made to reduce program size by not doing much error checking or other features that would be needed in a
	real deployed application. Hopefully that does not disqualify my contest entry.
	2. The program was shortened to not use optargs or spit out useful error messages to the user. It does not accept spaces
	between times. Hopefully this will not disqualify the entry. The program can be used as:
	ruby average_time_of_day "7:01am","7:02am","7:03am" - without space after the comma.

	Algorithm Notes:
	---------------------
	1. Times are assumed to fall within a contiguous 24 HOUR period. There seemed no other obvious ways to deal with this. If the
	flight arrives at 2pm and 3pm, there is no way to know if it is 3pm the next day or 10 days from now or 1 day early. So all times
	are treated as falling within one circular 24 hr period.
	2. Given a set of times, a chain of segments are created between them to do further analysis.
	3. Time is treated as cyclical. A "segment" has two times (head and tail) and a length in minutes. Given two times, there can only
	be one segment between them, the one that yields in a smaller segment. For example, between 7pm and 6am, there is a segment
	with 7pm as the head and 6am as the tail and it is 11 hrs long. The other segment with 6am as the head would be 13 hrs long and
	is not valid. If the times are exactly 12 hrs off, then we take any one of the two segments for further analysis.
	4. Out of multiple valid segments that have time t1 as the 'head', take the shortest one of them and that is the ONE segment with
	t1 as the head that we will now use for creating a chain
	5. Given a chain of segments, it can be "closed" or "open". For example, the 3 segments
	7pm(head)-->1am(tail)
	1am(head)-->4am(tail) and
	4am(head)-->6am(tail)
	constitute an open chain because there is no valid segment from 6am to 7pm closing it. On the other hand, the 4 segments
	7pm(head)-->1am(tail)
	1am(head)-->4am(tail) and
	4am(head)-->10am(tail)
	10am(head)-->7pm(tail)
	constitute a closed chain because the tail of the last segment is the head of the first.

	a) For open chains, it is easy - use the head of the first segment as the head of the entire chain. Compute differences from that head
	time and then add the average difference back to the head time to get the average flight arrival time. NOTE: If there is more than
	one arrival at the same time, this averaging method give extra weight to that time because that time will be counted multiple times.
	b) For close chains, throw out the "largest" segment. What remains is the chain we use to compute the average per (a) above. The
	idea is that time is circular so we pick the times that are closely clustered together to figure out when the "earliest" (head of the chain)
	arrival time is and when the "latest" (tail of the chain) is.
	Consider 7pm, 12am, 5am, 10am. The way we know 7pm as the earliest time and 10am as the latest time (instead of any other
	combination) is that the segment from 10am to 7pm is 9hrs long which is longer than any of the other 5 hr segments.
	6. Times can be uniformly spread across a 24 hr period - For example 12:10am, 12:10pm, 6:10am, 6:10pm
	a) Uniform spread with no duplicates - OUT OF LUCK. You don't have a good guess on when you need to be at the airport
	b) Uniform spread with duplicate times - Take the time with the largest recurrence and suggest that to the user as when the plane is
	most likely to arrive.
	=end

	all_times = []
	ARGV.first.split(",").each do \|e\|
	all_times << CycTime.new(e)
	end

	unique_times = CycTime.remove_duplicates(all_times)

	duplicate_times = (unique_times.size < all_times.size) ? true : false

	segments = CycTime.compute_segments(unique_times)
	if (segments.size == 1)
	seg = segments[segments.keys[0]]
	head = seg[:head]
	tail = seg[:tail]
	diff = ( tail.minutes_from(head) / 2).to_i
	avg_time = head.add(diff)
	puts "Average flight arrival time = #{avg_time}"
	else
	seg_chain = CycTime.analyze_segments(segments)
	if (seg_chain[:uniform_segments])
	if (!duplicate_times)
	puts "WARNING! Times are uniformly distributed over a 24 hours period. Show up at the airport when you can."
	else
	time = CycTime.most_repeating(all_times)
	puts "Average flight arrival time = #{time}"
	end
	else
	head = seg_chain[:head]
	total_time = 0
	all_times.each do \|t\|
	total_time += t.minutes_from(head)
	end
	avg_diff = (total_time / (all_times.size)).to_i
	puts "#{head.add(avg_diff)}"
	end
	end
	# Himansu Desai (himansudesai@hotmail.com)

	class CycTime < Object
	attr_accessor :hrs, :mins, :pm
	class CycSegment < Struct.new(:head, :tail, :mins); end
	class CycSegmentChain < Struct.new(:head, :uniform_segments); end

	def self.remove_duplicates(times)
	h = Hash.new
	midnight = CycTime.new("12:00am")
	times.each {\|t\| h[t.minutes_from(midnight)] = t}
	s = Set.new
	h.each_value { \|t\| s << t}
	s
	end

	def self.most_repeating(times)
	h = Hash.new
	midnight = CycTime.new("12:00am")
	times.each do \|t\|
	mins = t.minutes_from(midnight)
	end
	times.each { \|t\| (h[t] == nil) ? h[t] = 1 : h[t] = h[t] + 1}
	max_count = 0
	max_time = nil
	h.each_pair { \|time, count\| if h[time] > max_count then max_count = h[time]; max_time = time; end }
	max_time
	end

	def self.compute_segments(times)
	segments = {}
	times.each do \| t \|
	compare_list = times.clone.delete(t)
	short_seg = segments[t]
	compare_list.each do \|other\|
	cur_seg = t.circ_segment(other)
	short_seg = cur_seg if ( (cur_seg[:head] == t) && ((short_seg == nil) \|\| (short_seg[:mins] > cur_seg[:mins]) ))
	end
	segments[t] = short_seg if (short_seg != nil)
	end
	segments
	end

	def self.analyze_segments(segments)
	segs = segments.clone
	head_seg = tail_seg = segs.delete(segs.keys[0])
	count = segs.size
	while (count > 1)
	segs.each_pair do \|time, seg\|
	if (seg[:tail] == head_seg[:head])
	head_seg = seg
	count -= 1
	segs.delete(time)
	break
	elsif (seg[:head] == tail_seg[:tail])
	tail_seg = seg
	count -= 1
	segs.delete(time)
	break
	end
	end
	end
	last_seg = segs[segs.keys[0]]
	if (last_seg[:head] == tail_seg[:tail])
	tail_seg = last_seg
	end
	if (last_seg[:tail] == head_seg[:head])
	head_seg = last_seg
	end
	if (head_seg == tail_seg)
	seg_size = nil
	identical = true
	segments.each_value do \|seg\|
	seg_size = seg[:mins] if (seg_size == nil)
	identical = false unless seg[:mins] == seg_size
	end
	if identical then return CycSegmentChain.new(segments[segments.keys[0]][:head], true); end

	largest_seg = nil
	segments.each_pair { \|time, seg\| largest_seg = seg if ( (largest_seg == nil) \|\| (seg[:mins] > largest_seg[:mins]) ) }
	return CycSegmentChain.new(largest_seg[:tail], false)
	else
	return CycSegmentChain.new(head_seg[:head], false)
	end
	end

	def add(mins)
	am_pm = self.pm
	add_hrs = mins / 60
	add_mins = mins % 60
	new_mins = self.mins + add_mins
	if (new_mins > 59)
	new_hrs = self.hrs + add_hrs + 1
	new_mins = new_mins - 60
	else
	new_hrs = self.hrs + add_hrs
	end
	if (new_hrs > 11)
	new_hrs = new_hrs - 12
	am_pm = !am_pm
	end
	if (new_hrs == 0) then new_hrs = 12; end
	as_str(new_hrs, new_mins, am_pm)
	# new_mins = "#{new_mins}".rjust(2, '0')
	# str = "#{new_hrs}:#{new_mins}#{(am_pm) ? 'pm' : 'am'}"
	# str
	end

	def to_s
	as_str(self.hrs, self.mins, self.pm)
	# two_digit_mins = "#{self.mins}".rjust(2, '0')
	# "#{self.hrs}:#{two_digit_mins}#{(self.pm) ? 'pm' : 'am'}"
	end

	def mins_12hr_format
	(self.hrs == 12) ? (self.mins) : (self.hrs * 60) + self.mins
	end

	def minutes_from(time)
	if (self.pm == time.pm)
	diff = self.mins_12hr_format - time.mins_12hr_format if (self.pm == time.pm)
	return (diff < 0) ? (1440 - diff) : diff
	end
	return 720 - (time.mins_12hr_format) + self.mins_12hr_format
	end

	def initialize(time)
	if (time =~ /\d:\d\a*/)
	time =~ /(\d):(\d)(\D*)$/
	self.hrs = $1.to_i
	self.mins = $2.to_i
	self.pm = false
	self.pm = true if ($3 == 'pm')
	end
	end

	def circ_segment(other)
	if (self.pm == other.pm)
	if (other.mins_12hr_format > self.mins_12hr_format)
	return CycSegment.new(self, other, (other.mins_12hr_format - self.mins_12hr_format))
	else
	return CycSegment.new(other, self, (self.mins_12hr_format - other.mins_12hr_format));
	end
	else
	bigger = other; smaller = self
	if (self.mins_12hr_format > other.mins_12hr_format) then bigger = self; smaller = other; end
	return CycSegment.new(bigger, smaller, (720 - bigger.mins_12hr_format) + smaller.mins_12hr_format)
	end
	end

	private
	def as_str(h, m, pm)
	two_digit_mins = "#{m}".rjust(2, '0')
	"#{h}:#{two_digit_mins}#{(pm) ? 'pm' : 'am'}"
	end
	end