Skip to content

Instantly share code, notes, and snippets.

@liavkoren
Created July 6, 2017 18:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save liavkoren/16e1039e4e5f496636dfbfb1359a6d36 to your computer and use it in GitHub Desktop.
Save liavkoren/16e1039e4e5f496636dfbfb1359a6d36 to your computer and use it in GitHub Desktop.
Incorrectly calculating bounce rates
from collections import defaultdict
initial_pages = defaultdict(lambda: 0)
bounce_rates = defaultdict(lambda: 0)
for line in open('site_data.csv'):
first, second = line.rstrip().split(',')
if first == '-1':
initial_pages[second] += 1
if second == 'B':
bounce_rates[first] += 1
print('Initial states:')
total_pages = sum(initial_pages.values())
initial_page_probabilities = {key: value/total_pages for key, value in initial_pages.items()}
for page, probability in initial_page_probabilities.items():
print(f'Page {page}: {probability}')
print('Bounce rates:')
total_bounces = sum(bounce_rates.values())
bounce_rate_probabilities = {key: value/total_bounces for key, value in bounce_rates.items()}
for page, probability in bounce_rate_probabilities.items():
print(f'Page {page} bounce rate: {probability}')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment