Skip to content

Instantly share code, notes, and snippets.

@jasonhdavis
Last active November 19, 2018 14:29
Show Gist options
  • Save jasonhdavis/73664baf24bdb595ffe0b66d01703c01 to your computer and use it in GitHub Desktop.
Save jasonhdavis/73664baf24bdb595ffe0b66d01703c01 to your computer and use it in GitHub Desktop.
Efficient MongoDB query in Python to group results by day. This uses a single, simple find query to get all results within a date range. We then build a nested dictionary of [year][month][day] and add results to the dictionary structure
################################
########## WHY & How ###########
################################
# I had written a query inside of a loop that pulled results for every day
# This was very ineffecient
# While I belive this could be solved with a MongoDB Aggrigation, the syntax seemed complicated
# I could not find good examples for what I needed to do (aggrigate results of a day based on timestamp in one query)
# This is likely due to my ignorance
# Nonetheless, I thought this method may be helpful to someone else having the same issues as me
# Finally, dealing with nested dictionaries in python is slightly difficult as a nested key needs to be added with 'update'
# In a non-nested dictionary, the syntax is simple, however, when nesting things get complicated
# you can not use update for each iteration because it overwrites the previous value
# This was solved by building a key list for Year and Month as they are added to the dictionary
# This keylist is converted to a String because if your date range is over 12 months, you will duplicate the month key
################################
########## Import ##############
################################
from datetime import datetime, timedelta
from pymongo import MongoClient
from pprint import pprint
################################
###### Build Dictionary ########
################################
# In my app, user supplies a date range
end = datetime.today()
start = end - timedelta(days=600)
delta = end - start
date_dict={}
date_dict.update({start.year:{start.month:{start.day: start}}})
# Incriment over days in the range, setting each key value to 0
y_keys = [str(start.year)]
m_keys = [str(start.month)+"-"+str(start.year)]
for i in range(delta.days + 1):
iter_date = start+timedelta(days=i)
year = iter_date.year
month = iter_date.month
day = iter_date.day
if str(year) in y_keys and str(month)+"-"+str(year) in m_keys :
date_dict[year][month][day] = 0
elif str(year) in y_keys :
date_dict[year].update({month:{day:0}})
m_keys.append(str(month)+"-"+str(year))
else :
date_dict.update({year:{month:{day:0}}})
y_keys.append(str(year))
m_keys.append(str(month)+"-"+str(year))
################################
# MongDB Query & Data Handling #
################################
mongo = MongoClient('mongodb://localhost:27017')
range_search = mongo.sales.orders.find({'paymentDate':{'$lt': end, '$gt':start}})
for item in range_search :
date = item['paymentDate']
date_dict[date.year][date.month][date.day]+= item['orderTotal']
################################
########## Results #############
################################
## A dictionary of days inside of the requested date range
## order totals are summed based on the payment date timestamp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment