Skip to content

Instantly share code, notes, and snippets.

@stefanschmidt
Created February 4, 2022 21:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stefanschmidt/4930ba4d1636f7989d5948940acd7be8 to your computer and use it in GitHub Desktop.
Save stefanschmidt/4930ba4d1636f7989d5948940acd7be8 to your computer and use it in GitHub Desktop.
Compare dynamic and static date parsing in Python
# As a response to a proposal on Stack Overflow to use parser from dateutil to
# parse dates without needing to specify the date format one commenter noted:
# "Be aware that for large data amounts this might not be the most optimal
# way to approach the problem. Guessing the format every single time may
# be horribly slow." - https://stackoverflow.com/questions/466345
#
# I was wondering what "horribly slow" could mean in practice.
#
# In a quick performance comparison I could observe that static parsing
# (datetime.strptime) was on average about 5x faster than dynamic parsing
# (parser.parse) using four different timestamp formats populated with the
# current time and date. This could obviously be extended to more timestamp
# formats and a different time and date for every list item.
from datetime import datetime
from dateutil import parser
import timeit
d = datetime.now()
formats = ["%m/%d/%y %I:%M:%S%p", "%d.%m.%y %H:%M:%S", "%d.%m.%Y %H:%M:%S", "%A %d. %B %Y %H:%M:%S"]
dates = []
for f in formats:
dates.append(d.strftime(f))
def test_dynamic():
for f in dates:
parser.parse(f)
def test_static():
for i in range(len(dates)):
datetime.strptime(dates[i], formats[i])
print('parser.parse: ' + str(timeit.timeit("test_dynamic()", setup="from __main__ import test_dynamic", number=10000)))
print('datetime.strptime: ' + str(timeit.timeit("test_static()", setup="from __main__ import test_static", number=10000)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment