Last active
November 25, 2021 03:27
-
-
Save sloanlance/a73ef7e17d7efdc6991a08335c14843d to your computer and use it in GitHub Desktop.
Convert ISO 8601 durations to seconds.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
from functools import reduce | |
import string | |
def durationToSecondsRegex(duration: str) -> int: | |
""" | |
Parse duration string; return integer number of seconds. | |
For example, the duration string "1h2m3s" (1 hour, 2 minutes, 3 seconds) | |
would return 3723 seconds. | |
The duration string format is similar to ISO 8601. Any of the units may | |
be omitted, but they must appear in the order of "h", "m", then "s". | |
The values of each unit must be an integer. | |
This is the regular expression version. | |
""" | |
parsedDuration = re.match( | |
r'^(?:(\d+)h)?(?:(\d+)m)?(?:(\d+)s)?$', duration) | |
return reduce(lambda x, y: 60 * x + y, | |
(0 if not n else int(n) for n in parsedDuration.groups())) | |
def durationToSeconds(duration: str) -> int: | |
""" | |
Parse duration string; return integer number of seconds. | |
For example, the duration string "1h2m3s" (1 hour, 2 minutes, 3 seconds) | |
would return 3723 seconds. | |
The duration string format is similar to ISO 8601. Any of the units may | |
be omitted, and they may appear in any order. Is any unit is repeated, | |
the last one found is the only one retained. The values of each unit must | |
be an integer. | |
This version doesn't use regular expressions. It is faster, more | |
flexible, and gives better diagnostics. | |
""" | |
timeParts = {'h': 0, 'm': 0, 's': 0} | |
validUnits = timeParts.keys() | |
digits = '' | |
for c in duration: | |
if c in string.digits: | |
digits += c | |
elif c in validUnits: | |
if not digits: | |
raise ValueError( | |
f'Found unit "{c}" without preceding digits.') | |
timeParts[c] = int(digits) | |
digits = '' | |
else: | |
raise ValueError(f'Found invalid character, "{c}".') | |
return reduce(lambda x, y: 60 * x + y, timeParts.values()) | |
if '__main__' == __name__: | |
for (duration, expectedSeconds) in { | |
'1h2m3s': 3723, | |
'9h42s': 32442, | |
'1h': 3600, | |
'1m': 60, | |
'32s': 32, | |
'9m': 540, | |
}.items(): | |
testSeconds = durationToSeconds(duration) | |
print(repr(duration), expectedSeconds, testSeconds) | |
assert(expectedSeconds == testSeconds) | |
TODO:
- Add support for upper/lower case.
- Add support for optional ISO 8601 "P" (period) and "T" (time) marker characters. Maybe make it so that if "P" is given, all other characters are ignored until after "T".
- Consider using a unit multiplier ordered dictionary like
{'h': 3600, 'm': 60, 's': 1}
to specify the order of the units and to allow for the possible expansion to include days and weeks, which use multipliers that are not multiples of 60. It may even be faster than thefunctools.reduce
technique. - Similar: https://stackoverflow.com/a/35159973/543738 and https://stackoverflow.com/a/16742742/543738
Add a shell alternative:
TIMEOUT='00:30:30'
echo $(date -j -f '%F %T %z' "1970-01-01 ${TIMEOUT} +0000" '+%s')
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I mostly did this as a refresher exercise in using regular expressions with Python, but this could be a useful function. There are probably more robust implementations of this in other libraries. This may have the advantage of being much more lightweight if one only needs to convert ISO 8601 duration strings into seconds.
My colleagues prefer to avoid regex, because they can be tricky to get right and to maintain later. I made a non-regex version of this and compared its performance with the regex one. The one without regex is faster, more flexible, and gives better diagnostics.