Skip to content

Instantly share code, notes, and snippets.

@karlcow
Last active June 5, 2017 01:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save karlcow/8737fb709b49f481917896fa356c89da to your computer and use it in GitHub Desktop.
Save karlcow/8737fb709b49f481917896fa356c89da to your computer and use it in GitHub Desktop.
Trying different methods for extracting information
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import urlparse
import re
import sys
BODY = "<!-- @browser: Firefox 55.0 -->\n<!-- @ua_header: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0 -->\n<!-- @reported_with: media-decode-error -->\n\n**URL**: http://www.chia-anime.tv/player.php?id=78454\n**Browser / Version**: Firefox 55.0\n**Operating System**: Windows 10\n**Problem type**: Video doesn't play\n\n**Steps to Reproduce**\n1. Navigate to: http://www.chia-anime.tv/player.php?id=78454\r\n2. …\r\n\r\nExpected Behavior:\r\n\r\nActual Behavior:\r\n\r\nTechnical Information:\r\nError Code: NS_ERROR_DOM_MEDIA_DEMUXER_ERR (0x806e000c)\r\nDetails: class RefPtr<class mozilla::MozPromise<class mozilla::MediaResult,class mozilla::MediaResult,1> > __cdecl mozilla::MP4Demuxer::Init(void): No MP4 audio () or video () tracks\r\nResource: http://sasuke.chia-anime.tv:7779/cache/m_vTxrZNLcoL3q-_9YTC3w/1496434235/ei93pkha7qeg-650x370.html.mp4\n\n\n\n_From [webcompat.com](https://webcompat.com/) with ❤️_" # nopep8
def extract_media_info(body):
'''Extract information from the payload body for type-media.'''
body = body.replace('\r', '')
match_error = re.search(r'<!-- @reported_with: media-decode-error -->\n\n\*\*URL\*\*:\s(?P<url>[^\n]+)[^T]+Technical Information:\nError Code: (?P<error>\S+)', body) # nopep8
url = match_error.group('url')
domain = urlparse.urlparse(url).netloc
media_error = match_error.group('error')
return (media_error, domain)
def main():
'''core program'''
return extract_media_info(BODY)
if __name__ == "__main__":
sys.exit(main())
→ python -m cProfile regex.py
2831 function calls (2806 primitive calls) in 0.006 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 <string>:1(ParseResult)
1 0.000 0.000 0.000 0.000 <string>:1(SplitResult)
1 0.000 0.000 0.000 0.000 <string>:8(__new__)
1 0.001 0.001 0.002 0.002 collections.py:1(<module>)
1 0.000 0.000 0.000 0.000 collections.py:26(OrderedDict)
2 0.001 0.000 0.001 0.001 collections.py:293(namedtuple)
99 0.000 0.000 0.000 0.000 collections.py:337(<genexpr>)
13 0.000 0.000 0.000 0.000 collections.py:361(<genexpr>)
13 0.000 0.000 0.000 0.000 collections.py:363(<genexpr>)
1 0.000 0.000 0.000 0.000 collections.py:395(Counter)
1 0.001 0.001 0.001 0.001 heapq.py:31(<module>)
1 0.000 0.000 0.000 0.000 keyword.py:11(<module>)
1 0.000 0.000 0.001 0.001 re.py:143(search)
1 0.000 0.000 0.000 0.000 re.py:192(compile)
2 0.000 0.000 0.001 0.000 re.py:230(_compile)
1 0.000 0.000 0.001 0.001 regex.py:14(extract_media_info)
1 0.000 0.000 0.001 0.001 regex.py:25(main)
1 0.001 0.001 0.006 0.006 regex.py:7(<module>)
3 0.000 0.000 0.000 0.000 sre_compile.py:228(_compile_charset)
3 0.000 0.000 0.000 0.000 sre_compile.py:256(_optimize_charset)
4 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
2 0.000 0.000 0.000 0.000 sre_compile.py:433(_compile_info)
4 0.000 0.000 0.000 0.000 sre_compile.py:546(isstring)
2 0.000 0.000 0.000 0.000 sre_compile.py:552(_code)
2 0.000 0.000 0.001 0.000 sre_compile.py:567(compile)
9/2 0.000 0.000 0.000 0.000 sre_compile.py:64(_compile)
18 0.000 0.000 0.000 0.000 sre_parse.py:137(__len__)
32 0.000 0.000 0.000 0.000 sre_parse.py:141(__getitem__)
4 0.000 0.000 0.000 0.000 sre_parse.py:145(__setitem__)
96 0.000 0.000 0.000 0.000 sre_parse.py:149(append)
13/6 0.000 0.000 0.000 0.000 sre_parse.py:151(getwidth)
2 0.000 0.000 0.000 0.000 sre_parse.py:189(__init__)
135 0.000 0.000 0.000 0.000 sre_parse.py:193(__next)
28 0.000 0.000 0.000 0.000 sre_parse.py:206(match)
121 0.000 0.000 0.000 0.000 sre_parse.py:212(get)
8 0.000 0.000 0.000 0.000 sre_parse.py:221(isident)
2 0.000 0.000 0.000 0.000 sre_parse.py:227(isname)
1 0.000 0.000 0.000 0.000 sre_parse.py:236(_class_escape)
9 0.000 0.000 0.000 0.000 sre_parse.py:268(_escape)
5/2 0.000 0.000 0.000 0.000 sre_parse.py:317(_parse_sub)
5/2 0.000 0.000 0.000 0.000 sre_parse.py:395(_parse)
2 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
2 0.000 0.000 0.000 0.000 sre_parse.py:706(parse)
3 0.000 0.000 0.000 0.000 sre_parse.py:74(opengroup)
3 0.000 0.000 0.000 0.000 sre_parse.py:85(closegroup)
9 0.000 0.000 0.000 0.000 sre_parse.py:92(__init__)
1 0.000 0.000 0.000 0.000 urlparse.py:121(SplitResult)
1 0.000 0.000 0.000 0.000 urlparse.py:129(ParseResult)
1 0.000 0.000 0.000 0.000 urlparse.py:137(urlparse)
1 0.000 0.000 0.000 0.000 urlparse.py:160(_splitnetloc)
1 0.000 0.000 0.000 0.000 urlparse.py:168(urlsplit)
1 0.000 0.000 0.004 0.004 urlparse.py:29(<module>)
485 0.000 0.000 0.000 0.000 urlparse.py:332(<genexpr>)
1 0.000 0.000 0.000 0.000 urlparse.py:73(ResultMixin)
2 0.000 0.000 0.000 0.000 {_sre.compile}
13 0.000 0.000 0.000 0.000 {all}
2 0.000 0.000 0.000 0.000 {built-in method __new__ of type object at 0x1078fe428}
484 0.000 0.000 0.000 0.000 {chr}
40 0.000 0.000 0.000 0.000 {isinstance}
391/386 0.000 0.000 0.000 0.000 {len}
2 0.000 0.000 0.000 0.000 {map}
13 0.000 0.000 0.000 0.000 {method '__contains__' of 'frozenset' objects}
11 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
408 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
5 0.000 0.000 0.000 0.000 {method 'find' of 'bytearray' objects}
4 0.000 0.000 0.000 0.000 {method 'find' of 'str' objects}
24 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
22 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
2 0.000 0.000 0.000 0.000 {method 'group' of '_sre.SRE_Match' objects}
86 0.000 0.000 0.000 0.000 {method 'isalnum' of 'str' objects}
13 0.000 0.000 0.000 0.000 {method 'isdigit' of 'str' objects}
2 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
4 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects}
3 0.000 0.000 0.000 0.000 {method 'remove' of 'list' objects}
5 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'search' of '_sre.SRE_Pattern' objects}
3 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
11 0.000 0.000 0.000 0.000 {method 'startswith' of 'str' objects}
20 0.000 0.000 0.000 0.000 {min}
88 0.000 0.000 0.000 0.000 {ord}
1 0.000 0.000 0.000 0.000 {range}
2 0.000 0.000 0.000 0.000 {repr}
2 0.000 0.000 0.000 0.000 {sys._getframe}
1 0.000 0.000 0.000 0.000 {sys.exit}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import urlparse
import sys
BODY = "<!-- @browser: Firefox 55.0 -->\n<!-- @ua_header: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0 -->\n<!-- @reported_with: media-decode-error -->\n\n**URL**: http://www.chia-anime.tv/player.php?id=78454\n**Browser / Version**: Firefox 55.0\n**Operating System**: Windows 10\n**Problem type**: Video doesn't play\n\n**Steps to Reproduce**\n1. Navigate to: http://www.chia-anime.tv/player.php?id=78454\r\n2. …\r\n\r\nExpected Behavior:\r\n\r\nActual Behavior:\r\n\r\nTechnical Information:\r\nError Code: NS_ERROR_DOM_MEDIA_DEMUXER_ERR (0x806e000c)\r\nDetails: class RefPtr<class mozilla::MozPromise<class mozilla::MediaResult,class mozilla::MediaResult,1> > __cdecl mozilla::MP4Demuxer::Init(void): No MP4 audio () or video () tracks\r\nResource: http://sasuke.chia-anime.tv:7779/cache/m_vTxrZNLcoL3q-_9YTC3w/1496434235/ei93pkha7qeg-650x370.html.mp4\n\n\n\n_From [webcompat.com](https://webcompat.com/) with ❤️_" # nopep8
def extract_media_info(body):
'''Extract information from the payload body for type-media.'''
body = body.replace('\r', '')
MATCH_1 = '<!-- @reported_with: media-decode-error -->\n\n**URL**: '
MATCH_3 = 'Technical Information:\nError Code: '
url = body.partition(MATCH_1)[2].split()[0]
media_error = body.partition(MATCH_3)[2].split()[0]
domain = urlparse.urlparse(url).netloc
return (media_error, domain)
def main():
'''core program'''
return extract_media_info(BODY)
if __name__ == "__main__":
sys.exit(main())
→ python -m cProfile slicing.py
1501 function calls (1493 primitive calls) in 0.007 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 <string>:1(ParseResult)
1 0.000 0.000 0.000 0.000 <string>:1(SplitResult)
1 0.000 0.000 0.000 0.000 <string>:8(__new__)
1 0.002 0.002 0.004 0.004 collections.py:1(<module>)
1 0.000 0.000 0.000 0.000 collections.py:26(OrderedDict)
2 0.001 0.000 0.001 0.001 collections.py:293(namedtuple)
99 0.000 0.000 0.000 0.000 collections.py:337(<genexpr>)
13 0.000 0.000 0.000 0.000 collections.py:361(<genexpr>)
13 0.000 0.000 0.000 0.000 collections.py:363(<genexpr>)
1 0.000 0.000 0.000 0.000 collections.py:395(Counter)
1 0.002 0.002 0.002 0.002 heapq.py:31(<module>)
1 0.000 0.000 0.000 0.000 keyword.py:11(<module>)
1 0.000 0.000 0.000 0.000 re.py:192(compile)
1 0.000 0.000 0.000 0.000 re.py:230(_compile)
1 0.000 0.000 0.000 0.000 slicing.py:13(extract_media_info)
1 0.000 0.000 0.000 0.000 slicing.py:24(main)
1 0.001 0.001 0.007 0.007 slicing.py:7(<module>)
1 0.000 0.000 0.000 0.000 sre_compile.py:228(_compile_charset)
1 0.000 0.000 0.000 0.000 sre_compile.py:256(_optimize_charset)
1 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
1 0.000 0.000 0.000 0.000 sre_compile.py:433(_compile_info)
2 0.000 0.000 0.000 0.000 sre_compile.py:546(isstring)
1 0.000 0.000 0.000 0.000 sre_compile.py:552(_code)
1 0.000 0.000 0.000 0.000 sre_compile.py:567(compile)
3/1 0.000 0.000 0.000 0.000 sre_compile.py:64(_compile)
6 0.000 0.000 0.000 0.000 sre_parse.py:137(__len__)
10 0.000 0.000 0.000 0.000 sre_parse.py:141(__getitem__)
1 0.000 0.000 0.000 0.000 sre_parse.py:145(__setitem__)
2 0.000 0.000 0.000 0.000 sre_parse.py:149(append)
4/2 0.000 0.000 0.000 0.000 sre_parse.py:151(getwidth)
1 0.000 0.000 0.000 0.000 sre_parse.py:189(__init__)
11 0.000 0.000 0.000 0.000 sre_parse.py:193(__next)
8 0.000 0.000 0.000 0.000 sre_parse.py:206(match)
8 0.000 0.000 0.000 0.000 sre_parse.py:212(get)
2/1 0.000 0.000 0.000 0.000 sre_parse.py:317(_parse_sub)
2/1 0.000 0.000 0.000 0.000 sre_parse.py:395(_parse)
1 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
1 0.000 0.000 0.000 0.000 sre_parse.py:706(parse)
1 0.000 0.000 0.000 0.000 sre_parse.py:74(opengroup)
1 0.000 0.000 0.000 0.000 sre_parse.py:85(closegroup)
3 0.000 0.000 0.000 0.000 sre_parse.py:92(__init__)
1 0.000 0.000 0.000 0.000 urlparse.py:121(SplitResult)
1 0.000 0.000 0.000 0.000 urlparse.py:129(ParseResult)
1 0.000 0.000 0.000 0.000 urlparse.py:137(urlparse)
1 0.000 0.000 0.000 0.000 urlparse.py:160(_splitnetloc)
1 0.000 0.000 0.000 0.000 urlparse.py:168(urlsplit)
1 0.001 0.001 0.006 0.006 urlparse.py:29(<module>)
485 0.000 0.000 0.000 0.000 urlparse.py:332(<genexpr>)
1 0.000 0.000 0.000 0.000 urlparse.py:73(ResultMixin)
1 0.000 0.000 0.000 0.000 {_sre.compile}
13 0.000 0.000 0.000 0.000 {all}
2 0.000 0.000 0.000 0.000 {built-in method __new__ of type object at 0x1030dc428}
484 0.000 0.000 0.000 0.000 {chr}
15 0.000 0.000 0.000 0.000 {isinstance}
45/43 0.000 0.000 0.000 0.000 {len}
2 0.000 0.000 0.000 0.000 {map}
13 0.000 0.000 0.000 0.000 {method '__contains__' of 'frozenset' objects}
11 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
29 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
3 0.000 0.000 0.000 0.000 {method 'find' of 'bytearray' objects}
4 0.000 0.000 0.000 0.000 {method 'find' of 'str' objects}
24 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
3 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
86 0.000 0.000 0.000 0.000 {method 'isalnum' of 'str' objects}
13 0.000 0.000 0.000 0.000 {method 'isdigit' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
4 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects}
2 0.000 0.000 0.000 0.000 {method 'partition' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'remove' of 'list' objects}
5 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects}
5 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
11 0.000 0.000 0.000 0.000 {method 'startswith' of 'str' objects}
8 0.000 0.000 0.000 0.000 {min}
2 0.000 0.000 0.000 0.000 {ord}
1 0.000 0.000 0.000 0.000 {range}
2 0.000 0.000 0.000 0.000 {repr}
2 0.000 0.000 0.000 0.000 {sys._getframe}
1 0.000 0.000 0.000 0.000 {sys.exit}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import re
import sys
BODY = "<!-- @browser: Firefox 55.0 -->\n<!-- @ua_header: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0 -->\n<!-- @reported_with: media-decode-error -->\n\n**URL**: http://www.chia-anime.tv/player.php?id=78454\n**Browser / Version**: Firefox 55.0\n**Operating System**: Windows 10\n**Problem type**: Video doesn't play\n\n**Steps to Reproduce**\n1. Navigate to: http://www.chia-anime.tv/player.php?id=78454\r\n2. …\r\n\r\nExpected Behavior:\r\n\r\nActual Behavior:\r\n\r\nTechnical Information:\r\nError Code: NS_ERROR_DOM_MEDIA_DEMUXER_ERR (0x806e000c)\r\nDetails: class RefPtr<class mozilla::MozPromise<class mozilla::MediaResult,class mozilla::MediaResult,1> > __cdecl mozilla::MP4Demuxer::Init(void): No MP4 audio () or video () tracks\r\nResource: http://sasuke.chia-anime.tv:7779/cache/m_vTxrZNLcoL3q-_9YTC3w/1496434235/ei93pkha7qeg-650x370.html.mp4\n\n\n\n_From [webcompat.com](https://webcompat.com/) with ❤️_" # nopep8
def extract_media_info(body):
'''Extract information from the payload body for type-media.'''
body = body.replace('\r', '')
match_error = re.search(r'<!-- @reported_with: media-decode-error -->\n\n\*\*URL\*\*:\s(?P<url>[^\n]+)[^T]+Technical Information:\nError Code: (?P<error>\S+)', body) # nopep8
url = match_error.group('url')
media_error = match_error.group('error')
return (media_error, url)
def main():
'''core program'''
return extract_media_info(BODY)
if __name__ == "__main__":
sys.exit(main())
→ python -m cProfile regex-without-urlparse.py
1340 function calls (1323 primitive calls) in 0.001 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.001 0.001 re.py:143(search)
1 0.000 0.000 0.001 0.001 re.py:230(_compile)
1 0.000 0.000 0.001 0.001 regex-without-urlparse.py:13(extract_media_info)
1 0.000 0.000 0.001 0.001 regex-without-urlparse.py:23(main)
1 0.000 0.000 0.001 0.001 regex-without-urlparse.py:7(<module>)
2 0.000 0.000 0.000 0.000 sre_compile.py:228(_compile_charset)
2 0.000 0.000 0.000 0.000 sre_compile.py:256(_optimize_charset)
3 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
1 0.000 0.000 0.000 0.000 sre_compile.py:433(_compile_info)
2 0.000 0.000 0.000 0.000 sre_compile.py:546(isstring)
1 0.000 0.000 0.000 0.000 sre_compile.py:552(_code)
1 0.000 0.000 0.001 0.001 sre_compile.py:567(compile)
6/1 0.000 0.000 0.000 0.000 sre_compile.py:64(_compile)
12 0.000 0.000 0.000 0.000 sre_parse.py:137(__len__)
22 0.000 0.000 0.000 0.000 sre_parse.py:141(__getitem__)
3 0.000 0.000 0.000 0.000 sre_parse.py:145(__setitem__)
94 0.000 0.000 0.000 0.000 sre_parse.py:149(append)
9/4 0.000 0.000 0.000 0.000 sre_parse.py:151(getwidth)
1 0.000 0.000 0.000 0.000 sre_parse.py:189(__init__)
124 0.000 0.000 0.000 0.000 sre_parse.py:193(__next)
20 0.000 0.000 0.000 0.000 sre_parse.py:206(match)
113 0.000 0.000 0.000 0.000 sre_parse.py:212(get)
8 0.000 0.000 0.000 0.000 sre_parse.py:221(isident)
2 0.000 0.000 0.000 0.000 sre_parse.py:227(isname)
1 0.000 0.000 0.000 0.000 sre_parse.py:236(_class_escape)
9 0.000 0.000 0.000 0.000 sre_parse.py:268(_escape)
3/1 0.000 0.000 0.000 0.000 sre_parse.py:317(_parse_sub)
3/1 0.000 0.000 0.000 0.000 sre_parse.py:395(_parse)
1 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
1 0.000 0.000 0.001 0.001 sre_parse.py:706(parse)
2 0.000 0.000 0.000 0.000 sre_parse.py:74(opengroup)
2 0.000 0.000 0.000 0.000 sre_parse.py:85(closegroup)
6 0.000 0.000 0.000 0.000 sre_parse.py:92(__init__)
1 0.000 0.000 0.000 0.000 {_sre.compile}
25 0.000 0.000 0.000 0.000 {isinstance}
346/343 0.000 0.000 0.000 0.000 {len}
379 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
2 0.000 0.000 0.000 0.000 {method 'find' of 'bytearray' objects}
19 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
2 0.000 0.000 0.000 0.000 {method 'group' of '_sre.SRE_Match' objects}
1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
2 0.000 0.000 0.000 0.000 {method 'remove' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'search' of '_sre.SRE_Pattern' objects}
12 0.000 0.000 0.000 0.000 {min}
86 0.000 0.000 0.000 0.000 {ord}
1 0.000 0.000 0.000 0.000 {sys.exit}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
import sys
BODY = "<!-- @browser: Firefox 55.0 -->\n<!-- @ua_header: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0 -->\n<!-- @reported_with: media-decode-error -->\n\n**URL**: http://www.chia-anime.tv/player.php?id=78454\n**Browser / Version**: Firefox 55.0\n**Operating System**: Windows 10\n**Problem type**: Video doesn't play\n\n**Steps to Reproduce**\n1. Navigate to: http://www.chia-anime.tv/player.php?id=78454\r\n2. …\r\n\r\nExpected Behavior:\r\n\r\nActual Behavior:\r\n\r\nTechnical Information:\r\nError Code: NS_ERROR_DOM_MEDIA_DEMUXER_ERR (0x806e000c)\r\nDetails: class RefPtr<class mozilla::MozPromise<class mozilla::MediaResult,class mozilla::MediaResult,1> > __cdecl mozilla::MP4Demuxer::Init(void): No MP4 audio () or video () tracks\r\nResource: http://sasuke.chia-anime.tv:7779/cache/m_vTxrZNLcoL3q-_9YTC3w/1496434235/ei93pkha7qeg-650x370.html.mp4\n\n\n\n_From [webcompat.com](https://webcompat.com/) with ❤️_" # nopep8
def extract_media_info(body):
'''Extract information from the payload body for type-media.'''
body = body.replace('\r', '')
MATCH_1 = '<!-- @reported_with: media-decode-error -->\n\n**URL**: '
MATCH_3 = 'Technical Information:\nError Code: '
url = body.partition(MATCH_1)[2].split()[0]
media_error = body.partition(MATCH_3)[2].split()[0]
return (media_error, url)
def main():
'''core program'''
return extract_media_info(BODY)
if __name__ == "__main__":
sys.exit(main())
→ python -m cProfile slicing-without-urlparse.py
10 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 slicing-without-urlparse.py:12(extract_media_info)
1 0.000 0.000 0.000 0.000 slicing-without-urlparse.py:22(main)
1 0.000 0.000 0.000 0.000 slicing-without-urlparse.py:7(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'partition' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects}
2 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
1 0.000 0.000 0.000 0.000 {sys.exit}
@karlcow
Copy link
Author

karlcow commented Jun 5, 2017

See also something slightly related webcompat/webcompat.com#1580

@karlcow
Copy link
Author

karlcow commented Jun 5, 2017

and these tests are related to webcompat/webcompat.com#1551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment