Skip to content

Instantly share code, notes, and snippets.

@theSage21
Last active August 29, 2015 14:27
Show Gist options
  • Save theSage21/11904a3b76dba0466214 to your computer and use it in GitHub Desktop.
Save theSage21/11904a3b76dba0466214 to your computer and use it in GitHub Desktop.
Speedtesting results on html2text
Wrote profile results to testing.py.lprof
Timer unit: 1e-06 s
Total time: 18.9654 s
File: html2text/__init__.py
Function: optwrap at line 784
Line # Hits Time Per Hit % Time Line Contents
==============================================================
784 @profile
785 def optwrap(self, text):
786 """
787 Wrap all paragraphs in the provided text.
788
789 :type text: str
790
791 :rtype: str
792 """
793 1 4 4.0 0.0 if not self.body_width:
794 return text
795
796 1 2 2.0 0.0 assert wrap, "Requires Python 2.3."
797 1 2 2.0 0.0 result = ''
798 1 1 1.0 0.0 newlines = 0
799 # I cannot think of a better solution for now.
800 # To avoid the non-wrap behaviour for entire paras
801 # because of the presence of a link in it
802 1 1 1.0 0.0 if not self.wrap_links:
803 self.inline_links = False
804 3 42 14.0 0.0 for para in text.split("\n"):
805 2 4 2.0 0.0 if len(para) > 0:
806 1 1943 1943.0 0.0 if not skipwrap(para, self.wrap_links):
807 1 18963386 18963386.0 100.0 result = "\n".join(wrap(para, self.body_width))
808 1 8 8.0 0.0 if para.endswith(' '):
809 result += " \n"
810 newlines = 1
811 else:
812 1 42 42.0 0.0 result += "\n\n"
813 1 2 2.0 0.0 newlines = 2
814 else:
815 # Warning for the tempted!!!
816 # Be aware that obvious replacement of this with
817 # line.isspace()
818 # DOES NOT work! Explanations are welcome.
819 if not config.RE_SPACE.match(para):
820 result += para + "\n"
821 newlines = 1
822 else:
823 1 2 2.0 0.0 if newlines < 2:
824 result += "\n"
825 newlines += 1
826 1 1 1.0 0.0 return result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment