Skip to content

Instantly share code, notes, and snippets.

@phpdude
Created December 10, 2019 16:20
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save phpdude/1ae6f19de213d66286c8183e9e3b9ec1 to your computer and use it in GitHub Desktop.
Save phpdude/1ae6f19de213d66286c8183e9e3b9ec1 to your computer and use it in GitHub Desktop.
Efficent way to remove docstrings in python source code
import ast
import astor # read more at https://astor.readthedocs.io/en/latest/
parsed = ast.parse(open('source.py').read())
for node in ast.walk(parsed):
# let's work only on functions & classes definitions
if not isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.AsyncFunctionDef)):
continue
if not len(node.body):
continue
if not isinstance(node.body[0], ast.Expr):
continue
if not hasattr(node.body[0], 'value') or not isinstance(node.body[0].value, ast.Str):
continue
# Uncomment lines below if you want print what and where we are removing
# print(node)
# print(node.body[0].value.s)
node.body = node.body[1:]
print('***** Processed source code output ******\n=========================================')
print(astor.to_source(parsed))
> python clean.py
***** Processed source code output ******
=========================================
"""
Mycopyright (c)
"""
from abc import d
class MyClass(MotherClass):
def __init__(self, my_param):
self.my_param = my_param
def test_fctn():
def _wrapped(omg):
pass
return True
def test_fctn():
some_string = """
Some Docstring
"""
return some_string
"""
Mycopyright (c)
"""
from abc import d
class MyClass(MotherClass):
"""
Some;
Multi-
Line Docstring:
"""
def __init__(self, my_param):
"""Docstring"""
self.my_param = my_param
def test_fctn():
"""
Some Docstring
"""
def _wrapped(omg):
"some extra docstring"
pass
return True
def test_fctn():
some_string = """
Some Docstring
"""
return some_string
@newdive
Copy link

newdive commented Dec 11, 2020

incase of method/class with only docstring as body
add this two lines of code to make sure the code is valid

...
node.body = node.body[1:]
#add "pass" statement here
if len(node.body)<1:
    node.body.append(ast.Pass())

@mzpqnxow
Copy link

mzpqnxow commented Jan 1, 2022

I need to remove a module level docstrings as well, this was easy to do:

        if not isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.AsyncFunctionDef)):

becomes

        if not isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.AsyncFunctionDef, ast.Module)):

@aw492267
Copy link

Is it possible to modify the code above, so that only docstrings should be removed, but no comments?
Or only comments starting with '# TODO' ? Need to clean my code data for a code to text model..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment