This example explains Python decorators in the context of data science. The example acts as a quick reminder, rather than a complete guide.
Consider a Pandas DataFrame about posts on a social media. The DataFrame, called posts
, contains a column with the number of likes for each post.
post_id | ... | likes | ... |
---|---|---|---|
1 | ... | 43 | ... |
2 | ... | 92 | ... |
3 | ... | 54 | ... |
The following function calculates the average number of likes per post.
def average_likes(data):
return data['likes'].mean()
We would like to decorate the function with a decorator checking that the likes
column has integer type. Such a decorated function may look like the following.
@data_column_has_int_type('likes')
def average_likes(data):
return data['likes'].mean()
The decorator can be defined in the following way.
def data_column_has_int_type(column):
def decorator(function):
def wrapper(*args, **kwargs):
data = args[0]
if not pandas.api.types.is_integer_dtype(data[column]):
raise ValueError(f"Column {column} does not have integer type.")
return function(*args, **kwargs)
return wrapper
return decorator
The decorated function, average_likes
, is equivalent to:
data_column_has_int_type('likes')(average_likes)(posts)
This composition of functions unwraps as:
-
data_column_has_int_type('likes')
⟶decorator
withcolumn
set to'likes'
, equivalent to:def decorator(function): def wrapper(*args, **kwargs): data = args[0] if not pandas.api.types.is_integer_dtype(data['likes']): # <- See change raise ValueError(f"Column {column} does not have integer type.") return function(*args, **kwargs) return wrapper
-
decorator(average_likes)
⟶wrapper
withfunction
set to'average_likes'
, equivalent to:def wrapper(*args, **kwargs): data = args[0] if not pandas.api.types.is_integer_dtype(data['likes']): raise ValueError(f"Column {column} does not have integer type.") return average_likes(*args, **kwargs) # <- See change
-
wrapper(posts)
becomes:if not pandas.api.types.is_integer_dtype(posts['likes']): raise ValueError(f"Column {column} does not have integer type.") average_likes(posts) # <- See change
This flow can be viewed compactly as,
data_column_has_int_type('likes')(average_likes)(data)
decorator(average_likes)(data) # column='likes'
wrapper(data) # column='likes', function=average_likes
@kurtu5, you are correct, and it is good to use
@wraps
in decorators. With@wraps
, the undecorated function name and docstring would be preserved in the decorated function.I omitted them from the short guide as I wanted to focus on the logic of decorators.
I leave the link to
@wraps
in the Python documentation here for those interested to learn more about it: https://docs.python.org/3/library/functools.html#functools.wraps