Skip to content

Instantly share code, notes, and snippets.

@elmotec
Last active July 19, 2019 11:05
Show Gist options
  • Save elmotec/a471786a2f0a49bf7fc0cc811da73e33 to your computer and use it in GitHub Desktop.
Save elmotec/a471786a2f0a49bf7fc0cc811da73e33 to your computer and use it in GitHub Desktop.
import pandas as pd
def top(df, n, columns=None):
"""Returns top and bottom n elements subset of input dataframe.
Args:
df: input dataframe.
n: number of elements to return (n / 2 on top and on the bottom).
columns: list of columns to consider for sorting, defaults to use
head and tail.
Example:
This is useful for exploratory data debuging. Use it like:
>>> df = pd.DataFrame({'diff': range(100)})
>>> df.pipe(top, 4, 'diff')
diff
99 99
98 98
1 1
0 0
"""
n = min(n, len(df))
top_n, bot_n = n // 2 + n % 2, n // 2
if columns:
top = df.nlargest(top_n, columns).\
sort_values(by=columns, ascending=False)
bottom = df.nsmallest(bot_n, columns).\
sort_values(by=columns, ascending=False)
else:
top = df.head(top_n)
bottom = df.tail(bot_n)
return pd.concat([top, bottom])
if __name__ == '__main__':
import doctest
doctest.testmod()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment