Skip to content

Instantly share code, notes, and snippets.

@BroaderImpact
Last active February 18, 2023 17:35
Show Gist options
  • Save BroaderImpact/a2e7131964f097cdefe824215fb68354 to your computer and use it in GitHub Desktop.
Save BroaderImpact/a2e7131964f097cdefe824215fb68354 to your computer and use it in GitHub Desktop.
Exploratory Data Analysis

GitHub contributors GitHub issues GitHub Packagist Stars GitHub code size in bytes

Exploratory Data Analysis

Automated python script for running preliminary exploratory data analysis on csv input.

Installation

Use the package manager pip to install cca_eda.

pip install cca_eda

Usage

import cca_eda

# returns prompt to upload csv dataset
cca_eda.choice("EDA")

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

if choice == "EDA":
st.subheader("Exploratory Data Analysis")
data = st.file_uploader("Upload Dataset : ",type=["csv","txt"])
if data is not None:
df = pd.read_csv(data)
st.dataframe(df.head())
# Show Shape
if st.checkbox("Show Shape"):
st.write(df.shape)
# Show Columns
if st.checkbox("Show Columns"):
all_columns = df.columns = df.columns.to_list()
st.write(all_columns)
# Show Summary
if st.checkbox("Show Summary"):
st.write(df.describe())
# Show Value Counts
if st.checkbox("Show Value Counts"):
st.write(df.iloc[:,-1].value_counts())
# Show Select Columns To Show
if st.checkbox("Select Columns to Show"):
selected_columns = st.multiselect("Select Columns ",all_columns)
new_df = df[selected_columns]
st.dataframe(new_df)
# Show Percentage of Missing Values
if st.checkbox("Show Percentage of Missing Values"):
st.write((df.isna().mean().round(4) * 100))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment