Skip to content

Instantly share code, notes, and snippets.

@mapcloud
Forked from aculich/remove-empty-columns.csv
Created July 28, 2017 13:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mapcloud/c8aeaba8f6733491d46cfc2c9e263a40 to your computer and use it in GitHub Desktop.
Save mapcloud/c8aeaba8f6733491d46cfc2c9e263a40 to your computer and use it in GitHub Desktop.
remove-empty-columns
foo bar baz
a 1
b 2
c
4
e 5
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>
# <markdowncell>
# To drop all empty columns (but still keeping the headers) using the Python [Pandas library](http://pandas.pydata.org/) we can use the following 4-line script to read in the csv file, drop the columns where **all** the elements are missing, and save the data to a new csv file.
# <codecell>
from pandas.io.parsers import read_csv
data = read_csv('remove-empty-columns.csv')
filtered_data = data.dropna(axis='columns', how='all')
filtered_data.to_csv('empty-columns-removed.csv')
# <markdowncell>
# As shown below, the sample data included in the csv file has 3 columns which contain missing values.
#
# The second column, labeled **bar**, is completely empty except the header; columns like this should be dropped. The other columns contain data, but should not be dropped even though they contain some missing values.
# <codecell>
data
# <markdowncell>
# Using the [pandas.DataFrame.dropna()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html) function with the **columns** axis we can drop any column where **all** the entries are **NaN** (missing values).
# <codecell>
filtered_data = data.dropna(axis='columns', how='all')
filtered_data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment