Skip to content

Instantly share code, notes, and snippets.

@kafran
Forked from aculich/remove-empty-columns.csv
Created March 1, 2016 10:49
Show Gist options
  • Save kafran/e5792573ba7dfd4a3d7b to your computer and use it in GitHub Desktop.
Save kafran/e5792573ba7dfd4a3d7b to your computer and use it in GitHub Desktop.
remove-empty-columns
foo bar baz
a 1
b 2
c
4
e 5
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>
# <markdowncell>
# To drop all empty columns (but still keeping the headers) using the Python [Pandas library](http://pandas.pydata.org/) we can use the following 4-line script to read in the csv file, drop the columns where **all** the elements are missing, and save the data to a new csv file.
# <codecell>
from pandas.io.parsers import read_csv
data = read_csv('remove-empty-columns.csv')
filtered_data = data.dropna(axis='columns', how='all')
filtered_data.to_csv('empty-columns-removed.csv')
# <markdowncell>
# As shown below, the sample data included in the csv file has 3 columns which contain missing values.
#
# The second column, labeled **bar**, is completely empty except the header; columns like this should be dropped. The other columns contain data, but should not be dropped even though they contain some missing values.
# <codecell>
data
# <markdowncell>
# Using the [pandas.DataFrame.dropna()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html) function with the **columns** axis we can drop any column where **all** the entries are **NaN** (missing values).
# <codecell>
filtered_data = data.dropna(axis='columns', how='all')
filtered_data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment