Skip to content

Instantly share code, notes, and snippets.

@prashnts
Created December 21, 2016 08:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save prashnts/e61b49bcc424fb042677eb1eff3edf51 to your computer and use it in GitHub Desktop.
Save prashnts/e61b49bcc424fb042677eb1eff3edf51 to your computer and use it in GitHub Desktop.
Perf comparison when using Integer vs. String indexes in a Pandas DF
import pandas as pd
import numpy as np
LIM = 100000
df_i = pd.DataFrame(np.random.randint(0, 100, size=(LIM, 4)), columns=list('ABCD'), index=list(range(0, LIM)))
df_s = pd.DataFrame(np.random.randint(0, 100, size=(LIM, 4)), columns=list('ABCD'), index=list(map(hex, range(0, LIM))))
## used in ipython
>>> %timeit -n 100 df.sort_values('A')
100 loops, best of 3: 10.6 ms per loop
>>> %timeit -n 100 df_s.sort_values('A')
100 loops, best of 3: 18.1 ms per loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment