jtratner/mode-perf-comparison.md

## mode-perf-comparison.md

      
    Raw
  

              mode-perf-comparison.md
            
          
    Results from doing all of value_counts() first

---- Basics - int64 ----
100 length
10000 loops, best of 3: 100 µs per loop
1000 length
10000 loops, best of 3: 166 µs per loop
10000 length
1000 loops, best of 3: 367 µs per loop
100000 length
100 loops, best of 3: 2.67 ms per loop
100000 length with few #s
100 loops, best of 3: 2.52 ms per loop
100000 length with many #s
10 loops, best of 3: 18.4 ms per loop

---- Object type with strings ----
100 length
10000 loops, best of 3: 196 µs per loop
1000 length
1000 loops, best of 3: 306 µs per loop
10000 length
1000 loops, best of 3: 1.2 ms per loop
100000 length
100 loops, best of 3: 10.7 ms per loop
100000 length with few values
100 loops, best of 3: 10.5 ms per loop
100000 length with many values
10 loops, best of 3: 54.1 ms per loop

---- Timedelta type for [should use int64] ----
100 length
1000 loops, best of 3: 205 µs per loop
1000 length
1000 loops, best of 3: 238 µs per loop
10000 length
1000 loops, best of 3: 392 µs per loop
100000 length
100 loops, best of 3: 2.08 ms per loop
100000 length with few #s
100 loops, best of 3: 2.81 ms per loop
100000 length with many #s
100 loops, best of 3: 18 ms per loop

Results from just calculating table and then iterating over its results

Still quite slow with strings! However, the last string case takes ~80ms with value_counts, so ~40ms isn't terrible [for comparison, constructing a Series with 100K string values takes ~10ms]
---- Basics - int64 ----
100 length
10000 loops, best of 3: 81.3 µs per loop
1000 length
10000 loops, best of 3: 96.5 µs per loop
10000 length
10000 loops, best of 3: 182 µs per loop
100000 length
1000 loops, best of 3: 1.34 ms per loop
100000 length with few #s
1000 loops, best of 3: 1.48 ms per loop
100000 length with many #s
100 loops, best of 3: 4.22 ms per loop

---- Object type with strings ----
100 length
1000 loops, best of 3: 187 µs per loop
1000 length
1000 loops, best of 3: 305 µs per loop
10000 length
1000 loops, best of 3: 1.4 ms per loop
100000 length
100 loops, best of 3: 12.1 ms per loop
100000 length with few values
100 loops, best of 3: 11.8 ms per loop
100000 length with many values
10 loops, best of 3: 38.7 ms per loop

---- Timedelta type for [should use int64] ----
100 length
10000 loops, best of 3: 209 µs per loop
1000 length
1000 loops, best of 3: 218 µs per loop
10000 length
1000 loops, best of 3: 335 µs per loop
100000 length
1000 loops, best of 3: 1.64 ms per loop
100000 length with few #s
1000 loops, best of 3: 1.5 ms per loop
100000 length with many #s
100 loops, best of 3: 4.13 ms per loop

Hurray for optimizing a pathological case [and, you know, general perf boost]!
Doesn't matter if function is inlined or separate:

(earlier examples were copy/pasted inline - doesn't work to do cdef inline in this case)
›› ipython perf_test.ipy # separate func
---- Basics - int64 ----
100 length
10000 loops, best of 3: 81.7 µs per loop
1000 length
10000 loops, best of 3: 110 µs per loop
10000 length
1000 loops, best of 3: 313 µs per loop
100000 length
100 loops, best of 3: 2.41 ms per loop
100000 length with few #s
100 loops, best of 3: 2.57 ms per loop
100000 length with many #s
100 loops, best of 3: 4.82 ms per loop

---- Object type with strings ----
100 length
1000 loops, best of 3: 166 µs per loop
1000 length
1000 loops, best of 3: 277 µs per loop
10000 length
1000 loops, best of 3: 1.18 ms per loop
100000 length
100 loops, best of 3: 11.3 ms per loop
100000 length with few values
100 loops, best of 3: 11 ms per loop
100000 length with many values
10 loops, best of 3: 35.8 ms per loop

---- Timedelta type for [should use int64] ----
100 length
10000 loops, best of 3: 185 µs per loop
1000 length
1000 loops, best of 3: 198 µs per loop
10000 length
1000 loops, best of 3: 350 µs per loop
100000 length
100 loops, best of 3: 1.95 ms per loop
100000 length with few #s
100 loops, best of 3: 2.65 ms per loop
100000 length with many #s
100 loops, best of 3: 4.36 ms per loop

Side note - all of these counts are relatively consistent over multiple runs (ran each about 3-5 times)
No need to optimize DataFrame version - it's just putting together Series and that overhead is trivial.
Comparison to value_counts()

This is running val_perf
›› ipython val_count_perf.ipy
---- Basics - int64 ----
100 length
1000 loops, best of 3: 458 µs per loop
1000 length
1000 loops, best of 3: 528 µs per loop
10000 length
1000 loops, best of 3: 755 µs per loop
100000 length
100 loops, best of 3: 3.17 ms per loop
100000 length with few #s
100 loops, best of 3: 3.13 ms per loop
100000 length with many #s
10 loops, best of 3: 25.9 ms per loop

---- Object type with strings ----
100 length
1000 loops, best of 3: 668 µs per loop
1000 length
1000 loops, best of 3: 799 µs per loop
10000 length
100 loops, best of 3: 1.84 ms per loop
100000 length
100 loops, best of 3: 13.8 ms per loop
100000 length with few values
100 loops, best of 3: 13.2 ms per loop
100000 length with many values
10 loops, best of 3: 86 ms per loop

---- Timedelta type for [should use int64] ----
100 length
1000 loops, best of 3: 716 µs per loop
1000 length
1000 loops, best of 3: 810 µs per loop
10000 length
1000 loops, best of 3: 962 µs per loop
100000 length
100 loops, best of 3: 2.79 ms per loop
100000 length with few #s
100 loops, best of 3: 3.81 ms per loop
100000 length with many #s
10 loops, best of 3: 42.3 ms per loop

Abstracting the table count function causes little value_counts perf hit:

›› ipython val_count_perf.ipy # abstracted func
---- Basics - int64 ----
100 length
1000 loops, best of 3: 485 µs per loop
1000 length
1000 loops, best of 3: 535 µs per loop
10000 length
1000 loops, best of 3: 781 µs per loop
100000 length
100 loops, best of 3: 3.1 ms per loop
100000 length with few #s
100 loops, best of 3: 3.47 ms per loop
100000 length with many #s
10 loops, best of 3: 29.3 ms per loop

---- Object type with strings ----
100 length
1000 loops, best of 3: 666 µs per loop
1000 length
1000 loops, best of 3: 834 µs per loop
10000 length
1000 loops, best of 3: 1.89 ms per loop
100000 length
100 loops, best of 3: 14 ms per loop
100000 length with few values
100 loops, best of 3: 13.3 ms per loop
100000 length with many values
10 loops, best of 3: 86.3 ms per loop

---- Timedelta type for [should use int64] ----
100 length
1000 loops, best of 3: 758 µs per loop
1000 length
1000 loops, best of 3: 826 µs per loop
10000 length
1000 loops, best of 3: 961 µs per loop
100000 length
100 loops, best of 3: 2.84 ms per loop
100000 length with few #s
100 loops, best of 3: 3.72 ms per loop
100000 length with many #s
10 loops, best of 3: 42.8 ms per loop


## test_perf.py
# Need to run in IPython
import pandas as pd
import numpy as np
import random
random.seed(1234)
np.random.seed(1234)

print("---- Basics - int64 ---- ")
print("100 length")
s1 = pd.Series(np.random.randint(-100, 100, 100))
get_ipython().magic(u'timeit s1.mode()')
print("1000 length")
s1 = pd.Series(np.random.randint(-100, 100, 1000))
get_ipython().magic(u'timeit s1.mode()')
print("10000 length")
s1 = pd.Series(np.random.randint(-100, 100, 10000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length")
s1 = pd.Series(np.random.randint(-100, 100, 100000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with few #s")
arr = np.array([5] * 10000 + range(1000) * 90)
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with many #s")
arr = np.array([5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')

print("\n---- Object type with strings ---- ")
letters = np.array(list('abcdefghijklmnopqrstuvwxyz0123456789!$#^'))
# generate 100000 random strings to use
strings = [''.join(np.random.choice(letters, np.random.randint(5,50))) for _ in range(100000)]
rand_strings = lambda n: np.random.choice(strings[:100], n)
print("100 length")
s1 = pd.Series(rand_strings(100))
get_ipython().magic(u'timeit s1.mode()')
print("1000 length")
s1 = pd.Series(rand_strings(1000))
get_ipython().magic(u'timeit s1.mode()')
print("10000 length")
s1 = pd.Series(rand_strings(10000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length")
s1 = pd.Series(rand_strings(100000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with few values")
arr = np.array(['a'] * 10000 + list(rand_strings(90000)))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with many values")
arr = np.array([rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + strings[3000:])
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')

print("\n---- Timedelta type for [should use int64] ----")

rand_time = lambda n, start=1, end=201: map(pd.Timestamp, np.random.randint(start, end, n))

print("100 length")
s1 = pd.Series(rand_time(100))
get_ipython().magic(u'timeit s1.mode()')
print("1000 length")
s1 = pd.Series(rand_time(1000))
get_ipython().magic(u'timeit s1.mode()')
print("10000 length")
s1 = pd.Series(rand_time(10000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length")
s1 = pd.Series(rand_time(100000))
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with few #s")
arr = np.array([pd.Timestamp(5)] * 10000 + map(pd.Timestamp, range(1000)) * 90)
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')
print("100000 length with many #s")
arr = np.array(map(pd.Timestamp, [5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000)))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.mode()')

## val-count-perf.ipy
# Need to run in IPython
import pandas as pd
import numpy as np
import random
random.seed(1234)
np.random.seed(1234)

print("---- Basics - int64 ---- ")
print("100 length")
s1 = pd.Series(np.random.randint(-100, 100, 100))
get_ipython().magic(u'timeit s1.value_counts()')
print("1000 length")
s1 = pd.Series(np.random.randint(-100, 100, 1000))
get_ipython().magic(u'timeit s1.value_counts()')
print("10000 length")
s1 = pd.Series(np.random.randint(-100, 100, 10000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length")
s1 = pd.Series(np.random.randint(-100, 100, 100000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with few #s")
arr = np.array([5] * 10000 + range(1000) * 90)
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with many #s")
arr = np.array([5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')

print("\n---- Object type with strings ---- ")
letters = np.array(list('abcdefghijklmnopqrstuvwxyz0123456789!$#^'))
# generate 100000 random strings to use
strings = [''.join(np.random.choice(letters, np.random.randint(5,50))) for _ in range(100000)]
rand_strings = lambda n: np.random.choice(strings[:100], n)
print("100 length")
s1 = pd.Series(rand_strings(100))
get_ipython().magic(u'timeit s1.value_counts()')
print("1000 length")
s1 = pd.Series(rand_strings(1000))
get_ipython().magic(u'timeit s1.value_counts()')
print("10000 length")
s1 = pd.Series(rand_strings(10000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length")
s1 = pd.Series(rand_strings(100000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with few values")
arr = np.array(['a'] * 10000 + list(rand_strings(90000)))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with many values")
arr = np.array([rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + strings[3000:])
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')

print("\n---- Timedelta type for [should use int64] ----")

rand_time = lambda n, start=1, end=201: map(pd.Timestamp, np.random.randint(start, end, n))

print("100 length")
s1 = pd.Series(rand_time(100))
get_ipython().magic(u'timeit s1.value_counts()')
print("1000 length")
s1 = pd.Series(rand_time(1000))
get_ipython().magic(u'timeit s1.value_counts()')
print("10000 length")
s1 = pd.Series(rand_time(10000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length")
s1 = pd.Series(rand_time(100000))
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with few #s")
arr = np.array([pd.Timestamp(5)] * 10000 + map(pd.Timestamp, range(1000)) * 90)
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')
print("100000 length with many #s")
arr = np.array(map(pd.Timestamp, [5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000)))
np.random.shuffle(arr)
s1 = pd.Series(arr)
get_ipython().magic(u'timeit s1.value_counts()')
	# Need to run in IPython
	import pandas as pd
	import numpy as np
	import random
	random.seed(1234)
	np.random.seed(1234)

	print("---- Basics - int64 ---- ")
	print("100 length")
	s1 = pd.Series(np.random.randint(-100, 100, 100))
	get_ipython().magic(u'timeit s1.mode()')
	print("1000 length")
	s1 = pd.Series(np.random.randint(-100, 100, 1000))
	get_ipython().magic(u'timeit s1.mode()')
	print("10000 length")
	s1 = pd.Series(np.random.randint(-100, 100, 10000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length")
	s1 = pd.Series(np.random.randint(-100, 100, 100000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with few #s")
	arr = np.array([5] * 10000 + range(1000) * 90)
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with many #s")
	arr = np.array([5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000))
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')

	print("\n---- Object type with strings ---- ")
	letters = np.array(list('abcdefghijklmnopqrstuvwxyz0123456789!$#^'))
	# generate 100000 random strings to use
	strings = [''.join(np.random.choice(letters, np.random.randint(5,50))) for _ in range(100000)]
	rand_strings = lambda n: np.random.choice(strings[:100], n)
	print("100 length")
	s1 = pd.Series(rand_strings(100))
	get_ipython().magic(u'timeit s1.mode()')
	print("1000 length")
	s1 = pd.Series(rand_strings(1000))
	get_ipython().magic(u'timeit s1.mode()')
	print("10000 length")
	s1 = pd.Series(rand_strings(10000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length")
	s1 = pd.Series(rand_strings(100000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with few values")
	arr = np.array(['a'] * 10000 + list(rand_strings(90000)))
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with many values")
	arr = np.array([rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + [rand_strings(None)] * 1000 + strings[3000:])
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')

	print("\n---- Timedelta type for [should use int64] ----")

	rand_time = lambda n, start=1, end=201: map(pd.Timestamp, np.random.randint(start, end, n))

	print("100 length")
	s1 = pd.Series(rand_time(100))
	get_ipython().magic(u'timeit s1.mode()')
	print("1000 length")
	s1 = pd.Series(rand_time(1000))
	get_ipython().magic(u'timeit s1.mode()')
	print("10000 length")
	s1 = pd.Series(rand_time(10000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length")
	s1 = pd.Series(rand_time(100000))
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with few #s")
	arr = np.array([pd.Timestamp(5)] * 10000 + map(pd.Timestamp, range(1000)) * 90)
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')
	print("100000 length with many #s")
	arr = np.array(map(pd.Timestamp, [5] * 1000 + [3] * 1000 + [15] * 1000 + range(97000)))
	np.random.shuffle(arr)
	s1 = pd.Series(arr)
	get_ipython().magic(u'timeit s1.mode()')