Skip to content

Instantly share code, notes, and snippets.

@podhmo
Created April 14, 2020 01:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save podhmo/afe65fc26f11c4e4e020654655e44175 to your computer and use it in GitHub Desktop.
Save podhmo/afe65fc26f11c4e4e020654655e44175 to your computer and use it in GitHub Desktop.
from sklearn import datasets
import pandas as pd
iris = datasets.load_iris()
df = pd.DataFrame(data=iris["data"], columns=iris["feature_names"])
print(df.describe())
from vega_datasets import data
df = data.iris()
print(df.describe())
import importlib.util
import pathlib
import json
spec = importlib.util.find_spec("vega_datasets")
data = json.load((pathlib.Path(spec.submodule_search_locations[0]) / "_data/iris.json").open())
print(sum([row["petalLength"]for row in data]))
$ make 00 01 02
time r CMD BATCH --vanilla --slave 00iris.r 00iris.r.out
        0.54 real         0.17 user         0.12 sys
time python 01iris.py  | tee 01iris.py.out
       sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
count         150.000000        150.000000         150.000000        150.000000
mean            5.843333          3.057333           3.758000          1.199333
std             0.828066          0.435866           1.765298          0.762238
min             4.300000          2.000000           1.000000          0.100000
25%             5.100000          2.800000           1.600000          0.300000
50%             5.800000          3.000000           4.350000          1.300000
75%             6.400000          3.300000           5.100000          1.800000
max             7.900000          4.400000           6.900000          2.500000

real	0m2.241s
user	0m1.027s
sys	0m0.326s
time python 02iris.py  | tee 02iris.py.out
       petalLength  petalWidth  sepalLength  sepalWidth
count   150.000000  150.000000   150.000000  150.000000
mean      3.758000    1.199333     5.843333    3.057333
std       1.765298    0.762238     0.828066    0.435866
min       1.000000    0.100000     4.300000    2.000000
25%       1.600000    0.300000     5.100000    2.800000
50%       4.350000    1.300000     5.800000    3.000000
75%       5.100000    1.800000     6.400000    3.300000
max       6.900000    2.500000     7.900000    4.400000

real	0m0.748s
user	0m0.573s
sys	0m0.168s
time python 03iris.py  | tee 03iris.py.out
563.7000000000004

real    0m0.069s
user    0m0.037s
sys     0m0.023s
00:
time r CMD BATCH --vanilla --slave $(shell echo $@*.r) $(shell echo $@*.r).out
01:
time python $(shell echo $@*.py) | tee $(shell echo $@*.py).out
02:
time python $(shell echo $@*.py) | tee $(shell echo $@*.py).out
03:
time python $(shell echo $@*.py) | tee $(shell echo $@*.py).out
scikit-learn
pandas
vega_datasets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment