Skip to content

Instantly share code, notes, and snippets.

@pplonski
Created February 1, 2019 16:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pplonski/89f2a311aef6add1a3bbc912f645202b to your computer and use it in GitHub Desktop.
Save pplonski/89f2a311aef6add1a3bbc912f645202b to your computer and use it in GitHub Desktop.
import openml
import pandas as pd
openml.config.apikey = "aaas8a89sd87as87d8as7d98a" # it is some fake API key, please set your key here!
dataset_id = 1590 # Adults data set, see https://www.openml.org/d/1590
# get data directly fom openml
dataset = openml.datasets.get_dataset(dataset_id)
(X, y, categorical, names) = dataset.get_data(
target=dataset.default_target_attribute,
return_categorical_indicator=True,
return_attribute_names=True,
)
# Create nice pandas data frame
vals = {}
for i, name in enumerate(names):
vals[name] = X[:, i]
vals["target"] = y
df = pd.DataFrame(vals)
# Print data frame
print(df)
# Save data frame to CSV file
df.to_csv("./adults.csv", index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment