Skip to content

Instantly share code, notes, and snippets.

@seahrh
Last active June 7, 2018 07:51
Show Gist options
  • Save seahrh/3f0bcf41d96f59de11d93afe28cb175b to your computer and use it in GitHub Desktop.
Save seahrh/3f0bcf41d96f59de11d93afe28cb175b to your computer and use it in GitHub Desktop.
# Shuffle dataframe
cities.reindex(np.random.permutation(cities.index))
# Read data from Google Cloud Storage
california_housing_dataframe = pd.read_csv("https://storage.googleapis.com/mledu-datasets/california_housing_train.csv", sep=",")
# Convert pandas data into a dict of np arrays
# where `key` is column name.
# example:
# {'households': array([ 472., 463., 117., ..., 356., 359., 1453.]), 'median_income': array([1.4936, 1.82 , 1.6509, ..., 2.8462, 8.4016, 4.3644]), 'total_rooms': array([5612., 7650., 720., ..., 2019., 2648., 8507.]), 'housing_median_age': array([15., 19., 17., ..., 36., 33., 19.]), 'longitude': array([-114.31, -114.47, -114.56, ..., -121.39, -121.39, -121.39]), 'total_bedrooms': array([1283., 1901., 174., ..., 369., 357., 1470.])}
features = {key:np.array(value) for key,value in dict(features).items()}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment