Skip to content

Instantly share code, notes, and snippets.

@aagnone3
Created October 30, 2020 00:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aagnone3/a4a6790f2f9a330b49d79307604d6209 to your computer and use it in GitHub Desktop.
Save aagnone3/a4a6790f2f9a330b49d79307604d6209 to your computer and use it in GitHub Desktop.
sf_crime_3.py
random_state = 42
train = pd.read_csv("/datasets/s3-data-bucket/train.csv")
train.drop_duplicates(inplace=True)
train.reset_index(inplace=True, drop=True)
print(f"Loaded the dataset of {train.shape[1]}-D features")
test = pd.read_csv("/datasets/s3-data-bucket/test.csv", index_col='Id')
print(f"# train examples: {len(train)}\n# test examples: {len(test)}")
del test
# remove from clear outliers from the data set, allowing fast.ai to impute the values via `FillMissing` later on
train.replace({'X': -120.5, 'Y': 90.0}, np.NaN, inplace=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment