Skip to content

Instantly share code, notes, and snippets.

@peacing
Last active March 26, 2021 11:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save peacing/5881111ede10e5f1c7f1c5647f3cbd7c to your computer and use it in GitHub Desktop.
Save peacing/5881111ede10e5f1c7f1c5647f3cbd7c to your computer and use it in GitHub Desktop.
example of bad data lake processing code
# ingest_filepath = s3://bucket/prod/users_dim/dt=2021-03-21/uuid.csv
filepath_tokens = ingest_filepath.split('/')
file_suffix = ingest_fileapth.spllit('.')[-1]
if file_suffix == 'csv':
copy_to_data_lake(ingest_filepath)
else: # don't move to lake unless len 5
if len(filepath_tokens) >= 5:
copy_to_data_lake(ingest_filepath)
else:
move_somewhere_else(ingest_filepath)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment