Skip to content

Instantly share code, notes, and snippets.

@rikturr
Created July 30, 2020 20:04
Show Gist options
  • Save rikturr/774ac94fad906c6801b20320b26f0538 to your computer and use it in GitHub Desktop.
Save rikturr/774ac94fad906c6801b20320b26f0538 to your computer and use it in GitHub Desktop.
list_files
import s3fs
fs = s3fs.S3FileSystem(anon=True)
files = [f"s3://{x}" for x in fs.ls('s3://nyc-tlc/trip data/')
if 'yellow' in x and ('2019' in x or '2018' in x or '2017' in x)]
cols = ['VendorID', 'tpep_pickup_datetime', 'tpep_dropoff_datetime', 'passenger_count', 'trip_distance',
'RatecodeID', 'store_and_fwd_flag', 'PULocationID', 'DOLocationID', 'payment_type', 'fare_amount',
'extra', 'mta_tax', 'tip_amount', 'tolls_amount', 'improvement_surcharge', 'total_amount']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment