Skip to content

Instantly share code, notes, and snippets.

@peacing
Last active February 7, 2022 16:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save peacing/15aeb2c0b32425cd59994bdaf1a2a40b to your computer and use it in GitHub Desktop.
Save peacing/15aeb2c0b32425cd59994bdaf1a2a40b to your computer and use it in GitHub Desktop.
import pyspark.pandas as ps
# data path in HDFS
loans_filename = '/FileStore/tables/loans.csv'
loans_df = ps.read_csv(
loans_filename,
header=None,
names=['loan_amount', 'address', 'created_at', 'funded_at'],
infer_datetime_format=True,
parse_dates=['created_at', 'funded_at']
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment