Skip to content

Instantly share code, notes, and snippets.

@al102964
Created August 25, 2020 19:21
Show Gist options
  • Save al102964/eb2f0a4441440b320bf23a35b3ee6846 to your computer and use it in GitHub Desktop.
Save al102964/eb2f0a4441440b320bf23a35b3ee6846 to your computer and use it in GitHub Desktop.
from pyspark.sql.types import *
schema = StructType([
StructField("central", StringType()),
StructField("gateway", StringType()),
StructField("board", StringType()),
StructField("id_dispositivo", StringType()),
StructField("valor", FloatType()),
StructField("timestamp", TimestampType()),
StructField("tarifa", StringType()),
])
df = spark.read.format("csv") \
.schema(schema)\
.option("header", True) \
.option("sep", "\t") \
.load('/mnt/s3data/data/carga_historica_titulos.csv')
display(df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment