Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 20, 2021 01:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/2cab95b944e0b94f868f9c6ea77f8f52 to your computer and use it in GitHub Desktop.
Save 1ambda/2cab95b944e0b94f868f9c6ea77f8f52 to your computer and use it in GitHub Desktop.
from pyspark.sql.functions import *
from pyspark.sql.types import *
# 현재 디렉토리에 CSV 파일을 다운받은 후 아래 코드를 실행합니다.
# 해당 파일의 확장자는 `.csv` 로 되어있으나, 실제로 데이터의 구분자는 `\t` (탭) 입니다
# DataBricks 로 실습한다면 경로를 "/FileStore/tables/marketing_campaign.csv" 로 변경합니다
df = spark.read.load("./marketing_campaign.csv",
format="csv",
sep="\t",
inferSchema="true",
header="true")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment