Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created December 20, 2021 11:34
Embed
What would you like to do?
# 기존 date_customer 컬럼의 값과 비교를 위해 `date_joined` 라는 다른 이름으로 컬럼 값 변환 결과를 저장합니다
# 1. 이 과정에서 `to_date` 함수를 사용해 타입을 변경하고
# 2. `add_months` 함수를 통해 72개월 (= 6년) 을 기존 값에 추가했습니다.
dfWithJoined = dfConverted2.withColumn("date_joined", add_months(to_date(col("date_customer"), "d-M-yyyy"), 72))
dfWithJoined.select("date_customer", "date_joined").limit(5).show()
dfWithJoined.printSchema()
# `show()` 출력 결과
+-------------+-----------+
|date_customer|date_joined|
+-------------+-----------+
| 04-09-2012| 2018-09-04|
| 08-03-2014| 2020-03-08|
| 21-08-2013| 2019-08-21|
| 10-02-2014| 2020-02-10|
| 19-01-2014| 2020-01-19|
+-------------+-----------+
# `printSchema()` 출력 결과
root
|-- id: integer (nullable = true)
|-- year_birth: integer (nullable = true)
|-- education: string (nullable = true)
|-- count_kid: integer (nullable = true)
|-- count_teen: integer (nullable = true)
|-- date_customer: string (nullable = true)
|-- days_last_login: integer (nullable = true)
|-- count_children: integer (nullable = false)
|-- date_joined: date (nullable = true)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment