Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
dfInitial = spark.read(...)
dfFiltered = dfInitial.select(...).where(..).cache() # 캐시 호출
dfJoined = (...)
# action 호출
# Transformation 이 실행되며 dfFiltered 를 계산후
# dfFiltered 를 여러대 나눠진 Executor 에서 메모리에 캐싱
dfJoined.write(...)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment