Skip to content

Instantly share code, notes, and snippets.

@1ambda
Created January 2, 2022 01:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save 1ambda/48aceb2e62a2a2a449ccea45891fb869 to your computer and use it in GitHub Desktop.
Save 1ambda/48aceb2e62a2a2a449ccea45891fb869 to your computer and use it in GitHub Desktop.
dfInitial = spark.read(...)
dfFiltered = dfInitial.select(...).where(..).cache() # 캐시 호출
dfJoined = (...)
# action 호출
# Transformation 이 실행되며 dfFiltered 를 계산후
# dfFiltered 를 여러대 나눠진 Executor 에서 메모리에 캐싱
dfJoined.write(...)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment