This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyspark.sql.functions import broadcast | |
joinExpr = person["grad_section_id"] == graduateProgram["id"] | |
person.join(broadcast(graduateProgram), joinExpr).explain() | |
== Physical Plan == | |
AdaptiveSparkPlan isFinalPlan=false | |
+- BroadcastHashJoin [grad_section_id#596L], [id#610L], Inner, BuildRight, false | |
:- Project [_1#586L AS id#594L, _2#587 AS name#595, _3#588L AS grad_section_id#596L, _4#589 AS subject_enrolled_id#597] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinExpr = person["grad_section_id"] == graduateProgram["id"] | |
person.join(graduateProgram, joinExpr).explain() | |
== Physical Plan == | |
AdaptiveSparkPlan isFinalPlan=false | |
+- SortMergeJoin [grad_section_id#596L], [id#610L], Inner | |
:- Sort [grad_section_id#596L ASC NULLS FIRST], false, 0 | |
: +- Exchange hashpartitioning(grad_section_id#596L, 200), ENSURE_REQUIREMENTS, [id=#5805] | |
: +- Project [_1#586L AS id#594L, _2#587 AS name#595, _3#588L AS grad_section_id#596L, _4#589 AS subject_enrolled_id#597] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#1 | |
joinType = "cross" | |
graduateProgram.join(person, joinExpression, joinType) | |
#2 | |
person.join(graduateProgram) | |
#3 | |
person.crossJoin(graduateProgram) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinType = "left_anti" | |
graduateProgram.join(person, joinExpression, joinType) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinType = "left_semi" | |
graduateProgram.join(person, joinExpression, joinType) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinType = "right_outer" | |
person.join(graduateProgram, joinExpression, joinType) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinType = "left_outer" | |
person.join(graduateProgram, joinExpression, joinType) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinType = "outer" | |
person.join(graduateProgram, joinExpression, joinType) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DATAFRAME_1.join(DATAFRAME_2, JOIN_CONDITION, JOIN_TYPE) | |
# .join is function call on any pyspark dataframe | |
# DATFRAME_2 is a mandatory parameter | |
# JOIN_CONDITION is used to specify which key is used to match both dataframes | |
# JOIN_TYPE is used to specify which type of join to implement |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
joinExpression = person["grad_section_id"] == graduateProgram["id"] | |
person.join(graduateProgram, joinExpression) |
NewerOlder