The same as SQL join operation.
Select the type of merge through how
parameter.
This is the default option. Resulting dataframe will have only the rows with same values in the selected key column. If more than one key column is selected, all selected keys must be the same for the row to be selected.
Resulting dataframe will have all rows from both dataframes.
Key column(s) with different values will remain, and empty fields will receive NaN
values.
Resulting dataframe will have all rows from left dataframe, while discarding non matching rows from the right dataframe.
Resulting dataframe will have all rows from right dataframe, while discarding non matching rows from the left dataframe.
Select which columns will be the keys through on
parameter.
All columns with the same name will be used as keys.
Specify columns names as a list to the on
parameter.
Columns with the same names that are not specified will be renamed with _x
and _y
sufixes.
The same as a left merge on indices. If you set key columns as indices, join will result the same as a left merge on those columns. It is more efficient and faster than the equivalent merge command.
The left dataframe key column is always the index. You can select the key columns from the right dataframe with on
parameter.
Concatenation is like "stitching" or appending two dataframes.
Concatenation is done through appending rows by default.
The rows from one dataframe are "appended" to the other.
If there are different columns in each dataframe, NaN
values will be filled.
To concatenate columns, select option axis=1
.
In this case, .concat()
will "append" adtitional columns to the rows.
Empty fields from different rows in each dataframe will be filled with NaN
Concatenation will preserve all data by default, as a set union, which is the join='outer'
option.
To eliminate data that does not have a match, as a set intersection, use the option join='inner'
.
There is an .append()
function which is the same as a default .concat()
with default valus (outer join over row axis).