Skip to content

Instantly share code, notes, and snippets.

View bob-ortiz's full-sized avatar

Gustavo A. Castillo A. bob-ortiz

View GitHub Profile
@bob-ortiz
bob-ortiz / answer.md
Created July 12, 2023 05:54
Difference between a merge and a join in pandas

Note

The following answer was provided by Bart when I asked the difference between merge and join. I thought the answer was good enough and wanted to share it and to also remember this important difference.

Output

The join() and merge() methods in pandas are used to combine two DataFrames. They are similar in that they both combine DataFrames based on common columns or indices. However, there are some key differences between the two methods:

  • join() only joins on the index. If you want to join on columns, you need to set the on parameter.
  • merge() can join on columns or indices. You can specify the columns to join on with the on parameter, or you can specify the indices to join on with the left_index and right_index parameters.
  • join() by default performs a left join. This means that all rows from the left DataFrame are included in the result, even if there are no matching rows in the right DataFrame.
  • merge() by default performs an inner join. This means that onl