Skip to content

Instantly share code, notes, and snippets.

@sellisd
Created April 21, 2021 06:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sellisd/64455dc1b5296126dc1f4d90e906d6fa to your computer and use it in GitHub Desktop.
Save sellisd/64455dc1b5296126dc1f4d90e906d6fa to your computer and use it in GitHub Desktop.
What types of joins are commonly used
# install tool for cloning github repositories
python -m pip install git+https://github.com/sellisd/gitrepodb.git
# Clone top 20 (based on star-rating) SQL repositories
gitrepodb init
gitrepodb query --project SQL --query 'language:SQL,sort:stars-desc:archived=False' --head 20
gitrepodb add --basepath ~/data/
gitrepodb download --project SQL
find ~/data/ -name *.sql -exec grep 'JOIN' {} \;> join_types.dat
# Extract relevant lines
find ./ -name *.sql -exec grep 'JOIN' {} \;> join_types.dat
# Roughly count different types of joins:
# $ grep JOIN join_types.dat |wc -l
# 2744
# $ grep INNER join_types.dat |wc -l
# 172
# $ grep LEFT join_types.dat |wc -l
# 142
# $ grep OUTER join_types.dat |wc -l
# 63
# $ $ grep RIGHT join_types.dat |wc -l
# 4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment