Skip to content

Instantly share code, notes, and snippets.

@bh1995
Created January 12, 2021 19:09
Show Gist options
  • Save bh1995/3408cf053cd30f01765bfb512071d30d to your computer and use it in GitHub Desktop.
Save bh1995/3408cf053cd30f01765bfb512071d30d to your computer and use it in GitHub Desktop.
# SQL query for getting listings per city and ordering them
listing_sql = """SELECT city, count(id) as total_listing, count(DISTINCT neighbourhood_cleansed) as total_neighbourhood
FROM listings_df2
GROUP BY city
ORDER BY total_listing DESC"""
listing_total = spark.sql(listing_sql).cache();
# Show first 30
listing_total.show(30)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment