Skip to content

Instantly share code, notes, and snippets.

@bh1995
Created January 12, 2021 20:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bh1995/38ca6f43ea6dcb7a51b22c866cc51a55 to your computer and use it in GitHub Desktop.
Save bh1995/38ca6f43ea6dcb7a51b22c866cc51a55 to your computer and use it in GitHub Desktop.
beijing_popularity_sql = """SELECT a.date, b.city,b.neighbourhood_cleansed
FROM reviews_df a
LEFT JOIN listings_df2 b
on a.listing_id = b.id
WHERE b.city='Beijing'
ORDER BY date ASC;
"""
beijing_popularity = spark.sql(beijing_popularity_sql).cache()
beijing_popularity_pd = beijing_popularity.toPandas()
# plot
plt.figure(figsize = [16,7])# set figuresize
plt.hist(beijing_popularity_pd['date'],bins = 50,alpha = 0.5,color = 'red',edgecolor = 'white', linewidth = 1.2)
plt.xlabel('Date',size = 15)
plt.ylabel("Popularity",size = 15)
plt.title("Histogram of Bei Jing popularity(reviews) Change",size = 15)
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment