Skip to content

Instantly share code, notes, and snippets.

@MLWhiz
Last active August 11, 2021 20:42
Show Gist options
  • Save MLWhiz/dbe2692d2edd310a0cef221857acdc2f to your computer and use it in GitHub Desktop.
Save MLWhiz/dbe2692d2edd310a0cef221857acdc2f to your computer and use it in GitHub Desktop.
plot_times.py
import pandas as pd
import plotly.express as px
new_movies_list = movies_list*8
times_taken = []
for i in range(50, len(new_movies_list),50):
print(i)
movies_to_process = new_movies_list[:i]
# Multiprocess:
s = time.perf_counter()
result_multiprocess = Parallel(n_jobs=8)(delayed(get_html_by_movie_id)(movie_id) for movie_id in movies_to_process)
time_joblib = time.perf_counter() - s
# Asyncio
s = time.perf_counter()
result_asyncio = await scrape_all_titles(movies_to_process)
time_asyncio = time.perf_counter() - s
times_taken.append([i,"Joblib", time_joblib])
times_taken.append([i,"Asyncio", time_asyncio])
timedf = pd.DataFrame(times_taken,columns = ['num_movies', 'process', 'time_taken'])
fig = px.line(timedf,x = 'num_movies',y='time_taken',color='process')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment