Skip to content

Instantly share code, notes, and snippets.

@hannesdatta
Created September 2, 2022 11:24
Show Gist options
  • Save hannesdatta/27d3125acf5d6b188b8d02f5e59c1a33 to your computer and use it in GitHub Desktop.
Save hannesdatta/27d3125acf5d6b188b8d02f5e59c1a33 to your computer and use it in GitHub Desktop.
Solution to exercise 3.9 in my Python Bootcamp Tutorial (https://odcm.hannesdatta.com/docs/tutorials/pythonbootcamp/)
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.9 Tying things together\n",
"\n",
"Now it's your turn. Use the concepts from above to...\n",
"\n",
"- Create an array, holding ten subreddit names of your choice\n",
"- Write a function that returns as a dictionary the following data points from the about page of a subreddit: `display_name`, `title`, `subscribers`, and the date of creation, `created` (e.g., this is the link to the viewable about page for the [subreddit \"University\"](https://www.reddit.com/r/University/about), and this is the link to the [JSON version of the same page](https://www.reddit.com/r/University/about/.json)).\n",
"- Write a loop to retrieve data for the ten subreddits, and store the data in a new-line separated JSON file called `my_first_web_data.json`.\n",
"\n",
"<div class=\"alert alert-block alert-info\"><b>Tips:</b>\n",
" \n",
"<ul>\n",
" <li>Did you know you can \"look\" at the API output directly in Firefox or Chrome? Just open the URL that is called for a particular subreddit in your browser. Try it with <a href='https://www.reddit.com/r/University/about.json'>this one first (click)!</a></li>\n",
" <li>You can use <code>f.write</code> multiple times in your code. To write a new line to the file, use <code>f.write('\\n')</code>.</li>\n",
" <li>Please pay attention to where you open the file for the first time, and how (<code>'a'</code> vs. <code>'w'</code>)</li>\n",
" \n",
"</ul> \n",
" \n",
"</div>\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Solution__"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# import relevant packages\n",
"import requests\n",
"import json\n",
"\n",
"subreddits = ['skateboarding', 'climbing', 'tennis']\n",
"\n",
"# function to retrieve some data from reddit\n",
"def get_data(subreddit):\n",
" url = 'https://www.reddit.com/r/' + subreddit + '/about.json'\n",
" print(url)\n",
" content = requests.get(url, headers = {'User-agent': 'I am learning Python.'}).json()\n",
"\n",
" result = {\"display_name\": content['data']['display_name'],\n",
" \"title\": content['data']['title'],\n",
" \"subscribers\": content['data']['subscribers'],\n",
" \"timestamp\": content['data']['created']}\n",
" \n",
" return(result)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"skateboarding\n",
"https://www.reddit.com/r/skateboarding/about.json\n",
"climbing\n",
"https://www.reddit.com/r/climbing/about.json\n",
"tennis\n",
"https://www.reddit.com/r/tennis/about.json\n"
]
}
],
"source": [
"# write data\n",
"f=open('my_data.json','w',encoding='utf-8')\n",
"\n",
"# loop through all subreddits\n",
"for subreddit in subreddits:\n",
" print(subreddit)\n",
" f.write(json.dumps(get_data(subreddit)))\n",
" f.write('\\n')\n",
"\n",
"# close data file\n",
"f.close()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment