Last active
April 22, 2024 14:04
-
-
Save Sangarshanan/9c9eaf8b1fd985738da7340fe9fe619b to your computer and use it in GitHub Desktop.
Summarise Abstracts from Proposals
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "599074dc", | |
"metadata": {}, | |
"source": [ | |
"### Fetch Schedule info with Pretalx API" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"id": "6d4e8695", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import requests\n", | |
"SCHEDULE_URL = \"https://program.europython.eu/europython-2023/schedule/export/schedule.json\"\n", | |
"reponse = requests.get(SCHEDULE_URL)\n", | |
"all_proposals = reponse.json()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"id": "e942c154", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Processing Day 2023-07-17\n", | |
"Processing Day 2023-07-18\n", | |
"Processing Day 2023-07-19\n", | |
"Processing Day 2023-07-20\n", | |
"Processing Day 2023-07-21\n", | |
"Processing Day 2023-07-22\n", | |
"Processing Day 2023-07-23\n" | |
] | |
} | |
], | |
"source": [ | |
"PROCESS_TYPE = (\"Talk\",\"Tutorial\")\n", | |
"processed_proposals = []\n", | |
"titled_processed = []\n", | |
"for proposal_days in all_proposals[\"schedule\"][\"conference\"][\"days\"]:\n", | |
" print(f\"Processing Day {proposal_days['date']}\")\n", | |
" for room, proposals in proposal_days['rooms'].items():\n", | |
" date_wise_proposals = {}\n", | |
" for proposal in proposals:\n", | |
" _proposal = {}\n", | |
" if proposal.get('persons') and proposal[\"type\"] in PROCESS_TYPE:\n", | |
" _proposal[\"title\"] = proposal[\"title\"]\n", | |
" if _proposal[\"title\"] in titled_processed:\n", | |
" continue\n", | |
" else:\n", | |
" titled_processed.append(_proposal[\"title\"])\n", | |
" _proposal[\"slug\"] = proposal[\"slug\"]\n", | |
" _proposal[\"url\"] = proposal[\"url\"]\n", | |
" _proposal[\"type\"] = proposal[\"type\"]\n", | |
" _proposal[\"speaker\"] = ([i['public_name'] for i in proposal['persons']])\n", | |
" _proposal[\"abstract\"] = proposal['abstract']\n", | |
" processed_proposals.append(_proposal)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"id": "2b815c0e", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"141" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"len(processed_proposals)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "8319dd50", | |
"metadata": {}, | |
"source": [ | |
"### Summarise Schedule Abstract" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"id": "ed027c1a", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from transformers import pipeline\n", | |
"summarizer = pipeline(\"summarization\", model=\"knkarthick/MEETING_SUMMARY\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"id": "a461bb27", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"processed Asyncio without Asyncio\n", | |
"processed Geospatial Data Processing in Python: A Comprehensive Tutorial\n", | |
"processed How to MLOps: Experiment tracking & deployment ๐\n", | |
"processed Build, Serve, and Deploy a Fast, Production-Ready API with Python and Robyn\n", | |
"processed Decorators - A Deep Dive\n" | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"\n", | |
"KeyboardInterrupt\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"tweets = []\n", | |
"YEAR = 2024\n", | |
"for processed_proposal in processed_proposals:\n", | |
" abstract = summarizer(processed_proposal['abstract'])[0]['summary_text']\n", | |
" speakers = \", \".join(processed_proposal[\"speaker\"]) +\"'s\"\n", | |
" link = processed_proposal[\"url\"]\n", | |
" ptype = processed_proposal[\"type\"].lower()\n", | |
" tweets.append(f\"Check out {speakers} {ptype} at #EuroPython{YEAR}, in their words: {abstract} {link} ๐.\")\n", | |
" print(f\"processed {processed_proposal['title']}\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"id": "7dd1c9be", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Check out Yan Yanchii's tutorial at #EuroPython2024, in their words: In this tutorial, we will demystify the builtin module of the Python Standard Library and build a scheduler and an asynchronous http proxy. https://program.europython.eu/europython-2023/talk/8XKCB8/ ๐. \n", | |
"\n", | |
"Check out Martin Christen's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about the various Python modules for processing geospatial data. https://program.europython.eu/europython-2023/talk/3HDWUZ/ ๐. \n", | |
"\n", | |
"Check out Jeroen Overschie, Yke Rusticus's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about MLOps and take your first steps in a hands-on way. https://program.europython.eu/europython-2023/talk/PNYMHE/ ๐. \n", | |
"\n", | |
"Check out Sanskar Jethi's tutorial at #EuroPython2024, in their words: How to build fast, production-ready APIs using Robyn, a web framework for Python. https://program.europython.eu/europython-2023/talk/SEBRJA/ ๐. \n", | |
"\n", | |
"Check out Mike Mรผller's tutorial at #EuroPython2024, in their words: This tutorial is an in-depth introduction to decorators. Python offers decorator to implement re-usable code for cross-cutting task. https://program.europython.eu/europython-2023/talk/BGEYP7/ ๐. \n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"for tweet in tweets:\n", | |
" print(tweet, \"\\n\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"id": "38725140", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import csv\n", | |
"with open('tweets.csv', 'w', newline='') as file:\n", | |
" writer = csv.writer(file)\n", | |
" writer.writerow(tweets)" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.18" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment