Skip to content

Instantly share code, notes, and snippets.

@Sangarshanan
Last active April 22, 2024 14:04
Show Gist options
  • Save Sangarshanan/9c9eaf8b1fd985738da7340fe9fe619b to your computer and use it in GitHub Desktop.
Save Sangarshanan/9c9eaf8b1fd985738da7340fe9fe619b to your computer and use it in GitHub Desktop.
Summarise Abstracts from Proposals
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "599074dc",
"metadata": {},
"source": [
"### Fetch Schedule info with Pretalx API"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6d4e8695",
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"SCHEDULE_URL = \"https://program.europython.eu/europython-2023/schedule/export/schedule.json\"\n",
"reponse = requests.get(SCHEDULE_URL)\n",
"all_proposals = reponse.json()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e942c154",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processing Day 2023-07-17\n",
"Processing Day 2023-07-18\n",
"Processing Day 2023-07-19\n",
"Processing Day 2023-07-20\n",
"Processing Day 2023-07-21\n",
"Processing Day 2023-07-22\n",
"Processing Day 2023-07-23\n"
]
}
],
"source": [
"PROCESS_TYPE = (\"Talk\",\"Tutorial\")\n",
"processed_proposals = []\n",
"titled_processed = []\n",
"for proposal_days in all_proposals[\"schedule\"][\"conference\"][\"days\"]:\n",
" print(f\"Processing Day {proposal_days['date']}\")\n",
" for room, proposals in proposal_days['rooms'].items():\n",
" date_wise_proposals = {}\n",
" for proposal in proposals:\n",
" _proposal = {}\n",
" if proposal.get('persons') and proposal[\"type\"] in PROCESS_TYPE:\n",
" _proposal[\"title\"] = proposal[\"title\"]\n",
" if _proposal[\"title\"] in titled_processed:\n",
" continue\n",
" else:\n",
" titled_processed.append(_proposal[\"title\"])\n",
" _proposal[\"slug\"] = proposal[\"slug\"]\n",
" _proposal[\"url\"] = proposal[\"url\"]\n",
" _proposal[\"type\"] = proposal[\"type\"]\n",
" _proposal[\"speaker\"] = ([i['public_name'] for i in proposal['persons']])\n",
" _proposal[\"abstract\"] = proposal['abstract']\n",
" processed_proposals.append(_proposal)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2b815c0e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"141"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(processed_proposals)"
]
},
{
"cell_type": "markdown",
"id": "8319dd50",
"metadata": {},
"source": [
"### Summarise Schedule Abstract"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ed027c1a",
"metadata": {},
"outputs": [],
"source": [
"from transformers import pipeline\n",
"summarizer = pipeline(\"summarization\", model=\"knkarthick/MEETING_SUMMARY\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "a461bb27",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"processed Asyncio without Asyncio\n",
"processed Geospatial Data Processing in Python: A Comprehensive Tutorial\n",
"processed How to MLOps: Experiment tracking & deployment ๐Ÿ“Š\n",
"processed Build, Serve, and Deploy a Fast, Production-Ready API with Python and Robyn\n",
"processed Decorators - A Deep Dive\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"KeyboardInterrupt\n",
"\n"
]
}
],
"source": [
"tweets = []\n",
"YEAR = 2024\n",
"for processed_proposal in processed_proposals:\n",
" abstract = summarizer(processed_proposal['abstract'])[0]['summary_text']\n",
" speakers = \", \".join(processed_proposal[\"speaker\"]) +\"'s\"\n",
" link = processed_proposal[\"url\"]\n",
" ptype = processed_proposal[\"type\"].lower()\n",
" tweets.append(f\"Check out {speakers} {ptype} at #EuroPython{YEAR}, in their words: {abstract} {link} ๐Ÿ.\")\n",
" print(f\"processed {processed_proposal['title']}\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "7dd1c9be",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Check out Yan Yanchii's tutorial at #EuroPython2024, in their words: In this tutorial, we will demystify the builtin module of the Python Standard Library and build a scheduler and an asynchronous http proxy. https://program.europython.eu/europython-2023/talk/8XKCB8/ ๐Ÿ. \n",
"\n",
"Check out Martin Christen's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about the various Python modules for processing geospatial data. https://program.europython.eu/europython-2023/talk/3HDWUZ/ ๐Ÿ. \n",
"\n",
"Check out Jeroen Overschie, Yke Rusticus's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about MLOps and take your first steps in a hands-on way. https://program.europython.eu/europython-2023/talk/PNYMHE/ ๐Ÿ. \n",
"\n",
"Check out Sanskar Jethi's tutorial at #EuroPython2024, in their words: How to build fast, production-ready APIs using Robyn, a web framework for Python. https://program.europython.eu/europython-2023/talk/SEBRJA/ ๐Ÿ. \n",
"\n",
"Check out Mike Mรผller's tutorial at #EuroPython2024, in their words: This tutorial is an in-depth introduction to decorators. Python offers decorator to implement re-usable code for cross-cutting task. https://program.europython.eu/europython-2023/talk/BGEYP7/ ๐Ÿ. \n",
"\n"
]
}
],
"source": [
"for tweet in tweets:\n",
" print(tweet, \"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "38725140",
"metadata": {},
"outputs": [],
"source": [
"import csv\n",
"with open('tweets.csv', 'w', newline='') as file:\n",
" writer = csv.writer(file)\n",
" writer.writerow(tweets)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment