Sangarshanan/AbstractSummariser.ipynb

## AbstractSummariser.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "599074dc",
   "metadata": {},
   "source": [
    "### Fetch Schedule info with Pretalx API"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "6d4e8695",
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "SCHEDULE_URL = \"https://program.europython.eu/europython-2023/schedule/export/schedule.json\"\n",
    "reponse = requests.get(SCHEDULE_URL)\n",
    "all_proposals = reponse.json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "e942c154",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Processing Day 2023-07-17\n",
      "Processing Day 2023-07-18\n",
      "Processing Day 2023-07-19\n",
      "Processing Day 2023-07-20\n",
      "Processing Day 2023-07-21\n",
      "Processing Day 2023-07-22\n",
      "Processing Day 2023-07-23\n"
     ]
    }
   ],
   "source": [
    "PROCESS_TYPE = (\"Talk\",\"Tutorial\")\n",
    "processed_proposals = []\n",
    "titled_processed = []\n",
    "for proposal_days in all_proposals[\"schedule\"][\"conference\"][\"days\"]:\n",
    "    print(f\"Processing Day {proposal_days['date']}\")\n",
    "    for room, proposals in proposal_days['rooms'].items():\n",
    "        date_wise_proposals = {}\n",
    "        for proposal in proposals:\n",
    "            _proposal = {}\n",
    "            if proposal.get('persons') and proposal[\"type\"] in PROCESS_TYPE:\n",
    "                _proposal[\"title\"] = proposal[\"title\"]\n",
    "                if _proposal[\"title\"] in titled_processed:\n",
    "                    continue\n",
    "                else:\n",
    "                    titled_processed.append(_proposal[\"title\"])\n",
    "                _proposal[\"slug\"] = proposal[\"slug\"]\n",
    "                _proposal[\"url\"] = proposal[\"url\"]\n",
    "                _proposal[\"type\"] = proposal[\"type\"]\n",
    "                _proposal[\"speaker\"] = ([i['public_name'] for i in proposal['persons']])\n",
    "                _proposal[\"abstract\"] = proposal['abstract']\n",
    "                processed_proposals.append(_proposal)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "2b815c0e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "141"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(processed_proposals)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8319dd50",
   "metadata": {},
   "source": [
    "### Summarise Schedule Abstract"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ed027c1a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from transformers import pipeline\n",
    "summarizer = pipeline(\"summarization\", model=\"knkarthick/MEETING_SUMMARY\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "a461bb27",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "processed Asyncio without Asyncio\n",
      "processed Geospatial Data Processing in Python: A Comprehensive Tutorial\n",
      "processed How to MLOps: Experiment tracking & deployment 📊\n",
      "processed Build, Serve, and Deploy a Fast, Production-Ready API with Python and Robyn\n",
      "processed Decorators - A Deep Dive\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n",
      "KeyboardInterrupt\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tweets = []\n",
    "YEAR = 2024\n",
    "for processed_proposal in processed_proposals:\n",
    "    abstract = summarizer(processed_proposal['abstract'])[0]['summary_text']\n",
    "    speakers = \", \".join(processed_proposal[\"speaker\"]) +\"'s\"\n",
    "    link = processed_proposal[\"url\"]\n",
    "    ptype = processed_proposal[\"type\"].lower()\n",
    "    tweets.append(f\"Check out {speakers} {ptype} at #EuroPython{YEAR}, in their words: {abstract} {link} 🐍.\")\n",
    "    print(f\"processed {processed_proposal['title']}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7dd1c9be",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Check out Yan Yanchii's tutorial at #EuroPython2024, in their words: In this tutorial, we will demystify the builtin module of the Python Standard Library and build a scheduler and an asynchronous http proxy. https://program.europython.eu/europython-2023/talk/8XKCB8/ 🐍. \n",
      "\n",
      "Check out Martin Christen's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about the various Python modules for processing geospatial data. https://program.europython.eu/europython-2023/talk/3HDWUZ/ 🐍. \n",
      "\n",
      "Check out Jeroen Overschie, Yke Rusticus's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about MLOps and take your first steps in a hands-on way. https://program.europython.eu/europython-2023/talk/PNYMHE/ 🐍. \n",
      "\n",
      "Check out Sanskar Jethi's tutorial at #EuroPython2024, in their words: How to build fast, production-ready APIs using Robyn, a web framework for Python. https://program.europython.eu/europython-2023/talk/SEBRJA/ 🐍. \n",
      "\n",
      "Check out Mike Müller's tutorial at #EuroPython2024, in their words: This tutorial is an in-depth introduction to decorators. Python offers decorator to implement re-usable code for cross-cutting task. https://program.europython.eu/europython-2023/talk/BGEYP7/ 🐍. \n",
      "\n"
     ]
    }
   ],
   "source": [
    "for tweet in tweets:\n",
    "    print(tweet, \"\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "38725140",
   "metadata": {},
   "outputs": [],
   "source": [
    "import csv\n",
    "with open('tweets.csv', 'w', newline='') as file:\n",
    "    writer = csv.writer(file)\n",
    "    writer.writerow(tweets)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "599074dc",
	"metadata": {},
	"source": [
	"### Fetch Schedule info with Pretalx API"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"id": "6d4e8695",
	"metadata": {},
	"outputs": [],
	"source": [
	"import requests\n",
	"SCHEDULE_URL = \"https://program.europython.eu/europython-2023/schedule/export/schedule.json\"\n",
	"reponse = requests.get(SCHEDULE_URL)\n",
	"all_proposals = reponse.json()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"id": "e942c154",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Processing Day 2023-07-17\n",
	"Processing Day 2023-07-18\n",
	"Processing Day 2023-07-19\n",
	"Processing Day 2023-07-20\n",
	"Processing Day 2023-07-21\n",
	"Processing Day 2023-07-22\n",
	"Processing Day 2023-07-23\n"
	]
	}
	],
	"source": [
	"PROCESS_TYPE = (\"Talk\",\"Tutorial\")\n",
	"processed_proposals = []\n",
	"titled_processed = []\n",
	"for proposal_days in all_proposals[\"schedule\"][\"conference\"][\"days\"]:\n",
	" print(f\"Processing Day {proposal_days['date']}\")\n",
	" for room, proposals in proposal_days['rooms'].items():\n",
	" date_wise_proposals = {}\n",
	" for proposal in proposals:\n",
	" _proposal = {}\n",
	" if proposal.get('persons') and proposal[\"type\"] in PROCESS_TYPE:\n",
	" _proposal[\"title\"] = proposal[\"title\"]\n",
	" if _proposal[\"title\"] in titled_processed:\n",
	" continue\n",
	" else:\n",
	" titled_processed.append(_proposal[\"title\"])\n",
	" _proposal[\"slug\"] = proposal[\"slug\"]\n",
	" _proposal[\"url\"] = proposal[\"url\"]\n",
	" _proposal[\"type\"] = proposal[\"type\"]\n",
	" _proposal[\"speaker\"] = ([i['public_name'] for i in proposal['persons']])\n",
	" _proposal[\"abstract\"] = proposal['abstract']\n",
	" processed_proposals.append(_proposal)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"id": "2b815c0e",
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"141"
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"len(processed_proposals)"
	]
	},
	{
	"cell_type": "markdown",
	"id": "8319dd50",
	"metadata": {},
	"source": [
	"### Summarise Schedule Abstract"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"id": "ed027c1a",
	"metadata": {},
	"outputs": [],
	"source": [
	"from transformers import pipeline\n",
	"summarizer = pipeline(\"summarization\", model=\"knkarthick/MEETING_SUMMARY\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"id": "a461bb27",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"processed Asyncio without Asyncio\n",
	"processed Geospatial Data Processing in Python: A Comprehensive Tutorial\n",
	"processed How to MLOps: Experiment tracking & deployment 📊\n",
	"processed Build, Serve, and Deploy a Fast, Production-Ready API with Python and Robyn\n",
	"processed Decorators - A Deep Dive\n"
	]
	},
	{
	"name": "stderr",
	"output_type": "stream",
	"text": [
	"\n",
	"KeyboardInterrupt\n",
	"\n"
	]
	}
	],
	"source": [
	"tweets = []\n",
	"YEAR = 2024\n",
	"for processed_proposal in processed_proposals:\n",
	" abstract = summarizer(processed_proposal['abstract'])[0]['summary_text']\n",
	" speakers = \", \".join(processed_proposal[\"speaker\"]) +\"'s\"\n",
	" link = processed_proposal[\"url\"]\n",
	" ptype = processed_proposal[\"type\"].lower()\n",
	" tweets.append(f\"Check out {speakers} {ptype} at #EuroPython{YEAR}, in their words: {abstract} {link} 🐍.\")\n",
	" print(f\"processed {processed_proposal['title']}\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"id": "7dd1c9be",
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Check out Yan Yanchii's tutorial at #EuroPython2024, in their words: In this tutorial, we will demystify the builtin module of the Python Standard Library and build a scheduler and an asynchronous http proxy. https://program.europython.eu/europython-2023/talk/8XKCB8/ 🐍. \n",
	"\n",
	"Check out Martin Christen's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about the various Python modules for processing geospatial data. https://program.europython.eu/europython-2023/talk/3HDWUZ/ 🐍. \n",
	"\n",
	"Check out Jeroen Overschie, Yke Rusticus's tutorial at #EuroPython2024, in their words: In this tutorial, you will learn about MLOps and take your first steps in a hands-on way. https://program.europython.eu/europython-2023/talk/PNYMHE/ 🐍. \n",
	"\n",
	"Check out Sanskar Jethi's tutorial at #EuroPython2024, in their words: How to build fast, production-ready APIs using Robyn, a web framework for Python. https://program.europython.eu/europython-2023/talk/SEBRJA/ 🐍. \n",
	"\n",
	"Check out Mike Müller's tutorial at #EuroPython2024, in their words: This tutorial is an in-depth introduction to decorators. Python offers decorator to implement re-usable code for cross-cutting task. https://program.europython.eu/europython-2023/talk/BGEYP7/ 🐍. \n",
	"\n"
	]
	}
	],
	"source": [
	"for tweet in tweets:\n",
	" print(tweet, \"\\n\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"id": "38725140",
	"metadata": {},
	"outputs": [],
	"source": [
	"import csv\n",
	"with open('tweets.csv', 'w', newline='') as file:\n",
	" writer = csv.writer(file)\n",
	" writer.writerow(tweets)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3 (ipykernel)",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.18"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}