Skip to content

Instantly share code, notes, and snippets.

@jacobeisenstein
Last active April 8, 2024 10:03
Show Gist options
  • Save jacobeisenstein/ae0e13e270f3b00c9c2046b52297d018 to your computer and use it in GitHub Desktop.
Save jacobeisenstein/ae0e13e270f3b00c9c2046b52297d018 to your computer and use it in GitHub Desktop.
conference travel co2.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/jacobeisenstein/ae0e13e270f3b00c9c2046b52297d018/conference-travel-co2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# CO2 Emissions for ACL 2023\n",
"\n",
"* Jacob Eisenstein & Roy Schwartz, December 2023\n",
"\n",
"The carbon footprint of AI training and inference is deservedly receiving increasing attention. However, large models are only one way in which AI contributes to global warming: more prosaically, AI's [exponentially growing conference scene](https://thegradient.pub/neurips-2019-too-big/) involves thousands of flights all over the world.\n",
"\n",
"This notebook contains code to estimate the CO2 emissions associated with air travel to ACL 2023. We estimate that ACL's **2500 attendees** flew **23.6 million kilometers**, amounting to approximately **2400 metric tons** of CO2 emissions. To put this in perspective, it is equivalent to the emissions of:\n",
"* Powering **1572 A100 GPUs for an entire year** at maximum load (see calculations below)\n",
"* **512 person-years** for the average human being at the current global per-capita emissions rate.\n",
"* Training [Llama](https://arxiv.org/pdf/2302.13971.pdf) 2.4 times; training [Llama 2](https://arxiv.org/abs/2307.09288) 4.4 times. (This is based on reported emissions of 1000 mT co2e and 540 mT co2e respectively.)\n",
"\n"
],
"metadata": {
"id": "uz5iPtVqFFPj"
}
},
{
"cell_type": "markdown",
"source": [
"## Methodology\n",
"\n",
"Our estimates are based on per-attendee city-level addresses (provided by the ACL), in combination with publicly-available flight data.\n",
"\n",
"* **Airports**: For each participant, we identify a set of candidate airports near their self-reported city-level home address.\n",
"* **Flights**: We assume participants take a flight with the fewest connections, and that among flights with equal numbers of connections, take the one with the shortest combined distance.\n",
"* **Emissions**: Emissions per flight can be calculated in two ways: using a piecewise linear function from [ICCT](https://theicct.org/sites/default/files/publications/ICCT_CO2-commercl-aviation-2018_20190918.pdf), or using a quadratic formula from [myclimate](https://www.myclimate.org/en/information/about-myclimate/downloads/flight-emission-calculator/).\n",
"\n",
"More details on each step are given in the code below."
],
"metadata": {
"id": "Ql3Jd3t9MYmt"
}
},
{
"cell_type": "markdown",
"source": [
"\n",
"## Previous estimates\n",
"\n",
"There have been several prior estimates for emissions associated with conferences, although none this thorough for ACL.\n",
"\n",
"- [Caines and Rei (2019)](https://www.marekrei.com/blog/geographic-diversity-of-nlp-conferences/) estimated **2100 metric tons** (mT) of CO2E from **10.5m km** of air travel for all NLP conferences in 2019. Their estimate was less precise in a few ways: for example, it was based on authors rather than attendees, and did not incorporate information about flight routes. Their estimate was based on 200g co2e per passenger-km, which was higher than the estimates in the sources we found.\n",
"- [Skiles et al (2021)](https://www.nature.com/articles/s41893-021-00823-2#Sec22) estimated **3900 tons** co2e for ICLR 2019 (in New Orleans), roughly **1.5 tons per attendee**. However, this was based on only country-level geolocation per attendee, using the geographical center of the country. They had per-city attendee origins for two other conferences (NAMS and AAS), and estimated per-attendee emissions of roughly 1 ton co2e. Emissions were computed using [myclimate](https://www.myclimate.org/en/information/about-myclimate/downloads/flight-emission-calculator/), using the great circle distance between attendee origin and the conference location.\n",
"- [Yakar and Kwee (2020)](https://www.sciencedirect.com/science/article/pii/S0720048X20300589#bib0040) estimated 39.5 mT co2e for 22000 attendees to a radiology conference in Chicago in 2017, working out to roughly **1.8 mT co2e per participant**. This estimate was based on state- and country-level geographical locations per participant. They also used the myclimate calculator.\n",
"- [Jäckle (2022)](https://link.springer.com/chapter/10.1007/978-981-16-4911-0_2) examined a series of political science conferences, estimating emissions between **1-4 mT co2e per participant** depending on the location and the estimated greenhouse gas intensity of air travel. The estimate was based on the great circle route distance between the conference location and the home institution of paper presenters at the conference.\n",
"- [Przybyła and Shardlow (2022)](https://aclanthology.org/2022.findings-acl.304.pdf) use the addresses of ACL anthology authors to track emissions over time. They estimate roughly **1 tonne** co2e per participant for most conferences, with total emissions rising over time with the number of papers. The emissions per flight were based on a regression to account for change in efficiency over time. Similar to other prior work, flight distances were based on great circle route distance between the conference location and the home institution of paper authors.",
"\n",
"Previous estimates have not used flight connection data, which can help distinguish the effect of locating a conference near a well-connected international airport. Country-level geolocations are too coarse-grained, as many conference participants are from large countries like Canada, China, and the United States.\n",
"\n",
"The myclimate-based estimates include a multiplier to account for non-CO2 radiative forcing [[1](https://en.wikipedia.org/wiki/Environmental_effects_of_aviation#Climate_change), [2](https://www.myclimate.org/en/information/about-myclimate/downloads/flight-emission-calculator/)]. This is why per-participant emissions are larger in the studies that use this estimator. While it seems likely that CO2 emissions underestimate the climate impact of air travel, there is considerable uncertainty about the exact value of the multiplier (e.g., https://phys.org/news/2023-12-nasa-boeing-jet-contrails-science.html). For simplicity we examine only CO2 emissions, but we emphasize that the warming impacts of air travel are almost surely larger than the total emissions."
],
"metadata": {
"id": "n-dwibcgManP"
}
},
{
"cell_type": "markdown",
"source": [
"# Code"
],
"metadata": {
"id": "kF_tET6nMnP1"
}
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "mPVWwZW4oseV",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "ef320b3c-5b6e-42d3-c795-1f85d7d6e033",
"collapsed": true,
"cellView": "form"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting haversine\n",
" Downloading haversine-2.8.0-py2.py3-none-any.whl (7.7 kB)\n",
"Installing collected packages: haversine\n",
"Successfully installed haversine-2.8.0\n",
"Collecting fastparquet\n",
" Downloading fastparquet-2023.10.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.7/1.7 MB\u001b[0m \u001b[31m9.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: pandas>=1.5.0 in /usr/local/lib/python3.10/dist-packages (from fastparquet) (1.5.3)\n",
"Requirement already satisfied: numpy>=1.20.3 in /usr/local/lib/python3.10/dist-packages (from fastparquet) (1.23.5)\n",
"Collecting cramjam>=2.3 (from fastparquet)\n",
" Downloading cramjam-2.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m18.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from fastparquet) (2023.6.0)\n",
"Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from fastparquet) (23.2)\n",
"Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.5.0->fastparquet) (2.8.2)\n",
"Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.5.0->fastparquet) (2023.3.post1)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas>=1.5.0->fastparquet) (1.16.0)\n",
"Installing collected packages: cramjam, fastparquet\n",
"Successfully installed cramjam-2.7.0 fastparquet-2023.10.1\n"
]
}
],
"source": [
"# @title imports\n",
"\n",
"from enum import Enum\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"import networkx as nx\n",
"import numpy as np\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"from scipy import spatial\n",
"from scipy.spatial import cKDTree\n",
"from google.colab import drive, files\n",
"\n",
"import logging\n",
"logger = logging.getLogger('matplotlib.font_manager')\n",
"logger.setLevel(logging.WARNING)\n",
"\n",
"!pip install haversine\n",
"from haversine import haversine, Unit\n",
"\n",
"!pip install fastparquet # for file io\n",
"\n",
"import os"
]
},
{
"cell_type": "code",
"source": [
"# @title constants\n",
"HOME_DIR = 'drive/MyDrive/projects/acl_co2/' # @param\n",
"AIRPORT_FILENAME = 'airports.dat' # @param\n",
"ATTENDEE_FILENAME = 'acl_locations_with_lat_lon_fixed.csv' # @param\n",
"ROUTES_FILENAME = 'routes.dat' # @param\n",
"AIRPORT_DATA_PATH = os.path.join(HOME_DIR, AIRPORT_FILENAME)\n",
"ATTENDEE_DATA_PATH = os.path.join(HOME_DIR, ATTENDEE_FILENAME)\n",
"ROUTE_DATA_PATH = os.path.join(HOME_DIR, ROUTES_FILENAME)\n",
"\n",
"LOAD_SHORTEST_PATHS = True # @param # For caching previous computations\n",
"ASSUME_DIRECT = False # @param # Assume a direct flight exists between source and destination\n",
"EARTH_RADIUS = 6371\n",
"KG_CO2E_PER_PERSON = 4690 # @param # https://www.statista.com/statistics/268753/co2-emissions-per-capita-worldwide-since-1990/#:~:text=Global%20per%20capita%20carbon%20dioxide%20emissions%20averaged%204.69%20metric%20tons%20in%202021.\n",
"KG_CO2E_PER_KWH = 0.436 # @param # https://ourworldindata.org/grapher/carbon-intensity-electricity?tab=chart\n",
"A100_KW = 0.4 # @param\n",
"MAX_GROUND_TRANSPORT_KM = 150 # @param\n",
"NROWS = 100000 # debug\n",
"\n",
"# Two ways to compute CO2, based on myclimate and ICCT.\n",
"Method = Enum('Method', ['ICCT', 'MYCLIMATE'])\n",
"CO2_METHOD = Method.ICCT # @param\n",
"\n",
"drive.mount('/content/drive')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "plwhIhImNifR",
"outputId": "c1d1a8c0-88e5-49b7-f68f-96bf73e2e103"
},
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Mounted at /content/drive\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"## Loading data\n",
"\n",
"There are three data sources:\n",
"\n",
"- **Conference attendee locations**: City/state/nation data for each attendee.\n",
"We wrote a script to geocode these locations, using the OpenStreetMaps provider in the python geocoder library (https://geocoder.readthedocs.io/providers/OpenStreetMap.html). That code is available on request.\n",
"- **Airport locations**: Airport code and latitude/longitude, from https://github.com/jpatokal/openflights/tree/master/data.\n",
"- **Flights**: All airport-to-airport flights, again from https://github.com/jpatokal/openflights/tree/master/data. Unfortunately this data hasn't been updated since 2014."
],
"metadata": {
"id": "AppDYliLyG81"
}
},
{
"cell_type": "code",
"source": [
"# @title data loading code\n",
"def latlon_to_cartesian(lat, lon):\n",
" \"\"\"Converts latitude and longitude to cartesian coordinates for kdtree.\"\"\"\n",
" lat, lon = np.radians(lat), np.radians(lon)\n",
" x = np.cos(lat) * np.cos(lon)\n",
" y = np.cos(lat) * np.sin(lon)\n",
" z = np.sin(lat)\n",
" return x, y, z\n",
"\n",
"def read_attendees(attendee_path: str):\n",
" \"\"\"Read conference attendee file\"\"\"\n",
" names = ['count', 'city', 'state', 'country', 'lon', 'lat']\n",
"\n",
" df_attendees = pd.read_csv(attendee_path, header=None, names=names, nrows=NROWS)\n",
"\n",
" df_attendees['coordinates'] = df_attendees.apply(\n",
" lambda row: (float(row['lat']), float(row['lon'])), axis=1)\n",
"\n",
" df_attendees['xyz'] = df_attendees.apply(\n",
" lambda row: latlon_to_cartesian(*row['coordinates']), axis=1)\n",
"\n",
" total_attendees = df_attendees['count'].sum()\n",
" print(f\"Total attendees: {total_attendees}\")\n",
" print(f\"Total locations: {len(df_attendees)}\")\n",
"\n",
" return df_attendees\n",
"\n",
"def load_all_data(airport_path: str = AIRPORT_DATA_PATH,\n",
" routes_path: str = ROUTE_DATA_PATH,\n",
" attendees_path: str = ATTENDEE_DATA_PATH):\n",
" df_airports = pd.read_csv(airport_path, header=None,\n",
" names=['name', 'city', 'country', 'code1', 'code2', 'lat', 'lon',\n",
" 'code3', 'code4', 'code5', 'timezone', 'code7', 'code8'])\n",
"\n",
" # clean up some tiny airports\n",
" df_airports = df_airports[df_airports.code1 != '\\\\N']\n",
" df_routes = pd.read_csv(routes_path, header=None, names=[\n",
" 'airline', 'airline_id', 'source', 'source_id', 'destination',\n",
" 'destination_id', 'codeshare', 'stops', 'equipment'])\n",
"\n",
" # drop duplicate routes\n",
" df_routes = df_routes.drop_duplicates(subset=['source', 'destination']\n",
" ).reset_index(drop=True)\n",
" all_airports = sorted(\n",
" list(set(\n",
" df_routes['source'].unique()).union(df_routes['destination'].unique())))\n",
"\n",
" df_airports = df_airports[\n",
" df_airports['code1'].isin(all_airports)].set_index('code1')\n",
" df_airports['coordinates'] = df_airports.apply(\n",
" lambda row: (row['lat'], row['lon']), axis=1)\n",
" # Cartesian coordinates for distance computations.\n",
" df_airports['xyz'] = df_airports.apply(\n",
" lambda row: latlon_to_cartesian(*row['coordinates']), axis=1)\n",
"\n",
"\n",
" df_attendees = read_attendees(attendees_path)\n",
" return df_airports, df_routes, df_attendees"
],
"metadata": {
"id": "Oq1WfRUjpbVq",
"cellView": "form"
},
"execution_count": 4,
"outputs": []
},
{
"cell_type": "code",
"source": [
"df_airports, df_routes, df_attendees = load_all_data()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LdP7f-3nOCsX",
"outputId": "34ca1852-f572-4603-96fe-266b072f8ebb"
},
"execution_count": 13,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Total attendees: 2588\n",
"Total locations: 918\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"## Finding airports for each attendee location\n",
"\n",
"For each attendee, we find airports that they are likely to use. We do this in stages. If there are any airports within 75km, we include those. If not, we include all airports within 125km. If there are none, we include all airports within 250km. This covers >99% of attendees."
],
"metadata": {
"id": "9O7Xp04gyC1s"
}
},
{
"cell_type": "code",
"source": [
"# @title\n",
"def add_attendee_airport_codes(df_attendees: pd.DataFrame,\n",
" df_airports: pd.DataFrame) -> pd.DataFrame:\n",
" \"\"\"Uses KD tree to find closest airport code for each attendee.\"\"\"\n",
"\n",
" # Build KDTree\n",
" kdtree = cKDTree(df_airports['xyz'].tolist())\n",
" def find_closest_airport_codes(\n",
" attendee: pd.Series,\n",
" radii: list[float] = [75/EARTH_RADIUS, 125/EARTH_RADIUS, 250/EARTH_RADIUS]):\n",
" \"\"\"Finds airports all within each radius of city.\"\"\"\n",
" for radius in radii:\n",
" locs = kdtree.query_ball_point(attendee['xyz'], r=radius)\n",
" candidates = df_airports.iloc[locs]\n",
" if len(candidates):\n",
" return candidates.index.tolist()\n",
" return None\n",
"\n",
" return df_attendees.assign(airports=df_attendees.apply(\n",
" lambda row: find_closest_airport_codes(row), axis=1))\n",
"\n",
"df_attendees = add_attendee_airport_codes(df_attendees, df_airports)\n",
"print(\"Random sample of cities and linked airports:\")\n",
"print(df_attendees.query('count > 20').sample(5).reset_index(drop=True)[\n",
" ['city', 'state', 'country', 'airports']])"
],
"metadata": {
"id": "aRbkY5lkBus7",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "3dbbf4a0-3bba-4e51-b88b-363fc8948321"
},
"execution_count": 14,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Random sample of cities and linked airports:\n",
" city state country airports\n",
"0 vancouver bc canada [BLI, YXX, YCD, YYJ, YVR, CXH, ESD]\n",
"1 los_angeles ca united_states [BUR, LAX, ONT, SNA, LGB]\n",
"2 ann_arbor mi united_states [DTW, YQG]\n",
"3 seattle wa united_states [BFI, SEA]\n",
"4 sunnyvale ca united_states [SJC, OAK, SFO]\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"## Shortest paths between all pairs of airports\n",
"\n",
"The main computational challenge is to find likely flights for each attendee. To do this, we build a graph of airports and then find the shortest paths between all pairs of nodes in the graph. Path length is defined first by the number of hops and then by the distance, so that a path with $n$ hops will be preferred over all paths with $n+1$ hops.\n",
"\n",
"This section also contains code for estimating CO2 emissions by flight distance, using two methods that reflect the higher emissions associated with takeoff: a piecewise linear function ([ICCT](https://theicct.org/sites/default/files/publications/ICCT_CO2-commercl-aviation-2018_20190918.pdf)) and a quadratic formula ([myclimate](https://www.myclimate.org/en/information/about-myclimate/downloads/flight-emission-calculator/))."
],
"metadata": {
"id": "8nnbkVPp0a-n"
}
},
{
"cell_type": "code",
"source": [
"# @title Shortest path and other code\n",
"\n",
"def co2_kg_by_distance_icct(distance: float) -> float:\n",
" \"\"\"Computes CO2 emissions by flight distance in KM based on the ICCT method\n",
" (https://theicct.org/sites/default/files/publications/ICCT_CO2-commercl-aviation-2018_20190918.pdf).\n",
" \"\"\"\n",
" co2_1 = min(distance, 500) * 0.16\n",
" co2_2 = max(0, min(distance-500, 500)) * 0.11\n",
" co2_3 = max(0, distance-1000) * 0.09\n",
" return co2_1 + co2_2 + co2_3\n",
"\n",
"def co2_kg_by_distance_myclimate(distance: float) -> float:\n",
" \"\"\"Computes co2 emissions by flight distance in KM based on the myclimate method\n",
" (https://www.myclimate.org/en/information/about-myclimate/downloads/flight-emission-calculator/).\n",
" \"\"\"\n",
" # x: Flight distance [km], defined as the sum of GCD (great circle distance)\n",
" # and DC (distance correction for detours and holding patterns)\n",
" # and inefficiencies in the air traffic control systems [km]\n",
" # S: Average number of seats (total across all cabin classes)\n",
" # PLF: Passenger load factor\n",
" # CF: Cargo factor\n",
" # CW: Cabin class weighting factor\n",
" # EF: CO₂ emission factor for jet fuel combustion (kerosene)\n",
" # M: Multiplier accounting for potential non-CO₂ effects\n",
" # P: CO₂e emission factor for pre-production jet fuel, kerosene\n",
" # AF: Aircraft factor\n",
" # A: Airport infrastructure emissions\n",
"\n",
" aircrafts = ['Standard short-haul flight', 'Standard long-haul flight']\n",
" DC = 95\n",
"\n",
" x = distance+ DC\n",
"\n",
" S = [157.86, 302.58]\n",
" PLF = [.796, 0.82]\n",
" CF = 2*[0.26]\n",
" CW = 2*[1]\n",
" EF = 2*[3.16]\n",
" P = 2*[0.538]\n",
" M = 2*[3]\n",
" AF = 2*[0.00034]\n",
" A = 2*[11.68]\n",
" a = [0.000007, 0.00029]\n",
" b = [2.775, 3.475]\n",
" c = [1260.608, 3259.691]\n",
"\n",
" MAX_SHORT = 1500\n",
" MIN_LONG = 2500\n",
" def co2(x, S, PLF, CF, CW, EF, P, M, AF, A, a, b, c):\n",
" return (a * x**2+b*x+c)/(S*PLF) *(1-CF) * CW*(EF*M+P) + AF*x+A\n",
"\n",
" if distance <= MAX_SHORT:\n",
" index = 0\n",
" return co2(x, S[index], PLF[index], CF[index], CW[index],\n",
" EF[index], P[index], M[index], AF[index], A[index],\n",
" a[index], b[index], c[index])\n",
" elif distance >= MIN_LONG:\n",
" index = 1\n",
" return co2(x, S[index], PLF[index], CF[index], CW[index],\n",
" EF[index], P[index], M[index], AF[index], A[index],\n",
" a[index], b[index], c[index])\n",
" else:\n",
" index = 0\n",
" co2_1 = co2(x, S[index], PLF[index], CF[index], CW[index],\n",
" EF[index], P[index], M[index], AF[index], A[index],\n",
" a[index], b[index], c[index])\n",
"\n",
" index = 1\n",
" co2_2 = co2(x, S[index], PLF[index], CF[index], CW[index],\n",
" EF[index], P[index], M[index], AF[index], A[index],\n",
" a[index], b[index], c[index])\n",
"\n",
" coefficient = (distance-MAX_SHORT)/(MIN_LONG-MAX_SHORT)\n",
"\n",
" return coefficient*co2_1 + (1-coefficient)*co2_2\n",
"\n",
"\n",
"def build_shortest_paths(df_airports: pd.DataFrame,\n",
" df_routes: pd.DataFrame) -> pd.DataFrame:\n",
" \"\"\"Build a dataframe of paths and lengths between pairs of airports.\"\"\"\n",
" G = nx.Graph()\n",
"\n",
" for code, airport in df_airports.iterrows():\n",
" G.add_node(code, pos=airport['coordinates'])\n",
"\n",
" # Calculate the Haversine distances between airports and add weighted edges to\n",
" # the graph\n",
" for _, route in df_routes.iterrows():\n",
" source = route['source']\n",
" destination = route['destination']\n",
" if source in df_airports.index and destination in df_airports.index:\n",
" coordinates_source = df_airports.loc[source]['coordinates']\n",
" coordinates_destination = df_airports.loc[destination]['coordinates']\n",
" distance = haversine(coordinates_source, coordinates_destination,\n",
" unit=Unit.KILOMETERS)\n",
" G.add_edge(source, destination, weight=distance,\n",
" weight_with_hops=(1 + distance/1e5),\n",
" kg_co2e_icct=co2_kg_by_distance_icct(distance),\n",
" kg_co2e_myclimate=co2_kg_by_distance_myclimate(distance))\n",
"\n",
" # Use NetworkX's all_pairs_shortest_path to find the shortest paths between\n",
" # all pairs of airports.\n",
" # weight_with_hops will choose the shortest combined distance of the\n",
" # fewest-hop routes\n",
" shortest_paths = dict(nx.all_pairs_dijkstra_path(G, weight='weight_with_hops'))\n",
" shortest_path_lengths = dict(nx.all_pairs_dijkstra_path_length(G))\n",
" shortest_path_co2_icct = dict(nx.all_pairs_dijkstra_path_length(G, weight='kg_co2e_icct'))\n",
" shortest_path_co2_myclimate = dict(nx.all_pairs_dijkstra_path_length(G, weight='kg_co2e_myclimate'))\n",
"\n",
" records = []\n",
" for source in shortest_paths.keys():\n",
" for destination in shortest_paths[source].keys():\n",
" records.append({\"source\": source, \"destination\": destination,\n",
" \"path\": shortest_paths[source][destination],\n",
" \"distance\": shortest_path_lengths[source][destination],\n",
" \"kg_co2e_icct\": shortest_path_co2_icct[source][destination],\n",
" \"kg_co2e_myclimate\": shortest_path_co2_myclimate[source][destination]})\n",
" return pd.DataFrame(records)\n",
"\n",
"def find_best_paths_per_attendee(\n",
" df_attendees: pd.DataFrame,\n",
" paths_to_target: pd.DataFrame,\n",
" max_ground_travel: float = MAX_GROUND_TRANSPORT_KM) -> pd.DataFrame:\n",
" \"\"\"Finds best paths to the target destination per attendee.\"\"\"\n",
" df_per_attendee_airport = pd.merge(\n",
" df_attendees.explode('airports').reset_index(),\n",
" paths_to_target,\n",
" left_on='airports', right_on='source')\n",
" best_rows = df_per_attendee_airport.groupby('index')['distance'].idxmin()\n",
" df_best_paths = df_per_attendee_airport.loc[best_rows]\n",
"\n",
" target = paths_to_target.destination.unique()[0]\n",
" ground_distances = df_best_paths['coordinates'].apply(\n",
" lambda x: haversine(x, df_airports.loc[target]['coordinates'],\n",
" unit=Unit.KILOMETERS))\n",
" df_best_paths['ground_distance'] = ground_distances\n",
"\n",
" ground_travel = df_best_paths['ground_distance'] < max_ground_travel\n",
" df_best_paths.loc[ground_travel, 'num_hops'] = 0\n",
" df_best_paths.loc[ground_travel, 'kg_co2e_icct'] = 0\n",
" df_best_paths.loc[ground_travel, 'kg_co2e_myclimate'] = 0\n",
" df_best_paths.loc[ground_travel, 'distance'] = 0\n",
" df_best_paths['path'] = df_best_paths.apply(\n",
" lambda row: [target] if row['ground_distance'] < max_ground_travel \\\n",
" else row['path'], axis=1)\n",
" return df_best_paths\n",
"\n",
"\n",
"def find_best_direct_path_per_attendee(\n",
" df_attendees: pd.DataFrame,\n",
" target: str,\n",
" target_coordinates,\n",
" max_ground_travel: float = MAX_GROUND_TRANSPORT_KM) -> pd.DataFrame:\n",
" \"\"\"\n",
" An ablation study:\n",
" Compute direct distance to target destination per attendee,\n",
" instead of using flight information.\n",
" Use ground distance as a proxy (not taking airport locations into account)\n",
" \"\"\"\n",
" ground_distances = df_attendees['coordinates'].apply(\n",
" lambda row: haversine(row, target_coordinates, unit=Unit.KILOMETERS))\n",
"\n",
" df_best_paths = pd.DataFrame()\n",
" df_best_paths['distance'] = ground_distances\n",
" df_best_paths['kg_co2e_icct'] = ground_distances.apply(\n",
" lambda x: co2_kg_by_distance_icct(x))\n",
" df_best_paths['kg_co2e_myclimate'] = ground_distances.apply(\n",
" lambda x: co2_kg_by_distance_myclimate(x))\n",
"\n",
" # For distance shorter than max_ground_travel, assume attendees didn't fly,\n",
" # and assuming 0 kg_co2e.\n",
" ground_travel = df_best_paths['distance'] < max_ground_travel\n",
" df_best_paths.loc[ground_travel, 'num_hops'] = 0\n",
" df_best_paths.loc[ground_travel, 'kg_co2e_icct'] = 0\n",
" df_best_paths.loc[ground_travel, 'kg_co2e_myclimate'] = 0\n",
" df_best_paths.loc[ground_travel, 'distance'] = 0\n",
"\n",
" air_travel = df_best_paths['distance'] >= max_ground_travel\n",
" df_best_paths.loc[air_travel, 'num_hops'] = 1\n",
"\n",
" df_best_paths['path'] = df_attendees.apply(\n",
" lambda row: [str(row['city'])+','+str(row['state'])+','+str(row['country']), target], axis=1)\n",
"\n",
" df_best_paths['count'] = df_attendees.apply(\n",
" lambda row: row['count'], axis=1)\n",
"\n",
" df_best_paths['path'] = df_best_paths.apply(\n",
" lambda row: target if row['distance'] < max_ground_travel \\\n",
" else row['path'], axis=1)\n",
" return df_best_paths\n",
"\n",
"\n",
"def analyze_best_paths(df_best_paths: pd.DataFrame, df_attendees: pd.DataFrame,\n",
" co2_method=CO2_METHOD):\n",
" \"\"\"Some helper code to print out details of the best paths.\"\"\"\n",
" pct_accounted_for = df_best_paths['count'].sum() / df_attendees['count'].sum()\n",
" counts = df_best_paths['count'].to_numpy()\n",
" distances = df_best_paths['distance'].to_numpy()\n",
" co2_method_str = \"icct\" if co2_method==Method.ICCT else \"myclimate\"\n",
" kg_co2e = df_best_paths[f'kg_co2e_{co2_method_str}'].to_numpy()\n",
"\n",
" # A factor 2 is added to both distance and co2e to account for roundtrips\n",
" total_distance = 2 * counts @ distances\n",
" kg_co2e = 2 * kg_co2e @ counts\n",
"\n",
" people_years = kg_co2e / KG_CO2E_PER_PERSON\n",
" a100_co2_per_hour = KG_CO2E_PER_KWH * A100_KW\n",
" a100_hours = kg_co2e / a100_co2_per_hour\n",
"\n",
" direct_flights = np.array(df_best_paths['num_hops'] <= 1)\n",
" pct_direct_flights = (counts @ direct_flights) / counts.sum()\n",
"\n",
" ground_travel = np.array(df_best_paths['num_hops'] == 0)\n",
" pct_ground_travel = (counts @ ground_travel) / counts.sum()\n",
"\n",
" return {\"kilometers flight\": total_distance,\n",
" \"mt co2e\": kg_co2e / 1000,\n",
" \"a100 hours\": a100_hours,\n",
" \"a100 years\": a100_hours / (365 * 24),\n",
" \"person_years_co2e\": people_years,\n",
" \"pct direct flights\": pct_direct_flights,\n",
" \"pct ground travel\": pct_ground_travel,\n",
" \"pct accounted for:\": pct_accounted_for}\n",
"\n",
"def print_path_stats(paths, df_attendees, co2_method: str = 'icct'):\n",
" for key, val in analyze_best_paths(paths, df_attendees,\n",
" co2_method=co2_method).items():\n",
" if val > 10000:\n",
" val_str = f\"{val:.2e}\"\n",
" elif val > 1:\n",
" val_str = f\"{val:.1f}\"\n",
" else:\n",
" val_str = f\"{val:.3f}\"\n",
" print(f\"{key}: {val_str}\")"
],
"metadata": {
"id": "64txKHxG14Ll",
"cellView": "form"
},
"execution_count": 15,
"outputs": []
},
{
"cell_type": "code",
"source": [
"LOAD_SHORTEST_PATHS = True"
],
"metadata": {
"id": "RlnpaILp8d6-"
},
"execution_count": 16,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Load or compute best paths\n",
"shortest_path_path = os.path.join(HOME_DIR, \"shortest_paths.parquet\")\n",
"if LOAD_SHORTEST_PATHS:\n",
" df_paths = pd.read_parquet(shortest_path_path,\n",
" engine=\"fastparquet\")\n",
"else:\n",
" df_paths = build_shortest_paths(df_airports, df_routes) # this takes ~20 minutes\n",
" df_paths['num_hops'] = df_paths['path'].apply(len) - 1\n",
" df_paths.to_parquet(shortest_path_path, engine=\"fastparquet\")"
],
"metadata": {
"id": "6d38f_rQ6Xfh"
},
"execution_count": 17,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Emissions estimate\n",
"\n",
"Finally we can find the shortest path to the conference venue for each attendee. We make the conservative assumption that everyone located within 150km of Toronto emitted zero carbon while traveling to the conference."
],
"metadata": {
"id": "nKWl6dzh9-aM"
}
},
{
"cell_type": "code",
"source": [
"# Print best paths\n",
"best_paths = find_best_paths_per_attendee(\n",
" df_attendees,\n",
" df_paths[df_paths['destination']=='YYZ'],\n",
")\n",
"\n",
"print_path_stats(best_paths, df_attendees, co2_method=Method.ICCT)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "soe4XE_wA4M_",
"outputId": "514657a0-100b-484f-b4ad-b757bdd021ef"
},
"execution_count": 18,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"kilometers flight: 2.36e+07\n",
"mt co2e: 2402.3\n",
"a100 hours: 1.38e+07\n",
"a100 years: 1572.5\n",
"person_years_co2e: 512.2\n",
"pct direct flights: 0.587\n",
"pct ground travel: 0.089\n",
"pct accounted for:: 1.000\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# myclimate estimate\n",
"print_path_stats(best_paths, df_attendees, co2_method=Method.MYCLIMATE)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mFGeYR57GEa9",
"outputId": "49c1aa91-bd2a-4c95-ca93-8c2994b99018"
},
"execution_count": 19,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"kilometers flight: 2.36e+07\n",
"mt co2e: 4662.6\n",
"a100 hours: 2.67e+07\n",
"a100 years: 3051.9\n",
"person_years_co2e: 994.2\n",
"pct direct flights: 0.587\n",
"pct ground travel: 0.089\n",
"pct accounted for:: 1.000\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"The myclimate estimate is nearly two times higher. As noted above, the main reason for this difference is that myclimate incorporates a large multiplier for non-co2 warming effects."
],
"metadata": {
"id": "VqoHZvWVGJ-T"
}
},
{
"cell_type": "markdown",
"source": [
"### Ablation\n",
"\n",
"One distinct aspect of our methodology is the use of flight routes rather than direct city-to-city distances. We test the effect of this on our overall emissions estimate."
],
"metadata": {
"id": "NUGBxGPFF2Z4"
}
},
{
"cell_type": "code",
"source": [
"# @title\n",
"def flight_path_ablation(target: str):\n",
" \"\"\"Ablate the effect of modeling flight paths.\"\"\"\n",
" best_paths = find_best_paths_per_attendee(\n",
" df_attendees,\n",
" df_paths[df_paths['destination']==target]\n",
" )\n",
"\n",
" best_path_stats = analyze_best_paths(best_paths, df_attendees)\n",
" direct_paths = find_best_direct_path_per_attendee(\n",
" df_attendees, target, df_airports.loc[target].coordinates)\n",
" direct_path_stats = analyze_best_paths(direct_paths, df_attendees)\n",
"\n",
" best_path_co2 = best_path_stats['mt co2e']\n",
" direct_path_co2 = direct_path_stats['mt co2e']\n",
"\n",
" pct_connecting = (1 - best_path_stats['pct direct flights'])*100\n",
" co2_increase = (1 - direct_path_co2/best_path_co2)*100\n",
" print(f\"{co2_increase:.1f}% increase in emissions from {pct_connecting:.1f}%\"\\\n",
" f\" of attendees taking connecting flights to {target} \"\\\n",
" f\"({direct_path_co2:.0f} to {best_path_co2:.0f} mt co2e).\")"
],
"metadata": {
"id": "JoZsWc8hEfMG"
},
"execution_count": 20,
"outputs": []
},
{
"cell_type": "code",
"source": [
"flight_path_ablation('YYZ')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4Bc2gGz9YqqD",
"outputId": "9f0a1b7e-302f-483f-bf93-57893ae1c259"
},
"execution_count": 21,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"4.1% increase in emissions from 41.3% of attendees taking connecting flights to YYZ (2305 to 2402 mt co2e).\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"The difference is relatively small, but this is in part because Toronto is a well-connected airport with a relatively high proportion of direct flights (59\\%). The impact of modeling flight routes is greater if we pick a less well-connected city such as Montreal (37.5\\% direct flights):"
],
"metadata": {
"id": "Z97gWw3bXGXR"
}
},
{
"cell_type": "code",
"source": [
"flight_path_ablation('YUL')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "VGO20zYgYtLn",
"outputId": "f1dd9c9c-015f-4904-c906-b448627c392e"
},
"execution_count": 22,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"7.8% increase in emissions from 62.5% of attendees taking connecting flights to YUL (2326 to 2523 mt co2e).\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"For Montreal, modeling flight connections leads to a 7.8\\% increase in the CO2 estimate. In reality, the increase is likely to be higher, because we have assumed that each attendee chose the most carbon-efficient flight path.\n",
"\n",
"Note that these estimates assume the same geographic distribution of participants regardless of where the conference was held, so the absolute emissions numbers should not be considered to be accurate."
],
"metadata": {
"id": "_lnGWDFeZ7Ra"
}
},
{
"cell_type": "markdown",
"source": [
"# Limitations\n",
"\n",
"- We don't actually know how people got to ACL, we can only make an educated guess based on their addresses and commercial flight routes.\n",
"- The flight routes are based on data from 2014. If there are more direct flights now, the total emissions would be somewhat lower. On the other hand, we assume maximally carbon-efficient flight connections, which underestimates emissions.\n",
"- We don't incorporate differences between airplanes. For example, flights from remote airports might be on smaller planes with higher per-passenger emissions.\n",
"- The A100 estimate (6.9 million hours) is based on the global average greenhouse gas intensity of electricity production. Many data centers are powered by cleaner sources of energy. ACL 2023's A100-equivalent would be four times greater if we used the much lower greenhouse gas intensity of electricity production in Jacob's home state of Washington (https://www.washingtonpost.com/climate-environment/interactive/2023/clean-energy-electricity-sources/).\n",
"- If people didn't go to ACL, maybe they would have flown somewhere else. For example, maybe they would have visited Toronto for vacation (it was very fun!), or maybe they would have gone to a different conference.\n",
"- Canceling ACL would not have eliminated all of these emissions because most of the flights would have happened even if there were a few more empty seats.\n",
"\n",
"The last two points are counterfactuals, and whether they are a helpful way to think about emissions reductions is a philosophical question that is beyond the scope of this colab. It seems uncontroversial that a broad-based reduction in conference travel (e.g., across the computer science research community) could have a significant impact.\n",
"\n",
"### Comparison with LLMs\n",
"\n",
"Time to address the elephant (llama?) in the room. The ACL 2023 travel emissions are on a similar scale to the numbers reported in papers that quantify the emissions associated with training some of the largest language models. However, LLM training emissions are increasingly mitigated by the transition to low-carbon sources of energy. For commercial air travel, the situation is different: we may be decades away from alternatives to fossil fuels. Until then, flying less is the only way to reduce travel-related emissions.\n"
],
"metadata": {
"id": "vm7QNsTJ9ij9"
}
},
{
"cell_type": "markdown",
"source": [
"# Bonus visualization"
],
"metadata": {
"id": "VRAbSM-OZO14"
}
},
{
"cell_type": "code",
"source": [
"!pip install basemap"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 516
},
"id": "AeNFu45yZR-n",
"outputId": "1419c324-0ed2-467c-cb59-7d1b3b59dbbd"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting basemap\n",
" Downloading basemap-1.3.8-cp310-cp310-manylinux1_x86_64.whl (860 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m860.7/860.7 kB\u001b[0m \u001b[31m4.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hCollecting basemap-data<1.4,>=1.3.2 (from basemap)\n",
" Downloading basemap_data-1.3.2-py2.py3-none-any.whl (30.5 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m30.5/30.5 MB\u001b[0m \u001b[31m19.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: pyshp<2.4,>=1.2 in /usr/local/lib/python3.10/dist-packages (from basemap) (2.3.1)\n",
"Requirement already satisfied: matplotlib<3.8,>=1.5 in /usr/local/lib/python3.10/dist-packages (from basemap) (3.7.1)\n",
"Requirement already satisfied: pyproj<3.7.0,>=1.9.3 in /usr/local/lib/python3.10/dist-packages (from basemap) (3.6.1)\n",
"Requirement already satisfied: numpy<1.26,>=1.21 in /usr/local/lib/python3.10/dist-packages (from basemap) (1.23.5)\n",
"Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (1.2.0)\n",
"Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (0.12.1)\n",
"Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (4.45.1)\n",
"Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (1.4.5)\n",
"Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (23.2)\n",
"Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (9.4.0)\n",
"Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (3.1.1)\n",
"Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib<3.8,>=1.5->basemap) (2.8.2)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from pyproj<3.7.0,>=1.9.3->basemap) (2023.11.17)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib<3.8,>=1.5->basemap) (1.16.0)\n",
"Installing collected packages: basemap-data, basemap\n",
"Successfully installed basemap-1.3.8 basemap-data-1.3.2\n"
]
},
{
"output_type": "display_data",
"data": {
"application/vnd.colab-display-data+json": {
"pip_warning": {
"packages": [
"mpl_toolkits"
]
}
}
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": [
"import matplotlib.pyplot as plt\n",
"from mpl_toolkits.basemap import Basemap"
],
"metadata": {
"id": "yMQv9n4R3iNO"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"top_sources = df_attendees.sort_values('count', ascending=False).loc[1:7]\n",
"city_list = top_sources['city'].to_list()\n",
"coordinates_list = top_sources['coordinates'].to_list()\n",
"count_list = top_sources['count'].to_list()\n",
"city_counts = df_attendees.set_index('city')['count'].to_dict()"
],
"metadata": {
"id": "_6mVNFIm41qv"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# @title\n",
"# Convert Cartesian coordinates back to geographical coordinates\n",
"def convert_to_geographical(x, y, z):\n",
" lat_rad = np.arcsin(z)\n",
" lon_rad = np.arctan2(y, x)\n",
" lat = np.degrees(lat_rad)\n",
" lon = np.degrees(lon_rad)\n",
" return lat, lon\n",
"\n",
"toronto_lat, toronto_lon = 43.70, -79.42\n",
"destinations = {city_i: coordinates_i for city_i, coordinates_i in\n",
" zip(city_list, coordinates_list)}\n",
"# Find center of bounding box containing all destinations\n",
"locs_xyz = [latlon_to_cartesian(*coords) for coords in\n",
" list(destinations.values()) + [(toronto_lat, toronto_lon)]]\n",
"bb_center = .5 * (np.array(locs_xyz).max(0) + np.array(locs_xyz).min(0))\n",
"bb_center_latlon = convert_to_geographical(*np.array(locs_xyz).mean(0))\n",
"# bb_center_latlon = convert_to_geographical(*bb_center)\n",
"\n",
"# Create a Basemap with an Orthographic Projection centered on Toronto\n",
"m = Basemap(\n",
" projection='ortho',\n",
" lat_0=toronto_lat + 20, # bb_center_latlon[0], # + 25,\n",
" lon_0=toronto_lon + 5, # bb_center_latlon[1], # + 10, # avoid this hack\n",
" resolution='c',\n",
")\n",
"\n",
"# Draw coastlines and countries\n",
"coastlines = m.drawcoastlines()\n",
"coastlines.set_alpha(0.5)\n",
"countries = m.drawcountries()\n",
"countries.set_alpha(0.4)\n",
"\n",
"# Plot Toronto\n",
"x_toronto, y_toronto = m(toronto_lon, toronto_lat)\n",
"# m.plot(x_toronto, y_toronto, 'bo', markersize=8, label='Toronto')\n",
"\n",
"# Plot destinations and draw great circle routes\n",
"for city, (lat, lon) in destinations.items():\n",
" x_city, y_city = m(lon, lat)\n",
" m.drawgreatcircle(toronto_lon, toronto_lat, lon, lat,\n",
" linewidth=1 + 2 * city_counts[city] / 100,\n",
" color='r', linestyle='-', alpha=0.8)\n",
" m.plot(x_city, y_city, 'g*', markersize=5, label=city)\n",
" if city == \"new_york\":\n",
" ha='left'\n",
" else:\n",
" ha='right'\n",
" plt.text(x_city, y_city, city.title().replace('_', ' '), fontsize=9,\n",
" ha=ha, va='bottom', color='black', weight='bold')\n",
"\n",
"# Add a title\n",
"plt.title('Great Circle Routes from Toronto to Top Attendee Addresses')\n",
"\n",
"# Show the map\n",
"plt.show()"
],
"metadata": {
"id": "FBKFWEKU3bKr",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 428
},
"cellView": "form",
"outputId": "f5a213bb-84ba-4ff8-8fad-a293b399d952"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
],
"image/png": "\n"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"### Acknowledgements\n",
"\n",
"We are grateful to the ACL for providing the aggregated conference participation data, and to Sasha Luccioni for her feedback."
],
"metadata": {
"id": "IlFxwbJF7S8B"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment