This was the initial funding submission
The Open Green Web
Assign the project to an area. {Civic Tech, Data Security, Data Literacy, Infrastruktur} *
Data infrastructure
Max. 700 Zeichen
Since 2006, the Green Web Foundation Directory, set up by René Post and Arend-Jan Tetteroo, has tracked which sites on the internet run on renewable power and which don’t, let you check this online for yourself.
It works by checking which datacentre a website is running in, then who operates that datacentre, and how they source their power. Where there is insufficient data, the company adds their own info, and after being checked, data about the site is updated.
The aim of the the Open Green Web is to take the Green Web Foundation Directory, and open it up: open source the codebase, extend the API, and publish an open dataset from the last 10+ years this data.
700 chars max
When working with machine learning, it’s common to hear “the algorithms are not the hard part. Getting good data is the hard part”.
Getting good data about what power different datacentres use is time consuming to collect, as it currently relies on sending lots of emails to multiple departments in multiple organisations.
One of the goals of the Open Green Web project is to create this good data - a freely usable, open dataset that shows which parts of the internet (i.e. the datacentres they use) run on renewable power, so it can be used as a base for building new products and services.
Max. 700 Zeichen
Right now, because we have such little transparency when it comes finding out how the web is powered, most of the web runs on coal.
What we use to power the internet matters - IT as a sector is a larger emitter of greenhouse gas emissions than the aviation industry. What’s more, these emissions are avoidable - clean, affordable alternatives exist, and when a tech company’s users find out and pressure them to switch to green power, they usually do.
The goal of the Open Green Web project is help speed the transition of the web to green energy, by making the information which services use renewable power and which do not, as widely, and freely available as possible.
Max. 1300 Zeichen
We have three main areas of work. Trust, Ecosystem, Reach
We want to make it easy to see how we power the web now, and trust the data. To do this we will:
- publish a detailed methodology about how we check if a site is using green power or not
- open source the code for the platform, and the browser plug-ins
- update the documentation for understanding how it all works, and how to contribute to the project
Next we want to make it easy to use the data, by extending the API. Right now, the API is read-only, and to update the directory, you need to sign into the website to add information about a site.
We intend to allow users to update the green web directory through an API, so it can be incorporated into existing tools.
We will expose more information in the API responses, so tools using the API already can do more, and new tools can be built on top of it.
Finally, we will publish open datasets from the data collected, so it can be used for analysis by academia, or and industry.
One goal we have is creating a dataset that can easily be incorporated into existing resources like HTTP Archive - a well known, open dataset of information about top million sites, used by industry, about the state of the web.
Max. 400 Zeichen
Ecograder is a tool designed to look at a single website, and provide tips on how to make it greener. It uses the current Green Web Foundation API.
The HTTP Archive is a dataset on the million most popular websites on the internet. It is frequently cited by industry and academics to discuss the state of the web, and how it is built. It contains no data about how a site is powered.
Max. 700 Zeichen
There are two target groups - makers of the web, and users of the web.
For the makers of the web, we will focus on the developer communities like Climate Action Tech, a cross-company group of tech employees already trying move to renewable power, and continue to run workshops and give talks at industry conferences and meetup groups.
For users of the web, we will use channels we already have had success with, like the Green Web Foundation’s browser plugins, and work with existing campaigns.
As an example, I was a contributor to Mozilla’s Internet Health Report 2017, and each year they look for data to help communicate this
Examples include the Greenpeace’s Clicking Clean campaign, who request the some of the data from high profile companies that we already collect, and Mozilla’s Internet Health Report who since 2017 also focus in this area.
Have you already worked on the idea? If so, briefly describe the current status and explain the proposed changes. *
Max. 700 Zeichen
The Green Web Foundation is in production - around 1200 companies update the Directory regularly, and every day around 400,000 checks come in the browser plugins.
In March, while working on a research project, I contacted Rene at the Green Web Foundation to ask about using their API. After using the their API in my analysis and publishing a dataset based on it, we discussed how we might work together, and extend the project.
Together we agreed that open sourcing the software would help with building a community of users and contributors around it, and by opening the data, we can contribute missing data infrastructure needed to help move the web to running entirely on green energy.
https://www.thegreenwebfoundation.org/
Briefly sketch the most important milestones which you (and the team) want to implement in the funding period.
Max. 700 Zeichen
We will start with preparing the platform code so it is easy to understand how both the API, and the browser plugins work, and easy to contribute, and accept changes to the codebase. When we merge in our first pull request from a new contributor we will have passed our first milestone.
Next we will extend the API to allow updates to the data. When the first company sends updates to us over the new API, we will have met the next milestone.
Our final goal is people using the data we publish. When the dataset is cited by an academic paper, or incorporated into a larger dataset like the HTTP Archive, or used as a basis for a visualisation for Earth Day 2019, we will have met this milestone.