Skip to content

Instantly share code, notes, and snippets.

@igorcosta
Last active April 17, 2024 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save igorcosta/527cc98368972fabe3b50fdd80ee10de to your computer and use it in GitHub Desktop.
Save igorcosta/527cc98368972fabe3b50fdd80ee10de to your computer and use it in GitHub Desktop.
Get a list of all programming languages used by the repositories in a GitHub Organisation

get_org_languages.py ๐Ÿš€

Python script to get all the programming languages in a GitHub organization and return a total number of repos using the programming language.

Hard requirements: ๐Ÿ

  • Make sure you know Python 3.0 or older.
  • Make sure you have a GitHub Personal Access Token with the read:org and repo:status permissions. ๐Ÿ”
  • Install the Python dependencies:
pip install requests json

GITHUB_TOKEN

The GitHub token requirements for this script are as follows:

The token must have the read:org permission. This permission is required to read the organization's repositories. The token must have the repo:status permission. This permission is required to get the commit status of the repositories.

Once you have a GitHub token with the required permissions, you can set it as an environment variable on your machine. This will allow you to run the script without having to specify the token each time.

To set the token as an environment variable, you can use the following command:

export GITHUB_TOKEN=<your_token>

Once you have set the token as an environment variable, you can run the script by passing the name of the GitHub organization to the main() function.

Usage: ๐Ÿ’ป

python get_org_languages.py <organization_name>

Example:

python get_org_languages.py my-organization

Output: ๐Ÿ“„

Programming languages used in `organization-name`:
Ruby: 266071
JavaScript: 311370
CSS: 135455
HTML: 30939
Shell: 54565
C: 1586069
Vim Script: 58
Python: 11074
Dockerfile: 1143
Perl: 363
Raku: 42
CoffeeScript: 8368

License: โš–๏ธ

MIT License

Contributing: ๐Ÿค

If you have any suggestions or bug reports, please feel free to write a comment here.

Additional information: โ„น๏ธ

This script uses the GitHub API to get all the repositories in the organization and then gets the programming languages used in each repository. The script returns a dictionary of programming languages, with the total number of bytes of code written in each language across all repositories in the organization.

This script can be used to get a better understanding of the programming languages used in an organization. This information can be used to make decisions about which programming languages to support and which programming languages to train employees in.

I hope you like this.

import requests
import json
GITHUB_TOKEN = 'REPLACE_ME'
def get_org_repos(org_name):
"""Gets all the repositories in a GitHub organization.
Args:
org_name: The name of the GitHub organization.
Returns:
A list of repository dictionaries.
"""
url = "https://api.github.com/orgs/{}/repos".format(org_name)
headers = {"Authorization": "token {}".format(GITHUB_TOKEN)}
response = requests.get(url, headers=headers)
repos = json.loads(response.content)
return repos
def get_repo_languages(repo):
"""Gets the programming languages used in a GitHub repository.
Args:
repo: A GitHub repository dictionary.
Returns:
A dictionary of programming languages, with the number of bytes of code written
in each language.
"""
url = "https://api.github.com/repos/{}/{}/languages".format(
repo["owner"]["login"], repo["name"])
headers = {"Authorization": "token {}".format(GITHUB_TOKEN)}
response = requests.get(url, headers=headers)
languages = json.loads(response.content)
return languages
def get_org_languages(org_name):
"""Gets all the programming languages used in a GitHub organization.
Args:
org_name: The name of the GitHub organization.
Returns:
A dictionary of programming languages, with the total number of bytes of code
written in each language across all repositories in the organization.
"""
org_languages = {}
repos = get_org_repos(org_name)
for repo in repos:
repo_languages = get_repo_languages(repo)
for language, bytes_of_code in repo_languages.items():
if language not in org_languages:
org_languages[language] = 0
org_languages[language] += bytes_of_code
return org_languages
def main():
org_name = "github"
org_languages = get_org_languages(org_name)
print("Programming languages used in {}:".format(org_name))
for language, bytes_of_code in org_languages.items():
print("{}: {}".format(language, bytes_of_code))
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment