Skip to content

Instantly share code, notes, and snippets.

@benarch
benarch / gpt-versions.csv
Last active July 10, 2023 09:00
OpenAI GPT versions gist
GPT Version Trained by Datasets & Data sources used Parameters count Release Date
GPT-1 4.5GB of data BookCorpus 4.5 GB of text from 7000 unpublished books 117 Million Parameters 2018
GPT-2 40 GB of data WebText 40 GB of text 8 million documents from 45 million webpages upvoted on Reddit 1.5 Billion Parameters 14.02.2019
GPT-3 570 GB of data 570GB plaintext; 0.4 trillion tokens. Mostly CommonCrawl WebText; English Wikipedia; and two books corpora (Books1 and Books2). 175 Billion Parameters 2020
GPT-3.5 45 TB of data Finetuned version of GPT3 175 Billion Parameters 15.03.2022
GPT-4 undisclosed trained data and parameters info are officialy undiscloosed yet but there are rumors that indicates those numbers 100 Trillion Parameters 14.03.2023
@benarch
benarch / hol-keys.md
Last active July 11, 2017 07:47
Oracle - Live for Coding @ Tel Aviv - 11.7.17