Skip to content

Instantly share code, notes, and snippets.

Created August 11, 2020 20:20
Show Gist options
  • Save svpino/c4d1978f9eac47c59e61c34125c8fad9 to your computer and use it in GitHub Desktop.
Save svpino/c4d1978f9eac47c59e61c34125c8fad9 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "ai-generated-content.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyM8YEp0mpwiuT92OlguqohJ",
"include_colab_link": true
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
"cells": [
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
"source": [
"<a href=\"\" target=\"_parent\"><img src=\"\" alt=\"Open In Colab\"/></a>"
"cell_type": "code",
"metadata": {
"id": "HBMCqUqiDMpE",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
"outputId": "81b0aea6-fcdb-4c35-a2a2-0bc8eee57d5f"
"source": [
"# This will install the Transformes repository directly\n",
"# from Github.\n",
"!pip install -q git+"
"execution_count": 1,
"outputs": [
"output_type": "stream",
"text": [
"\u001b[K |████████████████████████████████| 3.0MB 2.8MB/s \n",
"\u001b[K |████████████████████████████████| 1.1MB 26.0MB/s \n",
"\u001b[K |████████████████████████████████| 890kB 38.9MB/s \n",
"\u001b[?25h Building wheel for transformers ( ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for sacremoses ( ... \u001b[?25l\u001b[?25hdone\n"
"name": "stdout"
"cell_type": "code",
"metadata": {
"id": "PUvscABODQ3q",
"colab_type": "code",
"colab": {}
"source": [
"import tensorflow as tf\n",
"import textwrap\n",
"from transformers import TFGPT2LMHeadModel, GPT2Tokenizer"
"execution_count": 12,
"outputs": []
"cell_type": "code",
"metadata": {
"id": "BaYoChPhDhJT",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
"outputId": "61fff41e-8009-4bed-9e7d-a90efb859a9b"
"source": [
"# We are going to be using GPT-2 Medium. The first time we run this\n",
"# cell, the models will be downloaded. They are large.\n",
"tokenizer = GPT2Tokenizer.from_pretrained(\"gpt2-medium\")\n",
"model = TFGPT2LMHeadModel.from_pretrained(\"gpt2-medium\")"
"execution_count": 14,
"outputs": [
"output_type": "stream",
"text": [
"All model checkpoint weights were used when initializing TFGPT2LMHeadModel.\n",
"All the weights of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2-medium.\n",
"If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.\n"
"name": "stderr"
"cell_type": "code",
"metadata": {
"id": "8az6BrxLDTSC",
"colab_type": "code",
"colab": {}
"source": [
"# This is the seed text. You can to put here some text that will \n",
"# help the AI focus on a specific topic. The AI will generate text\n",
"# that's directly related to this seed.\n",
"SEED_TEXT = \"\"\"\n",
" Clean Code should be as enjoyable as reading a good novel. Clean Code tells a story.\n",
" Clean Code is a pleasure to read. It's orderly and elegant. Everything is there for a reason. \n",
"# This represent the number of words we want to generate.\n",
"MAX_LENGTH = 250"
"execution_count": 27,
"outputs": []
"cell_type": "code",
"metadata": {
"id": "bki6iMqjDsc7",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
"outputId": "056b4187-2d79-4d62-f5f9-367967643cbf"
"source": [
"# Here we encode the seed text and generate the text.\n",
"encoded_text = tokenizer.encode(SEED_TEXT, return_tensors='tf')\n",
"output = model.generate(\n",
" encoded_text,\n",
" do_sample=True, \n",
" max_length=MAX_LENGTH, \n",
" top_k=50, \n",
" top_p=0.95, \n",
" num_return_sequences=1\n",
"execution_count": 28,
"outputs": [
"output_type": "stream",
"text": [
"Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence\n"
"name": "stderr"
"cell_type": "code",
"metadata": {
"id": "mb07t44BEMIm",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 153
"outputId": "9d6c8be3-fece-4a3a-99a0-1cf056cfd52e"
"source": [
"# Printing out the generated text.\n",
"for sample in output:\n",
" text = tokenizer.decode(sample, skip_special_tokens=True) \n",
" text = text[len(SEED_TEXT):].strip()\n",
" print(textwrap.fill(text, 80))"
"execution_count": 29,
"outputs": [
"output_type": "stream",
"text": [
"Clean Code takes some ideas that you have from other languages and adds some\n",
"extra stuff for a more functional look. You'll also notice some interesting\n",
"features such as object-oriented design and an introduction to functional\n",
"programming. And of course, you can look at how the author was able to use these\n",
"features. I would like to see such a book printed around the world. I'd have\n",
"loved to have seen the book from the perspective of a programmer, but to read in\n",
"clean and simple style? That would have been quite nice. The editor might have\n",
"just been wrong, right? Anyway, thank you for making this review!\n"
"name": "stdout"
"cell_type": "code",
"metadata": {
"id": "EGmiZOHhFRrv",
"colab_type": "code",
"colab": {}
"source": [
"execution_count": null,
"outputs": []
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment