Skip to content

Instantly share code, notes, and snippets.

@dribnet
Forked from aparrish/tracery-with-data.ipynb
Created July 18, 2017 04:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save dribnet/14a087bb2c6bba6fa4b8dad0acb2a8d7 to your computer and use it in GitHub Desktop.
Save dribnet/14a087bb2c6bba6fa4b8dad0acb2a8d7 to your computer and use it in GitHub Desktop.
Tracery and Python
{
"origin": "#hello.capitalize#, #location#!",
"hello": ["hello", "greetings", "howdy", "hey"],
"location": ["world", "solar system", "galaxy", "universe"]
}
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with Tracery in Python\n",
"\n",
"By [Allison Parrish](http://www.decontextualize.com)\n",
"\n",
"This tutorial shows you how to use [Tracery](http://tracery.io) in your Python programs. In particular, it shows a handful of useful patterns for incorporating large amounts of data into your Tracery grammars that would be impractical or inconvenient to do with a Tracery generator on its own.\n",
"\n",
"Tracery is an easy-to-use but powerful language and toolset for generating text from grammars made by [Kate Compton](http://www.galaxykate.com/). If you're not familiar with how Tracery works, try [the official tutorial](http://www.crystalcodepalace.com/traceryTut.html) or [this tutorial I wrote](http://air.decontextualize.com/tracery/).\n",
"\n",
"This tutorial is written for Python 3+ and should work also on 2.7."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Simple example\n",
"\n",
"In order to generate text from a Tracery grammar in Python, you'll need to install the [Tracery Python module](https://pypi.python.org/pypi/tracery). It's easiest to do this with `pip` at the command line, like so:\n",
"\n",
" pip install tracery\n",
" \n",
"(If you get a permissions error, try `pip install --user tracery`.)\n",
"\n",
"Once you've installed the `tracery` module, try the following example program:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello, galaxy!\n"
]
}
],
"source": [
"import tracery\n",
"from tracery.modifiers import base_english\n",
"\n",
"# put your grammar here as the value assigned to \"rules\"\n",
"rules = {\n",
" \"origin\": \"#hello.capitalize#, #location#!\",\n",
" \"hello\": [\"hello\", \"greetings\", \"howdy\", \"hey\"],\n",
" \"location\": [\"world\", \"solar system\", \"galaxy\", \"universe\"]\n",
"}\n",
"\n",
"grammar = tracery.Grammar(rules) # create a grammar object from the rules\n",
"grammar.add_modifiers(base_english) # add pre-programmed modifiers\n",
"print(grammar.flatten(\"#origin#\")) # and flatten, starting with origin rule"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This program takes a Tracery grammar (in the form of a Python dictionary) and \"flattens\" it, printing its output to standard output. You can take the content of a Tracery grammar you've written and paste it into this Python program as the value being assigned to the variable `rules` (unless your Tracery grammar uses some aspect of JSON formatting that works differently in Python, like Unicode escapes). Run the program from the command line (or in the cell above, if you're viewing this in Jupyter Notebook) and you'll get a line of output from your grammar."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading a Tracery grammar from a JSON file\n",
"\n",
"You may already have a set of Tracery grammar files that you want to generate from, or don't want to cut-and-paste the grammar into your Python script. If this is the case, no problem! You can use Python's `json` library to load any file containing a Tracery grammar from a JSON file. The program below shows how this works.\n",
"\n",
"Included is a sample grammar called `test-grammar.json`. Let's have a look."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"origin\": \"#hello.capitalize#, #location#!\",\r\n",
" \"hello\": [\"hello\", \"greetings\", \"howdy\", \"hey\"],\r\n",
" \"location\": [\"world\", \"solar system\", \"galaxy\", \"universe\"]\r\n",
"}"
]
}
],
"source": [
"!cat test-grammar.json"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python's `json` module provides functions for reading JSON-formatted data into Python as Python data structures, and exporting Python data structures to JSON format. The `.loads()` function from the module parses a string containing JSON-formatted data and returns the corresponding Python data structure (a dictionary or a list)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Greetings, galaxy!\n",
"Howdy, galaxy!\n",
"Howdy, galaxy!\n",
"Hey, solar system!\n",
"Hello, galaxy!\n",
"Hey, world!\n",
"Howdy, world!\n",
"Hello, world!\n",
"Hey, galaxy!\n",
"Hello, world!\n"
]
}
],
"source": [
"import tracery\n",
"from tracery.modifiers import base_english\n",
"import json\n",
"\n",
"# use json.loads() and open() to read in a JSON file as a Python data structure\n",
"rules = json.loads(open(\"test-grammar.json\").read())\n",
"\n",
"grammar = tracery.Grammar(rules)\n",
"grammar.add_modifiers(base_english)\n",
"\n",
"# print ten random outputs\n",
"for i in range(10):\n",
" print(grammar.flatten(\"#origin#\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above example uses a `for` loop to call the `.flatten()` method multiple times."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using external data in Tracery rules\n",
"\n",
"An interesting affordance of using Tracery in Python is the ability to *fill in* parts of the grammar using external data. By \"external\" data, what I mean is data that isn't directly in the grammar itself, but data that you insert into the grammar when your program runs. One reason to do this might be to make the output of your grammar dynamic, using (e.g.) data returned from a web API. A simpler reason is simply that large Tracery grammars can be difficult to edit and navigate, especially when you're working with rules that might have hundreds or thousands of possible replacements.\n",
"\n",
"To demonstrate, let's start with the generator discussed in [my Tracery tutorial](http://air.decontextualize.com/tracery/) that generates takes on the \"Dammit Jim, I'm a doctor, not a `OTHER PROFESSION`\" snowclone/trope. Such a grammar might look like this:\n",
"\n",
" {\n",
" \"origin\": \"#interjection#, #name#! I'm a #profession#, not a #profession#!\",\n",
" \"interjection\": [\"alas\", \"congratulations\", \"eureka\", \"fiddlesticks\",\n",
" \"good grief\", \"hallelujah\", \"oops\", \"rats\", \"thanks\", \"whoa\", \"yes\"],\n",
" \"name\": [\"Jim\", \"John\", \"Tom\", \"Steve\", \"Kevin\", \"Gary\", \"George\", \"Larry\"],\n",
" \"profession\": [\n",
" \"accountant\",\n",
" \"butcher\",\n",
" \"economist\",\n",
" \"forest fire prevention specialist\",\n",
" \"mathematician\",\n",
" \"proofreader\",\n",
" \"singer\",\n",
" \"teacher assistant\",\n",
" \"travel agent\",\n",
" \"welder\"\n",
" ]\n",
" }\n",
" \n",
"An immediately recognizable shortcoming of this grammar is that it doesn't have a large number of alternatives. If we want there to be more professions that, dammit Jim, I'm not, we need to type them into the grammar by hand. The selection of names is also woefully small.\n",
"\n",
"It would be nice if we could *supplement* the grammar by adding rule expansions from existing databases. For example, [Corpora](https://github.com/dariusk/corpora/) has [a list of occupations](https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json) and [a list of first names](https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json), which we could incorporate into our grammar. One way to do this would be simply to copy/paste the relevant part of the JSON file into the grammar. But we can also load the data directly *into* the grammar using Python.\n",
"\n",
"The program in the following cell specifies a *partial* Tracery grammar in a Python dictionary assigned to variable `rules`. The grammar is then augmented with data loaded from JSON files obtained from Corpora. Using the `json` library, we load the Corpora Project JSON files, find the data we need, and then assign it to new rules in the grammar. To get the example to work, we'll need to download [firstNames.json](https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json) and [occupations.json](https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json), so let's do that first using wget."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2017-07-18 23:01:27-- https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json\n",
"Resolving raw.githubusercontent.com... 151.101.164.133\n",
"Connecting to raw.githubusercontent.com|151.101.164.133|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 5647 (5.5K) [text/plain]\n",
"Saving to: 'firstNames.json'\n",
"\n",
"firstNames.json 100%[===================>] 5.51K --.-KB/s in 0s \n",
"\n",
"2017-07-18 23:01:28 (43.4 MB/s) - 'firstNames.json' saved [5647/5647]\n",
"\n",
"--2017-07-18 23:01:28-- https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json\n",
"Resolving raw.githubusercontent.com... 151.101.164.133\n",
"Connecting to raw.githubusercontent.com|151.101.164.133|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 22680 (22K) [text/plain]\n",
"Saving to: 'occupations.json'\n",
"\n",
"occupations.json 100%[===================>] 22.15K --.-KB/s in 0.03s \n",
"\n",
"2017-07-18 23:01:28 (815 KB/s) - 'occupations.json' saved [22680/22680]\n",
"\n"
]
}
],
"source": [
"!wget https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json\n",
"!wget https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The key trick here is that when creating the grammar, we refer to rules that don't yet exist. Later in the code, we add those rules (and their associated expansions, from the Corpora Project JSON files) by assigning values to keys in the `rules` dictionary. We're essentially building the grammar up gradually over the course of the program, instead of writing it all at once."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Whoa, Alexis! I'm a power tool repairer, not a custom sewer!\n"
]
}
],
"source": [
"import tracery\n",
"from tracery.modifiers import base_english\n",
"import json\n",
"\n",
"# the grammar refers to \"name\" and \"profession\" rules. we're not including them in the grammar\n",
"# here, but adding them later on (using corpora project data!)\n",
"rules = {\n",
" \"origin\": \"#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!\",\n",
" \"interjection\": [\"alas\", \"congratulations\", \"eureka\", \"fiddlesticks\",\n",
" \"good grief\", \"hallelujah\", \"oops\", \"rats\", \"thanks\", \"whoa\", \"yes\"],\n",
"}\n",
"\n",
"# load the JSON data from files downloaded from corpora project\n",
"names_data = json.loads(open(\"firstNames.json\").read())\n",
"occupation_data = json.loads(open(\"occupations.json\").read())\n",
"\n",
"# set the values for \"name\" and \"profession\" rules with corpora data\n",
"rules[\"name\"] = names_data[\"firstNames\"]\n",
"rules[\"profession\"] = occupation_data[\"occupations\"]\n",
"\n",
"# generate!\n",
"grammar = tracery.Grammar(rules)\n",
"grammar.add_modifiers(base_english)\n",
"print(grammar.flatten(\"#origin#\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> EXERCISE: Write a Tracery grammar that changes its output based on the current time of day. You'll need to use something like [`datetime`](https://docs.python.org/2.7/library/datetime.html#datetime.datetime.now) for this; after you've imported it, the expression `datetime.datetime.now().hour` evaluates to the current hour of the day."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Doing a Tracery \"mail merge\" with CSV data\n",
"\n",
"In [my CSV tutorial](https://gist.github.com/aparrish/f8e7eab47542678a39a39dddbca4ec2f), the final example shows how you might build sentences from data in a CSV file (in particular, a CSV exported from [this spreadsheet](https://docs.google.com/spreadsheets/d/1SmxsgSAcNqYahUXa9-XpecTg-fhy_Ko_-RMD3oT2Ukw/edit?usp=sharing) containing data about the dogs of NYC, originally from [here](https://project.wnyc.org/dogs-of-nyc/)). The method chosen in that example for constructing sentences is suboptimal: there's a lot of just slamming strings together with the `+` operator, which makes it hard to build in variation. It would be nice if we could build a Tracery grammar for generating these sentences instead!\n",
"\n",
"The following example does exactly this. As with the example in the previous section, this example constructs a *partial* Tracery grammar, and then adds rules to the grammar with new information. The difference with this example is that we generate a sentence for multiple data sets—instead of loading in data once at the beginning. For each row in the CSV file, we create a fresh copy of the grammar, then add rule/expansion pairs with the relevant data from the row. Inside the `for` loop, we construct a new grammar object and print the \"flattened\" (i.e. expanded) text."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Arf! I love living in Manhattan and they call me Buddy. My breed is Afghan Hound and my coat is brindle.\n",
"Ruff-ruff! Manhattan is where I call home and they call me Nicole. I'm the cleverest Afghan Hound you'll ever meet and you could say my coat is black.\n",
"Yip! I live in Manhattan and they call me Abby. I'm the friendliest Afghan Hound you'll ever meet and I've got a black coat.\n",
"Howdy! My name is Chloe. I've got a white coat and my breed is Afghan Hound. I live in Manhattan. Arf!\n",
"Ruff-ruff! I'm from Manhattan and they call me Jazzle. My breed is Afghan Hound and my coat is blond.\n",
"Hi! My name is Trouble. My coat is blond and I'm an Afghan Hound. the Bronx is where I call home. Ruff-ruff!\n",
"Hello! Arf! My name is Grace and I'm the most playful Afghan Hound you'll ever meet. I'm from Manhattan and I've got a cream coat.\n",
"Hey! Ruff-ruff! My name is Sisu and I'm the cleverest Afghan Hound you'll ever meet. Manhattan is where I call home and you could say my coat is black.\n",
"Arf! I'm from Queens and they call me Jakie. I'm the friendliest Afghan Hound you'll ever meet and you could say my coat is white.\n",
"Hey! Arf! My name is Geo and I'm an Afghan Hound. I'm from the Bronx and my coat is orange.\n",
"Hello! My name is Ginger. I've got a tan coat and my breed is Afghan Hound. the Bronx is where I call home. Woof!\n",
"Woof! The Bronx is where I call home and they call me Misty. I'm an Afghan Hound and my coat is tan.\n",
"Howdy! My name is Troy. You could say my coat is blond and I'm the friendliest Afghan Hound you'll ever meet. I live in Staten Island. Bow-wow!\n",
"Bow-wow! I'm from Queens and they call me Nick. I'm the cutest Afghan Hound you'll ever meet and you could say my coat is black.\n",
"Hello! Arf! My name is Prince and I'm the most loyal Afghan Hound you'll ever meet. Queens is where I call home and my coat is tan.\n",
"Howdy! Yip! My name is KIKU-NO-HM and I'm the most playful Akita you'll ever meet. I love living in Manhattan and I've got a gray coat.\n",
"Hey! My name is Sasha. My coat is white and I'm the strongest Akita you'll ever meet. I live in Manhattan. Bow-wow!\n",
"Yip! Queens is where I call home and they call me Bernie. I'm the strongest Akita you'll ever meet and my coat is white.\n",
"Howdy! My name is Coffee. You could say my coat is brown and I'm the strongest Akita you'll ever meet. I'm from the Bronx. Yip!\n",
"Hi! Arf! My name is Elsa and I'm an Akita. I'm from Brooklyn and you could say my coat is tan.\n",
"Hey! My name is Jason. I've got a black coat and my breed is Akita. I'm from Queens. Ruff-ruff!\n",
"Ruff-ruff! I love living in Queens and they call me Socrates. I'm an Akita and my coat is white.\n",
"Arf! I live in Staten Island and they call me Suki. My breed is Akita and you could say my coat is rust.\n",
"Hello! Woof! My name is Angel and I'm an Akita. Staten Island is where I call home and my coat is rust.\n",
"Hey! My name is Aspen. My coat is brown and my breed is Akita. I'm from Brooklyn. Woof!\n",
"Hi! Woof! My name is Bear and my breed is Akita. Queens is where I call home and my coat is black.\n",
"Hey! Ruff-ruff! My name is Buster and I'm the strongest Akita you'll ever meet. I love living in Queens and you could say my coat is white.\n",
"Hi! Arf! My name is CHIYO-KYRA and I'm the most playful Akita you'll ever meet. I love living in Queens and you could say my coat is brindle.\n",
"Howdy! My name is Chula. I've got a white coat and my breed is Akita. Queens is where I call home. Ruff-ruff!\n",
"Hi! Bow-wow! My name is Kita and I'm an Akita. Brooklyn is where I call home and you could say my coat is rust.\n",
"Yip! I'm from the Bronx and they call me Nicki. I'm an Akita and you could say my coat is black.\n",
"Howdy! Ruff-ruff! My name is Nikita and I'm an Akita. I love living in Queens and I've got a white coat.\n",
"Arf! Queens is where I call home and they call me Ralph. I'm the strongest Akita you'll ever meet and my coat is tan.\n",
"Hey! Yip! My name is Rambo and I'm an Akita. I love living in Queens and my coat is tan.\n",
"Hey! Woof! My name is Shoko and I'm the strongest Akita you'll ever meet. I'm from Brooklyn and my coat is tan.\n",
"Ruff-ruff! I live in the Bronx and they call me Bear. My breed is Akita and I've got a blond coat.\n",
"Hello! Bow-wow! My name is DARIUS and I'm an Akita. I live in Queens and my coat is white.\n",
"Bow-wow! I live in Queens and they call me Lady. I'm an Akita and you could say my coat is tan.\n",
"Hi! My name is Oreo. You could say my coat is black and my breed is Akita. I live in Staten Island. Woof!\n",
"Hello! Arf! My name is Prince and I'm an Akita. Queens is where I call home and you could say my coat is blond.\n",
"Hello! Woof! My name is Sabastian and my breed is Akita. I live in Manhattan and I've got a white coat.\n",
"Arf! I love living in the Bronx and they call me NICKO. I'm the strongest Akita you'll ever meet and you could say my coat is tan.\n",
"Hey! Bow-wow! My name is AKIRO and I'm the friendliest Akita you'll ever meet. I live in the Bronx and I've got a tan coat.\n",
"Hello! My name is Annie. I've got a white coat and I'm the cutest Akita you'll ever meet. Manhattan is where I call home. Bow-wow!\n",
"Howdy! My name is King. You could say my coat is brown and I'm the most playful Akita you'll ever meet. the Bronx is where I call home. Yip!\n",
"Howdy! Woof! My name is SCOOBYDORA and I'm the friendliest Akita you'll ever meet. I love living in Queens and my coat is brown.\n",
"Woof! I love living in Manhattan and they call me TENSHI. I'm an Akita and my coat is tan.\n",
"Hello! My name is Yuki. You could say my coat is black and I'm the cutest Akita you'll ever meet. the Bronx is where I call home. Ruff-ruff!\n",
"Howdy! My name is Grizz. My coat is rust and I'm an Akita. I'm from Manhattan. Ruff-ruff!\n",
"Hi! Bow-wow! My name is Bear and I'm an Akita. I'm from Manhattan and my coat is white.\n",
"Ruff-ruff! I live in Manhattan and they call me Sukiyaki. I'm an Akita and you could say my coat is brindle.\n",
"Ruff-ruff! I live in Manhattan and they call me Amy. My breed is Akita and I've got a brown coat.\n",
"Ruff-ruff! I'm from the Bronx and they call me Asia. I'm an Akita and my coat is black.\n",
"Howdy! Bow-wow! My name is Laki and my breed is Akita. Queens is where I call home and I've got a black coat.\n",
"Hey! Woof! My name is Lucy and I'm the cleverest Akita you'll ever meet. I'm from Manhattan and my coat is white.\n",
"Woof! Brooklyn is where I call home and they call me Luna. I'm an Akita and you could say my coat is white.\n",
"Hi! My name is Mugs. I've got a tan coat and I'm the friendliest Akita you'll ever meet. I live in Staten Island. Ruff-ruff!\n",
"Howdy! My name is Nicki. You could say my coat is white and my breed is Akita. I love living in Queens. Arf!\n",
"Yip! Queens is where I call home and they call me PUKI. I'm an Akita and I've got a black coat.\n",
"Howdy! Woof! My name is Star and I'm the cleverest Akita you'll ever meet. Queens is where I call home and I've got a white coat.\n",
"Howdy! My name is Sydney. My coat is brown and my breed is Akita. I love living in Manhattan. Woof!\n",
"Woof! I love living in the Bronx and they call me Yoshi. My breed is Akita and I've got a brindle coat.\n",
"Hey! Arf! My name is Babe and my breed is Akita. I'm from Queens and my coat is white.\n",
"Howdy! My name is HEAVENSENT. I've got a brown coat and my breed is Akita. Queens is where I call home. Yip!\n",
"Hey! Bow-wow! My name is Kobi and I'm the strongest Akita you'll ever meet. I live in the Bronx and my coat is brindle.\n",
"Hello! My name is Nikita. My coat is brindle and my breed is Akita. I love living in Manhattan. Ruff-ruff!\n",
"Woof! I'm from Queens and they call me Rocky. I'm an Akita and my coat is white.\n",
"Hi! Arf! My name is Romeo and I'm the friendliest Akita you'll ever meet. I love living in Brooklyn and I've got a brindle coat.\n",
"Yip! I love living in Manhattan and they call me Tara. My breed is Akita and my coat is tan.\n",
"Arf! I live in Queens and they call me Tara. I'm the most playful Akita you'll ever meet and you could say my coat is white.\n",
"Hi! My name is Tasha. My coat is brown and I'm the friendliest Akita you'll ever meet. I love living in Queens. Ruff-ruff!\n",
"Ruff-ruff! I love living in Staten Island and they call me Bella. I'm an Akita and I've got a black coat.\n",
"Yip! I'm from Queens and they call me Yoji. I'm the strongest Akita you'll ever meet and you could say my coat is black.\n",
"Woof! I live in Queens and they call me Tyson. My breed is Akita and I've got a white coat.\n",
"Bow-wow! I live in Staten Island and they call me Cookie. I'm an Akita and you could say my coat is black.\n",
"Howdy! Woof! My name is Jax and I'm the most playful Akita you'll ever meet. I'm from Queens and you could say my coat is black.\n",
"Ruff-ruff! I live in Brooklyn and they call me KELBY. I'm an Akita and I've got a black coat.\n",
"Hello! Yip! My name is Kiko and I'm an Akita. I'm from Staten Island and my coat is brown.\n",
"Hey! Ruff-ruff! My name is n/a and I'm the most loyal Akita you'll ever meet. Manhattan is where I call home and my coat is blond.\n",
"Hi! My name is Snowball. I've got a white coat and my breed is Akita. I live in Brooklyn. Woof!\n",
"Hi! Yip! My name is Bogie and I'm the strongest Akita you'll ever meet. I'm from the Bronx and my coat is black.\n",
"Bow-wow! I'm from Queens and they call me WON. I'm the cleverest Akita you'll ever meet and my coat is white.\n",
"Yip! Brooklyn is where I call home and they call me Ginger. I'm the friendliest Akita you'll ever meet and you could say my coat is white.\n",
"Hello! Arf! My name is Bear and I'm an Akita. I love living in Brooklyn and I've got a brindle coat.\n",
"Yip! I love living in the Bronx and they call me Buddy. I'm the most playful Akita you'll ever meet and you could say my coat is black.\n",
"Hey! My name is Coquito. You could say my coat is tan and I'm the friendliest Akita you'll ever meet. the Bronx is where I call home. Bow-wow!\n",
"Hi! Arf! My name is HENESSEY and I'm the cleverest Akita you'll ever meet. I love living in Brooklyn and my coat is black.\n",
"Bow-wow! I live in Brooklyn and they call me KICHI. I'm the strongest Akita you'll ever meet and you could say my coat is brown.\n",
"Woof! I love living in Queens and they call me Yogi. I'm the most playful Akita you'll ever meet and my coat is white.\n",
"Hello! Yip! My name is Brandy and I'm the friendliest Akita you'll ever meet. I love living in Queens and I've got a tan coat.\n",
"Howdy! Arf! My name is Chester and I'm the most loyal Akita you'll ever meet. Queens is where I call home and my coat is black.\n",
"Hey! My name is Jazzy. My coat is blond and I'm the friendliest Akita you'll ever meet. I'm from Queens. Yip!\n",
"Woof! Queens is where I call home and they call me Seven. I'm the cutest Akita you'll ever meet and I've got a blond coat.\n",
"Bow-wow! I love living in Brooklyn and they call me KIRA-A. I'm the strongest Akita you'll ever meet and my coat is rust.\n",
"Bow-wow! I love living in Queens and they call me Layla. My breed is Akita and you could say my coat is black.\n",
"Hey! Bow-wow! My name is Charlie and my breed is Akita. I live in Brooklyn and I've got a brindle coat.\n",
"Woof! Brooklyn is where I call home and they call me Reggie. I'm the most loyal Akita you'll ever meet and my coat is rust.\n",
"Hey! My name is Rocco. You could say my coat is silver and I'm an Akita. I love living in Staten Island. Yip!\n",
"Hello! My name is Saki. My coat is white and my breed is Akita. I love living in the Bronx. Arf!\n",
"Hey! Arf! My name is Stella and I'm the strongest Akita you'll ever meet. I'm from Brooklyn and my coat is white.\n"
]
}
],
"source": [
"import tracery\n",
"from tracery.modifiers import base_english\n",
"import json\n",
"import csv\n",
"\n",
"# create the \"template\" grammar, which will be copied and augmented for each record of the CSV\n",
"rules = {\n",
" \"origin\": [\n",
" \"#greeting.capitalize#! My name is #name#. #coatdesc.capitalize# and #breeddesc#. #homedesc#. #woof.capitalize#!\",\n",
" \"#greeting.capitalize#! #woof.capitalize#! My name is #name# and #breeddesc#. #homedesc.capitalize# and #coatdesc#.\",\n",
" \"#woof.capitalize#! #homedesc.capitalize# and they call me #name#. #breeddesc.capitalize# and #coatdesc#.\"\n",
" ],\n",
" \"greeting\": [\"hi\", \"howdy\", \"hello\", \"hey\"],\n",
" \"woof\": [\"woof\", \"arf\", \"bow-wow\", \"yip\", \"ruff-ruff\"],\n",
" \"coatdesc\": [\n",
" \"my coat is #color#\",\n",
" \"I've got #color.a# coat\",\n",
" \"you could say my coat is #color#\"\n",
" ],\n",
" \"breeddesc\": [\n",
" \"I'm #breed.a#\",\n",
" \"my breed is #breed#\",\n",
" \"I'm the #superlative# #breed# you'll ever meet\"\n",
" ],\n",
" \"superlative\": [\"cutest\", \"strongest\", \"most playful\", \"friendliest\", \"cleverest\", \"most loyal\"],\n",
" \"homedesc\": [\n",
" \"I'm from #borough#\",\n",
" \"I live in #borough#\",\n",
" \"I love living in #borough#\",\n",
" \"#borough# is where I call home\"\n",
" ]\n",
"}\n",
"\n",
"# iterate over the first 100 rows in the CSV file\n",
"for row in list(csv.DictReader(open(\"dogs-of-nyc.csv\")))[:100]:\n",
" # copy rules so we're not continuously overwriting values\n",
" rules_copy = dict(rules) # make a copy of the rules\n",
" \n",
" # now assign new rule/expansion pairs with the data from the current row\n",
" rules_copy[\"name\"] = row[\"dog_name\"]\n",
" rules_copy[\"color\"] = row[\"dominant_color\"].lower()\n",
" rules_copy[\"breed\"] = row[\"breed\"]\n",
" # little bit of fluency clean-up...\n",
" if row[\"borough\"] == \"Bronx\":\n",
" rules_copy[\"borough\"] = \"the \" + row[\"borough\"]\n",
" else:\n",
" rules_copy[\"borough\"] = row[\"borough\"]\n",
"\n",
" # now generate!\n",
" grammar = tracery.Grammar(rules_copy)\n",
" grammar.add_modifiers(base_english)\n",
" print(grammar.flatten(\"#origin#\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see, this technique allows us to combine the expressive strengths of Tracery-based text generation with Python's CSV parser to generate simple \"stories\" from spreadsheet data. Pretty neat!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment