Computational tools and statistical analysis are often deployed as a method to “read” texts. But what about using these same techniques to write them? In this workshop, we’ll investigate the state of the art of natural language processing with an eye toward using the sometimes-unintuitive abstractions of language produced by computational models to make programs that create surprising and poetic creative writing. Topics include: a whirlwind tour of spaCy for parsing English into syntactic constituents; a discussion of techniques for classifying and summarizing documents; and an explanation and demonstration of “word vectors” (like Google’s word2vec), an innovative language technology that allows computers to process written language less as discrete units and more like a continuous signal. Workshop participants will develop a number of small projects in text analysis and poetics using a public domain text of their choice. In becoming familiar with contemporary techniques for computational language analysis, critics and researchers will be able to reason better about language-based media on the Internet. Artists and writers, meanwhile, might just learn a few new techniques to add to their creative palette.
To install spaCy on Anaconda, you'll need to open a Terminal window (or the equivalent on your operating system) and type
conda install -c conda-forge spacy==2.0.11
This line installs the library. You'll also need to download a language model. For that, type:
python -m spacy download en_core_web_md
(Replace en with the language code for your desired language, if there's a model available for it.) The language model contains the statistical information necessary to parse text into sentences and sentences into parts of speech. Note that this download is several hundred megabytes, so it might take a while!
If time:
- Reading and writing electronic text, the computational poetry class I teach at NYU