Skip to content

Instantly share code, notes, and snippets.

@johnlaudun
johnlaudun / lcm.txt
Created February 6, 2024 15:46
"The Layton Court Mystery" by Anthony Berkeley (1925)
The Layton Court Mystery
by Anthony Berkeley
Contents
I. Eight o’Clock in the Morning
II. An Interrupted Breakfast
@johnlaudun
johnlaudun / sentiments.py
Last active March 11, 2019 22:22
Python script to compare Sentiment Analyses available in Python
#! /usr/bin/env python
'''
sentiments.py compares the outputs of the sentimental modules listed below.
Functionality to be added: normalization and smoothing.
(I haven't implemented the NLTK solution because I don't have classified texts.)
'''
# Imports
import matplotlib.pyplot as plt
\documentclass[12pt, letter]{article}
\usepackage[margin=1in]{geometry}
\setlength{\parindent}{0em}
\setlength{\parskip}{0.5em}
\newif\ifdraft
\drafttrue % or \draftfalse
\begin{document}
What I've been working on for the past few days is in preparation for attempting a topic model using the more established LDA instead of the NMF to see how well they compare -- with the understanding that since there is rarely a one-to-one matchup within either method, that there will be no such match across them.
Because LDA does not filter out common words on its own, the way the NMF method does, you have to start with a stoplist. I know we can begin with Blei's and a few other established lists, but I would also like to be able to compare that against our own results. My first thought was to build a dictionary of words and their frequency within the corpus. For convenience sake, I am using the NLTK.
Just as a record of what I've done, here's the usual code for loading the talks from the CSV with everything in it:
```python
import pandas
import re
Here's the text as I wrote it in Markdown and as it sits in the WP editing pane:
We still need to identify which talks have floats for values and determine what impact, if any, it has on the project.
```python
import nltk
tt_tokens = nltk.word_tokenize(all_words)
tt_freq = {}

In a recent post on ProfHacker I described a classroom technique I have used when teaching fiction. The roll-your-own dramatic interpretation mixed in with some reality television competition is one way I have found to plunge students into immersive encounters with texts and with each other. The exercise hacks conventional classroom dynamics, but perhaps ProfHacker readers yearn for something more, something more, well, hacky. And what if I told you that the hacks I have pursued were to reverse engineer text mining so that it became a lived process in the classroom?

As perhaps a lot of people who have tried to get undergraduates to read texts as

@johnlaudun
johnlaudun / lda.py
Created July 16, 2016 16:22 — forked from aronwc/lda.py
Example using GenSim's LDA and sklearn
""" Example using GenSim's LDA and sklearn. """
import numpy as np
from gensim import matutils
from gensim.models.ldamodel import LdaModel
from sklearn import linear_model
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
0.6,0.25,0.5,0.5,-0.85,-3.05,0.25,0,0,1,-1.3,0,0.1,0.6,-0.25,0.8,0.25,2,0,0,0.6,0.75,1.5,1,-1.2,-2.55,2.5,1.5,0.7,3.55,-1.2,1.5,1.45,0,-0.6,1.1,-0.4,0.25,0.1,1.2,0.5,-3.55,-0.6,0,-0.5,-0.35,0,0.5,-2.45,2.35,-1.5,0.75,0,0.5,0.8,-1.35,0,1.05,0,-0.75,1.85,0.25,1.25,0,-3.5,0,0,-0.2,-1,1.7,0.65,-2,0,-1.5,0.75,0.5,-3.45,0,0.5,2.4,0,-0.75,-1.5,0,-3,1,0,0.5,-0.75,-1.05,-0.75,0,-2,0,0.5,0.75,0.5,0.5,0.2,-0.5,0.25,0,-0.75,0.5,1.25,1.3,0,0.8,0,0,0,1.2,0,0.5,0,2.15,-0.75,0.5,0.1,-0.35,0,1.5,0,-0.25,-0.25,-2.1,-1.25,0.25,0,0,-0.5,-0.75,0.9,0,0,0,0.8,0.8,1.25,0.75,0.8,0.5,0,0.5,0,0.1,-0.5,-0.25,0.55,0.25,0.85,1.6,-2.3,-2.05,-0.5,1.05,0,-0.65,-2.35,-1.25,-0.6,-1.75,0,0.75,1.5,0.55,0,-1.25,-0.5,0,0,-0.7,-1.35,-0.15,0.45,0,0,0.85,2.6,0,0,-0.75,-0.25,0,2.25,0,-0.5,0.5,0,1.5,0,1.1,0,0.8,-1,-0.5,1.55,-0.25,0.25,0,0,-0.2,0,-0.5,0.5,0,-0.75,0.25,-1.25,0,-0.25,0,0,0.4,1.3,0.05,-0.5,0,0.6,0.75,-0.25,0,2.3,1.55,-0.25,0.25,0,0,1.65,0,0.5,0.75,0,-1,-0.5,-0.25,0,-0.75,0,0.6,1.5,0.25,0,0,0.8,1.25,-0.15,0,0,0,0,0.75,-0.5,0,0,1.25,1.35,2.1
# Syuzhet of a Novel
```R
library(syuzhet)
library(readr)
# Load file
pog_v <- get_sentences(read_file("../texts/banks/Player_of_Games.txt"))