Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@johnlaudun
johnlaudun / lcm.txt
Created February 6, 2024 15:46
"The Layton Court Mystery" by Anthony Berkeley (1925)
The Layton Court Mystery
by Anthony Berkeley
Contents
I. Eight o’Clock in the Morning
II. An Interrupted Breakfast
\documentclass[12pt, letter]{article}
\usepackage[margin=1in]{geometry}
\setlength{\parindent}{0em}
\setlength{\parskip}{0.5em}
\newif\ifdraft
\drafttrue % or \draftfalse
\begin{document}
What I've been working on for the past few days is in preparation for attempting a topic model using the more established LDA instead of the NMF to see how well they compare -- with the understanding that since there is rarely a one-to-one matchup within either method, that there will be no such match across them.
Because LDA does not filter out common words on its own, the way the NMF method does, you have to start with a stoplist. I know we can begin with Blei's and a few other established lists, but I would also like to be able to compare that against our own results. My first thought was to build a dictionary of words and their frequency within the corpus. For convenience sake, I am using the NLTK.
Just as a record of what I've done, here's the usual code for loading the talks from the CSV with everything in it:
```python
import pandas
import re
Here's the text as I wrote it in Markdown and as it sits in the WP editing pane:
We still need to identify which talks have floats for values and determine what impact, if any, it has on the project.
```python
import nltk
tt_tokens = nltk.word_tokenize(all_words)
tt_freq = {}
@johnlaudun
johnlaudun / sentiments.py
Last active March 11, 2019 22:22
Python script to compare Sentiment Analyses available in Python
#! /usr/bin/env python
'''
sentiments.py compares the outputs of the sentimental modules listed below.
Functionality to be added: normalization and smoothing.
(I haven't implemented the NLTK solution because I don't have classified texts.)
'''
# Imports
import matplotlib.pyplot as plt
@johnlaudun
johnlaudun / lda.py
Created July 16, 2016 16:22 — forked from aronwc/lda.py
Example using GenSim's LDA and sklearn
""" Example using GenSim's LDA and sklearn. """
import numpy as np
from gensim import matutils
from gensim.models.ldamodel import LdaModel
from sklearn import linear_model
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
0.6,0.25,0.5,0.5,-0.85,-3.05,0.25,0,0,1,-1.3,0,0.1,0.6,-0.25,0.8,0.25,2,0,0,0.6,0.75,1.5,1,-1.2,-2.55,2.5,1.5,0.7,3.55,-1.2,1.5,1.45,0,-0.6,1.1,-0.4,0.25,0.1,1.2,0.5,-3.55,-0.6,0,-0.5,-0.35,0,0.5,-2.45,2.35,-1.5,0.75,0,0.5,0.8,-1.35,0,1.05,0,-0.75,1.85,0.25,1.25,0,-3.5,0,0,-0.2,-1,1.7,0.65,-2,0,-1.5,0.75,0.5,-3.45,0,0.5,2.4,0,-0.75,-1.5,0,-3,1,0,0.5,-0.75,-1.05,-0.75,0,-2,0,0.5,0.75,0.5,0.5,0.2,-0.5,0.25,0,-0.75,0.5,1.25,1.3,0,0.8,0,0,0,1.2,0,0.5,0,2.15,-0.75,0.5,0.1,-0.35,0,1.5,0,-0.25,-0.25,-2.1,-1.25,0.25,0,0,-0.5,-0.75,0.9,0,0,0,0.8,0.8,1.25,0.75,0.8,0.5,0,0.5,0,0.1,-0.5,-0.25,0.55,0.25,0.85,1.6,-2.3,-2.05,-0.5,1.05,0,-0.65,-2.35,-1.25,-0.6,-1.75,0,0.75,1.5,0.55,0,-1.25,-0.5,0,0,-0.7,-1.35,-0.15,0.45,0,0,0.85,2.6,0,0,-0.75,-0.25,0,2.25,0,-0.5,0.5,0,1.5,0,1.1,0,0.8,-1,-0.5,1.55,-0.25,0.25,0,0,-0.2,0,-0.5,0.5,0,-0.75,0.25,-1.25,0,-0.25,0,0,0.4,1.3,0.05,-0.5,0,0.6,0.75,-0.25,0,2.3,1.55,-0.25,0.25,0,0,1.65,0,0.5,0.75,0,-1,-0.5,-0.25,0,-0.75,0,0.6,1.5,0.25,0,0,0.8,1.25,-0.15,0,0,0,0,0.75,-0.5,0,0,1.25,1.35,2.1
# Syuzhet of a Novel
```R
library(syuzhet)
library(readr)
# Load file
pog_v <- get_sentences(read_file("../texts/banks/Player_of_Games.txt"))
# Syuzhet Outputs
## Sentiment by Sentence for 4 Small Texts
```R
# Load libraries
library(syuzhet)
library(readr)