Skip to content

Instantly share code, notes, and snippets.

### Keybase proof
I hereby claim:
* I am stevecreedon on github.
* I am stevecreedon_cat (https://keybase.io/stevecreedon_cat) on keybase.
* I have a public key ASAR9J7Ys7MmQO5ylyehskr2kafV6ff_UHKTIvsFieDGdQo
To claim this, I am signing this object:
### Keybase proof
I hereby claim:
* I am stevecreedon on github.
* I am stevecreedon_cat (https://keybase.io/stevecreedon_cat) on keybase.
* I have a public key ASAR9J7Ys7MmQO5ylyehskr2kafV6ff_UHKTIvsFieDGdQo
To claim this, I am signing this object:

Keybase proof

I hereby claim:

  • I am stevecreedon on github.
  • I am stevecreedon (https://keybase.io/stevecreedon) on keybase.
  • I have a public key ASCFoLDFFgWiluQMVvgldwEJLh53F9n3ocCpKcPbLBCfywo

To claim this, I am signing this object:

@stevecreedon
stevecreedon / gist:de563ade880488b213b989543b9cc931
Last active September 7, 2016 17:31
Attempt to understand LDA Topic Analysis
At resolver.co.uk we need some form of topic discovery from our large corpus of email conversations. I'm not trying to understand LDA analysis in its full scientific or mathematical depth but just how it works so we can attempt to get the best out of it.
This is my best shot:
Say we have 1000 documents.
Let's make some huge assumptions:
1. Distributed across these documents we have 10 topics. We don't know the topics and we don't know which document has which of these mystery topics.