Skip to content

Instantly share code, notes, and snippets.

@hofmannsven
hofmannsven / README.md
Last active May 3, 2024 15:30
Git CLI Cheatsheet
@visenger
visenger / install_scala_sbt.sh
Last active January 31, 2023 19:10
Scala and sbt installation on ubuntu 12.04
#!/bin/sh
# one way (older scala version will be installed)
# sudo apt-get install scala
#2nd way
sudo apt-get remove scala-library scala
wget http://www.scala-lang.org/files/archive/scala-2.11.4.deb
sudo dpkg -i scala-2.11.4.deb
sudo apt-get update
@mattb
mattb / gist:3888345
Created October 14, 2012 11:53
Some pointers for Natural Language Processing / Machine Learning

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/

@Swind
Swind / unzip.scala
Last active April 22, 2018 08:09
[Unzip in Scala] #Scala #zip
import java.util.zip.ZipFile
import java.io.FileInputStream
import java.io.FileOutputStream
import scala.collection.JavaConversions._
import java.util.zip.ZipEntry
import java.io.InputStream
import java.io.OutputStream
import java.io.File
class ZipArchive {