Skip to content

Instantly share code, notes, and snippets.

this gist demonstrates idiomatic way to do sorting of various kinds through functional programming

import scala.math.Ordering
import scala.util.control.TailCalls._

trait Sort {
  def sort[O: Ordering](xs: List[O]): List[O]
  def safeSort[O: Ordering](xs: List[O]): List[O]
}
@shengc
shengc / poly tensor.md
Last active June 12, 2018 04:11
A polymorphic way of converting generic Scala collections to TensorFlow tensors

Recently, I have a project of using neural networks to learn and recognize certain patterns from text. The project by itself is an interesting problem which I ended up using conditional random field aided by recurrent neural network to solve. Once the model has been trained, I needed to write up a Java library to make the model accessible from JVM process.

The deep learning library I have been using is TensorFlow. I know Pytorch is picking up steams steadily in recent years, but to me TensorFlow is really the best choice for building up any applications of industrial strength, given its incredibly diversified set of client languges one can choose from. Luckily Java is one of those client languages.

TensorFlow defines a class SavedModelBundle that one can use to export the model onto the disk that any client languages TensorFlow supports can then read from. When the model is imported into the process, the way to use it is as simple as: 1. feeding the input tensors with the actual input values

viterbi decoding is one of the most important techniques for sequence tagging. In natural language processing, viterbi decoding usually plays the role of finding tagging of the sequence with the maximum likelihood for tasks such as part of speech (POS) tagging and named entity recognition (NER). I will describe it how it works by making prediction of POS on simple sentences based on maximum likelihood from first order (bi-gram) hidden markov model (HMM).

First of all, let's create some sentences with each element consisting of word and corresponding POS tag separated by '/'

sentences = [
  "the/DT can/NN is/VB in/IN the/DT shed/NN",
  "the/DT dog/NN can/VB see/VB the/DT cat/NN"
]
import java.util.Date

import org.apache.lucene.document._
import shapeless._
import shapeless.ops.hlist._
import shapeless.ops.record._

// extends from scala.annotation.StaticAnnotation is not necessary, but it kind of reminds me this is an annotation
final class indexable() extends scala.annotation.StaticAnnotation 
// addCompilerPlugin("org.spire-math" %% "kind-projector" % "0.9.3")

import scala.language.higherKinds

trait Leibniz[A, B] {
  def subst[F[_]](fa: F[A]): F[B]
}

object Leibniz {

Arithmetic solver for http://blog.plover.com/math/17-puzzle.html

sealed trait Op extends Product with Serializable
case object Add extends Op { override def toString() = "+" }
case object Min extends Op { override def toString() = "-" }
case object Mul extends Op { override def toString() = "*" }
case object Div extends Op { override def toString() = "/" }
 -- In Haskell

import Control.Applicative

newtype Compose f g a = Compose { getCompose :: f (g a) }

instance (Functor f, Functor g) => Functor (Compose f g) where
  fmap f (Compose fga) = Compose $ (fmap . fmap) f fga

Idiomatic way of implementing zipWithIndex on List[A] in FP is using a State Monad,

  import scalaz.State

  def zipWithIndex[A](list: List[A]): List[(A, Int)] = 
    list.foldLeft(State.state[Int, List[(A, Int)]](List.empty))({ (state, a) => 
      for {
        ls <- state
        i <- State.get[Int]