Skip to content

Instantly share code, notes, and snippets.

View tteofili's full-sized avatar

Tommaso Teofili tteofili

View GitHub Profile
@tteofili
tteofili / gist:4f03a755145b40ee620e
Last active August 29, 2015 14:05
script to find changed artifact versions since oak 1.0.0 release
#!/bin/bash
version=$1
if [ -z "$version" ]; then
#echo no version specified, finding latest released version
version=`svn log . | grep "jackrabbit-oak" -m 1 | tail -n 1 | awk '{print $4}'`
fi
#echo checking since $version
for i in $( ls -d */ ); do
changed=true
lv=$version
package com.github.tteofili.samples.lucene.benchmarks;
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
@tteofili
tteofili / RNN.java
Last active April 27, 2018 10:17
min char/word-level vanilla RNN (Java, nd4j, comons-math3)
/**
* Apache License 2.0
* see https://www.apache.org/licenses/LICENSE-2.0
*/
import org.apache.commons.math3.distribution.EnumeratedDistribution;
import org.apache.commons.math3.util.Pair;
import org.nd4j.linalg.api.iter.NdIndexIterator;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.api.ops.impl.transforms.SetRange;
import org.nd4j.linalg.api.ops.impl.transforms.SoftMax;
@tteofili
tteofili / NNFreqScoringSimilarity.java
Created January 23, 2018 13:47
Using index, term, doc frequencies to teach a neural network to rank docs
package com.github.tteofili.looseen.dl4j;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.similarities.BasicStats;
import org.apache.lucene.search.similarities.SimilarityBase;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.api.buffer.FloatBuffer;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.factory.Nd4j;
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@tteofili
tteofili / AverageWordEmbeddingsReranker.java
Last active November 14, 2018 16:37
Anserini Reranker based on mean averaged word embeddings nearest neighbour
package io.anserini.rerank.lib;
import io.anserini.rerank.Reranker;
import io.anserini.rerank.RerankerContext;
import io.anserini.rerank.ScoredDocuments;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
@tteofili
tteofili / ReduceVectors.java
Last active October 2, 2019 19:43
reducing word vectors
package io.anserini.embeddings.nn;
import org.deeplearning4j.models.embeddings.loader.WordVectorSerializer;
import org.deeplearning4j.models.embeddings.wordvectors.WordVectors;
import org.deeplearning4j.models.word2vec.wordstore.VocabCache;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dimensionalityreduction.PCA;
import java.io.IOException;
import java.nio.file.Path;
@tteofili
tteofili / ppa_pca_ppa.ipynb
Created October 10, 2019 06:33
PPA-PCA-PPA
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.