Skip to content

Instantly share code, notes, and snippets.

@shagunsodhani
shagunsodhani / cogroup.md
Created November 28, 2015 11:34
Example of cogroup operation

result dataset

query url
query1 url1
query2 url2
query1 url3
query2 url4

revenue dataset

@shagunsodhani
shagunsodhani / GraphX.md
Created December 10, 2015 15:00
Notes about GraphX Paper

This week I read upon GraphX, a distributed graph computation framework that unifies graph-parallel and data-parallel computation. Graph-parallel systems efficiently express iterative algorithms (by exploiting the static graph structure) but do not perform well on operations that require a more general view of the graph like operations that move data out of the graph. Data-parallel systems perform well on such tasks but directly implementing graph algorithms on data-parallel systems is inefficient due to complex joins and excessive data movement. This is the gap that GraphX fills in by allowing the same data to be viewed and operated upon both as a graph and as a table.

Preliminaries

Let G = (V, E) be a graph where V = {1, ..., n} is the set of vertices and E is the set of m directed edges. Each directed edge is a tuple of the form (i, j) ∈ E where i ∈ V is the source vertex and j ∈ V is the target vertex. The vertex p

Keybase proof

I hereby claim:

  • I am shagunsodhani on github.
  • I am shagun (https://keybase.io/shagun) on keybase.
  • I have a public key whose fingerprint is 052A 444C 26A2 DFAA 1F4E 9A79 48D9 E5AB AD77 C174

To claim this, I am signing this object:

@shagunsodhani
shagunsodhani / RecurrentNeuralNetworkRegularization.md
Created July 24, 2016 15:01
Notes for 'Recurrent Neural Network Regularization' paper

Recurrent Neural Network Regularization

Introduction

  • The paper explains how to apply dropout to LSTMs and how it could reduce overfitting in tasks like language modelling, speech recognition, image caption generation and machine translation.
  • Link to the paper
  • Regularisation method that drops out (or temporarily removes) units in a neural network.
@shagunsodhani
shagunsodhani / Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems.md
Created August 8, 2016 03:46
Summary of "Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems" paper

Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems

Introduction

  • The paper presents a suite of benchmark tasks to evaluate end-to-end dialogue systems such that performing well on the tasks is a necessary (but not sufficient) condition for a fully functional dialogue agent.
  • Link to the paper

Dataset

  • Created using large-scale real-world sources - OMDB (Open Movie Database), MovieLens and Reddit.
@shagunsodhani
shagunsodhani / Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge.md
Last active August 28, 2016 04:20
Notes for paper "Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge"

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

Introduction

  • Task of translating natural language queries into regular expressions without using domain specific knowledge.
  • Proposes a methodology for collecting a large corpus of regular expressions to natural language pairs.
  • Reports performance gain of 19.6% over state-of-the-art models.
  • Link to the paper

Architecture

@shagunsodhani
shagunsodhani / QRN.md
Last active September 24, 2016 19:01
Notes for "Query Regression Networks for Machine Comprehension" Paper

Query Regression Networks for Machine Comprehension

Introduction

  • Machine Comprehension (MC) - given a natural language sentence, answer a natural language question.
  • End-To-End MC - can not use language resources like dependency parsers. The only supervision during training is the correct answer.
  • Query Regression Network (QRN) - Variant of Recurrent Neural Network (RNN).
  • Link to the paper

Related Work

@shagunsodhani
shagunsodhani / proposal-keras-dtdc.md
Last active October 22, 2016 12:17
Poposal for Keras talk at Delhi Twitter Developer Community Meetup

Topic

Introduction to Deep Learning with Keras

Description

Keras is a high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

In the talk, I would introduce Keras and talk about how it can be used to accomplish workflows like image classfication and sequence modelling.

@shagunsodhani
shagunsodhani / entropy.py
Created October 26, 2016 11:01
Script to calculate entropy for any column in a file
# Script to calculate entropy for any column in a file.
from __future__ import print_function
import numpy as np
def entropy(file_path, sep, col_index, col_name):
'''Method to calculate entropy for any col_index
in a file where columns are seperated by sep'''
distribution = np.asarray(list(read_column(file_path, sep, col_index)))
@shagunsodhani
shagunsodhani / Improving Word Representations via Global Context and Multiple Word Prototypes.md
Last active December 18, 2016 17:56
Summary if paper "Improving Word Representations via Global Context and Multiple Word Prototypes"

Improving Word Representations via Global Context and Multiple Word Prototypes

Introduction

  • This paper pre-dated papers like Glove and Word2Vec and proposed an architecture that
    • combines local and global context while learning word embeddings to capture the word semantics.
    • learns multiple embeddings per word to account for homonymy and polysemy.
  • Link to the paper

Global Context-Aware Neural Language Model