Aneesh Joshi aneesh-joshi

## eval_w2v_avg.py
from gensim.similarity_learning import WikiQAExtractor

wikiqa = WikiQAExtractor(os.path.join("..", "data", "WikiQACorpus", "WikiQA-train.tsv"))
data = wikiqa.get_data()

# Below commented code is for making a dict for word vectors and pickling it
# w2v = {}

# with open('glove.6B.50d.txt') as f:
# 	for line in f:

## 17 May, 2018, SL Discussion.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                aneesh-joshi
                / 17 May, 2018, SL Discussion.md
            
            
              Last active
              May 18, 2018 07:10
            
          
    17 May, 2018 Discussion:

Predecided Objectives:


Come up with a way of evaluating models (in the form of a script)
Look for more data sets to evaluate models

Datasets:


WikiQA : [Ranking/Regression]
QuoraQP [Binary Classification]
The Stanford Natural Language Inference (SNLI) Corpus [Multi Class Classification]


## LSTM_POS_Tagger_notes.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                aneesh-joshi
                / LSTM_POS_Tagger_notes.md
            
            
              Created
              March 8, 2018 05:55
            
          
    Notes on LSTM POS Tagger Shapes

X : numpy array of shape (No. of sample, Padding Length) 
						  Example : 64, 1000

						  [ [0, 0, ...., 52, 16, 23],
						    [0, 0, ...., 23, 64, 12]]
						   ^ this has shape (2, 1000)  since padding length is 1000
 it corresponds to sentences


## word2vec_tftut.py
import tensorflow as tf
import numpy as np

corpus_raw = 'He is the king . The king is royal . She is the royal  queen '

# convert to lower case
corpus_raw = corpus_raw.lower()

words = []
for word in corpus_raw.split():
	from gensim.similarity_learning import WikiQAExtractor

	wikiqa = WikiQAExtractor(os.path.join("..", "data", "WikiQACorpus", "WikiQA-train.tsv"))
	data = wikiqa.get_data()

	# Below commented code is for making a dict for word vectors and pickling it
	# w2v = {}

	# with open('glove.6B.50d.txt') as f:
	# for line in f:
	import tensorflow as tf
	import numpy as np

	corpus_raw = 'He is the king . The king is royal . She is the royal queen '

	# convert to lower case
	corpus_raw = corpus_raw.lower()

	words = []
	for word in corpus_raw.split():