vfive fivejjs

## ExpandEdinburghFSDCorpus.md

      
              3 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fivejjs
                / ExpandEdinburghFSDCorpus.md
            
            
              Created
              October 21, 2013 01:18
                — forked from emaadmanzoor/ExpandEdinburghFSDCorpus.md
            
          
    Expand The Edinburgh Twitter FSD Corpus

The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:

Respect the Twitter API rate limits and throttle API hits.
Don't hit the API for already expanded tweet ID's, so you can resume tweet expansion after stopping midway.
Parse the API response and dump it into the correct column in the sqlite3 database.
Gracefully handle exceptions while acquiring tweets from the API.
Wrap version 1.1 of the Twitter API.
Start from a specified tweet ID, assuming the input file is sorted in increasing order of tweet ID.


## nltk-intro.py
import nltk

text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
computer or the gears of a cycle transmission as he does at the top of a mountain
or in the petals of a flower. To think otherwise is to demean the Buddha...which is
to demean oneself."""

# Used when tokenizing words
sentence_re = r'''(?x)      # set flag to allow verbose regexps
      ([A-Z])(\.[A-Z])+\.?  # abbreviations, e.g. U.S.A.

## gist:6babb65de8505af5e2ca
"""
This is a batched LSTM forward and backward pass
"""
import numpy as np
import code

class LSTM:

  @staticmethod
  def init(input_size, hidden_size, fancy_forget_bias_init = 3):

## readme.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fivejjs
                / readme.md
            
            
              Created
              May 7, 2016 03:49
                — forked from baraldilorenzo/readme.md
            
              
                VGG-16 pre-trained model for Keras
              
          
    ##VGG16 model for Keras
This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.
It has been obtained by directly converting the Caffe model provived by the authors.
Details about the network architecture can be found in the following arXiv paper:
Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

  
## 0_reuse_code.js
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console

## anything2vec.list
- word2vec https://arxiv.org/abs/1310.4546
- sentenc2vec, paragraph2vec, doc2vec https://cs.stanford.edu/~quocle/paragraph_vector.pdf
- tweet2vec http://arxiv.org/abs/1605.03481
- tweet2vec http://socialmachines.media.mit.edu/wp-content/uploads/sites/27/2016/05/tweet2vec_vvr.pdf
- author2vec http://dl.acm.org/citation.cfm?id=2889382
- item2vec http://arxiv.org/abs/1603.04259
- lda2vec https://arxiv.org/abs/1605.02019
- illustration2vec http://dl.acm.org/citation.cfm?id=2820907
- tag2vec http://ktsaurabh.weebly.com/uploads/3/1/7/8/31783965/distributed_representations_for_content-based_and_personalized_tag_recommendation.pdf
- category2vec http://www.anlp.jp/proceedings/annual_meeting/2015/pdf_dir/C4-3.pdf

## stuns
# source : http://code.google.com/p/natvpn/source/browse/trunk/stun_server_list
# A list of available STUN server.

stun.l.google.com:19302
stun1.l.google.com:19302
stun2.l.google.com:19302
stun3.l.google.com:19302
stun4.l.google.com:19302
stun01.sipphone.com
stun.ekiga.net

## dask-xgboost-airlines.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                fivejjs
                / dask-xgboost-airlines.ipynb
            
            
              Created
              March 21, 2017 02:22
                — forked from mrocklin/dask-xgboost-airlines.ipynb
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## howto.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                fivejjs
                / howto.md
            
            
              Created
              February 2, 2017 09:07
                — forked from persiyanov/howto.md
            
              
                How-to get Amazon EC2 instance and do machine learning on it. Jupyter 4.0.6 server and Python 2.7.
              
          
    Goal

Want to move computation on machine with much power.
We will set up Anaconda 4.0.0 and XGBoost 0.4 (it is tricky installable).
Preliminaries


Amazon AWS Educate gives 100$ for MIPT students.
GitHub Students Pack additionaly gives 15$.

Let's start

AWS Console and launching EC2 Instance.


## min-char-rnn.py
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np

# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
	import nltk

	text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
	computer or the gears of a cycle transmission as he does at the top of a mountain
	or in the petals of a flower. To think otherwise is to demean the Buddha...which is
	to demean oneself."""

	# Used when tokenizing words
	sentence_re = r'''(?x) # set flag to allow verbose regexps
	([A-Z])(\.[A-Z])+\.? # abbreviations, e.g. U.S.A.
	"""
	This is a batched LSTM forward and backward pass
	"""
	import numpy as np
	import code

	class LSTM:

	@staticmethod
	def init(input_size, hidden_size, fancy_forget_bias_init = 3):
	// Use Gists to store code you would like to remember later on
	console.log(window); // log the "window" object to the console
	- word2vec https://arxiv.org/abs/1310.4546
	- sentenc2vec, paragraph2vec, doc2vec https://cs.stanford.edu/~quocle/paragraph_vector.pdf
	- tweet2vec http://arxiv.org/abs/1605.03481
	- tweet2vec http://socialmachines.media.mit.edu/wp-content/uploads/sites/27/2016/05/tweet2vec_vvr.pdf
	- author2vec http://dl.acm.org/citation.cfm?id=2889382
	- item2vec http://arxiv.org/abs/1603.04259
	- lda2vec https://arxiv.org/abs/1605.02019
	- illustration2vec http://dl.acm.org/citation.cfm?id=2820907
	- tag2vec http://ktsaurabh.weebly.com/uploads/3/1/7/8/31783965/distributed_representations_for_content-based_and_personalized_tag_recommendation.pdf
	- category2vec http://www.anlp.jp/proceedings/annual_meeting/2015/pdf_dir/C4-3.pdf
	# source : http://code.google.com/p/natvpn/source/browse/trunk/stun_server_list
	# A list of available STUN server.

	stun.l.google.com:19302
	stun1.l.google.com:19302
	stun2.l.google.com:19302
	stun3.l.google.com:19302
	stun4.l.google.com:19302
	stun01.sipphone.com
	stun.ekiga.net
	"""
	Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
	BSD License
	"""
	import numpy as np

	# data I/O
	data = open('input.txt', 'r').read() # should be simple plain text file
	chars = list(set(data))
	data_size, vocab_size = len(data), len(chars)