alvations

## alchemy_call_limit.py
"""Query AlchemyAPI to determine number of API calls still available"""
# -*- coding: utf-8 -*-
import json
import requests

def get_api_key():
    # Load API key (40 HEX character key) from local file
    key = open('api_key.txt').readline().strip()
    return key

## nltk-intro.py
import nltk

text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
computer or the gears of a cycle transmission as he does at the top of a mountain
or in the petals of a flower. To think otherwise is to demean the Buddha...which is
to demean oneself."""

# Used when tokenizing words
sentence_re = r'''(?x)      # set flag to allow verbose regexps
      ([A-Z])(\.[A-Z])+\.?  # abbreviations, e.g. U.S.A.

## ty_itersample.py
"""
Programming task
================

Implement the method iter_sample below to make the Unit test pass. iter_sample
is supposed to peek at the first n elements of an iterator, and determine the
minimum and maximum values (using their comparison operators) found in that
sample. To make it more interesting, the method is supposed to return an
iterator which will return the same exact elements that the original one would
have yielded, i.e. the first n elements can't be missing.

## ty_ner.py
"""
Programming task
================

The following is an implementation of a simple Named Entity Recognition (NER).
NER is concerned with identifying place names, people names or other special
identifiers in text.

Here we make a very simple definition of a named entity: A sequence of
at least two consecutive capitalized words. E.g. "Los Angeles" is a named

## Entropy-and-WSD.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                alvations
                / Entropy-and-WSD.md
            
            
              Last active
              November 4, 2015 17:37
            
          
    Entropy and WSD.

Let p(x) be the probability mass function of a random variable X over a discrte set of symbols X:
p(x) = P(X=x)

For example, if we toss two coins and count the no. of heads, we have a random variable: p(0) = 1/4, p(1) = 1/2 and p(2) = 1/4

  
## clean_ctrchars.py
"""
This is a script used to clean control characters from the
 - NTU -Multilingual Corpus (http://web.mysites.ntu.edu.sg/fcbond/open/pubs/2012-ijalp-ntumc.pdf)
 - SeedLing Corpus (http://www.aclweb.org/anthology/W/W14/W14-2211.pdf)
 - DSL Corpus Collection (https://comparable.limsi.fr/bucc2014/4.pdf)
"""
import re
import unicodedata

# A full list of unicode characters.

## NLTK_Stanford_2015-12-09.md

      
              1 file
            
          
              11 forks
            
          
              3 comments
            
          
              31 stars
            
          
                alvations
                / NLTK_Stanford_2015-12-09.md
            
            
              Last active
              April 7, 2021 21:09
            
          
    NLTK API to Stanford NLP Tools compiled on 2015-12-09

Stanford NER

With NLTK version 3.1 and Stanford NER tool 2015-12-09, it is possible to hack the StanfordNERTagger._stanford_jar to include other .jar files that are necessary for the new tagger.
First set up the environment variables as per instructed at https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software

  
## BLEU.py
# -*- coding: utf-8 -*-

"""BLEU.

Usage:
  bleu.py --reference FILE --translation FILE [--weights STR] [--smooth STR] [--smooth-epsilon STR] [--smooth-alpha STR] [--smooth-k STR] [--segment-level]
  bleu.py -r FILE -t FILE [-w STR] [--smooth STR] [--segment-level]

Options:
  -h --help              Show this screen.

## pywsdlemmatizer.py
from nltk.corpus import wordnet as wn
from nltk.stem import PorterStemmer, WordNetLemmatizer
#from nltk import pos_tag, word_tokenize

# Pywsd's Lemmatizer.
porter = PorterStemmer()
wnl = WordNetLemmatizer()

from nltk.tag import PerceptronTagger

## NLTK_StanfordTools_MaltParser_Windows.md

      
              1 file
            
          
              1 fork
            
          
              1 comment
            
          
              6 stars
            
          
                alvations
                / NLTK_StanfordTools_MaltParser_Windows.md
            
            
              Last active
              May 12, 2023 10:12
            
          
    Getting Stanford NLP and MaltParser to work in NLTK for Windows Users

Firstly, I strongly think that if you're working with NLP/ML/AI related tools, getting things to work on Linux and Mac OS is much easier and save you quite a lot of time.
Disclaimer: I am not affiliated with Continuum (conda), Git, Java, Windows OS or Stanford NLP or MaltParser group. And the steps presented below is how I, IMHO, would setup a Windows computer if I own one.
Please please please understand the solution don't just copy and paste!!! We're not monkeys typing Shakespeare ;P
	"""Query AlchemyAPI to determine number of API calls still available"""
	# -- coding: utf-8 --
	import json
	import requests

	def get_api_key():
	# Load API key (40 HEX character key) from local file
	key = open('api_key.txt').readline().strip()
	return key
	import nltk

	text = """The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
	computer or the gears of a cycle transmission as he does at the top of a mountain
	or in the petals of a flower. To think otherwise is to demean the Buddha...which is
	to demean oneself."""

	# Used when tokenizing words
	sentence_re = r'''(?x) # set flag to allow verbose regexps
	([A-Z])(\.[A-Z])+\.? # abbreviations, e.g. U.S.A.
	"""
	Programming task
	================

	Implement the method iter_sample below to make the Unit test pass. iter_sample
	is supposed to peek at the first n elements of an iterator, and determine the
	minimum and maximum values (using their comparison operators) found in that
	sample. To make it more interesting, the method is supposed to return an
	iterator which will return the same exact elements that the original one would
	have yielded, i.e. the first n elements can't be missing.
	"""
	Programming task
	================

	The following is an implementation of a simple Named Entity Recognition (NER).
	NER is concerned with identifying place names, people names or other special
	identifiers in text.

	Here we make a very simple definition of a named entity: A sequence of
	at least two consecutive capitalized words. E.g. "Los Angeles" is a named
	"""
	This is a script used to clean control characters from the
	- NTU -Multilingual Corpus (http://web.mysites.ntu.edu.sg/fcbond/open/pubs/2012-ijalp-ntumc.pdf)
	- SeedLing Corpus (http://www.aclweb.org/anthology/W/W14/W14-2211.pdf)
	- DSL Corpus Collection (https://comparable.limsi.fr/bucc2014/4.pdf)
	"""
	import re
	import unicodedata

	# A full list of unicode characters.
	# -- coding: utf-8 --

	"""BLEU.

	Usage:
	bleu.py --reference FILE --translation FILE [--weights STR] [--smooth STR] [--smooth-epsilon STR] [--smooth-alpha STR] [--smooth-k STR] [--segment-level]
	bleu.py -r FILE -t FILE [-w STR] [--smooth STR] [--segment-level]

	Options:
	-h --help Show this screen.
	from nltk.corpus import wordnet as wn
	from nltk.stem import PorterStemmer, WordNetLemmatizer
	#from nltk import pos_tag, word_tokenize

	# Pywsd's Lemmatizer.
	porter = PorterStemmer()
	wnl = WordNetLemmatizer()

	from nltk.tag import PerceptronTagger