This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def normalize(train, test): | |
mean, std = train.mean(), test.std() | |
train = (train - mean) / std | |
test = (test - mean) / std | |
return train, test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Train the logistic rgeression classifier | |
clf = sklearn.linear_model.LogisticRegressionCV() | |
clf.fit(X, y) | |
# Plot the decision boundary | |
plot_decision_boundary(lambda x: clf.predict(x)) | |
plt.title("Logistic Regression") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
### Problem Statement ### | |
Let's say you have a square matrix which consists of cosine similarities (values between 0 and 1). | |
This square matrix can be of any size. | |
You want to get clusters which maximize the values between elemnts in the cluster. | |
For example, for the following matrix: | |
| A | B | C | D | |
A | 1.0 | 0.1 | 0.6 | 0.4 | |
B | 0.1 | 1.0 | 0.1 | 0.2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def fix_encoding(some_str): | |
return ''.join([c for c in some_str if 0x20 <= ord(c) <= 0x78]) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# =================================================================================== | |
# Many thanks to: | |
# https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/ | |
# | |
# More info: | |
# https://www.continuum.io/blog/developer-blog/python-packages-and-environments-conda | |
# https://conda-forge.github.io/#about | |
# =================================================================================== | |
# conda info --env |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from functools import lru_cache | |
@lru_cache(maxsize=100) | |
def fibonacci(n): | |
# Check that the input is a positive integer | |
if type(n) != int: | |
raise TypeError("n must be a positive int") | |
if n < 1: | |
raise ValueError("n must be a positive int") | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def foo(s): | |
if len(s) <= 0: | |
return None | |
else: | |
output, curr_char, curr_count = '', '', 0 | |
for idx in range(0, len(s)): | |
if s[idx] == curr_char: | |
curr_count += 1 | |
else: | |
output += curr_char + str(curr_count) if curr_count > 0 else curr_char |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.linear_model import SGDRegressor | |
# https://adventuresindatascience.wordpress.com/2014/12/30/minibatch-learning-for-large-scale-data-using-scikit-learn/ | |
def iter_minibatches(chunksize, numtrainingpoints): | |
# Provide chunks one by one | |
chunkstartmarker = 0 | |
while chunkstartmarker < numtrainingpoints: | |
chunkrows = range(chunkstartmarker,chunkstartmarker+chunksize) | |
X_chunk, y_chunk = getrows(chunkrows) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -*- coding: utf-8 -*- | |
from afinn import Afinn | |
import spacy | |
import re | |
class TargetedSentimentAnalysis(object): | |
def __init__(self): | |
self.afinn = Afinn(emoticons=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import argparse | |
import sys | |
def main(): | |
parser = argparse.ArgumentParser() | |
parser.add_argument('--x', type=float, default=1.0, | |
help='What is the first number?') | |
parser.add_argument('--y', type=float, default=1.0, | |
help='What is the second number?') | |
parser.add_argument('--operation', type=str, default='add', |