Skip to content

Instantly share code, notes, and snippets.

@undarmaa
undarmaa / understanding-word-vectors.ipynb
Created March 13, 2019 00:36 — forked from aparrish/understanding-word-vectors.ipynb
Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@undarmaa
undarmaa / text_cluster.py
Created October 2, 2018 07:19 — forked from gdbassett/text_cluster.py
Basic script for text->vectorization->TF-IDF->canopies->kmeans->clusters. Initially tested on VCDB breach summaries.
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# based on http://scikit-learn.org/stable/auto_examples/document_clustering.html
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans, MiniBatchKMeans
from sklearn.metrics.pairwise import pairwise_distances
import numpy as np
from time import time
from collections import defaultdict

텐서플로우 시작하기

글쓴이: 김정주(haje01@gmail.com)

이 문서는 텐서플로우 공식 페이지 내용을 바탕으로 만들어졌습니다.


소개

텐서플로우(TensorFlow)는 기계 학습과 딥러닝을 위해 구글에서 만든 오픈소스 라이브러리입니다. 데이터 플로우 그래프(Data Flow Graph) 방식을 사용하였습니다.

@undarmaa
undarmaa / canopy.py
Created May 8, 2018 02:27 — forked from gdbassett/canopy.py
Efficient python implementation of canopy clustering. (A method for efficiently generating centroids and clusters, most commonly as input to a more robust clustering algorithm.)
from sklearn.metrics.pairwise import pairwise_distances
import numpy as np
# X shoudl be a numpy matrix, very likely sparse matrix: http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.csr_matrix.html#scipy.sparse.csr_matrix
# T1 > T2 for overlapping clusters
# T1 = Distance to centroid point to not include in other clusters
# T2 = Distance to centroid point to include in cluster
# T1 > T2 for overlapping clusters
# T1 < T2 will have points which reside in no clusters
# T1 == T2 will cause all points to reside in mutually exclusive clusters
@undarmaa
undarmaa / Dockerfile
Created July 28, 2016 01:18 — forked from eduardschaeli/Dockerfile
Creates a Docker image with IPython Notebook installed.
# iPython Notebook with per-user storage and config
#
# Based on crosbymichael/ipython
# Creates a Docker image with IPython Notebook installed.
#
# It expects to be run like this:
#
# docker run -v /home/eduard/notebooks/eduard:/notebooks benthoo/ipython-user
#
# You provide a folder per user on the host system. This folder will hold the users notebooks and also needs to contain the
from sklearn.datasets.samples_generator import make_regression
from sklearn.linear_model import Lasso
import numpy as np
X, y = make_regression(n_samples=200, n_features=5000, random_state=0)
alpha = 1
model = Lasso(alpha=alpha, fit_intercept=False, max_iter=1000)
model.fit(X, y)
@undarmaa
undarmaa / knn_wine.py
Last active August 29, 2015 14:23 — forked from glamp/knn_wine.py
import pandas as pd
import pylab as pl
from sklearn.neighbors import KNeighborsClassifier
df = pd.read_csv("https://s3.amazonaws.com/demo-datasets/wine.csv")
test_idx = np.random.uniform(0, 1, len(df)) <= 0.3
train = df[test_idx==True]
test = df[test_idx==False]
@undarmaa
undarmaa / cluster.py
Last active August 29, 2015 14:22 — forked from erogol/cluster.py
import numpy as np
import numpy
import theano
import theano.tensor as T
from theano import function, config, shared, sandbox
from theano import ProfileMode
from sklearn import cluster, datasets
import matplotlib.pyplot as plt
def rsom(data, cluster_num, alpha, epochs = -1, batch = 1, verbose = False):