This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import threading | |
class Queue(object): | |
def __init__(self, max_size): | |
self.max_size = max_size | |
self.mutex = threading.Lock() | |
self.is_full = threading.Condition(self.mutex) | |
self.is_empty = threading.Condition(self.mutex) | |
self._queue = [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class ThreadUrl(threading.Thread): | |
def __init__(self, queue, visited, lock): | |
super(ThreadUrl, self).__init__() | |
self.queue = queue | |
self.visited = visited | |
self.lock = lock | |
def run(self): | |
while True: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# coding: utf-8 | |
# version 1.0.3 | |
# #![Spark Logo](http://spark-mooc.github.io/web-assets/images/ta_Spark-logo-small.png) + ![Python Logo](http://spark-mooc.github.io/web-assets/images/python-logo-master-v3-TM-flattened_small.png) | |
# # **Text Analysis and Entity Resolution** | |
# ####Entity resolution is a common, yet difficult problem in data cleaning and integration. This lab will demonstrate how we can use Apache Spark to apply powerful and scalable text analysis techniques and perform entity resolution across two datasets of commercial products. | |
# #### Entity Resolution, or "[Record linkage][wiki]" is the term used by statisticians, epidemiologists, and historians, among others, to describe the process of joining records from one data source with another that describe the same entity. Our terms with the same meaning include, "entity disambiguation/linking", duplicate detection", "deduplication", "record matching", "(reference) reconciliation", "object identification", "data/information integration", and "conf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
def curry(func): | |
""" | |
Decorator to curry a function, typical usage: | |
>>> @curry | |
... def foo(a, b, c): | |
... return a + b + c |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
# generate ListNode from unpacking list | |
>>> l = l = ListNode(*range(10)) | |
>>> l | |
<ListNode [0]> | |
>>> print l | |
<ListNode [0]>: 0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# encoding: utf-8 | |
''' | |
>>> l = fn_list([3, 2, 1]) | |
>>> l.len() | |
3 | |
>>> l.map(lambda x: x + 1).map(lambda x: x * 2) | |
[8, 6, 4] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# encoding: utf-8 | |
""" | |
Use kNN algorithm to recognize digits. | |
Download files here: http://download.csdn.net/detail/zouxy09/6610571 | |
├── digits | |
│ ├── testDigits | |
│ └── trainingDigits |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import random | |
import numpy as np | |
def distance(v1, v2): | |
""" | |
euclidean metric of v1, v2. | |
v1 and v2 are both n-dimensions vectors | |
""" | |
return np.sqrt(sum(np.power(v1 - v2, 2))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <iostream> | |
#include <ext/hash_map> | |
using namespace std; | |
using namespace __gnu_cxx; | |
template <class K, class T> | |
struct Node{ | |
K key; | |
T data; | |
Node *prev, *next; |