This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/******************************************************************************* | |
* Copyright (c) 2000, 2009 IBM Corporation and others. | |
* All rights reserved. This program and the accompanying materials | |
* are made available under the terms of the Eclipse Public License v1.0 | |
* which accompanies this distribution, and is available at | |
* http://www.eclipse.org/legal/epl-v10.html | |
* | |
* Contributors: | |
* IBM Corporation - initial API and implementation | |
*******************************************************************************/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
#italian | |
ital_text = re.sub('[^a-z]+', ' ', open('italian.txt').read().lower()).strip() + ' ' | |
ital = {} | |
for i in xrange(len(ital_text) - 1): | |
bigram = ital_text[i : i + 2] | |
if bigram in ital: | |
continue |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python2 | |
"""\ | |
We are computing the probability of features 1 through 16. | |
Each feature represents a pixel in a 4 x 4 bitmap. | |
We are taking into account the probability that a pixel is black. | |
This simplifies calculations as the probability of white is equal to | |
the probability of !black. | |
""" | |
import numpy, math |