Skip to content

Instantly share code, notes, and snippets.

View fjavieralba's full-sized avatar

Javier Alba fjavieralba

View GitHub Profile
import logging
import logging.handlers
import sys
if len(sys.argv) < 2:
print "ERROR: usage: syslog_generator.py <NAME>"
exit(1)
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)
@fjavieralba
fjavieralba / python_resources.md
Created June 9, 2014 08:50 — forked from jookyboi/python_resources.md
Python-related modules and guides.

Packages

  • lxml - Pythonic binding for the C libraries libxml2 and libxslt.
  • boto - Python interface to Amazon Web Services
  • Django - Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
  • Fabric - Library and command-line tool for streamlining the use of SSH for application deployment or systems administration task.
  • PyMongo - Tools for working with MongoDB, and is the recommended way to work with MongoDB from Python.
  • Celery - Task queue to distribute work across threads or machines.
  • pytz - pytz brings the Olson tz database into Python. This library allows accurate and cross platform timezone calculations using Python 2.4 or higher.

Guides

@fjavieralba
fjavieralba / 0_reuse_code.js
Created June 9, 2014 08:49
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@fjavieralba
fjavieralba / KafkaLocal.java
Last active March 23, 2021 09:57
Embedding Kafka+Zookeeper for testing purposes. Tested with Apache Kafka 0.8
import java.io.IOException;
import java.util.Properties;
import kafka.server.KafkaConfig;
import kafka.server.KafkaServerStartable;
public class KafkaLocal {
public KafkaServerStartable kafka;
public ZooKeeperLocal zookeeper;
@fjavieralba
fjavieralba / gist:4633857
Created January 25, 2013 11:50
A simple but maybe useful distance definition for texts
from difflib import SequenceMatcher
def distance(url1, url2):
ratio = SequenceMatcher(None, url1, url2).ratio()
return 1.0 - ratio
@fjavieralba
fjavieralba / basic_sentiment_score.py
Created October 28, 2012 20:36
Basic measure of sentiment score of a tagged text
def value_of(sentiment):
if sentiment == 'positive': return 1
if sentiment == 'negative': return -1
return 0
def sentiment_score(review):
return sum ([value_of(tag) for sentence in dict_tagged_sentences for token in sentence for tag in token[2]])
@fjavieralba
fjavieralba / dictionary_tagger.py
Last active September 6, 2018 09:50
Python class for tagging text with dictionaries
class DictionaryTagger(object):
def __init__(self, dictionary_paths):
files = [open(path, 'r') for path in dictionary_paths]
dictionaries = [yaml.load(dict_file) for dict_file in files]
map(lambda x: x.close(), files)
self.dictionary = {}
self.max_key_size = 0
for curr_dict in dictionaries:
for key in curr_dict:
if key in self.dictionary:
@fjavieralba
fjavieralba / preprocessing_text.py
Last active June 21, 2017 06:15
Simple wrapper classes for Splitting and POS-Tagging text using NLTK
text = """What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid."""
splitter = Splitter()
postagger = POSTagger()
splitted_sentences = splitter.split(text)
print splitted_sentences
[['What', 'can', 'I', 'say', 'about', 'this', 'place', '.'], ['The', 'staff', 'of', 'the', 'restaurant', 'is', 'nice', 'and', 'eggplant', 'is', 'not', 'bad', '.'], ['apart', 'from', 'that', ',', 'very', 'uninspired', 'food', ',', 'lack', 'of', 'atmosphere', 'and', 'too', 'expensive', '.'], ['I', 'am', 'a', 'staunch', 'vegetarian', 'and', 'was', 'sorely', 'dissapointed', 'with', 'the', 'veggie', 'options', 'on', 'the', 'menu', '.'], ['Will', 'be', 'the', 'last', 'time', 'I', 'visit', ',', 'I', 'recommend', 'others', 'to', 'avoid', '.']]
@fjavieralba
fjavieralba / NonOverlappingTagging.java
Created March 23, 2012 10:52
[JAVA] Non overlapping tagging of a sentence based on a dictionary of expressions
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
/*
Result is only one tagging of all the possible ones.
The resulting tagging is determined by these two priority rules:
- longest matches have higher priority
- search is made from left to right
*/
@fjavieralba
fjavieralba / non_overlapping_tagging.py
Created March 23, 2012 10:51
[PYTHON] Non overlapping tagging of a sentence based on a dictionary of expressions
def non_overlapping_tagging(sentence, dict, max_key_size):
"""
Result is only one tagging of all the possible ones.
The resulting tagging is determined by these two priority rules:
- longest matches have higher priority
- search is made from left to right
"""
tag_sentence = []
N = len(sentence)
if max_key_size == -1: