Skip to content

Instantly share code, notes, and snippets.

@amferraz
amferraz / iso_to_utf8.sh
Created June 5, 2014 19:29
change all files encoding to UTF-8.
for f in $(find . -not -iwholename '*.git*' -type f -exec file --mime-encoding {} \; | egrep -v "(utf-8|ascii|binary)" | awk "{print $1}" | tr -d ":"); do iconv -f iso-8859-1 -t utf-8 $f > $f.utf8 ; mv -v $f.utf8 $f; done
@amferraz
amferraz / Main.java
Created May 8, 2014 12:09
A sample app of converting a file to html with Apache Tika
package jusbrasil.test_tika;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.sax.SAXTransformerFactory;
>>> from pyne2014 import girls
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name girls
>>>
# install via MacPorts
sudo port install openssl
sudo env ARCHFLAGS="-arch x86_64" LDFLAGS="-L/opt/local/lib" CFLAGS="-I/opt/local/include" pip install cryptography
# install via Homebrew
brew install openssl
env ARCHFLAGS="-arch x86_64" LDFLAGS="-L/usr/local/opt/openssl/lib" CFLAGS="-I/usr/local/opt/openssl/include" pip install cryptography
@amferraz
amferraz / print_ics_file.py
Created April 22, 2014 16:03
Reads an ICS file and prints the events
# coding: utf-8
# File 'BrazilHolidays.ics' can be obtained here:
# http://www.mozilla.org/en-US/projects/calendar/holidays/
from icalendar import Calendar, Event
g = open('BrazilHolidays.ics','rb')
gcal = Calendar.from_ical(g.read())
for component in gcal.walk():
RBD
Victor e Leo
One Direction
Onze:20
Belo
Justin Bieber
Lorde
Guns N' Roses
Miley Cyrus
Gospel
@amferraz
amferraz / artists_without_a.sh
Created April 19, 2014 17:46
This is a toy script I made to find artists without the letter "A" in his name. The regex might need some improvement
#!/bin/bash
# This scripts prints artists from top 1000 from letras.mus.br
# whose name does not contains the 'a' letter.
# Requires package html-xml-utils
curl http://letras.mus.br/top-artistas/ | \
hxnormalize -l 240 -x | \
hxselect -s '\n' -c "#cnt_top > div.left > div > div > ol > li > a > span" | \
grep -vi '[aáàãä]'
@amferraz
amferraz / main.py
Last active December 25, 2015 18:59
A simple template scraper
# coding: utf-8
import requests
from lxml import html
home = requests.get('http://www.submarino.com.br/')
home_tree = html.fromstring(home.text)
products_links_xpath = '//*[@id="tab"]/div/ul/li/div/a/@href'
products_links = home_tree.xpath(products_links_xpath)
Function Shortcut
previous tab ⌘ + left arrow
next tab ⌘ + right arrow
go to tab ⌘ + number
go to window ⌘ + Option + Number
go to split pane by direction ⌘ + Option + arrow
go to split pane by order of use ⌘ + ] , ⌘ + [
split window horizontally (same profile) ⌘ + D
split window vertically (same profile) ⌘ + d
@amferraz
amferraz / main.py
Last active December 23, 2015 22:29
An example of ItemPipeline with FollowAllSpider
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy.settings import Settings
from scrapy import signals
from testspiders.spiders.followall import FollowAllSpider
class MyPipeline(object):
def process_item(self, item, spider):