Skip to content

Instantly share code, notes, and snippets.

View he7d3r's full-sized avatar

Helder Geovane Gomes de Lima he7d3r

View GitHub Profile
@turicas
turicas / requirements.txt
Created November 10, 2013 00:50
L² Hackathon WikiMedia
requests
@xadhix-zz
xadhix-zz / Facebook Lookback Downloader
Created February 4, 2014 12:18
Extracts the HD video link from the Facebook lookback page.
var xLBD = {};
xLBD.c = function (){
xLBD.f = unescape(document.querySelector("[flashvars]").getAttribute("flashvars")).substring(7);
xLBD.f = JSON.parse(xLBD.f.substring(0, xLBD.f.lastIndexOf("}") + 1)).video_data[0].hd_src;
xLBD.a = "<div style='position:absolute;top:100px;height:300px;left:15%;background:#fff;border:10px solid #000;font-size:5em;padding:100px;'>Click <a download='lookback.mp4' href='"+xLBD.f+"'>here<\/a> to download your lookBack video.</div>";
document.body.innerHTML += xLBD.a;
}
if(document.readyState == "complete")
xLBD.c();
else window.onload = xLBD.c;
$ python demonstrate_extractor.py
Extracting features for http://en.wikipedia.org/wiki/?oldid=626489778&diff=prev
<added_badwords_ratio>: 211.95999999999998
<added_misspellings_ratio>: 1.4638121546961327
<badwords_added>: 3
<bytes_changed>: 133
<chars_added>: 145
<day_of_week_in_utc>: 6
<hour_of_day_in_utc>: 15
<is_custom_comment>: True
#!/usr/bin/python
# -*- coding: utf-8 -*-
"""
@ Autor: [[Usuário:Danilo.mac]]
@ Licença: GNU General Public License 3.0 (GPL V3) e Creative Commons Attribution/Share-Alike (CC-BY-SA)
Descrição: Script para busca de referencias no dump dos históricos da Wikipédia lusófona.
"""
>>> import revscores
>>> dir(revscores)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
>>> from revscores import languages
>>> dir(languages)
['Language', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'english', 'language', 'portuguese']
Notice that the first "dir()" doesn't list out langauge. This is because language is not imported by default.
But when we run dir() on language, we can see "english", "portuguese" and "language". This is because these modules are imported by default.
@atcuno
atcuno / gist:3425484ac5cce5298932
Last active March 25, 2024 13:55
HowTo: Privacy & Security Conscious Browsing

The purpose of this document is to make recommendations on how to browse in a privacy and security conscious manner. This information is compiled from a number of sources, which are referenced throughout the document, as well as my own experiences with the described technologies.

I welcome contributions and comments on the information contained. Please see the How to Contribute section for information on contributing your own knowledge.

Table of Contents

@halfak
halfak / create_virtualenv.md
Last active January 17, 2023 22:50
Setting up a python 3.x Virtual Environment

Step 0: Set up python virtualenv

virtualenv is a command-line utiltity that will allow you to encapsulate a python environment. Ubuntu calls the package that installs this utility "python-virtualenv". You can install it with $ sudo apt-get install python-virtualenv.

Step 1: Create the virtualenv directory

In this sequence, I'm going to assume that python 3.5 is the installed verison.

$ cd ~
$ python
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from revscoring.languages import english
>>> english.is_badword("foobar")
False
>>> english.is_badword("shitty")
True

Step 1: Make a project directory

You'll want to keep all of your local repos in the same folder. I like to use ~/projects/ for this, but others like ~/workspace/. Choose a name you like typing.

$ cd ~
$ mkdir projects/

Step 2: Get the repos

@Ladsgroup
Ladsgroup / Cluster.py
Created August 27, 2015 18:59
Clustering reverted edits in Wikipedia
import codecs
import math
import sklearn.cluster
import matplotlib.pyplot as plt
x = set()
c = 0
path = '/home/amir/Downloads/featuresetsforclustering/ptwiki.features_reverted.20k.tsv'
with codecs.open(path, 'r', 'utf-8') as f:
for line in f: