Skip to content

Instantly share code, notes, and snippets.

César de Pablo zdepablo

View GitHub Profile
zdepablo /
Created Dec 22, 2017 — forked from neubig/
A small sequence labeler in DyNet
DyNet implementation of a sequence labeler (POS taggger).
This is a translation of this tagger in PyTorch:
Basic architecture:
- take words
- run though bidirectional GRU
- predict labels one word at a time (left to right), using a recurrent neural network "decoder"
The decoder updates hidden state based on:
- most recent word
zdepablo / install_submodules.R
Last active Dec 5, 2017
R: how to Install submodules from git
View install_submodules.R
install_submodule_git <- function(x, ...) {
install_dir <- tempfile()
system(paste("git clone --recursive", shQuote(x), shQuote(install_dir)))
devtools::install(install_dir, ...)
View gist:daf71447c82391c1b4311ffcceec2ebe
# java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=12605 Main # Name of .class program
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/pr/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p0.4/lib/hadoop/lib/native
java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=12611 -cp ta rget/da_record_linkage-0.0.1-SNAPSHOT-jar-with-dependencies.jar da_record_linkage.TestSnappy
netstat -plten | grep LISTEN | grep :120* # See if there is any open port
zdepablo / split_strat_scale.r
Last active Aug 29, 2015 — forked from multidis/split_strat_scale.r
Stratified sampling: training / test data split preserving class distribution (caret functions) and scaling (standardize) the data. Stratified folds for CV.
View split_strat_scale.r
## select training indices preserving class distribution
in.train <- createDataPartition(yclass, p=0.8, list=FALSE)
ytra <- yclass[in.train]; summary(factor(ytra))
ytst <- yclass[-in.train]; summary(factor(ytst))
## standardize features: training parameters of scaling for test-part
Xtra <- scale(X[in.train,])
zdepablo / hive-receipts
Last active Aug 29, 2015
Hive receipts
View hive-receipts
# Overwrite non-partitioned table with their own contents
CREATE table xx_COPY LIKE xx;
# Overwrite partitioned table with their own contents
CREATE table xx_COPY LIKE xx;
zdepablo / hadoop-fs-receipts
Last active Aug 29, 2015
Quick Receipts for Hadoop Filesystem
View hadoop-fs-receipts
# Reference:
# Show disk usage in human format
hadoop fs -du -s -h /user/hive/warehouse/da_cdepablo*
# Show permissions
hadoop fs -getfacl /user/hive/warehouse/da_cdepablo*
# Change permissions
hadoop fs -setfacl -R -m other::rwx /user/hive/warehouse/da_cdepablo
zdepablo / gist:3587a6755b080b85136c
Last active Aug 29, 2015
textalytics-queries per use
View gist:3587a6755b080b85136c
#Number of active users per service - with a cutoff
SELECT `service`, COUNT(*) num_users
SELECT `service`, `hash_key`, COUNT(*) num_requests
FROM `log`
WHERE `date_operation` > '2014-12-01'
GROUP BY `service`, `hash_key`
ORDER BY num_requests DESC
zdepablo / 0_reuse_code.js
Last active Aug 29, 2015
Here are some things you can do with Gists in GistBox.
View 0_reuse_code.js
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
zdepablo /
Last active Aug 29, 2015
Extract UEFA rankings for football team ranks from a HTML table
# -*- coding: utf-8 -*-
from lxml import html,etree
import requests
import unicodecsv
def group(iterator, count):
itr = iter(iterator)
while True:
# Credit
for branch in `git branch -r | grep -v HEAD`;do echo -e `git show --format="%ci %cr" $branch | head -n 1` \\t$branch; done | sort -r
You can’t perform that action at this time.