Skip to content

Instantly share code, notes, and snippets.

César de Pablo zdepablo

Block or report user

Report or block zdepablo

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View gist:5594c47bed63cabb225e93679e62408a
{"nbformat": 4, "metadata": {"kernelspec": {"name": "ir", "language": "R", "display_name": "R"}, "language_info": {"version": "3.4.2", "name": "R", "file_extension": ".r", "mimetype": "text/x-r-source", "pygments_lexer": "r", "codemirror_mode": "r"}}, "nbformat_minor": 1, "cells": [{"metadata": {"_uuid": "c1204e7335dd129c2c7d6e4d07ffe0d2bfea4574", "_cell_guid": "59426159-4fac-4472-baf4-2e9ddc973d48"}, "execution_count": null, "source": ["## WIDS 2018 DATATHON ##\n", "\n", "# Initialization\n", "{\n", " gc()\n", " cat(\"\\014\")\n", " rm(list = setdiff(ls(), c()))\n", "\n", " packages = function(x) {\n", " x = as.character(match.call()[[2]])\n", " if (!require(x,character.only = TRUE)){\n", " install.packages(pkgs = x, repos = \"http://cran.r-project.org\", dependencies = T, quiet = T)\n", " require(x, character.only = TRUE)\n", " }\n", " }\n", "\n", " suppressMessages(\n", " {\n", " packages(\"data.table\")\n", " packages(\"dplyr\")\n", " packages(\"xgboost\")\n", "
@zdepablo
zdepablo / dynet-tagger.py
Created Dec 22, 2017 — forked from neubig/dynet-tagger.py
A small sequence labeler in DyNet
View dynet-tagger.py
"""
DyNet implementation of a sequence labeler (POS taggger).
This is a translation of this tagger in PyTorch: https://gist.github.com/hal3/8c170c4400576eb8d0a8bd94ab231232
Basic architecture:
- take words
- run though bidirectional GRU
- predict labels one word at a time (left to right), using a recurrent neural network "decoder"
The decoder updates hidden state based on:
- most recent word
@zdepablo
zdepablo / install_submodules.R
Last active Dec 5, 2017
R: how to Install submodules from git
View install_submodules.R
install_submodule_git <- function(x, ...) {
install_dir <- tempfile()
system(paste("git clone --recursive", shQuote(x), shQuote(install_dir)))
devtools::install(install_dir, ...)
}
install_submodule_git("https://github.com/jonkeane/mocapGrip")
View gist:daf71447c82391c1b4311ffcceec2ebe
# java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=12605 Main # Name of .class program
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/pr/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p0.4/lib/hadoop/lib/native
java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=12611 -cp ta rget/da_record_linkage-0.0.1-SNAPSHOT-jar-with-dependencies.jar da_record_linkage.TestSnappy
netstat -plten | grep LISTEN | grep :120* # See if there is any open port
@zdepablo
zdepablo / split_strat_scale.r
Last active Aug 29, 2015 — forked from multidis/split_strat_scale.r
Stratified sampling: training / test data split preserving class distribution (caret functions) and scaling (standardize) the data. Stratified folds for CV.
View split_strat_scale.r
library(caret)
## select training indices preserving class distribution
in.train <- createDataPartition(yclass, p=0.8, list=FALSE)
summary(factor(yclass))
ytra <- yclass[in.train]; summary(factor(ytra))
ytst <- yclass[-in.train]; summary(factor(ytst))
## standardize features: training parameters of scaling for test-part
Xtra <- scale(X[in.train,])
@zdepablo
zdepablo / hive-receipts
Last active Aug 29, 2015
Hive receipts
View hive-receipts
# Overwrite non-partitioned table with their own contents
CREATE table xx_COPY LIKE xx;
INSERT OVERWRITE TABLE xx
SELECT * FROM xx
# Overwrite partitioned table with their own contents
CREATE table xx_COPY LIKE xx;
SHOW PARTITIONS ABC;
@zdepablo
zdepablo / hadoop-fs-receipts
Last active Aug 29, 2015
Quick Receipts for Hadoop Filesystem
View hadoop-fs-receipts
# Reference: http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/FileSystemShell.html
# Show disk usage in human format
hadoop fs -du -s -h /user/hive/warehouse/da_cdepablo*
# Show permissions
hadoop fs -getfacl /user/hive/warehouse/da_cdepablo*
# Change permissions
hadoop fs -setfacl -R -m other::rwx /user/hive/warehouse/da_cdepablo
@zdepablo
zdepablo / gist:3587a6755b080b85136c
Last active Aug 29, 2015
textalytics-queries per use
View gist:3587a6755b080b85136c
#Number of active users per service - with a cutoff
SELECT `service`, COUNT(*) num_users
FROM
(
SELECT `service`, `hash_key`, COUNT(*) num_requests
FROM `log`
WHERE `date_operation` > '2014-12-01'
GROUP BY `service`, `hash_key`
ORDER BY num_requests DESC
@zdepablo
zdepablo / 0_reuse_code.js
Last active Aug 29, 2015
Here are some things you can do with Gists in GistBox.
View 0_reuse_code.js
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@zdepablo
zdepablo / extractranks.py
Last active Aug 29, 2015
Extract UEFA rankings for football team ranks from a HTML table
View extractranks.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
from lxml import html,etree
import requests
import unicodecsv
def group(iterator, count):
itr = iter(iterator)
while True:
You can’t perform that action at this time.