Skip to content

Instantly share code, notes, and snippets.

@databill86
databill86 / gpt-2-wikitext-103.py
Created July 24, 2019 13:32 — forked from thomwolf/gpt-2-wikitext-103.py
A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103
# Copyright (c) 2019-present, Thomas Wolf.
# All rights reserved. This source code is licensed under the MIT-style license.
""" A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103 """
import os
from collections import namedtuple
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from ignite.engine import Engine, Events
@databill86
databill86 / deployment-tool-ansible-puppet-chef-salt.md
Created March 28, 2019 14:37 — forked from jaceklaskowski/deployment-tool-ansible-puppet-chef-salt.md
Choosing a deployment tool - ansible vs puppet vs chef vs salt

Requirements

  • no upfront installation/agents on remote/slave machines - ssh should be enough
  • application components should use third-party software, e.g. HDFS, Spark's cluster, deployed separately
  • configuration templating
  • environment requires/asserts, i.e. we need a JVM in a given version before doing deployment
  • deployment process run from Jenkins

Solution

@databill86
databill86 / simple_python_datasource.py
Created March 4, 2019 14:31 — forked from linar-jether/simple_python_datasource.py
Grafana python datasource - using pandas for timeseries and table data. inspired by and compatible with the simple json datasource
from flask import Flask, request, jsonify, json, abort
from flask_cors import CORS, cross_origin
import pandas as pd
app = Flask(__name__)
cors = CORS(app)
app.config['CORS_HEADERS'] = 'Content-Type'
@databill86
databill86 / dynamic_tasks.py
Created March 4, 2019 14:30 — forked from linar-jether/dynamic_tasks.py
Dynamic celery tasks - remote execution of arbitrary callables and DAGs, using dill to serialize and send executable code to worker. This also shows a way to map an iterable returned from one task to a group of tasks (distributed map), with an optional reducer (chord) to be executed when the group tasks complete
# Task primitives, allows pipeline execution using celery
@app.task
def dmap(it, callback, final=None):
# Map a callback over an iterator and return as a group
callback = subtask(callback)
# Hack for mapping a chain to values, due to a bug where args are not copied in group creation
if isinstance(callback, chain):
if final:
raise ValueError('task_processor: Cannot run reducer for dmap excecuted with a chain.')
@databill86
databill86 / celery_task_monitor.py
Created March 4, 2019 14:30 — forked from linar-jether/celery_task_monitor.py
Celery task monitor, logs task state to MongoDB
import pickle
import threading
from Queue import Queue
import time
from bson import InvalidDocument
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
@databill86
databill86 / min-char-rnn.py
Created January 8, 2019 11:34 — forked from karpathy/min-char-rnn.py
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
@databill86
databill86 / xrpaway.py
Created December 2, 2018 19:33
XRP Away™️ - automatically block XRP fanatics sliding into your Twitter mentions
# Requirement: pip install tweepy
import tweepy
# Credentials go here (generate at: https://apps.twitter.com)
auth = tweepy.OAuthHandler('consumer_key', 'consumer_secret')
auth.set_access_token('access_token', 'access_token_secret')
# Connect to Twitter
api = tweepy.API(auth)