Skip to content

Instantly share code, notes, and snippets.

@joshuarobinson
joshuarobinson / SimpleDownloader.ipynb
Last active April 25, 2019 10:15
Trivial example to illustrate how to use Spark to parallelize URL downloads.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
FROM openjdk:8-slim
RUN apt-get update && apt-get install -y curl python --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
# Download and extract the Presto package.
ARG PRESTO_VER=0.221
RUN curl https://repo1.maven.org/maven2/com/facebook/presto/presto-server/$PRESTO_VER/presto-server-$PRESTO_VER.tar.gz \
| tar xvz -C /opt/ \
&& ln -s /opt/presto-server-$PRESTO_VER /opt/presto-server \
<configuration>
<property>
<name>metastore.thrift.uris</name>
<value>thrift://10.62.205.205:9083</value>
</property>
<property>
<name>metastore.task.threads.always</name>
<value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
</property>
<property>
@joshuarobinson
joshuarobinson / purewatch.yaml
Last active June 16, 2020 23:37
Example Prometheus+Grafana Standalone with PureExporter
---
apiVersion: v1
kind: Service
metadata:
name: purewatch
labels:
app: purewatch
spec:
clusterIP: None
ports:
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
apiVersion: v1
kind: Service
metadata:
name: confluent
labels:
app: confluent
spec:
clusterIP: None
ports:
- name: kafka-port
#!/usr/bin/python3
import boto3
import sys
# Hard-coded endpoint override, update this for your use.
FB_DATAVIP='10.62.64.200'
if len(sys.argv) != 3:
print("Usage: {} bucketname key".format(sys.argv[0]))
import boto3
from datetime import datetime
import os
import sys
import time
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
import urllib3