Skip to content

Instantly share code, notes, and snippets.


Jay Swan jayswan

View GitHub Profile
jayswan /
Created Jun 7, 2018
Splunk/ELK Comparision

Splunk vs ELK is complicated, depending on what you want to optimize. Probably the biggest issue is the ecosystem around post-search data manipulation.

Places where ES shines

ES is amazing at searching for tokens and returning documents. The aggregations are also superb -- actually much faster than Splunk under most conditions. Plugins can extend that functionality. Stuff like fuzzy search, regex queries, indexed terms lookups, significant terms aggregations, and nested aggregations can be extremely powerful if you know how to use them well.

Trouble areas

ES has a reputation for stability problems. These are mostly solvable by running an appropriately sized cluster with new versions and proper circuit breaker settings. Much of the FUD I've seen about this is incorrect, but the biggest problem remains that you can't kill a misbehaving query or constrain its resource use after it has started; if your circuit breakers aren't working correctly then you're out of luck.

Chaining data processing


from __future__ import print_function
import os
import sys
from netmiko import ConnectHandler
target_mac = os.environ['TARGET_MAC']
router_ip = os.environ['ROUTER_IP']
router_user = os.environ['ROUTER_USER']
password = os.environ['ROUTER_PW']
jayswan /
Created Sep 28, 2016
pipe-able script to check the existence of a GitHub username; returns 200 if found
# Usage: some_command_that_outputs_usernames |
# subject to anonymous API rate limits
xargs -I {} curl -w "%{http_code}\n" -sI -o /dev/null{}
jayswan /
Created Jul 26, 2016
Scripts to retrieve CIDR blocks for various services
# Fastly
curl -s | jq -r '.addresses | .[]'
# Google
dig @ +short txt | awk '{gsub("ip4:","");for (col=2; col<NF;++col) print $col}'
curl -s | \
jq --raw-output '.prefixes | map(.ip_prefix) | .[]'
jayswan /
Created Jul 10, 2016
Elasticsearch scripted aggregation with joined fields

This script allows you to do SQL GROUPBY-like aggregations on multiple fields in an Elasticsearch index.

Performance will likely be poor on large data sets.

Saved Groovy script in <elasticsearch_dir>/config/scripts/join-param-list.groovy:

return fields.collect { doc[it].value }.join(delimiter);
jayswan / add-json.bro
Created Apr 28, 2016 — forked from J-Gras/add-json.bro
Additional JSON logging for Bro.
View add-json.bro
# Add additional JSON logging
module Log;
export {
## Enables JSON-logfiles for all active streams
const enable_all_json = T &redef;
## Streams not to generate JSON-logfiles for
const exclude_json: set[Log::ID] = { } &redef;
## Streams to generate JSON-logfiles for
jayswan /
Created Feb 25, 2016
Get a List of Google CIDR Blocks
dig @ +short txt | awk '{gsub("ip4:","");for (col=2; col<NF;++col) print $col}'
jayswan /
Created Feb 19, 2016
Convert AWS IP Prefixes to SiLK IP Set
curl -s | \
jq --raw-output '.prefixes | map(.ip_prefix) | .[]' > prefixes.txt
rwsetbuild prefixes.txt aws.ipset
jayswan / gist:a8d9920ef74516a02fe1
Last active Aug 26, 2020
Elasticsearch Python bulk index API example
View gist:a8d9920ef74516a02fe1
>>> import itertools
>>> import string
>>> from elasticsearch import Elasticsearch,helpers
es = Elasticsearch()
>>> # k is a generator expression that produces
... # a series of dictionaries containing test data.
... # The test data are just letter permutations
... # created with itertools.permutations.
... #
... # We then reference k as the iterator that's