Skip to content

Instantly share code, notes, and snippets.

View amontalenti's full-sized avatar

Andrew Montalenti amontalenti

View GitHub Profile

Keybase proof

I hereby claim:

  • I am amontalenti on github.
  • I am amontalenti (https://keybase.io/amontalenti) on keybase.
  • I have a public key ASCtP_ibM-nhDduaxOmYb5jdmpYNLt16lMGBT3yiuykq2Ao

To claim this, I am signing this object:

@amontalenti
amontalenti / Storm-Topology-Freeze-Issues.md
Last active December 8, 2017 07:17
Storm 0.9.2 exceptions related to multiple workers/supervisors

Conditions

Two Storm supervisors running in a 3-node configuration: one "head" node (nimbus) and two worker nodes. Each supervisor node runs a supervisor under supervision.

Topology

Multi-lang topology using Python and our streamparse library. Works fine when running on a single machine. Spout reads off Kafka and inserts tuples into

{
"action": "pageview",
"apikey": "mashable.com",
"campaign_id": "facebook",
"display": true,
"display_avail_height": 735,
"display_avail_width": 1280,
"display_pixel_depth": 24,
"display_total_height": 800,
"display_total_width": 1280,

Apply to be a Designer at Parse.ly

We are hiring a designer to work with our product and marketing teams to share knowledge of our real-time analytics platform with the wider world.

Parse.ly is an analytics platform that helps digital storytellers at some of the web's best sites, such as Arstechnica, New Yorker, The Telegraph, TechCrunch, and many more.

To see an example of how we work as an engineering team, check out

import re
from pymongo import MongoReplicaSetClient
from pymongo.read_preferences import ReadPreference
REPLICA_SET = "ptrack-mongo1" # or, parsely_articles for old CrawlDB
conn = MongoReplicaSetClient(read_preference=ReadPreference.SECONDARY, replicaSet=REPLICA_SET)
# optional collection filter for crawlDB in particular
# simple script converts my Google Chrome bookmarks for my "Dashes"
# folder into a Markdown list so I can share it on the Parse.ly wiki
import json
import os
POSITION = 8
bookmarks = json.load(open("%s/.config/google-chrome/Default/Bookmarks" % os.environ["HOME"]))
dash_links = bookmarks["roots"]["bookmark_bar"]["children"][POSITION]["children"]
class BadStr(str):
def __str__(self):
berak
def strip(self):
return self
import sys
# surprisingly, no exception thrown for this
sys.stdout.write(BadStr())
SELECT title, views FROM [test.parselyblog]
-- utilities
-- STRFTIME_UTC_USEC(pub_date, "%Y-%m")
-- SUM(views)a
-- COUNT(distinct url)
-- find dupes
-- ----------
line:
/plogger/?rand=1459810331846&idsite=deadspin.com&url=http%3A%2F%2Fscreengrabber.deadspin.com%2Fadrien-broner-tries-to-call-out-floyd-mayweather-turns-1768674744%3Futm_medium%3Dsharefromsite%26utm_source%3DScreengrabber_facebook&urlref=http%3A%2F%2Ffacebook.com%2Finstantarticles&screen=360x640%7C360x640%7C32&data=%7B%22parsely_uuid%22%3A%2227bdf3ea-1f69-4db0-aa78-8e90e1e217f1%22%2C%22parsely_site_uuid%22%3A%22c0349257-2cea-4864-a74d-18acd8e7c631%22%7D&sid=1&surl=http%3A%2F%2Fscreengrabber.deadspin.com%2Fadrien-broner-tries-to-call-out-floyd-mayweather-turns-1768674744%3Futm_medium%3Dsharefromsite%26utm_source%3DScreengrabber_facebook&sref=&sts=1459810321317&slts=0&date=Mon+Apr+04+2016+17%3A52%3A11+GMT-0500+(CDT)&action=heartbeat&inc=5 HTTP/1.1 || 200 || 236 || http://screengrabber.deadspin.com/adrien-broner-tries-to-call-out-floyd-mayweather-turns-1768674744?utm_medium=sharefromsite&utm_source=Screengrabber_facebook || [FBIA/FB4A;FBAV/70.0.0.22.83;] Mozilla/5.0 (Linux; Android 5.0; SM-N900T Build/LRX21V
import turtle
t = turtle.Turtle()
t.color('orange')
# self-test
t.left(180)
t.left(-180)
# this is a U
t.right(90)