Skip to content

Instantly share code, notes, and snippets.

@swinton
Created June 29, 2016 22:40
Show Gist options
  • Save swinton/1d28b2ca13a8f7abc68254f3aeeb9047 to your computer and use it in GitHub Desktop.
Save swinton/1d28b2ca13a8f7abc68254f3aeeb9047 to your computer and use it in GitHub Desktop.

TWITTER HERON IN PRACTICE

@louis_fumaosong, @heronstreaming

  • Twitter == real time y’all 🤘
  • Apache Storm Heron == real time data processing framework 💪
  • Heron terminology
    • Topology
      • A DAG, involving spouts and bolts, data flows from spouts through bolts
    • Spouts
      • Source of data
    • Bolts
      • Process data
      • Horizontally scalable
  • Architecture:
    • Submit topology to scheduler, scheduler handles everything else
    • Containerized?
  • Apache Storm originally, now Heron in production for 2 years, now fully replaced Apache Storm
  • Use cases:
    • Realtime ETL
    • Realtime BI
    • Realtime trends
    • Realtime…
      • Ops, media, …
    • Realtime
  • Microstream engine… small sub-systems
  • Stragglers (Bad host) handling
    • drop data
    • throttle senders
    • detect, reschedule stragglers

Metadata


Wed Jun 29 15:21:52 PDT 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment