Skip to content

Instantly share code, notes, and snippets.

@rgabo
Created December 10, 2013 22:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rgabo/81ec81e73bfa61632eb6 to your computer and use it in GitHub Desktop.
Save rgabo/81ec81e73bfa61632eb6 to your computer and use it in GitHub Desktop.
Strata + Hadoop World 2013 Road Report

Talking Points

  • Recently became CTO, was VP Engineering for a long time
    • Dealt with day-to-day things, but need to understand where we need to be in 3-6 months time
  • We started hitting the limits of our current architecture
  • Even for a young company, our data needs continuously evolved, often in parallel and as such, are often siloed away
    • We couldn't easily answer questions that came up
    • Use data in new ways that we didn't think before
  • We started looking at Hadoop
    • Hadoop at its core has a simple premise, scale-out storage and scale-out computing
    • The real strength is in its ecosystem, set of tools solving different problems and workloads, with the same groundwork
    • Hadoop is architecture, both technical and organizational
  • Hadoop is confusing, many vendors, many technologies, many solutions to many problems
  • Let's go to Hadoop + Strata World 2013!
    • 3000 people, sell out crowd
    • 9 tracks, hundreds of sessions, lot of marketing
    • More confused than ever
  • Why was it useful to go New York
    • Everyone's there, people who build it, people who augment it, people who use it, people who want to use it like us
    • You hear people talk about it and give their perspective which in turn helps you understand it, apply it to your problems, to your organization
    • Everything's happening in those three days (Pig meetup, Hive meetup, Data After Dark)
    • Talk to people

Takeaways

  • Mike Olson (CSO and Chairman, Cloudera)
    • Hadoop is moving from the periphery to the architectural center of the data center - emerge as an enterprise data hub
    • Work with the data where it lives
    • "This is architecture is too powerful, it is too right", allows you to rethink data
  • Doug Cutting (Architect, Cloudera)
    • Transition to OS technologies, platform technologies (Linux, Android)
    • Hadoop is general (platform) - "De facto standard operating system for big data"
    • Becoming ever more general
    • The ecosystem is adding more and more functionality, bringin workloads to Hadoop (other technologies diminishing)
    • Enterprise data hub, the center of a data-driven organization
  • Ken Rudin (Analytics @ Facebook)
    • Hadoop is technology, Big Data is about business needs
    • Big data = Hadoop + Relational
    • It's got to be about value
    • Train everyone
    • "Actionable insights are the goal" => "Impact out of insight", actually doing it
      • Impact your business (Move a metric)
      • Change a product
      • Change a process
      • Own the outcome: "If nothing changes, we have made no impact", "it doesn't make a difference whether or not you work at that company"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment