Skip to content

Instantly share code, notes, and snippets.

@kinlane
Created February 5, 2023 02:14
Show Gist options
  • Save kinlane/cf6d87992bf7567a18d9e5dfda23ef5f to your computer and use it in GitHub Desktop.
Save kinlane/cf6d87992bf7567a18d9e5dfda23ef5f to your computer and use it in GitHub Desktop.
twitter-engineer.json
This file has been truncated, but you can view the full file.
[
{
"title": "Introducing Twitter Image Pipeline iOS framework for open source",
"body": "<p>Today, we’re excited to open source our Twitter Image Pipeline iOS framework, or TIP for short.<br /> </p> \n<h5><b>Why a new framework?</b></h5> \n<p>In 2014, Twitter embraced sharing and displaying images in its apps, which was a great enhancement to the communication platform built on brevity. But with that we had accrued a large swath of tech debt, were missing some key features, struggled with reliability, and had many different complex approaches to our images. We took some time to assess where we were at and identify where we wanted to go before building a new framework to solve those pain points. Once we adopted this new iOS framework, we were able to take advantage of its features and have iterated on it over the course of 2 years. Today we’re happy to share it with you so you can enjoy the same benefits.</p> \n<h5><b>What does TIP offer?</b></h5> \n<p>TIP is a framework that attempts to solve many problems around image loading and caching. To give an overview of what it offers, let’s dig into some of the main parts of TIP.</p> \n<p><b>Caches</b></p> \n<p>A major part of being an efficient pipeline for loading images is the caches. From the very start, we knew we needed a strong caching system. Before TIP, Twitter had some serious problems with its image cache and we needed to have those solved from the start so that once we completed our migration to TIP we could be rid of those issues.</p> \n<p>First, we needed to address a user concern around account logout. Since we had a singular caching system, all cached images were shared between all Twitter accounts that were logged into the app. Because Twitter for iOS supports multiple accounts, this presented a problem when an account was logged out. We couldn’t clear images for the logged out account, and we didn’t want to purge the entire cache to lose all the performance it offers. This was a waste of storage and left behind unwanted images in the cache. To fundamentally solve this up front, we elected to silo TIP via “pipelines,” with each pipeline having its own distinct caches which can be individually purged. This use case might not translate to all apps and is completely optional, but it was an important starting point for building TIP.</p> \n<p>The caching per pipeline in TIP is separated into 3 caches: the first synchronous access cache is in-memory images that are already rendered to a specific resolution, the second asynchronous cache is in-memory images that aren’t necessarily scaled or even decoded yet, and the last asynchronous cache is an on-disk cache for longer persistence. This offers great performance in loading images while balancing the tradeoffs of how far images are cached. When building these caches, we focused on thread safety and data consistency. Twitter had a long-standing fundamental flaw with our previous caching system as well. There was a race condition which, when triggered, would cause the same image to be loaded from the network twice, and both loads would attempt to write to the same file, resulting in a corrupted image in the cache. Slow networking and larger images exacerbated the chances corruption would occur. TIP does not share this flaw and the issue has been eliminated.</p> \n<p>A particularly problematic part of our previous cache was that it was unbounded. The previous design didn’t build in the concept of limiting the cache size and relied solely on TTLs (time-to-live values) to purge images. This led to our image cache reaching 2GB or even larger with no recourse for the user. TIP, however, uses LRU (least-recently-used) mechanism for purging, has configurable limits to how big the cache can get, and has the ability to be cleared (either per pipeline or globally). After a full migration to TIP, Twitter no longer feels the pain of 2GB caches and achieves a better end-user experience with a 128 MB limit of on-disk storage. The TIP framework also allows us to add app data usage settings to clear the cache.</p> \n<p>The last need we had when building and adopting TIP was a way to transition from the legacy caches to the new TIP caches. We elected to build support for plugging in additional caches to each image pipeline so that when TIP didn’t have an image, the image could be loaded from the legacy cache before trying the network. This allowed our caches to progressively transition over to TIP without the impact of extra networking from the user.</p> \n<p><b>Fetching Images</b></p> \n<p>The core of an image pipeline is the fetching of a requested image. When building TIP, we had existing use cases that needed to be met and table stakes in performance to achieve, but we also wanted to really build out a robust system for image loading so we could maximize the user experience while minimizing resource impact, particularly with networking resources.</p> \n<p>It was noted early on that we have many different sizes of images, a.k.a. variants, that we use in the Twitter app. We were being wasteful having different parts of the app load many different variants. It wasted cache space, cost data, and took time to load new variants over the network. The improvement was to have TIP offer easy to use support for loading existing images that were cached even if they weren’t the correct variant. Variants that are larger than the one being fetched can simply be scaled to the appropriate size, avoiding the network. Variants that are smaller than the one being fetched can be offered as a preview for the user to see while the full image is being loaded, presenting content to the user immediately so they can see something while they wait for the higher quality variant to show up.</p> \n<p>We really cared about the network utilization, and wanted TIP to be completely transparent and robust in how it loads images over the network. TIP will automatically coalesce any two (or more) fetches that are for the same image variant, saving on networking. TIP also supports image download resuming; instead of a failed or canceled image load throwing away whatever progress was made, TIP will persist that data so the next time that image is fetched, it can be resumed from where it left off.</p> \n<p>A principle we adopted as we started building TIP was to get content to the user as soon as possible. With that in mind, we set out to ensure that TIP had support for progressive loading of images, which gets the content to the user as soon as possible while the image continues to load and gain in fidelity. It proved very successful as a feature, and we’ve since transitioned all Tweet images to be PJPEG. Users with fast internet will see no difference than with JPEG, but the majority of the world that access the internet through a slow connection can see images on screen much faster.</p> \n<p>Supporting a diverse set of image codecs was something we learned we needed as we were using TIP. We experimented more and more with different image codecs and ended up making 3 iterations that really benefit TIP. First, we added animated image support (specifically with GIFs). Second, we added support for all ImageIO supported codecs. Last, we abstracted out how TIP encodes and decodes so that codecs are pluggable and any custom codec can easily be added to TIP, either to supplement the existing codecs or replace any of them. With this diverse codec support, Twitter has been able to easily experiment with many image formats including PNG, JPEG, PJPEG, GIF, JPEG-2000, WebP and even custom codecs.</p> \n<p>The last critical component to TIP being robust for Twitter’s image fetching needs is the networking it executes with. Twitter has its own network layer abstraction through which all networking goes through, which is necessary for things like robust network performance measurements and control over networking operations. By abstracting out networking, TIP supports any network layer to be plugged in, but for simplicity uses an NSURLSession based option as default.</p> \n<p>Beyond fetching images, TIP supports the ability to store images to the caches and is a feature that gives us a lot of flexibility in our user experience. We can now easily support taking the image that is being Tweeted and put it into the cache so that when that Tweet shows in the user’s timeline, the image is already there and doesn’t have to be loaded over the network. To round off control over caches, specific images can be purged too.</p> \n<p><b>Debugging and Observability</b></p> \n<p>Beyond being a framework that provides features, TIP offers detailed insights into what is happening and how things are performing. Consumers of TIP can observe pipeline behavior, be notified of problems that crop up, see detailed logging, and handle robust errors. As an added utility for developers, we built in tools for use at runtime. With TIP’s UIImageView subclass, it is trivial to view debug details of a fetched image with an overlay on the image view. Even more useful is the ability to inspect caches to see what images are cached, including both complete and incomplete loads, with the details of each image. Twitter uses this feature with a simple debug UI for our developers to trivially inspect our caches and debug.</p> \n<h5><b>Who is TIP for?</b></h5> \n<p>After TIP’s evolution to where it is today, we feel that TIP has enough bang for the buck to be an asset to any app developer or development team that really wants the most control and performance with their images in a single framework.</p> \n<h5><b>What about Android?</b></h5> \n<p>Great question! Twitter has tried a number of iterations on Android for image loading as well, and currently have adopted <a href=\"http://frescolib.org/\">Facebook’s Fresco Library</a>. It’s feature rich, reliable, thoroughly documented and well maintained. We couldn’t be happier with using their open source project as an image pipeline for Android.</p> \n<h5><b>Where can I find TIP?</b></h5> \n<p>As with all Twitter open source projects, the Twitter Image Pipeline framework for iOS project can be found on Twitter’s github page, specifically at <a href=\"https://github.com/twitter/ios-twitter-image-pipeline\">https://github.com/twitter/ios-twitter-image-pipeline</a>. We hope you like it!</p> \n<p>Be sure to check out our other open source projects, including the Twitter Logging Service framework for iOS (TLS) which can be found at <a href=\"https://github.com/twitter/ios-twitter-logging-service\">https://github.com/twitter/ios-twitter-logging-service</a>.</p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-03-01T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/introducing-twitter-image-pipeline-ios-framework-for-open-source",
"domain": "engineering"
},
{
"title": "Optimizing Twitter Heron",
"body": "<p>We designed Twitter <a href=\"https://engineering/2015/flying-faster-with-twitter-heron\">Heron</a>, our next generation streaming engine, because we needed a system that scaled better, was easier to debug, had better performance, was easier to deploy and manage, and worked in a shared multi-tenant cluster environment. Heron has been in production at Twitter for more than two and a half years and has proven to deliver against our design criteria. We were so thrilled we <a href=\"https://engineering/2016/open-sourcing-twitter-heron\">open-sourced</a> it in May 2016. While Heron immediately delivered on its promise of better scalability compared to Storm (as we reported in our <a href=\"http://dl.acm.org/citation.cfm?id=2742788\">SIGMOD</a> 2015 paper), we’ve recently identified additional lower-level performance optimization opportunities. Since improvements in performance directly translate to efficiency, we identified the performance bottlenecks in the system and optimized them. In this blog we describe how we profiled Heron to identify performance limiting components, we highlight the optimizations, and we show how these optimizations improved throughput by 400-500% and reduced latency by 50-60%.</p> \n<p>Before we can discuss our performance optimizations, let us revisit some of the core Heron concepts. A Heron topology consists of spouts and bolts. Spouts tap into a data source and inject data tuples into a stream, which is processed by a downstream topology of bolts. Bolts are processing elements that apply logic on incoming tuples and emit outgoing tuples. A Heron topology is an assembly of spouts and bolts that create a logical processing plan in the form of a directed acyclic graph. The logical plan is translated into a physical plan that includes the number of instances for each spout and bolt as well as the number of containers these instances should be deployed on. </p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-03-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/optimizing-twitter-heron",
"domain": "engineering"
},
{
"title": "How we built Twitter Lite",
"body": "<p>We’re excited to introduce you to <a href=\"https://engineering/2017/introducing-twitter-lite/\">Twitter Lite,</a> a <a href=\"https://developers.google.com/web/fundamentals/getting-started/codelabs/your-first-pwapp/\">Progressive Web App</a> that is available at mobile.twitter.com. Twitter Lite is fast and responsive, uses less data, takes up less storage space, and supports push notifications and offline use in modern browsers. The web is becoming a platform for lightweight apps that can be accessed on-demand, installed without friction, and incrementally updated. Over the last year we’ve adopted new, open web APIs and significantly improved the performance and user experience.</p> \n<h4>Architecture overview</h4> \n<p>Twitter Lite is a client-side JavaScript application and a small, simple <a href=\"https://nodejs.org/en/\">Node.js</a> server. The server handles user authentication, constructs the initial state of the app, and renders the initial HTML application shell. Once loaded in the browser, the app requests data directly from the Twitter API. The simplicity of this basic architecture has helped us deliver exceptional service reliability and efficiency at scale – Twitter Lite an order of magnitude less expensive to run than our server-rendered desktop website.</p> \n<p>The client-side JavaScript application is developed, built, and tested with many open source libraries including <a href=\"https://facebook.github.io/react/\">React</a>, <a href=\"http://redux.js.org/\">Redux</a>, <a href=\"https://github.com/paularmstrong/normalizr\">Normalizr</a>, <a href=\"https://github.com/globalizejs/globalize\">Globalize</a>, <a href=\"https://babeljs.io/\">Babel</a>, <a href=\"https://webpack.js.org/\">Webpack</a>, <a href=\"https://facebook.github.io/jest/\">Jest</a>, <a href=\"http://webdriver.io/\">WebdriverIO</a>, and <a href=\"https://yarnpkg.com/\">Yarn</a>. Relying on established open source software has allowed us to spend more time improving the user-experience, increasing our iteration speed, and working on Twitter-specific problems such as processing and manipulating Timeline and Tweet data.</p> \n<p>We write modern JavaScript (ES2015 and beyond) that is compiled with Babel and bundled with Webpack. API response data is first processed by Normalizr – which allows us to de-duplicate items and transform data into more efficient forms – before being sent to various Redux modules used for fetching, storing, and retrieving remote and local data. The UI is implemented with several hundred React components that do everything from render text to manage virtual lists, lazy load modules, and defer rendering. Twitter Lite supports 42 languages, and we use Globalize to deliver localized numbers, dates, and messages.</p> \n<h4>Designing for performance</h4> \n<p>Hundreds of millions of people visit mobile.twitter.com every month. We want Twitter Lite to be the best way to use Twitter when your connectivity is slow, unreliable, limited, or expensive. We have been able to achieve speed and reliability through a series of incremental performance improvements known as the <a href=\"https://developers.google.com/web/fundamentals/performance/prpl-pattern/\">PRPL pattern</a> and by using the new capabilities of modern browsers on Android (e.g., Google Chrome) which include <a href=\"https://github.com/w3c/ServiceWorker\">Service Worker</a>, <a href=\"https://w3c.github.io/IndexedDB/\">IndexedDB</a>, <a href=\"https://developers.google.com/web/fundamentals/engage-and-retain/app-install-banners/\">Web App Install Banners</a>, and <a href=\"https://developers.google.com/web/fundamentals/engage-and-retain/push-notifications/\">Web Push Notifications</a>.</p> \n<h5>Availability</h5> \n<p>Twitter Lite is network resilient. To reach every person on the planet, we need to reach people on slow and unreliable networks. When available, we use a Service Worker to enable temporary offline browsing and near-instant loading on repeat visits, regardless of the network conditions. The Service Worker caches the HTML application shell and static assets, along with a few popular emoji. And when scripts or data fail to load we provide “Retry” buttons to help users recover from the failure. All together, these changes improve reliability and contribute to significantly faster loading and startup times on repeat visits.</p> \n<h5>Progressive loading</h5> \n<p>Twitter Lite is interactive in under 5 seconds over 3G on most devices. Most of the world is using 2G or 3G networks; a fast initial experience is essential. Over the last 3 months we’ve reduced average load times by over 30% and 99th percentile time-to-interactive latency by over 25%. To achieve this, the app streams the initial HTML response to the browser, sending instructions to preload critical resources while the server constructs the initial app state. Using webpack, the app’s scripts are broken up into granular pieces and loaded on demand. This means that the initial load only requires resources needed for the visible screen. (When available, a Service Worker will precache additional resources and allow instant future navigations to other screens.) These changes allow us to progressive load the app so people can sooner consume and create Tweets.</p> \n<h5>Rendering</h5> \n<p>Twitter Lite breaks up expensive rendering work. Although we’ve taken care to optimize the rendering of our components, the Tweet is a complex composite component, and rendering infinite lists of Tweets requires additional performance considerations. We implemented our own virtualized list component; it only renders the content visible within the viewport, incrementally renders items over multiple frames using the requestAnimationFrame API, and preserves scroll position across screens. Further improvements to perceived performance were possible by deferring non-critical rendering to idle periods using the <a href=\"https://w3c.github.io/requestidlecallback/\">requestIdleCallback</a> API.</p> \n<h5>Data usage</h5> \n<p>Twitter Lite reduces data use by default, serving smaller media resources and relying on cached data. We’ve optimized images to reduce their impact on data usage by as much as 40% as you scroll through a timeline. “Data saver” mode further reduces data usage by replacing images in Tweets and Direct Messages with a small, blurred preview. A HEAD request for the image helps us to display its size alongside a button to load it on demand. And at 1-3% the size of our native apps, Twitter Lite requires only a fraction of the device storage space.</p> \n<h5>Design systems and iteration speed</h5> \n<p>Increasing our capability to iterate quickly helps us to maintain a high quality user experience. We rely heavily on flexbox for layout and a small, fixed number of colors, font sizes, and lengths. Twitter Lite is built from a component-based responsive design system that allows the app to fit any form factor. Working with UI components has helped us established a shared vocabulary between design and engineering that encourages rapid iteration and reuse of existing building blocks. Some of our most complex features, such as mixed-content Timelines, can be created from as little as 30 lines of code configuring and connecting a Redux module to a React component.</p> \n<h5>Looking ahead</h5> \n<p>Building a fast web app at this scale, and keeping it fast, is a significant challenge involving design, product, and engineering from multiple teams at Twitter. We’re excited about our progress and experimenting with HTTP/2, GraphQL, and alternative compression formats to further reduce load times and data consumption. In the coming months, we’ll be shipping more improvements to the accessibility, safety, design, functionality, and performance of Twitter Lite.</p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-04-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/how-we-built-twitter-lite",
"domain": "engineering"
},
{
"title": "Introducing Torch Decision Trees",
"body": "<p>The Twitter timelines team had been looking for a faster implementation of <a href=\"https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting\">gradient boosted decision trees</a> (GBDT). Twitter Cortex provides DeepBird, which is an ML platform built around Torch. No GBDT solution was available in the Torch ecosystem, so we decided to build our own.</p> \n<p>Today we are excited to announce the <a href=\"https://github.com/twitter/torch-decisiontree\">torch-decisiontree</a> library, which is implemented in Lua and C using <a href=\"http://torch.ch/\">Torch</a>’s fast and easy-to-use tensor library. In the tradition of the Torch community and Twitter, we are open-sourcing the library so that others may further benefit and contribute.</p> \n<p><b>Example use case: applying decision forests to Tweets</b></p> \n<p>Torch-decisiontree provides the means to train GBDT and random forests. By organizing the data into a forest of trees, these techniques allow us to obtain richer features from data. For example, consider a dataset where each example is a Tweet represented as a <a href=\"https://en.wikipedia.org/wiki/Bag-of-words_model\">bag-of-words</a>. Furthermore, consider that we want the resulting trees to help predict if the tweet will be favorited. After training, each tree would map nodes to words in such a way that a Tweet’s nodes are easier to use to predict if the Tweet will be favorited or not.<br /> </p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-10-09T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/Introducing-Torch-Decision-Trees",
"domain": "engineering"
},
{
"title": "Introducing Serial: improved data serialization on Android",
"body": "<p>Smooth timeline scrolling on the Twitter for Android app is important for the user experience, and we’re always looking for ways to improve it. With some profiling, we discovered that serializing and deserializing data to and from the database using standard Android Externalizable classes was taking around 15% of the UI thread time. Existing libraries provided little support for making iterative changes, and any changes that broke serialization caused bugs that were difficult to identify and fix without wiping the database clean.</p> \n<p>We set out to fix this, and today we are excited to announce <b>Serial, a new open source library for serialization.</b><br /> </p> \n<p>When we started developing Serial, we identified four main pain points around the standard Android serialization libraries:</p> \n<ul> \n <li>Performance: Slow serialization was directly impacting the user experience.</li> \n <li>Debuggability: When there was a bug in our serialized data, the debugging information was obtuse and provided very little insight into how to approach a fix.</li> \n <li>Backwards compatibility: Android libraries had little to no support for making changes to objects that are serialized without wiping the serialized data completely, which made iteration difficult.</li> \n <li>Flexibility: We wanted a library that could be easily adopted by our existing code and model structure.</li> \n</ul> \n<p>While other Java serialization libraries like Kryo and Flatbuffer attempt to solve some overlapping problems, no libraries that we found fit these needs effectively on Android. The libraries tend to target performance and backwards compatibility, but ignore debuggability and often require major changes to the existing codebase in order to adopt the framework.</p> \n<p><b>Performance</b><br /> </p> \n<p>We pinpointed reflection as a culprit for performance. With Externalizable, information about the class, including the class name and package, are added to the byte array when serializing the object. This allows the framework to identify which object to instantiate with the serialized data, and where to find that class in the app package structure, but it is a time consuming process.<br /> </p> \n<p>To remove this inefficiency, Serial allows the developer to define a Serializer for each object that needs to be serialized. The Serializer can explicitly enumerate how each field should be written to and read from the serialized stream. This removes the need for reflection and dynamic lookups.<br /> </p> \n<p>Note: The real Tweet object and other model objects in our codebase that get the most benefit from these changes are significantly larger than this example, but for simplicity this is a scaled down version.<br /> </p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-11-06T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/introducing-serial",
"domain": "engineering"
},
{
"title": "Introducing Vireo: a lightweight and versatile video processing library",
"body": "<p>Twitter is the place to see what’s happening in the world. On the video team, we strive to make watching and sharing videos seamless, fast and enjoyable for people around the world. </p> \n<p>We built Vireo to simplify the processing and delivery of video and today we’re happy to announce that we are open-sourcing it. Vireo is a lightweight and versatile video processing library that powers our video transcoding service, deep learning recognition systems and more.</p> \n<p>Vireo is written in C++11 and built with functional programming principles. It also optionally comes with Scala wrappers that enable us to build scalable video processing applications within our backend services. It is built on top of best of class open-source libraries (we did not reinvent the wheel), and defines a unified and modular interface for these libraries to communicate easily and efficiently. Performance was a strong focus, as well as memory consumption: only the strictly required objects are kept in memory at all times, and we pick the fastest code, with negligible overhead. Some operations, such as trimming or remuxing, are blazing fast even on mobile!</p> \n<p>Thanks to a unified interface, it is easy to write new modules (e.g. supporting a new codec) or swap out existing ones in favor of others (e.g. proprietary or hardware H.264 decoder).<br /> </p> \n<p>Here’s an example code snippet that transcodes an input video file in H.264 format and saves it as an MP4 file:</p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2017-12-15T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2017/introducing-vireo",
"domain": "engineering"
},
{
"title": "Simplify Service Dependencies with Nodes",
"body": "<p>RPC services can be messy to implement. It can be even messier if you go with microservices. Rather than writing a monolithic piece of software that’s simple to deploy but hard to maintain, you write many small services each dedicated to a specific functionality, with a clean scope of features. They can be running in different environments, written at different times, and managed by different teams. They can be as far as a remote service across the continent, or as close as a logical component living in your own server process providing its work through an interface, synchronous or asynchronous. All said, you want to put together the work of many smaller components to process your request.</p> \n<p>This was exactly the problem that Blender, one of the major components of the Twitter Search backend, was facing. As one of the most complicated services in Twitter, it makes more than 30 calls to different services with complex interdependencies for a typical search request, eventually reaching hundreds of machines in the data center darkness. <a href=\"https://engineering/2011/twitter-search-is-now-3x-faster\">Since its deployment in 2011</a>, it has gone through several generations of refactoring regarding its dependency handling.</p> \n<p>At the beginning, it was simple. We had to call several backend services and put together a search response with the data collected. It’s easy to notice the dependencies between them: you need to get something from service A if you want to create a request for service B and C, whose responses will be used to create something to query service D with, and so on. We can draw a Directed Acyclic Graph (DAG) for dependencies between services in the workflow that processes the request, like this one:</p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2016-11-11T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2016/simplify-service-dependencies-with-nodes",
"domain": "engineering"
},
{
"title": "Reinforcement Learning for Torch: Introducing torch-twrl",
"body": "<p data-emptytext=\"Text\"></p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2016-09-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2016/reinforcement-learning-for-torch-introducing-torch-twrl",
"domain": "engineering"
},
{
"title": "Open Sourcing Twitter Heron",
"body": "<p>Last year we <a href=\"https://engineering/2015/flying-faster-with-twitter-heron\" style=\"\">announced</a> the introduction of our new distributed stream computation system, Heron. Today we are excited to announce that we are open sourcing Heron under the permissive Apache v2.0 license. Heron is a proven, production-ready, real-time stream processing engine, which has been powering all of Twitter’s real-time analytics for over two years. Prior to Heron, we used <a href=\"http://storm.apache.org/\" style=\"\">Apache Storm</a>, which we <a href=\"https://engineering/2011/storm-coming-more-details-and-plans-release\" style=\"\">open sourced in 2011</a>. Heron features a wide array of architectural improvements and is backward compatible with the Storm ecosystem for seamless adoption.<br /> </p> \n<p>Everything that happens in the world happens on Twitter. That generates a huge volume of information in the form of billions of live Tweets and engagements. We need to process this constant stream of data in real-time to capture trending topics and conversations and provide better relevance to our users. This requires a streaming system that continuously examines the data in motion and computes analytics in real-time.</p> \n<p>Heron is a streaming system that was born out of the challenges we faced due to increases in volume and diversity of data being processed, as well as the number of use cases for real-time analytics. We needed a system that scaled better, was easier to debug, had better performance, was easier to deploy and manage, and worked in a shared multi-tenant cluster environment.</p> \n<p>To address these requirements, we weighed the options of whether to extend Storm, switch to another platform, or to develop a new system. Extending Storm would have required extensive redesign and rewrite of its core components. The next option we considered was using an existing open-source solution. However, there are a number of issues with respect to making several open systems work in their current form at our scale. In addition, these systems are not compatible with Storm’s API. Rewriting the existing topologies with a different API would have been time consuming, requiring our internal customers to go through a very long migration process. Furthermore, there are different libraries that have been developed on top of the Storm API, such as Summingbird. If we changed the underlying API of the streaming platform, we would have had to rewrite other higher-level components of our stack.</p> \n<p>We concluded that our best option was to rewrite the system from the ground-up, reusing and building upon some of the existing components within Twitter.</p> \n<p><b>Enter Heron.</b></p> \n<p>Heron represents a fundamental change in streaming architecture from a thread-based system to a process-based system. It is written in industry-standard languages (Java/C++/Python) for efficiency, maintainability, and easier community adoption. Heron is also designed for deployment in modern cluster environments by integrating with powerful open source schedulers, such as <a href=\"http://mesos.apache.org/\">Apache Mesos</a>, <a href=\"http://aurora.apache.org/\">Apache Aurora</a>, <a href=\"http://reef.apache.org/\">Apache REEF</a>, <a href=\"http://slurm.schedmd.com/\">Slurm</a>.</p> \n<p>One of our primary requirements for Heron was ease of debugging and profiling. Heron addresses this by running each task in a process of its own, resulting in increased developer productivity as developers are able to quickly identify errors, profile tasks, and isolate performance issues.</p> \n<p>To process large amounts of data in real-time, we designed Heron for high scale, as topologies can run on several hundred machines. At such a scale, optimal resource utilization is critical. We’ve seen <a href=\"http://dl.acm.org/citation.cfm?id=2742788\">2-5x better efficiency</a> with Heron, which has saved us significant OPEX and CAPEX costs. This level of efficiency was made possible by both the custom IPC layer and the simplification of the computational components’ architecture.</p> \n<p>Running at Twitter-scale is not just about speed, it’s also about ease of deployment and management. Heron is designed as a library to simplify deployment. Furthermore, by integrating with off-the-shelf schedulers, Heron topologies safely run alongside critical services in a shared cluster, thereby simplifying management. Heron has proved to be reliable and easy to support, resulting in an order of magnitude reduction of incidents.</p> \n<p>We built Heron on the basis of valuable knowledge garnered from our years of experience running Storm at Twitter. We are open sourcing Heron because we would like to share our insights and knowledge and continue to learn from and collaborate with the real-time streaming community.</p> \n<p>Our early partners include both Fortune 500 companies, including Microsoft, and startups who are already using Heron for an expanding set of real-time use cases, including ETL, model enhancement, anomaly/fraud detection, IoT/IoE applications, embedded systems, VR/AR, advertisement bidding, financial, security, and social media.</p> \n<p>“Heron enables organizations to deploy a unique real-time solution proven for the scale and reach of Twitter,” says Raghu Ramakrishnan, Chief Technology Officer (CTO) for the Data Group at Microsoft. “In working with Twitter, we are contributing an implementation of Heron that could be deployed on Apache Hadoop clusters running YARN and thereby opening up this technology to the entire big data ecosystem.”</p> \n<p>We are currently considering moving Heron to an independent open source foundation. If you want to join this discussion, see this <a href=\"https://github.com/twitter/heron/issues/606\">issue</a> on GitHub. To join the Heron community, we recommend getting started at <a href=\"http://heronstreaming.io/\">heronstreaming.io</a>, joining the discussion on Twitter at <a href=\"https://twitter.com/heronstreaming\" class=\"has-hover-card\">@heronstreaming</a> and viewing the <a href=\"https://github.com/twitter/heron\">source</a> on GitHub.</p> \n<p><b>Acknowledgements</b></p> \n<p>Large projects like Heron would not have been possible without the help of many people.</p> \n<p>Thanks to: <a href=\"https://twitter.com/Louis_Fumaosong\">Maosong Fu</a>, <a href=\"https://twitter.com/vikkyrk\">Vikas R. Kedigehalli</a>, <a href=\"https://twitter.com/saileshmittal\">Sailesh Mittal,</a><a href=\"https://twitter.com/billgraham\">Bill Graham</a>, <a href=\"https://twitter.com/luneng90\">Neng Lu</a>, <a href=\"https://twitter.com/JingweiWu\">Jingwei Wu</a>, <a href=\"https://twitter.com/cckellogg\">Christopher Kellogg</a>, <a href=\"https://twitter.com/ajorgensen\">Andrew Jorgensen</a>, <a href=\"https://twitter.com/brianhatfield\">Brian Hatfield</a>, <a href=\"https://twitter.com/msb5014\">Michael Barry</a>, <a href=\"https://twitter.com/zhilant\">Zhilan Zweiger</a>, <a href=\"https://twitter.com/lucperkins\">Luc Perkins</a>, <a href=\"https://twitter.com/sanjeevrk\">Sanjeev Kulkarni</a>, <a href=\"https://twitter.com/staneja\">Siddharth Taneja</a>, <a href=\"https://twitter.com/challenger_nik\">Nikunj Bhagat</a>, <a href=\"https://twitter.com/MengdieH\">Mengdie Hu</a>, <a href=\"https://twitter.com/lawrencey99\">Lawrence Yuan</a>, <a href=\"https://twitter.com/hitonyc\">Zuyu Zhang</a>, and <a href=\"https://twitter.com/pateljm\">Jignesh Patel</a> who worked on architecting, developing, and productionizing Heron.</p> \n<p>Thanks to the open source and legal teams: <a href=\"https://twitter.com/sasa\">Sasa Gargenta</a>, <a href=\"https://twitter.com/douglashudson\">Douglas Hudson</a>, <a href=\"https://twitter.com/cra\">Chris Aniszczyk</a>.</p> \n<p>Thanks to early testers who gave us valuable feedback on deployment and documentation.</p> \n<p><b>References</b></p> \n<p>[1] <a href=\"http://dl.acm.org/citation.cfm?id=2742788\">Twitter Heron: Streaming at Scale</a>, Proceedings of ACM SIGMOD Conference, Melbourne, Australia, June 2015.</p> \n<p>[2] <a href=\"http://dl.acm.org/citation.cfm?id=2595641\">Storm@Twitter</a>, Proceedings of ACM SIGMOD Conference, Snowbird, Utah, June 2014.</p>\n\n<div class=\"tweet-error-text\">This Tweet is unavailable",
"date": "2016-05-25T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/topics/open-source/2016/open-sourcing-twitter-heron",
"domain": "engineering"
},
{
"title": "2013",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2013",
"domain": "engineering"
},
{
"title": "Students: Apply now for Summer of Code",
"body": "<p>We are thrilled to have an opportunity again to participate and support the <a href=\"http://www.google-melange.com/gsoc/accepted_orgs/google/gsoc2013\">Summer of Code</a> program, especially since we enjoyed being involved so much <a href=\"http://engineering.twitter.com/2012/08/how-we-spent-our-summer-of-code.html\">last year for the first time</a>.<br><br> Unlike many Summer of Code participating organizations that focus on a single ecosystem, we have a variety of projects spanning multiple programming languages and open source communities. Here are a few of our <a href=\"https://github.com/twitter/twitter.github.com/wiki/Google-Summer-of-Code-2013\">project ideas</a> for this year:<br><br>Finagle<br><br><a href=\"http://twitter.github.io/finagle/\">Finagle</a> is a protocol-agnostic, asynchronous RPC system for the JVM that makes it easy to build robust clients and servers in Java, <a href=\"http://www.scala-lang.org/\">Scala</a> or any JVM-hosted language. It is extensively used within Twitter and other companies to run their backend services. This summer, we’re offering these project ideas:</p> \n<ul>\n <li>Distributed debugging: DTrace is a very powerful and versatile tool for debugging local application. We would like to employ similar types of instrumentation on a cluster of machines that form a distributed system, tracing requests based on specific conditions like the state of the server.</li> \n <li>Pure Finagle-based ZooKeeper client: ZooKeeper is the open sourced library of cluster coordination that we use at Twitter. We would like to implement a ZooKeeper client purely in Finagle.</li> \n</ul>\n<p><br> If you’re new to Scala, we recommend you check out our <a href=\"http://twitter.github.io/scala_school/\">Scala School</a> and <a href=\"http://twitter.github.io/effectivescala/\">Effective Scala</a> guides on GitHub.<br><br>Mesos<br><br><a href=\"http://incubator.apache.org/mesos/\">Apache Mesos</a> is a cluster manager that provides efficient resource isolation and sharing across distributed applications (or frameworks). It is extensively used at Twitter to run all sorts of jobs and applications. We are looking for a student to help us add <a href=\"https://issues.apache.org/jira/browse/MESOS-418\">security and authentication support</a> to Mesos (including integration with LDAP). We recommend signing up on the Mesos <a href=\"http://incubator.apache.org/mesos/mailing-lists.html%20\">mailing list</a> and if you want to learn more about Mesos, you might enjoy this <a href=\"http://www.wired.com/wiredenterprise/2013/03/google-borg-twitter-mesos/\">article</a> in Wired.<br><br>Scalding<br><br><a href=\"https://github.com/twitter/scalding\">Scalding</a> is a Scala library that makes it easy to specify Hadoop MapReduce jobs. Scalding is built on top of <a href=\"http://www.cascading.org/\">Cascading</a>, a Java library that abstracts away low-level Hadoop details. Scalding is comparable to <a href=\"http://pig.apache.org/%20\">Pig</a>, but offers tight integration with Scala, bringing advantages of Scala to your MapReduce jobs. This summer, we’re looking for students to help with:</p> \n<ul>\n <li>Scalding Read-eval-print-loop (REPL): Make a REPL to allow playing with scalding in local and remote mode with a REPL. The challenge here is scheduling which portions of the items can be scheduled to run, and which portions are not yet ready to run. You will build a DAG and when one is materialized, you schedule the part of the job that is dependent on that output.</li> \n <li>Integrate Algebird and Spire: Spire is a scala library modeling many algebraic concepts. Algebird is a Twitter library that is very similar and has a subset of the objects in Spire. We would like to use the type-classes of Spire in Algebird. Algebird is focused on streaming/aggregation algorithms, which are a subset of Spire’s use case.</li> \n</ul>\n<p><br> You can view all of our project ideas on our <a href=\"https://github.com/twitter/twitter.github.com/wiki/Google-Summer-of-Code-2013\">wiki</a>.<br><br> We strongly recommend that you <a href=\"http://www.google-melange.com/gsoc/org/google/gsoc2013/twitter\">submit your application</a> early and discuss your ideas with respective project mentors. The deadline is <a href=\"http://www.google-melange.com/gsoc/events/google/gsoc2013\">May 03 at 19:00 UTC</a> and late applications cannot be accepted for any reason. You can always update your application and answer our questions after you submit it. If you have any questions not covered in the <a href=\"https://github.com/twitter/twitter.github.com/wiki/Google-Summer-of-Code-2013\">wiki</a>, ask them on our Summer of Code <a href=\"https://groups.google.com/forum/?fromgroups#!forum/twitter-gsoc\">mailing list</a>. We look forward to reading your applications and working with you on open source projects over the summer.<br><br> Good luck!<br><br> - Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)<br><br></p>",
"date": "2013-04-30T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/students-apply-now-for-summer-of-code",
"domain": "engineering"
},
{
"title": "Drinking from the Streaming API",
"body": "<p>Today we’re open-sourcing the <a href=\"https://github.com/twitter/hbc\">Hosebird Client</a> (hbc) under the <a href=\"https://github.com/twitter/hbc/blob/master/LICENSE\">ALv2 license</a> to provide a robust Java HTTP library for consuming Twitter’s <a href=\"https://dev.twitter.com/docs/streaming-apis\">Streaming API</a>. The client is full featured: it offers support for GZip, OAuth and partitioning; automatic reconnections with appropriate backfill counts; access to raw bytes payload; proper retry schemes, and relevant statistics. Even better, it’s been battle-tested in production by our internal teams. We highly recommend you take advantage of the Hosebird Client if you plan on working with the Streaming API.<br><br>Using Hosebird<br><br> The Hosebird Client is broken into two main modules: hbc-core and hbc-twitter4j. The hbc-core module uses a simple message queue that a consumer can poll for messages. The hbc-twitter4j module lets you use the superb <a href=\"http://twitter4j.org/\">Twitter4J</a> project and its data model on top of the message queue to provide a parsing layer.<br><br> The first step to use Hosebird is to setup the client using the ClientBuilder API:<br><br>// Create an appropriately sized blocking queue<br>BlockingQueueString&gt; queue = new LinkedBlockingQueueString&gt;(10000);<br>// Authenticate via OAuth<br>Authentication auth = new OAuth1(consumerKey, consumerSecret, token, secret);<br>// Build a hosebird client<br>ClientBuilder builder = new ClientBuilder()<br> &nbsp;&nbsp;&nbsp;.hosts(Constants.STREAM_HOST)<br> &nbsp;&nbsp;&nbsp;.authentication(auth)<br> &nbsp;&nbsp;&nbsp;.endpoint(new StatusesSampleEndpoint())<br> &nbsp;&nbsp;&nbsp;.processor(new StringDelimitedProcessor(queue))<br> &nbsp;&nbsp;&nbsp;.eventMessageQueue(queue);<br>Client hosebirdClient = builder.build();<br><br> After we have created a Client, we can connect and process messages:<br><br>client.connect();<br>while (!client.isDone()) {<br> &nbsp;String message = queue.take();<br> &nbsp;System.out.println(message); // print the message}<br><br>Hosebird Examples<br><br> We recommend you learn from the <a href=\"https://github.com/twitter/hbc/tree/master/hbc-example\">examples</a> on GitHub or contribute your own.<br><br> If you want a quick example, set these properties in hbc-example/pom.xml:<br>SECRET<br>SECRET<br>SECRET<br>SECRET<br><br> Then you can run this command on the command line:<br> mvn exec:java -pl hbc-example <br><br> This will connect to the <a href=\"https://dev.twitter.com/docs/api/1/get/statuses/sample\">sample stream API</a> and print 1000 JSON items from the API.<br><br> Acknowledgements<br> The Hosebird Client was primarily authored by Steven Liu (<a href=\"https://twitter.com/steven\">@steven</a>) and Kevin Oliver (<a href=\"https://twitter.com/kevino\">@kevino</a>). We’d also like to thank the <a href=\"https://twitter.com/TwitterAPI\">@TwitterAPI</a> team for their thoughtful suggestions and help.<br><br> On behalf of the Hosebird team, <br> - Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)</p>",
"date": "2013-02-28T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/drinking-from-the-streaming-api",
"domain": "engineering"
},
{
"title": "Twitter Typeahead.js: You Autocomplete Me",
"body": "<p>Twitter <a href=\"http://twitter.github.com/typeahead.js\">typeahead.js</a> is a fast and battle-tested jQuery plugin for auto completion. Today we’re open sourcing the code on <a href=\"https://github.com/twitter/typeahead.js\">GitHub</a> under the <a href=\"https://github.com/twitter/typeahead.js/blob/master/LICENSE\">MIT license</a>. By sharing a piece of our infrastructure with the open source community, we hope to evolve typeahead.js further with community input.<br><br><a href=\"http://4.bp.blogspot.com/-vUN5jO5VvfY/USPaIZbU5yI/AAAAAAAAAeU/Tix7jRANNpI/s1600/typeahead+image.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/twitter_typeaheadjsyouautocompleteme95.thumb.1280.1280.png\" alt=\"Twitter Typeahead.js: You Autocomplete Me\"></a><br><br><br> If your web application needs a fully-featured queryable search box, typeahead.js can help. Some of its capabilities and features include:</p> \n<ul>\n <li>Search data on the client, server, or both</li> \n <li>Handle multiple inputs on a single page with shared data and caching</li> \n <li>Suggest multiple types of data (e.g. searches and accounts) in a single input</li> \n <li>Support for international languages, including right-to-left (RTL) and input method editors (IME)</li> \n <li>Define custom matching and ranking functions</li> \n <li>Grey text hints that help explain what hitting tab will do</li> \n</ul>\n<p><br> It’s also optimized for large local datasets, so it’s fast for high-latency networks.<br><br>Examples<br><br> We recommend you take a look at our <a href=\"http://twitter.github.com/typeahead.js/examples/\">examples</a> page. There are three ways to get data:<br><br> Using local, hard-coded data passed on page render:</p> \n<pre>$('#input').typeahead([<br>{<br>name: 'planets',<br>local: [ \"Mercury\", \"Venus\", \"Earth\", \"Mars\", \"Jupiter\", \"Saturn\", \"Uranus\", \"Neptune\" ]<br>}<br>]);</pre> \n<p>Using a prefetch URL that will be hit to grab data on pageload and then stored in localStorage:</p> \n<pre>$('#input').typeahead([<br>{<br>name: 'countries',<br>prefetch: '/countries.json',<br>}<br>]);</pre> \n<p><br> Or using a queryable API that returns results as-you-type (with the query being passed in the ?q= parameter):</p> \n<pre>$('#input').typeahead([<br>{<br>name: 'countries',<br>remote: '/countries.json',<br>}<br>]);</pre> \n<p><br> You can also combine local or prefetch with a remote fallback for the performance of local data combined with the coverage of a remote query API (e.g. quickly search your friends but be able to find anyone on your site). There are lots of options for configuring everything from ranking, matching, rendering, templating engines, and more; check out the <a href=\"https://github.com/twitter/typeahead.js#readme\">README</a> for those details.<br><br> If you want to use this with a project like <a href=\"http://twitter.github.com/bootstrap\">Bootstrap</a>, all you have to do is include the JavaScript file for typeahead.js after Bootstrap’s JavaScript file and use our configuration options.<br><br> We initially built typeahead.js to support our needs; now we look forward to improvements and suggestions from the community. To learn more about how typeahead.js works, check out our detailed <a href=\"https://github.com/twitter/typeahead.js#readme\">documentation</a>. To stay in touch, follow <a href=\"https://twitter.com/typeahead\">@typeahead</a> and submit <a href=\"https://github.com/twitter/typeahead.js/issues\">issues</a> on GitHub. Also, if building web application frameworks like typeahead.js interests you, why not consider <a href=\"https://twitter.com/jobs/engineering\">joining the flock</a>?<br><br>Acknowledgements<br> Typeahead.js was primarily authored by Tim Trueman (<a href=\"https://twitter.com/timtrueman\">@timtrueman</a>), Veljko Skarich (<a href=\"https://twitter.com/vskarich\">@vskarich</a>) and Jake Harding (<a href=\"https://twitter.com/jakeharding\">@jakeharding</a>).</p>",
"date": "2013-02-19T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/twitter-typeaheadjs-you-autocomplete-me",
"domain": "engineering"
},
{
"title": "New Twitter search results",
"body": "<p>We just <a href=\"http://engineering/2013/02/search-and-discover-improvements-get.html\">shipped a new version of the Twitter app</a> with a brand new search experience that blends the most relevant content - Tweets, user accounts, images, news, related searches, and more - into a single stream of results. This is a major shift from how we have previously partitioned results by type (for instance, Tweet search vs. people search). We think this simplified experience makes it easier to find great content on Twitter using your mobile device. <br><br> A typical search scores items of the same type and picks the top-scoring results. In a blended search experience, this is not straightforward. The scores of different content types are computed by different services, and thus not directly comparable for blending. Another challenge is to decide which type of content to mix, as not all content types are always desirable to display. This post discusses our approach to solving these challenges.<br><br>Ranking<br><br> When a user searches, different types of content are searched separately, returning a sequence of candidate results for each content type with a type-specific score for each. For certain content types that are displayed as a single group or gallery unit, such as users or images, we assign the maximum score of results as the representative score of this content type. The result sequences for some content types may be trimmed or discarded entirely at this point.<br><br> Once results of different content types are prepared, each type-specific score is converted into a universally compatible score, called a “uniscore”. Uniscores of different modules are used as a means to blend content types as in a merge-sort, except for the penalization of content type transition. This is to avoid over-diversification of content types in the blended result.<br><br></p> \n<p><a href=\"http://3.bp.blogspot.com/-McCkggNJrU0/URKuInWEznI/AAAAAAAAAeE/pNNT8WnWwx8/s1600/Fig%2B1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_twitter_searchresults95.thumb.1280.1280.png\" alt=\"New Twitter search results \"></a></p> \n<p>Fig. 1: Search ranker chose News1 followed by Tweet1 so far and is presented with three candidatesTweet2, User Group, and News2 to pick the content after Tweet1. News2 has the highest uniscore but search ranker picks Tweet2, instead of News2 as we penalize change in type between consecutive content by decreasing the score of News2 from 0.65 to 0.55, for instance.<br><br><br>Score unification<br><br> Individual content is assigned a type-specific score, which is called a “raw” score, by its corresponding service. To facilitate blending and ranking content of different types as described above, raw scores are converted into uniscores using type-specific log-linear score conversion functions – where the chance of a converted score to take its value in [0, 1] is at least 95%, as estimated from observed dataset.<br><br>Content selection and boosting<br><br> Certain types of content may not have many relevant items to show for a particular input query, in which case we may choose not to include this type of content in search results. In other cases, for instance if query volume or matched item counts have an unusual spike (what we call a “burst”), we show this type and may also boost it to appear at a higher place in the results. To facilitate this, we represent trends in searches or matching result counts as a single number that is proportional to the level of “burstiness”.<br><br> For example, consider measuring “burstiness” for the number of images and news content matching the query “photos”. We first obtain three sequences of term frequencies, e.g. :<br><br></p> \n<p><a href=\"http://3.bp.blogspot.com/-8RGFpYCO0eU/URKtx5XfFBI/AAAAAAAAAd0/RnDQO0Ysc6g/s1600/Fig%2B2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_twitter_searchresults96.thumb.1280.1280.png\" alt=\"New Twitter search results \"></a></p> \n<p>Fig. 2 : Three sequences of number of Tweets over eight 15 minute buckets from bucket 1 (2 hours ago) to 8 (most recent).<br> Tweet : counts of Tweets that match query “photos”.<br> Image : counts of Tweets that match query “photos” and contain image links.<br> News : counts of Tweets that match query “photos” and contain news links.<br> Query “photos” is shown not only to match Tweets with image links more than those with news links but also is increasing over time.<br><br> Our approach to compute the burstiness of image and news facets is an extension of original work by Jon Kleinberg on bursty structure detection, which is in essence matching current level of burst to one of a predefined set of bursty states, while minimizing too diverse a change in matched states for smooth estimation [1].<br><br> In our extension, burstiness of mixable content types including images, users, news, and tweets are computed simultaneously to reflect relative difference in bursty levels between different types and used the distance of observed rate from each state’s bursty level as state cost. This is because accurately estimating probability of occurrence is infeasible for real-time estimation due to expensive computational cost and possible introduction of zero intervals between probability states due to numerical approximation. Optimal state sequences for images and news are estimated as shown in Fig 3.<br><br></p> \n<p><a href=\"http://1.bp.blogspot.com/-LaEnOU0Rkms/URKtqth29GI/AAAAAAAAAdo/wWnnvrvAaCA/s1600/Fig%2B3.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_twitter_searchresults97.thumb.1280.1280.png\" alt=\"New Twitter search results \"></a></p> \n<p>Fig. 3 : Normalized image and news counts are matched to one of n=5 states : 1 average, 2 above, and 2 below. Matched states curves show a more stable quantization of original sequence which has the effect of removal of small noisy peaks.<br><br><br> Finally, burstiness of each content type is computed as an exponential moving average of state IDs in the optimal state sequence. As shown in Fig. 3, jointly optimizing the sum of state cost and transition cost yields a smooth quantization of original sequence, which automatically filters out small noisy peaks in original counts. Also, this maps both trending (bursty) and steadily high sequences to a high burstiness value.<br><br> Burstiness computed this way is used to filter out content types with low or no bursts. It’s also used to boost the score of corresponding content types, as a feature for a multi-class classifier that predicts the most likely content type for a query, and in additional components of the ranking system.<br><br>References<br><br> [1] J. Kleinberg, Bursty and Hierarchical Structure in Streams, Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002. (<a href=\"http://www.cs.cornell.edu/home/kleinber/bhs.pdf\">PDF</a>)<br><br> Posted by <a href=\"http://twitter.com/glassyocean\">Youngin Shin</a><br> Search-Quality Team</p>",
"date": "2013-02-06T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/new-twitter-search-results",
"domain": "engineering"
},
{
"title": "Introducing Flight: a web application framework",
"body": "<p>Last year we rolled out a <a href=\"http://engineering.twitter.com/2012/05/improving-performance-on-twittercom.html\">major reimplementation</a> of the Twitter website. In addition to shifting the rendering of our page content to the server (which achieved significant performance gains), we re-envisioned the entire client-side infrastructure with a clean, robust and easy-to-learn framework which we call <a href=\"http://twitter.github.com/flight/\">Flight</a>. Today we’re making Flight available to the open source community under the liberal <a href=\"https://github.com/twitter/flight/blob/master/LICENSE\">MIT license</a> as a framework for structuring web applications.</p> \n<p>Whether you use Flight as the JavaScript framework for your next web project, or just as source for new ideas, we look forward to learning from diverse perspectives via community feedback and contributions on <a href=\"https://github.com/twitter/flight\">GitHub</a>.<br><br><strong>Why Flight?<br></strong><br> Flight is distinct from existing frameworks in that it doesn’t prescribe or provide any particular approach to rendering or providing data to a web application. It’s agnostic on how requests are routed, which templating language you use, or even if you render your HTML on the client or the server. While some web frameworks encourage developers to arrange their code around a prescribed model layer, Flight is organized around the existing DOM model with functionality mapped directly to DOM nodes.<br><br> Not only does this obviate the need for additional data structures that will inevitably influence the broader architecture, but by mapping our functionality directly onto the native web we get to take advantage of native features. For example, we get custom event propagation for free by piggybacking off DOM event bubbling, and our event handling infrastructure works equally well with both native and custom events.<br><br><strong>How does it work?<br></strong><br> Flight enforces strict separation of concerns. When you create a component you don’t get a handle to it. Consequently, components cannot be referenced by other components and cannot become properties of the global object tree. This is by design. Components do not engage each other directly; instead, they broadcast their actions as events which are subscribed to by other components. <br><br><strong>Why events?<br></strong><br> Events are open-ended. When a component triggers an event, it has no knowledge of how its request will be satisfied or by whom. This enforced decoupling of functionality allows the engineer to consider each component in isolation rather than having to reason about the growing complexity of the application as a whole.<br><br> By making DOM node events proxies for component events, we let the web work for us:</p> \n<ul>\n <li>we get event propagation for free</li> \n <li>a component can subscribe to a given event type at the document level or it can choose to listen only those events originating from within a specified DOM Node</li> \n <li>subscribing components do not distinguish between custom events from other components (e.g. ‘dataMailItemsServed’) and native DOM node events (e.g. ‘click’), and process both types of event in an identical fashion.</li> \n</ul>\n<p><a href=\"http://3.bp.blogspot.com/-kOtb8efsqvs/UQqgWfqxXhI/AAAAAAAAAdI/2lNVUu3dIyU/s1600/ss1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_flightawebapplicationframework95.thumb.1280.1280.png\" alt=\"Introducing Flight: a web application framework\"></a></p> \n<p><strong>Mobility and testing<br></strong><br> Each component is a module that, aside from a minimal set of standard dependencies (relevant Flight utilities and mixins), has no reference to the outside world. Thus a given component will respond to a given event in the same way, regardless of environment. This makes testing simple and reliable — events are essentially the only variable, and a production event is easy to replicate in testing. You can even debug a component by triggering events in the console.<br><br><strong>Mixins<br></strong><br> A mixin defines a set of functionality that is useful to more than one object. Flight comes with built-in support for <a href=\"http://javascriptweblog.wordpress.com/2011/05/31/a-fresh-look-at-javascript-mixins/\">functional mixins</a>, including protection against unintentional overrides and duplicate mixins. While classical JavaScript patterns support only single inheritance, a component prototype (or other object) can have multiple mixins applied to it. Moreover, mixins requires a fraction of the boilerplate required to form traditional classical hierarchies out of constructor-prototypes hybrids, and don’t suffer the leaky abstractions of the latter (‘super’, ‘static’, ‘const’ etc.)<br><br><strong>Documentation and demo<br></strong><br> Our GitHub page includes <a href=\"https://github.com/twitter/flight/blob/master/README.md\">full documentation</a> as well as a <a href=\"https://github.com/twitter/flight/tree/gh-pages/demo\">sample app</a> in the form of an <a href=\"http://twitter.github.com/flight/demo/\">email client</a>:<br><br></p> \n<p><a href=\"http://4.bp.blogspot.com/-jG6yq1qafBc/UQqgcn9JrsI/AAAAAAAAAdU/IqsPjTLGg1Q/s1600/ss2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_flightawebapplicationframework96.thumb.1280.1280.png\" alt=\"Introducing Flight: a web application framework\"></a></p> \n<p><br><strong>Future work<br></strong><br> Flight is an ongoing and evolving project. We’re planning to add a full testing framework and make available more of the utilities that we use for the Twitter website frontend. We also look forward to your contributions and comments. We know we haven’t thought of everything, and with your help we can continue to improve Flight for the benefit of everyone. <br><br><strong>Acknowledgments<br></strong><br> Flight was a group effort. These folks have contributed to the project: <a href=\"https://twitter.com/angustweets\">Angus Croll</a>, <a href=\"https://twitter.com/danwrong\">Dan Webb</a>, <a href=\"https://twitter.com/kpk\">Kenneth Kufluk</a>, along with other members the Twitter web team. A special thank you to <a href=\"https://github.com/twitter/flight#authors\">folks in the web community</a> who took the time to review the code.</p>",
"date": "2013-01-31T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/introducing-flight-a-web-application-framework",
"domain": "engineering"
},
{
"title": "Braindump",
"body": "<h2>Cross-posted from <a href=\"http://blog.oskarsson.nu/post/40196324612/the-twitter-stack\">@skr’s blog</a>.&nbsp; </h2> \n<p><br><br></p> \n<h2>The Twitter stack</h2> \n<p></p> \n<p>For various reasons, including performance and cost, Twitter has poured significant engineering effort into breaking down the site backend into smaller JVM based services. As a nice side effect we’ve been able to open source several of the libraries and other useful tools that came out of this effort.</p> \n<p>While there is a fair amount of information about these projects available as docs or slides I found no simple, high level introduction to what we can unofficially call the Twitter stack. So here it is. It’s worth noting that all this information is about open source projects, that it is public already and that I am not writing this as part of my job at Twitter or on their behalf.</p> \n<p>Now, granted these were not all conceived at Twitter and plenty of other companies have similar solutions. However I think the software mentioned below is quite powerful and with most of it released as open source it is a fairly compelling platform to base new services off of.</p> \n<p>I will describe the projects from a Scala perspective, but quite a few are useful in Java programs as well. See the&nbsp;<a href=\"http://twitter.github.com/scala_school/\">Twitter Scala</a>&nbsp;school for an intro to the language, although that is not required to understand this post.</p> \n<h1>Finagle</h1> \n<p></p> \n<p>At the heart of a service lies the&nbsp;<a href=\"https://github.com/twitter/finagle\">Finagle</a>&nbsp;library. By abstracting away the fundamental underpinnings of an RPC system, Finagle greatly reduces the complexity that service developers have to deal with. It allows us to focus on writing application-specific business logic instead of dwelling on lower level details of distributed systems. Ultimately the website itself uses these services to perform operations or fetch data needed to render the HTML. At Twitter the internal services use the&nbsp;<a href=\"http://thrift.apache.org/\">Thrift</a>&nbsp;protocol, but Finagle supports other protocols too such as Protocol buffers and HTTP.<br><br>Setting up a service using Finagle<br> A quick dive into how you would set up a Thrift service using Finagle.</p> \n<ol>\n <li>Write a Thrift file defining your API. It should contain the structs, exceptions and methods needed to describe the service functionality. See&nbsp;<a href=\"http://thrift.apache.org/docs/idl/\">Thrift Interface Description Language (IDL) docs</a>, in particular the examples at the end for more info.</li> \n <li>Use the Thrift file as input for a code generator that spits out code in your language. For Scala and Finagle based projects I would recommend&nbsp;<a href=\"https://github.com/twitter/scrooge\">Scrooge</a>.</li> \n <li>Implement the Scala trait generated from your Thrift IDL. This is where the actual functionality of your service goes.</li> \n <li>Provide the Finagle server builder an instance of the implementation above, a port to bind to and any other settings you might need and start it up.</li> \n</ol>\n<p><br> That looks pretty similar to just using plain Thrift without Finagle. However, there are quite a few improvements such as excellent monitoring support, tracing and Finagle makes it easy to write your service in an asynchronous fashion. More about these features later.<br><br> You can also use Finagle as a client. It takes care of all the boring stuff such as timeouts, retries and load balancing for you.</p> \n<h1>Ostrich</h1> \n<p></p> \n<p>So let’s say we have a Finagle Thrift service running. It’s doing very important work. Obviously you want to make sure it keeps doing that work and that it performs well. This is where&nbsp;<a href=\"https://github.com/twitter/ostrich\">Ostrich</a>&nbsp;comes in.<br><br>Metrics<br> Ostrich makes it easy to expose various metrics from your service. Let’s say you want to count how many times a particular piece of code is run. In your service you’d write a line of code that looks something like this:<br><br><code>Stats.incr(“some_important_counter”)</code><br><br> As simple as that. The counter named some_important_counter will be incremented by 1.&nbsp;<br><br> In addition to just straight up counters you can get gauges that report on the value of a variable:<br><br><code>Stats.addGauge(\"current_temperature\") { myThermometer.temperature }</code><br><br> or you can time a snippet of code to track the performance<br><br><code>Stats.time(\"translation\") {<br> &nbsp;document.translate(\"de\", \"en\")<br> }</code><br><br> Those and other examples can be found in the&nbsp;<a href=\"https://github.com/twitter/ostrich\">Ostrich readme</a>.<br><br>Export metrics<br> Ostrich runs a small http admin interface to expose these metrics and other functionality. To fetch them you would simply hit <a href=\"http://hostname:port/stats.json\" target=\"_blank\" rel=\"nofollow\">http://hostname:port/stats.json</a> to get the current snapshot of the metrics as JSON. At Twitter the stats from each service will be ingested from Ostrich by our internal observability stack, providing us with fancy graphs, alerting and so on.<br><br> To tie this back to our previous section: If you provide a Finagle client or server builder with an Ostrich backed StatsReceiver it’ll happily splurt out tons of metrics about how the service is performing, the latencies for the RPC calls and the number of calls to each method to name a few.<br><br> Ostrich can also deal with configuring your service, shutting down all the components gracefully and more.</p> \n<p><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/braindump95.thumb.1280.1280.png\" alt=\"Braindump\"><br> This is an example of what a dashboard could look like with stats gathered from Ostrich by our observability stack. Screenshot from <a href=\"https://twitter.com/intent/user?screen_name=raffi\">@raffi</a>’s&nbsp;<a href=\"http://www.slideshare.net/raffikrikorian/realtime-systems-at-twitter\">presentation</a>&nbsp;deck.<br><br></p> \n<h1>Zipkin</h1> \n<p></p> \n<p>Ostrich and Finagle combined gives us good service level metrics. However, one downside of a more service oriented architecture is that it’s hard to get a high level performance overview of a single request throughout the stack.<br> Perhaps you are a developer tasked with improving performance of a particular external api endpoint. With Zipkin you can get a visual representation of where most of the time to fulfill the request was spent. Think Firebug or Chrome developer tools for the back end.&nbsp;<a href=\"https://github.com/twitter/zipkin/\">Zipkin</a>&nbsp;is a implementation of a tracing system based off of the Google Dapper paper.<br><br>Finagle-Zipkin<br> So how does it work? There’s a&nbsp;<a href=\"https://github.com/twitter/finagle/tree/master/finagle-zipkin\">finagle-zipkin</a>&nbsp;module that will hook into the transmission logic of Finagle and time each operation performed by the service. It also passes request identifiers down to any services it relies on, this is how we can tie all the tracing data together. The tracing data is logged to the Zipkin backend and finally we can display and visualize that data in the Zipkin UI.<br><br> Let’s say we use Zipkin to inspect a request and we see that it spent most of it’s time waiting for a query to a MySQL database. We could then also see the actual SQL query sent and draw some conclusions from it. Other times perhaps a GC in a Scala service was a fault. Either way, the hope is that a glance at the trace view will reveal where the developer should spend effort improving performance.<br><br> Enabling tracing for Finagle services is often as simple as adding</p> \n<p><code>.tracerFactory(ZipkinTracer())</code></p> \n<p>to your ClientBuilder or ServerBuilder. Setting up the whole Zipkin stack is a bit more work though, check out the docs for further assistance.</p> \n<p><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/braindump96.thumb.1280.1280.png\" alt=\"Braindump\"><br> Trace view, taken from my Strange loop&nbsp;<a href=\"http://www.slideshare.net/johanoskarsson/zipkin-strangeloop\">talk</a>&nbsp;about Zipkin.</p> \n<h1>Mesos</h1> \n<p></p> \n<p><a href=\"http://incubator.apache.org/mesos/\">Mesos</a>&nbsp;describes itself as “a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks”. I’ll try to go through this section without using buzzwords such as “private cloud”, although technically I just did.<br><br> The core Mesos project is an open source Apache incubator project. On top of it you can run schedulers that deal with more specific technologies, for example Storm and Hadoop. The idea being that the same hardware can be used for multiple purposes, reducing wasted resources.<br><br> In addition to using&nbsp;<a href=\"http://storm-project.net/\">Storm</a>&nbsp;on top of Mesos we deploy some of our JVM-based services to internal Mesos clusters. With the proper configuration it takes care of concerns such as rack diversity, rescheduling if a machine goes down and so on.&nbsp;<br><br> The constraints imposed by Mesos have the positive side effect of enforcing adherence to various good distributed systems practices. For example:</p> \n<ul>\n <li>Service owners shouldn’t make any assumptions about jobs’ lifetimes, as the Mesos scheduler can move jobs to new hosts at any time.</li> \n <li>Jobs shouldn’t write to local disk, since persistence is not guaranteed.</li> \n <li>Deploy tooling and configs shouldn’t use static server lists, since Mesos implies deployment to a dynamic environment.</li> \n</ul>\n<p></p> \n<h1>Iago</h1> \n<p></p> \n<p>Before putting your new service into production you might want to check how it performs under load. That’s where&nbsp;<a href=\"https://github.com/twitter/iago\">Iago</a>&nbsp;(formerly Parrot) comes in handy. It’s a load testing framework that is pretty easy to use.<br><br> The process might look something like this:</p> \n<ol>\n <li>Collect relevant traffic logs that you want to use as the basis for your load test.</li> \n <li>Write a configuration file for the test. It contains the hostnames to send load to, the number of requests per second, the load pattern and so on.</li> \n <li>Write the actual load test. It receives a log line, you transform that into a request to a client.</li> \n <li>Run the load test. At Twitter this will start up a few tasks in a Mesos cluster, send the traffic and log metrics.</li> \n</ol>\n<p><br>Example<br> A load test class could be as simple as this:<br><br><code>class LoadTest(parrotService: ParrotService[ParrotRequest, Array[Byte]]) extends <br> &nbsp;ThriftRecordProcessor(parrotService) {<br><br> &nbsp;val client = new YourService.FinagledClient(service, new TBinaryProtocol.Factory())<br><br> &nbsp;def processLines(job: ParrotJob, lines: Seq[String]) {<br> &nbsp;&nbsp;&nbsp;lines foreach {line =&gt;client.doSomething(line) }<br> &nbsp;}<br> }&nbsp;</code><br><br> This class will feed each log line to your service’s doSomething method, according to the parameters defined in the configuration of parrotService.</p> \n<h1>ZooKeeper</h1> \n<p></p> \n<p>ZooKeeper is an Apache project that is handy for all kinds of distributed systems coordination.&nbsp;<br><br> One use case for ZooKeeper within Twitter is service discovery. Finagle services register themselves in ZooKeeper using our ServerSet library, see&nbsp;<a href=\"https://github.com/twitter/finagle/tree/master/finagle-serversets\">finagle-serversets</a>. This allows clients to simply say they’d like to communicate with “the production cluster for service a in data centre b” and the ServerSet implementation will ensure an up-to-date host list is available. Whenever new capacity is added the client will automatically be aware and will start load balancing across all servers.<br><br></p> \n<h1>Scalding</h1> \n<p></p> \n<p>From the&nbsp;<a href=\"https://github.com/twitter/scalding\">Scalding</a>&nbsp;github page: “Scalding is a Scala library that makes it easy to write MapReduce jobs in Hadoop. Instead of forcing you to write raw map and reduce functions, Scalding allows you to write code that looks like natural Scala”.<br><br> As it turns out services that receive a lot of traffic generate tons of log entries. These can provide useful insights into user behavior or perhaps you need to transform them to be suitable as Iago load test input.<br><br> I have to admit I was a bit sceptical about Scalding at first. It seemed there were already plenty of ways to write Hadoop jobs. Pig, Hive, plain MapReduce, Cascading and so on. However, when the rest of your project is in Scala it is very handy to be able to write Hadoop jobs in the same language. The syntax is often very close to the one used by Scala’s collection library, so you feel right at home, the difference being that with Scalding you might process terabytes of data with the same lines of code.<br><br> A simple word count example from their tutorial:</p> \n<p>&nbsp;&nbsp;<code>TextLine(args(\"input\"))<br> &nbsp;&nbsp;&nbsp;.read<br> &nbsp;&nbsp;&nbsp;.flatMap('line -&gt; 'word){ line : String =&gt; line.split(\"\\\\s\")}<br> &nbsp;&nbsp;&nbsp;.groupBy('word){group =&gt; group.size}<br> &nbsp;&nbsp;&nbsp;.write(Tsv(args(\"output\")))</code></p> \n<h1>jvmgcprof</h1> \n<p></p> \n<p>One of the well known downsides of relying on the JVM for time sensitive requests is that garbage collection pauses could ruin your day. If you’re unlucky a GC pause might hit at the wrong time, causing some requests to perform poorly or even timeout. Worst case that might have knock on effects that leads to downtime.<br><br> As a first line of defence against GC issues you should of course tweak your JVM startup parameters to suit the kind of work the service is undertaking. I’ve found these&nbsp;<a href=\"http://www.slideshare.net/aszegedi/everything-i-ever-learned-about-jvm-performance-tuning-twitter\">slides</a>&nbsp;from Twitter alumni Attila Szegedi extremely helpful.<br><br> Of course, you could minimize GC issues by reducing the amount of garbage your service generates. Start your service with jvmgcprof and it’ll help you reach that goal. If you already use Ostrich to track metrics in your service you can tell jvmgcprof which metric represents the work completed. For example you might want to know how many kilobytes of garbage is generated per incoming Thrift request. The jvmgcprof output for that could look something like this.<br><br> 2797MB w=101223 (231MB/s 28kB/w)<br> 50.00% &nbsp;8 &nbsp;&nbsp;297<br> 90.00% &nbsp;14 &nbsp;542<br> 95.00% &nbsp;15 &nbsp;572<br> 99.00% &nbsp;61 &nbsp;2237<br> 99.90% &nbsp;2620 &nbsp;&nbsp;&nbsp;94821<br> 99.99% &nbsp;2652 &nbsp;&nbsp;&nbsp;95974<br><br> On the first line you can see that the number requests or work were 101223 for the period monitored, with 231MB/s of garbage or 28kB per request. The garbage per request can easily be compared after changes has been made to see if they had a positive or negative impact on garbage generation. See the&nbsp;<a href=\"https://github.com/twitter/jvmgcprof\">jvmgcprof readme</a>&nbsp;for more information.</p> \n<h1>Summary</h1> \n<p></p> \n<p>It’s no surprise, but it turns out that having a common stack is very beneficial. Improvements and bug fixes made by one team will benefit others. There is of course another side to that coin, sometimes bugs are introduced that might just be triggered in your service. However, as an example, when developing Zipkin it was immensely helpful to be able to assume that everyone used Finagle. That way they would get tracing for free once we were done.<br><br> I have left out some of the benefits of the Twitter stack and how we use Scala, such as the very convenient way Futures allow you to deal with results from asynchronous requests. I hope to write a more in depth post on how to set up a Twitter style service that would deal with the details omitted in this article. In the meantime you can check out the&nbsp;<a href=\"http://twitter.github.com/scala_school/\">Scala school</a>&nbsp;for more information.<br><br> Thanks to everyone who worked on the projects mentioned in this article, too many to name but you know who you are.<br><br> Posted by <a href=\"https://twitter.com/skr\">Johan Oskarsson</a></p>",
"date": "2013-01-28T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/braindump",
"domain": "engineering"
},
{
"title": "Mobile app development: Catching crashers",
"body": "<p>Before Twitter for iOS code reaches production, we run it through static analysis, automated testing, code review, manual verification, and employee dogfooding. In this last step, we distribute beta builds to our employees to collect real-world feedback on products and monitor code stability through crash reports.</p> \n<p>Detailed crash data has had a huge impact to our development process, and significantly improved mobile app performance and quality. This post describes two types of bugs, and how we used detailed data from Crashlytics to diagnose and fix them.</p> \n<p><strong>Micro-bugs</strong><br>The most common crashers are not elaborate:</p> \n<ul>\n <li>You forgot to nil out delegate properties when the delegate is deallocated</li> \n <li>You didn’t check whether a block was nil before calling it</li> \n <li>You released an object in the wrong order</li> \n</ul>\n<p>These mistakes are solved with rote patterns. Sometimes you haven’t learned a pattern. Sometimes you know them, but lack rigor. We experienced the latter with category prefixing.</p> \n<p>Objective-C’s <a href=\"http://en.wikipedia.org/wiki/Objective-C#Categories\">Categories</a> allow you to attach new methods to classes you don’t own. For instance, we could attach isTweetable to NSString, which returns YES if the string is less than or equal to 140 characters.</p> \n<p>It’s part of Cocoa best practices to prefix categories with something unique. Instead of isTweetable, call your method tw_isTweetable. Then your app is safe if iOS adds a method with the same name to that class in the future. In the past, we usually did this, but missed a few categories.</p> \n<p>Late last year, we noticed several crashes consistently plaguing a small number of users. They were related to innocuous categories, but iOS documentation didn’t point to any name collisions.</p> \n<p>If we can’t reproduce the crash, we try to isolate the problem with crash environment data. Does it affect users of an older version of iOS? Certain hardware? Is it under a low-memory situation?</p> \n<p>Crashlytics revealed most of these crashes were on jailbroken devices. It turned out the jailbreak environment added its own unprefixed categories to core classes, and they shared the same names as our own categories.</p> \n<p>We discovered another set of crashes related to categories, on non-jailbroken devices. Older categories were inconsistently prefixed and collided with private categories used by the system frameworks. Sometimes you do the right thing — and just have bad luck.</p> \n<p><strong>Macro-bugs</strong><br>You should start with the simplest solution that works, but one day you will outgrow that solution. With a more complex architecture come more complex edge cases and their bugs.</p> \n<p>For example, Twitter for iPhone originally used NSKeyedArchiver for storage. To get more granular control over what we loaded from disk at launch, we moved to SQLite. Benchmarking on older hardware revealed that if we wanted to keep the main thread responsive, we had to enter the thorny world of SQLite multithreading.</p> \n<p>In our first implementation, we used a background queue to write incoming Tweets to a staging table. Then we bounced back to the main thread to replace the main table with the staging table.</p> \n<p>This looked fine during testing, but crash reports revealed database lock errors. SQLite’s write locks are not table level, but global. We had to serialize all write operations, so we rewrote the framework to use one GCD queue for reads and one for writes.</p> \n<p>Fixing that crash cleared the way for the next one: you should not share the same database connection between threads. You should open the connection on the thread where you’re going to use it. However, GCD makes no promises that items in one operation queue are dispatched to the same thread. We rewrote the framework to use native threads instead of GCD, and watched the graph of crashes dramatically drop.</p> \n<p>What lingered were database schema lock errors. We traced them back to the <a href=\"http://www.sqlite.org/sharedcache.html\">SQLite Shared Cache</a>. Disabling it eliminated the remaining crashes.</p> \n<p>While iTunes Connect collects crash reports from production builds, Crashlytics lets us collect crashes from employee dogfood builds, which dramatically reduces the feedback cycle. Instead of iterating on several public releases, we quickly address the crashes internally, and ship a better version to users.</p> \n<p><strong>On Crashlytics</strong><br>Last year, our Twitter for iOS team –– Satoshi Nakagawa, Bob Cottrell, Zhen Ma and I –– started using Crashlytics as an internal crash reporting framework. It was clear their analysis was the best available, immediately catching crashes other frameworks missed. Their web front-end was far more mature than what we were building internally. We liked them so much, we <a href=\"http://www.crashlytics.com/blog/crashlytics-is-joining-forces-with-twitter/\">welcomed them to the flock</a>.</p> \n<p>Today, the <a href=\"http://www.crashlytics.com/blog/its-finally-here-announcing-crashlytics-for-android/\">Crashlytics team released the Android SDK</a>, which our Android engineers have been beta testing. We’d all recommend you give it a try.</p> \n<p>Posted by Ben Sandofsky (<a href=\"https://twitter.com/intent/user?screen_name=sandofsky\">@sandofsky</a>)<br>Tech Lead, Twitter for Mac (previously, Twitter for iOS)</p>",
"date": "2013-05-30T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/mobile-app-development-catching-crashers",
"domain": "engineering"
},
{
"title": "CSP to the Rescue: Leveraging the Browser for Security",
"body": "<p>Programming is difficult — and difficult things generally don’t have a <em>perfect</em> solution. As an example, cross-site scripting (XSS) is still very much unsolved. It’s very easy to think you’re doing the right thing at the right time, but there are two opportunities to fail here: the fix might not be correct, and it might not be applied correctly. Escaping content (while still the most effective way to mitigate XSS) has a lot of “gotchas” (such as contextual differences and browser quirks) that show up time and time again. <a href=\"http://en.wikipedia.org/wiki/Content_Security_Policy\">Content Security Policy</a> (CSP) is an additional layer of security on top existing controls.</p> \n<p>Twitter has recently expanded our use of response headers in order to leverage the protection they provide via the browser. Headers like X-Frame-Options (for clickjacking protection) and Strict Transport Security (for enforcing SSL) are somewhat common these days, but we’re here to discuss and recommend Content Security Policy.</p> \n<p><strong>What is Content Security Policy?</strong></p> \n<p>Content Security Policy (CSP) is a whitelisting mechanism that allows you to declare what behavior is allowed on a given page. This includes where assets are loaded from, where forms can send data, and most importantly, what JavaScript is allowed to execute on a page. This is not the first time we’ve <a href=\"http://engineering.twitter.com/2011/03/improving-browser-security-with-csp.html\">blogged about CSP</a>&nbsp;or have dealt with CSP related&nbsp;vulnerabilities:</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p><a href=\"http://t.co/V1r84QSL\">http://t.co/V1r84QSL</a>のjavascript:のリンクを埋め込めたXSSなおった。ちなみにFirefoxからではCSPによって実行をブロックされた。実際のXSSでCSPが反応したのを見たのはこれが初めて。 <a href=\"http://t.co/ytFYeR4w\">pic.twitter.com/ytFYeR4w</a></p> — Masato Kinugawa (\n <a href=\"https://twitter.com/intent/user?screen_name=kinugawamasato\">@kinugawamasato</a>) \n <a href=\"https://twitter.com/kinugawamasato/statuses/200002373068918787\">May 8, 2012</a>\n </blockquote>",
"date": "2013-06-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/csp-to-the-rescue-leveraging-the-browser-for-security",
"domain": "engineering"
},
{
"title": "Libcrunch and CRUSH Maps",
"body": "<p>When we <a href=\"https://engineering/2012/blobstore-twitter%E2%80%99s-house-photo-storage-system\">introduced our in-house photo storage system Blobstore to the world</a>, we discussed a mapping framework called libcrunch for Blobstore that maps virtual buckets to storage nodes. The libcrunch implementation was heavily inspired by the seminal <a href=\"http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf\">paper</a> on the CRUSH algorithm. Today, we are open-sourcing <a href=\"https://github.com/twitter/libcrunch\">libcrunch on GitHub</a> under the Apache Public License 2.0 to share some of our code with the greater community.</p> \n<h5>Why libcrunch?</h5> \n<p>In developing Blobstore, we knew we wanted a mapping framework that can:</p> \n<ul>\n <li>support a flexible topology definition</li> \n <li>support placement rules such as rack diversity</li> \n <li>handle replication factor (RF)</li> \n <li>distribute data with a good balance</li> \n <li>cause stable data movement as a result of topology change (when a small change in the topology occurs, the resulting data movement should be similarly small)</li> \n <li>exhibit a good mean time to recovery when a node fails</li> \n</ul>\n<p>In addition to these fairly standard requirements, we zeroed in on another critical factor, which we call Replica Distribution Factor (RDF). In our initial research, we didn’t find any open source libraries that met our needs on the JVM, so we developed our own. There are mapping algorithms that may satisfy some of these criteria (such as consistent hashing) but none satisfies all of them, especially RDF.</p> \n<h5>What is RDF?</h5> \n<p>RDF is defined as the number of data nodes that share any data with a node. To understand RDF, you can look at one data node and how many other data nodes share any data with that node. In an extreme case, you can imagine data mapping where each data node is completely replicated by another data node. In this case, RDF would be 1. In another extreme case, every single data node may participate in having replicas of every other node. In that case, RDF would be as large as the size of the entire cluster.</p> \n<p>The key concern RDF seeks to address is the (permanent) loss of any data. It would be useful to think about the following scenario. Suppose the replication factor is 2 (2 replicas for each datum). And suppose that we lost one data node for any reason (disk failures and loss of racks). Then the number of replicas for any data that was on that lost data node is down to 1. At this point, if we lose any of those replicas, that piece of data is permanently lost. Assuming that the probability of losing one data node is small and independent (a crude but useful approximation), one can recognize that the probability of losing any data increases proportionally with the number of data nodes that share data with the lost node. And that is the definition of RDF. The bigger the RDF the bigger the probability of losing any data in case of data node loss. By tuning RDF down to a smaller number, one can mitigate the probability of permanent data loss to an acceptable level.</p> \n<p>As you can imagine, RDF becomes much more relevant if the replication factor (RF) is small. One can adopt a larger RF to address the risk of data loss but it would come at a cost, and a prohibitively expensive one if the data size is large.</p> \n<p>Libcunch is designed to deliver these functionalities, including RDF.</p> \n<h5>How does it work?</h5> \n<p>The libcrunch implementation uses the basic CRUSH algorithm as the building block of how it computes mapping. The CRUSH algorithm provides a number of functionalities that are mentioned in the paper. By using this algorithm to store and retrieve data, we can avoid a single point of failure and scale easily.</p> \n<p>To be able to limit the size of the RDF, we use a two-pass approach. In the first pass, we compute what we call the RDF mapping using the same cluster topology but using each data node (or its identifier) as the data. This way, we can come up with a fairly well-defined RDF set from which data mapping can be handled later. In the second pass, we compute the actual data mapping. But for a given data object, we don’t use the full cluster to select the data nodes. Instead, we limit the selection to one of those RDF set we computed in the first pass.</p> \n<h5>Example code</h5> \n<p>A mapping function is needed when you have a number of data objects you want to distribute to a number of nodes or containers. For example, you may want to distribute files to a number of storage machines. But it may not need to be limited to physical storage. Any time logical data is mapped to a logical container, you can use a mapping function.</p> \n<p>Creating and using the libcrunch mapping functions are pretty straightforward. The key part is to implement the placement rules you desire (such as rack isolation rules), and set up your cluster topology in terms of the type Node provided by libcrunch. Then you get the mapping result via the MappingFunction.computeMapping() method. For example:</p> \n<pre>// set up the placement rules<br>PlacementRules rules = createPlacementRules();<br><br>// instantiate the mapping function<br>MappingFunction mappingFunction = new RDFMapping(rdf, rf, rules, targetBalance);<br><br>// prepare your data<br>List&lt;Long&gt; data = prepareYourDataIds();<br><br>// set up the topology<br>Node root = createTopology();<br><br>// compute the mapping<br>Map&lt;Long,List&lt;Node&gt;&gt; mapping = mappingFunction.computeMapping(data, root);</pre> \n<h5>Future work</h5> \n<p>In the near future, we look forward to improving documentation and nurturing a community around libcrunch. We are also constantly looking for ways to improve various aspects of the algorithm such as balance and stability. We are also planning to adopt libcrunch in other storage systems we are developing at Twitter.</p> \n<p>If you’d like to help work on any features or have any bug fixes, we’re always looking for contributions or people to <a href=\"https://twitter.com/jobs/engineering\">join the flock</a> to build out our core storage technology. Just submit a pull request to say hello or reach out to us on the <a href=\"https://groups.google.com/forum/?fromgroups#!forum/twitter-libcrunch\">mailing list</a>. If you find something broken or have feature request ideas, report it in the <a href=\"https://github.com/twitter/libcrunch/issues\">issue tracker</a>.</p> \n<h5>Acknowledgements</h5> \n<p>Libcrunch is a team effort by Twitter’s Core Storage team (<a href=\"https://twitter.com/intent/user?screen_name=corestorage\">@corestorage</a>). It was primarily authored by Jerry Xu (<a href=\"https://twitter.com/intent/user?screen_name=jerryxu\">@jerryxu</a>) and Sangjin Lee (<a href=\"https://twitter.com/intent/user?screen_name=sjlee\">@sjlee</a>). The idea for libcrunch came out of discussions by Peter Schuller (<a href=\"https://twitter.com/intent/user?screen_name=scode\">@scode</a>), Boaz Avital (<a href=\"https://twitter.com/intent/user?screen_name=bx\">@bx</a>)&nbsp;and Chris Goffinet (<a href=\"https://twitter.com/intent/user?screen_name=lenn0x\">@lenn0x</a>), who also has made a number of direct contributions to the current manifestation. We’d also like to thank Stu Hood (<a href=\"https://twitter.com/intent/user?screen_name=stuhood\">@stuhood</a>) for his invaluable feedback and contributions.</p>",
"date": "2013-06-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/libcrunch-and-crush-maps",
"domain": "engineering"
},
{
"title": "Zippy Traces with Zipkin in your Browser",
"body": "<p>Last summer, we <a href=\"https://engineering/2012/distributed-systems-tracing-zipkin\">open-sourced Zipkin</a>, a distributed tracing system that helps us gather timing and dependency data for the many services involved in managing requests to Twitter. As we continually improve Zipkin, today we’re adding a <a href=\"https://github.com/twitter/zipkin/tree/master/zipkin-browser-extension/firefox\">Firefox extension</a> to Zipkin that makes it easy to see trace visualizations in your browser as you navigate your website.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>Get Zippy Traces with Zipkin in your browser! <a href=\"https://t.co/v75OcKry32\">https://t.co/v75OcKry32</a></p> — Zipkin project (\n <a href=\"https://twitter.com/intent/user?screen_name=zipkinproject\">@zipkinproject</a>) \n <a href=\"https://twitter.com/zipkinproject/statuses/347083739362365440\">June 18, 2013</a>\n </blockquote>",
"date": "2013-06-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/zippy-traces-zipkin-your-browser",
"domain": "engineering"
},
{
"title": "hRaven and the @HadoopSummit",
"body": "<p>Today marks the start of the <a href=\"http://hadoopsummit.org/san-jose/\">Hadoop Summit</a>, and we are thrilled to be a part of it. A few of our engineers will be participating in talks about our Hadoop usage at the summit:</p> \n<ul>\n <li>Day 1, 4:05pm: <a href=\"http://parquet.io/\">Parquet</a>: Columnar storage for the People</li> \n <li>Day 1, 4:55pm: A cluster is only as strong as its weakest link</li> \n <li>Day 2, 11:00am: A Birds-Eye View of Pig and <a href=\"https://github.com/twitter/scalding\">Scalding</a> Jobs with hRaven</li> \n <li>Day 2, 1:40pm: Hadoop Hardware at Twitter: Size does matter!</li> \n</ul>\n<p>As Twitter’s use of Hadoop and MapReduce rapidly expands, tracking usage on our clusters grows correspondingly more difficult. With an ever-increasing job load (tens of thousands of jobs per day), and a reliance on higher level abstractions such as Apache <a href=\"http://pig.apache.org/\">Pig</a> and Scalding, the utility of existing tools for viewing job history decreases rapidly.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hraven_and_the_hadoopsummit95.thumb.1280.1280.png\" width=\"700\" height=\"196\" alt=\"hRaven and the @HadoopSummit\"></p> \n<p>Note how paging through a very large number of jobs becomes unrealistic, especially when newly finished jobs push jobs rapidly through pages before you can navigate there.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hraven_and_the_hadoopsummit96.thumb.1280.1280.png\" width=\"527\" height=\"73\" alt=\"hRaven and the @HadoopSummit\"></p> \n<p>Extracting insights and browsing thousands of jobs becomes a challenge using the existing JobTracker user interface. We created <a href=\"https://github.com/twitter/hraven\">hRaven</a> to improve this situation and are open sourcing the code on GitHub today at the Hadoop Summit under the Apache Public License 2.0 to share with the greater Hadoop community.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hraven_and_the_hadoopsummit97.thumb.1280.1280.png\" width=\"203\" height=\"91\" alt=\"hRaven and the @HadoopSummit\"></p> \n<h5>Why hRaven?</h5> \n<p>There were many questions we wanted to answer when we were created hRaven. For example, how many Pig versus Scalding jobs do we run? What cluster capacity do jobs in my pool take? How many jobs do we run each day? What percentage of jobs have more than 30,000 tasks? Why do I need to hand-tune these (hundreds) of jobs, can’t the cluster learn and do it?</p> \n<p>We found that the existing tools were unable to start answering these questions at our scale.</p> \n<h5>How does it work?</h5> \n<p>hRaven archives the full history and metrics from all MapReduce jobs on clusters and strings together each job from a Pig or Scalding script execution into a combined flow. From this archive, we can easily derive aggregate resource utilization by user, pool, or application. Historical trending of an individual application allows us to perform runtime optimization of resource scheduling. The key concepts in hRaven are:</p> \n<ul>\n <li><strong>cluster</strong>: each cluster has a unique name mapping to the JobTracker</li> \n <li><strong>user</strong>: MapReduce jobs are run as a given user</li> \n <li><strong>application</strong>: a Pig or Scalding script (or plain MapReduce job)</li> \n <li><strong>flow</strong>: the combined DAG of jobs from a single execution of an application</li> \n <li><strong>version</strong>: changes impacting the DAG are recorded as a new version of the same application</li> \n</ul>\n<p>hRaven stores statistics, job configuration, timing and counters for every MapReduce job on every cluster. The key metrics stored are:</p> \n<ul>\n <li>Submit, launch and finish timestamps</li> \n <li>Total map and reduce tasks</li> \n <li>HDFS bytes read and written</li> \n <li>File bytes read and written</li> \n <li>Total map slot milliseconds</li> \n <li>Total reduce slot milliseconds</li> \n</ul>\n<p>This structured data around the full DAG of MapReduce jobs allows you to query for historical trending information or better yet, job optimization based on historical execution information. A concrete example is a custom Pig parallelism estimator querying hRaven that we use to automatically adjust reducer count.</p> \n<p>Data is loaded into hRaven into three steps, coordinated through ProcessRecords which record processing state in HBase:</p> \n<ol>\n <li>JobFilePreprocessor</li> \n <li>JobFileRawLoader</li> \n <li>JobFileProcessor</li> \n</ol>\n<p>First, the HDFS JobHistory location is scanned and the JobHistory and JobConfiguration file names of newly completed jobs are added to a sequence file. Then a mapreduce job runs on the source cluster to load the JobHistory and JobConfiguration files into HBase in parallel. Then in the third step a mapreduce job runs on the HBase cluster to parse the JobHistory and store individual stats and indexes.</p> \n<p>hRaven provides access to all of its stored data via a REST API, allowing auxiliary services such as web UIs and other tools to be built on it with ease.</p> \n<p>Below is a screenshot of an Twitter internal reporting application based on hRaven data showing overall cluster growth. Similarly we can visualize spikes in load over time, changes in reads and writes by application and by pool, as well as aspects such as pool usage vs. allocation. We also use hRaven data to calculate compute cost along varying dimensions.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hraven_and_the_hadoopsummit98.thumb.1280.1280.png\" width=\"700\" height=\"260\" alt=\"hRaven and the @HadoopSummit\"></p> \n<h5>Future work</h5> \n<p>In the near future, we want to add real time data loading from JobTracker and come up with a full flow-centric replacement for the JobTracker user interface (on top of integrating with the <a href=\"https://github.com/twitter/ambrose\">Ambrose</a> project). We would also like hRaven be enhanced to capture flow information from jobs run by frameworks other than Pig and Cascading, for instance Hive. Furthermore, we are in the process of supporting Hadoop 2.0 and want to focus on building a community around hRaven.</p> \n<p>The project is still young, so if you’d like to help work on any features or have any bug fixes, we’re always looking for contributions or people to <a href=\"https://twitter.com/jobs/positions?jvi=oJBqXfwr,Job\">join the flock</a> to expand our <a href=\"https://twitter.com/intent/user?screen_name=corestorage\">@corestorage</a> team. In particular, we are looking for <a href=\"https://twitter.com/jobs/positions?jvi=oJBqXfwr,Job\">engineers</a> with Hadoop and HBase experience. To say hello, just submit a pull request, follow <a href=\"https://twitter.com/intent/user?screen_name=TwitterHadoop\">@TwitterHadoop</a> or reach out to us on the <a href=\"http://groups.google.com/group/hraven-dev\">mailing list</a>. If you find something broken or have feature request ideas, report it in the <a href=\"https://github.com/twitter/hraven/issues\">issue tracker</a>.</p> \n<h5>Acknowledgements</h5> \n<p>hRaven was primarily authored by Gary Helmling (<a href=\"https://twitter.com/intent/user?screen_name=gario\">@gario</a>), Joep Rottinghuis (<a href=\"https://twitter.com/intent/user?screen_name=joep\">@joep</a>), Vrushali Channapattan (<a href=\"https://twitter.com/intent/user?screen_name=vrushalivc\">@vrushalivc</a>) and Chris Trezzo (<a href=\"https://twitter.com/intent/user?screen_name=ctrezzo\">@ctrezzo</a>) from the Twitter <a href=\"https://twitter.com/intent/user?screen_name=corestorage\">@corestorage</a> Hadoop team. In addition, we’d like to acknowledge the following folks who contributed to the project either directly or indirectly: Bill Graham (<a href=\"https://twitter.com/intent/user?screen_name=billgraham\">@billgraham</a>), Chandler Abraham (<a href=\"https://twitter.com/intent/user?screen_name=cba\">@cba</a>), Chris Aniszczyk (<a href=\"https://twitter.com/intent/user?screen_name=cra\">@cra</a>), Michael Lin (<a href=\"https://twitter.com/intent/user?screen_name=mlin\">@mlin</a>) and Dmitriy Ryaboy (<a href=\"https://twitter.com/intent/user?screen_name=squarecog\">@squarecog</a>).</p>",
"date": "2013-06-26T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/hraven-and-the-hadoopsummit",
"domain": "engineering"
},
{
"title": "Mesos Graduates from Apache Incubation",
"body": "<p>The Apache Software Foundation (ASF) has <a href=\"https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces45\">announced</a> the graduation of <a href=\"http://mesos.apache.org\">Apache Mesos</a>, the open source cluster manager that is used and heavily supported by Twitter, from its incubator.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>we’ve become a Top-Level Project at <a href=\"https://twitter.com/search?q=%23Apache&amp;src=hash\">#Apache</a>! <a href=\"https://t.co/qk7CxCOLfi\">https://t.co/qk7CxCOLfi</a> <a href=\"https://twitter.com/search?q=%23mesos&amp;src=hash\">#mesos</a></p> — Apache Mesos (\n <a href=\"https://twitter.com/intent/user?screen_name=ApacheMesos\">@ApacheMesos</a>) \n <a href=\"https://twitter.com/ApacheMesos/statuses/360039441500340224\">July 24, 2013</a>\n </blockquote>",
"date": "2013-07-24T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/mesos-graduates-from-apache-incubation",
"domain": "engineering"
},
{
"title": "Announcing Parquet 1.0: Columnar Storage for Hadoop",
"body": "<p>In March we <a href=\"http://blog.cloudera.com/blog/2013/03/introducing-parquet-columnar-storage-for-apache-hadoop/\">announced</a> the <a href=\"http://parquet.io/\">Parquet</a> project, the result of a collaboration between Twitter and <a href=\"https://twitter.com/clouderaeng\">Cloudera</a> intended to create an open-source columnar storage format library for Apache <a href=\"http://hadoop.apache.org/\">Hadoop</a>.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>We’re happy to release Parquet 1.0.0, more at: <a href=\"https://t.co/xKilQU22a5\">https://t.co/xKilQU22a5</a> 90+ merged pull requests since announcement: <a href=\"https://t.co/lrKdrNiUQA\">https://t.co/lrKdrNiUQA</a></p> — Parquet Format (\n <a href=\"https://twitter.com/intent/user?screen_name=ParquetFormat\">@ParquetFormat</a>) \n <a href=\"https://twitter.com/ParquetFormat/statuses/362265505144381440\">July 30, 2013</a>\n </blockquote>",
"date": "2013-07-30T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/announcing-parquet-10-columnar-storage-for-hadoop",
"domain": "engineering"
},
{
"title": "Login verification on Twitter for iPhone and Android",
"body": "<p><span>At Twitter, we want to make it easy as possible to secure your account. Designing a secure authentication protocol is tough; designing one that is also simple and intuitive is even harder. We think <a href=\"https://engineering/2013/improvements-to-login-verification-photos-and-more\">our new login verification feature</a> is an improvement in both security and usability, and we’re excited to share it with you.</span></p> \n<h5>More secure, you say?</h5> \n<p>Traditional two-factor authentication protocols require a shared secret between the user and the service. For instance, OTP protocols use a shared secret modulated by a counter (<a href=\"http://en.wikipedia.org/wiki/HMAC-based_One-time_Password_Algorithm\">HOTP</a>) or timer (<a href=\"http://en.wikipedia.org/wiki/Time-based_One-time_Password_Algorithm\">TOTP</a>). A weakness of these protocols is that the shared secret can be compromised if the server is compromised. We chose a design that is resilient to a compromise of the server-side data’s confidentiality: Twitter doesn’t persistently store secrets, and the private key material needed for approving login requests never leaves your phone.</p> \n<p>Other previous attacks against two-factor authentication have taken advantage of compromised SMS delivery channels. This solution avoids that because the key necessary to approve requests never leaves your phone. Also, our updated login verification feature provides additional information about the request to help you determine if the login request you see is the one you’re making.</p> \n<h5>And easier to use, as well?</h5> \n<p>Now you can enroll in login verification and approve login requests right from the Twitter app on iOS and Android. Simply tap a button on your phone, and you’re good to go. This means you don’t have to wait for a text message and then type in the code each time you sign in on twitter.com.</p> \n<h5>So how does it work?</h5> \n<p>When you enroll, your phone generates an asymmetric 2048-bit RSA keypair, which stores the private key locally on the device and sends the public key, which Twitter stores as part of your user object in our backend store, to the server.</p> \n<p>Whenever you initiate a login request by sending your username and password, Twitter will generate a challenge and request ID –– each of which is a 190-bit (32 alphanumerics) random nonce –– and store them in memcached. The request ID nonce is returned to the browser or client attempting to authenticate, and then a push notification is sent to your phone, letting you know you have a login verification request.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/login_verificationontwitterforiphoneandandroid95.thumb.1280.1280.png\" width=\"640\" height=\"360\" alt=\"Login verification on Twitter for iPhone and Android\"></p> \n<p>Within your Twitter app, you can then view the outstanding request, which includes several key pieces of information: time, geographical location, browser, and the login request’s challenge nonce. At that point, you can choose to approve or deny the request. If you approve the request, the client will use its private key to respond by signing the challenge. If the signature is correct, the login request will be marked as verified.</p> \n<p>In the meantime, the original browser will poll the server with the request ID nonce. When the request is verified, the polling will return a session token and the user will be signed in.</p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>Login verification is more secure and easier to use. And you can still sign in even if you lose your phone.</p> \n</blockquote> \n<h5>What happens if I don’t have my phone?</h5> \n<p>The private key is only stored on the phone. However, there’s still a way to sign in to Twitter even if you don’t have your phone or can’t connect to Twitter –– by using your backup code. We encourage you to store it somewhere safe.</p> \n<p>To make the backup code work without sharing secrets, we use an algorithm inspired by <a href=\"http://www.ece.northwestern.edu/CSEL/skey/skey_eecs.html\">S/KEY</a>. During enrollment, your phone generates a 64-bit random seed, SHA256 hashes it 10,000 times, and turns it into a 60-bit (12 characters of readable base32) string. It sends this string to our servers. The phone then asks you to write down the next backup code, which is the same seed hashed 9,999 times. Later, when you send us the backup code to sign in, we hash it one time, and then verify that the resulting value matches the value we initially stored. Then, we store the value you sent us, and the next time you generate a backup code it will hash the seed 9,998 times.</p> \n<p>This means that you don’t have to be connected to Twitter to generate a valid backup code. And, due to the one-way property of the hash algorithm, if ever an attacker could read the data on our servers, he/she won’t be able to generate one.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/login_verificationontwitterforiphoneandandroid96.thumb.1280.1280.png\" width=\"640\" height=\"360\" alt=\"Login verification on Twitter for iPhone and Android\"></p> \n<p>In addition to storing your backup codes safely, we encourage you to backup your phone. If you have an iPhone, you should make an <a href=\"http://support.apple.com/kb/ht4946\">encrypted backup</a>, which stores the cryptographic material necessary to recover your account easily in case you lose your phone or upgrade to a new phone. Also if you’re upgrading, you can simply un-enroll on your old phone and re-enroll on your new phone.</p> \n<h5>What if I want to sign in to Twitter on a service that doesn’t support login verification?</h5> \n<p>Twitter clients that use <a href=\"https://dev.twitter.com/docs/oauth/xauth\">XAuth authentication</a>, which expects a username and a password, don’t always support login verification directly. Instead, you’ll need to go to twitter.com on the web, navigate to your password settings, and generate a temporary password. You can use this password instead of your regular password to sign in over XAuth.</p> \n<h5>What’s next?</h5> \n<p>We’ll continue to make improvements so signing in to Twitter is even easier and more secure. For example, we’re working on building login verification into our clients and exposing a login verification API for other XAuth clients so people who don’t have access to the web also have a seamless login experience.</p>",
"date": "2013-08-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/login-verification-on-twitter-for-iphone-and-android",
"domain": "engineering"
},
{
"title": "Visualizing Epics and Dependencies in JIRA",
"body": "<p>The <a href=\"https://twitter.com/twittertpm\">@TwitterTPM</a> team has been exploring ways to visualize work across a number of teams and programs within engineering. One of our key goals is to highlight work that has dependencies and understand whether those dependencies have been met. We’ve arrived at a solution that combines <a href=\"https://www.atlassian.com/software/jira\">JIRA</a>, <a href=\"http://atlassian.com/software/greenhopper\">GreenHopper</a> and <a href=\"https://www.atlassian.com/software/confluence\">Confluence</a>.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>Ancient TPM proverb: “a watched JIRA never moves”</p> — Twitter TPM Team (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterTPM\">@TwitterTPM</a>) \n <a href=\"https://twitter.com/TwitterTPM/statuses/366989702470975488\">August 12, 2013</a>\n </blockquote>",
"date": "2013-08-12T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/visualizing-epics-and-dependencies-in-jira",
"domain": "engineering"
},
{
"title": "Twitter University: Building a world-class engineering organization",
"body": "<p>As Twitter has scaled, so too has our engineering organization. To help our engineers grow, it’s important for them to have access to world-class technical training, along with opportunities to teach the skills they’ve mastered. To that end, we’re establishing Twitter University.</p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>We want Twitter to be the best place in the world for engineers to work.</p> \n</blockquote> \n<p>Twitter University builds on several existing efforts at Twitter. We currently offer employees a whole swath of technical trainings, from orientation classes for new engineers to iOS Bootcamp, JVM Fundamentals, Distributed Systems, Scala School, and more for those who want to develop new skills. Most of these classes are taught by our own team members, and many of them have been organized during our quarterly Hack Weeks –– a testament to our engineers’ passion for learning and education.</p> \n<p>I’ve been inspired by these efforts. Being able to continually learn on the job and develop a sense of expertise or mastery is a fundamental factor in success in the technology industry and long term happiness at a company. Twitter University will be a vital foundation for our engineering organization.</p> \n<p>To lead the program, we’ve acquired <a href=\"http://marakana.com\">Marakana</a>, a company dedicated to open source training. We’ve been working with them for several months. The founders, Marko and Sasa Gargenta, have impressed us with their entrepreneurial leadership, commitment to learning and technical expertise.</p> \n<p>The Marakana team has cultivated a tremendous community of engineers in the Bay Area, and we look forward to engaging with all of you at meet-ups and technical events. Additionally, we’ll continue to <a href=\"https://github.com/twitter\">contribute to open source software</a>, and we aim to release some of the Twitter University content online to anyone who’d like to learn. You can keep up with Twitter University by following <a href=\"https://twitter.com/university\">@university</a>.</p> \n<p>We want Twitter to be the best place in the world for engineers to work.&nbsp;<a href=\"https://twitter.com/jobs/engineering\">Join us</a>.</p> \n<p><em><span><strong>Update</strong>: Changed Twitter University’s username to <a href=\"https://twitter.com/intent/user?screen_name=university\">@university</a>.&nbsp;</span></em></p>",
"date": "2013-08-13T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/twitter-university-building-a-world-class-engineering-organization",
"domain": "engineering"
},
{
"title": "New Tweets per second record, and how!",
"body": "<p><span>Recently, something remarkable happened on Twitter: On Saturday, August 3 in Japan, people watched an airing of <a href=\"http://en.wikipedia.org/wiki/Castle_in_the_Sky\">Castle in the Sky</a>, and at one moment they took to Twitter so much that we hit a one-second peak of 143,199 Tweets per second. (August 2 at 7:21:50 PDT; August 3 at 11:21:50 JST)</span></p> \n<p>To give you some context of how that compares to typical numbers, we normally take in more than 500 million Tweets a day which means about 5,700 Tweets a second, on average. This particular spike was around 25 times greater than our steady state.</p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_tweets_per_secondrecordandhow95.thumb.1280.1280.png\" width=\"700\" height=\"209\" alt=\"New Tweets per second record, and how!\"></span></p> \n<p><span>During this spike, our users didn’t experience a blip on Twitter. That’s one of our goals: to make sure Twitter is always available no matter what is happening around the world.</span></p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>New Tweets per second (TPS) record: 143,199 TPS. Typical day: more than 500 million Tweets sent; average 5,700 TPS.</p> \n</blockquote> \n<p><span>This goal felt unattainable three years ago, when the 2010 World Cup put Twitter squarely in the center of a </span><span></span><a href=\"https://engineering/2010/2010-world-cup-global-conversation\">real-time, global conversation</a><span>. The influx of Tweets –– from every shot on goal, penalty kick and yellow or red card –– repeatedly took its toll and made Twitter unavailable for short periods of time. Engineering worked throughout the nights during this time, desperately trying to find and implement order-of-magnitudes of efficiency gains. Unfortunately, those gains were quickly swamped by Twitter’s rapid growth, and engineering had started to run out of low-hanging fruit to fix.</span></p> \n<p>After that experience, we determined we needed to step back. We then determined we needed to re-architect the site to support the continued growth of Twitter and to keep it running smoothly. Since then we’ve worked hard to make sure that the service is resilient to the world’s impulses. We’re now able to withstand events like Castle in the Sky viewings, the Super Bowl, and the global New Year’s Eve celebration. This re-architecture has not only made the service more resilient when traffic spikes to record highs, but also provides a more flexible platform on which to build more features faster, including synchronizing direct messages across devices, Twitter cards that allow Tweets to become richer and contain more content, and a rich search experience that includes stories and users. And more features are coming.</p> \n<p>Below, we detail how we did this. We learned a lot. We changed our engineering organization. And, over the next few weeks, we’ll be publishing additional posts that go into more detail about some of the topics we cover here.<br><br><strong>Starting to re-architect</strong></p> \n<p>After the 2010 World Cup dust settled, we surveyed the state of our engineering. Our findings:</p> \n<ul>\n <li>We were running one of the world’s largest Ruby on Rails installations, and we had pushed it pretty far –– at the time, about 200 engineers were contributing to it and it had gotten Twitter through some explosive growth, both in terms of new users as well as the sheer amount of traffic that it was handling. This system was also monolithic where everything we did, from managing raw database and memcache connections through to rendering the site and presenting the public APIs, was in one codebase. Not only was it increasingly difficult for an engineer to be an expert in how it was put together, but also it was organizationally challenging for us to manage and parallelize our engineering team.</li> \n <li>We had reached the limit of throughput on our storage systems –– we were relying on a MySQL storage system that was temporally sharded and had a single master. That system was having trouble ingesting tweets at the rate that they were showing up, and we were operationally having to create new databases at an ever increasing rate. We were experiencing read and write hot spots throughout our databases.</li> \n <li>We were “throwing machines at the problem” instead of engineering thorough solutions –– our front-end Ruby machines were not handling the number of transactions per second that we thought was reasonable, given their horsepower. From previous experiences, we knew that those machines could do a lot more.</li> \n <li>Finally, from a software standpoint, we found ourselves pushed into an “optimization corner” where we had started to trade off readability and flexibility of the codebase for performance and efficiency.</li> \n</ul>\n<p>We concluded that we needed to start a project to re-envision our system. We set three goals and challenges for ourselves:</p> \n<ul>\n <li>We wanted big infrastructure wins in performance, efficiency, and reliability –– we wanted to improve the median latency that users experience on Twitter as well as bring in the outliers to give a uniform experience to Twitter. We wanted to reduce the number of machines needed to run Twitter by 10x. We also wanted to isolate failures across our infrastructure to prevent large outages –– this is especially important as the number of machines we use go up, because it means that the chance of any single machine failing is higher. Failures are also inevitable, so we wanted to have them happen in a much more controllable manner.</li> \n <li>We wanted cleaner boundaries with “related” logic being in one place –– we felt the downsides of running our particular monolithic codebase, so we wanted to experiment with a loosely coupled services oriented model. Our goal was to encourage the best practices of encapsulation and modularity, but this time at the systems level rather than at the class, module, or package level.</li> \n <li>Most importantly, we wanted to launch features faster. We wanted to be able to run small and empowered engineering teams that could make local decisions and ship user-facing changes, independent of other teams.</li> \n</ul>\n<p>We prototyped the building blocks for a proof of concept re-architecture. Not everything we tried worked and not everything we tried, in the end, met the above goals. But we were able to settle on a set of principles, tools, and an infrastructure that has gotten us to a much more desirable and reliable state today.</p> \n<p><br><strong>The JVM vs the Ruby VM</strong><br>First, we evaluated our front-end serving tier across three dimensions: CPU, RAM, and network. Our Ruby-based machinery was being pushed to the limit on the CPU and RAM dimensions –– but we weren’t serving that many requests per machine nor were we coming close to saturating our network bandwidth. Our Rails servers, at the time, had to be effectively single threaded and handle only one request at a time. Each Rails host was running a number of Unicorn processes to provide host-level concurrency, but the duplication there translated to wasteful resource utilization. When it came down to it, our Rails servers were only capable of serving 200 - 300 requests / sec / host.</p> \n<p>Twitter’s usage is always growing rapidly, and doing the math there, it would take a lot of machines to keep up with the growth curve.</p> \n<p>At the time, Twitter had experience deploying fairly large scale JVM-based services –– our search engine was written in Java, and our Streaming Api infrastructure as well as Flock, <a href=\"https://engineering/2010/introducing-flockdb\">our social graph system</a>, was written in Scala. We were enamored by the level of performance that the JVM gave us. It wasn’t going to be easy to get our performance, reliability, and efficiency goals out of the Ruby VM, so we embarked on writing code to be run on the JVM instead. We estimated that rewriting our codebase could get us &gt; 10x performance improvement, on the same hardware –– and now, today, we push on the order of 10 - 20K requests / sec / host.</p> \n<p>There was a level of trust that we all had in the JVM. A lot of us had come from companies where we had experience working with, tuning, and operating large scale JVM installations. We were confident we could pull off a sea change for Twitter in the world of the JVM. Now, we had to decompose our architecture and figure out how these different services would interact.</p> \n<p><br><strong>Programming model</strong><br>In Twitter’s Ruby systems, concurrency is managed at the process level: a single network request is queued up for a process to handle. That process is completely consumed until the network request is fulfilled. Adding to the complexity, architecturally, we were taking Twitter in the direction of having one service compose the responses of other services. Given that the Ruby process is single-threaded, Twitter’s “response time” would be additive and extremely sensitive to the variances in the back-end systems’ latencies. There were a few Ruby options that gave us concurrency; however, there wasn’t one standard way to do it across all the different VM options. The JVM had constructs and primitives that supported concurrency and would let us build a real concurrent programming platform.</p> \n<p>It became evident that we needed a single and uniform way to think about concurrency in our systems and, specifically, in the way we think about networking. As we all know, writing concurrent code (and concurrent networking code) is hard and can take many forms. In fact, we began to experience this. As we started to decompose the system into services, each team took slightly different approaches. For example, the failure semantics from clients to services didn’t interact well: we had no consistent back-pressure mechanism for servers to signal back to clients and we experienced “thundering herds” from clients aggressively retrying latent services. These failure domains informed us of the importance of having a unified, and complementary, client and server library that would bundle in notions of connection pools, failover strategies, and load balancing. To help us all get in the same mindset, <a href=\"https://engineering/2011/finagle-protocol-agnostic-rpc-system\">we put together both Futures and Finagle</a>.</p> \n<p>Now, not only did we have a uniform way to do things, but we also baked into our core libraries everything that all our systems needed so we could get off the ground faster. And rather than worry too much about how each and every system operated, we could focus on the application and service interfaces.</p> \n<p><br><strong>Independent systems</strong><br>The largest architectural change we made was to move from our monolithic Ruby application to one that is more services oriented. We focused first on creating Tweet, timeline, and user services –– our “core nouns”. This move afforded us cleaner abstraction boundaries and team-level ownership and independence. In our monolithic world, we either needed experts who understood the entire codebase or clear owners at the module or class level. Sadly, the codebase was getting too large to have global experts and, in practice, having clear owners at the module or class level wasn’t working. Our codebase was becoming harder to maintain, and teams constantly spent time going on “archeology digs” to understand certain functionality. Or we’d organize “whale hunting expeditions” to try to understand large scale failures that occurred. At the end of the day, we’d spend more time on this than on shipping features, which we weren’t happy with.</p> \n<p>Our theory was, and still is, that a services oriented architecture allows us to develop the system in parallel –– we agree on networking RPC interfaces, and then go develop the system internals independently –– but, it also meant that the logic for each system was self-contained within itself. If we needed to change something about Tweets, we could make that change in one location, the Tweet service, and then that change would flow throughout our architecture. In practice, however, we find that not all teams plan for change in the same way: for example, a change in the Tweet service may require other services to do an update if the Tweet representation changed. On balance, though, this works out more times than not.</p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_tweets_per_secondrecordandhow96.thumb.1280.1280.png\" width=\"700\" height=\"398\" alt=\"New Tweets per second record, and how!\"></span></p> \n<p><span>This system architecture also mirrored the way we wanted, and now do, run the Twitter engineering organization. Engineering is set up with (mostly) self-contained teams that can run independently and very quickly. This means that we bias toward teams spinning up and running their own services that can call the back end systems. This has huge implications on operations, however.</span></p> \n<p><br><strong>Storage</strong><br>Even if we broke apart our monolithic application into services, a huge bottleneck that remained was storage. Twitter, at the time, was storing tweets in a single master MySQL database. We had taken the strategy of storing data temporally –– each row in the database was a single tweet, we stored the tweets in order in the database, and when the database filled up we spun up another one and reconfigured the software to start populating the next database. This strategy had bought us some time, but, we were still having issues ingesting massive tweet spikes because they would all be serialized into a single database master so we were experiencing read load concentration on a small number of database machines. We needed a different partitioning strategy for Tweet storage.</p> \n<p>We took Gizzard, our framework to create sharded and fault-tolerant distributed databases, and applied it to tweets. We created T-Bird. In this case, Gizzard was fronting a series of MySQL databases –– every time a tweet comes into the system, Gizzard hashes it, and then chooses an appropriate database. Of course, this means we lose the ability to rely on MySQL for unique ID generation. <a href=\"https://engineering/2010/announcing-snowflake\">Snowflake</a> was born to solve that problem. Snowflake allows us to create an almost-guaranteed globally unique identifier. We rely on it to create new tweet IDs, at the tradeoff of no longer having “increment by 1” identifiers. Once we have an identifier, we can rely on Gizzard then to store it. Assuming our hashing algorithm works and our tweets are close to uniformly distributed, we increase our throughput by the number of destination databases. Our reads are also then distributed across the entire cluster, rather than being pinned to the “most recent” database, allowing us to increase throughput there too.</p> \n<p><br><strong>Observability and statistics</strong><br>We’ve traded our fragile monolithic application for a more robust and encapsulated, but also complex, services oriented application. We had to invest in tools to make managing this beast possible. Given the speed with which we were creating new services, we needed to make it incredibly easy to gather data on how well each service was doing. By default, we wanted to make data-driven decisions, so we needed to make it trivial and frictionless to get that data.</p> \n<p>As we were going to be spinning up more and more services in an increasingly large system, we had to make this easier for everybody. Our Runtime Systems team created two tools for engineering: Viz and <a href=\"https://engineering/2012/distributed-systems-tracing-zipkin\">Zipkin</a>. Both of these tools are exposed and integrated with Finagle, so all services that are built using Finagle get access to them automatically.</p> \n<pre>stats.timeFuture(\"request_latency_ms\") {<br>// dispatch to do work<br>}</pre> \n<p>The above code block is all that is needed for a service to report statistics into Viz. From there, anybody using Viz can write a query that will generate a timeseries and graph of interesting data like the 50th and 99th percentile of request_latency_ms.</p> \n<p></p> \n<p><strong>Runtime configuration and testing</strong><br>Finally, as we were putting this all together, we hit two seemingly unrelated snags: launches had to be coordinated across a series of different services, and we didn’t have a place to stage services that ran at “Twitter scale”. We could no longer rely on deployment as the vehicle to get new user-facing code out there, and coordination was going to be required across the application. In addition, given the relative size of Twitter, it was becoming difficult for us to run meaningful tests in a fully isolated environment. We had, relatively, no issues testing for correctness in our isolated systems –– we needed a way to test for large scale iterations. We embraced runtime configuration.</p> \n<p>We integrated a system we call Decider across all our services. It allows us to flip a single switch and have multiple systems across our infrastructure all react to that change in near-real time. This means software and multiple systems can go into production when teams are ready, but a particular feature doesn’t need to be “active”. Decider also allows us to have the flexibility to do binary and percentage based switching such as having a feature available for x% of traffic or users. We can deploy code in the fully “off” and safe setting, and then gradually turn it up and down until we are confident it’s operating correctly and systems can handle the new load. All this alleviates our need to do any coordination at the team level, and instead we can do it at runtime.</p> \n<p><br><strong>Today</strong><br>Twitter is more performant, efficient and reliable than ever before. We’ve sped up the site incredibly across the 50th (p50) through 99th (p99) percentile distributions and the number of machines involved in serving the site itself has been decreased anywhere from 5x-12x. Over the last six months, Twitter has flirted with four 9s of availability.</p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/new_tweets_per_secondrecordandhow97.thumb.1280.1280.png\" width=\"700\" height=\"375\" alt=\"New Tweets per second record, and how!\"></span></p> \n<p><span>Twitter engineering is now set up to mimic our software stack. We have teams that are ready for long term ownership and to be experts on their part of the Twitter infrastructure. Those teams own their interfaces and their problem domains. Not every team at Twitter needs to worry about scaling Tweets, for example. Only a few teams –– those that are involved in the running of the Tweet subsystem (the Tweet service team, the storage team, the caching team, etc.) –– have to scale the writes and reads of Tweets, and the rest of Twitter engineering gets APIs to help them use it.</span></p> \n<p>Two goals drive us as we did all this work: Twitter should always be available for our users, and we should spend our time making Twitter more engaging, more useful and simply better for our users. Our systems and our engineering team now enable us to launch new features faster and in parallel. We can dedicate different teams to work on improvements simultaneously and have minimal logjams for when those features collide. Services can be launched and deployed independently from each other (in the last week, for example, we had more than 50 deploys across all Twitter services), and we can defer putting everything together until we’re ready to make a new build for iOS or Android.</p> \n<p>Keep an eye on this blog and <a href=\"https://twitter.com/twittereng\">@twittereng</a> for more posts that will dive into details on some of the topics mentioned above.</p> \n<p><em>Thanks goes to Jonathan Reichhold (<a href=\"https://twitter.com/jreichhold\">@jreichhold</a>), David Helder (<a href=\"https://twitter.com/dhelder\">@dhelder</a>), Arya Asemanfar (<a href=\"https://twitter.com/a_a\">@a_a</a>), Marcel Molina (<a href=\"https://twitter.com/noradio\">@noradio</a>), and Matt Harris (<a href=\"https://twitter.com/themattharris\">@themattharris</a>) for helping contribute to this blog post.</em></p>",
"date": "2013-08-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/new-tweets-per-second-record-and-how",
"domain": "engineering"
},
{
"title": "Bootstrap 3.0",
"body": "<p>We are thrilled to see the <a href=\"https://github.com/twbs/bootstrap\">Bootstrap</a> community announce the <a href=\"http://blog.getbootstrap.com/2013/08/19/bootstrap-3-released/\">3.0 release</a>:</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>the <a href=\"https://twitter.com/twbootstrap\">@twbootstrap</a> team released 3.0, check it out <a href=\"http://t.co/11DcMjnRCF\">http://t.co/11DcMjnRCF</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/statuses/369567959871537152\">August 19, 2013</a>\n </blockquote>",
"date": "2013-08-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/bootstrap-30",
"domain": "engineering"
},
{
"title": "Improving accessibility of twitter.com",
"body": "<p><span>Timelines are the core content of Twitter. And with </span><a href=\"http://en.wikipedia.org/wiki/Device_independence\">device independence</a><span> being one of the core principles of accessibility, we’ve spent the past quarter tuning and enhancing keyboard access for timelines. Our goal was to provide a first-class user experience for consuming and interacting with timelines using the keyboard.</span></p> \n<p>The need for keyboard access isn’t always obvious. It’s sometimes considered a power or pro user feature, but for users who are unable to use the mouse, keyboard access is a necessity. For example, visually impaired users often rely entirely on the keyboard as the mouse requires the user to be able to see the screen. Other users suffer physical injuries which make it either impossible or painful to use the mouse. In <a href=\"http://www.youtube.com/watch?v=ZoLOyyS5700\">The Pointerless Web</a>, <a href=\"https://twitter.com/slicknet\">@slicknet</a> tells a personal story about how the keyboard became essential for him as a result of his RSI.</p> \n<p>The list of use cases is endless. What’s most important is to remember that humans are all different, and because of our differences, we all have different needs. The more options we have as users, the better. Through that lens, the importance of the keyboard is elevated as it offers another option to the user.</p> \n<p></p> \n<p><strong>Keyboard shortcuts</strong></p> \n<p><span>Since the introduction of our keyboard shortcuts in 2010, users have been able to navigate through timelines using the j and k keys to move the selection cursor up or down. With a clearer understanding of the people who benefit from keyboard shortcuts, we were able to identify and fix gaps in our implementation.</span></p> \n<p></p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>Our goal was to provide a first-class user experience for consuming and interacting with timelines using the keyboard.</p> \n</blockquote> \n<p><strong>A lack of focus</strong></p> \n<p>The first and most egregious problem when we started to tackle this was that the shortcuts for timeline navigation didn’t manipulate DOM focus. Specifically, when the user pressed j or k, a Tweet would only be rendered visually selected through the application of a class. This meant the timeline’s selection model was out of sync with the default navigational mechanism provided by the browser, and for practical purposes, limited keyboard access to actions defined by our keyboard shortcuts. For example, you could favorite a selected Tweet by pressing f, but couldn’t easily navigate to a link within the selected Tweet as the subsequent Tab keypress would end up moving focus to some other control in the page.</p> \n<p>To remedy this issue, the navigational shortcuts now set the tabIndex of the selected item to -1 and focus it. This enables j and k to function as macro-level navigational shortcuts between items in a timeline, and Tab / Shift + Tab to facilitate micro-level navigation between all of the various focusable controls within a Tweet. In other words, our shortcuts function as highways, and the Tab key as a local street. Further, browser-provided tab navigation is more robust in that it guarantees the user complete access to all of the actions within a Tweet. Any shortcuts we add for actions are just sugar.</p> \n<p>Here’s a video illustrating the difference between macro- and micro-navigation.</p> \n<p></p>\n<div class=\"video video-youtube\">\n <iframe width=\"100%\" src=\"https://www.youtube.com/embed/PFV9F7MRbQM\" frameborder=\"0\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>",
"date": "2013-08-22T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/improving-accessibility-of-twittercom",
"domain": "engineering"
},
{
"title": "Streaming MapReduce with Summingbird",
"body": "<p>Today we are open sourcing <a href=\"https://github.com/twitter/summingbird\">Summingbird</a> on GitHub under the ALv2.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>we’re thrilled to open source <a href=\"https://twitter.com/summingbird\">@summingbird</a>, streaming mapreduce with <a href=\"https://twitter.com/scalding\">@scalding</a> and <a href=\"https://twitter.com/stormprocessor\">@stormprocessor</a> <a href=\"https://twitter.com/search?q=%23hadoop&amp;src=hash\">#hadoop</a> <a href=\"https://t.co/cV3LkCdCot\">https://t.co/cV3LkCdCot</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/statuses/374925805572222976\">September 3, 2013</a>\n </blockquote>",
"date": "2013-09-03T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/streaming-mapreduce-with-summingbird",
"domain": "engineering"
},
{
"title": "Observability at Twitter",
"body": "<p>As Twitter has moved from a monolithic to a distributed architecture, our scalability has increased dramatically.</p> \n<p>Because of this, the overall complexity of systems and their interactions has also escalated. This decomposition has led to Twitter managing hundreds of services across our datacenters. Visibility into the health and performance of our diverse service topology has become an important driver for quickly determining the root cause of issues, as well as increasing Twitter’s overall reliability and efficiency. Debugging a complex program might involve instrumenting certain code paths or running special utilities; similarly Twitter needs a way to perform this sort of debugging for its distributed systems.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter95.thumb.1280.1280.png\" width=\"474\" height=\"414\" alt=\"Observability at Twitter\" class=\"align-center\"></p> \n<p>One important metric we track is the overall success rate of the Twitter API. When problems arise, determining the cause requires a system capable of handling metrics from our heterogenous service stack. To understand the health and performance of our services, many things need to be considered together: operating systems, in-house and open-source JVM-based applications, core libraries such as <a href=\"http://twitter.github.io/finagle/\">Finagle</a>, storage dependencies such as caches and databases, and finally, application-level metrics.</p> \n<p>Engineers at Twitter need to determine the performance characteristics of their services, the impact on upstream and downstream services, and get notified when services are not operating as expected. It is the Observability team’s mission to analyze such problems with our unified platform for collecting, storing, and presenting metrics. Creating a system to handle this job at Twitter scale is really difficult. Below, we explain how we capture, store, query, visualize and automate this entire process.</p> \n<p><strong>Architecture</strong></p> \n<p><strong><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter96.thumb.1280.1280.png\" width=\"480\" height=\"376\" alt=\"Observability at Twitter\" class=\"align-center\"></strong></p> \n<p></p> \n<p><strong>Collection</strong></p> \n<p>A key component to allowing performant yet very fine grained instrumentation of code and machines is to optimize for the lowest cost aggregation and rollup possible. This aggregation is is generally implemented via in-memory counters and approximate histograms in a service or application memory space, and then exported to the Observability stack over a consistent interface. Such metrics are collected from tens of thousands of endpoints representing approximately 170 million individual metrics every minute. In the vast majority of cases, data is pulled by the Observability stack from endpoints, however a hybrid aggregation and application-push model is used for specific applications which cannot or do not provide in-memory rollups.</p> \n<p>An endpoint is generally an HTTP server which serves a consistent view of all metrics it exports. For our applications and services, these metrics are exported by in-JVM stats libraries — such as the <a href=\"http://twitter.github.io/twitter-server/\">open-source Twitter-Server framework</a> — which provide convenient functions for adding instrumentation. Depending on the level of instrumentation and which internal libraries are used (such as Finagle, which exports rich datasets), applications commonly export anywhere from 50 to over 10,000 individual metrics per instance.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter97.thumb.1280.1280.png\" width=\"329\" height=\"259\" alt=\"Observability at Twitter\"></p> \n<p>For applications and metrics which do not use our core common libraries, data is made available via a host-agent over the same HTTP interface. This includes machine and operating system statistics, such as disk health, CPU, memory, and overall network traffic.</p> \n<p>All metrics are identified by multiple dimensions, including an underlying service name, and the origin of the data: such as a host, dynamically scheduled application instance identifier, or other identifier specific to the service, and a metric name. Numeric metrics are written to a time series database, which we will cover in more detail in the next section. For batch-processing and non-numeric values, the data is routed to HDFS using Scribe. <a href=\"https://github.com/twitter/scalding\">Scalding</a> and Pig jobs can be run to produce reports for situations that are not time-sensitive.</p> \n<p>Determining the network location of applications running in a multi-tenant scheduled environment such as <a href=\"http://mesos.apache.org/\">Mesos</a> adds additional complexity to metric collection when compared to ones deployed on statically allocated hosts. Applications running in Mesos may be co-located with other applications and are dynamically assigned network addresses, so we leverage Zookeeper to provide dynamic service discovery in order to determine where an application is running. This data is centralized for use in collector configurations and is queried around 15,000 times per minute by users and automated systems.</p> \n<p>In some cases, the default collection period is insufficient for surfacing emergent application behavior. To supplement the existing pipeline, Observability also supports a self-service feature to collect data at a user-specified interval, down to one second, and serve it from an ephemeral store. This enables engineers to focus on key metrics during an application deployment or other event where resolution is important but durability and long term storage are less critical.</p> \n<p><strong>Storage</strong></p> \n<p>Collected metrics are stored in and queried from a time series database developed at Twitter. As the quantity and dimension of time series data for low-latency monitoring at Twitter grew, existing solutions were no longer able to provide the features or performance required. As such, the Observability team developed a time series storage and query service which served as the abstraction layer for a multitude of Observability products. The database is responsible for filtering, validating, aggregating, and reliably writing the collected metrics to durable storage clusters, as well as hosting the query interface.</p> \n<p>There are separate online clusters for different data sets: application and operating system metrics, performance critical write-time aggregates, long term archives, and temporal indexes. A typical production instance of the time series database is based on four distinct Cassandra clusters, each responsible for a different dimension (real-time, historical, aggregate, index) due to different performance constraints. These clusters are amongst the largest Cassandra clusters deployed in production today and account for over 500 million individual metric writes per minute. Archival data is stored at a lower resolution for trending and long term analysis, whereas higher resolution data is periodically expired. Aggregation is generally performed at write-time to avoid extra storage operations for metrics that are expected to be immediately consumed. Indexing occurs along several dimensions–service, source, and metric names–to give users some flexibility in finding relevant data.</p> \n<p><strong>Query Language</strong></p> \n<p>An important characteristic of any database is the ability to locate relevant information. For Observability, query functionality is exposed as a service by our time series database over HTTP and Thrift. Queries are written using a declarative, functional inspired language. The language allows for cross-correlation of multiple metrics from multiple sources across multiple databases spread across geographical boundaries. Using this query interface and language provide a unified interface to time series data that all processing and visualization tools at Twitter use. This consistency means a single technology for engineers to learn to debug, visualize and alert on performance data.</p> \n<p><strong>Example Queries</strong></p> \n<p>Show the slow query count summed for all machines in the role. This can be used in CLI tools or visualizations. The arguments are aggregate function, service name, sources and metric name:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter98.thumb.1280.1280.png\" width=\"480\" height=\"28\" alt=\"Observability at Twitter\"></p> \n<p>Metrics can also be correlated; for instance, the percentage of queries which were slow:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter99.thumb.1280.1280.png\" width=\"612\" height=\"35\" alt=\"Observability at Twitter\"><span>Alerts use the same language. The first value is for warning and second is for critical. 7 of 10 minutes means the value exceeded either threshold for any 7 minutes out of a 10 minute window. Note it is identical to the earlier query except for the addition of a threshold and time window:</span></p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter100.thumb.1280.1280.png\" width=\"627\" height=\"34\" alt=\"Observability at Twitter\"></span></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter101.thumb.1280.1280.png\" width=\"657\" height=\"398\" alt=\"Observability at Twitter\" class=\"align-center\"></p> \n<p>On average, the Observability stack processes 400,000 queries per minute between ad-hoc user requests, dashboards and monitoring. The vast majority of these queries come from the monitoring system.</p> \n<p><strong>Visualization</strong></p> \n<p>While collecting and storing the data is important, it is of no use to our engineers unless it is visualized in a way that can immediately tell a relevant story. Engineers use the unified query language to retrieve and plot time series data on charts using a web application called Viz. A chart is the most basic visualization unit in Observability products. Charts are often created ad hoc in order to quickly share information within a team during a deploy or an incident, but they can also be created and saved in dashboards. A command line tool for dashboard creation, libraries of reusable components for common metrics, and an API for automation are available to engineers.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter102.thumb.1280.1280.png\" width=\"700\" height=\"303\" alt=\"Observability at Twitter\" class=\"align-center\"></p> \n<p>Dashboards are equipped with many tools to help engineers analyze the data. They can toggle between chart types (stacked and/or filled), chart scales (linear or logarithmic), and intervals (per-minute, per-hour, per-day). Additionally, they can choose between live near real-time data and historical data dating back to the beginning of when the service began collection. The average dashboard at Twitter contains 47 charts. It’s common to see these dashboards on big screens or on engineer’s monitors if you stroll through the offices. Engineers at Twitter live in these dashboards!</p> \n<p>Visualization use cases include hundreds of charts per dashboard and thousands of data points per chart. To provide the required chart performance, an in-house charting library was developed.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter103.thumb.1280.1280.png\" width=\"700\" height=\"210\" alt=\"Observability at Twitter\" class=\"align-center\"></p> \n<p><strong>Monitoring</strong></p> \n<p>Our monitoring system allows users to define alert conditions and notifications in the same query language they use for ad hoc queries and building dashboards. The addition of a predicate (e.g. &gt; 50 for 20 minutes) to a query expresses the condition the user wants to be notified about. The evaluation of these queries is partitioned across multiple boxes for scalability and redundancy with failover in the case of node failure. This system evaluates over 10,000 queries each minute across Twitter’s data centers. Notifications are sent out via email, voice call, or text message.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/observability_attwitter104.thumb.1280.1280.png\" width=\"700\" height=\"346\" alt=\"Observability at Twitter\" class=\"align-center\"></p> \n<p>The monitoring system also provides an API that is used to create a rich UI for teams to actively monitor the state of their service. Users can view the historical metrics for an alert as well as snooze an alert, specific rules, or specific machines during maintenance.</p> \n<p><strong>Related Systems</strong></p> \n<p>In addition to conventional per-minute metric data, we have two other significant internal systems for learning about the health and performance of services. <a href=\"https://engineering/2012/distributed-systems-tracing-zipkin\">Zipkin</a>, our distributed tracing system, does not contribute to monitoring but is integral in many debugging situations. Finally, an in-house exception logging and reporting application is an important tool for investigating the health of many systems and is often the first thing checked when Twitter’s success rate fluctuates.</p> \n<p><strong>Future Work</strong></p> \n<p>Twitter’s Observability stack is a distributed system just like the systems it provides visibility into. About 1.5% of the machines in a data center are used for collection, storage, query processing, visualization, and monitoring (0.3% if storage is excluded). It is common to see a dashboard or chart open within Twitter’s engineering areas. Our charts update as new data is available and many charts contain multiple queries. We serve around 200 million queries a day that end up in charts. This gives our engineers access to a huge amount of data for diagnosing, monitoring or just checking on health.</p> \n<p>This architecture has enabled us to keep up with the breakneck pace of growth and the tremendous scale of Twitter. Our challenges are not over. As Twitter continues to grow, it is becoming more complex and services are becoming more numerous. Thousands of service instances with millions of data points require high performance visualizations and automation for intelligently surfacing interesting or anomalous signals to the user. We seek to continually improve the stability and efficiency of our stack while giving users more flexible ways of interacting with the entire corpus of data that Observability manages.</p> \n<p>These are some of the complex problems that are being solved at Twitter. It is our hope that we have provided a thorough overview of the problems faced when monitoring distributed systems and how Twitter works to solve them. In future posts, we’ll dive deeper into parts of the stack to give more technical detail and discussion of the implementations. We’ll also discuss ways the system can be improved and what we are doing to make that happen.</p> \n<p><strong>Acknowledgements</strong></p> \n<p><span>The entire Observability team contributed to creating this overview: Charles Aylward (<a href=\"https://twitter.com/intent/user?screen_name=charmoto\">@charmoto</a>), Brian Degenhardt (<a href=\"https://twitter.com/intent/user?screen_name=bmdhacks\">@bmdhacks</a>), Micheal Benedict (<a href=\"https://twitter.com/intent/user?screen_name=micheal\">@micheal</a>), Zhigang Chen (<a href=\"https://twitter.com/intent/user?screen_name=zhigangc\">@zhigangc</a>), Jonathan Cao (<a href=\"https://twitter.com/intent/user?screen_name=jonathancao\">@jonathancao</a>), Stephanie Guo (<a href=\"https://twitter.com/intent/user?screen_name=stephanieguo\">@stephanieguo</a>), Franklin Hu (<a href=\"https://twitter.com/intent/user?screen_name=thisisfranklin\">@thisisfranklin</a>), Megan Kanne (<a href=\"https://twitter.com/intent/user?screen_name=megankanne\">@megankanne</a>), Justin Nguyen (<a href=\"https://twitter.com/intent/user?screen_name=JustANguyen\">@JustANguyen</a>), Ryan O’Neill (<a href=\"https://twitter.com/intent/user?screen_name=rynonl\">@rynonl</a>), Steven Parkes (<a href=\"https://twitter.com/intent/user?screen_name=smparkes\">@smparkes</a>), Kamil Pawlowski (<a href=\"https://twitter.com/intent/user?screen_name=oo00o0o00oo\">@oo00o0o00oo</a>), Krishnan Raman (<a href=\"https://twitter.com/intent/user?screen_name=dxbydt_jasq\">@dxbydt_jasq</a>), Yann Ramin (<a href=\"https://twitter.com/intent/user?screen_name=theatrus\">@theatrus</a>), Nik Shkrob (<a href=\"https://twitter.com/intent/user?screen_name=nshkrob\">@nshkrob</a>), Daniel Sotolongo (<a href=\"https://twitter.com/intent/user?screen_name=sortalongo\">@sortalongo</a>), Chang Su (<a href=\"https://twitter.com/intent/user?screen_name=changsmi\">@changsmi</a>), Michael Suzuki (<a href=\"https://twitter.com/intent/user?screen_name=michaelsuzuki\">@michaelsuzuki</a>), Tung Vo (<a href=\"https://twitter.com/intent/user?screen_name=tungv0\">@tungv0</a>), Daniel Wang, Cory Watson (<a href=\"https://twitter.com/intent/user?screen_name=gphat\">@gphat</a>), Alex Yarmula (<a href=\"https://twitter.com/intent/user?screen_name=twalex\">@twalex</a>), and Jennifer Yip (<a href=\"https://twitter.com/intent/user?screen_name=lunchbag\">@lunchbag</a>).</span></p>",
"date": "2013-09-09T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/observability-at-twitter",
"domain": "engineering"
},
{
"title": "Dremel made simple with Parquet",
"body": "<p>Columnar storage is a popular technique to optimize analytical workloads in parallel RDBMs. The performance and compression benefits for storing and processing large amounts of data are well documented in academic literature as well as several <a href=\"http://people.csail.mit.edu/tdanford/6830papers/stonebraker-cstore.pdf\">commercial</a> <a href=\"http://vldb.org/pvldb/vol5/p1790_andrewlamb_vldb2012.pdf%E2%80%8E\">analytical</a> <a href=\"http://www.monetdb.org/\">databases</a>.</p> \n<p>The goal is to keep I/O to a minimum by reading from a disk only the data required for the query. Using <a href=\"https://engineering/2013/announcing-parquet-10-columnar-storage-for-hadoop\">Parquet at Twitter</a>, we experienced a reduction in size by one third on our large datasets. Scan times were also reduced to a fraction of the original in the common case of needing only a subset of the columns. The principle is quite simple: instead of a traditional row layout, the data is written one column at a time. While turning rows into columns is straightforward given a flat schema, it is more challenging when dealing with nested data structures.</p> \n<p>We recently <a href=\"https://engineering/2013/announcing-parquet-10-columnar-storage-for-hadoop\">introduced Parquet</a>, an open source file format for Hadoop that provides columnar storage. Initially a joint effort between Twitter and Cloudera, it now has <a href=\"https://github.com/Parquet/parquet-mr/graphs/contributors\">many other contributors</a> including companies like Criteo. Parquet stores nested data structures in a flat columnar format using a technique outlined in the <a href=\"http://research.google.com/pubs/pub36632.html\">Dremel paper</a> from Google. Having implemented this model based on the paper, we decided to provide a more accessible explanation. We will first describe the general model used to represent nested data structures. Then we will explain how this model can be represented as a flat list of columns. Finally we’ll discuss why this representation is effective.</p> \n<p>To illustrate what columnar storage is all about, here is an example with three columns.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet95.thumb.1280.1280.png\" width=\"113\" height=\"102\" alt=\"Dremel made simple with Parquet\"></p> \n<p>In a row-oriented storage, the data is laid out one row at a time as follows:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet96.thumb.1280.1280.png\" width=\"295\" height=\"42\" alt=\"Dremel made simple with Parquet\"></p> \n<p>Whereas in a column-oriented storage, it is laid out one column at a time:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet97.thumb.1280.1280.png\" width=\"295\" height=\"42\" alt=\"Dremel made simple with Parquet\"></p> \n<p>There are several advantages to columnar formats.</p> \n<ul>\n <li>Organizing by column allows for better compression, as data is more homogenous. The space savings are very noticeable at the scale of a Hadoop cluster.</li> \n <li>I/O will be reduced as we can efficiently scan only a subset of the columns while reading the data. Better compression also reduces the bandwidth required to read the input.</li> \n <li>As we store data of the same type in each column, we can use encodings better suited to the modern processors’ pipeline by making instruction branching more predictable.</li> \n</ul>\n<h5>The model</h5> \n<p>To store in a columnar format we first need to describe the data structures using a <a href=\"https://github.com/Parquet/parquet-mr/tree/master/parquet-column/src/main/java/parquet/schema\">schema</a>. This is done using a model similar to <a href=\"http://en.wikipedia.org/wiki/Protocol_Buffers\">Protocol buffers</a>. This model is minimalistic in that it represents nesting using groups of fields and repetition using repeated fields. There is no need for any other complex types like Maps, List or Sets as they all can be mapped to a combination of repeated fields and groups.</p> \n<p>The root of the schema is a group of fields called a message. Each field has three attributes: a repetition, a type and a name. The type of a field is either a group or a primitive type (e.g., int, float, boolean, string) and the repetition can be one of the three following cases:</p> \n<ul>\n <li><strong>required</strong>: exactly one occurrence</li> \n <li><strong>optional</strong>: 0 or 1 occurrence</li> \n <li><strong>repeated</strong>: 0 or more occurrences</li> \n</ul>\n<p>For example, here’s a schema one might use for an address book:</p> \n<pre>message AddressBook {<br> required string owner;<br> repeated string ownerPhoneNumbers;<br> repeated group contacts {<br> required string name;<br> optional string phoneNumber;<br> }<br>}</pre> \n<p><span>Lists (or Sets) can be represented by a repeating field.</span></p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet98.thumb.1280.1280.png\" width=\"628\" height=\"173\" alt=\"Dremel made simple with Parquet\"><span></span></span></p> \n<p><span><span>A Map is equivalent to a repeating field containing groups of key-value pairs where the key is required.</span></span></p> \n<p><span><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet99.thumb.1280.1280.png\" width=\"627\" height=\"273\" alt=\"Dremel made simple with Parquet\"></span></span></p> \n<h5>Columnar format</h5> \n<p>A columnar format provides more efficient encoding and decoding by storing together values of the same primitive type. To store nested data structures in columnar format, we need to map the schema to a list of columns in a way that we can write records to flat columns and read them back to their original nested data structure. In Parquet, we create one column per primitive type field in the schema. If we represent the schema as a tree, the primitive types are the leaves of this tree.</p> \n<p>AddressBook example as a tree:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet100.thumb.1280.1280.png\" width=\"491\" height=\"228\" alt=\"Dremel made simple with Parquet\"></p> \n<p>To represent the data in columnar format we create one column per primitive type cell shown in blue.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet101.thumb.1280.1280.png\" width=\"700\" height=\"181\" alt=\"Dremel made simple with Parquet\"></p> \n<p>The structure of the record is captured for each value by two integers called repetition level and definition level. Using definition and repetition levels, we can fully reconstruct the nested structures. This will be explained in detail below.</p> \n<h5>Definition levels</h5> \n<p>To support nested records we need to store the level for which the field is null. This is what the definition level is for: from 0 at the root of the schema up to the maximum level for this column. When a field is defined then all its parents are defined too, but when it is null we need to record the level at which it started being null to be able to reconstruct the record.</p> \n<p>In a flat schema, an optional field is encoded on a single bit using 0 for null and 1 for defined. In a nested schema, we use an additional value for each level of nesting (as shown in the example), finally if a field is required it does not need a definition level.</p> \n<p>For example, consider the simple nested schema below:</p> \n<pre>message ExampleDefinitionLevel {<br> optional group a {<br> optional group b {<br> optional string c;<br> }<br> }<br>}</pre> \n<p>It contains one column: <strong>a.b.c</strong> where all fields are optional and can be null. When <strong>c</strong> is defined, then necessarily <strong>a</strong> and <strong>b</strong> are defined too, but when <strong>c</strong> is null, we need to save the level of the null value. There are 3 nested optional fields so the maximum definition level is 3.</p> \n<p>Here is the definition level for each of the following cases:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet102.thumb.1280.1280.png\" width=\"628\" height=\"163\" alt=\"Dremel made simple with Parquet\"></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet103.thumb.1280.1280.png\" width=\"565\" height=\"219\" alt=\"Dremel made simple with Parquet\"></p> \n<p>The maximum possible definition level is 3, which indicates that the value is defined. Values 0 to 2 indicate at which level the null field occurs.</p> \n<p>A required field is always defined and does not need a definition level. Let’s reuse the same example with the field <strong>b</strong> now <strong>required</strong>:</p> \n<pre>message ExampleDefinitionLevel {<br> optional group a {<br><strong>required</strong> group b {<br> optional string c;<br> }<br> }<br>}</pre> \n<p>The maximum definition level is now 2 as <strong>b</strong> does not need one. The value of the definition level for the fields below b changes as follows:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet104.thumb.1280.1280.png\" width=\"425\" height=\"167\" alt=\"Dremel made simple with Parquet\"></p> \n<p>Making definition levels small is important as the goal is to store the levels in as few bits as possible.</p> \n<h5>Repetition levels</h5> \n<p>To support repeated fields we need to store when new lists are starting in a column of values. This is what repetition level is for: it is the level at which we have to create a new list for the current value. In other words, the repetition level can be seen as a marker of when to start a new list and at which level.&nbsp;For example consider the following representation of a list of lists of strings:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet105.thumb.1280.1280.png\" width=\"627\" height=\"510\" alt=\"Dremel made simple with Parquet\"></p> \n<p>The column will contain the following repetition levels and values:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet106.thumb.1280.1280.png\" width=\"232\" height=\"220\" alt=\"Dremel made simple with Parquet\"></p> \n<p>The repetition level marks the beginning of lists and can be interpreted as follows:</p> \n<ul>\n <li>0 marks every new record and implies creating a new level1 and level2 list</li> \n <li>1 marks every new level1 list and implies creating a new level2 list as well.</li> \n <li>2 marks every new element in a level2 list.</li> \n</ul>\n<p>On the following diagram we can visually see that it is the level of nesting at which we insert records:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet107.thumb.1280.1280.png\" width=\"545\" height=\"448\" alt=\"Dremel made simple with Parquet\"></p> \n<p>A repetition level of 0 marks the beginning of a new record. In a flat schema there is no repetition and the repetition level is always 0. <a href=\"https://github.com/Parquet/parquet-mr/blob/8f93adfd0020939b9a58f092b88a5f62fd14b834/parquet-column/src/main/java/parquet/schema/GroupType.java#L199\">Only levels that are repeated need a Repetition level</a>: optional or required fields are never repeated and can be skipped while attributing repetition levels.</p> \n<h5>Striping and assembly</h5> \n<p>Now using the two notions together, let’s consider the AddressBook example again.&nbsp;This table shows the maximum repetition and definition levels for each column with explanations on why they are smaller than the depth of the column:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet108.thumb.1280.1280.png\" width=\"627\" height=\"176\" alt=\"Dremel made simple with Parquet\"></p> \n<p>In particular for the column <strong>contacts.phoneNumber</strong>, a defined phone number will have the maximum definition level of 2, and a contact without phone number will have a definition level of 1. In the case where contacts are absent, it will be 0.</p> \n<pre>AddressBook {<br> owner: \"Julien Le Dem\",<br> ownerPhoneNumbers: \"555 123 4567\",<br> ownerPhoneNumbers: \"555 666 1337\",<br> contacts: {<br> name: \"Dmitriy Ryaboy\",<br> phoneNumber: \"555 987 6543\",<br> },<br> contacts: {<br> name: \"Chris Aniszczyk\"<br> }<br>}<br>AddressBook {<br> owner: \"A. Nonymous\"<br>}</pre> \n<p>We’ll now focus on the column <strong>contacts.phoneNumber</strong> to illustrate this.</p> \n<p>Once projected the record has the following structure:</p> \n<pre>AddressBook {<br> contacts: {<br> phoneNumber: \"555 987 6543\"<br> }<br> contacts: {<br> }<br>}<br>AddressBook {<br>}</pre> \n<p>The data in the column will be as follows (R = Repetition Level, D = Definition Level)</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet109.thumb.1280.1280.png\" width=\"577\" height=\"222\" alt=\"Dremel made simple with Parquet\"></p> \n<p>To write the column we iterate through the record data for this column:</p> \n<ul>\n <li>contacts.phoneNumber: “555 987 6543” \n <ul>\n <li>new record: R = 0</li> \n <li>value is defined: D = maximum (2)</li> \n </ul></li> \n <li>contacts.phoneNumber: null \n <ul>\n <li>repeated contacts: R = 1</li> \n <li>only defined up to contacts: D = 1</li> \n </ul></li> \n <li>contacts: null \n <ul>\n <li>new record: R = 0</li> \n <li>only defined up to AddressBook: D = 0</li> \n </ul></li> \n</ul>\n<p>The columns contains the following data:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dremel_made_simplewithparquet110.thumb.1280.1280.png\" width=\"211\" height=\"134\" alt=\"Dremel made simple with Parquet\"></p> \n<p>Note that NULL values are represented here for clarity but are not stored at all. A definition level strictly lower than the maximum (here 2) indicates a NULL value.</p> \n<p>To reconstruct the records from the column, we iterate through the column:</p> \n<ul>\n <li><strong>R=0, D=2, Value = “555 987 6543”</strong>: \n <ul>\n <li>R = 0 means a new record. We recreate the nested records from the root until the definition level (here 2)</li> \n <li>D = 2 which is the maximum. The value is defined and is inserted.</li> \n </ul></li> \n <li><strong>R=1, D=1</strong>: \n <ul>\n <li>R = 1 means a new entry in the contacts list at level 1.</li> \n <li>D = 1 means contacts is defined but not phoneNumber, so we just create an empty contacts.</li> \n </ul></li> \n <li><strong>R=0, D=0</strong>: \n <ul>\n <li>R = 0 means a new record. we create the nested records from the root until the definition level</li> \n <li>D = 0 =&gt; contacts is actually null, so we only have an empty AddressBook</li> \n </ul></li> \n</ul>\n<h5><span>Storing definition levels and repetition levels efficiently</span></h5> \n<p>In regards to storage, this effectively boils down to creating three sub columns for each primitive type. However, the overhead for storing these sub columns is low thanks to the columnar representation. That’s because levels are bound by the depth of the schema and can be stored efficiently using only a few bits per value (A single bit stores levels up to 1, 2 bits store levels up to 3, 3 bits can store 7 levels of nesting). In the address book example above, the column <strong>owner</strong> has a depth of one and the column <strong>contacts.name</strong> has a depth of two. The levels will always have zero as a lower bound and the depth of the column as an upper bound. Even better, fields that are not repeated do not need a repetition level and required fields do not need a definition level, bringing down the upper bound.</p> \n<p>In the special case of a flat schema with all fields required (equivalent of NOT NULL in SQL), the repetition levels and definition levels are omitted completely (they would always be zero) and we only store the values of the columns. This is effectively the same representation we would choose if we had to support only flat tables.</p> \n<p>These characteristics make for a very compact representation of nesting that can be efficiently encoded using a combination of <a href=\"https://github.com/Parquet/parquet-mr/tree/master/parquet-column/src/main/java/parquet/column/values/rle\">Run Length Encoding and bit packing</a>. A sparse column with a lot of null values will compress to almost nothing, similarly an optional column which is actually always set will cost very little overhead to store millions of 1s. In practice, space occupied by levels is negligible. This representation is a generalization of how we would represent the simple case of a flat schema: writing all values of a column sequentially and using a bitfield for storing nulls when a field is optional.</p> \n<h5>Get Involved</h5> \n<p>Parquet is still a young project; to learn more about the project see our <a href=\"https://github.com/Parquet/parquet-mr/blob/master/README.md\">README</a> or look for the “<a href=\"https://github.com/Parquet/parquet-mr/issues?labels=pick+me+up%21&amp;state=open\">pick me up!</a>” label on GitHub. We do our best to review pull requests in a timely manner and give thorough and constructive reviews.</p> \n<p>You can also join our <a href=\"mailto:dev@parquet.apache.org\">mailing list</a> and tweet at <a href=\"https://twitter.com/intent/user?screen_name=ApacheParquet\">@ApacheParquet</a>&nbsp;to join the discussion.</p>",
"date": "2013-09-11T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/dremel-made-simple-with-parquet",
"domain": "engineering"
},
{
"title": "Summer of Code 2013 Results",
"body": "<p>For the second time, we had the opportunity to participate in the <a href=\"http://code.google.com/soc/\">Google Summer of Code</a> (GSoC) and we want to share news on the resulting open source activities. Unlike many GSoC participating organizations that focus on a single ecosystem, we have a <a href=\"http://twitter.github.io/\">variety of projects</a> that span multiple programming languages and communities.</p> \n<p>We worked on three projects with three amazing students over the summer.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>thank you to our <a href=\"https://twitter.com/gsoc\">@gsoc</a> participants for another fun summer of code! <a href=\"https://t.co/VMBmpSjZG0\">https://t.co/VMBmpSjZG0</a> <a href=\"http://t.co/XYi69OVNQT\">pic.twitter.com/XYi69OVNQT</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/statuses/384741733554061312\">September 30, 2013</a>\n </blockquote>",
"date": "2013-10-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/summer-of-code-2013-results",
"domain": "engineering"
},
{
"title": "Netty 4 at Twitter: Reduced GC Overhead",
"body": "<p>At Twitter, <a href=\"http://netty.io/\">Netty</a> (<a href=\"https://twitter.com/intent/user?screen_name=netty_project\">@netty_project</a>) is used in core places requiring networking functionality.</p> \n<p>For example:</p> \n<ul>\n <li><a href=\"http://twitter.github.io/finagle/\">Finagle</a> is our <a href=\"https://engineering/2011/finagle-protocol-agnostic-rpc-system\">protocol agnostic RPC system</a> whose transport layer is built on top of Netty, and it is used to implement most services internally like <a href=\"https://engineering/2011/twitter-search-now-3x-faster\">Search</a></li> \n <li>TFE (Twitter Front End) is our proprietary <a href=\"http://wiki.squid-cache.org/SpoonFeeding\">spoon-feeding</a> <a href=\"http://en.wikipedia.org/wiki/Reverse_proxy\">reverse proxy</a> which serves most of public-facing HTTP and <a href=\"http://en.wikipedia.org/wiki/SPDY\">SPDY</a> traffic using Netty</li> \n <li><a href=\"https://github.com/twitter/cloudhopper-smpp\">Cloudhopper</a> sends billions of SMS messages every month to hundreds of mobile carriers all around the world using Netty</li> \n</ul>\n<p>For those who aren’t aware, Netty is an open source <a href=\"http://en.wikipedia.org/wiki/New_I/O\">Java NIO</a> framework that makes it easier to create high-performing protocol servers. An older version of Netty v3 used Java objects to represent I/O events. This was simple, but could generate a lot of <a href=\"http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)\">garbage</a> especially at our scale. In the new Netty 4 release, changes were made so that instead of short-lived event objects, methods on long-lived channel objects are used to handle I/O events. There is also a specialized buffer allocator that uses pools.</p> \n<p>We take the performance, usability, and sustainability of the Netty project seriously, and we have been working closely with the Netty community to improve it in all aspects. In particular, we will discuss our usage of Netty 3 and will aim to show why migrating to Netty 4 has made us more efficient.</p> \n<h4><span>Reducing GC pressure and memory bandwidth consumption</span></h4> \n<p><span>A problem was Netty 3’s reliance on the JVM’s memory management for buffer allocations. Netty 3 creates a new heap buffer whenever a new message is received or a user sends a message to a remote peer. This means a ‘new byte[capacity]’ for each new buffer. These buffers caused GC pressure and consumed memory bandwidth: allocating a new byte array consumes memory bandwidth to fill the array with zeros for safety. However, the zero-filled byte array is very likely to be filled with the actual data, consuming the same amount of memory bandwidth. We could have reduced the consumption of memory bandwidth to 50% if the Java Virtual Machine (JVM) provided a way to create a new byte array which is not necessarily filled with zeros, but there’s no such way at this moment.</span></p> \n<p>To address this issue, we made the following changes for Netty 4.</p> \n<h5><span>Removal of event objects</span></h5> \n<p>Instead of creating event objects, Netty 4 defines different methods for different event types. In Netty 3, the <a href=\"http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/ChannelHandler.html\">ChannelHandler</a> has a single method that handles all event objects:</p> \n<pre>class Before implements ChannelUpstreamHandler {<br> void handleUpstream(ctx, ChannelEvent e) {<br> if (e instanceof MessageEvent) { ... }<br> else if (e instanceof ChannelStateEvent) { ... }<br> ...<br> }<br>}</pre> \n<p>Netty 4 has as many handler methods as the number of event types:</p> \n<pre>class After implements ChannelInboundHandler {<br> void channelActive(ctx) { ... }<br> void channelInactive(ctx) { ... }<br> void channelRead(ctx, msg) { ... }<br> void userEventTriggered(ctx, evt) { ... }<br> ...<br>}</pre> \n<p>Note a handler now has a method called ‘<a href=\"http://netty.io/4.0/api/io/netty/channel/ChannelInboundHandler.html#userEventTriggered(io.netty.channel.ChannelHandlerContext,%20java.lang.Object)\">userEventTriggered</a>’ so that it does not lose the ability to define a custom event object.</p> \n<h5><span>Buffer pooling</span></h5> \n<p>Netty 4 also introduced a new interface, ‘<a href=\"http://netty.io/4.0/api/io/netty/buffer/ByteBufAllocator.html\">ByteBufAllocator</a>’. It now provides a buffer pool implementation via that interface and is a pure Java variant of <a href=\"https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919\">jemalloc</a>, which implements <a href=\"http://en.wikipedia.org/wiki/Buddy_memory_allocation\">buddy memory allocation</a> and <a href=\"http://en.wikipedia.org/wiki/Slab_allocation\">slab allocation</a>.</p> \n<p>Now that Netty has its own memory allocator for buffers, it doesn’t waste memory bandwidth by filling buffers with zeros. However, this approach opens another can of worms—reference counting. Because we cannot rely on GC to put the unused buffers into the pool, we have to be very careful about leaks. Even a single handler that forgets to release a buffer can make our server’s memory usage grow boundlessly.</p> \n<h4>Was it worthwhile to make such big changes?</h4> \n<p>Because of the changes mentioned above, Netty 4 has no backward compatibility with Netty 3. It means our projects built on top of Netty 3 as well as other community projects have to spend non-trivial amount of time for migration. Is it worth doing that?</p> \n<p>We compared two <a href=\"http://en.wikipedia.org/wiki/Echo_Protocol\">echo</a> protocol servers built on top of Netty 3 and 4 respectively. (Echo is simple enough such that any garbage created is Netty’s fault, not the protocol). I let them serve the same distributed echo protocol clients with 16,384 concurrent connections sending 256-byte random payload repetitively, nearly saturating gigabit ethernet.</p> \n<p>According to our test result, Netty 4 had:</p> \n<ul>\n <li>5 times less frequent GC pauses: <strong>45.5 vs. 9.2 times/min</strong></li> \n <li>5 times less garbage production: <strong>207.11 vs 41.81 MiB/s</strong></li> \n</ul>\n<p>I also wanted to make sure our buffer pool is fast enough. Here’s a graph where the X and Y axis denote the size of each allocation and the time taken to allocate a single buffer respectively:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/netty_4_at_twitterreducedgcoverhead95.thumb.1280.1280.png\" width=\"635\" height=\"380\" alt=\"Netty 4 at Twitter: Reduced GC Overhead\"></p> \n<p>As you see, the buffer pool is much faster than JVM as the size of the buffer increases. It is even more noticeable for direct buffers. However, it could not beat JVM for small heap buffers, so we have something to work on here.</p> \n<h4>Moving forward</h4> \n<p>Although some parts of our services already migrated from Netty 3 to 4 successfully, we are performing the migration gradually. We discovered some barriers that slow our adoption that we hope to address in the near future:</p> \n<ul>\n <li><strong>Buffer leaks</strong>: Netty has a simple leak reporting facility but it does not provide information detailed enough to fix the leak easily.</li> \n <li><strong>Simpler core</strong>: Netty is a community driven project with many stakeholders that could benefit from a simpler core set of code. This increases the instability of the core of Netty because those non-core features tend to lead to collateral changes in the core. We want to make sure only the real core features remain in the core and other features stay out of there.</li> \n</ul>\n<p>We also are thinking of adding more cool features such as:</p> \n<ul>\n <li>HTTP/2 implementation</li> \n <li>HTTP and SOCKS proxy support for client side</li> \n <li>Asynchronous DNS resolution (see <a href=\"https://github.com/netty/netty/pull/1622\">pull request</a>)</li> \n <li>Native extensions for Linux that works directly with <a href=\"http://en.wikipedia.org/wiki/Epoll\">epoll</a> via JNI</li> \n <li>Prioritization of the connections with strict response time constraints</li> \n</ul>\n<h4>Getting Involved</h4> \n<p>What’s interesting about Netty is that it is used by many different people and companies worldwide, mostly not from Twitter. It is an independent and very healthy open source project with many <a href=\"https://github.com/netty/netty/graphs/contributors\">contributors</a>. If you are interested in building ‘the future of network programming’, why don’t you visit the project <a href=\"http://netty.io/\">web site</a>, follow <a href=\"https://twitter.com/intent/user?screen_name=netty_project\">@netty_project</a>, jump right into the <a href=\"https://github.com/netty/netty/\">source code</a> at GitHub or even consider&nbsp;<a href=\"https://twitter.com/jobs/engineering\">joining the flock</a>&nbsp;to help us improve Netty?</p> \n<h4>Acknowledgements</h4> \n<p>Netty project was founded by Trustin Lee (<a href=\"https://twitter.com/intent/user?screen_name=trustin\">@trustin</a>) who joined the flock in 2011 to help build Netty 4. We also like to thank Jeff Pinner (<a href=\"https://twitter.com/intent/user?screen_name=jpinner\">@jpinner</a>) from the TFE team who gave many great ideas mentioned in this article and became a guinea pig for Netty 4 without hesitation. Furthermore, Norman Maurer (<a href=\"https://twitter.com/intent/user?screen_name=normanmaurer\">@normanmaurer</a>), one of the core Netty committers, made an enormous amount of effort to help us materialize the great ideas into actually shippable piece of code as part of the Netty project. There are also countless number of individuals who gladly tried a lot of unstable releases catching up all the breaking changes we had to make, in particular we would like to thank: Berk Demir (<a href=\"https://twitter.com/intent/user?screen_name=bd\">@bd</a>), Charles Yang (<a href=\"https://twitter.com/intent/user?screen_name=cmyang\">@cmyang</a>), Evan Meagher (<a href=\"https://twitter.com/intent/user?screen_name=evanm\">@evanm</a>), Larry Hosken (<a href=\"https://twitter.com/intent/user?screen_name=lahosken\">@lahosken</a>), Sonja Keserovic (<a href=\"https://twitter.com/intent/user?screen_name=thesonjake\">@thesonjake</a>), and Stu Hood (<a href=\"https://twitter.com/intent/user?screen_name=stuhood\">@stuhood</a>).</p>",
"date": "2013-10-15T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/netty-4-at-twitter-reduced-gc-overhead",
"domain": "engineering"
},
{
"title": "Twitter University on YouTube",
"body": "<p>Here at Twitter, we embrace <a href=\"https://twitter.com/twitteross\">open source</a> and support the ongoing education of our team, as well as the general public. The intersection of these domains is particularly important to us, and led to the establishment of <a href=\"https://twitter.com/university\">Twitter University</a>, which supports these efforts and helps make Twitter the best place in the world for engineers to work.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>Let’s start Friday morning off right, with an Introduction to <a href=\"https://twitter.com/summingbird\">@summingbird</a> with <a href=\"https://twitter.com/posco\">@posco</a> &amp; <a href=\"https://twitter.com/sritchie\">@sritchie</a> - <a href=\"http://t.co/ho3vexgxtY\">http://t.co/ho3vexgxtY</a> <a href=\"https://twitter.com/TwitterOSS\">@TwitterOSS</a></p> — Twitter University (\n <a href=\"https://twitter.com/intent/user?screen_name=university\">@university</a>) \n <a href=\"https://twitter.com/university/statuses/383624562362941442\">September 27, 2013</a>\n <span>&nbsp;</span>\n </blockquote>",
"date": "2013-10-25T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/twitter-university-on-youtube",
"domain": "engineering"
},
{
"title": "Slave Recovery in Apache Mesos",
"body": "<p><em>Cross-posted from the Apache Mesos <a href=\"http://mesos.apache.org/blog/slave-recovery-in-apache-mesos/\">blog</a>.</em></p> \n<p>With the latest <a href=\"http://mesos.apache.org/\">Mesos</a> release, <a href=\"http://mesos.apache.org/downloads/\">0.14.1</a>, we are bringing high availability to the slaves by introducing a new feature called Slave Recovery. In a nutshell, slave recovery enables:</p> \n<ul>\n <li>Tasks and their executors to keep running when the slave process is down</li> \n <li>A restarted slave process to reconnect with running executors/tasks on the slave</li> \n</ul>\n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>Announcing a new version of <a href=\"https://twitter.com/search?q=%23Mesos&amp;src=hash\">#Mesos</a>, 0.14.1! Available for download now: <a href=\"http://t.co/e6UmTdjSrD\">http://t.co/e6UmTdjSrD</a> Release notes online: <a href=\"https://t.co/XtRzj2y7Tt\">https://t.co/XtRzj2y7Tt</a></p> — Apache Mesos (\n <a href=\"https://twitter.com/intent/user?screen_name=ApacheMesos\">@ApacheMesos</a>) \n <a href=\"https://twitter.com/ApacheMesos/statuses/392715592173514752\">October 22, 2013</a>\n </blockquote>",
"date": "2013-11-11T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/slave-recovery-in-apache-mesos",
"domain": "engineering"
},
{
"title": "Forward Secrecy at Twitter",
"body": "<p><span>As part of our continuing effort to keep our users’ information as secure as possible, we’re happy to announce that we recently enabled forward secrecy for traffic on twitter.com, api.twitter.com, and mobile.twitter.com. On top of the usual confidentiality and integrity properties of HTTPS, forward secrecy adds a new property. If an adversary is currently recording all Twitter users’ encrypted traffic, and they later crack or steal Twitter’s private keys, they should not be able to use those keys to decrypt the recorded traffic. As <a href=\"https://www.eff.org/deeplinks/2013/08/pushing-perfect-forward-secrecy-important-web-privacy-protection\">the Electronic Frontier Foundation points out</a>, this type of protection is increasingly important on today’s Internet.</span></p> \n<p>Under traditional HTTPS, the client chooses a random session key, encrypts it using the server’s public key, and sends it over the network. Someone in possession of the server’s private key and some recorded traffic can decrypt the session key and use that to decrypt the entire session. In order to support forward secrecy, we’ve enabled the EC Diffie-Hellman cipher suites. Under those cipher suites, the client and server manage to come up with a shared, random session key without ever sending the key across the network, even under encryption. The details of this remarkable and counterintuitive key exchange are explained at <a href=\"https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange\">Wikipedia’s excellent article on Diffie-Hellman key exchange</a>. The server’s private key is only used to sign the key exchange, preventing man-in-the-middle attacks.</p> \n<p>There are two main categories of Diffie-Hellman key exchange. Traditional Diffie-Hellman (DHE) depends on the hardness of the <a href=\"https://www.khanacademy.org/math/applied-math/cryptography/modern-crypt/v/discrete-logarithm-problem\">Discrete Logarithm Problem</a> and uses significantly more CPU than RSA, the most common key exchange used in SSL. Elliptic Curve Diffie-Hellman (ECDHE) is only a little more expensive than RSA for an equivalent security level. Vincent Bernat (<a href=\"https://twitter.com/intent/user?screen_name=vince2_\">@vince2_</a>)&nbsp;<a href=\"http://vincent.bernat.im/en/blog/2011-ssl-perfect-forward-secrecy.html#some-benchmarks\">benchmarked ECDHE</a> at a 15% overhead relative to RSA over 2048-bit keys. DHE, by comparison, used 310% more CPU than RSA.</p> \n<p></p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>Forward secrecy is just the latest way in which Twitter is trying to defend and protect the user’s voice.</p> \n</blockquote> \n<p>In practical deployment, we found that enabling and prioritizing ECDHE cipher suites actually caused negligible increase in CPU usage. HTTP keepalives and <a href=\"http://vincent.bernat.im/en/blog/2011-ssl-session-reuse-rfc5077.html\">session resumption</a> mean that most requests do not require a full handshake, so handshake operations do not dominate our CPU usage. We find 75% of Twitter’s client requests are sent over connections established using ECDHE. The remaining 25% consists mostly of older clients that don’t yet support the ECDHE cipher suites.</p> \n<p>The last obstacle to correctly implementing forward secrecy was our use of <a href=\"http://www.ietf.org/rfc/rfc5077.txt\">TLS session tickets</a>. We use TLS session tickets to allow clients to reconnect quickly using an abbreviated handshake if they still have a session ticket from a recent connection. Beside the CPU savings, this saves one network round-trip, commonly around 150ms and often much more on mobile networks.</p> \n<p><span>Session tickets enable pools of servers to support session resumption without need for a shared session cache. However, as Adam Langley (<a href=\"https://twitter.com/intent/user?screen_name=agl__\">@agl__</a>)&nbsp;points out in his blog post </span><a href=\"https://www.imperialviolet.org/2013/06/27/botchingpfs.html\">How to Botch TLS Forward Secrecy</a><span>:</span></p> \n<p class=\"indent-30\"><em>If you run several servers then they all need to share the same session ticket key otherwise they can’t decrypt each other’s session tickets. In this case your forward secrecy is limited by the security of that file on disk. Since your private key is probably kept on the same disk, enabling forward secure cipher suites probably hasn’t actually achieved anything other than changing the file that the attacker needs to steal!</em></p> \n<p class=\"indent-30\"><em>You need to generate session ticket keys randomly, distribute them to the servers without ever touching persistent storage and rotate them frequently.</em></p> \n<p>We implemented such a key rotation system with a few additional goals: It should be simple to implement, simple to maintain, resistant to failure, and we should be able to restart the frontend process during deploys without waiting to receive new session ticket keys.</p> \n<p>To do so, we have a set of key generator machines, of which one is the leader. The leader generates a fresh session ticket key every twelve hours and zeroes old keys after thirty-six hours. Keys are stored in tmpfs (a RAM-based filesystem), with no swap partitions configured. It’s important that there be no swap, because tmpfs will use swap if available, which could write keys to long-term disk storage.</p> \n<p>Every five minutes, our frontends fetch the latest ticket keys from a key generator machine via SSH. This traffic is also forward secret by SSH’s Diffie-Hellman key exchange. Again, the keys are stored on a tmpfs, so they survive a restart of the frontend process without being leaked to disk.</p> \n<p>When a new ticket key K is available, we don’t want to start encrypting new tickets with it until we expect that each frontend has a copy of K and can decrypt those tickets. We use a timestamp in K’s filename to determine its age; once it is at least twenty minutes old, it is considered current and frontends will start encrypting tickets with it.</p> \n<p>In general, frontends will have three ticket keys available: the current key, and the two previous keys. They can decrypt session tickets using any of those three. When a client resumes a session using one of the previous keys, a frontend will finish the abbreviated handshake and assign a new session ticket using the current key. This logic is implemented in a callback provided to <a href=\"http://openssl.6102.n7.nabble.com/openssl-org-2697-documentation-for-SSL-CTX-set-tlsext-ticket-key-cb-td36421.html\">OpenSSL’s SSL_CTX_set_tlsext_ticket_key_cb</a>.</p> \n<p><strong>The New Normal</strong></p> \n<p>At the end of the day, we are writing this not just to discuss an interesting piece of technology, but to present what we believe should be the new normal for web service owners. A year and a half ago, Twitter was first served completely over HTTPS. Since then, it has become clearer and clearer how important that step was to protecting our users’ privacy.</p> \n<p>If you are a webmaster, we encourage you to implement HTTPS for your site and make it the default. If you already offer HTTPS, ensure your implementation is hardened with <a href=\"http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security\">HTTP Strict Transport Security</a>, <a href=\"http://en.wikipedia.org/wiki/HTTP_cookie#Secure_and_HttpOnly\">secure cookies</a>, <a href=\"https://www.owasp.org/index.php/Certificate_and_Public_Key_Pinning\">certificate pinning</a>, and <a href=\"http://vincent.bernat.im/en/blog/2011-ssl-perfect-forward-secrecy.html\">Forward Secrecy</a>. The security gains have never been more important to implement.</p> \n<p>If you don’t run a website, demand that the sites you use implement HTTPS to help protect your privacy, and make sure you are using an up-to-date web browser so you are getting the latest security improvements.</p> \n<p>Security is an ever-changing world. Our work on deploying forward secrecy is just the latest way in which Twitter is trying to defend and protect the user’s voice in that world.</p> \n<p><strong>Acknowledgements</strong></p> \n<p>Like all projects at Twitter, this has been a collaboration. We’d like to thank all involved, including: <a href=\"https://twitter.com/jmhodges\">Jeff Hodges</a>, <a href=\"https://twitter.com/jpinner\">Jeffrey Pinner</a>, <a href=\"https://twitter.com/kylerandolph\">Kyle Randolph</a>, <a href=\"https://twitter.com/jschauma\">Jan Schaumann</a>, <a href=\"https://twitter.com/mfinifter\">Matt Finifter</a>, the entirety of our Security and Twitter Frontend teams, and the broader security and cryptography communities.</p>",
"date": "2013-11-22T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/forward-secrecy-at-twitter",
"domain": "engineering"
},
{
"title": "CocoaSPDY: SPDY for iOS / OS X",
"body": "<p>For over a year now, Twitter has supported the <a href=\"http://en.wikipedia.org/wiki/SPDY\">SPDY protocol</a> and today it accounts for a significant percentage of our web traffic. SPDY aims to improve upon a number of HTTP’s shortcomings and one client segment in particular that has a lot of potential to benefit is mobile devices. Cellular networks still suffer from high latency, so reducing client-server roundtrips can have a pronounced impact on a user’s experience.</p> \n<p>Up until now, the primary clients that supported SPDY were browsers, led by Chrome and Firefox. Today we’re happy to announce that in addition to rolling out SPDY to our iOS app users around the world, we’re also open sourcing a SPDY framework, <a href=\"https://github.com/twitter/CocoaSPDY\">CocoaSPDY</a>, for iOS and OS X apps under the permissive Apache 2.0 license. The code is written in Objective-C against the CoreFoundation and Cocoa/Cocoa Touch interfaces and has no external dependencies.</p> \n<h5>What exactly is SPDY?</h5> \n<p>SPDY was originally designed at Google as an experimental successor to HTTP. It’s a binary protocol (rather than human-readable like HTTP), but is fully compatible with HTTP. In fact, current draft work on <a href=\"https://github.com/http2/http2-spec\">HTTP/2.0</a> is largely based on the SPDY protocol and its real-world success. At Twitter, we have been participating with other companies in the evolution of the specification and adopting SPDY across our technology stack. We’ve also contributed implementations to open source projects such as <a href=\"http://netty.io/\">Netty</a>. For more details about SPDY, we encourage you to read the <a href=\"http://www.chromium.org/spdy/spdy-whitepaper\">whitepaper</a> and the <a href=\"http://www.chromium.org/spdy/spdy-protocol/spdy-protocol-draft3-1\">latest specification</a>.</p> \n<h5>Enabling SPDY</h5> \n<p>One of our primary goals with our SPDY implementation was to make integration with an existing application as easy, transparent, and seamless as possible. With that in mind, we created two integration mechanisms—one for NSURLConnection and one for NSURLSession—each of which could begin handling an application’s HTTP calls with as little as a one-line change to the code.</p> \n<p>While we’ve since added further options for customized behavior and configuration to the interface, it’s still possible to enable SPDY in an iOS or OS X application with a diff as minimal as this:</p> \n<pre>+[SPDYURLConnectionProtocol registerOrigin:@\"<a href=\"https://api.twitter.com:443\" target=\"_blank\" rel=\"nofollow\">https://api.twitter.com:443</a>\"];</pre> \n<p>The simplicity of this integration is in part thanks to Apple’s own well-designed NSURLProtocol interface.</p> \n<h5>Performance Improvements</h5> \n<p>We’re still actively experimenting with and tuning our SPDY implementation in order to improve the user’s experience in our app as much as possible. However, we have measured as much as a <strong>30% decrease</strong> in latency in the wild for API requests carried over SPDY relative to those carried over HTTP.&nbsp;</p> \n<p><span>In particular, we’ve observed SPDY helping <strong>more as a user’s network conditions get worse</strong>.&nbsp;</span><span>This can be seen in the long tail of the following charts tracking a sample of cellular API traffic:</span></p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/cocoaspdy_spdy_foriososx95.thumb.1280.1280.png\" width=\"640\" height=\"480\" alt=\"CocoaSPDY: SPDY for iOS / OS X\"></span></p> \n<p></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/cocoaspdy_spdy_foriososx96.thumb.1280.1280.png\" width=\"640\" height=\"480\" alt=\"CocoaSPDY: SPDY for iOS / OS X\"></p> \n<p><span>The x-axis is divided into the 50th percentile (median), 75th, 95th and 99th percentiles of HTTP and SPDY requests. The y-axis represents the raw, unadjusted request latency in milliseconds. Note that latency is a fairly coarse metric that can’t fully demonstrate the benefits of SPDY.</span></p> \n<h5>Future Work and Getting Involved</h5> \n<p>CocoaSPDY is young project under active development, we welcome feedback and contributions. If you’re interested in getting involved outside of usage, some of our future plans include:</p> \n<ul>\n <li><a href=\"https://github.com/twitter/CocoaSPDY/issues/1\">Server Push</a></li> \n <li><a href=\"https://github.com/twitter/CocoaSPDY/issues/2\">Discretionary Request Dispatch</a></li> \n <li>HTTP/2.0 Support</li> \n</ul>\n<p>Feature requests or bugs should be reported on the GitHub <a href=\"https://github.com/twitter/CocoaSPDY/issues\">issue tracker</a>.</p> \n<h5>Acknowledgements</h5> \n<p>The CocoaSPDY library and framework was originally authored by Michael Schore (<a href=\"https://twitter.com/intent/user?screen_name=goaway\">@goaway</a>) and Jeff Pinner (<a href=\"https://twitter.com/intent/user?screen_name=jpinner\">@jpinner</a>) and draws significantly from the <a href=\"https://github.com/netty/netty/tree/master/codec-http/src/main/java/io/netty/handler/codec/spdy\">Netty implementation</a>. Testing and deploying the protocol has been an ongoing collaboration between Twitter’s TFE and iOS Platform teams, with many individuals contributing to its success. We would also like to extend a special thanks to Steve Algernon (<a href=\"https://twitter.com/intent/user?screen_name=salgernon\">@salgernon</a>) at Apple, for providing his insight and assistance with some of the nuances of the Cocoa networking stack.</p>",
"date": "2013-12-19T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2013/cocoaspdy-spdy-for-ios-os-x",
"domain": "engineering"
},
{
"title": "2012",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2012",
"domain": "engineering"
},
{
"title": "Right-to-left support for Twitter Mobile",
"body": "<p>Thanks to the efforts of our <a href=\"https://translate.twitter.com/welcome\">translation volunteers</a>, last week we were able to <a href=\"http://engineering/2012/03/twitter-now-available-in-arabic-farsi.html\">launch</a> right-to-left language support for our mobile website in Arabic and Farsi. Two interesting challenges came up during development for this feature:<br><br> 1) We needed to support a timeline that has both right-to-left (RTL) and left-to-right (LTR) tweets. We also needed to make sure that specific parts of each tweet, such as usernames and URLs, are always displayed as LTR.<br><br> 2) For our touch website, we wanted to flip our UI so that it was truly an RTL experience. But this meant we would need to change a lot of our CSS rules to have reversed values for properties like padding, margins, etc. — both time-consuming and unsustainable for future development. We needed a solution that would let us make changes without having to worry about adding in new CSS rules for RTL every time.<br><br> In this post, I detail how we handled these two challenges and offer some general RTL tips and other findings we gleaned during development.<br><br>General RTL tips<br><br> The basis for supporting RTL lies in the dir element attribute, which can be set to either ltr or rtl. This allows you to set an element’s content direction, so that any text/children nodes would render in the orientation specified. You can see the difference below:<br><br></p> \n<p><a href=\"http://3.bp.blogspot.com/-VeYcasPQ_Dw/UNTaXWC_OHI/AAAAAAAAAcU/G8HJyzd6s4I/s1600/RTL1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/right-to-left_supportfortwittermobile95.thumb.1280.1280.png\" alt=\"Right-to-left support for Twitter Mobile\"></a></p> \n<p><br> In the first row, the text in the LTR column is correct, but in the second it’s the text in the RTL column. <br><br> Since this attribute can be used on any element, it can a) be used to change the direction of inline elements, such as links (see “Handling bidirectional tweet content” below) and b) if added to the root html node then the browser will flip the order of all the elements on the page automatically (see “Creating a right-to-left UI” below).<br><br> The other way to change content direction lies in the direction and unicode-bidi CSS properties. Just like the dir attribute, the direction property allows you to specify the direction within an element. However, there is one key difference: while direction will affect any block-level elements, for it to affect inline elements the unicode-bidi property must be set to embed or override. Using the dir attribute acts as if both those properties were applied, and is the preferred method as bidi should be considered a document change, not a styling one.<br><br> For more on this, see the “W3C directionality specs” section below.<br><br>Handling bidirectional tweet content<br><br> One of the things we had to think about was how to properly align each tweet depending on the dominant directionality of the content characters. For example, a tweet with mostly RTL characters should be right-aligned and read from right to left. To figure out which chars were RTL, we used this regex:<br><br>/[\\u0600-\\u06FF]|[\\u0750-\\u077F]|[\\u0590-\\u05FF]|[\\uFE70-\\uFEFF]/m<br><br> Then depending on how many chars matched, we could figure out the direction we’d want to apply to the tweet. <br><br> However, this would also affect the different entities that are in a tweet’s content. Tweet entities are special parts included in the text that has their own context applied to them, such as usernames and hashtags. Usernames and URLs should always be displayed as LTR, while hashtags may be RTL or LTR depending on what the first character is. To solve this, while parsing out entities we also make sure that the correct direction was applied to the element the entities were contained in.<br><br> If you are looking to add RTL support for your site and you have dynamic text with mixed directionality, besides using the dir attribute or direction property, you could also look into the \\u200e () and the \\u200f () characters. These are invisible control markers that tell the browser how the following text should be displayed. But be careful; conflicts can arise if both the dir / direction and character marker methods are used together. Or if you are using Ruby, Twitter has a great localization gem called <a href=\"https://github.com/twitter/twitter-cldr-rb\">TwitterCldr</a> which can take a string and insert these markers appropriately.<br><br>Creating a right-to-left UI <br><br> For our mobile touch website, we would first detect what language the user’s browser is set in. When it’s one of our supported RTL languages, we add the dir attribute to our page. The browser will then flip the layout of the site so that everything was rendered on the right-hand side first. <br><br> This worked fairly well on basic alignment of the page; however, this did not change how all the elements are styled. Properties like padding, margin, text-align, and float will all have the same values, which means that the layout will look just plain wrong in areas where these are applied. This can be the most cumbersome part of adding RTL support to a website, as it usually means adding special rules to your stylesheets to handle this flipped layout. <br><br> For our mobile touch website, we are using<a href=\"http://code.google.com/p/closure-stylesheets/\"> Google Closure</a> as our stylesheet compiler. This has an extremely convenient flag called —output-orientation, which will go through your stylesheets and adjust the rules according to the value (LTR or RTL) you pass in. By running the stylesheet compilation twice, once with this flag set to RTL, we get two stylesheets that are the mirror images of each other. This fixed nearly all styling issues that came from needing to flip CSS values. In the end, there were only two extra rules that we needed to add to the RTL stylesheet - those were put into rtl.css which gets added on as the last input file for the RTL compilation, thusly overriding any previous rules that were generated.<br><br> After that, it’s just a matter of including the right stylesheet for the user’s language and voila! a very nicely RTL’d site with minimal extra effort on the development side.<br><br> One last thing that we needed to think about was element manipulation with JS. Since elements will now be pushed as far to the right as possible instead of to the far left, the origin point in which an element starts at may be very different than what you’d expect - possibly even out of the visible area in a container. <br><br> For example, we had to change the way that the media strip in our photo gallery moved based on the page’s directionality. Besides coordinates changing, an LTR user would drag starting from the right, then ending to the left in order to see more photos. For an RTL user, the natural inclination would be to start at a left point and drag to the right. This is something that can’t be handled automatically as with our stylesheet compiler, so it comes down to good old-fashioned programming to figure out how we wanted elements to move.<br><br>Improving translations<br><br> We would like to thank our amazing translations community for helping us get to this point. Without your efforts,we would not have been able to launch this feature onto mobile Twitter. And although we’ve made great strides in supporting RTL, we still have more work to do. <br><br> We would love to have more translations for other languages that are not complete yet, such as our other two RTL languages Hebrew and Urdu. Visit <a href=\"http://translate.twitter.com/\">translate.twitter.com</a> to see how you can help us add more languages to Twitter.<br><br>Helpful Resources<br><br> W3C directionality specs:</p> \n<ul>\n <li><a href=\"http://www.w3.org/TR/2004/WD-xhtml2-20040722/mod-bidi.html\">dir</a></li> \n <li><a href=\"http://www.w3.org/TR/CSS2/visuren.html#direction\">direction / unicode-bidi </a></li> \n <li><a href=\"http://www.w3.org/International/questions/qa-bidi-css-markup\">dir versus direction</a></li> \n</ul>\n<p>More resources:</p> \n<ul>\n <li><a href=\"https://developer.mozilla.org/en-US/docs/CSS/direction\">Mozilla</a></li> \n <li><a href=\"https://github.com/twitter/twitter-cldr-rb\">TwitterCLDR Ruby Gem</a></li> \n <li><a href=\"http://code.google.com/p/closure-stylesheets/\">Google Closure Stylesheets</a></li> \n <li><a href=\"http://xkcd.com/1137/\">XKCD’s example of another control character </a></li> \n</ul>\n<p><br> Posted by Christine Tieu (<a href=\"https://twitter.com/intent/user?screen_name=ctieu\">@ctieu</a>)<br> Engineer, Mobile Web Team</p>",
"date": "2012-12-21T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/right-to-left-support-for-twitter-mobile",
"domain": "engineering"
},
{
"title": "How our photo filters came into focus",
"body": "<p>The old adage “a picture is worth a thousand words” is very apt for Twitter: a single photo can express what otherwise might require many Tweets. Photos help capture whatever we’re up to: kids’ birthday parties, having fun with our friends, the world we see when we travel.</p> \n<p>Like so many of you, lots of us here at Twitter really love sharing filtered photos in our tweets. As we got into doing it more often, we began to wonder if we could make that experience better, easier and faster. After all, the now-familiar process for tweeting a filtered photo has required a few steps:</p> \n<p>1. Take the photo (with an app)<br> 2. Filter the photo (probably another app)<br> 3. Finally, tweet it!</p> \n<p>Constantly needing to switch apps takes time, and results in frustration and wasted photo opportunities. So we challenged ourselves to make the experience as fast and simple as possible. We wanted everyone to be able to easily tweet photos that are beautiful, timeless, and meaningful.</p> \n<p>With last week’s photo filters release, we think we accomplished that on the latest versions of Twitter for Android and Twitter for iPhone. Now we’d like to tell you a little more about what went on behind the scenes in order to develop this new photo filtering experience.</p> \n<p>It’s all about the filters</p> \n<p>Our guiding principle: to create filters that amplify what you want to express, and to help that expression stand the test of time. We began with research, user stories, and sketches. We designed and tested multiple iterations of the photo-taking experience, and relied heavily on user research to make decisions about everything from filters nomenclature and iconography to the overall flow. We refined and distilled until we felt we had the experience right.</p> \n<p></p> \n<p><a href=\"http://2.bp.blogspot.com/-TdUzWMo2yzw/UNNJfRIT6KI/AAAAAAAAAbc/qlh-4jNypAg/s1600/blog_eng1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_our_photo_filterscameintofocus95.thumb.1280.1280.png\" alt=\"How our photo filters came into focus\"></a></p> \n<p></p> \n<p>We spent many hours poring over the design of the filters. Since every photo is different, we did our analyses across a wide range of photos including portraits, scenery, indoor, outdoor and low-light shots. We also calibrated details ranging from color shifts, saturation, and contrast, to the shape and blend of the vignettes before handing the specifications over to Aviary, a company specializing in photo editing. They applied their expertise to build the algorithms that matched our filter specs.</p> \n<p></p> \n<p><a href=\"http://4.bp.blogspot.com/-X9SKAtHFsuo/UNNObx5nrRI/AAAAAAAAAbw/9t31YkNwQ4w/s1600/blog_eng2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_our_photo_filterscameintofocus96.thumb.1280.1280.png\" alt=\"How our photo filters came into focus\"></a></p> \n<p><br>Make it fast!</p> \n<p>Our new photo filtering system is a tight integration of Aviary’s cross-platform GPU-accelerated photo filtering technology with our own user interface and visual specifications for filters. Implementing this new UI presented some unique engineering challenges. The main one was the need to create an experience that feels instant and seamless to use — while working within constraints of memory usage and processing speed available on the wide range of devices our apps support.</p> \n<p>To make our new filtering experience work, our implementation keeps up to four full-screen photo contexts in memory at once: we keep three full-screen versions of the image for when you’re swiping through photos (the one you’re currently looking at plus the next to the right and the left), and the fourth contains nine small versions of the photo for the grid view. And every time you apply or remove a crop or magic enhance, we update the small images in the grid view to reflect those changes, so it’s always up to date.</p> \n<p>Without those, you could experience a lag when scrolling between photos — but mobile phones just don’t have a lot of memory. If we weren’t careful about when and how we set up these chunks of memory, one result could be running out of memory and crashing the app. So we worked closely with Aviary’s engineering team to achieve a balance that would work well for many use cases.</p> \n<p>Test and test some more</p> \n<p>As soon as engineering kicked off, we rolled out this new feature internally so that we could work out the kinks, sanding down the rough spots in the experience. At first, the team tested it, and then we opened it up to all employees to get lots of feedback. We also engaged people outside the company for user research. All of this was vital to get a good sense about which aspects of the UI would resonate, or wouldn’t.</p> \n<p>After much testing and feedback, we designed an experience in which you can quickly and easily choose between different filtering options – displayed side by side, and in a grid. Auto-enhancement and cropping are both a single tap away in an easy-to-use interface.</p> \n<p></p> \n<p><a href=\"http://3.bp.blogspot.com/-h3ui89b-SAo/UNNPRNClFBI/AAAAAAAAAb8/ZRZegKIBLr8/s1600/blog_eng3.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_our_photo_filterscameintofocus97.thumb.1280.1280.png\" alt=\"How our photo filters came into focus\"></a></p> \n<p></p> \n<p>Finally, a collaborative team of engineers, designers and product managers were able to ship a set of filters wrapped in a seamless UI that anyone with our Android or iPhone app can enjoy. And over time, we want our filters to evolve so that sharing and connecting become even more delightful. It feels great to be able to share it with all of you at last.</p> \n<p><br> Posted by <a href=\"https://twitter.com/intent/user?screen_name=ryfar\">@ryfar</a><br> Tweet Composer Team<br><br><br><br></p>",
"date": "2012-12-20T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/how-our-photo-filters-came-into-focus",
"domain": "engineering"
},
{
"title": "Class project: “Analyzing Big Data with Twitter”",
"body": "<p>Twitter partnered with UC Berkeley this past semester to teach <a href=\"http://blogs.ischool.berkeley.edu/i290-abdt-s12/\">Analyzing Big Data with Twitter</a>, a class with <a href=\"http://people.ischool.berkeley.edu/~hearst/\">Prof. Marti Hearst</a>. In the first half of the semester, Twitter engineers went to UC Berkeley to talk about the <a href=\"http://engineering.twitter.com/\">technology behind Twitter</a>: from the basics of <a href=\"http://engineering.twitter.com/2012/06/building-and-profiling-high-performance.html\">scaling</a> up a <a href=\"http://engineering.twitter.com/2012/06/distributed-systems-tracing-with-zipkin.html\">service</a> to the algorithms behind <a href=\"http://engineering.twitter.com/2012/03/generating-recommendations-with.html\">user recommendations</a> and <a href=\"http://engineering.twitter.com/2011/05/engineering-behind-twitters-new-search.html\">search</a>. These talks are available online, on the course <a href=\"http://blogs.ischool.berkeley.edu/i290-abdt-s12/\">website</a>. <br><br> In the second half of the course, students applied their knowledge and creativity to build data-driven applications on top of Twitter. They came up with a range of products that included tracking bands or football teams, monitoring Tweets to find calls for help, and identifying communities on Twitter. Each project was mentored by one of our engineers.<br><br> Last week, 40 of the students came to Twitter HQ to demo their final projects in front of a group of our engineers, designers and engineering leadership team.</p> \n<p><a href=\"http://2.bp.blogspot.com/-HnOe0xampRc/UMoOdW2gAuI/AAAAAAAAAbI/Hl2WnAEbc_A/s1600/IMG_8847.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/class_project_analyzingbigdatawithtwitter95.thumb.1280.1280.png\" alt=\"Class project: “Analyzing Big Data with Twitter” \"></a></p> \n<p><br> The students’ enthusiasm and creativity inspired and impressed all of us who were involved. The entire experience was really fun, and we hope to work with Berkeley more in the future.<br><br> Many thanks to the volunteer Twitter engineers, to Prof. Hearst, and of course to our fantastic students!<br><br></p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>Cannot believe that I’ll be visiting the Twitter HQ tomorrow for our project presentations. Yay! Yes, had to tweet about it. <a href=\"https://twitter.com/search/%23ilovetwitter\">#ilovetwitter</a></p> — Priya Iyer (\n <a href=\"https://twitter.com/intent/user?screen_name=myy_precious\">@myy_precious</a>) \n <a href=\"https://twitter.com/myy_precious/status/276513438702923779\">December 6, 2012</a>\n </blockquote>",
"date": "2012-12-13T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/class-project-analyzing-big-data-with-twitter",
"domain": "engineering"
},
{
"title": "Blobstore: Twitter’s in-house photo storage system",
"body": "<p>Millions of people turn to Twitter to share and discover photos. To make it possible to upload a photo and attach it to your Tweet directly from Twitter, we partnered with Photobucket in 2011. As soon as photos became a more native part of the Twitter experience, more and more people began using this feature to share photos. <br><br> In order to introduce new features and functionality, such as <a href=\"http://engineering/2012/12/twitter-photos-put-filter-on-it.html\">filters</a>, and continue to improve the photos experience, Twitter’s Core Storage team began building an in-house photo storage system. In September, we began to use this new system, called Blobstore.<br><br>What is Blobstore?<br><br> Blobstore is Twitter’s low-cost and scalable storage system built to store photos and other binary large objects, also known as blobs. When we set out to build Blobstore, we had three design goals in mind:<br><br></p> \n<ul>\n <li>Low Cost: Reduce the amount of money and time Twitter spent on storing Tweets with photos.</li> \n <li>High Performance: Serve images in the low tens of milliseconds, while maintaining a throughput of hundreds of thousands of requests per second.</li> \n <li>Easy to Operate: Be able to scale operational overhead with Twitter’s continuously growing infrastructure.</li> \n</ul>\n<p><br>How does it work?<br><br> When a user tweets a photo, we send the photo off to one of a set of Blobstore front-end servers. The front-end understands where a given photo needs to be written, and forwards it on to the servers responsible for actually storing the data. These storage servers, which we call storage nodes, write the photo to a disk and then inform a Metadata store that the image has been written and instruct it to record the information required to retrieve the photo. This Metadata store, which is a non-relational key-value store cluster with automatic multi-DC synchronization capabilities, spans across all of Twitter’s data centers providing a consistent view of the data that is in Blobstore.<br><br> The brain of Blobstore, the blob manager, runs alongside the front-ends, storage nodes, and index cluster. The blob manager acts as a central coordinator for the management of the cluster. It is the source of all of the front-ends’ knowledge of where files should be stored, and it is responsible for updating this mapping and coordinating data movement when storage nodes are added, or when they are removed due to failures.<br><br> Finally, we rely on Kestrel, Twitter’s existing asynchronous queue server, to handle tasks such as replicating images and ensuring data integrity across our data centers. <br><br> We guarantee that when an image is successfully uploaded to Twitter, it is immediately retrievable from the data center that initially received the image. Within a short period of time, the image is replicated to all of our other data centers, and is retrievable from those as well. Because we rely on a multi-data-center Metadata store for the central index of files within Blobstore, we are aware in a very short amount of time whether an image has been written to its original data center; we can route requests there until the Kestrel queues are able to replicate the data.<br><br>Blobstore Components</p> \n<p><a href=\"http://2.bp.blogspot.com/-ck7Xv1OZygs/UMecZAbUr0I/AAAAAAAAAaY/5Dw0ntVVL4w/s1600/Blobstore%2BComponents.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/blobstore_twittersin-housephotostoragesystem95.thumb.1280.1280.png\" alt=\"Blobstore: Twitter’s in-house photo storage system \"></a></p> \n<p><br>How is the data found?<br><br> When an image is requested from Blobstore, we need to determine its location in order to access the data. There are a few approaches to solving this problem, each with its own pros and cons. One such approach is to map or hash each image individually to a given server by some method. This method has a fairly major downside in that it makes managing the movement of images much more complicated. For example, if we were to add or remove a server from Blobstore, we would need to recompute a new location for each individual image affected by the change. This adds operational complexity, as it would necessitate a rather large amount of bookkeeping to perform the data movement.<br><br> We instead created a fixed-sized container for individual blobs of data, called a “virtual bucket”. We map images to these containers, and then we map the containers to the individual storage nodes. We keep the total number of virtual buckets unchanged for the entire lifespan of our cluster. In order to determine which virtual bucket a given image is stored in, we perform a simple hash on the image’s unique ID. As long as the number of virtual buckets remains the same, this hashing will remain stable. The advantage of this stability is that we can reason about the movement of data at a much more coarsely grained level than the individual image.<br><br>How do we place the data?<br><br> When mapping virtual buckets to physical storage nodes, we keep some rules in mind to make sure that we don’t lose data when we lose servers or hard drives. For example, if we were to put all copies of a given image on a single rack of servers, losing that rack would mean that particular image would be unavailable.<br><br> If we were to completely mirror the data on a given storage node on another storage node, it would be unlikely that we would ever have unavailable data, as the likelihood of losing both nodes at once is fairly low. However, whenever we were to lose a node, we would only have a single node to source from to re-replicate the data. We would have to recover slowly, so as to not impact the performance of the single remaining node.<br><br> If we were to take the opposite approach and allow any server in the cluster to share a range of data on all servers, then we would avoid a bottleneck when recovering lost replicas, as we would essentially be able to read from the entire cluster in order to re-replicate data. However, we would also have a very high likelihood of data loss if we were to lose more than the replication factor of the cluster (two) per data center, as the chance that any two nodes would share some piece of data would be high. So, the optimal approach would be somewhere in the middle: for a given piece of data, there would be a limited number of machines that could share the range of data of its replica - more than one but less than the entire cluster.<br><br> We took all of these things into account when we determined the mapping of data to our storage nodes. As a result, we built a library called “libcrunch” which understands the various data placement rules such as rack-awareness, understands how to replicate the data in way that minimizes risk of data loss while also maximizing the throughput of data recovery, and attempts to minimize the amount of data that needs to be moved upon any change in the cluster topology (such as when nodes are added or removed). It also gives us the power to fully map the network topology of our data center, so storage nodes have better data placement and we can take into account rack awareness and placement of replicas across PDU zones and routers.<br><br> Keep an eye out for a blog post with more information on libcrunch.<br><br>How is the data stored?<br><br> Once we know where a given piece of data is located, we need to be able to efficiently store and retrieve it. Because of their relatively high storage density, we are using standard hard drives inside our storage nodes (3.5” 7200 RPM disks). Since this means that disk seeks are very expensive, we attempted to minimize the number of disk seeks per read and write.<br><br> We pre-allocate ‘fat’ files on each storage node disk using fallocate(), of around 256MB each. We store each blob of data sequentially within a fat file, along with a small header. The offset and length of the data is then stored in the Metadata store, which uses SSDs internally, as the access pattern for index reads and writes is very well-suited for solid state media. Furthermore, splitting the index from the data saves us from needing to scale out memory on our storage nodes because we don’t need to keep any local indexes in RAM for fast lookups. The only time we end up hitting disk on a storage node is once we already have the fat file location and byte offset for a given piece of data. This means that we can generally guarantee a single disk seek for that read.<br><br></p> \n<p><a href=\"http://3.bp.blogspot.com/-c-gnLltkKUM/UMecoVXkGBI/AAAAAAAAAak/P9H75Wc8Cyg/s1600/How%2Bis%2Bthe%2Bdata%2Bstored.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/blobstore_twittersin-housephotostoragesystem96.thumb.1280.1280.png\" alt=\"Blobstore: Twitter’s in-house photo storage system \"></a></p> \n<p><br>Topology Management<br><br> As the number of disks and nodes increases, the rate of failure increases. Capacity needs to be added, disks and nodes need to be replaced after failures, servers need to be moved. To make Blobstore operationally easy we put a lot of time and effort into libcrunch and the tooling associated with making cluster changes. <br><br></p> \n<p><a href=\"http://4.bp.blogspot.com/-3hbpHmPwUOA/UMecvGKyH7I/AAAAAAAAAaw/irUWMopcGwk/s1600/blobStore_frontend.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/blobstore_twittersin-housephotostoragesystem97.thumb.1280.1280.png\" alt=\"Blobstore: Twitter’s in-house photo storage system \"></a></p> \n<p><br> When a storage node fails, data that was hosted on that node needs to be copied from a surviving replica to restore the correct replication factor. The failed node is marked as unavailable in the cluster topology, and so libcrunch computes a change in the mapping from the virtual buckets to the storage nodes. From this mapping change, the storage nodes are instructed to copy and migrate virtual buckets to new locations.<br><br>Zookeeper<br> Topology and placement rules are stored internally in one of our Zookeeper clusters. The Blob Manager deals with this interaction and it uses this information stored in Zookeeper when an operator makes a change to the system. A topology change can consist of adjusting the replication factor, adding, failing, or removing nodes, as well as adjusting other input parameters for libcrunch. <br><br>Replication across Data centers<br><br> Kestrel is used for cross data center replication. Because kestrel is a durable queue, we use it to asynchronously replicate our image data across data centers. <br><br>Data center-aware Routing<br><br> TFE (Twitter Frontend) is one of Twitter’s core components for routing. We wrote a custom plugin for TFE, that extends the default routing rules. Our Metadata store spans multiple data centers, and because the metadata stored per blob is small (a few bytes), we typically replicate this information much faster than the blob data. If a user tries to access a blob that has not been replicated to the nearest data center they are routed to, we look up this metadata information and proxy requests to the nearest data center that has the blob data stored. This gives us the property that if replication gets delayed, we can still route requests to the data center that stored the original blob, serving the user the image at the cost of a little higher latency until it’s replicated to the closer data center.<br><br>Future work<br><br> We have shipped the first version of blobstore internally. Although blobstore started with photos, we are adding other features and use cases that require blob storage to blobstore. And we are also continuously iterating on it to make it more robust, scalable, and easier to maintain.<br><br>Acknowledgments<br><br> Blobstore was a group effort. The following folks have contributed to the project: Meher Anand (<a href=\"https://twitter.com/meher_anand\">@meher_anand</a>), Ed Ceaser (<a href=\"https://twitter.com/asdf\">@asdf</a>), Harish Doddi (<a href=\"https://twitter.com/thinkingkiddo\">@thinkingkiddo</a>), Chris Goffinet (<a href=\"https://twitter.com/lenn0x\">@lenn0x</a>), Jack Gudenkauf (<a href=\"https://twitter.com/_jg\">@_jg</a>), and Sangjin Lee (<a href=\"https://twitter.com/sjlee\">@sjlee</a>). <br><br> Posted by Armond Bigian <a href=\"https://twitter.com/armondbigian\">@armondbigian</a><br> Engineering Director, Core Storage &amp; Database Engineering</p>",
"date": "2012-12-11T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/blobstore-twitter-s-in-house-photo-storage-system",
"domain": "engineering"
},
{
"title": "Implementing pushState for twitter.com",
"body": "<p>As part of our <a href=\"http://engineering.twitter.com/2012/05/improving-performance-on-twittercom.html\">continuing effort to improve the performance of twitter.com</a>, we’ve recently implemented <a href=\"https://developer.mozilla.org/en-US/docs/DOM/Manipulating_the_browser_history#Adding_and_modifying_history_entries\">pushState</a>. With this change, users experience a perceivable decrease in latency when navigating between sections of twitter.com; in some cases near zero latency, as we’re now caching responses on the client.</p> \n<p>This post provides an overview of the pushState API, a summary of our implementation, and details some of the pitfalls and gotchas we experienced along the way.</p> \n<h3>API Overview</h3> \n<p>pushState is part of the <a href=\"https://developer.mozilla.org/en-US/docs/DOM/Manipulating_the_browser_history\">HTML 5 History API</a>— a set of tools for managing state on the client. The pushState() method enables mapping of a state object to a URL. The address bar is updated to match the specified URL without actually loading the page.</p> \n<p><code>history.pushState([page data], [page title], [page URL])</code></p> \n<p>While the pushState() method is used when navigating forward from A to B, the History API also provides a “popstate” event—used to mange back/forward button navigation. The event’s “state” property maps to the data passed as the first argument to pushState().</p> \n<p>If the user presses the back button to return to the initial point from which he/she first navigated via pushState, the “state” property of the “popstate” event will be undefined. To set the state for the initial, full-page load use the replaceState() method. It accepts the same arguments as the pushState() method.</p> \n<p><code>history.replaceState([page data], [page title], [page URL])</code></p> \n<p>The following diagram illustrates how usage of the History API comes together.</p> \n<p><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/diagram_illustratinguseofthehtml5historyapi95.thumb.1280.1280.png\" alt=\"Diagram illustrating use of the HTML 5 History API\"><br><br></p> \n<h3>Progressive Enhancement</h3> \n<p>Our pushState implementation is a <a href=\"http://en.wikipedia.org/wiki/Progressive_enhancement\">progressive enhancement</a> on top of <a href=\"http://engineering.twitter.com/2012/05/improving-performance-on-twittercom.html\">our previous work</a>, and could be described as <a href=\"http://en.wikipedia.org/wiki/Hijax\">Hijax</a> + server-side rendering. By maintaining view logic on the server, we keep the client light, and maintain support for browsers that don’t support pushState with the same URLs. This approach provides the additional benefit of enabling us to disable pushState at any time without jeopardizing any functionality.</p> \n<h3>On the Server</h3> \n<p>On the server, we configured each endpoint to return either full-page responses, or a JSON payload containing a partial, server-side rendered view, along with its corresponding JavaScript components. The decision of what response to send is determined by checking the Accept header and looking for “application/json.”</p> \n<p>The same views are used to render both types of requests; to support pushState the views format the pieces used for the full-page responses into JSON.</p> \n<p>Here are two example responses for the Interactions page to illustrate the point:</p> \n<h4>pushState response</h4> \n<pre><code>{\n // Server-rendered HTML for the view\n page: \"</code></pre> \n<p>…</p> \n<pre><code>\",\n // Path to the JavaScript module for the associated view\n module: \"app/pages/connect/interactions\",\n // Initialization data for the current view\n init_data: {…},\n title: \"Twitter / Interactions\"\n}\n</code>\n</pre> \n<h4>Full page response</h4> \n<pre><code>\n&lt;b&gt;{{title}}&lt;/b&gt;</code></pre> \n<p>{{page}}</p> \n<h3>Client Architecture</h3> \n<p>Several aspects of our existing client architecture made it particularly easy to enhance twitter.com with pushState.</p> \n<p>By contract, our components attach themselves to a single DOM node, listen to events via delegation, fire events on the DOM, and those events are broadcast to other components via DOM event bubbling. This allows our components to be even more loosely coupled—a component doesn’t need a reference to another component in order to listen for its events.</p> \n<p>Secondly, all of our components are defined using AMD, enabling the client to make decisions about what components to load.</p> \n<p>With this client architecture we implemented pushState by adding two components: one responsible for managing the UI, the other data. Both are attached to the document, listen for events across the entire page, and broadcast events available to all components.</p> \n<h4>UI Component</h4> \n<ul>\n <li>Manages the decision to pushState URLs by listening for document-wide clicks, and keyboard shortcuts</li> \n <li>Broadcasts an event to initiate pushState navigation</li> \n <li>Updates the UI in response to events from the data component</li> \n</ul>\n<h4>DATA Component</h4> \n<ul>\n <li>Only included if we’re using pushState</li> \n <li>Manages XHRs and caching of responses</li> \n <li>Provides eventing around the HTML 5 history API to provide a single interface for UI components</li> \n</ul>\n<h4>Example pushState() Navigation LifeCycle</h4> \n<ol>\n <li>The user clicks on link with a specialized class (we choose “js-nav”), the click is caught by the UI component which prevents the default behavior and triggers a custom event to initiate pushState navigation.</li> \n <li>The data component listens for that event and…<br>\n <ol>\n <li>Writes the current view to cache and, only before initial pushState navigation, calls replaceState() to set the state data for the view</li> \n <li>Fetches the JSON payload for the requested URL (either via XHR or from cache)</li> \n <li>Update the cache for the URL</li> \n <li>Call pushState() to update the URL</li> \n <li>Trigger an event indicating the UI should be updated</li> \n </ol></li> \n <li>The UI component resumes control by handling the event from the data component and…<br>\n <ol>\n <li>JavaScript components for the current view are torn down (event listeners detached, associated state is cleaned up)</li> \n <li>The HTML for the current view is replaced with the new HTML</li> \n <li>The script loader only fetches modules not already loaded</li> \n <li>The JavaScript components for the current view are initialized</li> \n <li>An event is triggered to alert all components that the view is rendered and initialized</li> \n </ol></li> \n</ol>\n<h3>Pitfalls, Gotchas, etc.</h3> \n<p>It’ll come as no surprise to any experienced frontend engineers that the majority of the problems and annoyances with implementing pushState stem from either 1) inconsistencies in browser implementations of the HTML 5 History API, or 2) having to replicate behaviors or functionality you would otherwise get for free with full-page reloads.</p> \n<h4>Don’t believe the API, title updates are manual</h4> \n<p>All browsers currently disregard the title attribute passed to the pushState() and replaceState() methods. Any updates to the page title need to be done manually.</p> \n<h4>popstate Event Inconsistencies</h4> \n<p>At the time of this writing, WebKit (and only WebKit) fires an extraneous popstate event after initial page load. This appears to be <a href=\"https://bugs.webkit.org/show_bug.cgi?id=93506\">a known bug in WebKit</a>, and is easy to work around by ignoring popstate events if the “state” property is undefined.</p> \n<h4>State Object Size Limits</h4> \n<p>Firefox imposes <a href=\"https://developer.mozilla.org/en-US/docs/DOM/Manipulating_the_browser_history#The_pushState().C2.A0method\">640KB character limit</a> on the serialized state object passed to pushState(), and will throw an exception if that limit is exceeded. We hit this limit in the early days of our implementation, and moved to storing state in memory. We limit the size of the serialized JSON we cache on the client per URL, and can adjust that number via a server-owned config.</p> \n<p>It’s worth noting that due to the aforementioned popstate bug in WebKit, we pass an empty object as the first argument to pushState() to distinguish WebKit’s extraneous popstate events from those triggered in response to back/forward navigation.</p> \n<h4>Thoughtful State Management Around Caching</h4> \n<p>The bulk of the work implementing pushState went into designing a simple client framework that would facilitate caching and provide the right events to enable components to both prepare themselves to be cached, and restore themselves from cache. This was solved through a few simple design decisions:</p> \n<ol>\n <li>All events that trigger navigation (clicks on links, keyboard shortcuts, and back/forward button presses) are abstracted by the pushState UI component, routed through the same path in the data component, and subsequently fire the same events. This allows the UI to be both cached and handle updates in a uniform way.</li> \n <li>The pushState UI component fires events around the rendering of updates: one before the DOM is updated, and another after the update is complete. The former enables UI components such as dialogs and menus to be collapsed in advance of the page being cached; the later enables UI components like timelines to update their timestamps when rendered from cache.</li> \n <li>POST &amp; DELETE operations bust the client-side cache.</li> \n</ol>\n<h4>Re-implementing Browser Functionality</h4> \n<p>As is often the case, changing the browser’s default behavior in an effort to make the experience faster or simpler for the end-user typically requires more work on behalf of developers and designers. Here are some pieces of browser functionality that we had to re-implement:</p> \n<ul>\n <li>Managing the position of the scrollbar as the user navigates forward and backward.</li> \n <li>Preserving context menu functionality when preventing a link’s default click behavior.</li> \n <li>Accounting for especially fast, indecisive user clicks by ensuring the response you’re rendering is in sync with the last requested URL.</li> \n <li>Canceling outbound XHRs when the user requests a new page to avoid unnecessary UI updates.</li> \n <li>Implementing the <a href=\"http://preloaders.net/\">canonical AJAX spinner</a>, so the user knows the page is loading.</li> \n</ul>\n<h3>Final Thoughts</h3> \n<p>Despite the usual browser inconsistencies and other gotchas, we’re pretty happy with the HTML 5 History API. Our implementation has enabled us to deliver the fast initial page rendering times and robustness we associate with traditional, server-side rendered sites and the lightening quick in-app navigation and state changes associate with client-side rendered web applications.</p> \n<h3>Helpful Resources</h3> \n<ul>\n <li>Mozilla’s (<a href=\"https://twitter.com/mozilla\">@Mozilla</a>) <a href=\"https://developer.mozilla.org/en-US/docs/DOM/Manipulating_the_browser_history\">HTML 5 History API documentation</a></li> \n <li>Chris Wanstrath’s (<a href=\"https://twitter.com/defunkt\">@defunkt</a>) <a href=\"https://github.com/defunkt/jquery-pjax/\">pjax (pushState + ajax = pjax) plugin for jQuery project on GitHub</a></li> \n <li>Benjamin Lupton’s (<a href=\"https://twitter.com/balupton\">@balupton</a>) <a href=\"https://github.com/balupton/History.js/\">history.js project on GitHub</a></li> \n <li>Modernizr’s (<a href=\"https://twitter.com/Modernizr\">@Modernizr</a>) <a href=\"https://github.com/Modernizr/Modernizr\">pushState capability detection</a></li> \n</ul>\n<p>—Todd Kloots, Engineer, Web Core team (<a href=\"https://twitter.com/todd\">@todd</a>)</p>",
"date": "2012-12-07T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/implementing-pushstate-for-twittercom",
"domain": "engineering"
},
{
"title": "Twitter and SMS Spoofing",
"body": "<p>Over the past two days, <a href=\"http://arstechnica.com/security/2012/12/tweeting-with-sms-can-open-door-to-hacks-on-your-twitter-account/\">a few articles</a> have been published about a potential problem concerning the ability to post false updates to another user’s SMS-enabled Twitter account, and it has been misreported that US-based Twitter users are currently vulnerable to this type of attack.<br><br> The general concern is that if a user has a Twitter account configured for <a href=\"https://support.twitter.com/articles/14014-twitter-via-sms-faq#\">SMS updates</a>, and an attacker knows that user’s phone number, it could be possible for the attacker to send a fake SMS message to Twitter that looks like it’s coming from that user’s phone number, which would result in a fake post to that user’s timeline.<br><br> Most Twitter users interact over the SMS channel using a “shortcode.” In the US, for instance, <a href=\"https://support.twitter.com/articles/20170024\">this shortcode</a> is 40404. &nbsp;Because of the way that shortcodes work, it is not possible to send an SMS message with a fake source addressed to them, which eliminates the possibility of an SMS spoofing attack to those numbers. <br><br> However, in some countries a Twitter shortcode is <a href=\"https://support.twitter.com/articles/14226#\">not yet available</a>, and in those cases Twitter users interact over the SMS channel using a “longcode.” A longcode is basically just a normal looking phone number. &nbsp;Given that it is possible to send an SMS message with a fake source address to these numbers, we have offered <a href=\"https://support.twitter.com/groups/34-apps-sms-and-mobile/topics/153-twitter-via-sms/articles/20169928-how-to-use-pins-with-sms#\">PIN protection</a> to users who sign up with a longcode since 2007. &nbsp;As of August of this year, we have additionally disallowed posting through longcodes for users that have an available shortcode.<br><br> It has been misreported that US-based Twitter users are currently vulnerable to a spoofing attack because PIN protection is unavailable for them. &nbsp;By having a shortcode, PIN protection isn’t necessary for US-based Twitter users, because they are not vulnerable to SMS spoofing. &nbsp;We only provide the option for PIN protection in cases where a user could have registered with a longcode that is susceptible to SMS spoofing.<br><br> We work hard to protect our users from these kinds of threats and many others, and will continue to keep Twitter a site deserving of your trust.&nbsp; <br><br> Posted by Moxie Marlinspike - <a href=\"https://twitter.com/moxie\">@moxie</a><br> Engineering Manager, Product Security <br><br></p>",
"date": "2012-12-05T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/twitter-and-sms-spoofing",
"domain": "engineering"
},
{
"title": "Discover with a new lens: Twitter cards",
"body": "<p>As you already know, there’s a myriad of things shared on Twitter every day, and not just 140 characters of text. There are links to breaking news stories, images from current events, and the latest activity from those you follow.</p> \n<p>We want Discover to be the place where you find the best of that content relevant to you, even if you don’t necessarily know everyone involved. This is why we’ve introduced several improvements to Discover on twitter.com over the last few months. For example we redesigned it to show a <a href=\"http://engineering/2012/09/more-tweets-to-discover.html\">continuous stream of Tweets</a> with photos and links to websites, in which you can also now see Tweets from activity, based on what your network favorites. We also added new signals to better blend together all the most relevant Tweets for you, and implemented <a href=\"https://twitter.com/twitter/status/266338665540755457\">polling</a> so you know whenever there are fresh Tweets to see.</p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>Have you checked Discover lately? A new notification at the top of your stream shows when new Tweets are available. <a href=\"http://t.co/dL2NYafx\">twitter.com/twitter/status…</a></p> — Twitter (\n <a href=\"https://twitter.com/intent/user?screen_name=twitter\">@twitter</a>) \n <a href=\"https://twitter.com/twitter/status/266338665540755457\">November 8, 2012</a>\n </blockquote>",
"date": "2012-11-15T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/discover-with-a-new-lens-twitter-cards",
"domain": "engineering"
},
{
"title": "Dimension Independent Similarity Computation (DISCO)",
"body": "<p>MapReduce is a programming model for processing large data sets, typically used to do distributed computing on clusters of commodity computers. With large amount of processing power at hand, it’s very tempting to solve problems by brute force. However, we often combine clever sampling techniques with the power of MapReduce to extend its utility.<br><br> Consider the problem of finding all pairs of similarities between D indicator (0/1 entries) vectors, each of dimension N. In particular we focus on cosine similarities between all pairs of D vectors in R^N. Further assume that each dimension is L-sparse, meaning each dimension has at most L non-zeros across all points. For example, typical values to compute similarities between all pairs of a subset of Twitter users can be:<br><br> D = 10M<br> N = 1B<br> L = 1000<br><br> Since the dimensions are sparse, it is natural to store the points dimension by dimension. To compute cosine similarities, we can easily feed each dimension t into MapReduce by using the following Mapper and Reducer combination</p> \n<p><a href=\"http://4.bp.blogspot.com/-_3IPLL6vJLg/UKFsYMRgbTI/AAAAAAAAAZQ/dqreJhdAjUo/s1600/image02.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dimension_independentsimilaritycomputationdisco95.thumb.1280.1280.png\" alt=\"Dimension Independent Similarity Computation (DISCO)\"></a></p> \n<p><br> Where #(w) counts the number of dimensions in which point w occurs, and #(w1, w2) counts the number of dimensions in which w1 and w2 co-occur, i.e., the dot product between w1 and w2. The steps above compute all dot products, which will then be scaled by the cosine normalization factor.<br><br> There are two main complexity measures for MapReduce: “shuffle size”, and “reduce-key complexity”, defined shortly (Ashish Goel and Kamesh Munagala 2012). It can be easily shown that the above mappers will output on the order of O(NL^2) emissions, which for the example parameters we gave is infeasible. The number of emissions in the map phase is called the “shuffle size”, since that data needs to be shuffled around the network to reach the correct reducer.<br><br> Furthermore, the maximum number of items reduced to a single key is at most #(w1, w2), which can be as large as N. Thus the “reduce-key complexity” for the above scheme is N.<br><br> We can drastically reduce the shuffle size and reduce-key complexity by some clever sampling:</p> \n<p><a href=\"http://1.bp.blogspot.com/-34ZR2UwBq58/UKFsEADJLVI/AAAAAAAAAZE/2wSxCTdGU5A/s1600/image01.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dimension_independentsimilaritycomputationdisco96.thumb.1280.1280.png\" alt=\"Dimension Independent Similarity Computation (DISCO)\"></a></p> \n<p></p> \n<p><a href=\"http://1.bp.blogspot.com/-9E4sZjm3FF8/UKFqmjiPIPI/AAAAAAAAAYs/V6ejserefzs/s1600/image00.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/dimension_independentsimilaritycomputationdisco97.thumb.1280.1280.png\" alt=\"Dimension Independent Similarity Computation (DISCO)\"></a></p> \n<p><br> Notation: p and ε are oversampling parameters.<br><br> In this case, the output of the reducers are random variables whose expectations are the cosine similarities. Two proofs are needed to justify the effectiveness of this scheme. First, that the expectations are indeed correct and obtained with high probability, and second, that the shuffle size is greatly reduced. <br><br> We prove both of these claims in (Reza Bosagh-Zadeh and Ashish Goel 2012). In particular, in addition to correctness, we prove that the shuffle size of the above scheme is only O(DL log(D)/ε), with no dependence on the “dimension” N, hence the name.<br><br> This means as long as you have enough mappers to read your data, you can use the DISCO sampling scheme to make the shuffle size tractable. Furthermore, each reduce key gets at most O(log(D)/ε) values, thus making the reduce-key complexity tractable too. <br><br> Within Twitter, we use the DISCO sampling scheme to compute similar users. We have also used the scheme to find highly similar pairs of words, by taking each dimension to be the indicator vector that signals in which Tweets the word appears. We further empirically verify the claims and observe large reductions in shuffle size, with details in the <a href=\"http://arxiv.org/abs/1206.2082\">paper</a>.<br><br> Finally, this sampling scheme can be used to implement many other similarity measures. For Jaccard Similarity, we improve the implementation of the well-known MinHash (<a href=\"http://en.wikipedia.org/wiki/MinHash\" target=\"_blank\" rel=\"nofollow\">http://en.wikipedia.org/wiki/MinHash</a>) scheme on Map-Reduce.<br><br> Posted by <br> Reza Zadeh (<a href=\"http://twitter.com/Reza_Zadeh\">@Reza_Zadeh</a>) and Ashish Goel (<a href=\"http://twitter.com/ashishgoel\">@ashishgoel</a>) - Personalization &amp; Recommendation Systems Group and Revenue Group<br><br><br>Bosagh-Zadeh, Reza and Goel, Ashish (2012), <a href=\"http://arxiv.org/abs/1206.2082\">Dimension Independent Similarity Computation, arXiv:1206.2082</a><br><br> Goel, Ashish and Munagala, Kamesh (2012), <a href=\"http://www.stanford.edu/~ashishg/papers/mapreducecomplexity.pdf\">Complexity Measures for Map-Reduce, and Comparison to Parallel Computing</a>, <a href=\"http://www.stanford.edu/~ashishg/papers/mapreducecomplexity.pdf\" target=\"_blank\" rel=\"nofollow\">http://www.stanford.edu/~ashishg/papers/mapreducecomplexity.pdf</a></p>",
"date": "2012-11-12T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/dimension-independent-similarity-computation-disco",
"domain": "engineering"
},
{
"title": "Bolstering our infrastructure",
"body": "<p>Last night, the world tuned in to Twitter to share the election results as U.S. voters chose a president and settled many other campaigns. Throughout the day, people sent more than 31 million election-related Tweets (which contained certain key terms and relevant hashtags). And as results rolled in, we tracked the surge in election-related Tweets at 327,452 Tweets per minute (TPM). These numbers reflect the largest election-related Twitter conversation during our 6 years of existence, though they don’t capture the total volume of all Tweets yesterday.<br><br> As an engineering team, we keep an eye on all of the activity across the platform –– in particular, on the number of Tweets per second (TPS). Last night, Twitter averaged about 9,965 TPS from 8:11pm to 9:11pm PT, with a one-second peak of 15,107 TPS at 8:20pm PT and a one-minute peak of 874,560 TPM. Seeing a sustained peak over the course of an entire event is a change from the way people have previously turned to Twitter during live events. <br><br> In the past, we’ve generally experienced short-lived roars related to the clock striking midnight on <a href=\"http://engineering/2011/01/celebrating-new-year-with-new-tweet.html\">New Year’s Eve</a> (6,939 Tweets per second, or TPS), the <a href=\"https://twitter.com/twitter/status/92754546824200193\">end of a soccer game</a> (7,196 TPS), or <a href=\"https://twitter.com/twitter/status/92754546824200193\">Beyonce’s pregnancy announcement</a> (8,868 TPS). Those spikes tended to last seconds, maybe minutes at most. Now, rather than brief spikes, we are seeing sustained peaks for hours. Last night is just another example of the traffic pattern we’ve experienced this year –– we also saw this during the <a href=\"http://engineering/2012/06/courtside-tweets.html\">NBA Finals</a>, <a href=\"http://engineering/2012/08/olympic-and-twitter-records.html\">Olympics Closing Ceremonies</a>, <a href=\"http://engineering/2012/09/the-vmas-look-back-via-twitter.html\">VMAs</a>, and <a href=\"http://engineering/2012/10/twitters-hip-hop-firmament-barsandstars.html\">Hip-Hop Awards</a>.<br><br> Last night’s numbers demonstrate that as Twitter usage patterns change, Twitter the service can remain resilient. Over time, we have been working to build an infrastructure that can withstand an ever-increasing load. For example, we’ve been steadily <a href=\"http://engineering.twitter.com/2011/03/building-faster-ruby-garbage-collector.html\">optimizing the Ruby runtime</a>. And, as part of our ongoing migration away from Ruby, we’ve reconfigured the service so traffic from our mobile clients hits the Java Virtual Machine (JVM) stack, avoiding the Ruby stack altogether. <br><br> Of course, we still have plenty more to do. We’ll continue to measure and evaluate event-based traffic spikes, including their size and duration. We’ll continue studying the best ways to accommodate expected and unexpected traffic surges and high volume conversation during planned real-time events such as elections and championship games, as well as unplanned events such as natural disasters.<br><br> The bottom line: No matter when, where or how people use Twitter, we need to remain accessible 24/7, around the world. We’re hard at work delivering on that vision.<br><br> - Mazen Rawashdeh, VP of Infrastructure Operations Engineering (<a href=\"http://twitter.com/mazenra\">@mazenra</a>)</p>",
"date": "2012-11-07T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/bolstering-our-infrastructure",
"domain": "engineering"
},
{
"title": "Open Sourcing Clutch.IO",
"body": "<p>Clutch is an easy-to-integrate library for native iOS applications designed to help you develop faster, deploy instantly and run A/B tests. When Clutch co-founders Eric Florenzano (<a href=\"http://twitter.com/ericflo\">@ericflo</a>) and Eric Maguire (<a href=\"http://twitter.com/etmaguire\">@etmaguire</a>) recently joined the flock, they <a href=\"http://blog.clutch.io/post/29340796276/clutch-joins-the-flock\">promised</a> that everything you need to run Clutch on your own infrastructure would be available.<br><br></p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>We are incredibly excited to announce that Twitter has acquired the IP of Clutch.io and we start work there today! <a href=\"http://t.co/chQ1iatB\">blog.clutch.io/post/293407962…</a></p> — Clutch IO (\n <a href=\"http://twitter.com/clutchio\">@clutchio</a>) \n <a href=\"https://twitter.com/clutchio/status/235043048952840192\">August 13, 2012</a>\n </blockquote>",
"date": "2012-10-11T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/open-sourcing-clutchio",
"domain": "engineering"
},
{
"title": "Scalding 0.8.0 and Algebird",
"body": "<p>Earlier this year we open sourced <a href=\"https://github.com/twitter/scalding\">Scalding</a>, a Scala API for <a href=\"http://www.cascading.org/\">Cascading</a> that makes it easy to write big data jobs in a syntax that’s simple and concise. We use Scalding heavily — for everything from custom ad targeting algorithms to PageRank on the Twitter graph. Since open sourcing Scalding, we’ve been improving our documentation by adding a <a href=\"https://github.com/twitter/scalding/wiki/Getting-Started\">Getting Started</a> guide and a <a href=\"https://github.com/twitter/scalding/wiki/Rosetta-Code\">Rosetta Code</a> page that contains several MapReduce tasks translated from other frameworks (e.g., Pig and Hadoop Streaming) into Scalding.</p> \n<p>Today we are excited to tell you about the 0.8.0 release of Scalding.</p> \n<h3>What’s new</h3> \n<p>There are a lot of <a href=\"https://github.com/twitter/scalding/blob/develop/CHANGES.md\">new features</a>, for example, Scalding now includes a <a href=\"https://github.com/twitter/scalding/wiki/Matrix-API-Reference\">type-safe Matrix API</a>. The Matrix API makes expressing matrix sums, products, and simple algorithms like cosine similarity trivial. The <a href=\"https://github.com/twitter/scalding/wiki/Type-safe-api-reference\">type-safe Pipe API</a> has some new functions and a few bug fixes.</p> \n<p>In the familiar <a href=\"https://github.com/twitter/scalding/wiki/Fields-based-API-Reference\">Fields API</a>, we’ve added the ability to add type information to fields which allows scalding to pick up Ordering instances so that grouping on almost any scala collection becomes easy. There is now a function to estimate set size in groupBy: <a href=\"https://github.com/twitter/scalding/blob/develop/src/main/scala/com/twitter/scalding/ReduceOperations.scala#L62\">approxUniques</a> (a naive implementation requires two groupBys, but this function uses <a href=\"http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.9475\">HyperLogLog</a>). Since many aggregations are simple transformations of existing Monoids (associative operations with a zero), we added mapPlusMap to simplify implementation of many reducing operations (<a href=\"https://github.com/twitter/scalding/blob/develop/src/main/scala/com/twitter/scalding/ReduceOperations.scala#L50\">count how many functions are implemented in terms of mapPlusMap</a>).</p> \n<p>Cascading and scalding try to optimize your job to some degree, but in some cases for optimal performance, some hand-tuning is needed. This release adds three features to make that easier:</p> \n<ul>\n <li><a href=\"https://github.com/twitter/scalding/blob/develop/src/main/scala/com/twitter/scalding/RichPipe.scala#L253\">forceToDisk</a> forces a materialization and helps when you know the prior operation filters almost all data and should not be limited to just before a join or merge.</li> \n <li>Map-side aggregation in Cascading is done in memory with a threshold on when to spill and poor tuning can result in performance issues or out of memory errors. To help alleviate these issues, we now expose a function in groupBy to specify the <a href=\"https://github.com/twitter/scalding/blob/develop/src/main/scala/com/twitter/scalding/GroupBuilder.scala#L91\">spillThreshold.</a></li> \n <li>We make it easy for Scalding Jobs to control the Hadoop configuration by allowing <a href=\"https://github.com/twitter/scalding/blob/develop/src/main/scala/com/twitter/scalding/Job.scala#L98\">overriding</a> of the config.</li> \n</ul>\n<h3>Algebird</h3> \n<p><a href=\"https://github.com/twitter/algebird\">Algebird</a> is our lightweight abstract algebra library for Scala and is targeted for building aggregation systems (such as <a href=\"https://github.com/nathanmarz/storm\">Storm</a>). It was originally developed as part of Scalding’s Matrix API, but almost all of the common reduce operations we care about in Scalding turn out to be instances of Monoids. This common library gives Map-merge, Set-union, List-concatenation, primitive-type algebra, and some fancy Monoids such as HyperLogLog for set cardinality estimation. Algebird has no dependencies and should be easy to use from any scala project that is doing aggregation of data or data-structures. For instance in the Algebird repo, type “sbt console” and then:</p> \n<pre>scala&gt; import com.twitter.algebird.Operators._\nimport com.twitter.algebird.Operators._</pre> \n<pre>scala&gt; Map(1 -&gt; 3, 2 -&gt; 5, 3 -&gt; 7, 5 -&gt; 1) + Map(1 -&gt; 1, 2 -&gt; 1)\nres0: scala.collection.immutable.Map[Int,Int] = Map(1 -&gt; 4, 2 -&gt; 6, 3 -&gt; 7, 5 -&gt; 1)</pre> \n<pre>scala&gt; Set(1,2,3) + Set(3,4,5)\nres1: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4)</pre> \n<pre>scala&gt; List(1,2,3) + List(3,4,5)\nres2: List[Int] = List(1, 2, 3, 3, 4, 5)</pre> \n<pre>scala&gt; Map(1 -&gt; 3, 2 -&gt; 4, 3 -&gt; 1) * Map(2 -&gt; 2)\nres3: scala.collection.immutable.Map[Int,Int] = Map(2 -&gt; 8)</pre> \n<pre>scala&gt; Map(1 -&gt; Set(2,3), 2 -&gt; Set(1)) + Map(2 -&gt; Set(2,3))\nres4: scala.collection.immutable.Map[Int,scala.collection.immutable.Set[Int]] = Map(1 -&gt; Set(2, 3), 2 -&gt; Set(1, 2, 3))</pre> \n<h3>Future work</h3> \n<p>We are thrilled to see industry recognition of Scalding; the project has received a <a href=\"http://www.infoworld.com/slideshow/65089/bossie-awards-2012-the-best-open-source-databases-202354#slide3\">Bossie Award</a> and there’s a community building around Scalding, with adopters like Etsy and eBay using it in production. In the near future, we are looking at adding optimized skew joins, refactoring the code base into smaller components and using Thrift and Protobuf lzo compressed data. On the whole, we look forward to improving documentation and nurturing a community around Scalding as we approach a 1.0 release.</p> \n<p>If you’d like to help work on any features or have any bug fixes, we’re always looking for contributions. Just submit a pull request to say hello or reach out to us on the <a href=\"https://groups.google.com/forum/?fromgroups#!forum/cascading-user\">mailing list</a>. If you find something missing or broken, report it in the <a href=\"https://github.com/twitter/scalding/issues\">issue tracker</a>.</p> \n<h3>Acknowledgements</h3> \n<p>Scalding and Algebird are built by a community. We’d like to acknowledge the following folks who contributed to the project: Oscar Boykin (<a href=\"https://twitter.com/posco\">@posco</a>), Avi Bryant (<a href=\"https://twitter.com/avibryant\">@avibryant</a>), Edwin Chen (<a href=\"https://twitter.com/echen\">@echen</a>), Sam Ritchie (<a href=\"https://twitter.com/sritchie\">@sritchie</a>), Flavian Vasile (<a href=\"https://twitter.com/flavianv\">@flavianv</a>) and Argyris Zymnis (<a href=\"https://twitter.com/argyris\">@argyris</a>).</p> \n<p>Follow <a href=\"https://twitter.com/scalding\">@scalding</a> on Twitter to stay in touch!</p> \n<p>Posted by Chris Aniszczyk <a href=\"https://twitter.com/cra\">@cra</a><br> Manager, Open Source</p> \n<p></p>",
"date": "2012-09-24T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/scalding-080-and-algebird",
"domain": "engineering"
},
{
"title": "Joining the Linux Foundation",
"body": "<p>Today Twitter officially joins the Linux Foundation (<a href=\"https://twitter.com/linuxfoundation\">@linuxfoundation</a>), the nonprofit consortium dedicated to protecting and promoting the growth of the Linux operating system.</p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>happy 21st birthday to <a href=\"https://twitter.com/search/?q=%23linux\"><s>#</s>linux</a>, we’re proud to support the <a href=\"https://twitter.com/linuxfoundation\"><s>@</s>linuxfoundation</a> <a href=\"http://t.co/sJQxwdtF\">wired.com/thisdayintech/…</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/status/239413692314300417\">August 25, 2012</a>\n </blockquote>",
"date": "2012-08-27T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/joining-the-linux-foundation",
"domain": "engineering"
},
{
"title": "How we spent our Summer of Code",
"body": "<p>For the first time, <a href=\"http://engineering.twitter.com/2012/05/summer-of-code-at-twitter.html\">Twitter participated</a> in the <a href=\"http://code.google.com/soc/\">Google Summer of Code</a> (GSoC) and we want to share news on the resulting open source activities. Unlike many GSoC participating organizations that focus on a single ecosystem, we have a variety of projects spanning multiple programming languages and communities.</p> \n<div class=\"g-tweet\"> \n <blockquote class=\"twitter-tweet\"> \n <p>it’s “pencils down” for <a href=\"https://twitter.com/gsoc\">@gsoc</a>, thank you so much to our mentors and student interns <a href=\"https://twitter.com/KL_7\">@KL_7</a> <a href=\"https://twitter.com/fbru02\">@fbru02</a> <a href=\"https://twitter.com/rubeydoo\">@rubeydoo</a> for hacking with us this summer</p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/statuses/238707212116185088\">August 23, 2012</a>\n </blockquote>",
"date": "2012-08-24T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/how-we-spent-our-summer-of-code",
"domain": "engineering"
},
{
"title": "Crowdsourced data analysis with Clockwork Raven",
"body": "<p>Today, we’re excited to open source <a href=\"http://twitter.github.com/clockworkraven/\">Clockwork Raven</a>, a web application that allows users to easily submit data to <a href=\"http://en.wikipedia.org/wiki/Amazon_Mechanical_Turk\">Mechanical Turk</a> for manual review and then analyze that data. Clockwork Raven steps in to do what algorithms cannot: it sends your data analysis tasks to real people and gets fast, cheap and accurate results. We use Clockwork Raven to gather tens of thousands of judgments from Mechanical Turk users every week.</p> \n<h3>Motivation</h3> \n<p>We’re huge fans of human evaluation at Twitter and how it can aid data analysis. In the past, we’ve used systems like Mechanical Turk and CrowdFlower, as well as an internal system where we train dedicated reviewers and have them come in to our offices. However, as we scale up our usage of human evaluation, we needed a better system. This is why we built Clockwork Raven and designed it with several important goals in mind:</p> \n<ul>\n <li>Requires little technical skill to use: The current Mechanical Turk web interface requires knowledge of HTML to do anything beyond very basic tasks.</li> \n <li>Uniquely suited for our needs: Many of our evaluations require us to embed tweets and timelines in the task. We wanted to create reusable components that would allow us to easily add these widgets to our tasks.</li> \n <li>Scalable: Manually training reviews doesn’t scale as well as a system that crowd sources the work through Mechanical Turk.</li> \n <li>Reliable: We wanted controls over who’s allowed to complete our evaluations, so we can ensure we’re getting top-notch results.</li> \n <li>Low barrier of entry: We wanted a tool that everyone in the company could use to launch evaluations.</li> \n <li>Integrated analysis: We wanted a tool that would analyze the data we gather, in addition to provide the option to export a JSON or CSV to import into tools like R or a simple spreadsheet.</li> \n</ul>\n<h3>Features</h3> \n<p>In Clockwork Raven, you create an evaluation by submitting a table of data (CSV or JSON). Each row of this table corresponds to a task that a human will complete. We build a template for the tasks in the Template Builder, then submit them to Mechanical Turk and Clockwork Raven tracks how many responses we’ve gotten. Once all the tasks are complete, we can import the results into Clockwork Raven where they’re presented in a configurable bar chart and can be exported to a number of data formats.</p> \n<p>Here’s the features we’ve built into Clockwork Raven to address the goals above:</p> \n<ul>\n <li>Clockwork Raven has a simple drag-and-drop builder not unlike the form builder in Google Docs. We can create headers and text sections, add multiple-choice and free-response questions, and insert data from a column in the uploaded data. <br><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/crowdsourced_dataanalysiswithclockworkraven95.thumb.1280.1280.png\" alt=\"Crowdsourced data analysis with Clockwork Raven\"></li> \n <li>The template builder has pre-built components for common items we need to put in our evaluations, like users and Tweets. It’s easy to build new components, so you can design your own. In the template builder, we can pass parameters (like the identifier of the Tweet we’re embedding) into the component. Here’s how we insert a tweet:<br><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/crowdsourced_dataanalysiswithclockworkraven96.thumb.1280.1280.png\" alt=\"Crowdsourced data analysis with Clockwork Raven\"></li> \n <li>Clockwork Raven submits jobs to Mechanical Turk. We can get back thousands of judgements in an hour or less. And because Mechanical Turk workers come from all over the world, we get results whenever we want them.</li> \n <li>Clockwork Raven allows you to manage a list of Trusted Workers. We’ve found that having a hand-picked list of workers is the best way to get great results. We can expand our pool by opening up our tasks beyond our hand-picked set and choosing workers who are doing a great job with our tasks.</li> \n <li>Clockwork Raven authenticates against any LDAP directory (or you can manage user accounts manually). That means that you can give a particular LDAP group at your organization access to Clockwork Raven, and they can log in with their own username and password. No shared accounts, and full accountability for who’s spending what. You can also give “unprivileged” access to some users, allowing them to try Clockwork Raven out and submit evaluations to the Mechanical Turk sandbox (which is free), but not allowing them to submit tasks that cost money without getting approval.</li> \n <li>Clockwork Raven has a built-in data analysis tool that lets you chart your results across multiple dimensions of data and view individual results:<br><br><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/crowdsourced_dataanalysiswithclockworkraven97.thumb.1280.1280.png\" alt=\"Crowdsourced data analysis with Clockwork Raven\"></li> \n</ul>\n<h3>Future Work</h3> \n<p>We’re actively developing Clockwork Raven and improving it over time. Our target for the next release is a comprehensive REST API that works with JSON (possibly Thrift as well). We’re hoping this will allow us to build Clockwork Raven into our workflows, as well as enable its use for real-time human evaluation. We’re also working on better ways of managing workers, by automatically managing the group of trusted workers through qualification tasks and automated analysis of untrusted users’ work.</p> \n<p>If you’d like to help work on these features, or have any bug fixes, other features, or documentation improvements, we’re always looking for contributions. Just submit a pull request to say hello or reach out to us on the <a href=\"https://groups.google.com/forum/?fromgroups#!forum/twitter-clockworkraven\">mailing list</a>. If you find something missing or broken, report it in the <a href=\"https://github.com/twitter/clockworkraven/issues\">issue tracker</a>.</p> \n<h3>Acknowledgements</h3> \n<p>Clockwork Raven was primarily authored by Ben Weissmann (<a href=\"https://twitter.com/benweissmann\">@benweissmann</a>). In addition, we’d like to acknowledge the following folks who contributed to the project: Edwin Chen (<a href=\"https://twitter.com/echen\">@echen</a>) and Dave Buchfuhrer (<a href=\"https://twitter.com/daveFNbuck\">@daveFNbuck</a>).</p> \n<p>Follow <a href=\"https://twitter.com/clockworkraven\">@clockworkraven</a> on Twitter to stay in touch!</p> \n<p>- Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/clockworkraven\">@cra</a>)</p>",
"date": "2012-08-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/crowdsourced-data-analysis-with-clockwork-raven",
"domain": "engineering"
},
{
"title": "Visualizing Hadoop with HDFS-DU",
"body": "<p>We are a heavy adopter of <a href=\"https://github.com/twitter/hdfs-du\">Apache Hadoop</a> with a large set of data that resides in its clusters, so it’s important for us to understand how these resources are utilized. At our July <a href=\"http://engineering/2012/01/hack-week-twitter.html\">Hack Week</a>, we experimented with developing <a href=\"https://github.com/twitter/hdfs-du\">HDFS-DU</a> to provide us an interactive visualization of the underlying Hadoop Distributed File System (HDFS). The project aims to monitor different snapshots for the entire HDFS system in an interactive way, showing the size of the folders and the rate at which the size changes. It can also effectively identify efficient and inefficient file storage and highlight nodes in the file system where this is happening.</p> \n<p>HDFS-DU provides the following in a web user interface:</p> \n<ul>\n <li>A TreeMap visualization where each node is a folder in HDFS. The area of each node can be relative to the size or number of descendents</li> \n <li>A tree visualization showing the topology of the file system</li> \n</ul>\n<p>HDFS-DU is built using the following front-end technologies:</p> \n<ul>\n <li><a href=\"http://d3js.org/\">D3.js</a>: for tree visualization</li> \n <li><a href=\"http://thejit.org/\">JavaScript InfoVis Toolkit</a>: for TreeMap visualization</li> \n</ul>\n<h3>Details</h3> \n<p>Below is a screenshot of the HDFS-DU user interface (directory names scrubbed). The user interface is made up of two linked visualizations. The left visualization is a TreeMap and shows parent-child relationships through containment. The right visualization is a tree layout, which displays two levels of depth from the current selected node in the file system. The tree visualization displays extra information for each node on hover.</p> \n<p><a href=\"http://1.bp.blogspot.com/-4VPTuVaxVpk/UCFYwAIdpmI/AAAAAAAAAK8/72tbXz3Nd9I/s1600/1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/visualizing_hadoopwithhdfs-du95.thumb.1280.1280.png\" alt=\"Visualizing Hadoop with HDFS-DU\"></a></p> \n<p>You can drill down on the TreeMap by clicking on a node, this would create the same effect as clicking on any tree node. There are two possible layouts for the TreeMap. The default one encodes file size in the area of each node. The second one encodes number of descendents in the area of each node. In the second view it’s interesting to spot nodes where storage is inefficient.</p> \n<p><a href=\"http://2.bp.blogspot.com/-qSpx6IiGlpc/UCFY-h3c13I/AAAAAAAAALI/eVHAfwuSKA4/s1600/2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/visualizing_hadoopwithhdfs-du96.thumb.1280.1280.png\" alt=\"Visualizing Hadoop with HDFS-DU\"></a></p> \n<p></p> \n<h3>Future Work</h3> \n<p>This project was created at our July Hack Week and we still consider it beta but useful software. In the future, we would love to improve the front-end client and create a new back-end for a different runtime environment. On the front end, the directory browser, currently on the right, is poorly suited to the task of showing the directory structure. A view which looks more like a traditional filesystem browser would be more immediately recognizable and make better use of space (it is likely that a javascript file browser exists and could be used instead). Also, the integration between the current file browser and the TreeMap needs improvement.</p> \n<p>We initially envisioned the TreeMap as a <a href=\"http://en.wikipedia.org/wiki/Voronoi_diagram\">Voronoi TreeMap</a>, however our current implementation of that code ran too slowly to be practical. We would love to get the Voronoi TreeMap code to work fast enough. We would also like to add the option to use different values to size and color the TreeMap areas. For example, change in size, creation time, last access time, frequency of access.</p> \n<h3>Acknowledgements</h3> \n<p>HDFS-DU was primarily authored by Travis Crawford (<a href=\"https://twitter.com/tc/\">@tc</a>), Nicolas Garcia Belmonte (<a href=\"https://twitter.com/philogb\">@philogb</a>) and Robert Harris (<a href=\"https://twitter.com/trebor\">@trebor</a>). Given that this is a young project, we always appreciate bug fixes, features and documentation improvements. Feel free to fork the project and send us a pull request on GitHub to say hello. Finally, if you’re interested in visualization and distributed file systems like Hadoop, we’re always looking for engineers to <a href=\"https://twitter.com/jobs\">join the flock.</a></p> \n<p>Follow <a href=\"https://twitter.com/hdfsdu\">@hdfsdu</a> on Twitter to stay in touch!</p> \n<p>- Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)</p>",
"date": "2012-08-07T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/visualizing-hadoop-with-hdfs-du",
"domain": "engineering"
},
{
"title": "Trident: a high-level abstraction for realtime computation",
"body": "<p>Trident is a new high-level abstraction for doing realtime computing on top of <a href=\"http://storm-project.net/\">Twitter Storm</a>, available in <a href=\"http://storm-project.net/downloads.html\">Storm 0.8.0</a> (released today). It allows you to seamlessly mix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you’re familiar with high level batch processing tools like <a href=\"http://pig.apache.org/\">Pig</a> or <a href=\"http://www.cascading.org/\">Cascading</a>, the concepts of Trident will be very familiar - Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies.</p> \n<p>We’re really excited about Trident and believe it is a major step forward in Big Data processing. It builds upon Storm’s foundation to make realtime computation as easy as batch computation.</p> \n<h2>Example</h2> \n<p>Let’s look at an illustrative example of Trident. This example will do two things:</p> \n<ol>\n <li> <p>Compute streaming word count from an input stream of sentences</p> </li> \n <li> <p>Implement queries to get the sum of the counts for a list of words</p> </li> \n</ol>\n<p>For the purposes of illustration, this example will read an infinite stream of sentences from the following source:</p> \n<p></p> \n<p>This spout cycles through that set of sentences over and over to produce the sentence stream. Here’s the code to do the streaming word count part of the computation:</p> \n<p></p> \n<p>Let’s go through the code line by line. First a TridentTopology object is created, which exposes the interface for constructing Trident computations. TridentTopology has a method called newStream that creates a new stream of data in the topology reading from an input source. In this case, the input source is just the FixedBatchSpout defined from before. Input sources can also be queue brokers like Kestrel or Kafka. Trident keeps track of a small amount of state for each input source (metadata about what it has consumed) in Zookeeper, and the “spout1” string here specifies the node in Zookeeper where Trident should keep that metadata.</p> \n<p>Trident processes the stream as small batches of tuples. For example, the incoming stream of sentences might be divided into batches like so:</p> \n<p><a href=\"http://4.bp.blogspot.com/-dvyMvMMVyQg/UBqwN1OQnoI/AAAAAAAAAKQ/uaYoI0UH9fU/s1600/batched-stream.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/trident_a_high-levelabstractionforrealtimecomputation95.thumb.1280.1280.png\" alt=\"Trident: a high-level abstraction for realtime computation\"></a></p> \n<p>Generally the size of those small batches will be on the order of thousands or millions of tuples, depending on your incoming throughput.</p> \n<p>Trident provides a fully fledged batch processing API to process those small batches. The API is very similar to what you see in high level abstractions for Hadoop like Pig or Cascading: you can do group by’s, joins, aggregations, run functions, run filters, and so on. Of course, processing each small batch in isolation isn’t that interesting, so Trident provides functions for doing aggregations across batches and persistently storing those aggregations - whether in memory, in Memcached, in Cassandra, or some other store. Finally, Trident has first-class functions for querying sources of realtime state. That state could be updated by Trident (like in this example), or it could be an independent source of state.</p> \n<p>Back to the example, the spout emits a stream containing one field called “sentence”. The next line of the topology definition applies the Split function to each tuple in the stream, taking the “sentence” field and splitting it into words. Each sentence tuple creates potentially many word tuples - for instance, the sentence “the cow jumped over the moon” creates six “word” tuples. Here’s the definition of Split:</p> \n<p></p> \n<p>As you can see, it’s really simple. It simply grabs the sentence, splits it on whitespace, and emits a tuple for each word.</p> \n<p>The rest of the topology computes word count and keeps the results persistently stored. First the stream is grouped by the “word” field. Then, each group is persistently aggregated using the Count aggregator. The persistentAggregate function knows how to store and update the results of the aggregation in a source of state. In this example, the word counts are kept in memory, but this can be trivially swapped to use Memcached, Cassandra, or any other persistent store. Swapping this topology to store counts in Memcached is as simple as replacing the persistentAggregate line with this (using <a href=\"https://github.com/nathanmarz/trident-memcached\">trident-memcached</a>), where the “serverLocations” variable is a list of host/ports for the Memcached cluster:</p> \n<p></p> \n<p>The values stored by persistentAggregate represents the aggregation of all batches ever emitted by the stream.</p> \n<p>One of the cool things about Trident is that it has fully fault-tolerant, exactly-once processing semantics. This makes it easy to reason about your realtime processing. Trident persists state in a way so that if failures occur and retries are necessary, it won’t perform multiple updates to the database for the same source data.</p> \n<p>The persistentAggregate method transforms a Stream into a TridentState object. In this case the TridentState object represents all the word counts. We will use this TridentState object to implement the distributed query portion of the computation.</p> \n<p>The next part of the topology implements a low latency distributed query on the word counts. The query takes as input a whitespace separated list of words and return the sum of the counts for those words. These queries are executed just like normal RPC calls, except they are parallelized in the background. Here’s an example of how you might invoke one of these queries:</p> \n<p></p> \n<p>As you can see, it looks just like a regular remote procedure call (RPC), except it’s executing in parallel across a Storm cluster. The latency for small queries like this are typically around 10ms. More intense DRPC queries can take longer of course, although the latency largely depends on how many resources you have allocated for the computation.</p> \n<p>The implementation of the distributed query portion of the topology looks like this:</p> \n<p></p> \n<p>The same TridentTopology object is used to create the DRPC stream, and the function is named “words”. The function name corresponds to the function name given in the first argument of execute when using a DRPCClient.</p> \n<p>Each DRPC request is treated as its own little batch processing job that takes as input a single tuple representing the request. The tuple contains one field called “args” that contains the argument provided by the client. In this case, the argument is a whitespace separated list of words.</p> \n<p>First, the Split function is used to split the arguments for the request into its constituent words. The stream is grouped by “word”, and the stateQuery operator is used to query the TridentState object that the first part of the topology generated. stateQuery takes in a source of state - in this case, the word counts computed by the other portion of the topology - and a function for querying that state. In this case, the MapGet function is invoked, which gets the count for each word. Since the DRPC stream is grouped the exact same way as the TridentState was (by the “word” field), each word query is routed to the exact partition of the TridentState object that manages updates for that word.</p> \n<p>Next, words that didn’t have a count are filtered out via the FilterNull filter and the counts are summed using the Sum aggregator to get the result. Then, Trident automatically sends the result back to the waiting client.</p> \n<p>Trident is intelligent about how it executes a topology to maximize performance. There’s two interesting things happening automatically in this topology:</p> \n<ol>\n <li> <p>Operations that read from or write to state (like persistentAggregate and stateQuery) automatically batch operations to that state. So if there’s 20 updates that need to be made to the database for the current batch of processing, rather than do 20 read requests and 20 write requests to the database, Trident will automatically batch up the reads and writes, doing only 1 read request and 1 write request (and in many cases, you can use caching in your State implementation to eliminate the read request). So you get the best of both words of convenience - being able to express your computation in terms of what should be done with each tuple - and performance.</p> </li> \n <li> <p>Trident aggregators are heavily optimized. Rather than transfer all tuples for a group to the same machine and then run the aggregator, Trident will do partial aggregations when possible before sending tuples over the network. For example, the Count aggregator computes the count on each partition, sends the partial count over the network, and then sums together all the partial counts to get the total count. This technique is similar to the use of combiners in MapReduce.</p> </li> \n</ol>\n<p>Let’s look at another example of Trident.</p> \n<h2>Reach</h2> \n<p>The next example is a pure DRPC topology that computes the reach of a URL on Twitter on demand. Reach is the number of unique people exposed to a URL on Twitter. To compute reach, you need to fetch all the people who ever tweeted a URL, fetch all the followers of all those people, unique that set of followers, and that count that uniqued set. Computing reach is too intense for a single machine - it can require thousands of database calls and tens of millions of tuples. With Storm and Trident, it’s easy to parallelize the computation of each step across a cluster.</p> \n<p>This topology will read from two sources of state. One database maps URLs to a list of people who tweeted that URL. The other database maps a person to a list of followers for that person. The topology definition looks like this:</p> \n<p></p> \n<p>The topology creates TridentState objects representing each external database using the newStaticState method. These can then be queried in the topology. Like all sources of state, queries to these databases will be automatically batched for maximum efficiency.</p> \n<p>The topology definition is straightforward - it’s just a simple batch processing job. First, the urlToTweeters database is queried to get the list of people who tweeted the URL for this request. That returns a list, so the ExpandList function is invoked to create a tuple for each tweeter.</p> \n<p>Next, the followers for each tweeter must be fetched. It’s important that this step be parallelized, so shuffle is invoked to evenly distribute the tweeters among all workers for the topology. Then, the followers database is queried to get the list of followers for each tweeter. You can see that this portion of the topology is given a large parallelism since this is the most intense portion of the computation.</p> \n<p>Next, the set of followers is uniqued and counted. This is done in two steps. First a “group by” is done on the batch by “follower”, running the “One” aggregator on each group. The “One” aggregator simply emits a single tuple containing the number one for each group. Then, the ones are summed together to get the unique count of the followers set. Here’s the definition of the “One” aggregator:</p> \n<p></p> \n<p>This is a “combiner aggregator”, which knows how to do partial aggregations before transferring tuples over the network to maximize efficiency. Sum is also defined as a combiner aggregator, so the global sum done at the end of the topology will be very efficient.</p> \n<p>Let’s now look at Trident in more detail.</p> \n<h2>Fields and tuples</h2> \n<p>The Trident data model is the TridentTuple which is a named list of values. During a topology, tuples are incrementally built up through a sequence of operations. Operations generally take in a set of input fields and emit a set of “function fields”. The input fields are used to select a subset of the tuple as input to the operation, while the “function fields” name the fields emitted by the operation.</p> \n<p>Consider this example. Suppose you have a stream called “stream” that contains the fields “x”, “y”, and “z”. To run a filter MyFilter that takes in “y” as input, you would say:</p> \n<p></p> \n<p>Suppose the implementation of MyFilter is this:</p> \n<p></p> \n<p>This will keep all tuples whose “y” field is less than 10. The TridentTuple given as input to MyFilter will only contain the “y” field. Note that Trident is able to project a subset of a tuple extremely efficiently when selecting the input fields: the projection is essentially free.</p> \n<p>Let’s now look at how “function fields” work. Suppose you had this function:</p> \n<p></p> \n<p>This function takes two numbers as input and emits two new values: the addition of the numbers and the multiplication of the numbers. Suppose you had a stream with the fields “x”, “y”, and “z”. You would use this function like this:</p> \n<p></p> \n<p>The output of functions is additive: the fields are added to the input tuple. So the output of this each call would contain tuples with the five fields “x”, “y”, “z”, “added”, and “multiplied”. “added” corresponds to the first value emitted by AddAndMultiply, while “multiplied” corresponds to the second value.</p> \n<p>With aggregators, on the other hand, the function fields replace the input tuples. So if you had a stream containing the fields “val1” and “val2”, and you did this:</p> \n<p></p> \n<p>The output stream would only contain a single tuple with a single field called “sum”, representing the sum of all “val2” fields in that batch.</p> \n<p>With grouped streams, the output will contain the grouping fields followed by the fields emitted by the aggregator. For example:</p> \n<p></p> \n<p>In this example, the output will contain the fields “val1” and “sum”.</p> \n<h2>State</h2> \n<p>A key problem to solve with realtime computation is how to manage state so that updates are idempotent in the face of failures and retries. It’s impossible to eliminate failures, so when a node dies or something else goes wrong, batches need to be retried. The question is - how do you do state updates (whether external databases or state internal to the topology) so that it’s like each message was processed only once?</p> \n<p>This is a tricky problem, and can be illustrated with the following example. Suppose that you’re doing a count aggregation of your stream and want to store the running count in a database. If you store only the count in the database and it’s time to apply a state update for a batch, there’s no way to know if you applied that state update before. The batch could have been attempted before, succeeded in updating the database, and then failed at a later step. Or the batch could have been attempted before and failed to update the database. You just don’t know.</p> \n<p>Trident solves this problem by doing two things:</p> \n<ol>\n <li> <p>Each batch is given a unique id called the “transaction id”. If a batch is retried it will have the exact same transaction id.</p> </li> \n <li> <p>State updates are ordered among batches. That is, the state updates for batch 3 won’t be applied until the state updates for batch 2 have succeeded.</p> </li> \n</ol>\n<p>With these two primitives, you can achieve exactly-once semantics with your state updates. Rather than store just the count in the database, what you can do instead is store the transaction id with the count in the database as an atomic value. Then, when updating the count, you can just compare the transaction id in the database with the transaction id for the current batch. If they’re the same, you skip the update - because of the strong ordering, you know for sure that the value in the database incorporates the current batch. If they’re different, you increment the count.</p> \n<p>Of course, you don’t have to do this logic manually in your topologies. This logic is wrapped by the State abstraction and done automatically. Nor is your State object required to implement the transaction id trick: if you don’t want to pay the cost of storing the transaction id in the database, you don’t have to. In that case the State will have at-least-once-processing semantics in the case of failures (which may be fine for your application). You can read more about how to implement a State and the various fault-tolerance tradeoffs possible <a href=\"https://github.com/nathanmarz/storm/wiki/Trident-state\">in this doc</a>.</p> \n<p>A State is allowed to use whatever strategy it wants to store state. So it could store state in an external database or it could keep the state in-memory but backed by HDFS (like how HBase works). State’s are not required to hold onto state forever. For example, you could have an in-memory State implementation that only keeps the last X hours of data available and drops anything older. Take a look at the implementation of the <a href=\"https://github.com/nathanmarz/trident-memcached\">Memcached integration</a> for an example State implementation.</p> \n<h2>Execution of Trident topologies</h2> \n<p>Trident topologies compile down into as efficient of a Storm topology as possible. Tuples are only sent over the network when a repartitioning of the data is required, such as if you do a groupBy or a shuffle. So if you had this Trident topology:</p> \n<p><a href=\"http://4.bp.blogspot.com/-LZD5mUEkC2s/UBqwZbpKT9I/AAAAAAAAAKc/-2eKlRrAt_Y/s1600/trident-to-storm1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/trident_a_high-levelabstractionforrealtimecomputation96.thumb.1280.1280.png\" alt=\"Trident: a high-level abstraction for realtime computation\"></a></p> \n<p>It would compile into Storm spouts/bolts like this:</p> \n<p><a href=\"http://1.bp.blogspot.com/-WpJ9YiaCn7c/UBqwpnK6cpI/AAAAAAAAAKo/2tV2bXfgzEE/s1600/trident-to-storm2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/trident_a_high-levelabstractionforrealtimecomputation97.thumb.1280.1280.png\" alt=\"Trident: a high-level abstraction for realtime computation\"></a></p> \n<p>As you can see, Trident colocates operations within the same bolt as much as possible.</p> \n<h2>Conclusion</h2> \n<p>Trident makes realtime computation elegant. You’ve seen how high throughput stream processing, state manipulation, and low-latency querying can be seamlessly intermixed via Trident’s API. Trident lets you express your realtime computations in a natural way while still getting maximal performance. To get started with Trident, take a look at these <a href=\"https://github.com/nathanmarz/storm-starter/tree/master/src/jvm/storm/starter/trident\">sample Trident topologies</a> and the <a href=\"https://github.com/nathanmarz/storm/wiki/Documentation\">Trident documentation</a>.</p> \n<p>- Nathan Marz, Software Engineer (<a href=\"https://twitter.com/nathanmarz\">@nathanmarz</a>)</p>",
"date": "2012-08-02T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/trident-a-high-level-abstraction-for-realtime-computation",
"domain": "engineering"
},
{
"title": "TwitterCLDR: Improving Internationalization Support in Ruby",
"body": "<p>We recently open sourced <a href=\"https://github.com/twitter/twitter-cldr-rb\">TwitterCLDR</a> under the Apache Public License 2.0. TwitterCLDR is an <a href=\"http://site.icu-project.org\">“ICU level”</a> internationalization library for Ruby that supports dates, times, numbers, currencies, world languages, sorting, text normalization, time spans, plurals, and unicode code point data. By sharing our code with the community we hope to collaborate together and improve internationalization support for websites all over the world. If your company is considering supporting multiple languages, then you can try TwitterCLDR to help your internationalization efforts.</p> \n<h3>Motivation</h3> \n<p>Here’s a test. Say this date out loud: 2/1/2012</p> \n<p>If you said, “February first, 2012”, you’re probably an American. If you said, “January second, 2012”, you’re probably of European or possibly Asian descent. If you said, “January 12, 1902”, you’re probably a computer. The point is that as humans, we almost never think about formatting dates, plurals, lists, and the like. If you’re creating a platform available around the world, however, these kinds of minutiae make a big difference to users.</p> \n<p>The <a href=\"http://www.unicode.org/consortium/consort.html\">Unicode Consortium</a> publishes and maintains a bunch of data regarding formatting dates, numbers, lists, and more, called the <a href=\"http://cldr.unicode.org\">Common Locale Data Repository (CLDR)</a>. IBM maintains International Components for Unicode (ICU), a library that uses the Unicode Consortium’s data to make it easier for programmers to use. However, this library is targeted at Java and C/C++ developers and not Ruby programmers, which is one of the programming languages used at Twitter. For example, Ruby and TwitterCLDR helps power our <a href=\"http://translate.twttr.com/welcome\">Translation Center</a>. TwitterCLDR provides a way to use the same CLDR data that Java uses, but in a Ruby environment. Hence, formatting dates, times, numbers, currencies and plurals should now be much easier for the typical Rubyist. Let’s go over some real world examples.</p> \n<h2>Example Code</h2> \n<p>Dates, Numbers, and Currencies</p> \n<p>Let’s format a date in Spanish (es):</p> \n<pre>$&gt; DateTime.now.localize(:es).to_full_s\n$&gt; \"lunes, 12 de diciembre de 2011 21:44:57 UTC -0800\"</pre> \n<p>Too long? Make it shorter:</p> \n<pre>$&gt; DateTime.now.localize(:es).to_short_s\n$&gt; \"12/12/11 21:44\" </pre> \n<p>Built in support for relative times lets you do this:</p> \n<pre>$&gt; (DateTime.now - 1).localize(:en).ago.to_s\n$&gt; \"1 day ago\"\n$&gt; (DateTime.now + 1).localize(:en).until.to_s\n$&gt; \"In 1 day\"</pre> \n<p>Number formatting is easy:</p> \n<pre>$&gt; 1337.localize(:en).to_s\n$&gt; \"1,337\"\n$&gt; 1337.localize(:fr).to_s\n$&gt; \"1 337\"</pre> \n<p>We’ve got you covered for currencies and decimals too:</p> \n<pre>$&gt; 1337.localize(:es).to_currency.to_s(:currency =&gt; \"EUR\")\n$&gt; \"1.337,00 €\"\n$&gt; 1337.localize(:es).to_decimal.to_s(:precision =&gt; 3)\n$&gt; \"1.337,000\"</pre> \n<p>Currency data? Absolutely:</p> \n<pre>$&gt; TwitterCldr::Shared::Currencies.for_country(\"Canada\")\n$&gt; { :currency =&gt; \"Dollar\", :symbol =&gt; \"$\", :code =&gt; \"CAD\" }</pre> \n<p>Plurals</p> \n<p>Get the plural rule for a number:</p> \n<pre>$&gt; TwitterCldr::Formatters::Plurals::Rules.rule_for(1, :ru)\n$&gt; :one\n$&gt; TwitterCldr::Formatters::Plurals::Rules.rule_for(3, :ru)\n$&gt; :few\n$&gt; TwitterCldr::Formatters::Plurals::Rules.rule_for(10, :ru)\n$&gt; :many</pre> \n<p>Embed plurals right in your translatable phrases using JSON syntax:</p> \n<pre>$&gt; str = 'there % in the barn'\n$&gt; str.localize % { :horse_count =&gt; 3 }\n$&gt; \"there are 3 horses in the barn\"</pre> \n<p>Unicode Data</p> \n<p>Get attributes for any Unicode code point:</p> \n<pre>$&gt; code_point = TwitterCldr::Shared::CodePoint.for_hex(\"1F3E9\")\n$&gt; code_point.name\n$&gt; \"LOVE HOTEL\"\n$&gt; code_point.category\n$&gt; \"So\"</pre> \n<p>Normalize strings using Unicode’s standard algorithms (NFD, NFKD, NFC, or NFKC):</p> \n<pre>$&gt; \"español\".localize.code_points\n$&gt; [\"0065\", \"0073\", \"0070\", \"0061\", \"00F1\", \"006F\", \"006C\"]\n$&gt; \"español\".localize.normalize(:using =&gt; :NFKD).code_points\n$&gt; [\"0065\", \"0073\", \"0070\", \"0061\", \"006E\", \"0303\", \"006F\", \"006C\"]</pre> \n<p>Sorting (Collation)</p> \n<p>TwitterCLDR includes a pure Ruby, from-scratch implementation of the <a href=\"http://unicode.org/reports/tr10/\">Unicode Collation Algorithm</a> (with tailoring) that enables locale-aware sorting capabilities.</p> \n<p>Alphabetize a list using regular Ruby sort:</p> \n<pre>$&gt; [\"Art\", \"Wasa\", \"Älg\", \"Ved\"].sort\n$&gt; [\"Art\", \"Ved\", \"Wasa\", \"Älg\"]</pre> \n<p>Alphabetize a list using TwitterCLDR’s locale-aware sort:</p> \n<pre>$&gt; [\"Art\", \"Wasa\", \"Älg\", \"Ved\"].localize(:de).sort.to_a\n$&gt; [\"Älg\", \"Art\", \"Ved\", \"Wasa\"]</pre> \n<p>NOTE: Most of these methods can be customized to your liking.</p> \n<h3>JavaScript Support</h3> \n<p>What good is all this internationalization support in Ruby if I can’t expect the same output on the client side too? To bridge the gap between the client and server sides, TwitterCLDR also contains a JavaScript implementation (known as twitter-cldr-js) whose compiled files are maintained in a <a href=\"https://github.com/twitter/twitter-cldr-js\">separate GitHub repo</a>. At the moment, twitter-cldr-js supports dates, times, relative times, and plural rules. We’re working on expanding its capabilities, so stay tuned.</p> \n<h3>Future Work</h3> \n<p>In the future, we hope to add even more internationalization capabilities to TwitterCLDR, including Rails integration, phone number and postal code validation, support for Unicode characters in Ruby 1.8 strings and regular expressions, and the ability to translate timezone names via the TZInfo gem and ActiveSupport. We would love to have the community use TwitterCLDR and help us improve the code to reach everyone in the world.</p> \n<h3>Acknowledgements</h3> \n<p>Twitter CLDR was primarily authored by Cameron Dutro (<a href=\"https://twitter.com/camertron\">@camertron</a>). In addition, we’d like to acknowledge the following folks who contributed to the project either directly or indirectly: Kirill Lashuk (<a href=\"https://twitter.com/kl_7\">@kl_7</a>), Nico Sallembien (<a href=\"https://twitter.com/nsallembien\">@nsallembien</a>), Sumit Shah (<a href=\"https://twitter.com/omnidactyl\">@omnidactyl</a>), Katsuya Noguchi, Engineer (<a href=\"https://twitter.com/kn\">@kn</a>), Timothy Andrew (<a href=\"https://twitter.com/timothyandrew\">@timothyandrew</a>) and Kristian Freeman (<a href=\"https://twitter.com/imkmf\">@imkmf</a>).</p> \n<p><br> - Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)</p>",
"date": "2012-08-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/twittercldr-improving-internationalization-support-in-ruby",
"domain": "engineering"
},
{
"title": "Caching with Twemcache",
"body": "<p>Update - July 11, 2012, 9:45am<br><br>We want to correct an error regarding the slab calcification problem we mentioned in the original post. This problem only applied to our v1.4.4 fork of Memcached; this correction is reflected below. The recent Memcached version has addressed some of these problems. <br><br> We built <a href=\"http://github.com/twitter/twemcache\">Twemcache</a> because we needed a more robust and manageable version of Memcached, suitable for our large-scale production environment. Today, we are open-sourcing Twemcache under the New BSD license. As one of the largest adopters of <a href=\"http://memcached.org/\">Memcached</a>, a popular open source caching system, we have used Memcached over the years to help us scale our ever-growing traffic. Today, we have hundreds of dedicated cache servers keeping over 20TB of data from over 30 services in-memory, including crucial data such as user information and Tweets. Collectively these servers handle almost 2 trillion queries on any given day (that’s more than 23 million queries per second). As we continued to grow, we needed a more robust and manageable version of Memcached suitable for our large scale production environment. <br><br> We have been running Twemcache in production for more than a year and a half. Twemcache is based on a fork of Memcached v1.4.4 that is heavily modified to improve maintainability and help us monitor our cache servers better. We improved performance, removed code that we didn’t find necessary, refactored large source files and added observability related features. The following sections will provide more details on why we did this and what those new features are.<br><br>Motivation<br><br> Almost all of our cache use cases fall into two categories:<br><br></p> \n<ul>\n <li>as an optimization for disk where cache is used as the in-memory serving layer to shed load from databases.</li> \n <li>as an optimization for cpu where cache is used as a buffer to store items that are expensive to recompute.</li> \n</ul>\n<p><br> An example of these two optimizations is “caching of Tweets”. All Tweets are persisted to disk when they are created, but most Tweets requested by users need to be served out of memory for performance reasons. We use Twemcache to store recent and frequently accessed Tweets, as an optimization for disk. When a Tweet shows up in a particular client, it takes a particular presentation - rendered Tweet - which has other metadata like number of retweets, favorites etc. We also use Twemcache to store the recently rendered Tweets, as an optimization for cpu. <br><br> To effectively address the use cases mentioned above, it’s extremely important that caches are always available and have predictable performance with respect to item hit rate even when operating at full capacity. Caches should also be able to adapt to changing item sizes on-the-fly as application data size grows or shrinks over time. Finally, it is critical to have observability into caches to monitor the health and effectiveness of our cache clusters. It turns out that all these problems are interrelated because adapting to changing item sizes usually requires a cache reconfiguration — which impacts availability and predictability. Twemcache tries to address these needs with the help of the following features:<br><br>Random Eviction<br><br> The v1.4.4 implementation of Memcached, which Twemcache is based on, suffers from a problem we call slab calcification. In Memcached, a slab can only store items of a given maximum size and once a slab has been allocated to a slab class, it cannot be reassigned to another slab class. In other words, slabs once allocated are locked to their respective slab classes. This is the crux of the slab calcification problem. When items grow or shrink in size, new slabs must be to allocated to store them. Over time, when caches reach full memory capacity, to store newer items we must rely on evicting existing items in the same slab class. If the newer items are of a size with no slabs allocated, write requests may fail completely. Meanwhile, slabs allocated to a different slab class may sit idle. Slab calcification leads to loss of capacity and efficiency. <br><br> To solve this problem without resorting to periodically restarting the server instances, we introduced a new eviction strategy called random eviction. In this strategy, when a new item needs to be inserted and it cannot be accommodated by the space occupied by an expired item or the available free memory, we’ll simply pick a random slab from the list of all allocated slabs, evict all items within that slab, and reallocate it to the slab class that fits the new item. <br><br> It turns out that this feature is quite powerful for two reasons:<br><br></p> \n<ul>\n <li>Cache servers can now gracefully move on-the-fly from one slab size to another for a given application. This enables our cache servers to adapt to changing item sizes and have a predictable long term hit rate by caching an application’s active working set of items.</li> \n <li>Application developers don’t have to worry about reconfiguring their cache server when they add or delete fields from their cache item structures or if their item size grows over time.</li> \n</ul>\n<p><br> By providing a stable hit rate, random eviction prevents performance degradation due to data pattern change and system instability associated with restarts. The <a href=\"http://www.youtube.com/watch?v=EtROv2or8SE&amp;feature=youtu.be&amp;hd=1\">video</a> below illustrates how over time Twemcache is able to adapt to a shifting size pattern and still remain effective.<br><br></p>\n<div class=\"video video-youtube\">\n <iframe width=\"100%\" src=\"https://www.youtube.com/embed/EtROv2or8SE\" frameborder=\"0\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>",
"date": "2012-07-10T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/caching-with-twemcache",
"domain": "engineering"
},
{
"title": "Building and profiling high performance systems with Iago",
"body": "<p><a href=\"http://twitter.github.com/iago\">Iago</a> is a load generator that we created to help us test services before they encounter production traffic. While there are many load generators available in the open source and commercial software worlds, Iago provides us with capabilities that are uniquely suited for Twitter’s environment and the precise degree to which we need to test our services. <br><br> There are three main properties that make Iago a good fit for Twitter:<br><br></p> \n<ul>\n <li>High performance: In order to reach the highest levels of performance, your load generator must be equally performant. It must generate traffic in a very precise and predictable way to minimize variance between test runs and allow comparisons to be made between development iterations. Additionally, testing systems to failure is an important part of capacity planning, and it requires you to generate load significantly in excess of expected production traffic.</li> \n <li>Multi-protocol: Modelling a system as complex as Twitter can be difficult, but it’s made easier by decomposing it into component services. Once decomposed, each piece can be tested in isolation; this requires your load generator to speak each service’s protocol. Twitter has in excess of 100 such services, and Iago can and has tested most of them due to its built-in support for the protocols we use, including HTTP, Thrift and several others.</li> \n <li>Extensible: Iago is designed first and foremost for engineers. It assumes that the person building the system will also be interested in validating its performance and will know best how to do so. As such, it’s designed from the ground up to be extensible – making it easy to generate new traffic types, over new protocols and with individualized traffic sources. It is also provides sensible defaults for common use cases, while allowing for extensive configuration without writing code if that’s your preference.</li> \n</ul>\n<p><br><br> Iago is the load generator we always wished we had. Now that we’ve built it, we want to share it with others who might need it to solve similar problems. Iago is now open sourced at <a href=\"https://github.com/twitter/iago\">GitHub</a> under the Apache Public License 2.0 and we are happy to accept any feedback (or pull requests) the open source community might have.<br><br>How does Iago work?<br><br> Iago’s <a href=\"https://github.com/twitter/iago\">documentation</a> goes into more detail, but it is written in Scala and is designed to be extended by anyone writing code for the JVM platform. Non-blocking requests are generated at a specified rate, using an underlying, configurable statistical distribution (the default is to model a <a href=\"http://en.wikipedia.org/wiki/Poisson_Process\">Poisson Process</a>). The request rate can be varied as appropriate – for instance to warm up caches before handling full production load. In general the focus is on the arrival rate aspect of <a href=\"http://en.wikipedia.org/wiki/Little%27s_Law\">Little’s Law</a>, instead of concurrent users, which is allowed to float as appropriate given service latency. This greatly enhances the ability to compare multiple test runs and protects against service regressions inducing load generator slow down.<br><br> In short, Iago strives to model a system where requests arrive independently of your service’s ability to handle them. This is as opposed to load generators which model closed systems where users will patiently handle whatever latency you give them. This distinction allows us to closely mimic failure modes that we would encounter in production.<br><br> Part of achieving high performance is the ability to scale horizontally. Unsurprisingly, Iago is no different from the systems we test with it. A single instance of Iago is composed of cooperating processes that can generate ~10K RPS provided a number of requirements are met including factors such as size of payload, the response time of the system under test, the number of ephemeral sockets available, and the rate you can actually generate messages your protocol requires. Despite this complexity, with horizontal scaling Iago is used to routinely test systems at Twitter with well over 200K RPS. We do this internally using our <a href=\"http://incubator.apache.org/mesos/\">Apache Mesos grid</a> computing infrastructure, but Iago can adapt to any system that supports creating multiple JVM processes that can discover each other using <a href=\"http://zookeeper.apache.org/\">Apache Zookeeper</a>.<br><br>Iago at Twitter<br><br> Iago has been used at Twitter throughout our stack, from our core database interfaces, storage sub-systems and domain logic, up to the systems accepting front end web requests. We routinely evaluate new hardware with it, have extended it to support correctness testing at scale and use it to test highly specific endpoints such as the new <a href=\"http://engineering/2012/06/tailored-trends-bring-you-closer.html\">tailored trends</a>, personalized search, and Discovery releases. We’ve used it to model anticipated load for large events as well as the overall growth of our system over time. It’s also good for providing background traffic while other tests are running, simply to provide the correct mix of usage that we will encounter in production.<br><br>Acknowledgements &amp; Future Work<br><br> Iago was primarily authored by James Waldrop (<a href=\"https://twitter.com/hivetheory\">@hivetheory</a>), but as with any such engineering effort a large number of people have contributed. A special thanks go out to the Finagle team, Marius Eriksen (<a href=\"https://twitter.com/marius\">@marius</a>), Arya Asemanfar (<a href=\"https://twitter.com/a_a\">@a_a</a>), Evan Meagher (<a href=\"https://twitter.com/evanm\">@evanm</a>), Trisha Quan (<a href=\"https://twitter.com/trisha\">@trisha</a>) and Stephan Zuercher (<a href=\"https://twitter.com/zuercher\">@zuercher</a>) for being tireless consumers as well as contributors to the project. Furthermore, we’d like to thank Raffi Krikorian (<a href=\"https://twitter.com/raffi\">@raffi</a>) and Dave Loftesness (<a href=\"https://twitter.com/dloft\">@dloft</a>) for originally envisioning and spearheading the effort to create Iago.<br><br> To view the Iago source code and participate in the creation and development of our roadmap, please visit <a href=\"https://github.com/twitter/iago\">Iago</a> on GitHub. If you have any further questions, we suggest joining the <a href=\"https://groups.google.com/forum/?fromgroups#!forum/iago-users\">mailing list</a> and following <a href=\"https://twitter.com/iagoloadgen\">@iagoloadgen</a>. If you’re at the Velocity Conference this week in San Francisco, please swing by our <a href=\"http://velocityconf.com/velocity2012/public/schedule/detail/26222\">office hours</a> to learn more about Iago. <br><br> - Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)</p>",
"date": "2012-06-25T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/building-and-profiling-high-performance-systems-with-iago",
"domain": "engineering"
},
{
"title": "Twitter at the Hadoop Summit",
"body": "<p><br> Apache Hadoop is a fundamental part of Twitter infrastructure. The massive computational and storage capacity it provides us is invaluable for analyzing our data sets, continuously improving user experience, and powering features such as “who to follow” <a href=\"http://engineering/2010/07/discovering-who-to-follow.html\">recommendations</a>,<a href=\"http://engineering/2012/05/new-tailored-suggestions-for-you-to.html\"> tailored follow suggestions</a> for new users and <a href=\"http://engineering/2012/05/best-of-twitter-in-your-inbox.html\">“best of Twitter” emails</a>. We developed and open-sourced a number of technologies, including the recent <a href=\"https://github.com/twitter/elephant-twin\">Elephant Twin</a> project that help our engineers be productive with Hadoop. We will be talking about some of them at the <a href=\"http://hadoopsummit.org/\">Hadoop Summit</a> this week:<br><br>Real-time analytics with Storm and Hadoop (<a href=\"https://twitter.com/nathanmarz\">@nathanmarz</a>)<br><a href=\"https://github.com/nathanmarz/storm\">Storm</a> is a distributed and fault-tolerant real-time computation system, doing for real-time computation what Hadoop did for batch computation. Storm can be used together with Hadoop to make a potent realtime analytics stack; Nathan will discuss how we’ve combined the two technologies at Twitter to do complex analytics in real-time.<br><br>Training a Smarter Pig: Large-Scale Machine Learning at Twitter (<a href=\"https://twitter.com/lintool\">@lintool</a>)<br> We’ll present a case study of Twitter`s integration of machine learning tools into its existing Hadoop-based, Pig-centric analytics platform. In our deployed solution, common machine learning tasks such as data sampling, feature generation, training, and testing can be accomplished directly in <a href=\"http://pig.apache.org/\">Pig</a>, via carefully crafted loaders, storage functions, and user-defined functions. This means that machine learning is just another Pig script, which allows seamless integration with existing infrastructure for data management, scheduling, and monitoring in a production environment. This talk is based on a paper we presented at SIGMOD 2012.<br><br>Scalding: Twitter`s new DSL for Hadoop (<a href=\"https://twitter.com/posco\">@posco</a>)<br> Hadoop uses a functional programming model to represent large-scale distributed computation. Scala is thus a very natural match for Hadoop. We will present <a href=\"https://github.com/twitter/scalding\">Scalding</a>, which is built on top of Cascading. Scalding brings an API very similar to Scala`s collection API to allow users to write jobs as they might locally and run those Jobs at scale. This talk will present the Scalding DSL and show some example jobs for common use cases.<br><br>Hadoop and Vertica: The Data Analytics Platform at Twitter (<a href=\"https://twitter.com/billgraham\">@billgraham</a>)<br> Our data analytics platform uses a number of technologies, including Hadoop, Pig, Vertica, MySQL and ZooKeeper, to process hundreds of terabytes of data per day. Hadoop and Vertica are key components of the platform. The two systems are complementary, but their inherent differences create integration challenges. This talk is an overview of the overall system architecture focusing on integration details, job coordination and resource management.<br><br>Flexible In-Situ Indexing for Hadoop via Elephant Twin (<a href=\"https://twitter.com/squarecog\">@squarecog</a>)<br> Hadoop workloads can be broadly divided into two types: large aggregation queries that involve scans through massive amounts of data, and selective “needle in a haystack” queries that significantly restrict the number of records under consideration. Secondary indexes can greatly increase processing speed for queries of the second type. We will present Twitter`s generic, extensible in-situ indexing framework Elephant Twin which was just open sourced: unlike “trojan layouts,” no data copying is necessary, and unlike Hive, our integration at the Hadoop API level means that all layers in the stack above can benefit from indexes.<br><br> As you can tell, our uses of Hadoop are wide and varied. We are looking forward to exchanging notes with other practitioners and learning about upcoming developments in the Hadoop ecosystem. Hope to see you there and if this sort of thing gets you excited, reach out to us, as we are <a href=\"http://twitter.com/jobs\">hiring</a>!<br><br> - Dmitriy Ryaboy, Engineering Manager, Analytics (<a href=\"http://twitter.com/squarecog\">@squarecog</a>)</p>",
"date": "2012-06-13T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/twitter-at-the-hadoop-summit",
"domain": "engineering"
},
{
"title": "Distributed Systems Tracing with Zipkin",
"body": "<p><a href=\"https://github.com/twitter/zipkin\">Zipkin</a> is a distributed tracing system that we created to help us gather timing data for all the disparate services involved in managing a request to the Twitter API. As an analogy, think of it as a performance profiler, like <a href=\"http://getfirebug.com/\">Firebug</a>, but tailored for a website backend instead of a browser. In short, it makes Twitter faster. Today we’re open sourcing <a href=\"https://github.com/twitter/zipkin\">Zipkin</a> under the APLv2 license to share a useful piece of our infrastructure with the open source community and gather feedback.</p> \n<p><a href=\"http://4.bp.blogspot.com/-DsmFcaeMi4U/T9DX_XLMrdI/AAAAAAAAABg/jKF_pXToIEo/s1600/zipkin.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/distributed_systemstracingwithzipkin95.thumb.1280.1280.png\" alt=\"Distributed Systems Tracing with Zipkin\"></a><br><br></p> \n<h3>What can Zipkin do for me?</h3> \n<p>Here’s the Zipkin web user interface. This example displays the trace view for a web request. You can see the time spent in each service compared to the scale on top and all the services involved in the request on the left. You can click on those for more detailed information.</p> \n<p><a href=\"http://4.bp.blogspot.com/-b0r71ZbJdmA/T9DYhbE0uXI/AAAAAAAAABs/bXwyM76Iddc/s1600/web-screenshot.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/distributed_systemstracingwithzipkin96.thumb.1280.1280.png\" alt=\"Distributed Systems Tracing with Zipkin\"></a></p> \n<p>Zipkin has helped us find a whole slew of untapped performance optimizations, such as removing memcache requests, rewriting slow MySQL SELECTs, and fixing incorrect service timeouts. Finding and correcting these types of performance bottlenecks helps make Twitter faster.</p> \n<h3>How does Zipkin work?</h3> \n<p>Whenever a request reaches Twitter, we decide if the request should be sampled. We attach a few lightweight trace identifiers and pass them along to all the services used in that request. By only sampling a portion of all the requests we reduce the overhead of tracing, allowing us to always have it enabled in production.</p> \n<p>The Zipkin collector receives the data via Scribe and stores it in Cassandra along with a few indexes. The indexes are used by the Zipkin query daemon to find interesting traces to display in the web UI.</p> \n<p>Zipkin started out as a project during our first Hack Week. During that week we implemented a basic version of the <a href=\"http://research.google.com/pubs/pub36356.html\">Google Dapper</a> paper for Thrift. Today it has grown to include support for tracing Http, Thrift, Memcache, SQL and Redis requests. These are mainly done via our <a href=\"http://twitter.github.com/finagle/\">Finagle</a> library in Scala and Java, but we also have a gem for Ruby that includes basic tracing support. It should be reasonably straightforward to add tracing support for other protocols and in other libraries.</p> \n<h3>Acknowledgements</h3> \n<p>Zipkin was primarily authored by Johan Oskarsson (<a href=\"https://twitter.com/intent/user?screen_name=skr\">@skr</a>) and Franklin Hu (<a href=\"https://twitter.com/intent/user?screen_name=thisisfranklin\">@thisisfranklin</a>). The project relies on a bunch of Twitter libraries such as <a href=\"http://twitter.github.com/finagle/\">Finagle</a> and <a href=\"https://github.com/twitter/scrooge\">Scrooge</a> but also on <a href=\"http://cassandra.apache.org/\">Cassandra</a> for storage, <a href=\"http://zookeeper.apache.org/\">ZooKeeper</a> for configuration, <a href=\"https://github.com/facebook/scribe\">Scribe</a> for transport, <a href=\"http://twitter.github.com/bootstrap/\">Bootstrap</a> and <a href=\"http://d3js.org/\">D3</a> for the UI. Thanks to the authors of those projects, the authors of the Dapper paper as well as the numerous people at Twitter involved in making Zipkin a reality. A special thanks to <a href=\"https://twitter.com/intent/user?screen_name=iano\">@iano</a>, <a href=\"https://twitter.com/intent/user?screen_name=couch\">@couch</a>, <a href=\"https://twitter.com/intent/user?screen_name=zed\">@zed</a>, <a href=\"https://twitter.com/intent/user?screen_name=dmcg\">@dmcg</a>, <a href=\"https://twitter.com/intent/user?screen_name=marius\">@marius</a> and <a href=\"https://twitter.com/intent/user?screen_name=a_a\">@a_a</a> for their involvement. Last but not least we’d like to thank <a href=\"https://twitter.com/intent/user?screen_name=jeajea\">@jeajea</a> for designing the Zipkin logo.</p> \n<p>On the whole, Zipkin was initially targeted to support Twitter’s infrastructure of libraries and protocols, but can be extended to support more systems that can be used within your infrastructure. Please let us know on Github if you find any issues and pull requests are always welcome. If you want to stay in touch, follow <a href=\"https://twitter.com/intent/user?screen_name=ZipkinProject\">@ZipkinProject</a> and check out the upcoming talk at Strange Loop 2012. If distributed systems tracing interests you, consider joining the flock to make things better.</p> \n<p>- Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/#!/cra\">@cra</a>)</p>",
"date": "2012-06-07T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/distributed-systems-tracing-with-zipkin",
"domain": "engineering"
},
{
"title": "Studying rapidly evolving user interests",
"body": "<p>Twitter is an amazing real-time information dissemination platform. We’ve seen events of historical importance such as the Arab Spring unfold via Tweets. We even know that Twitter is <a href=\"http://www.youtube.com/watch?v=0UFsJhYBxzY\">faster than earthquakes</a>! However, can we more scientifically characterize the real-time nature of Twitter?</p> \n<p>One way to measure the dynamics of a content system is to test how quickly the distribution of terms and phrases appearing in it changes. A recent study we’ve done does exactly this: looking at terms and phrases in Tweets and in real-time search queries, we see that the most frequent terms in one hour or day tend to be very different from those in the next — significantly more so than in other content on the web. Informally, we call this phenomenon churn.</p> \n<p>This week, we are presenting a <a href=\"http://arxiv.org/abs/1205.6855\">short paper</a> at the <a href=\"http://www.icwsm.org/2012/\">International Conference on Weblogs and Social Media (ICWSM 2012)</a>, in which <a href=\"http://www.twitter.com/gilad\">@gilad</a> and I examine this phenomenon. An extended version of the paper, titled “A Study of ‘Churn’ in Tweets and Real-Time Search Queries”, is available <a href=\"http://arxiv.org/abs/1205.6855\">here</a>. Some highlights:</p> \n<ul>\n <li>Examining all search queries from October 2011, we see that, on average, about 17% of the top 1000 query terms from one hour are no longer in the top 1000 during the next hour. In other words, 17% of the top 1000 query terms “churn over” on an hourly basis.</li> \n <li>Repeating this at a granularity of days instead of hours, we still find that about 13% of the top 1000 query terms from one day are no longer in the top 1000 during the next day.</li> \n <li>During major events, the frequency of queries spike dramatically. For example, on October 5, immediately following news of the death of Apple co-founder and CEO Steve Jobs, the query “steve jobs” spiked from a negligible fraction of query volume to 15% of the query stream — almost one in six of all queries issued! Check it out: the query volume is literally off the charts! Notice that related queries such as “apple” and “stay foolish” spiked as well.</li> \n</ul>\n<p><a href=\"http://3.bp.blogspot.com/-F7dqZbCQU9o/T8zwh16VJVI/AAAAAAAAAAk/r9qX9fqjU9U/s1600/graph-apple-queries.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/studying_rapidlyevolvinguserinterests95.thumb.1280.1280.png\" alt=\"Studying rapidly evolving user interests\"></a></p> \n<p>What does this mean? News breaks on Twitter, whether local or global, of narrow or broad interest. When news breaks, Twitter users flock to the service to find out what’s happening. Our goal is to instantly connect people everywhere to what’s most meaningful to them; the speed at which our content (and the relevance signals stemming from it) evolves make this more technically challenging, and we are hard at work continuously refining our relevance algorithms to address this. Just to give one example: search, boiled down to its basics, is about computing term statistics such as term frequency and inverse document frequency. Most algorithms assume some static notion of underlying distributions — which surely isn’t the case here!</p> \n<p>In addition, we’re presenting a <a href=\"http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4785/5095\">paper</a> at the co-located workshop on <a href=\"http://socmedvis.ucd.ie/\">Social Media Visualization</a>, where <a href=\"http://www.twitter.com/miguelrios\">@miguelrios</a> and I share some of our experiences in using data visualization techniques to generate insights from the petabytes of data in our data warehouse. You’ve seen some of these visualizations before, for example, about the <a href=\"http://engineering/2010/07/2010-world-cup-global-conversation.html\">2010 World Cup</a> and <a href=\"http://engineering/2011/06/global-pulse.html\">2011 Japan earthquake</a>. In the paper, we present another visualization, of seasonal variation of tweeting patterns for users in four different cities (New York City, Tokyo, Istanbul, and Sao Paulo). The gradient from white to yellow to red indicates amount of activity (light to heavy). Each tile in the heatmap represents five minutes of a given day and colors are normalized by day. This was developed internally to understand why growth patterns in Tweet-production experience seasonal variations.</p> \n<p><a href=\"http://2.bp.blogspot.com/-wA7QSnBjFWI/T8zw0wBouhI/AAAAAAAAAAw/pZ40XbZfh-8/s1600/sleeping_grid-miguel.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/studying_rapidlyevolvinguserinterests96.thumb.1280.1280.png\" alt=\"Studying rapidly evolving user interests\"></a></p> \n<p>We see different patterns of activity between the four cities. For example, waking/sleeping times are relatively constant throughout the year in Tokyo, but the other cities exhibit seasonal variations. We see that Japanese users’ activities are concentrated in the evening, whereas in the other cities there is more usage during the day. In Istanbul, nights get shorter during August; Sao Paulo shows a time interval during the afternoon when Tweet volume goes down, and also longer nights during the entire year compared to the other three cities.</p> \n<p>Finally, we’re also giving a keynote at the co-located workshop on <a href=\"http://www.ramss.ws/2012/\">Real-Time Analysis and Mining of Social Streams (RAMSS)</a>, fitting very much into the theme of our study. We’ll be reviewing many of the challenges of handling real-time data, including many of the issues described above.</p> \n<p>Interested in real-time systems that deliver relevant information to users? Interested in data visualization and data science? We’re <a href=\"http://twitter.com/jobs/\">hiring</a>! Join the flock!</p> \n<p>- Jimmy Lin, Research Scientist, Analytics (<a href=\"http://www.twitter.com/lintool/\">@lintool</a>)</p>",
"date": "2012-06-04T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/studying-rapidly-evolving-user-interests",
"domain": "engineering"
},
{
"title": "Improving performance on twitter.com",
"body": "<p>To connect you to information in real time, it’s important for Twitter to be fast. That’s why we’ve been reviewing our entire technology stack to optimize for speed.</p> \n<p>When we shipped <a href=\"https://twitter.com/hashtag/NewTwitter\">#NewTwitter</a> in September 2010, <a href=\"http://engineering.twitter.com/2010/09/tech-behind-new-twittercom.html\">we built it</a> around a web application architecture that pushed all of the UI rendering and logic to JavaScript running on our users’ browsers and consumed the Twitter REST API directly, in a similar way to our mobile clients. That architecture broke new ground by offering a number of advantages over a more traditional approach, but it lacked support for various optimizations available only on the server.</p> \n<p>To improve the twitter.com experience for everyone, we’ve been working to take back control of our front-end performance by moving the rendering to the server. This has allowed us to drop our initial page load times to 1/5th of what they were previously and reduce differences in performance across browsers.</p> \n<p>On top of the rendered pages, we asynchronously bootstrap a new modular JavaScript application to provide the fully-featured interactive experience our users expect. This new framework will help us rapidly develop new Twitter features, take advantage of new browser technology, and ultimately provide the best experience to as many people as possible.</p> \n<p>This week, we rolled out the re-architected version of one of our most visited pages, the <a href=\"https://twitter.com/cwninja/status/205341558285938688\">Tweet permalink page</a>. We’ll continue to roll out this new framework to the rest of the site in the coming weeks, so we’d like to take you on a tour of some of the improvements.</p> \n<h3>No more #!</h3> \n<p>The first thing that you might notice is that permalink URLs are now simpler: they no longer use the hashbang (#!). While hashbang-style URLs have a <a href=\"http://danwebb.net/2011/5/28/it-is-about-the-hashbangs\">handful of limitations</a>, our primary reason for this change is to improve initial page-load performance.</p> \n<p>When you come to twitter.com, we want you to see content as soon as possible. With hashbang URLs, the browser needs to download an HTML page, download and execute some JavaScript, recognize the hashbang path (which is only visible to the browser), then fetch and render the content for that URL. By removing the need to handle routing on the client, we remove many of these steps and reduce the time it takes for you to find out what’s happening on twitter.com.</p> \n<h3>Reducing time to first tweet</h3> \n<p>Before starting any of this work we added instrumentation to find the performance pain points and identify which categories of users we could serve better. The most important metric we used was “time to first Tweet”. This is a measurement we took from a sample of users, (using the <a href=\"http://www.w3.org/TR/navigation-timing/\">Navigation Timing API</a>) of the amount of time it takes from navigation (clicking the link) to viewing the first Tweet on each page’s timeline. The metric gives us a good idea of how snappy the site feels.</p> \n<p>Looking at the components that make up this measurement, we discovered that the raw parsing and execution of JavaScript caused massive outliers in perceived rendering speed. In our fully client-side architecture, you don’t see anything until our JavaScript is downloaded and executed. The problem is further exacerbated if you do not have a high-specification machine or if you’re running an older browser. The bottom line is that a client-side architecture leads to slower performance because most of the code is being executed on our users’ machines rather than our own.</p> \n<p>There are a variety of options for improving the performance of our JavaScript, but we wanted to do even better. We took the execution of JavaScript completely out of our render path. By rendering our page content on the server and deferring all JavaScript execution until well after that content has been rendered, we’ve dropped the time to first Tweet to one-fifth of what it was.</p> \n<h3>Loading only what we need</h3> \n<p>Now that we’re delivering page content faster, the next step is to ensure that our JavaScript is loaded and the application is interactive as soon as possible. To do that, we need to minimize the amount of JavaScript we use: smaller payload over the wire, fewer lines of code to parse, faster to execute. To make sure we only download the JavaScript necessary for the page to work, we needed to get a firm grip on our dependencies.</p> \n<p>To do this, we opted to arrange all our code as CommonJS modules, delivered via AMD. This means that each piece of our code explicitly declares what it needs to execute which, firstly, is a win for developer productivity. When working on any one module, we can easily understand what dependencies it relies on, rather than the typical browser JavaScript situation in which code depends on an implicit load order and globally accessible properties.</p> \n<p>Modules let us separate the loading and the evaluation of our code. This means that we can bundle our code in the most efficient manner for delivery and leave the evaluation order up to the dependency loader. We can tune how we bundle our code, lazily load parts of it, download pieces in parallel, separate it into any number of files, and more — all without the author of the code having to know or care about this. Our JavaScript bundles are built programmatically by a tool, similar to the RequireJS optimizer, that crawls each file to build a dependency tree. This dependency tree lets us design how we bundle our code, and rather than downloading the kitchen sink every time, we only download the code we need — and then only execute that code when required by the application.</p> \n<h3>What’s next?</h3> \n<p>We’re currently rolling out this new architecture across the site. Once our pages are running on this new foundation, we will do more to further improve performance. For example, we will implement the History API to allow partial page reloads in browsers that support it, and begin to overhaul the server side of the application.</p> \n<p>If you want to know more about these changes, come and see us at the <a href=\"http://fluentconf.com/fluent2012/public/schedule/detail/24679\">Fluent Conference</a> next week. We’ll speak about the details behind our rebuild of twitter.com and host a <a href=\"http://twitterhappyhour.eventbrite.com/\">JavaScript Happy Hour</a> at Twitter HQ on May 31.</p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>@<a href=\"https://twitter.com/danwrong\">danwrong</a> yey! The future is coming and it looks just like the past, but more good underneath.</p> — Tom Lea (\n <a href=\"https://twitter.com/intent/user?screen_name=cwninja\">@cwninja</a>) \n <a href=\"https://twitter.com/cwninja/status/205341558285938688\">May 23, 2012</a>\n </blockquote>",
"date": "2012-05-29T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/improving-performance-on-twittercom",
"domain": "engineering"
},
{
"title": "Visualize Data Workflows with Ambrose",
"body": "<p>Last Friday at our <a href=\"http://pig.apache.org\">Apache Pig</a> Hackathon, we open-sourced Twitter <a href=\"https://github.com/twitter/ambrose\">Ambrose</a>, a tool which helps authors of large-scale data workflows keep track of the overall status of a workflow and visualize its progress.<br><br></p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>hey <a href=\"https://twitter.com/search/%2523hadoop\">#hadoop</a> folks, we open sourced @<a href=\"https://twitter.com/ambrose\">ambrose</a> today, a tool for visualization and real-time monitoring of data workflows <a href=\"https://t.co/72jVgSAy\">github.com/twitter/ambrose</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/status/200985628912005120\">May 11, 2012</a>\n </blockquote>",
"date": "2012-05-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/visualize-data-workflows-with-ambrose",
"domain": "engineering"
},
{
"title": "Related Queries and Spelling Corrections in Search",
"body": "<p>As you may have noticed, searches on twitter.com, Twitter for iOS, and Twitter for Android now have spelling corrections and related queries next to the search results. <br><br> At the core of our related queries and spelling correction service is a simple mechanism: if we see query A in some context, and then see query B in the same context, we think they’re related. If A and B are similar, B may be a spell-corrected version of A; if they’re not, it may be interesting to searchers who find A interesting. We use both query sessions and tweets for context; if we observe a user typing [justin beiber] and then, within the same session, typing [justin bieber], we’ll consider the second query as a possible spelling correction to the first — and if the same session will also contain [selena gomez], we may consider this as a related query to the previous queries. The data we process is anonymized — we don’t track which queries are issued by a given user, only that the same (unknown) user has issued several queries in a row, or continuously tweeted. <br><br> To measure the similarity between queries, we use a variant of <a href=\"http://en.wikipedia.org/wiki/Edit_distance\">Edit Distance</a> tailored to Twitter queries; for example, in our variant we treat the beginning and end characters of a query differently from the inner characters, as spelling mistakes tend to be concentrated in those. Our variant also treats special Twitter characters (such as @ and #) differently from other characters, and has other differences from the vanilla Edit Distance. To measure the quality of the suggestions, we use a variety of signals including query frequencies (of the original query and the suggestion), statistical correlation measures such as <a href=\"http://en.wikipedia.org/wiki/Likelihood-ratio_test\">log-likelihood</a>, the quality of the search results for the suggestion, and others.<br><br> Twitter’s spelling correction has a number of unique challenges: searchers frequently type in usernames or hashtags that are not well-formed English words; there is a real-time constancy of new lingo and terms supplied by our own users; and we want to help people find those in order to join in the conversation. To address all of these issues, on top of our context-based mechanism, we also index dictionaries of trending queries and popular users that are likely to be misspelled, and use Lucene’s <a href=\"https://builds.apache.org/job/Lucene-trunk/javadoc/suggest/org/apache/lucene/search/spell/package-summary.html\">built-in</a> spelling correction library (tweaked to better serve our needs) to identify misspelling and retrieve corrections for queries.<br><br></p> \n<p><a href=\"http://4.bp.blogspot.com/-rS-RCrZIHiY/T61VpOyYatI/AAAAAAAAAUs/mSD0NuM1_oo/s1600/bieber_spelling.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/related_queries_andspellingcorrectionsinsearch95.thumb.1280.1280.png\" alt=\"Related Queries and Spelling Corrections in Search\"></a></p> \n<p><br> Initially, we started computing-related queries and spelling correction in a batch service, periodically updating our user-facing service with the latest data. But we’ve noticed that the lag this process introduced resulted in a less-than-optimal experience — it would take several hours for the models to adapt to new search trends. We then rewrote the entire service, this time as an online, real-time one. Queries and tweets are tracked as they come, and our models are continuously updated, just like the search results themselves. To account for the longer tail of queries that has less context from recent hours, we combine the real-time, up-to-date model with a background model computed in the same manner, but over several months of data (and updated daily).<br><br></p> \n<p><a href=\"http://3.bp.blogspot.com/-FRw8hKOODMc/T61Vtmv5cNI/AAAAAAAAAU4/P5mrboXfIMU/s1600/dark%2Bshadows.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/related_queries_andspellingcorrectionsinsearch96.thumb.1280.1280.png\" alt=\"Related Queries and Spelling Corrections in Search\"></a></p> \n<p><br> Within the first two weeks of launching our related queries and spelling corrections in late April, we’ve corrected 5 million queries and provided suggestions to 100 million more. We’re very encouraged by the high engagement rates we’re seeing so far on both features. <br><br> We’re working on more ways to help you find and discover the most relevant and engaging content in real time, so stay tuned. There are other big improvements we’ll be rolling out to Twitter search over the coming weeks and months.<br><br>Acknowledgments<br> The system was built by Gilad Mishne (<a href=\"https://twitter.com/gilad\">@gilad</a>), Zhenghua Li (<a href=\"https://twitter.com/zhenghuali\">@zhenghuali</a>) and Tian Wang (<a href=\"https://twitter.com/wangtian\">@wangtian</a>) with help from the entire Twitter Search team. Thanks also to Jeff Dalton (<a href=\"https://twitter.com/jeffd\">@jeffd</a>) for initial explorations and to Aneesh Sharma (<a href=\"https://twitter.com/aneeshs\">@aneeshs</a>) for help with the design.</p>",
"date": "2012-05-11T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/related-queries-and-spelling-corrections-in-search",
"domain": "engineering"
},
{
"title": "Incubating Apache Mesos",
"body": "<p>At Twitter, <a href=\"http://incubator.apache.org/mesos/\">Apache Mesos</a> runs on hundreds of production machines and makes it easier to execute jobs that do everything from running services to handling our analytics workload. For those not familiar with it, the Mesos project originally started as a UC Berkeley research effort. It is now being developed at the Apache Software Foundation (ASF), where it just reached its first release inside the <a href=\"http://incubator.apache.org/\">Apache Incubator</a>. <br><br> Mesos aims to make it easier to build distributed applications and frameworks that share clustered resources like, CPU, RAM or hard disk space. There are Java, Python and C++ APIs for developing new parallel applications. Specifically, you can use Mesos to: <br><br></p> \n<ul>\n <li>Run <a href=\"http://hadoop.apache.org/\">Hadoop</a>, <a href=\"http://www.spark-project.org/\">Spark</a> and other frameworks concurrently on a shared pool of nodes</li> \n <li>Run multiple instances of Hadoop on the same cluster to isolate production and experimental jobs, or even multiple versions of Hadoop</li> \n <li>Scale to 10,000s of nodes using fast, event-driven C++ implementation</li> \n <li>Run long-lived services (e.g., Hypertable and HBase) on the same nodes as batch applications and share resources between them</li> \n <li>Build new cluster computing frameworks without reinventing low-level facilities for farming out tasks, and have them coexist with existing ones</li> \n <li>View cluster status and information using a web user interface</li> \n</ul>\n<p><br> Mesos is being used at <a href=\"http://www.conviva.com/\">Conviva</a>, <a href=\"http://www.berkeley.edu/\">UC Berkeley</a> and <a href=\"http://www.ucsf.edu/\">UC San Francisco</a>, as well as here. Some of our runtime systems engineers, specifically Benjamin Hindman (<a href=\"https://twitter.com/benh\">@benh</a>), Bill Farner (<a href=\"https://twitter.com/wfarner\">@wfarner</a>), Vinod Kone (<a href=\"https://twitter.com/vinodkone\">@vinodkone</a>), John Sirois (<a href=\"https://twitter.com/johnsirois\">@johnsirois</a>), Brian Wickman (<a href=\"https://twitter.com/wickman\">@wickman</a>), and Sathya Hariesh (<a href=\"https://twitter.com/sathya\">@sathya</a>) have worked hard to evolve Mesos and make it useful for our scalable engineering challenges. If you’re interested in Mesos, we invite you to <a href=\"http://www.mesosproject.org/download.html\">try it out</a>, follow <a href=\"http://twitter.com/#!/apachemesos\">@ApacheMesos</a>, join the <a href=\"http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/\">mailing list</a> and help us develop a Mesos community within the ASF.<br><br> — Chris Aniszczyk, Manager of Open Source (<a href=\"https://twitter.com/cra\">@cra</a>)</p>",
"date": "2012-05-10T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/incubating-apache-mesos",
"domain": "engineering"
},
{
"title": "Discover: Improved personalization algorithms and real-time indexing",
"body": "<p>We are beginning to roll out a <a href=\"http://engineering/2012/05/discover-better-stories.html\">new version of the Discover tab</a> that is even more personalized for you. We’ve improved our personalization algorithms to incorporate several new signals including the accounts you follow and whom they follow. All of this social data is used to understand your interests and display stories that are relevant to you in real-time.<br><br> Behind the scenes, the new Discover tab is powered by <a href=\"http://engineering.twitter.com/2011/05/engineering-behind-twitters-new-search.html\">Earlybird</a>, Twitter’s real-time search technology. When a user tweets, that Tweet is indexed and becomes searchable in seconds. Every Tweet with a link also goes through some additional processing: we extract and expand any URLs available in Tweets, and then fetch the contents of those URLs via <a href=\"http://engineering.twitter.com/2011/11/spiderduck-twitters-real-time-url.html\">SpiderDuck</a>, our real-time URL fetcher.<br><br> To generate the stories that are based on your social graph and that we believe are most interesting to you, we first use <a href=\"http://engineering.twitter.com/2012/03/cassovary-big-graph-processing-library.html\">Cassovary</a>, our graph processing library, to identify your connections and rank them according to how strong and important those connections are to you.<br><br> Once we have that network, we use Twitter’s flexible search engine to find URLs that have been shared by that circle of people. Those links are converted into stories that we’ll display, alongside other stories, in the Discover tab. Before displaying them, a final ranking pass re-ranks stories according to how many people have tweeted about them and how important those people are in relation to you. All of this happens in near-real time, which means breaking and relevant stories appear in the new Discover tab almost as soon as people start talking about them. <br><br> Our NYC engineering team, led by Daniel Loreto (<a href=\"http://twitter.com/#!/danielloreto\">@DanielLoreto</a>), along with Julian Marinus (<a href=\"http://twitter.com/#!/fooljulian\">@fooljulian</a>), Alec Thomas (<a href=\"http://twitter.com/#!/alecthomas\">@alecthomas</a>), Dave Landau (<a href=\"http://twitter.com/#!/landau\">@landau</a>), and Ugo Di Girolamo (<a href=\"http://twitter.com/#!/ugodiggi\">@ugodiggi</a>), is working hard on Discover to create new ways to bring you instantly closer to the things you care about. This update is just the beginning of this ongoing effort.<br><br> - Ori Allon, Director of Engineering (<a href=\"http://twitter.com/#!/oriallon\">@oriallon</a>)</p>",
"date": "2012-05-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/discover-improved-personalization-algorithms-and-real-time-indexing",
"domain": "engineering"
},
{
"title": "Sponsoring the Apache Foundation",
"body": "<p>Open source is a pervasive part of our culture. Many projects at Twitter rely on open source technologies, and as we evolve as a company, our commitment to open source continues to increase. Today, we are becoming an official sponsor of the <a href=\"http://apache.org/\">Apache Software Foundation</a> (ASF), a non-profit and volunteer-run open source foundation. </p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>Starting today, we are sponsoring The Apache Foundation. We look forward to contributing more and increasing our commitment to @<a href=\"https://twitter.com/TheASF\">TheASF</a></p> — Twitter Open Source (\n <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a>) \n <a href=\"https://twitter.com/TwitterOSS/status/193035999767576576\">April 19, 2012</a>\n </blockquote>",
"date": "2012-04-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/sponsoring-the-apache-foundation",
"domain": "engineering"
},
{
"title": "Introducing the Innovator’s Patent Agreement",
"body": "<p>Cross-posted on the <a href=\"http://engineering/2012/04/introducing-innovators-patent-agreement.html\">Twitter Blog</a>. <br><br> One of the great things about Twitter is working with so many talented folks who dream up and build incredible products day in and day out. Like many companies, we apply for patents on a bunch of these inventions. However, we also think a lot about how those patents may be used in the future; we sometimes worry that they may be used to impede the innovation of others. For that reason, we are publishing a draft of the Innovator’s Patent Agreement, which we informally call the “IPA”. <br><br> The IPA is a new way to do patent assignment that keeps control in the hands of engineers and designers. It is a commitment from Twitter to our employees that patents can only be used for defensive purposes. We will not use the patents from employees’ inventions in offensive litigation without their permission. What’s more, this control flows with the patents, so if we sold them to others, they could only use them as the inventor intended. <br><br> This is a significant departure from the current state of affairs in the industry. Typically, engineers and designers sign an agreement with their company that irrevocably gives that company any patents filed related to the employee’s work. The company then has control over the patents and can use them however they want, which may include selling them to others who can also use them however they want. With the IPA, employees can be assured that their patents will be used only as a shield rather than as a weapon. <br><br> We will implement the IPA later this year, and it will apply to all patents issued to our engineers, both past and present. We are still in early stages, and have just started to reach out to other companies to discuss the IPA and whether it might make sense for them too. In the meantime, we’ve <a href=\"https://github.com/twitter/innovators-patent-agreement\">posted the IPA on GitHub</a> with the hope that you will take a look, share your feedback and discuss with your companies. And, of course, <a href=\"https://twitter.com/jobs/engineering\">you can #jointheflock</a> and have the IPA apply to you. <br><br> Today is the second day of our quarterly Hack Week, which means employees – engineers, designers, and folks all across the company – are working on projects and tools outside their regular day-to-day work. The goal of this week is to give rise to the most audacious and creative ideas. These ideas will have the greatest impact in a world that fosters innovation, rather than dampening it, and we hope the IPA will play an important part in making that vision a reality. <br><br> - Adam Messinger, VP of Engineering (<a href=\"https://twitter.com/#!/adam_messinger\">@adam_messinger</a>)</p>",
"date": "2012-04-17T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/introducing-the-innovator-s-patent-agreement-0",
"domain": "engineering"
},
{
"title": "MySQL at Twitter",
"body": "<p>MySQL is the persistent storage technology behind most Twitter data: the interest graph, timelines, user data and the Tweets themselves. Due to our scale, we push MySQL a lot further than most companies. Of course, MySQL is open source software, so we have the ability to change it to suit our needs. Since we believe in sharing knowledge and that open source software facilitates innovation, we have decided to open source <a href=\"https://github.com/twitter/mysql\">our MySQL work</a> on GitHub under the BSD New license. The objectives of our work thus far has primarily been to improve the predictability of our services and make our lives easier. Some of the work we’ve done includes:</p> \n<ul>\n <li>Add additional status variables, particularly from the internals of InnoDB. This allows us to monitor our systems more effectively and understand their behavior better when handling production workloads.</li> \n <li>Optimize memory allocation on large NUMA systems: Allocate InnoDB’s buffer pool fully on startup, fail fast if memory is not available, ensure performance over time even when server is under memory pressure.</li> \n <li>Reduce unnecessary work through improved server-side statement timeout support. This allows the server to proactively cancel queries that run longer than a millisecond-granularity timeout.</li> \n <li>Export and restore InnoDB buffer pool in using a safe and lightweight method. This enables us to build tools to support rolling restarts of our services with minimal pain.</li> \n <li>Optimize MySQL for SSD-based machines, including page-flushing behavior and reduction in writes to disk to improve lifespan.</li> \n</ul>\n<p>We look forward sharing our work with upstream and other downstream MySQL vendors, with a goal to improve the MySQL community. For a more complete look at our work, please see the <a href=\"https://github.com/twitter/mysql/wiki/Change-History\">change history</a> and <a href=\"https://github.com/twitter/mysql/wiki\">documentation</a>. If you want to learn more about our usage of MySQL, we will be <a href=\"http://www.percona.com/live/mysql-conference-2012/sessions/gizzard-scale-twitters-mysql-sharding-framework\">speaking about Gizzard</a>, our sharding and replication framework on top of MySQL, at the Percona Live MySQL Conference and Expo on April 12th. Finally, <a href=\"https://github.com/twitter/mysql\">contact us on GitHub</a> or <a href=\"https://github.com/twitter/mysql/issues\">file an issue</a> if you have questions. <br><br> On behalf of the Twitter DBA and DB development teams,<br><br> - Jeremy Cole (<a href=\"https://twitter.com/intent/user?screen_name=jeremycole\">@jeremycole</a>)<br> - Davi Arnaut (<a href=\"https://twitter.com/intent/user?screen_name=darnaut\">@darnaut</a>)</p>",
"date": "2012-04-09T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/mysql-at-twitter",
"domain": "engineering"
},
{
"title": "Security Open House March 29",
"body": "<p>The past few months have been busy for the Twitter security team: we’ve turned on <a href=\"http://engineering/2012/02/securing-your-twitter-experience-with.html\">HTTPS by default</a> for everyone, added great engineers from Whisper Systems and Dasient, and had some stimulating internal discussions about how we can continue to better protect users. We want to share what we’ve been up to and discuss the world of online security, so we’ll be hosting a Security Open House on March 29 here at Twitter HQ. We’ve got a great lineup of speakers to get the conversations going:<br><br> Neil Daswani (<a href=\"https://twitter.com/intent/user?screen_name=neildaswani\">@neildaswani</a>): Online fraud and mobile application abuse<br><br> Jason Wiley (<a href=\"https://twitter.com/intent/user?screen_name=capnwiley\">@capnwiley</a>) &amp; Dino Fekaris (<a href=\"https://twitter.com/intent/user?screen_name=dino\">@dino</a>): Twitter phishing vectors and the fallout<br><br> Neil Matatall (<a href=\"https://twitter.com/intent/user?screen_name=nilematotle\">@nilematotle</a>): Brakeman: detecting security vulnerabilities in Ruby on Rails applications via static analysis<br><br> Come by to meet our Security team, hear about some of our work, and learn about opportunities to join the flock at the first <a href=\"https://twitter.com/hashtag/TwitterSec\">#TwitterSec</a>. Here’s what you need to know to get yourself signed up:</p> \n<blockquote>\n When: Thursday, March 29, 2012; 5:30pm - 9:00pm\n <br>\n <br> Where: Twitter HQ - 795 Folsom Street, San Francisco, CA\n <br>\n <br> Who: Security and privacy engineers\n <br>\n <br> RSVP: Space is limited, so \n <a href=\"http://www.eventbrite.com/event/3190558045\">reserve your spot now</a>. Hope to see you here!\n</blockquote>",
"date": "2012-03-22T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/security-open-house-march-29",
"domain": "engineering"
},
{
"title": "Cassovary: A Big Graph-Processing Library",
"body": "<p>We are open sourcing <a href=\"https://github.com/twitter/cassovary\">Cassovary</a>, a big graph-processing library for the Java Virtual Machine (JVM) written in Scala. Cassovary is designed from the ground up to efficiently handle graphs with billions of edges. It comes with some common node and graph data structures and traversal algorithms. A typical usage is to do large-scale graph mining and analysis.</p> \n<p>At Twitter, Cassovary forms the bottom layer of a stack that we use to power many of our graph-based features, including <a href=\"https://twitter.com/who_to_follow/suggestions\">“Who to Follow”</a> and <a href=\"https://twitter.com/similar_to/twitter\">“Similar to.”</a> We also use it for relevance in <a href=\"http://engineering.twitter.com/2011/05/engineering-behind-twitters-new-search.html\">Twitter Search</a> and the algorithms that determine which Promoted Products users will see. Over time, we hope to bring more non-proprietary logic from some of those product features into Cassovary.</p> \n<p>Please use, fork, and contribute to Cassovary if you can. If you have any questions, ask on the <a href=\"http://www.google.com/url?q=http%3A%2F%2Fgroups.google.com%2Fgroup%2Ftwitter-cassovary\">mailing list</a> or file <a href=\"https://github.com/twitter/cassovary/issues\">issues</a> on GitHub. Also, follow <a href=\"https://twitter.com/cassovary\">@cassovary</a> for updates.</p> \n<p>-Pankaj Gupta (@<a href=\"https://twitter.com/#!/pankaj\">pankaj</a>)</p>",
"date": "2012-03-08T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/cassovary-a-big-graph-processing-library",
"domain": "engineering"
},
{
"title": "Generating Recommendations with MapReduce and Scalding",
"body": "<p>Scalding is an in-house MapReduce framework that Twitter recently open-sourced. Like <a href=\"http://pig.apache.org/\">Pig</a>, it provides an abstraction on top of MapReduce that makes it easy to write big data jobs in a syntax that’s simple and concise. Unlike Pig, Scalding is written in pure Scala — which means all the power of Scala and the JVM is already built-in. No more UDFs, folks!</p> \n<p>At Twitter, our mission is to instantly connect people everywhere to what’s most meaningful to them. With over a hundred million active users creating more than 250 million tweets every day, this means we need to quickly analyze massive amounts of data at scale.</p> \n<p>That’s why we recently open-sourced Scalding, an in-house MapReduce framework built on top of Scala and Cascading.</p> \n<p>In 140: Instead of forcing you to write raw <code>map</code> and <code>reduce</code> functions, Scalding allows you to write natural code like:</p> \n<p>Simple to read, and just as easily run over a 10 line test file as a 10 terabyte data source in Hadoop!</p> \n<p>Like Twitter, Scalding has a powerful simplicity that we love, and in this post we’ll use the example of building a basic recommendation engine to show you why. A couple of notes before we begin:</p> \n<ul>\n <li>Scalding is open-source and lives <a href=\"https://github.com/twitter/scalding\">here on Github</a>.</li> \n <li>For a longer, tutorial-based version of this post (which goes more in-depth into the code and mathematics), see the <a href=\"http://blog.echen.me/2012/02/09/movie-recommendations-and-more-via-mapreduce-and-scalding/\">original blog entry</a>.</li> \n</ul>\n<p>We use Scalding hard and we use it often, for everything from custom ad targeting algorithms to PageRank on the Twitter graph, and we hope you will too. Let’s dive in!</p> \n<h2>Movie similarities</h2> \n<p>Imagine you run an online movie business. You have a rating system in place (people can rate movies with 1 to 5 stars) and you want to calculate similarities between pairs of movies, so that if someone watches The Lion King, you can recommend films like Toy Story.</p> \n<p>One way to define the similarity between two movies is to use their correlation:</p> \n<ul>\n <li>For every pair of movies A and B, find all the people who rated both A and B.</li> \n <li>Use these ratings to form a Movie A vector and a Movie B vector.</li> \n <li>Calculate the correlation between these two vectors.</li> \n <li>Whenever someone watches a movie, you can then recommend the movies most correlated with it.</li> \n</ul>\n<p>Here’s a snippet illustrating the code.</p> \n<p>Notice that Scalding provides higher-level functions like <code>group</code> for you (and many others, too, like <code>join</code> and <code>filter</code>), so that you don’t have to continually rewrite these patterns yourself. What’s more, if there are other abstractions you’d like to add, go ahead! It’s easy to add new functions.</p> \n<h2>Rotten Tomatoes</h2> \n<p>Let’s run this code over some real data. What dataset of movie ratings should we use?</p> \n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\"> \n <p>My review for ‘How to Train Your Dragon’ on Rotten Tomatoes: 4 1/2 stars &gt;<a href=\"http://t.co/YTOKWLEt\">bit.ly/xtw3d3</a></p> — Benjamin West (\n <a href=\"https://twitter.com/intent/user?screen_name=BenTheWest\">@BenTheWest</a>) \n <a href=\"https://twitter.com/BenTheWest/status/171772890121895936\">February 21, 2012</a>\n </blockquote>",
"date": "2012-03-02T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/generating-recommendations-with-mapreduce-and-scalding",
"domain": "engineering"
},
{
"title": "Simple Strategies for Smooth Animation on the iPhone",
"body": "<p>The iPhone was revolutionary for its use of direct manipulation – the feeling that you’re really holding content in your hands and manipulating it with your fingertips. While many mobile platforms have touch, it is the realistic physics and fluid animation of the iPhone that sets it apart from its competitors.</p> \n<p>However, jerky scrolling ruins the experience. The new UI of Twitter for iPhone 4.0 contains many details that could impact performance, so we had to treat 60 frame-per-second animation as a priority. If you are troubleshooting animation performance, this post should provide some useful pointers.</p> \n<h2>A review of layers</h2> \n<p>Animation on iOS is powered by Core Animation layers. Layers are a simple abstraction for working with the GPU. When animating layers, the GPU just transforms surfaces as an extended function of the hardware itself. However, the GPU is <a href=\"http://cocoawithlove.com/2011/03/mac-quartzgl-2d-drawing-on-graphics.html\">not optimized for drawing.</a> Everything in your view’s <code>drawRect:</code> is handled by the CPU, then handed off to the GPU as a texture.</p> \n<p>Animation problems fall into one of those two phases in the pipeline. Either the GPU is being taxed by expensive operations, or the CPU is spending too much time preparing the cell before handing it off to the GPU. The following sections contain simple directives, based on how we addressed each of these challenges.</p> \n<h2>GPU bottlenecks</h2> \n<p>When the GPU is overburdened, it manifests with low, but consistent, framerates. The most common reasons may be excessive compositing, blending, or pixel misalignment. Consider the following Tweet: <a href=\"http://3.bp.blogspot.com/-fIF9Kei_zME/T0PqXhdn6LI/AAAAAAAAADQ/YHoIPdEmvy0/s1600/example%2Bcell.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/simple_strategiesforsmoothanimationontheiphone95.thumb.1280.1280.png\" alt=\"Simple Strategies for Smooth Animation on the iPhone\"></a></p> \n<h3>Use direct drawing</h3> \n<p>A naive implementation of a Tweet cell might include a <code>UILabel</code> for the username, a <code>UILabel</code> for the tweet text, a <code>UIImageView</code> for the avatar, and so on.</p> \n<p><a href=\"http://1.bp.blogspot.com/-OCZPmfhEb68/T0PpVcJ2jZI/AAAAAAAAADE/RGxSC3AKtfY/s1600/naive%2Bcell.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/simple_strategiesforsmoothanimationontheiphone96.thumb.1280.1280.png\" alt=\"Simple Strategies for Smooth Animation on the iPhone\"></a></p> \n<p>Unfortunately, each view burdens Core Animation with extra compositing.</p> \n<p>Instead, our Tweet cells contain a single view with no subviews; a single<code> drawRect: </code>draws everything.</p> \n<p>We institutionalized direct drawing by creating a generic table view cell class that accepts a block for its <code>drawRect:</code>method. This is, by far, the most commonly used cell in the app.</p> \n<p><a href=\"http://1.bp.blogspot.com/-WUtJZWj4th8/T0PsZ8ZYLRI/AAAAAAAAADo/K-eprkSnckI/s1600/direct%2Bdrawing.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/simple_strategiesforsmoothanimationontheiphone97.thumb.1280.1280.png\" alt=\"Simple Strategies for Smooth Animation on the iPhone\"></a></p> \n<h3>Avoid blending</h3> \n<p>You’ll notice that Tweets in Twitter for iPhone 4.0 have a drop shadow on top of a subtle textured background. This presented a challenge, as blending is expensive.</p> \n<p>We solved this by reducing the area Core Animation has to consider non-opaque, by splitting the shadow areas from content area of the cell.</p> \n<p>To quickly spot blending, select the Color Blended Layers option under Instruments in the Core Animation instrument. The green area indicates opaque; the red areas point to blended surfaces.</p> \n<p><a href=\"http://2.bp.blogspot.com/-GyzdZfO3-2M/T0Pq1QJbOoI/AAAAAAAAADc/nYonoJUdxLM/s1600/blending%2Bexample.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/simple_strategiesforsmoothanimationontheiphone98.thumb.1280.1280.png\" alt=\"Simple Strategies for Smooth Animation on the iPhone\"></a></p> \n<h3>Check pixel alignment</h3> \n<p>Spot the danger in the following code:</p> \n<p><code>CGRect subframe = CGRectMake(x, y, width / 2.0, height / 2.0);</code></p> \n<p>If width is an odd number, then subFrame will have a fractional width. Core Animation will accept this, but it will require anti-aliasing, which is expensive. Instead, run floor or ceil on computed values.</p> \n<p>In Instruments, check Color Misaligned Images to hunt for accidental anti-aliasing.</p> \n<h2>Cell preparation bottlenecks</h2> \n<p>The second class of animation problem is called a “pop” and occurs when new cells scroll into view. When a cell is about to appear on screen, it only has 17 milliseconds to provide content before you’ve dropped a frame.</p> \n<h3>Recycle cells</h3> \n<p>As described in the table view documentation, instead of creating and destroying cell objects whenever they appear or disappear, you should recycle cells with the help of <code>dequeueReusableCellWithIdentifier:</code></p> \n<h3>Optimize your <code>drawRect:</code></h3> \n<p>If you are direct drawing and recycling cells, and you still see a pop, check the time of your drawRect: under Instruments in Core Animation. If needed, eliminate “nice to have” details, like subtle gradients.</p> \n<h3>Pre-render if necessary</h3> \n<p>Sometimes, you can’t simplify drawing. The new <a href=\"https://twitter.com/hashtag/Discover\">#Discover</a> tab in Twitter for iPhone 4.0 displays large images in cells. No matter how simple the treatment, scaling and cropping a large image is expensive.</p> \n<p>We knew <a href=\"https://twitter.com/hashtag/Discover\">#Discover</a> had an upper bound of ten stories, so we decided to trade memory for CPU. When we receive a trending story image we pre-render the cell on a low-priority background queue, and store it in a cache. When the cell scrolls into view, we set the cell’s layer.contents to the prebaked CGImage, which requires no drawing.</p> \n<h2>Conclusion</h2> \n<p>All of these optimizations come at the cost of code complexity and developer productivity. So long as you don’t paint yourself into a corner in architecture, you can always apply these optimizations after you’ve written the simplest thing that works and collected actual measurements on hardware.</p> \n<p><a href=\"http://c2.com/cgi/wiki?PrematureOptimization\">Remember: Premature optimization is the root of all evil.</a></p> \n<h2>Acknowledgements</h2> \n<p>-Ben Sandofsky (@<a href=\"https://twitter.com/#!/sandofsky\">sandofsky</a>), Ryan Perry (@<a href=\"https://twitter.com/#!/ryfar\">ryfar</a>) for technical review, and the Twitter mobile team for their input.</p>",
"date": "2012-02-21T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/simple-strategies-for-smooth-animation-on-the-iphone",
"domain": "engineering"
},
{
"title": "Twitter NYC Open House",
"body": "<p>When we opened an office in New York City last October, we were excited to become a part of the city’s growing tech community, with all of its energy and innovation. Since then, we’ve been building out an engineering team in New York. Focused on search and discovery, the team works to find ways to extract value out of the more than 250 million Tweets people send every day.</p> \n<p>We want to share some of the exciting projects that we’ve been working on, so we’re holding the first <a href=\"https://twitter.com/hashtag/TwitterNYC\">#TwitterNYC</a> Engineering Open House. Come by to meet our engineering team, see some of our work, and learn about opportunities to <a href=\"http://twitter.com/jobs\">#jointheflock!</a></p> \n<p>When Thursday, February 16, 2012; 7 pm - 9 pm</p> \n<p>Where Twitter NYC</p> \n<p>Speakers </p> \n<ul>\n <li>Daniel Loreto (@<a href=\"https://twitter.com/#!/DanielLoreto\">DanielLoreto</a>): The Life of a Tweet</li> \n <li>Adam Schuck (@<a href=\"https://twitter.com/#!/AllSchuckUp\">AllSchuckUp</a>): Realtime Search at Twitter</li> \n</ul>\n<p>If you’re interested in attending, please send your name and company affiliation to openhouse@twitter.com. Space is very limited. If we have space for you, you’ll get a confirmation with more details, including the office address. If you don’t get in this time, we’ll notify you about future events.</p>",
"date": "2012-02-08T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/twitter-nyc-open-house",
"domain": "engineering"
},
{
"title": "Join the Flock: Twitter&#39;s International Engineering Open House",
"body": "<p>Next week, we will host our first Open House to present the achievements of the <a href=\"http://translate.twttr.com\">Twitter Translation Center</a>, a community platform for translating Twitter’s products. Since it’s launch in early 2011, we have released Twitter in 22 languages, up from just six in 2010.</p> \n<p>The amazing level of activity in the Twitter Translation Center has driven us to explore new avenues to scale and deliver quality translation to our international users. For instance, we’ve learned how to work with a community of 425,000 translators, who have collectively produced more than one million separate translations and voted close to five million times on those translations. Recognizing this growth, we’ve added a lot of features to make the Translation Center a better platform for community translation; these innovations include forums, translation search, translation memory, glossary management, moderation tools, and spam and cross-site scripting prevention tools.</p> \n<p>We’ve also learnt that introducing a new language requires more than just translations. For each language, we had to ensure that we supported the appropriate date and number formats, that hashtags and URLs could be properly extracted from Tweets in those languages, and that we correctly counted the number of characters in each Tweet.</p> \n<p>In order to deliver quality, we built tools to test localized versions of our applications before a launch. Given that all of our translators are volunteers, we wanted to give our translator community a chance to review and test their output before they released it to our users.</p> \n<h2>What’s next?</h2> \n<p>Twitter’s impact on the world inspires us to look for new ways to connect and interact with our international users. We continue to address the challenges and lessons on how best to serve community localization at Twitter. To hear more about these tools and topics, join us for an evening of technical discussions and networking during our first International Engineering Open House.</p> \n<p>When Thursday, February 2, 2012; 7 pm - 9 pm</p> \n<p>Where Twitter HQ, 795 Folsom St, Suite 600, San Francisco, CA 94107</p> \n<p>Speakers </p> \n<ul>\n <li>Nico Sallembien (@<a href=\"https://twitter.com/#!/nsallembien\">nsallembien</a>): A look at Unicode character distribution in Tweets</li> \n <li>Laura Gomez (@<a href=\"https://twitter.com/#!/laura\">laura</a>): Community Localization and Twitter: Experience, Engagement, and Scalability</li> \n <li>Yoshimasa Niwa (@<a href=\"https://twitter.com/#!/niw\">niw</a>): Software development for Japanese mobile phones</li> \n</ul>\n<p>-@<a href=\"https://twitter.com/#!/international\">international</a></p>",
"date": "2012-01-27T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2012/join-the-flock-twitters-international-engineering-open-house",
"domain": "engineering"
},
{
"title": "2011",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2011",
"domain": "engineering"
},
{
"title": "SpiderDuck: Twitter&#39;s Real-time URL Fetcher",
"body": "<p>Tweets often contain URLs or links to a variety of content on the web, including images, videos, news articles and blog posts. SpiderDuck is a service at Twitter that fetches all URLs shared in Tweets in real-time, parses the downloaded content to extract metadata of interest and makes that metadata available for other Twitter services to consume within seconds.</p> \n<p>Several teams at Twitter need to access the linked content, typically in real-time, to improve Twitter products. For example:</p> \n<ul>\n <li>Search to index resolved URLs and improve relevance</li> \n <li>Clients to display certain types of media, such as photos, next to the Tweet</li> \n <li>Tweet Button to count how many times each URL has been shared on Twitter</li> \n <li>Trust &amp; Safety to aid in detecting malware and spam</li> \n <li>Analytics to surface a variety of aggregated statistics about links shared on Twitter</li> \n</ul>\n<h2>Background</h2> \n<p>Prior to SpiderDuck, Twitter had a service that resolved all URLs shared in Tweets by issuing HEAD requests and following redirects. While this service was simple and met the needs of the company at the time, it had a few limitations:</p> \n<ul>\n <li>It resolved the URLs but did not actually download the content. The resolution information was stored in an in-memory cache but not persisted durably to disk. This meant that if the in-memory cache instance was restarted, data would be lost.</li> \n <li>It did not implement politeness rules typical of modern bots, for example, rate limiting and following robots.txt directives.</li> \n</ul>\n<p>Clearly, we needed to build a real URL fetcher that overcame the above limitations and would meet the company’s needs in the long term. Our first thought was to use or build on top of an existing open source URL crawler. We realized though that almost all of the available crawlers have two properties that we didn’t need:</p> \n<ul>\n <li>They are recursive crawlers. That is, they are designed to fetch pages and then recursively crawl the links extracted from those pages. Recursive crawling involves significant complexity in crawl scheduling and long term queuing, which isn’t relevant to our use case.</li> \n <li>They are optimized for large batch crawls. What we needed was a fast, real-time URL fetcher.</li> \n</ul>\n<p>Therefore, we decided to design a new system that could meet Twitter’s real-time needs and scale horizontally with its growth. Rather than reinvent the wheel, we built the new system largely on top of open source building blocks, thus still leveraging the contributions of the open source community.</p> \n<p>This is typical of many engineering problems at Twitter – while they resemble problems at other large Internet companies, the requirement that everything work in real-time introduces unique and interesting challenges.</p> \n<h2>System Overview</h2> \n<p>Here’s an overview of how SpiderDuck works. The following diagram illustrates its main components.</p> \n<p><a href=\"http://3.bp.blogspot.com/-C99QChtEdoY/TsGF6ZUCHEI/AAAAAAAAACU/23tXAxtAHe4/s1600/SpiderDuck-Architecture.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/spiderduck_twittersreal-timeurlfetcher95.thumb.1280.1280.png\" alt=\"SpiderDuck: Twitter's Real-time URL Fetcher\"></a>The SpiderDuck architecture</p> \n<p>Kestrel: This is <a href=\"https://github.com/robey/kestrel\">message queuing system</a> widely used at Twitter for queuing incoming Tweets.</p> \n<p>Schedulers: These jobs determine whether to fetch a URL, schedule the fetch, follow redirect hops if any. After the fetch, they parse the downloaded content, extract metadata, and write the metadata to the Metadata Store and the raw content to the Content Store. Each scheduler performs its work independently of the others; that is, any number of schedulers can be added to horizontally scale the system as Tweet and URL volume grows.</p> \n<p>Fetchers: These are <a href=\"http://thrift.apache.org/\">Thrift</a> servers that maintain short-term fetch queues of URLs, issue the actual HTTP fetch requests and implement rate limiting and robots.txt processing. Like the Schedulers, Fetchers scale horizontally with fetch rate.</p> \n<p>Memcached: This is a <a href=\"http://memcached.org/\">distributed cache</a> used by the fetchers to temporarily store robots.txt files.</p> \n<p>Metadata Store: This is a <a href=\"http://Cassandra.apache.org/\">Cassandra</a>-based distributed hash table that stores page metadata and resolution information keyed by URL, as well as fetch status for every URL recently encountered by the system. This store serves clients across Twitter that need real-time access to URL metadata.</p> \n<p>Content Store: This is an <a href=\"http://Hadoop.apache.org/hdfs/\">HDFS</a> cluster for archiving downloaded content and all fetch information.</p> \n<p>We will now describe the two main components of SpiderDuck — the URL Scheduler and the URL Fetcher — in more detail.</p> \n<h2>The URL Scheduler</h2> \n<p>The following diagram illustrates the various stages of processing in the SpiderDuck Scheduler.</p> \n<p><a href=\"http://3.bp.blogspot.com/-svA2cS0hmqE/TsGGMIvzOaI/AAAAAAAAACg/MWbjyNxrEiQ/s1600/spiderduck_blog_post.Scheduler.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/spiderduck_twittersreal-timeurlfetcher96.thumb.1280.1280.png\" alt=\"SpiderDuck: Twitter's Real-time URL Fetcher\"></a>The URL Scheduler</p> \n<p>Like most of SpiderDuck, the Scheduler is built on top of an open source asynchronous RPC framework developed at Twitter called <a href=\"http://engineering.twitter.com/2011/08/finagle-protocol-agnostic-rpc-system.html\">Finagle</a>. (In fact, this was one of the earliest projects to utilize Finagle.) Each box in the diagram above, except for the Kestrel Reader, is a Finagle Filter – an abstraction that allows a sequence of processing stages to be easily composed into a fully asynchronous pipeline. Being fully asynchronous allows SpiderDuck to handle high throughput with a small, fixed number of threads.</p> \n<p>The Kestrel Reader continuously polls for new Tweets. As Tweets come in, they are sent to the Tweet Processor, which extracts URLs from them. Each URL is then sent to the Crawl Decider stage. This stage reads the Fetch Status of the URL from the Metadata Store to check if and when SpiderDuck has seen the URL before. The Crawl Decider then decides whether the URL should be fetched based on a pre-defined fetch policy (that is, do not fetch if SpiderDuck has fetched it in the past X days). If the Decider determines to not fetch the URL, it logs the status to indicate that processing is complete. If it determines to fetch the URL, it sends the URL to the Fetcher Client stage.</p> \n<p>The Fetcher Client stage uses a client library to talk to the Fetchers. The client library implements the logic that determines which Fetcher will fetch a given URL; it also handles the processing of redirect hops. (It is typical to have a chain of redirects because URLs posted on Twitter are often shortened.) A context object is associated with each URL flowing through the Scheduler. The Fetcher Client adds all fetch information including status, downloaded headers, and content into the context object and passes it on to the Post Processor. The Post Processor runs the extracted page content through a metadata extractor library, which detects page encoding and parses the page with an open-source HTML5 parser. The extractor library implements a set of heuristics to retrieve page metadata such as title, description, and representative image. The Post Processor then writes all the metadata and fetch information into the Metadata Store. If necessary, the Post Processor can also schedule a set of dependent fetches. An example of dependent fetches is embedded media, such as images.</p> \n<p>After post-processing is complete, the URL context object is forwarded to the next stage that logs all the information, including full content, to the Content Store (HDFS) using an open source log aggregator called <a href=\"https://github.com/facebook/Scribe\">Scribe</a>. This stage also notifies interested listeners that the URL processing is complete. The notification uses a simple Publish-Subscribe model, which is implemented using Kestrel’s fanout queues.</p> \n<p>All processing steps are executed asynchronously – no thread ever waits for a step to complete. All state related to each URL in flight is stored in the context object associated with it, which makes the threading model very simple. The asynchronous implementation also benefits from the convenient abstractions and constructs provided by Finagle and the <a href=\"https://github.com/twitter/util/\">Twitter Util libraries.</a></p> \n<h2>The URL Fetcher</h2> \n<p>Let’s take a look at how a Fetcher processes a URL.</p> \n<p><a href=\"http://2.bp.blogspot.com/-8skikMiGAUE/TsGGc1UXjEI/AAAAAAAAACs/LEgn4_6_lmE/s1600/spiderduck_blog_postFetcher.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/spiderduck_twittersreal-timeurlfetcher97.thumb.1280.1280.png\" alt=\"SpiderDuck: Twitter's Real-time URL Fetcher\"></a>The URL Fetcher</p> \n<p>The Fetcher receives the URL through its Thrift interface. After basic validation, the Thrift handler passes the URL to a Request Queue Manager, which assigns it to the appropriate Request Queue. A scheduled task drains each Request Queue at a fixed rate. Once the URL is pulled off of its queue, it is sent to the HTTP Service for processing. The HTTP service, built on top of Finagle, first checks if the host associated with the URL is already in its cache. If not, it creates a Finagle client for it and schedules a robots.txt fetch. After the robots.txt is downloaded, the HTTP service fetches the permitted URL. The robots.txt file itself is cached, both in the in-process Host Cache as well as in Memcached to prevent its re-fetch for every new URL that the Fetcher encounters from that host.</p> \n<p>Tasks called Vultures periodically examine the Request Queues and Host Cache to find queues and hosts that haven’t been used for a period of time; when found, they are deleted. The Vultures also report useful stats through logs and the <a href=\"http://twitter.github.com/commons\">Twitter Commons</a> stats exporting library.</p> \n<p>The Fetcher’s Request Queue serves an important purpose: rate limiting. SpiderDuck rate limits outgoing HTTP fetch requests per-domain so as not to overload web servers receiving requests. For accurate rate limiting, SpiderDuck ensures each Request Queue is assigned to exactly one Fetcher at any point of time, with automatic failover to a different Fetcher in case the assigned Fetcher fails. A cluster suite called <a href=\"http://clusterlabs.org/wiki/Pacemaker\">Pacemaker</a> assigns Request Queues to Fetchers and manages failover. URLs are assigned to Request Queues based on their domains by a Fetcher client library. The default rate limit used for all web sites can be overriden on a per-domain basis, as needed. The Fetchers also implement queue backoff logic. That is, if URLs are coming in faster than they can be drained, they reject requests to indicate to the client to backoff or take other suitable action.</p> \n<p>For security purposes, the Fetchers are deployed in a special zone in Twitter data centers called a <a href=\"http://en.wikipedia.org/wiki/DMZ_(computing)\">DMZ.</a> This means that the Fetchers cannot access Twitter’s production clusters and services. Hence, it is all the more important to keep them lightweight and self contained, a principle which guided many aspects of the design.</p> \n<h2>How Twitter uses SpiderDuck</h2> \n<p>Twitter services consume SpiderDuck data in a number of ways. Most query the Metadata Store directly to retrieve URL metadata (for example, page title) and resolution information (that is, the canonical URL after redirects). The Metadata Store is populated in real-time, typically seconds after the URL is tweeted. These services do not talk directly to Cassandra, but instead to SpiderDuck Thrift servers that proxy the requests. This intermediate layer provides SpiderDuck the flexibility to transparently switch storage systems, if necessary. It also supports an avenue for higher level API abstractions than what would be possible if the services interacted directly with Cassandra.</p> \n<p>Other services periodically process SpiderDuck logs in HDFS to generate aggregate stats for Twitter’s internal metrics dashboards or conduct other types of batch analyses. The dashboards help us answer questions like “How many images are shared on Twitter each day?” “What news sites do Twitter users most often link to?” and “How many URLs did we fetch yesterday from this specific website?”</p> \n<p>Note that services don’t typically tell SpiderDuck what to fetch; SpiderDuck fetches all URLs from incoming Tweets. Instead, services query information related to URLs after it becomes available. SpiderDuck also allows services to make requests directly to the Fetchers to fetch arbitrary content via HTTP (thus benefiting from our data center setup, rate limiting, robots.txt support and so on), but this use case is not common.</p> \n<h2>Performance numbers</h2> \n<p>SpiderDuck processes several hundred URLs every second. A majority of these are unique over the time window defined by SpiderDuck’s fetch policy, and hence get fetched. For URLs that get fetched, SpiderDuck’s median processing latency is under two seconds, and the 99th percentile processing latency is under five seconds. This latency is measured from Tweet creation time, which means that in under five seconds after a user clicked “Tweet,” the URL in that Tweet is extracted, prepared for fetch, all redirect hops are retrieved, the content is downloaded and parsed, and the metadata is extracted and made available to clients via the Metadata Store. Most of that time is spent either in the Fetcher Request Queues (due to rate limiting) or in actually fetching from the external web server. SpiderDuck itself adds no more than a few hundred milliseconds of processing overhead, most of which is spent in HTML parsing.</p> \n<p>SpiderDuck’s Cassandra-based Metadata Store handles close to 10,000 requests per second. Each request is typically for a single URL or a small batch (around 20 URLs), but it also processes large batch requests (200-300 URLs). The store’s median latency for reads is 4-5 milliseconds, and its 99th percentile is 50-60 milliseconds.</p> \n<h2>Acknowledgements</h2> \n<p>The SpiderDuck core team consisted of the following folks: Abhi Khune, Michael Busch, Paul Burstein, Raghavendra Prabhu, Tian Wang and Yi Zhuang. In addition, we’d like to acknowledge the following folks, spanning many teams across the company, who contributed to the project either directly, by helping with components SpiderDuck relies on (for example, Cassandra, Finagle, Pacemaker and Scribe) or with its unique data center setup: Alan Liang, Brady Catherman, Chris Goffinet, Dmitriy Ryaboy, Gilad Mishne, John Corwin, John Sirois, Jonathan Boulle, Jonathan Reichhold, Marius Eriksen, Nick Kallen, Ryan King, Samuel Luckenbill, Steve Jiang, Stu Hood and Travis Crawford. Thanks also to the entire Twitter Search team for their invaluable design feedback and support. If you want to work on projects like this, <a href=\"http://twitter.com/jobs\">join the flock!</a><br><br> - Raghavendra Prabhu (@<a href=\"http://twitter.com/omrvp\">omrvp</a>), Software Engineer</p>",
"date": "2011-11-14T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/spiderduck-twitters-real-time-url-fetcher",
"domain": "engineering"
},
{
"title": "Twitter’s mobile web app delivers performance",
"body": "<p>As the number of people using Twitter has grown, we’ve wanted to make sure that we deliver the best possible experience to users, regardless of platform or device. Since <a href=\"http://twitter.com/\">twitter.com</a> is not optimized for smaller screens or touch interactions familiar to many smart phones, we decided to build a cross-platform web application that felt native in its responsiveness and speed for those who prefer accessing Twitter on their phone’s or the tablet’s browser.</p> \n<h2>A better mobile user experience</h2> \n<p>When building <a href=\"http://mobile.twitter.com/\">mobile.twitter.com</a> as a web client, we used many of the tools offered in HTML5, CSS3, and JavaScript to develop an application that has the same look, feel, and performance of a native mobile application. This post focuses on four primary areas of the mobile app architecture that enabled us to meet our performance and usability goals:</p> \n<ul>\n <li>event listeners</li> \n <li>scroll views</li> \n <li>templates</li> \n <li>storage</li> \n</ul>\n<p><a href=\"http://2.bp.blogspot.com/-DdgfWY6hxvE/Tm-VFWSbpzI/AAAAAAAAACI/8Ip6S8oi0w4/s1600/Mobile%2Bblog%2B2011-09-08%2Bat%2B10.31.03%2BPM%25282%2529.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/twitter_s_mobilewebappdeliversperformance95.thumb.1280.1280.png\" alt=\"Twitter’s mobile web app delivers performance\"></a> Twitter’s mobile app architecture</p> \n<h3>Event listener</h3> \n<p>For the Twitter application to feel native, responses have to be immediate. The web application delivers this experience by using event listeners in its code.</p> \n<p>Traditionally, Javascript uses DOM-only events such as onclick, mouseover, mouseout, focus, and blur to render a page. However, because Twitter has so many unique points of interaction, we decided to optimize the resources presented to us with mobile devices. The web application we developed uses event listeners throughout the code. These syntactic events, loaded with the JavaScript on the client, listen for unique triggers that are fired, following the users’ interactions. When users retweet or favorite Tweets, the JavaScript listens for those events and responds accordingly throughout the application, updating screen views where necessary.</p> \n<p>The client-side JavaScript on the mobile application handles communication with Twitter through the Twitter API. To illustrate the use of event listeners, let’s look at how a Retweet works. When a user clicks the Retweet button on the UI, the system fires a click event that fires a Retweet request through the API.</p> \n<p>The web client application listens for an event like a Retweet and updates the rest of the application when it receives it.</p> \n<p>When that Retweet event is successful, a return event fires off a signal and the web app listens for a successful Retweet notification. When it receives the notification, the rest of the application updates appropriately.</p> \n<p>The web app’s architecture ensures that while the user-facing layer for the various web apps may differ, the app reuses the custom event listeners throughout, thus making it possible to scale across all devices. For instance, both the iPhone and the iPad use the same views and modules, but in different navigation contexts, while the event architecture drives the rest of the application.</p> \n<h3>ScrollViews</h3> \n<p>Mobile browsers use a viewport on top of the browser’s window to allow the user to zoom and scroll the content of an entire page. As helpful as this is, the viewport prevents the web pages from using fixed positioned elements and scrolling gestures. Both of these provide a better user experience because the app header is fixed and you can fit more content in a smaller area.</p> \n<p>We worked around the limitations of native scrolling by writing a ScrollView component that allows users to scroll the content using JavaScript and CSS Transforms and Transitions. The CSS Transforms uses the device’s GPU to mimic the browser’s viewport.</p> \n<p>ScrollView adds a scrolling element and a wrapper container to the element that you wish to scroll. The wrapper container has a fixed width and height so that the inner contents can overflow. The JavaScript calculates the amount of pixels that overflow and moves the scroll element, using CSS Transforms.</p> \n<p>ScrollView listens for three events, <code>onTouchStart</code>, <code>onTouchMove</code>, and <code>onTouchEnd</code> to render a smooth animation:</p> \n<p><code>onTouchStart</code></p> \n<p>The mobile site stores the initial touch position, timestamp and other variables that it will use later to calculate the distance and velocity of the scroll.</p> \n<p><code>onTouchMove</code></p> \n<p>Next, the web app simply moves the scroll element by the delta between the start and the current positions.</p> \n<p><code>onTouchEnd</code></p> \n<p>Finally, the web app confirms if the scroll element has moved. If there was no movement, the application fires a click event that stops the scrolling action. If the scroll element moved, it calculates the distance and the speed to generate inertial scrolling, which fires a timer.</p> \n<p>When the timer fires, the application uses CSS Transforms to move the scroll element to the new position while it decreases the velocity logarithmically. Once the velocity reaches a minimum speed, the application cancels the timer and completes the animation. During this process, it takes into account important coordinates to calculate the elasticity when the user scrolls past the lower or the upper boundary of the scroll element.</p> \n<p>ScrollView is used to specify which content is scrollable. It can also be used to fix the navigation header to the top of the window to implement Pull-To-Refresh and infinite Tweet timelines.</p> \n<h3>Templates</h3> \n<p>One of the many customized solutions unique to Twitter and its user experience is a templating system. Templating is a two-pass process. During the first pass, the app expands the templates and marks the places in those resulting strings where dynamic data needs to go. The app then caches the results of the first pass. When it does a second pass to add dynamic data, the app references the cache, delivering a substantial performance benefit.</p> \n<h3>Efficient storage</h3> \n<p>In addition to custom events, we reexamined the use of storage available to the web app from the native browser. Since 15 percent of all mobile applications are launched when the device is offline, the solution needed to cover both online and offline instances. Twitter’s new mobile web app makes use of the HTML5’s app cache, which allows you to specify which files the browser should cache and make available to offline users. Using app cache also helps limit the amount of network activity. You can specify in your manifest file what to store; these include items such as the master index file, sprites, and other assets. When a user loads a page, the web app shows the assets from app cache; it stores the new assets when the manifest gets updated. This ensures the web app can be used even when it is offline, since an updated manifest is always waiting in the cache.</p> \n<p>The web app also uses local storage for simple items, such as user settings, user information, and strings, that are persistent throughout the application for immediate access. It uses a SQL database to handle Tweets and Profiles. Within the schema, each storage database gets a name based on the user, allowing for very quick joins between tables. Separate user tables allow for encapsulation and provide the ability to bundle data securely, by user. Given the growing use cases for devices like an iPad, especially in a multilingual setting, this innovation allows for two people using separate languages to receive all the translated strings cached per user on the same device.</p> \n<p>In addition to using storage elements of the HTML5 spec, Twitter’s mobile application also makes use of some of the best tools of CSS3. This list includes</p> \n<ul>\n <li>Flex box model</li> \n <li>Gradients</li> \n <li>Shadows</li> \n <li>3D transforms</li> \n <li>Transitions</li> \n <li>Animations</li> \n</ul>\n<h2>Future direction</h2> \n<p>The event framework gives us a scalable way to grow this product over time. Our goal is to add support for new devices as well as build new user-facing features and elements. We will invest in both native applications and the web. In cases where we can or should go native, we will, but in many cases we believe our web app provides an optimal approach for serving a broad set of users.</p> \n<h2>Acknowledgements</h2> \n<p>Twitter’s HTML5 mobile application was developed by Manuel Deschamps (@<a href=\"https://twitter.com/#!/manuel\">manuel</a>) and designed by Bryan Haggerty (@<a href=\"https://twitter.com/#!/bhaggs\">bhaggs</a>). Mark Percival (@<a href=\"https://twitter.com/#!/mdp\">mdp</a>) contributed to the coding of the mobile architecture.</p>",
"date": "2011-09-14T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/twitter-s-mobile-web-app-delivers-performance",
"domain": "engineering"
},
{
"title": "Finagle: A Protocol-Agnostic RPC System",
"body": "<p><a href=\"http://twitter.github.io/finagle/\">Finagle</a> is a protocol-agnostic, asynchronous RPC system for the JVM that makes it easy to build robust clients and servers in Java, Scala, or any JVM-hosted language.</p> \n<p>Rendering even the simplest web page on <a href=\"http://www.twitter.com\">twitter.com</a> requires the collaboration of dozens of network services speaking many different protocols. For example, in order to render the home page, the application issues requests to the Social Graph Service, Memcached, databases, and many other network services. Each of these speaks a different protocol: Thrift, Memcached, MySQL, and so on. Additionally, many of these services speak to other services — they are both servers and clients. The Social Graph Service, for instance, provides a Thrift interface but consumes from a cluster of MySQL databases.</p> \n<p>In such systems, a frequent cause of outages is poor interaction between components in the presence of failures; common failures include crashed hosts and extreme latency variance. These failures can cascade through the system by causing work queues to back up, TCP connections to churn, or memory and file descriptors to become exhausted. In the worst case, the user sees a Fail Whale.</p> \n<h2>Challenges of building a stable distributed system</h2> \n<p>Sophisticated network servers and clients have many moving parts: failure detectors, load-balancers, failover strategies, and so on. These parts need to work together in a delicate balance to be resilient to the varieties of failure that occur in a large production system.</p> \n<p>This is made especially difficult by the many different implementations of failure detectors, load-balancers, and so on, per protocol. For example, the implementation of the back-pressure strategies for Thrift differ from those for HTTP. Ensuring that heterogeneous systems converge to a stable state during an incident is extremely challenging.</p> \n<h2>Our approach</h2> \n<p>We set out to develop a single implementation of the basic components of network servers and clients that could be used for all of our protocols. <a href=\"https://github.com/twitter/finagle\">Finagle</a> is a protocol-agnostic, asynchronous Remote Procedure Call (RPC) system for the Java Virtual Machine (JVM) that makes it easy to build robust clients and servers in Java, Scala, or any JVM-hosted language. Finagle supports a wide variety of request/response- oriented RPC protocols and many classes of streaming protocols.</p> \n<p>Finagle provides a robust implementation of:</p> \n<ul>\n <li>connection pools, with throttling to avoid TCP connection churn;</li> \n <li>failure detectors, to identify slow or crashed hosts;</li> \n <li>failover strategies, to direct traffic away from unhealthy hosts;</li> \n <li>load-balancers, including “least-connections” and other strategies; and</li> \n <li>back-pressure techniques, to defend servers against abusive clients and dogpiling.</li> \n</ul>\n<p>Additionally, Finagle makes it easier to build and deploy a service that</p> \n<ul>\n <li>publishes standard statistics, logs, and exception reports;</li> \n <li>supports distributed tracing (a la Dapper) across protocols;</li> \n <li>optionally uses ZooKeeper for cluster management; and</li> \n <li>supports common sharding strategies.</li> \n</ul>\n<p>We believe our work has paid off — we can now write and deploy a network service with much greater ease and safety.</p> \n<h2>Finagle at Twitter</h2> \n<p>Today, Finagle is deployed in production at Twitter in several front- and back-end serving systems, including our URL crawler and HTTP Proxy. We plan to continue deploying Finagle more widely.</p> \n<p><a href=\"http://4.bp.blogspot.com/-riUycEXusDA/TlU85PnE_dI/AAAAAAAAAB4/ZDjz80Qu7NU/s1600/Finagle%2BDiagram.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/finagle_a_protocol-agnosticrpcsystem95.thumb.1280.1280.png\" alt=\"Finagle: A Protocol-Agnostic RPC System\"></a> A Finagle-based architecture (under development)</p> \n<p>The diagram illustrates a future architecture that uses Finagle pervasively. For example, the User Service is a Finagle server that uses a Finagle memcached client, and speaks to a Finagle Kestrel service.</p> \n<h2>How Finagle works</h2> \n<p>Finagle is flexible and easy to use because it is designed around a few simple, composable primitives: <code>Futures</code>, <code>Services</code>, and <code>Filters</code>.</p> \n<h3><code>Future</code> objects</h3> \n<p>In Finagle, <code>Future</code> objects are the unifying abstraction for all asynchronous computation. A <code>Future</code> represents a computation that may not yet have completed and that can either succeed or fail. The two most basic ways to use a <code>Future</code> are to:</p> \n<ul>\n <li>block and wait for the computation to return</li> \n <li>register a callback to be invoked when the computation eventually succeeds or fails <h3>Future callbacks</h3> <p>In cases where execution should continue asynchronously upon completion of a computation, you can specify a success and a failure callback. <a href=\"https://gist.github.com/1166301\">Callbacks</a> are registered via the <code>onSuccess</code> and <code>onFailure</code> methods:</p> </li> \n</ul> \n<pre class=\"brush:scala;first-line:1;\">val request: HttpRequest =\n new DefaultHttpRequest(HTTP_1_1, GET, \"/\")\nval responseFuture: Future[HttpResponse] = client(request)\n\nresponseFuture onSuccess { responseFuture =&gt;\n println(responseFuture)\n} onFailure { exception =&gt;\n println(exception)\n}\n</pre> \n<h3>Composing <code>Futures</code></h3> \n<p><code>Futures</code> can be combined and transformed in interesting ways, leading to the kind of compositional behavior commonly seen in functional programming languages. For instance, you can convert a <code>Future[String]</code> to a <code>Future[Int]</code> by using <code>map</code>:</p> \n<pre class=\"brush:scala;first-line:1;\"> val stringFuture: Future[String] = Future(\"1\")\n val intFuture: Future[Int] = stringFuture map { string =&gt;\n string.toInt\n }\n</pre> \n<p>Similarly, you can use <code>flatMap</code> to easily pipeline a sequence of <code>Futures</code>:</p> \n<pre class=\"brush:scala;first-line:1;\">val authenticatedUser: Future[User] =\n User.authenticate(email, password)\n\nval lookupTweets: Future[Seq[Tweet]] =\n authenticatedUser flatMap { user =&gt;\n Tweet.findAllByUser(user)\n }\n</pre> \n<p>In this example, <code>User.authenticate()</code> is performed asynchronously; <code>Tweet.findAllByUser()</code> is invoked on its eventual result. This is alternatively expressed in Scala, using the <code>for</code> statement:</p> \n<pre class=\"brush:scala;first-line:1;\">for {\n user &lt;- User.authenticate(email, password)\n tweets &lt;- Tweet.findAllByUser(user)\n} yield tweets\n</pre> \n<p>Handling errors and exceptions is very easy when <code>Futures</code> are pipelined using <code>flatMap</code> or the <code>for</code> statement. In the above example, if <code>User.authenticate()</code> asynchronously raises an exception, the subsequent call to <code>Tweet.findAllByUser()</code> never happens. Instead, the result of the pipelined expression is still of the type <code>Future[Seq[Tweet]]</code>, but it contains the exceptional value rather than tweets. You can respond to the exception using the <code>onFailure</code> callback or other compositional techniques.</p> \n<p>A nice property of <code>Futures</code>, as compared to other asynchronous programming techniques (such as the continuation passing style), is that you an easily write clear and robust asynchronous code, even with more sophisticated operations such as scatter/gather:</p> \n<pre class=\"brush:scala;first-line:1;\">val severalFutures = Seq[Future[Int]] =\n Seq(Tweet.find(1), Tweet.find(2), …)\nval combinedFuture: Future[Seq[Int]] =\n Future.collect(severalFutures)\n</pre> \n<h3><code>Service</code> objects</h3> \n<p>A <code>Service</code> is a function that receives a request and returns a <code>Future</code> object as a response. Note that both clients and servers are represented as <code>Service</code> objects.</p> \n<p>To create a <code>Server</code>, you extend the abstract <code>Service</code> class and listen on a port. Here is a simple HTTP server listening on port 10000:</p> \n<pre class=\"brush:scala;first-line:1;\">val service = new Service[HttpRequest, HttpResponse] {\n def apply(request: HttpRequest) =\n Future(new DefaultHttpResponse(HTTP_1_1, OK))\n}\n\nval address = new InetSocketAddress(10000)\n\nval server: Server[HttpRequest, HttpResponse] = ServerBuilder()\n .name(\"MyWebServer\")\n .codec(Http())\n .bindTo(address)\n .build(service)\n</pre> \n<p>Building an HTTP client is even easier:</p> \n<pre class=\"brush:scala;first-line:1;\">val client: Service[HttpRequest, HttpResponse] = ClientBuilder()\n .codec(Http())\n .hosts(address)\n .build()\n\n// Issue a request, get a response:\nval request: HttpRequest =\n new DefaultHttpRequest(HTTP_1_1, GET, \"/\")\n\nclient(request) onSuccess { response =&gt;\n println(\"Received response: \" + response)\n}\n</pre> \n<h3><code>Filter</code> objects</h3> \n<p><code>Filters</code> are a useful way to isolate distinct phases of your application into a pipeline. For example, you may need to handle exceptions, authorization, and so forth before your Service responds to a request.</p> \n<p>A <code>Filter</code> wraps a <code>Service</code> and, potentially, converts the input and output types of the Service to other types. In other words, a <code>Filter</code> is a <code>Service</code> transformer. Here is a filter that ensures an HTTP request has valid OAuth credentials that uses an asynchronous authenticator service:</p> \n<pre class=\"brush:scala;first-line:1;\">class RequireAuthentication(a: Authenticator) extends Filter[...] {\n def apply(\n request: Request,\n continue: Service[AuthenticatedRequest, HttpResponse]\n ) = {\n a.authenticate(request) flatMap {\n case AuthResult(OK, passport) =&gt;\n continue(AuthenticatedRequest(request, passport))\n case ar: AuthResult(Error(code)) =&gt;\n Future.exception(new RequestUnauthenticated(code))\n }\n }\n}\n</pre> \n<p>A <code>Filter</code> then decorates a <code>Service</code>, as in this example:</p> \n<pre class=\"brush:scala;first-line:1;\">val baseService = new Service[HttpRequest, HttpResponse] {\n def apply(request: HttpRequest) =\n Future(new DefaultHttpResponse(HTTP_1_1, OK))\n}\n\nval authorize = new RequireAuthorization(…)\nval handleExceptions = new HandleExceptions(...)\n\nval decoratedService: Service[HttpRequest, HttpResponse] =\n handleExceptions andThen authorize andThen baseService\n</pre> \n<p>Finagle is an open source project, available under the Apache License, Version 2.0. Source code and documentation are available on <a href=\"https://github.com/twitter/finagle\">GitHub.</a></p> \n<h2>Acknowledgements</h2> \n<p>Finagle was originally conceived by Marius Eriksen and Nick Kallen. Other key contributors are Arya Asemanfar, David Helder, Evan Meagher, Gary McCue, Glen Sanford, Grant Monroe, Ian Ownbey, Jake Donham, James Waldrop, Jeremy Cloud, Johan Oskarsson, Justin Zhu, Raghavendra Prabhu, Robey Pointer, Ryan King, Sam Whitlock, Steve Jenson, Wanli Yang, Wilhelm Bierbaum, William Morgan, Abhi Khune, and Srini Rajagopal.</p>",
"date": "2011-08-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/finagle-a-protocol-agnostic-rpc-system",
"domain": "engineering"
},
{
"title": "A Storm is coming: more details and plans for release",
"body": "<p>We’ve received a lot of questions about what’s going to happen to Storm now that BackType <a href=\"http://blog.backtype.com/2011/07/backtype-has-been-acquired-by-twitter/\">has been acquired</a> by Twitter. I’m pleased to announce that I will be releasing Storm at <a href=\"https://thestrangeloop.com/\">Strange Loop</a> on September 19th! Check out the <a href=\"https://thestrangeloop.com/sessions/storm-twitters-scalable-realtime-computation-system\">session info</a> for more details.</p> \n<p>In my <a href=\"http://tech.backtype.com/preview-of-storm-the-hadoop-of-realtime-proce\">preview post</a> about Storm, I discussed how Storm can be applied to a huge variety of realtime computation problems. In this post, I’ll give more details on Storm and what it’s like to use.</p> \n<p>Here’s a recap of the three broad use cases for Storm:</p> \n<ol>\n <li>Stream processing: Storm can be used to process a stream of new data and update databases in realtime. Unlike the standard approach of doing stream processing with a network of queues and workers, Storm is fault-tolerant and scalable.</li> \n <li>Continuous computation: Storm can do a continuous query and stream the results to clients in realtime. An example is streaming trending topics on Twitter into browsers. The browsers will have a realtime view on what the trending topics are as they happen.</li> \n <li>Distributed RPC: Storm can be used to parallelize an intense query on the fly. The idea is that your Storm topology is a distributed function that waits for invocation messages. When it receives an invocation, it computes the query and sends back the results. Examples of Distributed RPC are parallelizing search queries or doing set operations on large numbers of large sets.</li> \n</ol>\n<p>The beauty of Storm is that it’s able to solve such a wide variety of use cases with just a simple set of primitives.</p> \n<h2>Components of a Storm cluster</h2> \n<p>A Storm cluster is superficially similar to a Hadoop cluster. Whereas on Hadoop you run “MapReduce jobs”, on Storm you run “topologies”. “Jobs” and “topologies” themselves are very different — one key difference is that a MapReduce job eventually finishes, whereas a topology processes messages forever (or until you kill it).</p> \n<p>There are two kinds of nodes on a Storm cluster: the master node and the worker nodes. The master node runs a daemon called “Nimbus” that is similar to Hadoop’s “JobTracker”. Nimbus is responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures.</p> \n<p>Each worker node runs a daemon called the “Supervisor”. The supervisor listens for work assigned to its machine and starts and stops worker processes as necessary based on what Nimbus has assigned to it. Each worker process executes a subset of a topology; a running topology consists of many worker processes spread across many machines.</p> \n<p><a href=\"http://1.bp.blogspot.com/-ZMLveylidh0/TjrtEuk6DtI/AAAAAAAAAAg/aBkm8nVzwT4/s1600/storm-cluster.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/a_storm_is_comingmoredetailsandplansforrelease95.thumb.1280.1280.png\" alt=\"A Storm is coming: more details and plans for release\"></a></p> \n<p>All coordination between Nimbus and the Supervisors is done through a <a href=\"http://zookeeper.apache.org/\">Zookeeper</a> cluster. Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and stateless; all state is kept in Zookeeper or on local disk. This means you can kill -9 Nimbus or the Supervisors and they’ll start back up like nothing happened. This design leads to Storm clusters being incredibly stable. We’ve had topologies running for months without requiring any maintenance.</p> \n<h2>Running a Storm topology</h2> \n<p>Running a topology is straightforward. First, you package all your code and dependencies into a single jar. Then, you run a command like the following:</p> \n<p><code>storm jar all-my-code.jar backtype.storm.MyTopology arg1 arg2</code></p> \n<p>This runs the class backtype.storm.MyTopology with the arguments arg1 and arg2. The main function of the class defines the topology and submits it to Nimbus. The storm jar part takes care of connecting to Nimbus and uploading the jar.</p> \n<p>Since topology definitions are just Thrift structs, and Nimbus is a Thrift service, you can create and submit topologies using any programming language. The above example is the easiest way to do it from a JVM-based language.</p> \n<h2>Streams and Topologies</h2> \n<p>Let’s dig into the abstractions Storm exposes for doing scalable realtime computation. After I go over the main abstractions, I’ll tie everything together with a concrete example of a Storm topology.</p> \n<p>The core abstraction in Storm is the “stream”. A stream is an unbounded sequence of tuples. Storm provides the primitives for transforming a stream into a new stream in a distributed and reliable way. For example, you may transform a stream of tweets into a stream of trending topics.</p> \n<p>The basic primitives Storm provides for doing stream transformations are “spouts” and “bolts”. Spouts and bolts have interfaces that you implement to run your application-specific logic.</p> \n<p>A spout is a source of streams. For example, a spout may read tuples off of a <a href=\"https://github.com/robey/kestrel\">Kestrel</a> queue and emit them as a stream. Or a spout may connect to the Twitter API and emit a stream of tweets.</p> \n<p>A bolt does single-step stream transformations. It creates new streams based on its input streams. Complex stream transformations, like computing a stream of trending topics from a stream of tweets, require multiple steps and thus multiple bolts.</p> \n<p>Multi-step stream transformations are packaged into a “topology” which is the top-level abstraction that you submit to Storm clusters for execution. A topology is a graph of stream transformations where each node is a spout or bolt. Edges in the graph indicate which bolts are subscribing to which streams. When a spout or bolt emits a tuple to a stream, it sends the tuple to every bolt that subscribed to that stream.</p> \n<p><a href=\"http://4.bp.blogspot.com/-GJC9Nc4w1Jg/TjruC3DYNLI/AAAAAAAAAAo/RIOoALcvbB8/s1600/topology.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/a_storm_is_comingmoredetailsandplansforrelease96.thumb.1280.1280.png\" alt=\"A Storm is coming: more details and plans for release\"></a></p> \n<p>Everything in Storm runs in parallel in a distributed way. Spouts and bolts execute as many threads across the cluster, and they pass messages to each other in a distributed way. Messages never pass through any sort of central router, and there are no intermediate queues. A tuple is passed directly from the thread who created it to the threads that need to consume it.</p> \n<p>Storm guarantees that every message flowing through a topology will be processed, even if a machine goes down and the messages it was processing get dropped. How Storm accomplishes this without any intermediate queuing is the key to how it works and what makes it so fast.</p> \n<p>Let’s look at a concrete example of spouts, bolts, and topologies to solidify the concepts.</p> \n<h2>A simple example topology</h2> \n<p>The example topology I’m going to show is “streaming word count”. The topology contains a spout that emits sentences, and the final bolt emits the number of times each word has appeared across all sentences. Every time the count for a word is updated, a new count is emitted for it. The topology looks like this:</p> \n<p><a href=\"http://2.bp.blogspot.com/-OwZCkwKYRi8/TjruVkA3z1I/AAAAAAAAAAw/B0xYM7OGQiI/s1600/word-count-pic.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/a_storm_is_comingmoredetailsandplansforrelease97.thumb.1280.1280.png\" alt=\"A Storm is coming: more details and plans for release\"></a></p> \n<p>Here’s how you define this topology in Java:</p> \n<p>The spout for this topology reads sentences off of the “sentence_queue” on a Kestrel server located at kestrel.backtype.com on port 22133.</p> \n<p>The spout is inserted into the topology with a unique id using the setSpout method. Every node in the topology must be given an id, and the id is used by other bolts to subscribe to that node’s output streams. The KestrelSpout is given the id “1” in this topology.</p> \n<p>setBolt is used to insert bolts in the topology. The first bolt defined in this topology is the SplitSentence bolt. This bolt transforms a stream of sentences into a stream of words. Let’s take a look at the implementation of SplitSentence:</p> \n<p>The key method is the execute method. As you can see, it splits the sentence into words and emits each word as a new tuple. Another important method is declareOutputFields, which declares the schema for the bolt’s output tuples. Here it declares that it emits 1-tuples with a field called “word”.</p> \n<p>Bolts can be implemented in any language. Here is the same bolt implemented in Python:</p> \n<p>The last parameter to setBolt is the amount of parallelism you want for the bolt. The SplitSentence bolt is given a parallelism of 10 which will result in 10 threads executing the bolt in parallel across the Storm cluster. To scale a topology, all you have to do is increase the parallelism for the bolts at the bottleneck of the topology.</p> \n<p>The setBolt method returns an object that you use to declare the inputs for the bolt. Continuing with the example, the SplitSentence bolt subscribes to the output stream of component “1” using a shuffle grouping. “1” refers to the KestrelSpout that was already defined. I’ll explain the shuffle grouping part in a moment. What matters so far is that the SplitSentence bolt will consume every tuple emitted by the KestrelSpout.</p> \n<p>A bolt can subscribe to multiple input streams by chaining input declarations, like so:</p> \n<p>You would use this functionality to implement a streaming join, for instance.</p> \n<p>The final bolt in the streaming word count topology, WordCount, reads in the words emitted by SplitSentence and emits updated counts for each word. Here’s the implementation of WordCount: WordCount maintains a map in memory from word to count. Whenever it sees a word, it updates the count for the word in its internal map and then emits the updated count as a new tuple. Finally, in declareOutputFields the bolt declares that it emits a stream of 2-tuples named “word” and “count”.</p> \n<p>The internal map kept in memory will be lost if the task dies. If it’s important that the bolt’s state persist even if a task dies, you can use an external database like Riak, Cassandra, or Memcached to store the state for the word counts. An in-memory HashMap is used here for simplicity purposes.</p> \n<p>Finally, the WordCount bolt declares its input as coming from component 2, the SplitSentence bolt. It consumes that stream using a “fields grouping” on the “word” field.</p> \n<p>“Fields grouping”, like the “shuffle grouping” that I glossed over before, is a type of “stream grouping”. “Stream groupings” are the final piece that ties topologies together.</p> \n<h2>Stream groupings</h2> \n<p>A stream grouping tells a topology how to send tuples between two components. Remember, spouts and bolts execute in parallel as many tasks across the cluster. If you look at how a topology is executing at the task level, it looks something like this:</p> \n<p><a href=\"http://4.bp.blogspot.com/-LeFX7F-6QNc/Tjruof9qKQI/AAAAAAAAAA4/fTEc0Vb84TY/s1600/topology-tasks.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/a_storm_is_comingmoredetailsandplansforrelease98.thumb.1280.1280.png\" alt=\"A Storm is coming: more details and plans for release\"></a></p> \n<p>When a task for Bolt A emits a tuple to Bolt B, which task should it send the tuple to?</p> \n<p>A “stream grouping” answers this question by telling Storm how to send tuples between sets of tasks. There’s a few different kinds of stream groupings.</p> \n<p>The simplest kind of grouping is called a “shuffle grouping” which sends the tuple to a random task. A shuffle grouping is used in the streaming word count topology to send tuples from KestrelSpout to the SplitSentence bolt. It has the effect of evenly distributing the work of processing the tuples across all of SplitSentence bolt’s tasks.</p> \n<p>A more interesting kind of grouping is the “fields grouping”. A fields grouping is used between the SplitSentence bolt and the WordCount bolt. It is critical for the functioning of the WordCount bolt that the same word always go to the same task. Otherwise, more than one task will see the same word, and they’ll each emit incorrect values for the count since each has incomplete information. A fields grouping lets you group a stream by a subset of its fields. This causes equal values for that subset of fields to go to the same task. Since WordCount subscribes to SplitSentence’s output stream using a fields grouping on the “word” field, the same word always goes to the same task and the bolt produces the correct output.</p> \n<p>Fields groupings are the basis of implementing streaming joins and streaming aggregations as well as a plethora of other use cases. Underneath the hood, fields groupings are implemented using consistent hashing.</p> \n<p>There are a few other kinds of groupings, but talking about those is beyond the scope of this post.</p> \n<p>With that, you should now have everything you need to understand the streaming word count topology. The topology doesn’t require that much code, and it’s completely scalable and fault-tolerant. Whether you’re processing 10 messages per second or 100K messages per second, this topology can scale up or down as necessary by just tweaking the amount of parallelism for each component.</p> \n<h2>The complexity that Storm hides</h2> \n<p>The abstractions that Storm provides are ultimately pretty simple. A topology is composed of spouts and bolts that you connect together with stream groupings to get data flowing. You specify how much parallelism you want for each component, package everything into a jar, submit the topology and code to Nimbus, and Storm keeps your topology running forever. Here’s a glimpse at what Storm does underneath the hood to implement these abstractions in an extremely robust way.</p> \n<ol>\n <li> <p>Guaranteed message processing: Storm guarantees that each tuple coming off a spout will be fully processed by the topology. To do this, Storm tracks the tree of messages that a tuple triggers. If a tuple fails to be fully processed, Storm will replay the tuple from the Spout. Storm incorporates some clever tricks to track the tree of messages in an efficient way.</p> </li> \n <li>Robust process management: One of Storm’s main tasks is managing processes around the cluster. When a new worker is assigned to a supervisor, that worker should be started as quickly as possible. When that worker is no longer assigned to that supervisor, it should be killed and cleaned up. <p>An example of a system that does this poorly is Hadoop. When Hadoop launches a task, the burden for the task to exit is on the task itself. Unfortunately, tasks sometimes fail to exit and become orphan processes, sucking up memory and resources from other tasks.</p> <p>In Storm, the burden of killing a worker process is on the supervisor that launched it. Orphaned tasks simply cannot happen with Storm, no matter how much you stress the machine or how many errors there are. Accomplishing this is tricky because Storm needs to track not just the worker processes it launches, but also subprocesses launched by the workers (a subprocess is launched when a bolt is written in another language).</p> <p>The nimbus daemon and supervisor daemons are stateless and fail-fast. If they die, the running topologies aren’t affected. The daemons just start back up like nothing happened. This is again in contrast to how Hadoop works.</p> </li> \n <li> <p>Fault detection and automatic reassignment: Tasks in a running topology heartbeat to Nimbus to indicate that they are running smoothly. Nimbus monitors heartbeats and will reassign tasks that have timed out. Additionally, all the tasks throughout the cluster that were sending messages to the failed tasks quickly reconnect to the new location of the tasks.</p> </li> \n <li> <p>Efficient message passing: No intermediate queuing is used for message passing between tasks. Instead, messages are passed directly between tasks using <a href=\"http://zeromq.org/\">ZeroMQ</a>. This is simpler and way more efficient than using intermediate queuing. ZeroMQ is a clever “super-socket” library that employs a number of tricks for maximizing the throughput of messages. For example, it will detect if the network is busy and automatically batch messages to the destination.</p> <p>Another important part of message passing between processes is serializing and deserializing messages in an efficient way. Again, Storm automates this for you. By default, you can use any primitive type, strings, or binary records within tuples. If you want to be able to use another type, you just need to implement a simple interface to tell Storm how to serialize it. Then, whenever Storm encounters that type, it will automatically use that serializer.</p> </li> \n <li> <p>Local mode and distributed mode: Storm has a “local mode” where it simulates a Storm cluster completely in-process. This lets you iterate on your topologies quickly and write unit tests for your topologies. You can run the same code in local mode as you run on the cluster.</p> </li> \n</ol>\n<p>Storm is easy to use, configure, and operate. It is accessible for everyone from the individual developer processing a few hundred messages per second to the large company processing hundreds of thousands of messages per second.</p> \n<h2>Relation to “Complex Event Processing”</h2> \n<p>Storm exists in the same space as “Complex Event Processing” systems like <a href=\"http://esper.codehaus.org/\">Esper</a>, <a href=\"http://www.streambase.com/\">Streambase</a>, and <a href=\"http://s4.io/\">S4</a>. Among these, the most closely comparable system is S4. The biggest difference between Storm and S4 is that Storm guarantees messages will be processed even in the face of failures whereas S4 will sometimes lose messages.</p> \n<p>Some CEP systems have a built-in data storage layer. With Storm, you would use an external database like Cassandra or Riak alongside your topologies. It’s impossible for one data storage system to satisfy all applications since different applications have different data models and access patterns. Storm is a computation system and not a storage system. However, Storm does have some powerful facilities for achieving data locality even when using an external database.</p> \n<h2>Summary</h2> \n<p>I’ve only scratched the surface on Storm. The “stream” concept at the core of Storm can be taken so much further than what I’ve shown here — I didn’t talk about things like multi-streams, implicit streams, or direct groupings. I showed two of Storm’s main abstractions, spouts and bolts, but I didn’t talk about Storm’s third, and possibly most powerful abstraction, the “state spout”. I didn’t show how you do distributed RPC over Storm, and I didn’t discuss Storm’s awesome automated deploy that lets you create a Storm cluster on EC2 with just the click of a button.</p> \n<p>For all that, you’re going to have to wait until September 19th. Until then, I will be working on adding documentation to Storm so that you can get up and running with it quickly once it’s released. We’re excited to release Storm, and I hope to see you there at Strange Loop when it happens.</p> \n<p>- Nathan Marz (<a href=\"http://twitter.com/nathanmarz\">@nathanmarz</a>)</p>",
"date": "2011-08-04T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/a-storm-is-coming-more-details-and-plans-for-release",
"domain": "engineering"
},
{
"title": "Fast Core Animation UI for the Mac",
"body": "<p>Starting today, Twitter is offering TwUI as an open-source framework <a href=\"https://github.com/twitter/twui\">(https://github.com/twitter/twui)</a> for developing interfaces on the Mac.</p> \n<p>Until now, there was not a simple and effective way to design interactive, hardware-accelerated interfaces on the Mac. Core Animation can create hardware-accelerated drawings, but doesn’t provide interaction mechanisms. AppKit and NSView have excellent interaction mechanisms, but the drawings operations are CPU-bound, which makes fluid scrolling, animations, and other effects difficult – if not impossible – to accomplish.</p> \n<p>UIKit on Apple’s iOS platform has offered developers a fresh start. While UIKit borrows many ideas from AppKit regarding interaction, it can offload compositing to the GPU because it is built on top of Core Animation. This architecture has enabled developers to create many applications that were, until this time, impossible to build.</p> \n<h2>TwUI as a solution</h2> \n<p>TwUI brings the philosophy of UIKit to the desktop. It is built on top of Core Animation, and it borrows interaction ideas from AppKit. It allows for all the things Mac users expect, including drag &amp; drop, mouse events, tooltips, Mac-like text selection, and so on. And, since TwUI isn’t bound by the constraints of an existing API, developers can experiment with new features like block-based drawRect and layout.</p> \n<h2>How TwUI works</h2> \n<p>You will recognize the fundamentals of TwUI if you are familiar with UIKit. For example, a “TUIView” is a simple, lightweight wrapper around a Core Animation layer – much like UIView on iOS.</p> \n<p>TUIView offers useful subclasses for operations such as scroll views, table views, buttons, and so on. More importantly, TwUI makes it easy to build your own custom interface components. And because all of these views are backed by layers, composited by Core Animation, your UI is rendered at optimal speed.</p> \n<p><a href=\"http://4.bp.blogspot.com/-h8RURBOhuaw/TgzVQ09GjNI/AAAAAAAAABw/DqqZwWYikZA/s1600/Screen%2BShot%2BExample%2BCell\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/fast_core_animationuiforthemac95.thumb.1280.1280.png\" alt=\"Fast Core Animation UI for the Mac\"></a></p> \n<p>Xcode running the TwUI example project</p> \n<h2>Ongoing development</h2> \n<p>Since TwUI forms the basis of Twitter for the Mac, it is an integral part of our shipping code. Going forward, we need to stress test it in several implementations. We’ll continue to develop additional features and make improvements. And, we encourage you to experiment, as that will help us build a robust and exciting UI framework for the Mac.</p> \n<h2>Acknowledgements</h2> \n<p>The following engineers were mainly responsible for the TwUI development:</p> \n<p>-Loren Brichter (@<a href=\"https://twitter.com/#!/lorenb\">lorenb</a>), Ben Sandofsky (@<a href=\"https://twitter.com/#!/sandofsky\">sandofsky</a>)</p>",
"date": "2011-07-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/fast-core-animation-ui-for-the-mac",
"domain": "engineering"
},
{
"title": "Join the Flock!",
"body": "<h2>Engineering Open House</h2> \n<p>Twitter’s engineering team is growing quickly. Two-thirds of our engineers were hired in the last 12 months. Those engineers joined us from cities and countries around the world and from companies of various sizes.</p> \n<p>As part of our effort to find and hire great people to build great products and solve complicated problems, last Thursday we invited several dozen engineers to Twitter HQ for our first engineering open house. Presentations from <a href=\"https://twitter.com/#!/wfarner\">@wfarner</a>, <a href=\"https://twitter.com/#!/michibusch\">@michibusch</a>, <a href=\"https://twitter.com/#!/mracus\">@mracus</a> and <a href=\"https://twitter.com/#!/esbie\">@esbie</a> showcased the depth and range of the effort required to present twitter.com to the world. The topics covered some of these key areas for development:</p> \n<ul>\n <li> <p>Dynamic deployment and resource management with Mesos - <a href=\"https://twitter.com/#!/wfarner\">@wfarner</a></p> <p>Using <a href=\"http://mesosproject.org\">Mesos</a> as a platform, we have built a private cloud system on which we hope to eventually run most, if not all, of our services. We expect this to simplify deployment and improve the reliability of our systems, while making more efficient use of our compute resources.</p> </li> \n <li> <p>Real-time search at Twitter - <a href=\"https://twitter.com/#!/michibusch\">@michibusch</a></p> <p>Since 2008, Twitter has made dramatic enhancements to our real-time search engine, scaling it from 200 QPS to 18,000 QPS. At the core of our infrastructure is <a href=\"http://engineering.twitter.com/2010/10/twitters-new-search-architecture.html\">Earlybird</a>, a version of <a href=\"http://lucene.apache.org/java/docs/index.html\">Lucene</a> modified for real-time search. This work, combined with other key infrastructure components, led to our recent revamp of the <a href=\"http://engineering.twitter.com/2011/05/engineering-behind-twitters-new-search.html\">search experience</a> and will enable future innovation in real-time search.</p> </li> \n <li> <p>The client-side architecture of #<a href=\"https://twitter.com/#!/NewTwitter\">NewTwitter</a>- <a href=\"https://twitter.com/#!/mracus\">@mracus</a> and <a href=\"https://twitter.com/#!/esbie\">@esbie</a></p> <p>Client-side applications for desktop and mobile environments have access to a class of well-rounded tools and framework components that aren’t as yet widely available for the browser. Therefore, a fully in-browser app like #<a href=\"https://twitter.com/#!/NewTwitter\">NewTwitter</a> requires investment in solid architecture in order to remain clean and extensible as it grows. At Twitter, we’re constantly iterating on the in-house and open source JavaScript tools we use to address this need.</p> </li> \n</ul>\n<p>This was Twitter’s first engineering open house, but it certainly won’t be our last. We plan to hold these regularly - every couple months or so. In the meantime, if you’re interested in keeping up with our engineering team, you can follow <a href=\"http://twitter.com/twittereng\">@twittereng</a> or check out our <a href=\"http://twitter.com/jobs/engineering\">jobs page</a>.</p> \n<p>- Mike Abbott (@<a href=\"https://twitter.com/#!/mabb0tt\">mabb0tt</a>), VP Engineering</p>",
"date": "2011-06-22T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/join-the-flock",
"domain": "engineering"
},
{
"title": "The Engineering Behind Twitter’s New Search Experience",
"body": "<p>Today, Twitter launched a <a href=\"http://twitter.com/search\">personalized search experience</a> to help our users find the most relevant Tweets, images, and videos. To build this product, our infrastructure needed to support two major features: relevance-filtering of search results and the identification of relevant images and photos. Both features leverage a ground-up rewrite of the search infrastructure, with <a href=\"http://engineering.twitter.com/2011/04/twitter-search-is-now-3x-faster_1656.html\">Blender</a> and <a href=\"http://engineering.twitter.com/2010/10/twitters-new-search-architecture.html\">Earlybird</a> at the core.</p> \n<h2>Investment in Search</h2> \n<p>Since the acquisition of <a href=\"http://engineering/2008/07/finding-perfect-match.html\">Summize</a> in 2008, Twitter has invested heavily in search. We’ve grown our search team from three to 15 engineers and scaled our real-time search engine by two orders of magnitude — all this, while we replaced the search infrastructure in flight, with no major service interruptions.</p> \n<p>The engineering story behind the evolution of search is compelling. The Summize infrastructure used Ruby on Rails for the front-end and MySQL for the back-end (the same architecture as the one used by Twitter and many other start-ups). At the time, Lucene and other open-source search technology did not support real-time search. As a result, we constructed our reverse indexes in MySQL, leveraging its concurrent transactions and B-tree data structures to support concurrent indexing and searching. We were able to scale our MySQL-based solution surprisingly far by partitioning the index across multiple databases and replicating the Rails front-end. In 2008, Twitter search handled an average of 20 TPS and 200 QPS. By October 2010, when we replaced MySQL with Earlybird, the system was handling 1,000 TPS and 12,000 QPS on average.</p> \n<p>Earlybird, a real-time, reverse index based on <a href=\"http://lucene.apache.org/java/docs/index.html\">Lucene</a>, not only gave us an order of magnitude better performance than MySQL for real-time search, it doubled our memory efficiency and provided the flexibility to add relevance filtering. However, we still needed to replace the Ruby on Rails front-end, which was only capable of synchronous calls to Earlybird and had accrued significant technical debt through years of scaling and transition to Earlybird.</p> \n<p>In April 2011, we launched a replacement, called Blender, which improved our search latencies by 3x, gave us 10x throughput, and allowed us to remove Ruby on Rails from the search infrastructure. Today, we are indexing an average of 2,200 TPS while serving 18,000 QPS (1.6B queries per day!). More importantly, Blender completed the infrastructure necessary to make the most significant user-facing change to Twitter search since the acquisition of Summize.</p> \n<h2>From Hack-Week Project to Production</h2> \n<p>When the team launched Earlybird, we were all excited by its potential — it was fast and the code was clean and easy to extend. While on vacation in Germany, Michael Busch, one of our search engineers, implemented a demo of image and video search. A few weeks later, during Twitter’s first <a href=\"http://engineering.twitter.com/2010/10/hack-week.html\">Hack Week</a>, the search team, along with some members of other teams, completed the first demo of our new search experience. Feedback from the company was so positive that the demo became part of our product roadmap.</p> \n<h2>Surfacing Relevant Tweets</h2> \n<p>There is a lot of information on Twitter — on average, more than 2,200 new Tweets every second! During large events, for example the <a href=\"https://twitter.com/hashtag/tsunami\">#tsunami</a> in Japan, this rate can increase by 3 to 4x. Often, users are interested in only the most memorable Tweets or those that other users engage with. In our new search experience, we show search results that are most relevant to a particular user. So search results are personalized, and we filter out the Tweets that do not resonate with other users.</p> \n<p>To support relevance filtering and personalization, we needed three types of signals:</p> \n<ul>\n <li>Static signals, added at indexing time</li> \n <li>Resonance signals, dynamically updated over time</li> \n <li>Information about the searcher, provided at search time</li> \n</ul>\n<p>Getting all of these signals into our index required changes to our ingestion pipeline, Earlybird (our reverse index), and Blender (our front-ends). We also created a new updater component that continually pushes resonance signals to Earlybird. In the ingestion pipeline, we added a pipeline stage that annotates Tweets with static information, for example, information about the user and the language of the Tweet’s text. The Tweets are then replicated to the Earlybird indexes (in real time), where we have extended Lucene’s internal data structures to support dynamic updates to arbitrary annotations. Dynamic updates, for example, the users’ interactions with Tweets, arrive over time from the updater. Together, Earlybird and the updater support a high and irregular rate of updates without requiring locks or slowing down searches.</p> \n<p>At query time, a Blender server parses the user’s query and passes it along with the user’s social graph to multiple Earlybird servers. These servers use a specialized ranking function that combines relevance signals and the social graph to compute a personalized relevance score for each Tweet. The highest-ranking, most-recent Tweets are returned to the Blender, which merges and re-ranks the results before returning them to the user.</p> \n<p><a href=\"http://1.bp.blogspot.com/-2eGkp7fP-JA/TeVwGcx4BbI/AAAAAAAAABM/j-qMhcuMKyI/s1600/updater_arch.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/the_engineering_behindtwittersnewsearchexperience95.thumb.1280.1280.png\" alt=\"The Engineering Behind Twitter’s New Search Experience\"></a></p> \n<p>Twitter search architecture with support for relevance</p> \n<h2>Removing Duplicates</h2> \n<p>Duplicate and near-duplicate Tweets are often not particularly helpful in Twitter search results. During popular and important events, when search should be most helpful to our users, nearly identical Tweets increase in number. Even when the quality of the duplicates is high, the searcher would benefit from a more diverse set of results. To remove duplicates we use a technique based on <a href=\"http://en.wikipedia.org/wiki/MinHash\">MinHashing</a>, where several signatures are computed per Tweet and two Tweets sharing the same set of signatures are considered duplicates. The twist? Like everything at Twitter, brevity is key: We have a very small memory budget to store the signatures. Our algorithm compresses each Tweet to just 4 bytes while still identifying the vast majority of duplicates with very low computational requirements.</p> \n<h2>Personalization</h2> \n<p>Twitter is most powerful when you personalize it by choosing interesting accounts to follow, so why shouldn’t your search results be more personalized too? They are now! Our ranking function accesses the social graph and uses knowledge about the relationship between the searcher and the author of a Tweet during ranking. Although the social graph is very large, we compress the meaningful part for each user into a <a href=\"http://en.wikipedia.org/wiki/Bloom_filter\">Bloom filter</a>, which gives us space-efficient constant-time set membership operations. As Earlybird scans candidate search results, it uses the presence of the Tweet’s author in the user’s social graph as a relevance signal in its ranking function.</p> \n<p>Even users that follow few or no accounts will benefit from other personalization mechanisms; for example, we now automatically detect the searcher’s preferred language and location.</p> \n<h2>Images and Videos in Search</h2> \n<p>Images and videos have an amazing ability to describe people, places, and real-time events as they unfold. Take for example @<a href=\"https://twitter.com/#!/jkrums\">jkrums</a>’ Twitpic of <a href=\"https://twitter.com/#!/jkrums/status/1121915133\">US Airways Flight 1549 Hudson river landing</a>, and @<a href=\"https://twitter.com/#!/stefmara\">stefmara</a>’s photos and videos of <a href=\"http://twitter.com/#!/Stefmara/status/70120040241963008\">space shuttle Endeavour’s final launch</a>.</p> \n<p>There is a fundamental difference between searching for Tweets and searching for entities in Tweets, such as images and videos. In the former case, the decision about whether a Tweet matches a query can be made by looking at the text of the Tweet, with no other outside information. Additionally, per-Tweet relevance signals can be used to rank and compare matching Tweets to find the best ones. The situation is different when searching for images or videos. For example, the same image may be tweeted many times, with each Tweet containing different keywords that all describe the image. Consider the following Tweets:</p> \n<ul>\n <li>“This is my Australian Shepherd: <a href=\"http://bit.ly/kQvYGp\">http://bit.ly/kQvYGp</a>”</li> \n <li>“What a cute dog! RT This is my Australian Shepherd: <a href=\"http://bit.ly/kQvYGp\">http://bit.ly/kQvYGp</a>”.</li> \n</ul>\n<p>One possible description of the image is formed from the union of keywords in the Tweets’ text; that is, “dog”, “Australian”, and “shepherd” all describe the image. If an image is repeatedly described by a term in the Tweet’s text, it is likely to be about that term.</p> \n<p>So what makes this a difficult problem? Twitter allows you to search Tweets within seconds; images and photos in tweets should be available in realtime too! Earlybird uses <a href=\"http://en.wikipedia.org/wiki/Inverted_index\">inverted indexes</a> for search. While these data structures are extremely efficient, they do not support inline updates, which makes it nearly impossible to append additional keywords to indexed documents.</p> \n<p>If timeliness was not important, we could use <a href=\"http://en.wikipedia.org/wiki/MapReduce\">MapReduce</a> jobs that periodically aggregate keyword unions and produce inverted indexes. In these offline indexes, each link to an image or photo link would be a document, with the aggregated keywords as the document’s text. However, to meet our indexing latency goals, we would have to run these MapReduce jobs every few seconds, an impractical solution.</p> \n<p>Instead, we extended Earlybird’s data structures to support efficient lookups of entities contained in Tweets. At query time, we look up the images and videos for matching Tweets and and store them in a custom hash map. The keys of the map are URLs and the values are score counters. Each time the same URL is added to the map, its corresponding score counter is incremented. After this aggregation is complete, the map is sorted and the best images and photos are returned for rendering.</p> \n<h2>What’s next?</h2> \n<p>The search team is excited to build innovative search products that drive discovery and help our users. While the new search experience is a huge improvement over pure real-time search, we are just getting started. In the coming months, we will improve quality, scale our infrastructure, expand our indexes, and bring relevance to mobile.</p> \n<p>If you are a talented engineer and want to work on the largest real-time search engine in the world, Twitter search is hiring for <a href=\"https://twitter.com/job.html?jvi=oE2EVfwZ,Job\">search quality</a> and <a href=\"https://twitter.com/job.html?jvi=oamhVfws,Job\">search infrastructure</a>!</p> \n<h2>Acknowledgements</h2> \n<p>The following people contributed to the launch: Abhi Khune, Abdur Chowdhury, Aneesh Sharma, Ashok Banerjee, Ben Cherry, Brian Larson, Coleen Baik, David Chen, Frost Li, Gilad Mishne, Isaac Hepworth, Jon Boulle, Josh Brewer, Krishna Gade, Michael Busch, Mike Hayes, Nate Agrin, Patrick Lok, Raghavendra Prabu, Sarah Brown, Sam Luckenbill, Stephen Fedele, Tian Wang, Yi Zhuang, Zhenghua Li.</p> \n<p>We would also like to thank the original Summize team, former team members, hack-week contributors, and management for their contributions and support.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=twittersearch\">@twittersearch</a></p>",
"date": "2011-05-31T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/the-engineering-behind-twitter-s-new-search-experience",
"domain": "engineering"
},
{
"title": "Faster Ruby: Kiji Update",
"body": "<p>In March 2011, we shared <a href=\"http://engineering.twitter.com/2011/03/building-faster-ruby-garbage-collector.html\">Kiji</a>, an improved Ruby runtime. The initial performance gains were relatively modest, but laid the foundation for future improvements. We continued the work and now have some excellent results.</p> \n<h2>FASTER REMEMBERED SET CALCULATIONS</h2> \n<p>In Kiji 0.10, every change to the longlife heap required full recalculation of the “remembered set,” the boundary objects referenced from the longlife to the eden heap. For Kiji 0.11, we changed the calculation to an incremental model that only includes newly-allocated objects.</p> \n<p>We made this easier by disabling garbage collection during source code parsing, which has a tendency to mutate references in place. Now, if the parser needs more memory, it merely allocates a new heap chunk. This lets us allocate all AST nodes, including those created in <code>instance_eval</code>, on the longlife heap. The result is a big performance boost for applications like template engines that use lots of <code>instance_eval</code>.</p> \n<h2>MORE OBJECTS IN LONGLIFE</h2> \n<p>For Kiji 0.11, we now allocate non-transient strings in the longlife heap, along with the AST nodes. This includes strings allocated during parsing, assigned to constants (or members of a constant hash or array), and those that are members of frozen objects. With Ruby’s <code>Kernel.freeze</code> method, big parts of frozen objects are now evicted from the ordinary heap and moved to the longlife heap.</p> \n<p>This change is significant. When the twitter.com web application ran Kiji 0.10, it had 450,000 live objects after garbage collection in its ordinary heap. Kiji 0.11 places over 300,000 string objects in the longlife heap, reducing the number of live objects in the ordinary heap to under 150,000. The nearly 66 percent reduction allows the heap to collect much less frequently.</p> \n<h2>SIMPLIFIED HEAP GROWTH STRATEGY</h2> \n<p>Ruby Enterprise Edition has a set of environment variables that govern when to run the garbage collector and how to grow and shrink the heaps. After evaluating Ruby’s heap growth strategy, we replaced it with one that is much simpler to configure and works better for server workloads.</p> \n<p>As a first step, we eliminated <code>GC_MALLOC_LIMIT</code>. This environment variable prescribes when to force a garbage collection, following a set of C-level <code>malloc()</code> calls. We found this setting to be capricious; it performed best when it was set so high as to be effectively off. By eliminating the malloc limit entirely, the Kiji 0.11 garbage collector runs only when heaps are full, or when no more memory can be allocated from the operating system. This also means that under UNIX-like systems, you can more effectively size the process with <code>ulimit -u</code>.</p> \n<p>0.11 now has only these three GC-tuning environment variables:</p> \n<ul>\n <li>The first parameter is <code>RUBY_GC_HEAP_SIZE</code>. This parameter determines the number of objects in a heap slab. The value is specified in numbers of objects. Its default value is 32768.</li> \n <li> <p>The next parameter is <code>RUBY_GC_EDEN_HEAPS</code>. This parameter specifies the target number of heap slabs for the ordinary heap. Its default value is 24.</p> <p>The runtime starts out with a single heap slab, and when it fills up, it collects the garbage and allocates a new slab until it reaches the target number. This gradual strategy keeps fragmentation in the heaps low, as it tends to concentrate longer-lived objects in the earlier heap slabs. If the heap is forced to grow beyond the target number of slabs, the runtime releases vacated slabs after each garbage collection in order to restore the target size. Once the application reaches the target size of ordinary heap, it does not go below it.</p> <p>Since performance is tightly bound to the rate of eden collections (a classic memory for speed tradeoff), this makes the behavior of a long-lived process very predictable. We have had very good results with settings as high as 64.</p> </li> \n <li>The final parameter is <code>RUBY_GC_LONGLIFE_LAZINESS</code>, a decimal between 0 and 1, with a default of 0.05. This parameter governs a different heap growth strategy for longlife heap slabs. The runtime releases vacant longlife heap slabs when the ratio of free longlife heap slots to all longlife heap slots after the collection is higher than this parameter. Also, if the ratio is lower after collection, a new heap slab is allocated. <p>The default value is well-tuned for our typical workload and prevents memory bloat.</p> </li> \n</ul>\n<p>We also reversed the order of adding the freed slots onto the free list. Now, new allocations are fulfilled with free slots from older (presumably, more densely-populated) heap slabs first, allowing recently allocated heap slabs to become completely vacant in a subsequent GC run. This may slightly impact locality of reference, but works well for us.</p> \n<h2>ADDITIONAL CHANGES</h2> \n<p>We replaced the old profiling methods that no longer applied with our improved memory debugging.</p> \n<p>We also removed the “fastmarktable” mode, where the collector used a mark bit in the object slots. Kiji 0.11 uses only the copy-on-write friendly mark table. This lets us reset the mark bits after collection by zeroing out the entire mark table, instead of flipping a bit in every live object.</p> \n<h2>IT’S IN THE NUMBERS</h2> \n<p>We updated the <a href=\"http://1.bp.blogspot.com/-XYqJXFAnsP8/TXFuVlzXsaI/AAAAAAAAABA/fWwPaMfT16A/s1600/request-response-rates.png\">performance chart</a> from the first blog post about Kiji with the 0.11 data. As you can see, the new data shows a dramatic improvement for our example intensive workload. While Kiji 0.9 responded to all requests until 90 requests/sec and peaked at 95 responses out of 100 requests/sec, Kiji 0.11 responds to all requests until 120 requests/sec. This is a 30% improvement in throughput across the board, and 2.7x the speed of standard Ruby 1.8.</p> \n<p><a href=\"http://1.bp.blogspot.com/-EJUXzqoNNFY/TdWDcOFRuZI/AAAAAAAAAFQ/dVfmpK8vgoQ/s1600/kiji-11.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/faster_ruby_kijiupdate95.thumb.1280.1280.png\" alt=\"Faster Ruby: Kiji Update\"></a></p> \n<h2>FULL ALLOCATION TRACING</h2> \n<p>We found that in order to effectively develop Kiji 0.11, we needed to add more sophisticated memory instrumentation than is currently available for Ruby. As a result, we ended up with some really useful debugging additions that you can turn on as well.</p> \n<p>The first tool is a summary of memory stats after GC. It lets you cheaply measure the impact of memory-related changes:</p> \n<p><a href=\"http://1.bp.blogspot.com/-w8xHsk1Rp5k/TdWDrbIsMjI/AAAAAAAAAFY/wLkJpcSFr4Y/s1600/gc-summary.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/faster_ruby_kijiupdate96.thumb.1280.1280.png\" alt=\"Faster Ruby: Kiji Update\"></a></p> \n<p>The second tool is an allocation tracer (a replacement for BleakHouse and similar tools). After each GC, the runtime writes files containing full stack traces for the allocation points of all freed and surviving objects. You can easily parse this with AWK to list common object types, allocation sites, and number of objects allocated. This makes it easy to identify allocation hotspots, memory leaks, or objects that persist on the eden and should be manually moved to the longlife.</p> \n<p>A sample output for allocation tracing, obtained by running RubySpec under Kiji:</p> \n<p><a href=\"http://3.bp.blogspot.com/-FfYIQlFpboM/TdWD0PW3tGI/AAAAAAAAAFg/P9FVzU3azmU/s1600/allocation-trace.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/faster_ruby_kijiupdate97.thumb.1280.1280.png\" alt=\"Faster Ruby: Kiji Update\"></a></p> \n<p>For more information, refer to the <code>README-kiji</code> file in the distribution.</p> \n<h2>FUTURE DIRECTIONS</h2> \n<p>0.11 is a much more performant and operable runtime than Kiji 0.10. However, through this work we identified a practical strategy for making an even better, fully-generational version that would apply well to Ruby 1.9. Time will tell if we get to implement it.</p> \n<p>We also would like to investigate the relative performance of JRuby.</p> \n<h2>TRY IT!</h2> \n<p>We have released the <a href=\"https://github.com/twitter/rubyenterpriseedition187-248/\">Kiji REE branch</a> on GitHub.</p> \n<h2>ACKNOWLEDGEMENTS</h2> \n<p>The following engineers at Twitter contributed to the REE improvements: Rob Benson, Brandon Mitchell, Attila Szegedi, and Evan Weaver.</p> \n<p>If you want to work on projects like this, <a href=\"http://twitter.com/jobs\">join the flock!</a></p> \n<p>— Attila (<a href=\"https://twitter.com/intent/user?screen_name=asz\">@asz</a>)</p>",
"date": "2011-05-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/faster-ruby-kiji-update",
"domain": "engineering"
},
{
"title": "Twitter Search is Now 3x Faster",
"body": "<p>In the spring of 2010, the search team at Twitter started to rewrite our search engine in order to serve our ever-growing traffic, improve the end-user latency and availability of our service, and enable rapid development of new search features. As part of the effort, we launched a new <a href=\"http://engineering.twitter.com/2010/10/twitters-new-search-architecture.html\">real-time search engine</a>, changing our back-end from MySQL to a real-time version of <a href=\"http://lucene.apache.org/java/docs/index.html\">Lucene</a>. Last week, we launched a replacement for our Ruby-on-Rails front-end: a Java server we call Blender. We are pleased to announce that this change has produced a 3x drop in search latencies and will enable us to rapidly iterate on search features in the coming months.</p> \n<h2>PERFORMANCE GAINS</h2> \n<p>Twitter search is one of the most heavily-trafficked search engines in the world, serving over one billion queries per day. The week before we deployed Blender, the <a href=\"http://twitter.com/#!/search/%23tsunami\">#tsunami</a> in Japan contributed to a significant increase in query load and a related spike in search latencies. Following the launch of Blender, our 95th percentile latencies were reduced by 3x from 800ms to 250ms and CPU load on our front-end servers was cut in half. We now have the capacity to serve 10x the number of requests per machine. This means we can support the same number of requests with fewer servers, reducing our front-end service costs.</p> \n<p><a href=\"http://4.bp.blogspot.com/-CmXJmr9UAbA/TZy6AsT72fI/AAAAAAAAAAs/aaF5AEzC-e4/s1600/Blender_Tsunami.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/twitter_search_isnow3xfaster95.thumb.1280.1280.png\" alt=\"Twitter Search is Now 3x Faster\"></a></p> \n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;95th Percentile Search API Latencies Before and After Blender Launch</p> \n<h2>TWITTER’S IMPROVED SEARCH ARCHITECTURE</h2> \n<p>In order to understand the performance gains, you must first understand the inefficiencies of our former Ruby-on-Rails front-end servers. The front ends ran a fixed number of single-threaded rails worker processes, each of which did the following:</p> \n<ul>\n <li>parsed queries</li> \n <li>queried index servers synchronously</li> \n <li>aggregated and rendered results</li> \n</ul>\n<p>We have long known that the model of synchronous request processing uses our CPUs inefficiently. Over time, we had also accrued significant technical debt in our Ruby code base, making it hard to add features and improve the reliability of our search engine. Blender addresses these issues by:</p> \n<ol>\n <li>Creating a fully asynchronous aggregation service. No thread waits on network I/O to complete.</li> \n <li>Aggregating results from back-end services, for example, the real-time, top tweet, and geo indices.</li> \n <li>Elegantly dealing with dependencies between services. Workflows automatically handle transitive dependencies between back-end services.</li> \n</ol>\n<p>The following diagram shows the architecture of Twitter’s search engine. Queries from the website, API, or internal clients at Twitter are issued to Blender via a hardware load balancer. Blender parses the query and then issues it to back-end services, using workflows to handle dependencies between the services. Finally, results from the services are merged and rendered in the appropriate language for the client.</p> \n<p><a href=\"http://3.bp.blogspot.com/-NezgJOPlwJI/TZy6flMqM7I/AAAAAAAAAA0/2XY00S2yxZQ/s1600/Blender_workflow.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/twitter_search_isnow3xfaster96.thumb.1280.1280.png\" alt=\"Twitter Search is Now 3x Faster\"></a></p> \n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Twitter Search Architecture with Blender</p> \n<h2>BLENDER OVERVIEW</h2> \n<p>Blender is a Thrift and HTTP service built on <a href=\"http://www.jboss.org/netty\">Netty</a>, a highly-scalable NIO client server library written in Java that enables the development of a variety of protocol servers and clients quickly and easily. We chose Netty over some of its other competitors, like Mina and Jetty, because it has a cleaner API, better documentation and, more importantly, because several other projects at Twitter are using this framework. To make Netty work with Thrift, we wrote a simple Thrift codec that decodes the incoming Thrift request from Netty’s channel buffer, when it is read from the socket and encodes the outgoing Thrift response, when it is written to the socket.</p> \n<p>Netty defines a key abstraction, called a Channel, to encapsulate a connection to a network socket that provides an interface to do a set of I/O operations like read, write, connect, and bind. All channel I/O operations are asynchronous in nature. This means any I/O call returns immediately with a ChannelFuture instance that notifies whether the requested I/O operations succeed, fail, or are canceled.</p> \n<p>When a Netty server accepts a new connection, it creates a new channel pipeline to process it. A channel pipeline is nothing but a sequence of channel handlers that implements the business logic needed to process the request. In the next section, we show how Blender maps these pipelines to query processing workflows.</p> \n<h2>WORKFLOW FRAMEWORK</h2> \n<p>In Blender, a workflow is a set of back-end services with dependencies between them, which must be processed to serve an incoming request. Blender automatically resolves dependencies between services, for example, if service A depends on service B, A is queried first and its results are passed to B. It is convenient to represent workflows as <a href=\"http://en.wikipedia.org/wiki/Directed_acyclic_graph\">directed acyclic graphs</a> (see below).</p> \n<p><a href=\"http://2.bp.blogspot.com/-6seokrK0Jzc/TZy6sIuzodI/AAAAAAAAAA8/d9ihb-CWVDs/s1600/Blender_S1.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/twitter_search_isnow3xfaster97.thumb.1280.1280.png\" alt=\"Twitter Search is Now 3x Faster\"></a></p> \n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Sample Blender Workflow with 6 Back-end Services</p> \n<p>In the sample workflow above, we have 6 services {s1, s2, s3, s4, s5, s6} with dependencies between them. The directed edge from s3 to s1 means that s3 must be called before calling s1 because s1 needs the results from s3. Given such a workflow, the Blender framework performs a <a href=\"http://en.wikipedia.org/wiki/Topological_sorting\">topological sort</a> on the DAG to determine the total ordering of services, which is the order in which they must be called. The execution order of the above workflow would be {(s3, s4), (s1, s5, s6), (s2)}. This means s3 and s4 can be called in parallel in the first batch, and once their responses are returned, s1, s5, and s6 can be called in parallel in the next batch, before finally calling s2.</p> \n<p>Once Blender determines the execution order of a workflow, it is mapped to a Netty pipeline. This pipeline is a sequence of handlers that the request needs to pass through for processing.</p> \n<h2>MULTIPLEXING INCOMING REQUESTS</h2> \n<p>Because workflows are mapped to Netty pipelines in Blender, we needed to route incoming client requests to the appropriate pipeline. For this, we built a proxy layer that multiplexes and routes client requests to pipelines as follows:</p> \n<ul>\n <li>When a remote Thrift client opens a persistent connection to Blender, the proxy layer creates a map of local clients, one for each of the local workflow servers. Note that all local workflow servers are running inside Blender’s JVM process and are instantiated when the Blender process starts.</li> \n <li>When the request arrives at the socket, the proxy layer reads it, figures out which workflow is requested, and routes it to the appropriate workflow server.</li> \n <li>Similarly, when the response arrives from the local workflow server, the proxy reads it and writes the response back to the remote client.</li> \n</ul>\n<p>We made use of Netty’s event-driven model to accomplish all the above tasks asynchronously so that no thread waits on I/O.</p> \n<h2>DISPATCHING BACK-END REQUESTS</h2> \n<p>Once the query arrives at a workflow pipeline, it passes through the sequence of service handlers as defined by the workflow. Each service handler constructs the appropriate back-end request for that query and issues it to the remote server. For example, the real-time service handler constructs a realtime search request and issues it to one or more realtime index servers asynchronously. We are using the <a href=\"https://github.com/twitter/commons\">twitter commons</a> library (recently open-sourced!) to provide connection-pool management, load-balancing, and dead host detection.</p> \n<p>The I/O thread that is processing the query is freed when all the back-end requests have been dispatched. A timer thread checks every few milliseconds to see if any of the back-end responses have returned from remote servers and sets a flag indicating if the request succeeded, timed out, or failed. We maintain one object over the lifetime of the search query to manage this type of data.</p> \n<p>Successful responses are aggregated and passed to the next batch of service handlers in the workflow pipeline. When all responses from the first batch have arrived, the second batch of asynchronous requests are made. This process is repeated until we have completed the workflow or the workflow’s timeout has elapsed.</p> \n<p>As you can see, throughout the execution of a workflow, no thread busy-waits on I/O. This allows us to efficiently use the CPU on our Blender machines and handle a large number of concurrent requests. We also save on latency as we can execute most requests to back-end services in parallel.</p> \n<h2>BLENDER DEPLOYMENT AND FUTURE WORK</h2> \n<p>To ensure a high quality of service while introducing Blender into our system, we are using the old Ruby on Rails front-end servers as proxies for routing thrift requests to our Blender cluster. Using the old front-end servers as proxies allows us to provide a consistent user experience while making significant changes to the underlying technology. In the next phase of our deploy, we will eliminate Ruby on Rails entirely from the search stack, connecting users directly to Blender and potentially reducing latencies even further.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=twittersearch\">@twittersearch</a></p> \n<h2>ACKNOWLEDGEMENTS</h2> \n<p>The following Twitter engineers worked on Blender: Abhi Khune, Aneesh Sharma, Brian Larson, Frost Li, Gilad Mishne, Krishna Gade, Michael Busch, Mike Hayes, Patrick Lok, Raghavendra Prabhu, Sam Luckenbill, Tian Wang, Yi Zhuang, Zhenghua Li.</p>",
"date": "2011-04-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/twitter-search-is-now-3x-faster",
"domain": "engineering"
},
{
"title": "Improving Browser Security with CSP",
"body": "<p>If you are using Firefox 4, you now have an extra layer of security when accessing mobile.twitter.com.</p> \n<p>Over the past few weeks we’ve been testing a new security feature for our mobile site. It is called a Content Security Policy, or CSP. This policy is a standard developed by Mozilla that aims to thwart cross site scripting (XSS) attacks at their point of execution, the browser. The upcoming release of Firefox 4 implements CSP, and while the mobile site may not get a high volume of desktop browser traffic (the desktop users hitting that site typically have low bandwidth connections), it has given us an opportunity to test out a potentially powerful anti-XSS tool in a controlled setting.</p> \n<h2>CSP IN A NUTSHELL</h2> \n<p>In a typical XSS attack, the attacker injects arbitrary Javascript into a page, which is then executed by an end-user. When a website enables CSP, the browser ignores inline Javascript and only loads external assets from a set of whitelisted sites. Enabling CSP on our site was simply a matter of including the policy in the returned headers under the CSP defined key, ‘X-Content-Security-Policy’.</p> \n<p>The policy also contains a ‘reporting URI’ to which the browser sends JSON reports of any violations. This feature not only assists debugging of the CSP rules, it also has the potential to alert a site’s owner to emerging threats.</p> \n<h2>IMPLEMENTING THE FEATURE</h2> \n<p>Although activating CSP is easy, in order for it to work correctly you may need to modify your site. In our case it meant removing all inline Javascript. While it is good practice to keep inline Javascript out of your HTML, it is sometimes necessary to speed up the load times on slower high-latency mobile phones.</p> \n<p>We began our explorations by restricting the changes to browsers that support CSP (currently only Firefox 4) in order to lessen the impact on users. Next, we identified all the possible locations of our assets and built a rule set to encompass those; for example, things such as user profile images and stylesheets from our content delivery network.</p> \n<p>Our initial trials revealed that some libraries were evaluating strings of Javascript and triggering a violation, most notably jQuery 1.4, which tests the ‘eval’ function after load. This wasn’t totally unexpected and we modified some of the libraries to get them to pass. Since jQuery fixed this in 1.5, it is no longer an issue.</p> \n<h2>INITIAL RESULTS</h2> \n<p>After a soft launch, we ran into some unexpected issues. Several common Firefox extensions insert Javascript on page load, thereby triggering a report. However, even more surprising were the number of ISPs who were inadvertently inserting Javascript or altering image tags to point to their caching servers. It was the first example of how CSP gave us visibility into what was happening on the user’s end. We addressed this problem by mandating SSL for Firefox 4 users, which prevents any alteration of our content.</p> \n<p>Today CSP is one hundred percent live on mobile.twitter.com and we are logging and evaluating incoming violation reports.</p> \n<h2>FINAL THOUGHTS</h2> \n<p>Allowing sites like Twitter to disable inline Javascript and whitelist external assets is a huge step towards neutralizing XSS attacks. However, for many sites it is not going to be as simple as flipping a switch. Most sites will require some work and you may need to alter a few third-party Javascript libraries. Depending on how complex your site is, this could entail the bulk of your effort.</p> \n<p>We hope other browsers will adopt the CSP standard, especially as more sites depend on client-side code and user-generated content. The simple option of being able to disable inline Javascript and limit external sources gives sites the ability to stop the vast majority of today’s attacks with minimal effort.</p> \n<p>Over the next couple of months we plan to implement a Content Security Policy across more of Twitter, and we encourage you to request support for this standard in your preferred browser.</p> \n<h2>ACKNOWLEDGEMENTS</h2> \n<p>The following people at Twitter contributed to the CSP effort: John Adams, Jacob Hoffman-Andrews, Kevin Lingerfelt, Bob Lord, Mark Percival, and Marcus Philips</p> \n<h2>FURTHER READING</h2> \n<p><a href=\"http://blog.mozilla.com/security/2011/03/22/creating-a-safer-web-with-content-security-policy/\">Mozilla CSP announcement</a></p> \n<p><a href=\"https://developer.mozilla.org/en/Security/CSP\">Mozilla CSP Doc Center</a></p> \n<p><a href=\"https://wiki.mozilla.org/Security/CSP/Specification\">CSP Spec</a></p> \n<p><a href=\"http://people.mozilla.org/~bsterne/content-security-policy/demo.cgi\">CSP Demo Page</a></p> \n<p>—Mark (<a href=\"https://twitter.com/intent/user?screen_name=mdp\">@mdp</a>)</p>",
"date": "2011-03-22T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/improving-browser-security-with-csp",
"domain": "engineering"
},
{
"title": "The Great Migration, the Winter of 2011",
"body": "<p>If you look back at the history of Twitter, our rate of growth has largely outpaced the capacity of our hardware, software, and the company itself. Indeed, in our first five years, Twitter’s biggest challenge was coping with our unprecedented growth and sightings of the infamous Fail Whale.<br><br> These issues came to a head <a href=\"http://engineering.twitter.com/2010/06/perfect-stormof-whales.html\">last June</a> when Twitter experienced more than ten hours of downtime. However, unlike past instances of significant failure, we said at the time that that we had a long-term plan.<br><br> Last September, we began executing on this plan and undertook the most significant engineering challenge in the history of Twitter. We hope it will have a significant impact the service’s success for many years to come. During this time, the engineers and operations teams moved Twitter’s infrastructure to a new home while making changes to our infrastructure and our organization that will ensure that we can constantly stay abreast of our capacity needs; give users and developers greater reliability; and, allow for new product offerings.<br><br> This was our season of migration.<br><br></p> \n<h2>Redesigning and Rebuilding the Bird Mid-flight</h2> \n<p><a href=\"https://twitter.com/about/opensource\">Under the hood</a>, Twitter is a complex yet elegant distributed network of <a href=\"https://github.com/robey/kestrel\">queues</a>, daemons, <a href=\"https://github.com/fauna/memcached\">caches</a>, and <a href=\"https://github.com/twitter/flockdb\">databases</a>. Today, the feed and care of Twitter requires more than 200 engineers to keep the site growing and running smoothly. What did moving the entirety of Twitter while improving up-time entail? Here’s a simplified version of what we did.<br><br> First, our engineers extended many of Twitter’s core systems to replicate Tweets to multiple data centers. Simultaneously, our operations engineers divided into new teams and built new processes and software to allow us to qualify, burn-in, deploy, tear-down and monitor the thousands of servers, routers, and switches that are required to build out and operate Twitter. With hardware at a second data center in place, we moved some of our non-runtime systems there – giving us headroom to stay ahead of tweet growth. This second data center also served as a staging laboratory for our replication and migration strategies. Simultaneously, we prepped a third larger data center as our final nesting ground.<br><br> Next, we set out rewiring the rocket mid-flight by writing Tweets to both our primary data center and the second data center. Once we proved our replication strategy worked, we built out the full Twitter stack, and copied all 20TB of Tweets, from <a href=\"https://twitter.com/intent/user?screen_name=jack\">@jack</a>’s <a href=\"https://twitter.com/#!/jack/status/29\">first</a> to <a href=\"https://twitter.com/intent/user?screen_name=honeybadger\">@honeybadger</a>’s latest Tweet to the second data center. Once all the data was in place we began serving live traffic from the second data center for end-to-end testing and to continue to shed load from our primary data center. Confident that our strategy for replicating Twitter was solid, we moved on to the final leg of the migration, building out and moving all of Twitter from the first and second data centers to the final nesting grounds. This essentially required us to move much of Twitter two times.<br><br> What’s more, during the migration we set a new Tweet per second <a href=\"http://engineering/2011/01/celebrating-new-year-with-new-tweet.html\"> record</a>, <a href=\"http://engineering/2011/03/numbers.html\"> continued to grow</a>, launched <a href=\"https://twitter.com/#!/download\">new</a> <a href=\"http://engineering/2011/02/translating-twitter-into-more-languages.html\">products</a>, while improving the <a href=\"http://engineering/2011/03/making-twitter-more-secure-https.html\"> security</a> and up-time of our service.<br><br></p> \n<h2>A Flock</h2> \n<p>The effort and planning behind this effort were huge. Vacations were put off, weekends were worked, more than a few strategic midnight oil reserves were burned in this two-stage move. The technical accomplishments by the operations and engineering teams that made this move possible were immense. Equally great, was the organization and alignment of the engineering and operations teams, their ability to create lightweight robust processes where none had existed before. Without this cohesion, this flocking of sorts, none of this would have been possible.<br><br> Though spring is here, and this particular season of migration is over, it represents more of a beginning than an ending. This move gives us the capacity to deliver Tweets with greater reliability and speed, and creates more runway to focus on the most interesting operations and engineering problems. It’s an immense opportunity to innovate and build the products and technologies that our users request and our talented engineers love to develop.<br><br> —The Twitter Engineering Team<br><br> P.S. Twitter is hiring across engineering and operations. If you want to develop novel systems that scale on the order of billions, <a href=\"http://twitter.com/jobs-engineering.html\">join the flock</a>.</p>",
"date": "2011-03-21T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/the-great-migration-the-winter-of-2011",
"domain": "engineering"
},
{
"title": "Building a Faster Ruby Garbage Collector",
"body": "<p>Since late 2009, much of www.twitter.com has run on Ruby Enterprise Edition (REE), a modified version of the standard MRI 1.8.7 Ruby interpreter. At the time, we worked with the REE team to integrate some third-party patches that allowed us to tune the garbage collector for long-lived workloads. We knew this was not a perfect choice, but a switch to a new runtime (even MRI 1.9x) would introduce compatibility problems, and testing indicated that alternative runtimes are not necessarily faster for our workload. Nevertheless, the CPU cost of REE remained too high.</p> \n<p>To address this problem, we decided to explore options for optimizing the REE runtime. We called this effort Project Kiji, after the Japanese bird.</p> \n<h2>Inefficient garbage collection</h2> \n<p>Our performance measurements revealed that even after our patches, the Ruby runtime uses a significant fraction of the CPU for running the garbage collector on twitter.com. This is largely because MRI’s garbage collector uses a single heap:</p> \n<ul>\n <li>The garbage collector’s naive stop-the-world mark-and-sweep process accesses the entire memory set several times. It first marks all objects at the “root-set” level as “in-use” and then reexamines all the objects to release the memory of those not in use. Additionally, the collector suspends the system during every sweep, thereby periodically “freezing” some of the programs.</li> \n <li>The collection process is not generational. That is, the collector does not move objects between heaps; they all stay at the same address for their lifetime. The resulting fragmented memory extracts a penalty in bookkeeping cost because it can neither be consolidated nor discarded.</li> \n</ul>\n<p>We needed to make the garbage collector more efficient but had limited options. We couldn’t easily change the runtime’s stop-the world-process because internally it relies on being single-threaded. Neither could we implement a real generational collector because our interpreter relies on objects staying at the same address in memory.</p> \n<h2>Two heaps are better than one</h2> \n<p>While we could not change the location of an allocated object, we assumed we could allocate the objects in a different location, depending on their expected lifetime. So, first, we separated the live objects into transient objects and long-lived objects. Next, we added another heap to Ruby and called it “longlife” heap.</p> \n<p>According to the current implementation, Kiji has two heaps:</p> \n<ul>\n <li>An ordinary heap that has a variety of objects, including the transient objects</li> \n <li>A longlife heap for objects that we expect to be long-lived.<br><br></li> \n</ul>\n<h2>How the longlife heap works</h2> \n<p>In the new configuration, AST (Abstract Syntax Tree) nodes occur in longlife heaps. They are a parsed representation of the Ruby programs’ source code construct (such as name, type, expression, statement, or declaration). Once loaded, they tend to stick in memory, and are largely immutable. They also occupy a large amount of memory: in twitter.com’s runtime, they account for about 60% of live objects at any given time. By placing the AST nodes in a separate longlife heap and running an infrequent garbage collection only on this heap, we saw a significant performance gain: the time the CPU spends in garbage collection reduced from 18.5% to 14%.</p> \n<p>With infrequent collection, the system retains some garbage in memory, which increases overall memory use. We experimented with various scheduling strategies for longlife garbage collection that balanced the tradeoff between CPU usage and memory usage. We finally selected a strategy that triggers a collection scheduled to synchronize with the 8th collection cycle of the ordinary heap if an allocation occurs in the longlife heap. If the longlife heap does not receive an allocation, subsequent collections on the longlife heap occur after the 16th, 32nd, 64th collection cycle and so on, with each occurrence increasing exponentially.</p> \n<h2>Improved mark phase</h2> \n<p>A second heap improved garbage collection but we needed to ensure that the objects in the longlife heap continued to keep alive those objects they referenced in the ordinary heap. Due to a separation in heaps, we were now processing the majority of our ordinary heap collections without a longlife collection. Therefore, ordinary objects—reachable only through longlife objects—would not be marked as live and could, mistakenly, be swept as garbage. We needed to maintain a set of “remembered objects” or boundary objects that would live in the ordinary heap but were directly referenced by objects living in the longlife heap.</p> \n<p>This proved to be a far greater challenge than originally expected. At first we added objects to the remembered set whenever we constructed an AST node. However, the AST nodes are not uniformly immutable. Following a parse, the Ruby interpreter tends to rewrite them immediately to implement small optimizations on them. This frequently rewrites the pointers between objects. We overcame this problem by implementing an algorithm that is similar to the mark phase, except that it is not recursive and only discovers direct references from the longlife to the ordinary heap. We run the algorithm at ordinary collection time when we detect that prior changes have occurred in the longlife heap. The run decreases in frequency over time; the longer the process runs, the more the amount of loaded code that stagnates. In other words, we are consciously optimizing for long-running processes.</p> \n<p>An additional optimization ensures that if an ordinary object points to a longlife object, the marking never leaves the ordinary heap during the mark phase. This is because all outgoing pointers from the longlife heap to the ordinary heap are already marked as remembered objects. The mark algorithm running through the longlife heap reference chains does not mark any new objects in the ordinary heap.</p> \n<h2>Results</h2> \n<p>The graph below shows the performance curves of the twitter.com webapp on various Ruby interpreter variants on a test machine with a synthetic load (not indicative of our actual throughput). We took out-of-the-box Ruby MRI 1.8.7p248, REE 2010.02 (on which Kiji is based), and Kiji. In addition, we tested REE and Kiji in two modes: one with the default settings, the other with GC_MALLOC_LIMIT tuned to scale back speculative collection. We used httperf to stress the runtimes with increasing numbers of requests per second, and measured the rate of successful responses per second. As you can see, the biggest benefit comes from the GC tuning, but Kiji’s two-heap structure also creates a noticeable edge over standard REE.</p> \n<p><a href=\"http://1.bp.blogspot.com/-XYqJXFAnsP8/TXFuVlzXsaI/AAAAAAAAABA/fWwPaMfT16A/s1600/request-response-rates.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_fasterrubygarbagecollector95.thumb.1280.1280.png\" alt=\"Building a Faster Ruby Garbage Collector\"></a></p> \n<p>We have also measured the CPU percentage spent in the GC for these variants, this time with actual production traffic. We warmed up the runtimes first, then explicitly shut off allocations in the longlife heap once it was unlikely that any genuinely long-lived AST nodes would be generated. The results are below.</p> \n<p><a href=\"http://3.bp.blogspot.com/--BuN8WupGZc/TXFedNofclI/AAAAAAAAAA4/B4YWL_FQGpw/s1600/gc-percentage.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_fasterrubygarbagecollector96.thumb.1280.1280.png\" alt=\"Building a Faster Ruby Garbage Collector\"></a></p> \n<h2>Lessons and future trends</h2> \n<p>With Kiji, the garbage collector now consumes a smaller portion of the CPU for Ruby processes, and more importantly, enables additional improvements. As we identify additional objects to move to the longlife heap, we can further decrease the overall CPU usage. The theoretical floor is about 5% of total CPU time spent in the GC. We will be posting updates with new results.</p> \n<h2>References</h2> \n<p>A major source of inspiration was the patch by Narihiro Nakamura (available <a href=\"http://github.com/authorNari/patch_bag/blob/ff245102f66566c8f8a2aba67215493f1d064414/ruby/gc_partial_longlife_r23386.patch\">here</a>). Narihiro’s patch is a proof-of-concept; it handles few AST nodes. It is also written as a patch for MRI 1.9, and we needed to cross-port it to REE, which is a derivative of MRI 1.8.7. We substantially extended Narihiro’s work to include algorithmic boundary set calculation and stop the ordinary mark from leaving the ordinary heap. We also ensured our solution integrated well with REE’s strategy for avoiding copy-on-write of objects in forked process in the mark phase. These changes delivered significant gains.</p> \n<h2>Try it!</h2> \n<p>We have released the <a href=\"https://github.com/twitter/rubyenterpriseedition187-248/\">Kiji REE branch</a> on GitHub, and hope that a future version will be suitable for merging into REE itself. In our case, switching to Kiji brought a 13% increase in the number of requests twitter.com can serve per second.</p> \n<h2>Acknowledgements</h2> \n<p>The following engineers at Twitter contributed to the REE improvements: Rob Benson, Brandon Mitchell, Attila Szegedi, and Evan Weaver.</p> \n<p>— Attila (<a href=\"https://twitter.com/intent/user?screen_name=asz\">@asz</a>)</p>",
"date": "2011-03-04T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2011/building-a-faster-ruby-garbage-collector",
"domain": "engineering"
},
{
"title": "2010",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2010",
"domain": "engineering"
},
{
"title": "Hack Week",
"body": "<p><a href=\"http://1.bp.blogspot.com/_CwxOkZF2NIo/TMIHnfMrT3I/AAAAAAAAABk/Jzbq-lrqRFQ/s1600/WastelandRomanceKickoff.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hack_week95.thumb.1280.1280.png\" alt=\"Hack Week\"></a></p> \n<p>Here at Twitter, we make things. Over the last five weeks, we’ve launched the new Twitter and made significant changes to the technology behind Twitter.com, deployed a new backend for search, and refined the algorithm for trending topics to make them more real-time.</p> \n<p>To keep with the spirit of driving innovation in engineering, we’ll be holding our first Hack Week starting today (Oct 22) and running through next Friday (Oct 29). In this light, we’ll all be building things that are separate from our normal work and not part of our day-to-day jobs. Of course, we’ll keep an eye out for whales.</p> \n<p>There aren’t many rules – basically we’ll work in small teams and share our projects with the company at the end of the week. What will happen with each project will be determined once it’s complete. Some may ship immediately, others may be added to the roadmap and built out in the future, and the remainder may serve as creative inspiration.</p> \n<p>If you have an idea for one of our teams, send a tweet to <a href=\"https://twitter.com/intent/user?screen_name=hackweek\">@hackweek</a>. We’re always looking for feedback.</p>",
"date": "2010-10-22T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/hack-week",
"domain": "engineering"
},
{
"title": "Twitter&#39;s New Search Architecture",
"body": "<p>If we have done a good job then most of you shouldn’t have noticed that we launched a new backend for search on <a href=\"http://twitter.com/\">twitter.com</a> during the last few weeks! One of our main goals, but also biggest challenges, was a smooth switch from the old architecture to the new one, without any downtime or inconsistencies in search results. Read on to find out what we changed and why.</p> \n<p>Twitter’s real-time search engine was, until very recently, based on the technology that Summize originally developed. This is quite amazing, considering the explosive growth that Twitter has experienced since the Summize acquisition. However, scaling the old MySQL-based system had become increasingly challenging.</p> \n<h2>The new technology</h2> \n<p>About 6 months ago, we decided to develop a new, modern search architecture that is based on a highly efficient inverted index instead of a relational database. Since we love Open Source here at Twitter we chose <a href=\"http://lucene.apache.org/java/docs/index.html\">Lucene</a>, a search engine library written in Java, as a starting point.</p> \n<p>Our demands on the new system are immense: With over 1,000 TPS (Tweets/sec) and 12,000 QPS (queries/sec) = over 1 billion queries per day (!) we already put a very high load on our machines. As we want the new system to last for several years, the goal was to support at least an order of magnitude more load.</p> \n<p>Twitter is real-time, so our search engine must be too. In addition to these scalability requirements, we also need to support extremely low indexing latencies (the time it takes between when a Tweet is tweeted and when it becomes searchable) of less than 10 seconds. Since the indexer is only one part of the pipeline a Tweet has to make it through, we needed the indexer itself to have a sub-second latency. Yes, we do like challenges here at Twitter! (btw, if you do too: <a href=\"http://twitter.com/jointheflock\">@JoinTheFlock</a>!)</p> \n<h2>Modified Lucene</h2> \n<p>Lucene is great, but in its current form it has several shortcomings for real-time search. That’s why we rewrote big parts of the core in-memory data structures, especially the posting lists, while still supporting Lucene’s standard APIs. This allows us to use Lucene’s search layer almost unmodified. Some of the highlights of our changes include:</p> \n<ul>\n <li>significantly improved garbage collection performance</li> \n <li>lock-free data structures and algorithms</li> \n <li>posting lists, that are traversable in reverse order</li> \n <li>efficient early query termination</li> \n</ul>\n<p>We believe that the architecture behind these changes involves several interesting topics that pertain to software engineering in general (not only search). We hope to continue to share more on these improvements.</p> \n<p>And, before you ask, we’re planning on contributing all these changes back to Lucene; some of which have already made it into Lucene’s trunk and its new realtime branch.</p> \n<h2>Benefits</h2> \n<p>Now that the system is up and running, we are very excited about the results. We estimate that we’re only using about 5% of the available backend resources, which means we have a lot of headroom. Our new indexer could also index roughly 50 times more Tweets per second than we currently get! And the new system runs extremely smoothly, without any major problems or instabilities (knock on wood).</p> \n<p>But you might wonder: Fine, it’s faster, and you guys can scale it longer, but will there be any benefits for the users? The answer is definitely yes! The first difference you might notice is the bigger index, which is now twice as long — without making searches any slower. And, maybe most importantly, the new system is extremely versatile and extensible, which will allow us to build cool new features faster and better. Stay tuned!</p> \n<p>The engineers who implemented the search engine are: <a href=\"http://twitter.com/michibusch\">Michael Busch</a>, <a href=\"http://twitter.com/krishnagade\">Krishna Gade</a>, <a href=\"http://twitter.com/JugglingPumba\">Mike Hayes</a>, <a href=\"http://twitter.com/akhune\">Abhi Khune</a>, <a href=\"http://twitter.com/larsonite\">Brian Larson</a>, <a href=\"http://twitter.com/plok\">Patrick Lok</a>, <a href=\"http://twitter.com/sam\">Samuel Luckenbill</a>, <a href=\"http://twitter.com/pbrane\">Jake Mannix</a>, <a href=\"http://twitter.com/jreichhold\">Jonathan Reichhold</a>.</p>",
"date": "2010-10-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/twitters-new-search-architecture",
"domain": "engineering"
},
{
"title": "Tool Legit",
"body": "<p>Hi, I’m <a href=\"http://www.twitter.com/stirman\">@stirman</a>, and I’m a tool.</p> \n<p>Well, I build tools, along with <a href=\"http://twitter.com/jacobthornton\">@jacobthornton</a>, <a href=\"http://twitter.com/gbuyitjames\">@gbuyitjames</a> and <a href=\"http://twitter.com/sm\">@sm</a>, the Internal Tools team here at Twitter.</p> \n<h2>Build or buy?</h2> \n<p>To build or not to build internal tools is usually a debated topic, especially amongst startups. Investing in internal projects has to be weighed against investing in external-facing features for your product, although at some point the former investment shows greater external returns than the latter. Twitter has made it a priority to invest in internal tools since the early days, and with the growth of the product and the company, our tools have become a necessity.</p> \n<p>I often hear from friends in the industry about internal tools being a night and weekend additional project for engineers that are already backlogged with “real” work. We have decided to make building tools our “real” work. This decision means we have time to build solid applications, spend the necessary time to make them look great and ensure that they work well.</p> \n<h2>Noble goals</h2> \n<p>Our team’s mission is to increase productivity and transparency throughout the company. We increase productivity by streamlining processes and automating tasks. We increase transparency by building tools and frameworks that allow employees to discover, and be notified of, relevant information in real time. Many companies use the term “transparency” when discussing their company culture, but very few put the right pieces in place to ensure that a transparent environment can be established without exposing too much information. Twitter invests heavily in my team so that we can build the infrastructure to ensure a healthy balance.</p> \n<h2>Example apps</h2> \n<p>We have built tools that track and manage milestones for individual teams, manage code change requests, provide an easy A/B testing framework for twitter.com, create internal short links, get approval for offer letters for new candidates, automate git repository creation, help conduct fun performance reviews and many more. We release a new tool about once every other week. We release a first version as early as possible, and then iterate quickly after observing usage and gathering feedback.</p> \n<p>Also, with the help of <a href=\"http://twitter.com/mdo\">@mdo</a>, we have put together a internal blueprint site that not only contains a style guide for new apps, but also hosts shared stylesheets, javascript libraries and code samples, like our internal user authentication system, to make spinning up a new tool as simple as possible.</p> \n<p>We put a lot of effort into ensuring our tools are easy to use and making them look great. We have fun with it. Here’s a screenshot of a recent app that tracks who’s on call for various response roles at any given time.</p> \n<p><a href=\"http://1.bp.blogspot.com/_BDCdWZhyt4A/TKUTkmYT4kI/AAAAAAAAAB4/gaLJQAmAls8/s1600/eng_blog_carousel.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/tool_legit95.thumb.1280.1280.png\" alt=\"Tool Legit\"></a></p> \n<h2>Time to play</h2> \n<p>We also have fun learning new technologies. Here’s a screenshot of a real-time Space Invaders Twitter sentiment analysis visualization that is part of a status board displayed on flat screens around the office. <a href=\"http://twitter.com/jacobthornton\">@jacobthornton</a> wanted to learn more about node.js for some upcoming projects and he built “Space Tweets” to do just that! If you’re interested in the code, get it on <a href=\"http://github.com/jacobthornton/space-tweet\">github</a>.</p> \n<p></p> \n<h2>Giving back</h2> \n<p>While we’re talking about open source, we would like to mention how much our team values frameworks like <a href=\"http://rubyonrails.org/\">Ruby on Rails</a>, <a href=\"http://mootools.net/\">MooTools</a> and their respective communities, all of which are very important to our internal development efforts and in which you’ll find us actively participating by submitting patches, debating issues, etc. We are proactively working towards open sourcing some of our own tools in the near future, so keep an eye on this blog.</p> \n<h2>Join us</h2> \n<p>Does this stuff interest you? Are you a tool? Hello? Is this thing on? Is anyone listening? (If you are still here, you passed the test! <a href=\"http://twitter.com/job.html?jvi=oSbdVfwV,Job\">Apply here</a> to join our team or hit me up at <a href=\"http://www.twitter.com/stirman\">@stirman</a>!)</p>",
"date": "2010-09-30T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/tool-legit",
"domain": "engineering"
},
{
"title": "The Tech Behind the New Twitter.com",
"body": "<p>The <a href=\"http://engineering/2010/09/better-twitter.html\">Twitter.com redesign</a> presented an opportunity to make bold changes to the underlying technology of the website. With this in mind, we began implementing a new architecture almost entirely in JavaScript. We put special emphasis on ease of development, extensibility, and performance. Building the application on the client forced us to come up with unique solutions to bring our product to life, a few of which we’d like to highlight in this overview.</p> \n<h2>API Client</h2> \n<p>One of the most important architectural changes is that Twitter.com is now a client of our own API. It fetches data from the same endpoints that the mobile site, our apps for iPhone, iPad, Android, and every third-party application use. This shift allowed us to allocate more resources to the API team, generating over 40 patches. In the initial page load and every call from the client, all data is now fetched from a highly optimized JSON fragment cache.</p> \n<h2>The Javascript API</h2> \n<p>We built a JavaScript library to access Twitter’s REST API for <a href=\"http://dev.twitter.com/anywhere\">@anywhere</a> which provided a good starting point for development on this project. The JavaScript API provides API fetching and smart client-side caching, both in-memory and using localStorage, allowing us to minimize the number of network requests made while using Twitter.com. For instance, timeline fetches include associated user data for each Tweet. The resulting user objects are proactively cached, so viewing a profile does not require unnecessary fetches of user data.</p> \n<p>Another feature of the JavaScript API is that it provides event notifications before and after each API call. This allows components to register interest and respond immediately with appropriate changes to the UI, while letting independent components remain decoupled, even when relying on access to the same data.</p> \n<h2>Page Management</h2> \n<p>One of the goals with this project was to make page navigation easier and faster. Building on the web’s traditional analogy of interlinked documents, our application uses a page routing system that maintains a strong relationship between a URL and its content. This allows us to provide a rich web application that behaves like a traditional web site. Doing so demanded that we develop a rich routing model on the client. To do so we developed a routing system to switch between stateful pages, driven by the URL hash. As the user navigates, the application caches the visited pages in memory. Although the information on those pages can quickly become stale, we’ve alleviated much of this complexity by making pages subscribe to events from the JavaScript API and keep themselves in sync with the overall application state.</p> \n<h2>The Rendering Stack</h2> \n<p>In order to support crawlers and users without JavaScript, we needed a rendering system that runs on both server and client. To meet this need, we’ve built our rendering stack around <a href=\"http://mustache.github.com/\">Mustache</a>, and developed a view object system that generates HTML fragments from API objects. We’ve also extended Mustache to support internationalized string substitution.</p> \n<p>Much attention was given to optimizing performance in the DOM. For example, we’ve implemented event delegation across the board, which has enabled a low memory profile without worrying about event attachment. Most of our UI is made out of reusable components, so we’ve centralized event handling to a few key root nodes. We also minimize repaints by building full HTML structures before they are inserted and attach relevant data in the HTML rendering step, rather than through DOM manipulation.</p> \n<h2>Inline Media</h2> \n<p>One important product feature was embedding third-party content directly on the website whenever tweet links to one of our content partners. For many of these partners, such as <a href=\"http://www.kiva.org/\">Kiva</a> and <a href=\"http://vimeo.com/\">Vimeo</a>, we rely on the <a href=\"http://www.oembed.com/\">oEmbed</a> standard, making a simple JSON-P request to the content provider’s domain and embeds content found in the response. For other media partners, like <a href=\"http://twitpic.com/\">TwitPic</a> and <a href=\"http://www.youtube.com/\">YouTube</a>, we rely on known embed resources that can be predicted from the URL, which reduces network requests and results in a speedier experience.</p> \n<h2>Open Source</h2> \n<p>Twitter has always embraced open-source technology, and the new web client continues in this tradition. We used <a href=\"http://jquery.com/\">jQuery</a>, <a href=\"http://mustache.github.com/\">Mustache</a>, <a href=\"http://labjs.com/\">LABjs</a>, <a href=\"http://www.modernizr.com/\">Modernizr</a>, and numerous other open-source scripts and jQuery plugins. We owe a debt of gratitude to the authors of these libraries and many others in the JavaScript community for their awesome efforts in writing open-source JavaScript. We hope that, through continuing innovations in front-end development here at Twitter, we’ll be able to give back to the open-source community with some of our own technology.</p> \n<h2>Conclusions</h2> \n<p>With <a href=\"http://twitter.com/newtwitter\">#NewTwitter</a>, we’ve officially adopted JavaScript as a core technology in our organization. This project prompted our first internal JavaScript summit, which represents an ongoing effort to exchange knowledge, refine our craft and discover new ways of developing for the web. We’re very excited about the doors this architectural shift will open for us as we continue to invest more deeply in rich client experiences. If you’re passionate about JavaScript, application architecture, and Twitter, now is a very exciting time to <a href=\"http://twitter.com/jointheflock\">@JoinTheFlock</a>!</p> \n<p>This application was engineered in four months by seven core engineers: <a href=\"http://twitter.com/mracus\">Marcus Phillips</a>, <a href=\"http://twitter.com/bs\">Britt Selvitelle</a>, <a href=\"http://twitter.com/hoverbird\">Patrick Ewing</a>, <a href=\"http://twitter.com/bcherry\">Ben Cherry</a>, <a href=\"http://twitter.com/ded\">Dustin Diaz</a>, <a href=\"http://twitter.com/dsa\">Russ d’Sa</a>, and <a href=\"http://twitter.com/esbie\">Sarah Brown</a>, with numerous contributions from around the company.</p>",
"date": "2010-09-20T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/the-tech-behind-the-new-twittercom",
"domain": "engineering"
},
{
"title": "My Awesome Summer Internship at Twitter",
"body": "<p>On my second day at Twitter, I was writing documentation for the systems I was going to work on (to understand them better), and I realized that there was a method in the service’s API that should be exposed but wasn’t. I pointed this out to my engineering mentor, Steve Jenson (<a href=\"https://twitter.com/intent/user?screen_name=stevej\">@stevej</a>). I expected him to ignore me, or promise to fix it later. Instead, he said, “Oh, you’re right. What are you waiting for? Go ahead and fix it.” After 4 hours, about 8 lines of code, and a code review with Steve, I committed my first code at Twitter.</p> \n<p>My name is Siddarth Chandrasekaran (<a href=\"https://twitter.com/intent/user?screen_name=sidd\">@sidd</a>). I’m a rising junior at Harvard studying Computer Science and Philosophy, and I just spent the last 10 weeks at Twitter as an intern with the Infrastructure team.</p> \n<p>When I started, I had very little real-world experience — I’d never coded professionally before – so, I was really excited and really nervous. I spent the first couple of weeks understanding the existing code base (and being very excited that I sat literally three cubicles away from Jason Goldman! (<a href=\"https://twitter.com/intent/user?screen_name=goldman\">@goldman</a>)). I remember my first “teatime” (Twitter’s weekly Friday afternoon company all-hands), when Evan Williams (<a href=\"https://twitter.com/intent/user?screen_name=ev\">@ev</a>) broke into song in the middle of his presentation, dramatically launching the karaoke session that followed teatime.</p> \n<p>Over the next few weeks, I worked on a threshold monitoring system: a Scala service that facilitates defining basic “rules” (thresholds for various metrics), and monitors these values using a timeseries database. The goal was to allow engineers to be able to easily define and monitor their own thresholds. I was extremely lucky to have the opportunity to build such a critical piece of infrastructure, with abundant guidance from Ian Ownbey (<a href=\"https://twitter.com/intent/user?screen_name=iano\">@iano</a>). Writing an entire service from scratch was scary, but as a result, I also learned a lot more than I expected. It was perfect: I was working independently, but could turn to my co-workers for help anytime.</p> \n<p>There are several things that I’ve loved during my time at Twitter:</p> \n<p>On my third day at work, I got to see the President of Russia.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/my_awesome_summerinternshipattwitter95.thumb.1280.1280.png\" alt=\"My Awesome Summer Internship at Twitter\"></p> \n<p>A few weeks later, Kanye West (<a href=\"https://twitter.com/intent/user?screen_name=kanyewest\">@kanyewest</a>) “dropped by” for lunch.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/my_awesome_summerinternshipattwitter96.thumb.1280.1280.png\" alt=\"My Awesome Summer Internship at Twitter\"></p> \n<p>I was in the amazing “Class of Twitter HQ” recruiting video.</p> \n<p></p> \n<p>Every day at Twitter has given me something to be very excited about: the snack bars, the delicious lunches, teatime, random rockband sessions, the opportunity to work on some really cool stuff with very smart people, and most importantly, being part of a company that is caring and honest. My co-workers have artful, creative, daring, and ingenious approaches to the hard engineering problems that Twitter faces, and the company supports them by providing a culture of trust and openness. As an intern, it has been an overwhelmingly positive experience to be part of such a culture. Needless to say, I will miss Twitter very dearly, and I’m very thankful for this opportunity.</p> \n<p>What are you waiting for? Join The Flock! (<a href=\"https://twitter.com/intent/user?screen_name=jointheflock\">@jointheflock</a>)</p> \n<p>—Siddarth (<a href=\"https://twitter.com/intent/user?screen_name=sidd\">@sidd</a>)</p>",
"date": "2010-08-27T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/my-awesome-summer-internship-at-twitter",
"domain": "engineering"
},
{
"title": "Twitter &amp; Performance: An update",
"body": "<p>On Monday, a fault in the database that stores Twitter user records caused problems on both Twitter.com and our API. The short, non-technical explanation is that a mistake led to some problems that we were able to fix without losing any data.</p> \n<p>While we were able to resolve these issues by Tuesday morning, we want to talk about what happened and use this an opportunity to discuss the recent progress we’ve made in improving Twitter’s performance and availability. We recently covered these topics in a pair of June posts <a href=\"http://engineering.twitter.com/2010/06/perfect-stormof-whales.html\">here</a> and on our <a href=\"http://engineering/2010/06/whats-happening-with-twitter.html\">company blog</a>).</p> \n<h2>Riding a rocket</h2> \n<p>Making sure Twitter is a stable platform and a reliable service is our number one priority. The bulk of our engineering efforts are currently focused on this effort, and we have moved resources from other important projects to focus on the issue.</p> \n<p>As we said last month, keeping pace with record growth in Twitter’s user base and activity presents some unique and complex engineering challenges. We frequently compare the tasks of scaling, maintaining, and tweaking Twitter to building a rocket in mid-flight.</p> \n<p>During the World Cup, Twitter <a href=\"http://engineering/2010/07/2010-world-cup-global-conversation.html\">set records</a> <a href=\"http://engineering/2010/06/another-big-record-part-deux.html\">for</a> <a href=\"http://engineering/2010/06/big-goals-big-game-big-records.html\">usage</a>. While the event was happening, our operations and infrastructure engineers worked to improve the performance and stability of the service. We have made more than 50 optimizations and improvements to the platform, including:</p> \n<ul>\n <li>Doubling the capacity of our internal network;</li> \n <li>Improving the monitoring of our internal network;</li> \n <li>Rebalancing the traffic on our internal network to redistribute the load;</li> \n <li>Doubling the throughput to the database that stores tweets;</li> \n <li>Making a number of improvements to the way we use memcache, improving the speed of Twitter while reducing internal network traffic; and,</li> \n <li>Improving page caching of the front and profile pages, reducing page load time by 80 percent for some of our most popular pages.</li> \n</ul>\n<h2>So what happened Monday?</h2> \n<p>While we’re continuously improving the performance, stability and scalability of our infrastructure and core services, there are still times when we run into problems unrelated to Twitter’s capacity. That’s what happened this week.</p> \n<p>On Monday, our users database, where we store millions of user records, got hung up running a long-running query; as a result, most of the table became locked. The locked users table manifested itself in many ways: users were unable to sign-up, sign in, update their profile or background images, and responses from the API were malformed, rendering the response unusable to many of the API clients. In the end, this affected most of the Twitter ecosystem: our mobile, desktop, and web-based clients, the Twitter support and help system, and Twitter.com.</p> \n<p>To remedy the locked table, we force-restarted the database server in recovery mode, a process that took more than 12 hours (the database covers records for more than 125 million users — that’s a lot of records). During the recovery, the users table and related tables remained unavailable. Unfortunately, even after the recovery process completed, the table remained in an unusable state. Finally, yesterday morning we replaced the partially-locked user db with a copy that was fully available (in the parlance of database admins everywhere, we promoted a slave to master), fixing the database and all of the related issues.</p> \n<p>We have taken steps to ensure we can more quickly detect and respond to similar issues in the future. For example, we are prepared to more quickly promote a slave db to a master db, and we put additional monitoring in place to catch errant queries like the one that caused Monday’s incidents.</p> \n<h2>Long-term solutions</h2> \n<p>As we said last month, we are working on long-term solutions to make Twitter more reliable (news that we are moving into our own data center this fall, <a href=\"http://engineering.twitter.com/2010/07/room-to-grow-twitter-data-center.html\">which we announced this afternoon</a>, is just one example). This will take time, and while there has been short-term pain, our capacity has improved over the past month.</p> \n<p>Finally, despite the rapid growth of our company, we’re still a relatively small crew maintaining a comparatively large (rocket) ship. We’re actively looking for engineering talent, with more than 20 openings currently. If you’re interested in learning more about the problems we’re solving or “<a href=\"http://twitter.com/jointheflock\">joining the flock</a>,” check out our <a href=\"http://twitter.com/positions.html\">jobs page</a>.</p>",
"date": "2010-07-21T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/twitter-performance-an-update",
"domain": "engineering"
},
{
"title": "Room to grow: a Twitter data center",
"body": "<p>Later this year, Twitter is moving our technical operations infrastructure into a new, custom-built data center in the Salt Lake City area. We’re excited about the move for several reasons.</p> \n<p>First, Twitter’s user base has continued to grow steadily in 2010, with over 300,000 people a day signing up for new accounts on an average day. Keeping pace with these users and their Twitter activity presents some unique and complex engineering challenges (as John Adams, our lead engineer for application services, <a href=\"http://www.youtube.com/watch?v=_7KdeUIvlvw\">noted in a speech</a> last month at the O’Reilly Velocity conference). Having dedicated data centers will give us more capacity to accommodate this growth in users and activity on Twitter.</p> \n<p>Second, Twitter will have full control over network and systems configuration, with a much larger footprint in a building designed specifically around our unique power and cooling needs. Twitter will be able to define and manage to a finer grained SLA on the service as we are managing and monitoring at all layers. The data center will house a mixed-vendor environment for servers running open source OS and applications.</p> \n<p>Importantly, having our own data center will give us the flexibility to more quickly make adjustments as our infrastructure needs change.</p> \n<p>Finally, Twitter’s custom data center is built for high availability and redundancy in our network and systems infrastructure. This first Twitter managed data center is being designed with a multi-homed network solution for greater reliability and capacity. We will continue to work with NTT America to operate our current footprint, and plan to bring additional Twitter managed data centers online over the next 24 months.</p>",
"date": "2010-07-21T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/room-to-grow-a-twitter-data-center",
"domain": "engineering"
},
{
"title": "Murder: Fast datacenter code deploys using BitTorrent",
"body": "<p>Twitter has thousands of servers. What makes having boatloads of servers particularly annoying though is that we need to quickly get multiple iterations of code and binaries onto all of them on a regular basis. We used to have a git-based deploy system where we’d just instruct our front-ends to download the latest code from our main git machine and serve that. Unfortunately, once we got past a few hundred servers, things got ugly. We recognized that this problem was not unlike many of the scaling problems we’ve had and dealt with in the past though—we were suffering the symptoms of a centralized system.</p> \n<h2>Slow deploys</h2> \n<p>By sitting beside a particularly vocal Release Engineer, I received first-hand experience of the frustration caused by slow deploys. We needed a way to dramatically speed things up. I thought of some quick hacks to get this fixed: maybe replicate the git repo or maybe shard it so everyone isn’t hitting the same thing at once. Most of these quasi-centralized solutions will still require re-replicating or re-sharding again in the near future though (especially at our growth).</p> \n<p>It was time for something completely different, something decentralized, something more like…<a href=\"http://bittorrent.com/\">BitTorrent</a>…running inside of our datacenter to quickly copy files around. Using the file-sharing protocol, we launched a side-project called Murder and after a few days (and especially nights) of nervous full-site tinkering, it turned a 40 minute deploy process into one that lasted just 12 seconds!</p> \n<h2>To the rescue</h2> \n<p>Murder (which by the way is the name for a <a href=\"http://en.wikipedia.org/wiki/List_of_collective_nouns_for_birds\">flock of crows</a>) is a combination of scripts written in Python and Ruby to easily deploy large binaries throughout your company’s datacenter(s). It takes advantage of the fact that the environment in a datacenter is somewhat different from regular internet connections: low-latency access to servers, high bandwidth, no NAT/Firewall issues, no ISP traffic shaping, only trusted peers, etc. This let us come up with a list of optimizations on top of BitTornado to make BitTorrent not only reasonable, but also effective on our internal network.</p> \n<p>Since at the time we used Capistrano for signaling our servers to perform tasks, Murder also includes a Capistrano deploy strategy to make it easy for existing users of Capistrano to convert their file distribution to be decentralized. The final component is the work Matt Freels (@<a href=\"http://twitter.com/mf\">mf</a>) did in bundling everything into an easy to install ruby gem. This further helped get Murder to be usable for more deploy tasks at Twitter.</p> \n<h2>Where to get it</h2> \n<p>Murder, like many internal Twitter systems, is fully open sourced for your contributions and usage at: <a href=\"http://github.com/lg/murder\">http://github.com/lg/murder</a>. I recently did a talk (see video below) at <a href=\"http://2010.cusec.net/\">CUSEC 2010</a> in Montreal, Canada which explains many of the internals. If you have questions for how to use it, feel free to contact <a href=\"http://twitter.com/lg\">me</a> or <a href=\"http://twitter.com/mf\">Matt</a> on Twitter.</p> \n<p>We’re always looking for talented Systems and Infrastructure engineers to help grow and scale our website. Murder is one of the many projects that really highlights how thinking about decentralized and distributed systems can make huge improvements. If Murder or these kinds of engineering challenges interest you, please visit our <a href=\"http://twitter.com/job.html?jvi=oAPbVfwf\">jobs page</a> and apply. We’ve got loads of similar projects waiting for staffing. Thanks!</p> \n<p></p> \n<p><a href=\"http://vimeo.com/11280885\">Twitter - Murder Bittorrent Deploy System</a> from <a href=\"http://vimeo.com/user3690378\">Larry Gadea</a> on <a href=\"http://vimeo.com\">Vimeo</a>.</p>",
"date": "2010-07-15T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/murder-fast-datacenter-code-deploys-using-bittorrent",
"domain": "engineering"
},
{
"title": "Cassandra at Twitter Today",
"body": "<p>In the past year, we’ve been working with the <a href=\"http://cassandra.apache.org/\">Apache Cassandra</a> open source distributed database. Much of our work there has been out in the open, since we’re <a href=\"http://twitter.com/about/opensource\">big proponents of open source software</a>. Unfortunately, lately we’ve been less involved in the community because of <a href=\"http://engineering.twitter.com/2010/06/perfect-stormof-whales.html\">more pressing concerns</a> and have created some misunderstandings.</p> \n<p>We’re using Cassandra in production for a bunch of things at Twitter. A few examples: Our geo team uses it to store and query their database of places of interest. The research team uses it to store the results of data mining done over our entire user base. Those results then feed into things like <a href=\"https://twitter.com/intent/user?screen_name=toptweets\">@toptweets</a> and local trends. Our analytics, operations and infrastructure teams are working on a system that uses cassandra for large-scale real time analytics for use both internally and externally.</p> \n<p>For now, we’re not working on using Cassandra as a store for Tweets. This is a change in strategy. Instead we’re going to continue to maintain our existing Mysql-based storage. We believe that this isn’t the time to make large scale migration to a new technology. We will focus our Cassandra work on new projects that we wouldn’t be able to ship without a large-scale data store.</p> \n<p>We’re investing in Cassandra every day. It’ll be with us for a long time and our usage of it will only grow.</p>",
"date": "2010-07-10T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/cassandra-at-twitter-today",
"domain": "engineering"
},
{
"title": "A Perfect Storm.....of Whales",
"body": "<p>Since Saturday, Twitter has experienced several incidences of poor site performance and a high number of errors due to one of our internal sub-networks being over-capacity. We’re working hard to address the core issues causing these problems—more on that below—but in the interests of the open exchange of information, wanted to pull back the curtain and give you deeper insight into what happened and how we’re working to address this week’s poor site performance.<br><br></p> \n<h2>What happened?</h2> \n<p>In brief, we made three mistakes: <br> * We put two critical, fast-growing, high-bandwith components on the same segment of our internal network. <br> * Our internal network wasn’t appropriately being monitored. <br> * Our internal network was temporarily misconfigured. <br><br></p> \n<h2>What we’re doing to fix it</h2> \n<p>* We’ve doubled the capacity of our internal network. <br> * We’re improving the monitoring of our internal network. <br> * We’re rebalancing the traffic on our internal network to redistribute the load. <br><br></p> \n<h2>Onward</h2> \n<p>For much of 2009, Twitter’s biggest challenge was coping with our unprecedented growth (a challenge we happily still face). Our engineering team spent much of 2009 redesigning Twitter’s runtime for scale, and our operations team worked to improve our monitoring and capacity planning so we can quickly identify and find solutions for problems as they occur. Those efforts were well spent; every day, more people use Twitter, yet we serve fewer whales. But as this week’s issues show, there is always room for improvement: we must apply the same diligence &amp; care in the design, planning, and monitoring of our internal network. <br><br> Based on our experiences this week, we’re working with our hosting partner to deliver improvements on all three fronts. By bringing the monitoring of our internal network in line with the rest of the systems at Twitter, we’ll be able to grow our capacity well ahead of user growth. Furthermore, by doubling our internal network capacity and rebalancing load across the internal network, we’re better prepared to serve today’s tweets and beyond. <br><br> As more people turn to Twitter to see what’s happening in the world (or in the World Cup), you may still see the whale when there are unprecedented spikes in traffic. For instance, during the <a href=\"http://twitter.com/worldcup\">World Cup</a> tournament—and particularly during big, closely-watched matches (such as tomorrow’s match between England and the U.S.A.)—we anticipate a significant surge in activity on Twitter. While we are making every effort to prepare for that surge, the whale may surface.<br><br> Finally, as we think about new ways to communicate with you about Twitter’s performance and availability status, continue reading <a href=\"http://status.twitter.com\">http://status.twitter.com</a>, <a href=\"http://dev.twitter.com/status\">http://dev.twitter.com/status</a>, and following <a href=\"https://twitter.com/intent/user?screen_name=twitterapi\">@twitterapi</a> for the latest updates. <br><br> Thanks for your continued patience and enthusiasm. <br><br> —<a href=\"https://twitter.com/intent/user?screen_name=jeanpaul\">@jeanpaul</a></p>",
"date": "2010-06-11T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/a-perfect-stormof-whales",
"domain": "engineering"
},
{
"title": "Announcing Snowflake",
"body": "<p>A while back we <a href=\"http://groups.google.com/group/twitter-development-talk/browse_thread/thread/5152a34a8ae6ccb6/1edb5cd6002f6499\">announced on our API developers list</a> that we would change the way we generate unique ID numbers for tweets.</p> \n<p>While we’re not quite ready to make this change, we’ve been hard at work on <a href=\"http://github.com/twitter/snowflake\">Snowflake</a> which is the internal service to generate these ids. To give everyone a chance to familiarize themselves with the techniques we’re employing and how it’ll affect anyone building on top of the Twitter platform we are open sourcing the Snowflake code base today.</p> \n<p>Before I go further, let me provide some context.</p> \n<h2>The Problem</h2> \n<p>We currently use MySQL to store most of our online data. In the beginning, the data was in one small database instance which in turn became one large database instance and eventually many large database clusters. For various reasons, the details of which merit a whole blog post, we’re working to replace many of these systems with <a href=\"http://cassandra.apache.org/\">the Cassandra distributed database</a> or horizontally sharded MySQL (using <a href=\"http://github.com/twitter/gizzard\">gizzard</a>).</p> \n<p>Unlike MySQL, Cassandra has no built-in way of generating unique ids – nor should it, since at the scale where Cassandra becomes interesting, it would be difficult to provide a one-size-fits-all solution for ids. Same goes for sharded MySQL.</p> \n<p>Our requirements for this system were pretty simple, yet demanding:</p> \n<p>We needed something that could generate tens of thousands of ids per second in a highly available manner. This naturally led us to choose an uncoordinated approach.</p> \n<p>These ids need to be <em>roughly sortable</em>, meaning that if tweets A and B are posted around the same time, they should have ids in close proximity to one another since this is how we and most Twitter clients sort tweets.[1]</p> \n<p>Additionally, these numbers have to fit into 64 bits. We’ve been through the painful process of growing the number of bits used to store tweet ids <a href=\"http://www.twitpocalypse.com/\">before</a>. It’s unsurprisingly hard to do when you have over <a href=\"http://social.venturebeat.com/2010/04/14/twitter-applications/\">100,000 different codebases involved</a>.</p> \n<h2>Options</h2> \n<p>We considered a number of approaches: MySQL-based ticket servers (<a href=\"http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/\">like flickr uses</a>), but those didn’t give us the ordering guarantees we needed without building some sort of re-syncing routine. We also considered various UUIDs, but all the schemes we could find required 128 bits. After that we looked at Zookeeper sequential nodes, but were unable to get the performance characteristics we needed and we feared that the coordinated approach would lower our availability for no real payoff.</p> \n<h2>Solution</h2> \n<p>To generate the roughly-sorted 64 bit ids in an uncoordinated manner, we settled on a composition of: timestamp, worker number and sequence number.</p> \n<p>Sequence numbers are per-thread and worker numbers are chosen at startup via zookeeper (though that’s overridable via a config file).</p> \n<p>We encourage you to peruse and play with the code: you’ll find it on <a href=\"https://github.com/twitter/snowflake\">github</a>. Please remember, however, that it is currently alpha-quality software that we aren’t yet running in production and is very likely to change.</p> \n<h2>Feedback</h2> \n<p>If you find bugs, please report them on github. If you are having trouble understanding something, come ask in the <a href=\"https://twitter.com/hashtag/twinfra\">#twinfra</a> IRC channel on freenode. If you find anything that you think may be a security problem, please email security@twitter.com (and cc myself: ryan@twitter.com).</p> \n<p>[1] In mathematical terms, although the tweets will no longer be sorted, they will be <a href=\"http://ci.nii.ac.jp/naid/110002673489/\">k-sorted</a>. We’re aiming to keep our k below 1 second, meaning that tweets posted within a second of one another will be within a second of one another in the id space too.</p>",
"date": "2010-06-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/announcing-snowflake",
"domain": "engineering"
},
{
"title": "Tracing traffic through our stack",
"body": "<p>The <a href=\"https://twitter.com/intent/user?screen_name=twitterapi\">@twitterapi</a> has had two authentication mechanisms for quite a while now: <a href=\"http://en.wikipedia.org/wiki/Basic_access_authentication\">HTTP Basic Authentication</a> and <a href=\"http://oauth.net/\">OAuth</a>. Basic authentication has gotten us so far! You can even use curl from the command line to interact with our API and simply pass a username and a password as a -u command line parameter when calling statuses/update to tweet, for example. However, as times have changed, so have our requirements around authentication — developers will need to take action. <a href=\"http://www.countdowntooauth.com\">Basic Auth support is going away on June 30, 2010</a>. <a href=\"http://oauth.net/about/\">OAuth has always been part of Twitter’s blood</a>, and soon, we’re going to be using it exclusively. OAuth has many benefits for end users (e.g. protection of their passwords and fine grained control over applications), but what does it mean for Twitter on the engineering front? Quite a lot.</p> \n<p>Our authentication stack, right now, for basic auth, looks as so:</p> \n<ul>\n <li>decode the Authorization header that comes in via the HTTP request;</li> \n <li>check any rate limits that apply for the user or the IP address that request came from (a memcache hit);</li> \n <li>see if the authorization header is in memcache - and if it is, use it to find the user in cache and verify that the password is correct. If neither the header is in cache, nor the user is in cache, nor the password is correct (in case the user has changed his or her password), then keep going;</li> \n <li>pull the user out of storage;</li> \n <li>verify the user hasn’t been locked out of the system; and</li> \n <li>verify the user’s credentials.</li> \n</ul>\n<p>Our stack then also logs a lot of information to scribe about that user and login to help us counter harmful activities (whether malicious or simply buggy) — but, the one thing that we don’t have any visibility into, when using basic authentication, is what application is doing all this.</p> \n<p>To verify an OAuth-signed request, we go through a lot more intensive (both computationally and on our storage systems):</p> \n<ul>\n <li>decode the Authorization header;</li> \n <li>validate that the oauth_nonce and the oauth_timestamp pair that were passed in are not present in memcache — if so, then this may be a relay attack, and deny the user access;</li> \n <li>use the oauth_consumer_key and the oauth_token from the header, look up both the Twitter application and the user’s access token object from cache and fallback to the database if necessary. If, for some reason, neither can be retrieved, then something has gone wrong and proactively deny access;</li> \n <li>with the application and the access token, verify the oauth_signature. If it doesn’t match, then reject the request; and</li> \n <li>check any rate limits that may apply for the user at this stage</li> \n</ul>\n<p>Of course, for all the reject paths up top, we log information — that’s invaluable data for us to turn over to our Trust &amp; Safety team. If the user manages to authenticate, however, then we too have a wealth of information! We can, at this point, for every authenticated call, tie an user and an application to a specific action on our platform.</p> \n<p>For us, and the entire Twitter ecosystem, its really important to be able to identify, and get visibility into, our users’ traffic. We want to be able to help developers if their software is malfunctioning, and we want to be able to make educated guesses as to whether traffic is malicious or not. And, if everything is functioning normally, then we can use this data to help us provision and plan for growth better and deliver better reliability. But, if all applications are simply using usernames and passwords as their identifiers, then we have no way to distinguish who is sending what traffic on behalf of which users.</p> \n<p>Phase one of our plan is to remove basic authentication for calls that require authentication — those calls will migrate to a three-legged OAuth scheme. After that, we’ll start migrating all calls to at least begin to use a two-legged OAuth scheme. We also have OAuth 2 in the works. Start firing up <a href=\"http://dev.twitter.com\">dev.twitter.com</a> and creating Twitter applications!</p> \n<p>As always, the <a href=\"https://twitter.com/intent/user?screen_name=twitterapi\">@twitterapi</a> team is here to help out. Just make sure to join the <a href=\"http://groups.google.com/group/twitter-development-talk\">Twitter Development Talk</a> group to ask for questions, follow <a href=\"https://twitter.com/intent/user?screen_name=twitterapi\">@twitterapi</a> for announcements, and skim through our docs on dev.twitter.com to help you through this transition.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=raffi\">@raffi</a></p>",
"date": "2010-05-12T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/tracing-traffic-through-our-stack",
"domain": "engineering"
},
{
"title": "Introducing FlockDB",
"body": "<p>Twitter stores many graphs of relationships between people: who you’re following, who’s following you, who you receive phone notifications from, and so on.</p> \n<p>Some of the features of these graphs have been challenging to store in scalable ways as we’ve grown. For example, instead of requiring each friendship to be requested and confirmed, you can build one-way relationships by just following other people. There’s also no limit to how many people are allowed to follow you, so some people have millions of followers (like <a href=\"https://twitter.com/intent/user?screen_name=aplusk\">@aplusk</a>), while others have only a few.</p> \n<p>To deliver a tweet, we need to be able to look up someone’s followers and page through them rapidly. But we also need to handle heavy write traffic, as followers are added or removed, or spammers are caught and put on ice. And for some operations, like delivering a <a href=\"https://twitter.com/intent/user?screen_name=mention\">@mention</a>, we need to do set arithmetic like “who’s following both of these users?” These features are difficult to implement in a traditional relational database.</p> \n<h2>A valiant effort</h2> \n<p>We went through several storage layers in the early days, including abusive use of relational tables and key-value storage of denormalized lists. They were either good at handling write operations or good at paging through giant result sets, but never good at both.</p> \n<p>A little over a year ago, we could see that we needed to try something new. Our goals were:</p> \n<ul>\n <li>Write the simplest possible thing that could work.</li> \n <li>Use off-the-shelf MySQL as the storage engine, because we understand its behavior — in normal use as well as under extreme load and unusual failure conditions. Give it enough memory to keep everything in cache.</li> \n <li>Allow for horizontal partitioning so we can add more database hardware as the corpus grows.</li> \n <li>Allow write operations to arrive out of order or be processed more than once. (Allow failures to result in redundant work rather than lost work.)</li> \n</ul>\n<p>FlockDB was the result. We finished migrating to it about 9 months ago and never looked back.</p> \n<h2>A valiant-er effort</h2> \n<p>FlockDB is a database that stores graph data, but it isn’t a database optimized for graph-traversal operations. Instead, it’s optimized for very large <a href=\"http://en.wikipedia.org/wiki/Adjacency_list\">adjacency</a></p> \n<p>lists, fast reads and writes, and page-able set arithmetic queries.</p> \n<p>It stores graphs as sets of edges between nodes identified by 64-bit integers. For a social graph, these node IDs will be user IDs, but in a graph storing “favorite” tweets, the destination may be a tweet ID. Each edge is also marked with a 64-bit position, used for sorting. (Twitter puts a timestamp here for the “following” graph, so that your follower list is displayed latest-first.)</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/schema95.thumb.1280.1280.png\" alt=\"schema\"></p> \n<p>When an edge is “deleted”, the row isn’t actually deleted from MySQL; it’s just marked as being in the deleted state, which has the effect of moving the primary key (a compound key of the source ID, state, and position). Similarly, users who delete their account can have their edges put into an archived state, allowing them to be restored later (but only for a limited time, according to our terms of service). We keep only a compound primary key and a secondary index for each row, and answer all queries from a single index. This kind of schema optimization allows MySQL to shine and gives us predictable performance.</p> \n<p>A complex query like “What’s the intersection of people I follow and people who are following President Obama?” can be answered quickly by decomposing it into single-user queries (“Who is following President Obama?”). Data is partitioned by node, so these queries can each be answered by a single partition, using an indexed range query. Similarly, paging through long result sets is done by using the position field as a cursor, rather than using <code>LIMIT/OFFSET</code>, so any page of results for a query is indexed and is equally fast.</p> \n<p>Write operations are <a href=\"http://en.wikipedia.org/wiki/Idempotence\">idempotent</a> and <a href=\"http://en.wikipedia.org/wiki/Commutative\">commutative</a>, based on the time they enter the system. We can process operations out of order and end up with the same result, so we can paper over temporary network and hardware failures, or even replay lost data from minutes or hours ago. This was especially helpful during the initial roll-out.</p> \n<p>Commutative writes also simplify the process of bringing up new partitions. A new partition can receive write traffic immediately, and receive a dump of data from the old partitions slowly in the background. Once the dump is over, the partition is immediately “live” and ready to receive reads.</p> \n<p>The app servers (affectionately called “flapps”) are written in Scala, are stateless, and are horizontally scalable. We can add more as query load increases, independent of the databases. They expose a very small thrift API to clients, though we’ve written <a href=\"http://github.com/twitter/flockdb-client\">a Ruby client</a> with a much richer interface.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/it_s_in_the_cloud96.thumb.1280.1280.png\" alt=\"it's in the cloud\"></p> \n<p>We use <a href=\"http://github.com/twitter/gizzard\">the Gizzard library</a> to handle the partitioning layer. A forwarding layer maps ranges of source IDs to physical databases, and replication is handled by building a tree of such tables under the same forwarding address. Write operations are acknowledged after being journalled locally, so that disruptions in database availability or performance are decoupled from website response times.</p> \n<p>Each edge is actually stored twice: once in the “forward” direction (indexed and partitioned by the source ID) and once in the “backward” direction (indexed and partitioned by the destination ID). That way a query like “Who follows me?” is just as efficient as “Who do I follow?”, and the answer to each query can be found entirely on a single partition.</p> \n<p>The end result is a cluster of commodity servers that we can expand as needed. Over the winter, we added 50% database capacity without anyone noticing. We currently store over</p> \n<h2>13 billion edges</h2> \n<p>and sustain peak traffic of</p> \n<h2>20k writes/second</h2> \n<p>and</p> \n<h2>100k reads/second</h2> \n<p>.</p> \n<h2>Lessons learned</h2> \n<p>Some helpful patterns fell out of our experience, even though they weren’t goals originally:</p> \n<ul>\n <li> <h2>Use aggressive timeouts to cut off the long tail.</h2> You can’t ever shake out all the unfairness in the system, so some requests will take an unreasonably long time to finish — way over the 99.9th percentile. If there are multiple stateless app servers, you can just cut a client loose when it has passed a “reasonable” amount of time, and let it try its luck with a different app server.</li> \n <li> <h2>Make every case an error case.</h2> Or, to put it another way, use the same code path for errors as you use in normal operation. Don’t create rarely-tested modules that only kick in during emergencies, when you’re least likely to feel like trying new things. We queue all write operations locally (using <a href=\"http://github.com/robey/kestrel\">Kestrel</a> as a library), and any that fail are thrown into a separate error queue. This error queue is periodically flushed back into the write queue, so that retries use the same code path as the initial attempt.</li> \n <li> <h2>Do nothing automatically at first.</h2> <p>Provide lots of gauges and levers, and automate with scripts once patterns emerge. FlockDB measures the latency distribution of each query type across each service (MySQL, Kestrel, Thrift) so we can tune timeouts, and reports counts of each operation so we can see when a client library suddenly doubles its query load (or we need to add more hardware). Write operations that cycle through the error queue too many times are dumped into a log for manual inspection. If it turns out to be a bug, we can fix it, and re-inject the job. If it’s a client error, we have a good bug report.</p> </li> \n</ul>\n<h2>Check it out</h2> \n<p>The source is in github: <a href=\"http://github.com/twitter/flockdb\">http://github.com/twitter/flockdb</a></p> \n<p>In particular, check out the demo to get a feel for the kind of data that can be stored and what you can do with it:</p> \n<p><a href=\"http://github.com/twitter/flockdb/blob/master/doc/demo.markdown\">http://github.com/twitter/flockdb/blob/master/doc/demo.markdown</a></p> \n<p>Talk to us on IRC, in <a href=\"https://twitter.com/hashtag/twinfra\">#twinfra</a> (irc.freenode.net), or join the mailing list:</p> \n<p><a href=\"http://groups.google.com/group/flockdb\">http://groups.google.com/group/flockdb</a></p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=robey\">@robey</a>, <a href=\"https://twitter.com/intent/user?screen_name=nk\">@nk</a>, <a href=\"https://twitter.com/intent/user?screen_name=asdf\">@asdf</a>, <a href=\"https://twitter.com/intent/user?screen_name=jkalucki\">@jkalucki</a></p>",
"date": "2010-05-03T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/introducing-flockdb",
"domain": "engineering"
},
{
"title": "Memcached SPOF Mystery",
"body": "<p>At Twitter, we use <a href=\"http://en.wikipedia.org/wiki/Memcache\">memcached</a> to speed up page loads and alleviate database load. We have many memcached hosts. To make our system robust, our memcached clients use <a href=\"http://en.wikipedia.org/wiki/Consistent_hashing\">consistent hashing</a> and enable the <a href=\"http://blog.evanweaver.com/files/doc/fauna/memcached/classes/Memcached.html\">auto_eject_hosts</a> option. With this many hosts and this kind of configuration, one would assume that it won’t be noticeable if one memcached host goes down, right? Unfortunately, our system will have elevated <a href=\"http://help.twitter.com/entries/83978-i-keep-getting-robots-and-or-whales\">robots</a> whenever a memcached host dies or is taken out of the system. The system does not recover on its own unless the memcached host is brought back. Essentially, every memcached host is a single point of failure.</p> \n<p><a href=\"http://4.bp.blogspot.com/_dLa0WBH_rKs/S83d89lJIHI/AAAAAAAAAAs/qPkHfQ1PwKk/s1600/memcached_SPOF.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/memcached_spof_mystery95.thumb.1280.1280.png\" alt=\"Memcached SPOF Mystery\"></a></p> \n<p>This is what we observed when a memcached host crashed recently. Web 502/503s spiked and recovered, and web/api 500 errors occur at a sustained elevated rate.</p> \n<p>Why is this happening? At first, we thought the elevated robots were caused by remapping the keys on the dead host. After all, reloading from databases can be expensive. When other memcached hosts have more data to read from databases than they can handle, they may throw exceptions. But that should only happen if some memcached hosts are near their capacity limit. Whereas the elevated robots can happen even during off-peak hours. There must be something else.</p> \n<p>A closer look at the source of those exceptions surprised us. It turns out those exceptions are not from the requests sent to the other healthy memcached hosts but from the requests sent to the dead host! Why do the clients keep sending requests to the dead host?</p> \n<p>This is related to the “auto_eject_hosts” option. The purpose of this option is to let the client temporarily eject dead hosts from the pool. A host is marked as dead if the client has a certain number of consecutive failures with the host. The dead server will be retried after retry timeout. Due to general unpredictable stuff such as network flux, hardware failures, timeouts due to other jobs in the boxes, requests sent to healthy memcached hosts can fail sporadically. The retry timeout is thus set to a very low value to avoid remapping a large number of keys unnecessarily.</p> \n<p>When a memcached host is really dead, however, this frequent retry is undesirable. The client has to establish connection to the dead host again, and it usually gets one of the following exceptions: Memcached::ATimeoutOccurred, Memcached::UnknownReadFailure, Memcached::SystemError (“Connection refused”), or Memcached::ServerIsMarkedDead. Unfortunately, a client does not share the “the dead host is still dead” information with other clients, so all clients will retry the dead host and get those exceptions at very high frequency.</p> \n<p>The problem is not difficult to fix once we get better understanding of the problem. Simply retrying a memcached request once or twice on those exceptions usually works. <a href=\"http://blog.evanweaver.com/files/doc/fauna/memcached/classes/Memcached/Error.html\">Here</a> is the list of all the memcached runtime exceptions. Ideally, memcached client should have some nice build-in mechanisms (e.g. exponential backoff) to retry some of the exceptions, and optionally log information about what happened. The memcached client shouldn’t transparently swallow all exceptions, which would cause users to lose all visibilities into what’s going on.</p> \n<p>After we deployed the fix, we don’t see elevated robots any more when a memcached host dies. The memcached SPOF mystery solved!</p> \n<p>P.S. Two option parameters “exception_retry_limit” and “exceptions_to_retry” have been added to <a href=\"http://blog.evanweaver.com/files/doc/fauna/memcached/classes/Memcached.html\">memcached</a>.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=wanliyang\">@wanliyang</a></p>",
"date": "2010-04-20T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/memcached-spof-mystery",
"domain": "engineering"
},
{
"title": "Hadoop at Twitter",
"body": "<p>My name is <a href=\"http://www.twitter.com/kevinweil\">Kevin Weil</a> and I’m a member of the analytics team at Twitter. We’re collectively responsible for Twitter’s data warehousing, for building out an analysis platform that lets us easily and efficiently run large calculations over the Twitter dataset, and ultimately for turning that data into something actionable that helps the business. We’re fortunate to work with great people from teams across the organization for the latter. The former two are largely on our plate though, and as a result we use <a href=\"http://hadoop.apache.org/\">Hadoop</a>, <a href=\"http://hadoop.apache.org/pig\">Pig</a>, and <a href=\"http://hadoop.apache.org/hbase\">HBase</a> heavily. Today we’re excited to open source some of the core code we rely on for our data analysis.</p> \n<h2>TL;DR</h2> \n<p>We’re releasing a whole bunch of code that we use with Hadoop and Pig specifically around LZO and Protocol Buffers. Use it, fork it, improve upon it. <a href=\"http://github.com/kevinweil/elephant-bird\">http://github.com/kevinweil/elephant-bird</a></p> \n<h2>What’s Hadoop? Pig, HBase?</h2> \n<p>Hadoop is a distributed computing framework with two main components: a distributed file system and a map-reduce implementation. It is a top-level <a href=\"http://www.apache.org/\">Apache</a> project, and as such it is fully open source and has a vibrant community behind it.</p> \n<p>Imagine you have a cluster of 100 computers. Hadoop’s distributed file system makes it so you can put data “into Hadoop” and pretend that all the hard drives on your machines have coalesced into one gigantic drive. Under the hood, it breaks each file you give it into 64- or 128-MB chunks called blocks and sends them to different machines in the cluster, replicating each block three times along the way. Replication ensures that one or even two of your 100 computers can fail simultaneously, and you’ll never lose data. In fact, Hadoop will even realize that two machines have failed and will begin to re-replicate the data, so your application code never has to care about it!</p> \n<p>The second main component of Hadoop is its map-reduce framework, which provides a simple way to break analyses over large sets of data into small chunks which can be done in parallel across your 100 machines. You can read more about it <a href=\"http://hadoop.apache.org/common/docs/current/mapred_tutorial.html\">here</a>; it’s quite generic, capable of handling everything from basic analytics through <a href=\"http://video.google.com/videoplay?docid=741403180270990805#\">map-tile generation for Google Maps</a>! Google has a proprietary system which Hadoop itself is modeled after; Hadoop is used at <a href=\"http://wiki.apache.org/hadoop/PoweredBy\">many large companies</a> including Yahoo!, Facebook, and Twitter. We’re happy users of <a href=\"http://www.cloudera.com/hadoop/\">Cloudera’s free Hadoop distribution</a>.</p> \n<p><a href=\"http://hadoop.apache.org/pig\">Pig</a> is a dataflow language built on top of Hadoop to simplify and speed up common analysis tasks. Instead of writing map-reduce jobs in Java, you write in a higher-level language called <a href=\"http://wiki.apache.org/pig/PigLatin/\">PigLatin</a>, and a query compiler turns your statements into an ordered sequence of map-reduce jobs. It enables complex map-reduce job flows to be written in a few easy steps.</p> \n<p><a href=\"http://hadoop.apache.org/hbase\">HBase</a> is a distributed, column-oriented data store built on top of Hadoop and modeled after Google’s <a href=\"http://labs.google.com/papers/bigtable.html\">BigTable</a>. It allows for structured data storage combined with low-latency data serving.</p> \n<h2>How does Twitter Use Hadoop?</h2> \n<p>Twitter has large data storage and processing requirements, and thus we have worked to implement a set of optimized data storage and workflow solutions within Hadoop. In particular, we store all of our data LZO compressed, because the LZO compression turns out to strike a very good balance between compression ratio and speed for use in Hadoop. Hadoop jobs are generally IO-bound, and typical compression algorithms like gzip or bzip2 are so computationally intensive that jobs quickly become CPU-bound. LZO in contrast was built for speed, so you get 4-5x compression ratio while leaving the CPU available to do real work. For more discussion of LZO, complete with performance comparisons, see <a href=\"http://www.cloudera.com/blog/2009/06/parallel-lzo-splittable-compression-for-hadoop/\">this Cloudera blog post</a> we did a while back.</p> \n<p>We also make heavy use of Google’s <a href=\"http://code.google.com/p/protobuf\">protocol buffers</a> for efficient, extensible, backward-compatible data storage. Hadoop does not mandate any particular format on disk, and common formats like CSV are</p> \n<ul>\n <li>space-inefficient: an integer like 2930533523 takes 10 bytes in ASCII.</li> \n <li>untyped: is 2930533523 an int, a long, or a string?</li> \n <li>not robust to versioning changes: adding a new field, or removing an old one, requires you to change your code</li> \n <li>not hierarchical: you cannot store any nested structure</li> \n</ul>\n<p>Other solutions like JSON fail fewer of these tests, but protocol buffers retain one key advantage: code generation. You write a quick description of your data structure, and the protobuf library will generate code for working with that data structure in the language of your choice. Because Google designed protobufs for data storage, the serialized format is efficient; integers, for example, are <a href=\"http://code.google.com/apis/protocolbuffers/docs/encoding.html\">variable-length or zigzag encoded</a>.</p> \n<h2>So…</h2> \n<p>The code we are releasing makes it easy to work with LZO-compressed data of all sorts — JSON, CSV, TSV, line-oriented, and especially protocol buffer-encoded — in Hadoop, Pig, and HBase. It also includes a framework for automatically generating all of this code for protobufs given the protobuf IDL. That is, not only will</p> \n<h2>protoc</h2> \n<p>generate the standard protobuf code for you, it will now generate Hadoop and Pig layers on top of that which you can immediately plug in to start analyzing data written with these formats. Having code automatically generated from a simple data structure definition has helped us move very quickly and make fewer mistakes in our analysis infrastructure at Twitter. You can even hook in and add your own code generators from within the framework. Please do, and submit back! Much more documentation is available at the github page, <a href=\"http://www.github.com/kevinweil/elephant-bird\">http://www.github.com/kevinweil/elephant-bird</a>.</p> \n<p>Thanks to <a href=\"http://www.twitter.com/squarecog\">Dmitriy Ryaboy</a>, <a href=\"http://www.twitter.com/chuangl4\">Chuang Liu</a>, and <a href=\"http://www.twitter.com/floleibert\">Florian Leibert</a> for help developing this framework. We welcome contributions, forks, and pull requests. If working on stuff like this every day sounds interesting, <a href=\"http://twitter.com/positions.html\">we’re hiring</a>!</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=kevinweil\">@kevinweil</a></p>",
"date": "2010-04-08T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/hadoop-at-twitter",
"domain": "engineering"
},
{
"title": "Introducing Gizzard, a framework for creating distributed datastores",
"body": "<h2>An introduction to sharding</h2> \n<p>Many modern web sites need fast access to an amount of information so large that it cannot be efficiently stored on a single computer. A good way to deal with this problem is to “shard” that information; that is, store it across multiple computers instead of on just one.</p> \n<p>Sharding strategies often involve two techniques: partitioning and replication. With <em>partitioning</em>, the data is divided into small chunks and stored across many computers. Each of these chunks is small enough that the computer that stores it can efficiently manipulate and query the data. With the other technique of <em>replication</em>, multiple copies of the data are stored across several machines. Since each copy runs on its own machine and can respond to queries, the system can efficiently respond to tons of queries for the same data by adding more copies. Replication also makes the system resilient to failure because if any one copy is broken or corrupt, the system can use another copy for the same task.</p> \n<p>The problem is: sharding is difficult. Determining smart partitioning schemes for particular kinds of data requires a lot of thought. And even more difficult is ensuring that all of the copies of the data are <em>consistent</em> despite unreliable communication and occasional computer failures. Recently, a lot of open-source distributed databases have emerged to help solve this problem. Unfortunately, as of the time of writing, most of the available open-source projects are either too immature or too limited to deal with the variety of problems that exist on the web. These new databases are hugely promising but for now it is sometimes more practical to build a custom solution.</p> \n<h2>What is a sharding framework?</h2> \n<p>Twitter has built several custom distributed data-stores. Many of these solutions have a lot in common, prompting us to extract the commonalities so that they would be more easily maintainable and reusable. Thus, we have extracted Gizzard, a Scala framework that makes it easy to create custom fault-tolerant, distributed databases.</p> \n<p>Gizzard is a <em>framework</em> in that it offers a basic template for solving a certain class of problem. This template is not perfect for everyone’s needs but is useful for a wide variety of data storage problems. At a high level, Gizzard is a middleware networking service that manages partitioning data across arbitrary backend datastores (e.g., SQL databases, Lucene, etc.). The partitioning rules are stored in a forwarding table that maps key ranges to partitions. Each partition manages its own replication through a declarative replication tree. Gizzard supports “migrations” (for example, elastically adding machines to the cluster) and gracefully handles failures. The system is made <em>eventually consistent</em> by requiring that all write-operations are idempotent AND commutative and as operations fail (because of, e.g., a network partition) they are retried at a later time.</p> \n<p>A very simple sample use of Gizzard is <a href=\"http://github.com/nkallen/Rowz\">Rowz</a>, a distributed key-value store. To get up-and-running with Gizzard quickly, clone Rows and start customizing!</p> \n<p>But first, let’s examine how Gizzard works in more detail.</p> \n<h2>How does it work?</h2> \n<h2>Gizzard is middleware</h2> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/alt_text95.thumb.1280.1280.png\" alt=\"Alt text\" title=\"\"></p> \n<p>Gizzard operates as a <em>middleware</em> networking service. It sits “in the middle” between clients (typically, web front-ends like PHP and Ruby on Rails applications) and the many partitions and replicas of data. Sitting in the middle, all data querying and manipulation flow through Gizzard. Gizzard instances are stateless so run as many gizzards as are necessary to sustain throughput or manage TCP connection limits. Gizzard, in part because it is runs on the JVM, is quite efficient. One of Twitter’s Gizzard applications (FlockDB, our distributed graph database) can serve 10,000 queries per second per commodity machine. But your mileage may vary.</p> \n<h2>Gizzard supports any datastorage backend</h2> \n<p>Gizzard is designed to replicate data across any network-available data storage service. This could be a relational database, Lucene, Redis, or anything you can imagine. As a general rule, Gizzard requires that all write operations be idempotent AND commutative (see the section on Fault Tolerance and Migrations), so this places some constraints on how you may use the back-end store. In particular, Gizzard does not guarantee that write operations are applied in order. It is therefore imperative that the system is designed to reach a consistent state regardless of the order in which writes are applied.</p> \n<h2>Gizzard handles partitioning through a forwarding table</h2> \n<p>Gizzard handles partitioning (i.e., dividing exclusive ranges of data across many hosts) by mappings <em>ranges</em> of data to particular shards. These mappings are stored in a forwarding table that specifies lower-bound of a numerical range and what shard that data in that range belongs to.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/alt_text96.thumb.1280.1280.png\" alt=\"Alt text\" title=\"\"></p> \n<p>To be precise, you provide Gizzard a custom “hashing” function that, given a key for your data (and this key can be application specific), produces a number that belongs to one of the ranges in the forwarding table. These functions are programmable so you can optimize for locality or balance depending on your needs.</p> \n<p>This tabular approach differs from the “consistent hashing” technique used in many other distributed data-stores. This allows for heterogeneously sized partitions so that you easily manage <em>hotspots</em>, segments of data that are extremely popular. In fact, Gizzard does allows you to implement completely custom forwarding strategies like consistent hashing, but this isn’t the recommended approach.</p> \n<h2>Gizzard handles replication through a replication tree</h2> \n<p>Each shard referenced in the forwarding table can be either a physical shard or a logical shard. A physical shard is a reference to a particular data storage back-end, such as a SQL database. In contrast, A <em>logical shard</em> is just a tree of other shards, where each <em>branch</em> in the tree represents some logical transformation on the data, and each <em>node</em> is a data-storage back-end. These logical transformations at the branches are usually rules about how to propagate read and write operations to the children of that branch. For example, here is a two-level replication tree. Note that this represents just ONE partition (as referenced in the forwarding table):</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/alt_text97.thumb.1280.1280.png\" alt=\"Alt text\" title=\"\"></p> \n<p>The “Replicate” branches in the figure are simple strategies to repeat write operations to all children and to balance reads across the children according to health and a weighting function. You can create custom branching/logical shards for your particular data storage needs, such as to add additional transaction/coordination primitives or quorum strategies. But Gizzard ships with a few standard strategies of broad utility such as Replicating, Write-Only, Read-Only, and Blocked (allowing neither reads nor writes). The utility of some of the more obscure shard types is discussed in the section on <code>Migrations</code>.</p> \n<p>The exact nature of the replication topologies can vary per partition. This means you can have a higher replication level for a “hotter” partition and a lower replication level for a “cooler” one. This makes the system highly configurable. For instance, you can specify that the that back-ends mirror one another in a primary-secondary-tertiary-etc. configuration for simplicity. Alternatively, for better fault tolerance (but higher complexity) you can “stripe” partitions across machines so that no machine is a mirror of any other.</p> \n<h2>Gizzard is fault-tolerant</h2> \n<p>Fault-tolerance is one of the biggest concerns of distributed systems. Because such systems involve many computers, there is some likelihood that one (or many) are malfunctioning at any moment. Gizzard is designed to avoid any single points of failure. If a certain replica in a partition has crashed, Gizzard routes requests to the remaining healthy replicas, bearing in mind the weighting function. If all replicas of in a partition are unavailable, Gizzard will be unable to serve read requests to that shard, but all other shards will be unaffected. Writes to an unavailable shard are buffered until the shard again becomes available.</p> \n<p>In fact, if any number of replicas in a shard are unavailable, Gizzard will try to write to all healthy replicas as quickly as possible and buffer the writes to the unavailable shard, to try again later when the unhealthy shard returns to life. The basic strategy is that all writes are materialized to a durable, transactional journal. Writes are then performed asynchronously (but with manageably low latency) to all replicas in a shard. If a shard is unavailable, the write operation goes into an error queue and is retried later.</p> \n<p>In order to achieve “eventual consistency”, this “retry later” strategy requires that your write operations are idempotent AND commutative. This is because a retry later strategy can apply operations out-of-order (as, for instance, when newer jobs are applied before older failed jobs are retried). In most cases this is an easy requirement. A demonstration is commutative, idempotent writes is given in the Gizzard demo app, <a href=\"http://github.com/nkallen/Rowz\">Rowz</a>.</p> \n<h2>Winged migrations</h2> \n<p>It’s sometimes convenient to copy or move data from shards from one computer to another. You might do this to balance load across more or fewer machines, or to deal with hardware failures. It’s interesting to explain some aspect of how migrations work just to illustrate some of the more obscure logical shard types. When migrating from <code>Datastore A</code> to <code>Datastore A'</code>, a Replicating shard is set up between them but a WriteOnly shard is placed in front of <code>Datastore A'</code>. Then data is copied from the old shard to the new shard. The WriteOnly shard ensures that while the new Shard is bootstrapping, no data is read from it (because it has an incomplete picture of the corpus).</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/alt_text98.thumb.1280.1280.png\" alt=\"Alt text\" title=\"\"></p> \n<p>Because writes will happen out of order (new writes occur before older ones and some writes may happen twice), all writes must be idempotent to ensure data consistency.</p> \n<h2>How to use Gizzard?</h2> \n<p>The source-code to Gizzard is <a href=\"http://github.com/twitter/gizzard\">available on GitHub</a>. A sample application that uses Gizzard, called Rowz, <a href=\"http://github.com/nkallen/Rowz\">is also available</a>. The best way to get started with Gizzard is to clone Rowz and customize.</p> \n<h2>Installation</h2> \n<h2>Maven</h2> \n<pre><p> com.twitter</p>\n<p> gizzard</p>\n<p> 1.0</p>\n\n</pre> \n<h2>Ivy</h2> \n<pre></pre> \n<h2>Reporting problems</h2> \n<p>The Github issue tracker is <a href=\"http://github.com/twitter/gizzard/issues\">here</a>.</p> \n<h2>Contributors</h2> \n<ul>\n <li>Robey Pointer</li> \n <li>Nick Kallen</li> \n <li>Ed Ceaser</li> \n <li>John Kalucki</li> \n</ul>\n<p>If you’d like to learn more about the technologies that power Twitter, join us at <a href=\"http://chirp.twitter.com/\">Chirp</a>, the Twitter Developer Conference, where you will party, <a href=\"http://chirp.twitter.com/hack_day.html\">hack</a>, chat with and <a href=\"http://chirp.twitter.com/speakers.html\">learn</a> from Twitter Engineers and other developers!</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=nk\">@nk</a>, <a href=\"https://twitter.com/intent/user?screen_name=robey\">@robey</a>, <a href=\"https://twitter.com/intent/user?screen_name=asdf\">@asdf</a>, <a href=\"https://twitter.com/intent/user?screen_name=jkalucki\">@jkalucki</a></p>",
"date": "2010-04-06T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/introducing-gizzard-a-framework-for-creating-distributed-datastores",
"domain": "engineering"
},
{
"title": "Timeboxing",
"body": "<p>When you build a service that talks to external components, you have to worry about the amount of time that a network call will make. The standard technique for protecting yourself is to use timeouts.</p> \n<p>Most network libraries let you set timeouts to protect yourself but for local computation there are few tools to help you.</p> \n<p>Timeboxing is the name we’ve given to a technique for setting timeouts on local computation. The name is borrowed from the <a href=\"http://en.wikipedia.org/wiki/Timeboxing\">handy organizational technique</a>.</p> \n<p>Let’s say you have a method that can take an unbounded amount of time to complete. Normally it’s fast but sometimes it’s horribly slow. If you want to ensure that the work doesn’t take too long, you can box the amount of time it will be allowed to take before it’s aborted.</p> \n<p>One implementation we use for this in Scala is built on the <a href=\"http://java.sun.com/javase/6/docs/api/java/util/concurrent/Future.html\">Futures</a> Java concurrency feature. Futures allow you to compute in a separate thread while using your current thread for whatever else you’re doing. When you need the results of the Future, you call its <code>get()</code> method which blocks until the computation is complete. The trick we use is that you don’t need to do other work in the meantime, you can call <code>get()</code> immediately with a timeout value.</p> \n<p>Here’s an example (in Scala):</p> \n<pre>import java.util.concurrent.{Executors, Future, TimeUnit}\nval defaultValue = \"Not Found\"\nval executor = Executors.newFixedThreadPool(10)\nval future: Future[String] = executor.submit(new Callable[String]() {\n def call(): String = {\n // There's a small chance that this will take longer than you're willing to wait.\n sometimesSlow()\n }\n})\ntry {\n future.get(100L, TimeUnit.MILLISECONDS)\n} catch {\n case e: TimeoutException =&gt; defaultValue\n}</pre> \n<p>If returning a default value isn’t appropriate, you can report an error to the user or handle it in some other custom fashion. It entirely depends on the task. We measure and manage these slow computations with <a href=\"http://github.com/robey/ostrich\">Ostrich</a>, our performance statistics library. Code that frequently times out is a candidate for algorithmic improvement.</p> \n<p>Even though we’ve described it as a technique for protecting you from runaway local computation, timeboxing can also help with network calls that don’t support timeouts such as <a href=\"http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6450279\">DNS resolution</a>.</p> \n<p>Timeboxing is just one of many techniques we use to keep things humming along here at Twitter.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=stevej\">@stevej</a></p>",
"date": "2010-04-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/timeboxing",
"domain": "engineering"
},
{
"title": "Unicorn Power",
"body": "<p>Have you found yourself waiting in line at a supermarket and the guy in front decides to pay by check? One person can slow everyone down. We had a similar problem at Twitter.</p> \n<p>Every server has a fixed number of workers (cashiers) that handle incoming requests. During peak hours, we may get more simultaneous requests than available workers. We respond by putting those requests in a queue.</p> \n<p>This is unnoticeable to users when the queue is short and we handle requests quickly, but large systems have outliers. Every so often a request will take unusually long, and everyone waiting behind that request suffers. Worse, if an individual worker’s line gets too long, we have to drop requests. You may be presented with an adorable whale just because you landed in the wrong queue at the wrong time.</p> \n<p>A solution is to stop being a supermarket and start being <a href=\"http://en.wikipedia.org/wiki/Fry%27s_Electronics\">Fry’s.</a> When you checkout at Fry’s you wait in one long line. In front are thirty cash registers handling one person at a time. When a cashier finishes with a customer, they turn on a light above the register to signal they’re ready to handle the next one. It’s counterintuitive, but one long line can be more efficient than many short lines.</p> \n<p>For a long Time, twitter.com ran on top of Apache and Mongrel, using mod_proxy_balancer. As in the supermarket, Apache would “push” requests to waiting mongrels for processing, using the ‘bybusyness’ method. Mongrels which had the least number of requests queued received the latest request. Unfortunately, when an incoming request was queued, the balancer process had no way of knowing how far along in each task the mongrels were. Apache would send requests randomly to servers when they were equally loaded, placing fast requests behind slow ones, increasing the latency of each request.</p> \n<p>In November we started testing <a href=\"http://tomayko.com/writings/unicorn-is-unix\">Unicorn</a>, an exciting new Ruby server that takes the Mongrel core and adds Fry’s “pull” model. Mobile.twitter.com was our first app to run Unicorn behind Apache, and in January <a href=\"https://twitter.com/intent/user?screen_name=netik\">@netik</a> ran the gauntlet to integrate Unicorn into the main Twitter.com infrastructure.</p> \n<p>During initial tests, we predicted we would maintain CPU usage and only cut request latency 10-15%. Unicorn surprised us by dropping request latency 30% and significantly lowering CPU usage.</p> \n<p>For automatic recovery and monitoring, we’d relied on monit for mongrel. Monit couldn’t introspect the memory and CPU within the Unicorn process tree, so we developed a new monitoring script, called Stormcloud, to kill Unicorns when they ran out of control. Monit would still monitor the master Unicorn process, but Stormcloud would watch over the Unicorns.</p> \n<p>With monit, child death during request processing (due to process resource limits or abnormal termination) would cause that request and all requests queued in the mongrel to send 500 “robot” errors until the mongrel had been restarted by monit. Unicorn’s pull model prevents additional errors from firing as the remaining children can still process the incoming workload.</p> \n<p>Since each Unicorn child is only working on one request at a time, a single error is thrown, allowing us to isolate a failure to an individual request (or sequence of requests). Recovery is fast, as Unicorn immediately restarts children that have died, unlike monit which would wait until the next check cycle.</p> \n<p>We also took advantage of Unicorn’s zero-downtime deployment by writing a deploy script that would transition Twitter on to new versions of our code base while still accepting requests, ensuring that the new code base was verified and running before switching onto it. It’s a bit like changing the tires on your car while still driving, and it works beautifully.</p> \n<p>Stay tuned for the implications.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=sandofsky\">@sandofsky</a>, <a href=\"https://twitter.com/intent/user?screen_name=netik\">@netik</a></p>",
"date": "2010-03-30T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/unicorn-power",
"domain": "engineering"
},
{
"title": "Link: Cassandra at Twitter",
"body": "<p>Storage Team Lead <a href=\"http://twitter.com/rk\">Ryan King</a> recently spoke to MyNoSQL about how we’re using <a href=\"http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king\">Cassandra at Twitter.</a> In the interview Ryan talks about the reasons for the switch and how we plan to migrate tweets from MySQL to Cassandra.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=jeanpaul\">@jeanpaul</a></p>",
"date": "2010-02-23T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/link-cassandra-at-twitter",
"domain": "engineering"
},
{
"title": "The Anatomy of a Whale",
"body": "<p>Sometimes it’s really hard to figure out what’s causing problems in a web site like Twitter. But over time we have learned some techniques that help us to solve the variety of problems that occur in our complex web site.</p> \n<p>A few weeks ago, we noticed something unusual: over 100 visitors to Twitter per second saw what is popularly known as “the fail whale”. Normally these whales are rare; 100 per second was cause for alarm. Although even 100 per second is a very small fraction of our overall traffic, it still means that a lot of users had a bad experience when visiting the site. So we mobilized a team to find out the cause of the problem.</p> \n<h2>What Causes Whales?</h2> \n<p>What is the thing that has come to be known as “the fail whale”? It is a visual representation of the HTTP “503: Service Unavailable” error. It means that Twitter does not have enough capacity to serve all of its users. To be precise, we show this error message when a request would wait for more than a few seconds before resources become available to process it. So rather than make users wait forever, we “throw away” their requests by displaying an error.</p> \n<p>This can sometimes happen because too many users try to use Twitter at once and we don’t have enough computers to handle all of their requests. But much more likely is that some component part of Twitter suddenly breaks and starts slowing down.</p> \n<p>Discovering the root cause can be very difficult because Whales are an indirect symptom of a root cause that can be one of many components. In other words, the only concrete fact that we knew at the time was that there was some problem, somewhere. We set out to uncover exactly where in the Twitter requests’ lifecycle things were breaking down.</p> \n<p>Debugging performance issues is really hard. But it’s not hard due to a lack of data; in fact, the difficulty arises because there is too much data. We measure dozens of metrics per individual site request, which, when multiplied by the overall site traffic, is a massive amount of information about Twitter’s performance at any given moment. Investigating performance problems in this world is more of an art than a science. It’s easy to confuse causes with symptoms and even the data recording software itself is untrustworthy.</p> \n<p>In the analysis below we used a simple strategy that involves proceeding from the most aggregate measures of system as a whole and at each step getting more fine grained, looking at smaller and smaller parts.</p> \n<h2>How is a Web Page Built?</h2> \n<p>Composing a web page for Twitter request often involves two phases. First data is gathered from remote sources called “network services”. For example, on the Twitter homepage your tweets are displayed as well as how many followers you have. These data are pulled respectively from our tweet caches and our social graph database, which keeps track of who follows whom on Twitter. The second phase of the page composition process assembles all this data in an attractive way for the user. We call the first phase the IO phase and the second the CPU phase. In order to discover which phase was causing problems, we checked data that records what amount of time was spent in each phase when composing Twitter’s web pages.</p> \n<p><a href=\"http://4.bp.blogspot.com/_JOkC2SWCoSc/S3H6iZ-LYTI/AAAAAAAAAPg/0FhPjMPwPFU/s1600-h/cpulatency.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/the_anatomy_of_awhale95.thumb.1280.1280.png\" alt=\"The Anatomy of a Whale\"></a></p> \n<p>The green line in this graph represents the time spent in the IO phase and the blue line represents the CPU phase. This graph represents about 1 day of data. You can see that the relationships change over the course of the day. During non-peak traffic, CPU time is the dominant portion of our request, with our network services responding relatively quickly. However, during peak traffic, IO latency almost doubles and becomes the primary component of total request latency.</p> \n<h2>Understanding Performance Degradation</h2> \n<p>There are two possible interpretations for this ratio changing over the course of the day. One possibility is that the way people use Twitter during one part of the day differs from other parts of the day. The other possibility is that some network service degrades in performance as a function of use. In an ideal world, each network service would have equal performance for equal queries; but in the worst case, the same queries actually get slower as you run more simultaneously. Checking various metrics confirmed that users use Twitter the same way during different parts of the day. So we hypothesize that the problem must be in a network service degrading poorly. We were still unsure; in any good investigation one must constantly remain skeptical. But we decided that we had enough information to transition from this more general analysis of the system into something more specific, so we looked into IO latency data.</p> \n<p><a href=\"http://4.bp.blogspot.com/_JOkC2SWCoSc/S3H7Nx2lx3I/AAAAAAAAAPw/4aodTJ1foqQ/s1600-h/networkservice.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/the_anatomy_of_awhale96.thumb.1280.1280.png\" alt=\"The Anatomy of a Whale\"></a></p> \n<p>This graph represents the total amount of time waiting for our network services to deliver data. Since the amount of traffic we get changes over the course of the day, we expect any total to vary proportionally. But this graph is actually traffic independent; that is, we divide the measured latency by the amount of traffic at any given time. If any traffic-independent total latency changes over the course of the day, we know the corresponding network service is degrading with traffic. You can see that the purple line in this graph (which represents Memcached) degrades dramatically as traffic increases during peak hours. Furthermore, because it is at the top of the graph it is also the biggest proportion of time waiting for network services. So this correlates with the previous graph and we now have a stronger hypothesis: Memcached performance degrades dramatically during the course of the day, which leads to slower response times, which leads to whales.</p> \n<p>This sort of behavior is consistent with insufficient resource capacity. When a service with limited resources, such as Memcached, is taxed to its limits, requests begin contending with each other for Memcached’s computing time. For example, if Memcached can only handle 10 requests at a time but it gets 11 requests at time, the 11th request needs to wait in line to be served.</p> \n<h2>Focus on the Biggest Contributor to the Problem</h2> \n<p>If we can add sufficient Memcached capacity to reduce this sort of resource contention, we could increase the throughput of Twitter.com substantially. If you look at the above graph, you can infer that this optimization could increase twitter performance by 50%.</p> \n<p>There are two ways to add capacity. We could do this by adding more computers (memcached servers). But we can also change the software that talks to Memcached to be as efficient with its requests as possible. Ideally we do both.</p> \n<p>We decided to first pursue how we query Memcached to see if there was any easy way to optimize that by reducing the overall number of queries. But, there are many types of queries to memcached and it might be that some may take longer than others. We want to spend our time wisely and focus on optimizing the queries that are most expensive in aggregate.</p> \n<p>We sampled a live process to record some statistics on which queries take the longest. The following is each type of Memcached query and how long they take on average:</p> \n<pre>get 0.003s\nget_multi 0.008s\nadd 0.003s\ndelete 0.003s\nset 0.003s\nincr 0.003s\nprepend 0.002s</pre> \n<p>You can see that <code>get_multi</code> is a little more expensive than the rest but everything else is the same. But that doesn’t mean it’s the source of the problem. We also need to know how many requests per second there are for each type of query.</p> \n<pre>get 71.44%\nget_multi 8.98%\nset 8.69%\ndelete 5.26%\nincr 3.71%\nadd 1.62%\nprepend 0.30%</pre> \n<p>If you multiply average latency by the percentage of requests you get a measure of the total contribution to slowness. Here, we found that <code>gets</code> were the biggest contributor to slowness. So, we wanted to see if we could reduce the number of gets.</p> \n<h2>Tracing Program Flow</h2> \n<p>Since we make Memcached queries from all over the Twitter software, it was initially unclear where to start looking for optimization opportunities. Our first step was to begin collecting stack traces, which are logs that represent what the program is doing at any given moment in time. We instrumented one of our computers to sample some small percentages of <code>get</code> memcached calls and record what sorts of things caused them.</p> \n<p>Unfortunately, we collected a huge amount of data and it was hard to understand. Following our precedent of using visualizations in order to gain insight into large sets of data, we took some inspiration from the Google perf-tools project and wrote <a href=\"http://github.com/eaceaser/ruby-call-graph\">a small program</a> that generated a cloud graph of the various paths through our code that were resulting in Memcached Gets. Here is a simplified picture:</p> \n<p><a href=\"http://3.bp.blogspot.com/_JOkC2SWCoSc/S3H7OVuGPGI/AAAAAAAAAP4/OzGaUK-AND0/s1600-h/callgraph.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/the_anatomy_of_awhale97.thumb.1280.1280.png\" alt=\"The Anatomy of a Whale\"></a></p> \n<p>Each circle represents one component/function. The size of the circle represents how big a proportion of Memcached <code>get</code> queries come from that function. The lines between the circles show which function caused the other function to occur. The biggest circle is <code>check_api_rate_limit</code> but it is caused mostly by <code>authenticate_user</code> and <code>attempt_basic_auth</code>. In fact, <code>attempt_basic_auth</code> is the main opportunity for enhancement. It helps us compute who is requesting a given web page so we can serve personalized (and private) information to just the right people.</p> \n<p>Any Memcached optimizations that we can make here would have a large effect on the overall performance of Twitter. By counting the number of actual <code>get</code> queries made per request, we found that, on average, a single call to <code>attempt_basic_auth</code>was making 17 calls. The next question is: can any of them be removed?</p> \n<p>To figure this out we need to look very closely at the all of the queries. Here is a “history” of the the most popular web page that calls <code>attempt_basic_auth</code>. This is the API request for <code><a href=\"http://twitter.com/statuses/friends_timeline.format\" target=\"_blank\" rel=\"nofollow\">http://twitter.com/statuses/friends_timeline.format</a></code>, the most popular page on Twitter!</p> \n<pre>get([\"User:auth:missionhipster\", # maps screen name to user id\nget([\"User:15460619\", # gets user object given user id (used to match passwords)\nget([\"limit:count:login_attempts:...\", # prevents dictionary attacks\nset([\"limit:count:login_attempts:...\", # unnecessary in most cases, bug\nset([\"limit:timestamp:login_attempts:...\", # unnecessary in most cases, bug\nget([\"limit:timestamp:login_attempts:...\",\nget([\"limit:count:login_attempts:...\", # can be memoized\nget([\"limit:count:login_attempts:...\", # can also be memoized\nget([\"user:basicauth:...\", # an optimization to avoid calling bcrypt\nget([\"limit:count:api:...\", # global API rate limit\nset([\"limit:count:api:...\", # unnecessary in most cases, bug\nset([\"limit:timestamp:api:...\", # unnecessary in most cases, bug\nget([\"limit:timestamp:api:...\",\nget([\"limit:count:api:...\", # can be memoized from previous query\nget([\"home_timeline:15460619\", # determine which tweets to display\nget([\"favorites_timeline:15460619\", # determine which tweets are favorited\nget_multi([[\"Status:fragment:json:7964736693\", # load, in parallel, all of the tweets we're gonna display.</pre> \n<p>Note that all of the “limit:” queries above come from <code>attempt_basic_auth</code>. We noticed a few other (relatively minor) unnecessary queries as well. It seems like from this data we can eliminate seven out of seventeen total Memcached calls — a 42% improvement for the most popular page on Twitter.</p> \n<p>At this point, we need to write some code to make these bad queries go away. Some of them we cache (so we don’t make the exact same query twice), some are just bugs and are easy to fix. Some we might try to parallelize (do more than one query at the same time). But this 42% optimization (especially if combined with new hardware) has the potential to eliminate the performance degradation of our Memcached cluster and also make most page loads that much faster. It is possible we could see a (substantially) greater than 50% increase in the capacity of Twitter with these optimizations.</p> \n<p>This story presents a couple of the fundamental principles that we use to debug the performance problems that lead to whales. First, always proceed from the general to the specific. Here, we progressed from looking first at I/O and CPU timings to finally focusing on the specific Memcached queries that caused the issue. And second, live by the data, but don’t trust it. Despite the promise of a 50% gain that the data implies, it’s unlikely we’ll see any performance gain anywhere near that. Even still, it’ll hopefully be substantial.</p> \n<p>— <a href=\"https://twitter.com/intent/user?screen_name=asdf\">@asdf</a> and <a href=\"https://twitter.com/intent/user?screen_name=nk\">@nk</a></p>",
"date": "2010-02-10T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/the-anatomy-of-a-whale",
"domain": "engineering"
},
{
"title": "Links: Relational Algebra and Scala DI",
"body": "<p>Wondering what other technical concerns we face here at Twitter?</p> \n<p>Infrastructure engineer <a href=\"http://twitter.com/nk\">Nick Kallen</a> has a pair of posts on his personal blog: one about the <a href=\"http://magicscalingsprinkles.wordpress.com/2010/01/28/why-i-wrote-arel/\">new relational algebra system</a> behind the Rails 3 ORM, and one about <a href=\"http://magicscalingsprinkles.wordpress.com/2010/02/08/why-i-love-everything-you-hate-about-java/\">dependency injection in Scala</a>.</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=evan\">@evan</a></p>",
"date": "2010-02-09T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/links-relational-algebra-and-scala-di",
"domain": "engineering"
},
{
"title": "Introducing the Open Source Twitter Text libraries",
"body": "<p>Over time Tweets have acquired a language all their own. Some of these have been around a long time (like @username at the beginning of a Tweet) and some of these are relatively recent (such as lists) but all of them make the language of Tweets unique. Extracting these Tweet-specific components from a Tweet is relatively simple for the majority of Tweets, but like most text parsing issues the devil is in the details.</p> \n<p>We’ve extracted the code we use to handle Tweet-specific elements and released it as an open source library. This first version is available in Ruby and Java but in the Twitter spirit of openness we’ve also released a conformance test suite so any other implementations can verify they meet the same standards.</p> \n<h2>Tweet-specific Language</h2> \n<p>It all started with the @reply … and then it got complicated. Twitter users started the use of @username at the beginning of a Tweet to indicate a reply, but you’re not here to read about history. In order to talk about the new Twitter Text libraries one needs to understand the Tweet-specific elements we’re interested in. Much of this will be a review of what you already know but a shared vocabulary will help later on. While the Tweet-specific language is always expanding the current elements consist of:</p> \n<ul>\n <li>@reply This is a Tweet which begins with @username. This is distinct from the presence of @username elsewhere in the Tweet (more on that in a moment). An @reply Tweet is considered directly addressed to the @username and only some of your followers will see the Tweets (notably, those who follow both you and the @username).</li> \n <li><a href=\"https://twitter.com/intent/user?screen_name=mention\">@mention</a> This is a Tweet which contains one or more @usernames anywhere in the Tweet. Technically an @reply is a type of <a href=\"https://twitter.com/intent/user?screen_name=mention\">@mention</a>, which is important from a parsing perspective. An <a href=\"https://twitter.com/intent/user?screen_name=mention\">@mention</a> Tweets will be delivered to all of your followers regardless of is the follow the <a href=\"https://twitter.com/intent/user?screen_name=mentioned\">@mentioned</a> user or not.</li> \n <li>@username/list-name Twitter lists are referenced using the syntax @username/list-name where the list-name portion has to meet some specific rules.</li> \n <li>#hashtag As long has there has been a way to search Tweets* people have been adding information to make the easy to find. The #hashtag syntax has become the standard for attaching a succinct tag to Tweets.</li> \n <li>URLs While URLs are not Tweet-specific they are an important part of Tweets and require some special handling. There is a vast array of services based on the URLs in Tweets. In addition to services that extract the URLs most people expect URLs to be automatically converted to links when viewing a Tweet.</li> \n</ul>\n<h2>Twitter Text Libraries</h2> \n<p>For this first version of the Twitter Text libraries we’ve released both Ruby and Java versions. We certainly expect more languages in the future and we’re looking forward to the patches and feedback we’ll get on these first versions.</p> \n<p>For each library we’ve provided functions for extracting the various Tweet-specific elements. Displaying Tweets in HTML is a very common use case so we’ve also included HTML auto-linking functions. The individual language interfaces differ so they can feel as natural as possible for each individual language.</p> \n<h2>Ruby Library</h2> \n<p>The Ruby library is available as a <a href=\"http://gemcutter.org/gems/twitter-text\">gem via gemcutter</a> or the source code can be <a href=\"http://github.com/mzsanford/twitter-text-rb\">found on github</a>. You can also peruse the <a href=\"http://mzsanford.github.com/twitter-text-rb/doc/index.html\">rdoc hosted on github</a>. The Ruby library is provided as a set of Ruby modules so they can be included in your own classes and modules. The rdoc is a more complete reference but for a quick taste check out this short example:</p> \n<pre>class MyClass\n include Twitter::Extractor\n usernames = extract_mentioned_screen_names(\"Mentioning @twitter and @jack\")\n # usernames = [\"twitter\", \"jack\"]\nend</pre> \n<p>The interface makes this all seems quite simple but there are some very complicated edge cases. I’ll talk more about that in the next section, Conformance Testing.</p> \n<h2>Java Library</h2> \n<p>The source code for the Java library can be <a href=\"http://github.com/mzsanford/twitter-text-java\">found on github</a>. The library provides an ant file for buildinf the twitter-text.jar file. You can also peruse the <a href=\"http://mzsanford.github.com/twitter-text-java/docs/api/index.html\">javadocs hosted on github</a>. The Java library provides Extractor and Autolink classes that provide object-oriented methods for extraction and auto-linking. The javadoc is a more complete reference but for a quick taste check out this short example:</p> \n<pre>import java.util.List;\nimport com.twitter.Extractor;\n\npublic class Check {\n public static void main(String[] args) {\n List names;\n Extractor extractor = new Extractor();\n\n names = extractor.extractMentionedScreennames(\"Mentioning @twitter and @jack\");\n for (String name : names) {\n System.out.println(\"Mentioned @\" + name);\n }\n }\n}</pre> \n<p>The library makes this all seems quite simple but there are some very complicated edge cases.</p> \n<h2>Conformance Testing</h2> \n<p>While working on the Ruby and Java version of the Twitter Text libraries it became pretty clear that porting tests to each language individually wasn’t going to be sustainable. To help keep things in sync we created that <a href=\"http://github.com/mzsanford/twitter-text-conformance\">Twitter Text Conformance project</a>. This project provides some simple yaml files that define the expected before and after states for testing. The per-language implementation of these tests can vary along with the per-language interface, making it intuitive for programmers in any language.</p> \n<p>The basic extraction and auto-link test cases are easy to understand but the edge cases about. Many of the largest complications come from handling Tweets written in Japanese and other languages that don’t use spaces. We also try to be lenient with the allowed URL characters, which creates some more headaches.</p> \n<p>— <a href=\"https://twitter.com/intent/user?screen_name=mzsanford\">@mzsanford</a></p>",
"date": "2010-02-04T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/introducing-the-open-source-twitter-text-libraries",
"domain": "engineering"
},
{
"title": "WOEIDs in Twitter&#39;s Trends",
"body": "<p>How do you represent a “place”? That’s what we were wondering when we were putting together our API for <a href=\"http://engineering/2010/01/now-trending-local-trends.html\">Trends on Twitter</a>. We needed a way to represent a place in a permanent and language-independent way - we didn’t want to be caught in using an identifier that may change over time, and we didn’t want to be caught in internationalization issues. Where we landed was on using Yahoo!’s Where On Earth IDs, or WOEIDs, as our representation.</p> \n<p><a href=\"http://developer.yahoo.com/geo/geoplanet/guide/concepts.html\">WOEID</a>s are 32-bit identifiers that are “unique and non-repetitive” — if a location is assigned a WOEID, the WOEID assigned is never changed, and that particular WOEID is never given to another location. In addition, a WOEID has certain properties such as an implied hierarchy (a particular city is in a particular state, for example), and a taxonomy of categories (there is a “language” to categorize something as a “town” or a “suburb”). Finally, there is a standard endpoint to query to get more information about a place. A program that wanted to get data about San Francisco, CA would first determine that it has a WOEID of 2487956, and with that query <a href=\"http://where.yahooapis.com/v1/place/2487956\" target=\"_blank\" rel=\"nofollow\">http://where.yahooapis.com/v1/place/2487956</a>.</p> \n<p>What this all means is that our Trends API is now interoperable with anybody else who is building a system on top of WOEIDs — you could easily mash-up <a href=\"http://www.flickr.com/places/\">Flickr’s places</a> with our Trend locations, for example. If you want to give something like that a try, check out the <a href=\"http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-trends-available\">trends/available</a> endpoint, as that will let you know what WOEIDs we are exposing trending information for. With those WOEIDs, you can then hit up <a href=\"http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-trends-location\">trends/location</a> to get the actual trending data.</p> \n<p>There are two niceties to the API: pass in a latitude/longitude when querying trends/available, and remember that there is a hierarchy of WOEIDs. If you pass in a lat and a long parameter to trends/available, then all the locations that are returned are sorted by their <a href=\"http://en.wikipedia.org/wiki/Haversine_formula\">haversine distance</a> from the coordinate passed in. Application developers can use this to help get trends from “around where you are”.</p> \n<p>And second, like I mentioned above, WOEIDs form a hierarchy (that’s mostly correct). Here is the hierarchy of the locations that we have as of today:</p> \n<pre>1 (\"Terra\")\n |---- 23424775 (\"Canada\" - Country)\n |---- 23424803 (\"Ireland\" - Country)\n |---- 23424975 (\"United Kingdom\" - Country)\n | \\---- 24554868 (\"England\" - State)\n | \\---- 23416974 (\"Greater London\" - County)\n | \\---- 44418 (\"London\" - Town)\n |---- 23424900 (\"Mexico\" - Country)\n |---- 23424768 (\"Brazil\" - Country)\n | \\---- 2344868 (\"Sao Paulo\" - State)\n | \\---- 12582314 (\"São Paulo\" - County)\n | \\---- 455827 (\"Sao Paulo\" - Town)\n \\---- 23424977 (\"United States\" - Country)\n |---- 2347572 (\"Illinois\" - State)\n | \\---- 12588093 (\"Cook\" - County)\n | \\---- 2379574 (\"Chicago\" - Town)\n |---- 2347567 (\"District of Columbia\" - State)\n | \\---- 12587802 (\"District of Columbia\" - County)\n | \\---- 2514815 (\"Washington\" - Town)\n |---- 2347606 (\"Washington\" - State)\n | \\---- 12590456 (\"King\" - County)\n | \\---- 2490383 (\"Seattle\" - Town)\n |---- 2347579 (\"Maryland\" - State)\n | \\---- 12588679 (\"Baltimore City\" - County)\n | \\---- 2358820 (\"Baltimore\" - Town)\n |---- 2347563 (\"California\" - State)\n | |---- 12587707 (\"San Francisco\" - County)\n | | \\---- 2487956 (\"San Francisco\" - Town)\n | \\---- 12587688 (\"Los Angeles\" - County)\n | \\---- 2442047 (\"Los Angeles\" - Town)\n |---- 2347580 (\"Massachusetts\" - State)\n | \\---- 12588712 (\"Suffolk\" - County)\n | \\---- 2367105 (\"Boston\" - Town)\n |---- 2347591 (\"New York\" - State)\n | \\---- 2459115 (\"New York\" - Town)\n |---- 2347569 (\"Georgia\" - State)\n | \\---- 12587929 (\"Fulton\" - County)\n | \\---- 2357024 (\"Atlanta\" - Town)\n |---- 2347602 (\"Texas\" - State)\n | |---- 12590226 (\"Tarrant\" - County)\n | | \\---- 2406080 (\"Fort Worth\" - Town)\n | |---- 12590107 (\"Harris\" - County)\n | | \\---- 2424766 (\"Houston\" - Town)\n | |---- 12590063 (\"Dallas\" - County)\n | | \\---- 2388929 (\"Dallas\" - Town)\n | \\---- 12590021 (\"Bexar\" - County)\n | \\---- 2487796 (\"San Antonio\" - Town)\n \\---- 2347597 (\"Pennsylvania\" - State)\n \\---- 12589778 (\"Philadelphia\" - County)\n \\---- 2471217 (\"Philadelphia\" - Town)\n</pre> \n<p>Right now, even though we don’t expose this information using trends/available, you could ask for any of those WOEIDs, and we’ll choose the nearest trend location (or locations!) that we have data for.</p> \n<p>And, of course, we have a few more things in the pipeline…</p> \n<p>—<a href=\"https://twitter.com/intent/user?screen_name=raffi\">@raffi</a></p>",
"date": "2010-02-04T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/woeids-in-twitters-trends",
"domain": "engineering"
},
{
"title": "Hello World",
"body": "<p>Welcome! I’m Ben, and I’m an engineer at Twitter. We’ve started this blog to show some of the cool things we’re creating and tough problems we’re solving.</p> \n<p>As a fun way to kick things off, I ran <a href=\"http://vis.cs.ucdavis.edu/%7Eogawa/codeswarm/\">Code Swarm</a> over a few essential production apps. Icons represent developers, and particles represent files added or modified. It doesn’t cover prototypes or contributions to open source, so it isn’t exactly scientific, but it still goes to show Twitter’s explosive growth mirrored in engineering.</p> \n<p></p> \n<p>(The forked version of Code Swarm that adds rounded avatars and particle trails is <a href=\"http://github.com/sandofsky/code_swarm\">available on github</a>.)</p>",
"date": "2010-02-03T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2010/hello-world-0",
"domain": "engineering"
},
{
"title": "2014",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2014",
"domain": "engineering"
},
{
"title": "Mesos 0.15 and Authentication Support",
"body": "<p>With the latest Mesos <a href=\"http://mesos.apache.org/downloads/\">0.15.0 release</a>, we are pleased to report that we’ve added initial authentication support for frameworks (see <a href=\"https://issues.apache.org/jira/browse/MESOS-418\">MESOS-418</a>) connecting to Mesos. In a nutshell, this feature allows only authenticated frameworks to register with Mesos and launch tasks. Authentication is important as it prevents rogue frameworks from causing problems that may impact the usage of resources within a Mesos cluster.</p> \n<h5>How it works</h5> \n<p>Mesos uses the excellent <a href=\"http://asg.web.cmu.edu/sasl/\">Cyrus SASL library</a> to provide authentication. SASL is a very flexible authentication framework that allows two endpoints to authenticate with each other and also has support for various authentication mechanisms (e.g., ANONYMOUS, PLAIN, CRAM-MD5, GSSAPI).</p> \n<p>In this release, Mesos uses SASL with the CRAM-MD5 authentication mechanism. The process for enabling authentication begins with the creation of an authentication credential that is unique to the framework. This credential constitutes a <strong>principal</strong> and <strong>secret</strong> pair, where <strong>principal</strong> refers to the identity of the framework. Note that the ‘principal’ is different from the framework <strong>user</strong> (the Unix user which executors run as) or the resource <strong>role</strong> (role that has reserved certain resources on a slave). These credentials should be shipped to the Mesos master machines, as well as given access to the framework, meaning that both the framework and the Mesos masters should be started with these credentials.</p> \n<p>Once authentication is enabled, Mesos masters only allow authenticated frameworks to register. Authentication for frameworks is performed under the hood by the new scheduler driver.</p> \n<p>For specific instructions on how to do this please read the <a href=\"//mesos.apache.org/documentation/latest/upgrades/\">upgrade instructions</a>.</p> \n<h5>Looking ahead</h5> \n<p><strong>Adding authentication support for slaves</strong></p> \n<p>Similar to adding authentication support to frameworks, it would be great to add authentication support to the slaves. Currently any node in the network can run a Mesos slave process and register with the Mesos master. Requiring slaves to authenticate with the master before registration would prevent rogue slaves from causing problems like DDoSing the master or getting access to users tasks in the cluster.</p> \n<p><strong>Integrating with Kerberos</strong></p> \n<p>Currently the authentication support via shared secrets between frameworks and masters is basic to benefit usability. To improve upon this basic approach, a more powerful solution would be to integrate with an industry standard authentication service like <a href=\"http://en.wikipedia.org/wiki/Kerberos_(protocol)\">Kerberos</a>. A nice thing about SASL and one of the reasons we picked it is because of its support for integration with GSSAPI/Kerberos. We plan to leverage this support to integrate Kerberos with Mesos.</p> \n<p><strong>Data encryption</strong></p> \n<p>Authentication is only part of the puzzle when it comes to deploying and running applications securely in the cloud. Another crucial component is data encryption. Currently all the messages that flow through the Mesos cluster are un-encrypted making it possible for intruders to intercept and potentially control your task. We plan to add encryption support by adding SSL support to <a href=\"https://github.com/3rdparty/libprocess\">libprocess</a>, the low-level communication library that Mesos uses which is responsible for all network communication between Mesos components.</p> \n<p><strong>Authorization</strong></p> \n<p>We are also investigating authorizing principals to allow them access to only a specific set of operations like launching tasks or using resources. In fact, you could imagine a world where an authenticated ‘principal’ will be authorized to on behalf of a subset of ‘user’s and ‘role’s for launching tasks and accepting resources respectively. This authorization information could be stored in a directory service like LDAP.</p> \n<h5>Thanks</h5> \n<p>While a lot of people contributed to this feature, we would like to give special thanks to <a href=\"https://twitter.com/ilimugur\">Ilim Igur</a>, our <a href=\"http://www.google-melange.com/gsoc/homepage/google/gsoc2013\">Google Summer of Code intern</a> who started this project and contributed to the initial design and implementation.</p> \n<p>If you are as excited as us about this feature please go ahead and play with <a href=\"http://mesos.apache.org\">latest release</a> and let us know what you think. You can get in touch with us via <a href=\"https://twitter.com/intent/user?screen_name=ApacheMesos\">@ApacheMesos</a> or via <a href=\"//mesos.apache.org/community/\">mailing lists and IRC</a>.</p>",
"date": "2014-01-09T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/mesos-015-and-authentication-support",
"domain": "engineering"
},
{
"title": "Introducing Twitter Data Grants",
"body": "<p>Today we’re introducing a pilot project we’re calling <a href=\"https://engineering.twitter.com/research/data-grants\">Twitter Data Grants</a>, through which we’ll give a handful of research institutions access to our public and historical data.</p> \n<p>With more than 500 million Tweets a day, Twitter has an expansive set of data from which we can glean insights and learn about a variety of topics, from health-related information such as when and<a href=\"http://releases.jhu.edu/2013/01/24/using-twitter-to-track-the-flu/\"> where the flu may hit</a> to global events like <a href=\"https://engineering/2014/everybody-everywhere-ringing-in-2014\">ringing in the new year</a>. To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data. Our Data Grants program aims to change that by connecting research institutions and academics with the data they need.</p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>Submit a proposal for consideration to our Twitter Data Grants pilot program by March 15.</p> \n</blockquote> \n<p>If you’d like to participate, submit a proposal <a href=\"https://engineering.twitter.com/research/data-grants\">here</a> no later than March 15th. For this initial pilot, we’ll select a small number of proposals to receive free datasets. We can do this thanks to Gnip, one of our <a href=\"https://dev.twitter.com/programs/twitter-certified-products/gnip\">certified data reseller partners</a>. They are working with us to give selected institutions free and easy access to Twitter datasets. In addition to the data, we will also be offering opportunities for the selected institutions to collaborate with Twitter engineers and researchers.</p> \n<p>We encourage those of you at research institutions using Twitter data to send in your best proposals. To get updates and stay in touch with the program: visit research.twitter.com, make sure to follow <a href=\"https://twitter.com/intent/user?screen_name=TwitterEng\">@TwitterEng</a>, or email data-grants@twitter.com with questions.</p>",
"date": "2014-02-05T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/introducing-twitter-data-grants",
"domain": "engineering"
},
{
"title": "Netty at Twitter with Finagle",
"body": "<p><a href=\"https://engineering/2011/finagle-a-protocol-agnostic-rpc-system\">Finagle</a> is our fault tolerant, protocol-agnostic RPC framework built atop <a href=\"http://netty.io/\">Netty</a>. Twitter’s core services are built on Finagle, from backends serving user profile information, Tweets, and timelines to front end API endpoints handling HTTP requests.</p> \n<p>Part of <a href=\"https://engineering/2013/new-tweets-per-second-record-and-how\">scaling Twitter</a> was the shift from a monolithic Ruby on Rails application to a service-oriented architecture. In order to build out this new architecture we needed a performant, fault tolerant, protocol-agnostic, asynchronous RPC framework. Within a service-oriented architecture, services spend most of their time waiting for responses from other upstream services. Using an asynchronous library allows services to concurrently process requests and take full advantage of the hardware. While Finagle could have been built directly on top of NIO, Netty had already solved many of the problems we would have encountered as well as provided a clean and clear API.</p> \n<p>Twitter is built atop several open source protocols: primarily HTTP, Thrift, Memcached, MySQL, and Redis. Our network stack would need to be flexible enough that it could speak any of these protocols and extensible enough that we could easily add more. Netty isn’t tied to any particular protocol. Adding to it is as simple as creating the appropriate event handlers. This extensibility has lead to many community driven protocol implementations including, <a href=\"https://engineering/2013/cocoaspdy-spdy-for-ios-os-x\">SPDY</a>, PostrgreSQL, WebSockets, IRC, and AWS.</p> \n<p>Netty’s connection management and protocol agnosticism provided an excellent base from which Finagle could be built. However we had a few other requirements Netty couldn’t satisfy out of the box as those requirements are more “high-level”. Clients need to connect to and load balance across a cluster of servers. All services need to export metrics (request rates, latencies, etc) that provide valuable insight for debugging service behavior. With a service-oriented architecture a single request may go through dozens of services making debugging performance issues nearly impossible without a tracing framework. Finagle was built to solve these problems. In the end Finagle relies on Netty for IO multiplexing providing a <strong>transaction-oriented</strong> framework on top of Netty’s <strong>connection-oriented</strong> model.</p> \n<h5>How Finagle Works</h5> \n<p>Finagle emphasizes modularity by stacking independent components together. Each component can be swapped in or out depending on the provided configuration. For example, tracers all implement the same interface. Thus, a tracer can be created to send tracing data to a local file, hold it in memory and expose a read endpoint, or write out to the network.</p> \n<p>At the bottom of a Finagle stack is a Transport. A Transport represents a stream of objects that can be asynchronously read from and written to. Transports are implemented as Netty ChannelHandlers and inserted into the end of a ChannelPipeline. Finagle’s ChannelHandlerTransport manages Netty interest ops to propagate back pressure. When Finagle indicates that the service is ready to read, Netty reads data off the wire and runs it through the ChannelPipeline where they’re interpreted by a codec then sent to the Finagle Transport. From there, Finagle sends the message through its own stack.</p> \n<p>For client connections, Finagle maintains a pool of transports across which it balances load. Depending on the semantics of the provided connection pool Finagle either requests a new connection from Netty or re-uses an existing one if it’s idle. When a new connection is requested, a Netty ChannelPipeline is created based on the client’s codec. Extra ChannelHandlers are added to the ChannelPipeline for stats, logging, and SSL. The connection is then handed to a channel transport which Finagle can write to and read from. If all connections are busy requests will be queued according to configurable queueing policies.</p> \n<p>On the server side Netty manages the codec, stats, timeouts, and logging via a provided ChannelPipelineFactory. The last ChannelHandler in a server’s ChannelPipeline is a Finagle bridge. The bridge will watch for new incoming connections and create a new Transport for each one. The Transport wraps the new channel before it’s handed to a server implementation. Messages are then read out of the ChannelPipeline and sent to the implemented server instance.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/netty_at_twitterwithfinagle95.thumb.1280.1280.png\" width=\"665\" height=\"270\" alt=\"Netty at Twitter with Finagle\"></p> \n<pre>1) Finagle Client which is layered on top of the Finagle Transport. This Transport abstracts Netty away from the user<br>2) The actual ChannelPipeline of Netty that contains all the ChannelHandler implementations that do the actual work<br>3) Finagle Server which is created for each connection and provided a transport to read from and write to<br>4) ChannelHandlers implementing protocol encoding/decoding logic, connection level stats, SSL handling.</pre> \n<h5>Bridging Netty and Finagle</h5> \n<p>Finagle clients use ChannelConnector to bridge Finagle and Netty. ChannelConnector is a function that takes a SocketAddress and returns a Future Transport. When a new connection is requested of Netty, Finagle uses a ChannelConnector to request a new Channel and create a new Transport with that Channel. The connection is established asynchronously, fulfilling the Future with the new Transport on success or a failure if the connection cannot be established. A Finagle client can then dispatch requests over the Transport.</p> \n<p>Finagle servers bind to an interface and port via a Listener. When a new connection is established, the Listener creates a Transport and passes it to a provided function. From there, the Transport is handed to a Dispatcher which dispatches requests from the Transport to the Service according to a given policy.</p> \n<h5>Finagle’s Abstraction</h5> \n<p>Finagle’s core concept is a simple function (functional programming is the key here) from Request to Future of Response.</p> \n<pre class=\"brush:scala;first-line:1;\">type Service[Req, Rep] = Req =&gt; Future[Rep]\n</pre> \n<p>A future is a container used to hold the result of an asynchronous operation such as a network RPC, timeout, or disk I/O operation. A future is either empty—the result is not yet available; succeeded—the producer has completed and has populated the future with the result of the operation; or failed—the producer failed, and the future contains the resulting exception.</p> \n<p>This simplicity allows for very powerful composition. Service is a symmetric API representing both the client and the server. Servers implement the service interface. The server can be used concretely for testing or Finagle can expose it on a network interface. Clients are provided an implemented service that is either virtual or a concrete representation of a remote server.</p> \n<p>For example, we can create a simple HTTP server by implementing a service that takes an HttpReq and returns a Future[HttpRep] representing an eventual response:</p> \n<pre class=\"brush:scala;first-line:1;\">val s: Service[HttpReq, HttpRep] = new Service[HttpReq, HttpRep] { \n def apply(req: HttpReq): Future[HttpRep] =\n Future.value(HttpRep(Status.OK, req.body))\n}\nHttp.serve(\":80\", s)\n</pre> \n<p>A client is then provided with a symmetric representation of that service:</p> \n<pre class=\"brush:scala;first-line:1;\">val client: Service[HttpReq, HttpRep] = Http.newService(\"twitter.com:80\")\nval f: Future[HttpRep] = client(HttpReq(\"/\"))\nf map { rep =&gt; transformResponse(rep) }\n</pre> \n<p>This example exposes the server on port 80 of all interfaces and consumes from twitter.com port 80. However we can also choose not to expose the server and instead use it directly:</p> \n<pre class=\"brush:scala;first-line:1;\">server(HttpReq(\"/\")) map { rep =&gt; transformResponse(rep) }\n</pre> \n<p>Here the client code behaves the same way but doesn’t require a network connection. This makes testing clients and servers very simple and straightforward.</p> \n<p>Clients and servers provide application-specific functionality. However, there is a need for application agnostic functionality as well. Timeouts, authentication, and statics are a few examples. Filters provide an abstraction for implementing application-agnostic functionality.</p> \n<p>Filters receive a request and a service with which it is composed.</p> \n<pre class=\"brush:scala;first-line:1;\">type Filter[Req, Rep] = (Req, Service[Req, Rep]) =&gt; Future[Rep]\n</pre> \n<p>Filters can be chained together before being applied to a service.</p> \n<pre class=\"brush:scala;first-line:1;\">recordHandletime andThen\ntraceRequest andThen\ncollectJvmStats\nandThen myService\n</pre> \n<p>This allows for clean abstractions of logic and good separation of concerns. Internally, Finagle heavily uses filters. Filters enhance modularity and reusability. They’ve proved valuable for testing as they can be unit tested in isolation with minimal mocking.</p> \n<p>Filters can also modify both the data and type of requests and responses. The figure below shows a request making its way through a filter chain into a service and back out.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/netty_at_twitterwithfinagle96.thumb.1280.1280.png\" width=\"523\" height=\"123\" alt=\"Netty at Twitter with Finagle\"></p> \n<p>We might use type modification for implementing authentication.</p> \n<pre class=\"brush:scala;first-line:1;\">val auth: Filter[HttpReq, AuthHttpReq, HttpRes, HttpRes] =\n{ (req, svc) =&gt; authReq(req) flatMap { authReq =&gt; svc(authReq) } }\n\nval authedService: Service[AuthHttpReq, HttpRes] = ...\nval service: Service[HttpReq, HttpRes] =\nauth andThen authedService\n</pre> \n<p>Here we have a service that requires and AuthHttpReq. To satisfy the requirement, a filter is created that can receive an HttpReq and authenticate it. The filter is then composed with the service, yielding a new service that can take an HttpReq and produce an HttpRes. This allows us to test the authenticating filter in isolation of the service.</p> \n<h5>Failure Management</h5> \n<p>At scale, failure becomes common rather than exceptional; hardware fails, networks become congested, network links fail. Libraries capable of extremely high throughput and extremely low latency are meaningless if the systems they run on or communicate with fail. To that end, Finagle is set up to manage failures in a principled way. It trades some throughput and latency for better failure management.</p> \n<p>Finagle can balance load across a cluster of hosts implicitly using latency as a heuristic. A Finagle client locally tracks load on every host it knows about. It does so by counting the number of outstanding requests being dispatched to a single host. Given that, Finagle will dispatch new requests to hosts with the lowest load and implicitly the lowest latency.</p> \n<p>Failed requests will cause Finagle to close the connection to the failing host and remove it from the load balancer. In the background, Finagle will continuously try to reconnect. The host will be re-added to the load balancer only after Finagle can re-establish a connection. Service owners are then free to shut down individual hosts without negatively impacting downstream clients. Clients also keep per-connection health heuristics and remove the connection if it’s deemed unhealthy.</p> \n<h5>Composing Services</h5> \n<p>Finagle’s service as a function philosophy allows for simple, but expressive code. For example, a user’s request for their home timeline touches several services. The core of these are the authentication service, timeline service, and Tweet service. These relationships can be expressed succinctly.</p> \n<pre class=\"brush:scala;first-line:1;\">val timelineSvc = Thrift.newIface[TimelineService](...) // #1 \nval tweetSvc = Thrift.newIface[TweetService](...)\nval authSvc = Thrift.newIface[AuthService](...)\n \nval authFilter = Filter.mk[Req, AuthReq, Res, Res] { (req, svc) =&gt; // #2 \n authSvc.authenticate(req) flatMap svc(_)\n}\n \nval apiService = Service.mk[AuthReq, Res] { req =&gt; \n timelineSvc(req.userId) flatMap {tl =&gt;\n val tweets = tl map tweetSvc.getById(_)\n Future.collect(tweets) map tweetsToJson(_) }\n } \n } //#3 \nHttp.serve(\":80\", authFilter andThen apiService) // #4\n \n// #1 Create a client for each service\n// #2 Create new Filter to authenticate incoming requests\n// #3 Create a service to convert an authenticated timeline request to a json response \n// #4 Start a new HTTP server on port 80 using the authenticating filter and our service\n</pre> \n<p>Here we create clients for the timeline service, Tweet service, and authentication service. A filter is created for authenticating raw requests. Finally, our service is implemented, combined with the auth filter and exposed on port 80.</p> \n<p>When a request is received, the auth filter will attempt to authenticate it. A failure will be returned immediately without ever affecting the core service. Upon successful authentication the AuthReq will be sent to the API service. The service will use the attached userId to lookup the user’s timeline via the timeline service. A list of tweet ids is returned then iterated over. Each id is then used to request the associated tweet. Finally, the list of Tweet requests is collected and converted into a JSON response.</p> \n<p>As you can see, the flow of data is defined and we leave the concurrency to Finagle. We don’t have to manage thread pools or worry about race conditions. The code is clear and safe.</p> \n<h5>Conclusion</h5> \n<p>We’ve been working closely with the Netty committers to improve on parts of Netty that both Finagle and the <a href=\"https://engineering/2013/netty-4-at-twitter-reduced-gc-overhead\">wider community can benefit from</a>. Recently the internal structure of Finagle has been updated to be more modular, paving the way for an upgrade to Netty 4.</p> \n<p>Finagle has yielded excellent results. We’ve managed to dramatically increase the amount of traffic we can serve while reducing latencies and hardware requirements. For example, after moving our API endpoints from the Ruby stack onto Finagle, we saw p99 latencies drop from hundreds of milliseconds to tens. Our new stack has enabled us to reach new records in throughput and as of this writing our <a href=\"https://engineering/2013/new-tweets-per-second-record-and-how\">record tweets per second is 143,199</a>.</p> \n<p>Finagle was born out of a need to set Twitter up to scale out to the entire globe at a time when we were struggling with site stability for our users. Using Netty as a base, we could quickly design and build Finagle to manage our scaling challenges. Finagle and Netty handle every request Twitter sees.</p> \n<h5>Thanks</h5> \n<p>This post will also appear as a case study in the&nbsp;<a href=\"http://www.manning.com/maurer/\">Netty in Action</a> book by Norman Maurer.</p> \n<p><a href=\"http://plosworkshop.org/2013/preprint/eriksen.pdf\">Your Server as a Function</a> by Marius Eriksen provides more insight into Finagle’s philosophy.</p> \n<p>Many thanks to <a href=\"https://twitter.com/intent/user?screen_name=trustin\" target=\"_blank\">Trustin Lee</a>&nbsp;and <a href=\"https://twitter.com/intent/user?screen_name=normanmaurer\">Norman Maurer</a> for their work on Netty. Thanks to <a href=\"https://twitter.com/intent/user?screen_name=marius\" target=\"_blank\">Marius Eriksen</a>, <a href=\"https://twitter.com/intent/user?screen_name=evanm\" target=\"_blank\">Evan Meagher</a>, <a href=\"https://twitter.com/intent/user?screen_name=mnnakamura\" target=\"_blank\">Moses Nakamura</a>, <a href=\"https://twitter.com/intent/user?screen_name=SteveGury\" target=\"_blank\">Steve Gury</a>, <a href=\"https://twitter.com/intent/user?screen_name=rubeydoo\" target=\"_blank\">Ruben Oanta</a>, <a href=\"https://twitter.com/intent/user?screen_name=bmdhacks\" target=\"_blank\">Brian Degenhardt</a>&nbsp;for their insights.</p>",
"date": "2014-02-13T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/netty-at-twitter-with-finagle",
"domain": "engineering"
},
{
"title": "How To: Objective C Initializer Patterns",
"body": "<p>Initializer patterns are an important part of good Objective-C, but these best practices are often overlooked. It’s the sort of thing that doesn’t cause problems most of the time, but the problems that arise are often difficult to anticipate. By being more rigorous and conforming to some best practices, we can save ourselves a lot of trouble. In this article, we’ll cover initialization topics in-depth, with examples to demonstrate how things can go wrong.</p> \n<p><strong>Part 1</strong>: Designated and Secondary Initializers<br><strong>Part 2</strong>:&nbsp;Case Studies<br><strong>Part 3</strong>:&nbsp;Designated and Secondary Initializer Cheat Sheet<br><strong>Part 4</strong>:&nbsp;- initWithCoder:, + new, and - awakeFromNib<br><strong><br></strong></p> \n<p class=\"align-center\"><strong>Part 1: Designated and Secondary Initializers</strong></p> \n<p><strong>Designated initializers</strong> define how we structure our initializers when subclassing; they are the “canonical initializer” for your class. A designated initializer does not define what initializer you should use when creating an object, like this common example:</p> \n<pre class=\"brush:css;first-line:1;\">[[UIView alloc] initWithFrame:CGRectZero];\n</pre> \n<p><span>It’s not necessary to call the designated initializer in the above case, although it won’t do any harm. If you are conforming to best practices, It is valid to call any designated initializer in the superclass chain, and the designated initializer for every class in the hierarchy is guaranteed to be called. For example:</span></p> \n<p></p> \n<pre class=\"brush:css;first-line:1;\">[[UIView alloc] init];\n</pre> \n<p></p> \n<p></p> \n<p>is guaranteed to call&nbsp;[NSObject init]&nbsp;and [UIView initWithFrame:], in that order. The order is guaranteed to be reliable regardless of which designated initializer in the superclass chain you call, and will <strong>always go from furthest ancestor to furthest descendant</strong>.</p> \n<p>When subclassing, you have three valid choices: you may choose to reuse your superclass’s designated initializer, to create your own designated initializer, or to not create any initializers (relying on your superclass’s).</p> \n<p>If you override your superclass’s designated initializer, your work is done. You can feel safe knowing that this initializer will be called.</p> \n<p>If you choose to create a new designated initializer for your subclass, you must do two things. First, create a new initializer, and document it as the new designated initializer in your header file. Second, you must override your superclass’s designated initializer and call the new one. Here’s an example for a UIView subclass:</p> \n<pre class=\"brush:css;first-line:1;\">// Designated initializer\n- (instancetype)initWithFoo:(TwitterFoo *)foo \n{\n if (self = [super initWithFrame:CGRectZero]) {\n _foo = foo;\n // initializer logic\n }\n return self;\n}\n\n- (instancetype)initWithFrame:(CGRect)rect\n{\n return [self initWithFoo:nil];\n}\n</pre> \n<p>Apple doesn’t mention much about it in the documentation, but all Apple framework classes provide valuable guarantees due to their consistency with these patterns. In the above example, if we did not override our superclass’s designated initializer to call the new one, we would break the guarantee which makes calling any designated initializer in the hierarchy reliable. For example, if we removed our initWithFrame: override,</p> \n<pre class=\"brush:css;first-line:1;\">[[TwitterFooView alloc] init];\n</pre> \n<p>could not be relied upon to call our designated initializer, initWithFoo:. The initialization would end with initWithFrame:.</p> \n<p>Finally, not all initializers are designated initializers. Additional initializers are referred to as convenience or secondary initializers. There is one rule here you’ll want to follow: Always call the designated initializer (or another secondary initializer) on <strong>self</strong> instead of super.</p> \n<p><strong>Example 1:</strong></p> \n<pre class=\"brush:css;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> TwitterFooView : UIView\n\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n\n<a href=\"https://twitter.com/intent/user?screen_name=implementation\">@implementation</a> TwitterFooView\n\n// Designated Initializer\n- (instancetype)initWithFoo:(TwitterFoo *)foo\n{\n\tif (self = [super initWithFrame:CGRectZero]) {\n\t\t_foo = foo;\n\t\t// do the majority of initializing things\n\t}\n\treturn self;\n}\n\n// Super override\n- (instancetype)initWithFrame:(CGRect)rect\n{\n\treturn [self initWithFoo:nil];\n}\n\n// Instance secondary initializer\n- (instancetype)initWithBar:(TwitterBar *)bar \n{\n\tif (self = [self initWithFoo:nil]) {\n\t\t_bar = bar;\n\t\t// bar-specific initializing things\n\t}\n\treturn self;\n}\n\n// Class secondary initializer\n+ (instancetype)fooViewWithBaz:(TwitterBaz *)baz\n{\n\tTwitterFooView *fooView = [[TwitterFooView alloc] initWithFoo:nil];\n\tif (fooView) {\n\t\t// baz-specific initialization\n}\n\treturn fooView;\n}\n\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a></pre> \n<p><br>Again, the key takeaway from this example is that in both - initWithBar: and + fooViewWithBaz:, we call - initWithFoo:, the designated initializer, on self. There’s one more rule to follow to preserve a deterministic designated initializer behavior: When writing initializers, don’t call designated initializers beyond your direct superclass. This can break the order of designated initializer execution. For an example of how this can go wrong, see Part 2, Case 2.</p> \n<p class=\"align-center\"><strong>Part 2: Case Studies</strong></p> \n<p>Now that we’ve covered the rules and guarantees relating to designated and secondary initializers, let’s prove these assertions using some concrete examples.</p> \n<p><strong>Case 1: Designated Initializer Ordering</strong></p> \n<p>Based on the code from Example 1, let’s prove the following assertion: Calling any designated initializer in the superclass chain is valid, and designated initializers are guaranteed to be executed in order from furthest ancestor ([NSObject init]) to furthest descendant ([TwitterFooView initWithFoo:]). In the following three diagrams, we’ll show the order of initializer execution when calling each designated initializer in the hierarchy: initWithFoo:, initWithFrame:, and init.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns95.thumb.1280.1280.png\" width=\"801\" height=\"409\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns96.thumb.1280.1280.png\" width=\"815\" height=\"531\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns97.thumb.1280.1280.png\" width=\"797\" height=\"567\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p><strong>Case 2: Example bug in UIViewController subclass</strong></p> \n<p>In the following example, we will analyze what can happen when we violate the following rule: When writing initializers, don’t call designated initializers beyond your immediate superclass. In context, this means we shouldn’t override or call [NSObject init] from a UIViewController subclass.</p> \n<p>Let’s say we start with a class TwitterGenericViewController, and incorrectly override [NSObject init]:</p> \n<pre class=\"brush:css;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> TwitterGenericViewController : UIViewController\n\n// Incorrect\n- (instancetype)init\n{\n if (self = [super init]) {\n _textView = [[UITextView alloc] init];\n _textView.delegate = self;\n }\n return self;\n}\n\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n</pre> \n<p>If we instantiate this object using [[TwitterGenericViewController alloc] init], this will work fine. However, if we use [[TwitterGenericViewController alloc] initWithNibName:nil bundle:nil], which should be perfectly valid, this initializer method will never be called. Let’s look at the order of execution for this failure case:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns98.thumb.1280.1280.png\" width=\"921\" height=\"345\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p>Things begin to break even further when subclasses are introduced below this incorrect - init implementation. Consider the following subclass to TwitterGenericViewController which correctly overrides initWithNibName:bundle:</p> \n<pre class=\"brush:css;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> TwitterViewController : TwitterGenericViewController\n\n- (instancetype)initWithNibName:(NSString *)nibNameOrNil bundle:(NSBundle *)nibBundleOrNil\n{\n if (self = [super initWithNibName:nibNameOrNil bundle:nibBundleOrNil]) {\n _textView = [[UITextView alloc] init];\n _textView.delegate = self;\n }\n return self;\n}\n\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n</pre> \n<p>Now, we have failure no matter which initializer we choose.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns99.thumb.1280.1280.png\" width=\"819\" height=\"498\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p>In this case, there is a failure because TwitterGenericViewController’s initializer is never called.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/how_to_objectivecinitializerpatterns100.thumb.1280.1280.png\" width=\"817\" height=\"600\" alt=\"How To: Objective C Initializer Patterns\" class=\"align-center\"></p> \n<p>In this case, all initializers are called, but in the wrong order. TwitterViewController will populate _textView and set it’s delegate, and then TwitterGenericViewController (the superclass) will initialize, overriding the _textView configuration. That’s backwards! We always want subclasses to initialize after superclasses, so we can override state properly.</p> \n<p class=\"align-center\"><strong>Part 3: Designated and Secondary Initializer Cheat Sheet</strong></p> \n<p><strong>When creating an object</strong></p> \n<p>Calling any designated initializer in the superclass chain is valid when creating an object. You can rely on all designated initializers being calling in order from furthest ancestor to furthest descendant.</p> \n<p><strong>When subclassing</strong><br><em>Option 1: Override immediate superclass’s designated initializer</em></p> \n<p>Be sure to call immediate super’s designated initializer first<br>Only override your immediate superclass’s designated initializer</p> \n<p><em>Option 2: Create a new designated initializer</em></p> \n<p>Be sure to call your immediate superclass’s designated initializer first<br>Define a new designated initializer<br>Document new designated initializer as such in your header<br>Separately, override immediate superclass’s designated initializer and call the new designated initializer on self</p> \n<p><em>Option 3: Don’t create any initializers</em></p> \n<p>It is valid to omit any initializer definition and rely on your superclass’s<br>In this case, you ‘inherit’ your superclass’s designated initializer as it applies to the last rule in Option 1</p> \n<p><strong>Additionally, you may define class or instance secondary initializers</strong></p> \n<p class=\"align-left\">Secondary initializers, must always call self, and either call the designated initializer or another secondary initializer.<br>Secondary initializers may be class or instance methods (see Example 1)<br><strong></strong></p> \n<p class=\"align-center\"><strong>Part 4: - initWithCoder:, + new, and - awakeFromNib</strong></p> \n<p><strong>+ new</strong></p> \n<p>The documentation describes the [NSObject new] as “a combination of alloc and init”. There’s nothing wrong with using the method for initialization, since we’ve established that calling any designated initializer in the hierarchy is valid, and all designated initializers will be called in order. However, when contrasted with [[NSObject alloc] init], + new is used less often, and is therefore less familiar. Developers using Xcode’s global search for strings like “MyObject alloc” may perhaps unknowingly overlook uses of [MyObject new].</p> \n<p><strong>- initWithCoder:</strong></p> \n<p>Reading&nbsp;<a href=\"https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Archiving/Articles/codingobjects.html#//apple_ref/doc/uid/20000948-BCIHBJDE\">Apple’s documentation</a> on object initialization when using NSCoding is a helpful first step.&nbsp;</p> \n<p>There are two key things to remember when implementing initWithCoder: First, if your superclass conforms to NSCoding, you should call [super initWithCoder:coder] instead of [super (designated initializer)].</p> \n<p>There’s a problem with the example provided in the documentation for initWithCoder:, specifically the call to [super (designated initializer)]. If you’re a direct subclass of NSObject, calling [super (designated initializer)] won’t call your [self (designated initializer)]. If you’re not a direct subclass of NSObject, and one of your ancestors implements a new designated initializer, calling [super (designated initializer)] WILL call your [self (designated initializer)]. This means that apple’s suggestion to call super in initWithCoder encourages non-deterministic initialization behavior, and is not consistent with the solid foundations laid by the designated initializer pattern. Therefore, my recommendation is that you should treat initWithCoder: as a secondary initializer, and call [self (designated initializer)], not [super (designated initializer)], if your superclass does not conform to NSCoding.</p> \n<p><strong>-</strong> <strong>awakeFromNib</strong></p> \n<p>The documentation for - awakeFromNib is straightforward:</p> \n<ul>\n <li><a href=\"https://developer.apple.com/library/mac/documentation/cocoa/reference/applicationkit/Protocols/NSNibAwaking_Protocol/Reference/Reference.html\">NSNibAwaking Protocol Reference</a></li> \n</ul>\n<p>The key point here is that interface builder outlets will not be set while the designated initializer chain is called. awakeFromNib happens afterwards, and IBOutlets will be set at that point.</p> \n<p><strong>Documentation</strong></p> \n<p>NSCell has four designated initializers for different configurations. This is an interesting exception to the standard single designated initializer pattern, so it’s worth checking out:</p> \n<ul>\n <li><a href=\"https://developer.apple.com/library/mac/documentation/Cocoa/Reference/ApplicationKit/Classes/NSCell_Class/Reference/NSCell.html#//apple_ref/doc/uid/TP40004017\">NSCell Class Reference</a></li> \n</ul>\n<p>Documentation relating to initialization:</p> \n<ul>\n <li><a href=\"https://developer.apple.com/library/mac/documentation/general/Conceptual/CocoaEncyclopedia/Initialization/Initialization.html\">Object Initialization</a></li> \n <li><a href=\"https://developer.apple.com/library/ios/Documentation/General/Conceptual/DevPedia-CocoaCore/MultipleInitializers.html\">Multiple Initializers</a></li> \n <li><a href=\"https://developer.apple.com/library/ios/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/EncapsulatingData/EncapsulatingData.html\">Encapsulating Data</a></li> \n <li><a href=\"https://developer.apple.com/library/ios/documentation/cocoa/Conceptual/Archiving/Articles/codingobjects.html#//apple_ref/doc/uid/20000948-BCIHBJDE\">Encoding and Decoding Objects</a></li> \n</ul>\n<p></p> \n<p class=\"align-center\"><strong>&nbsp;</strong></p>",
"date": "2014-02-27T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/how-to-objective-c-initializer-patterns",
"domain": "engineering"
},
{
"title": "__attribute__ directives in Objective-C",
"body": "<p><span>In this post, we’ll examine what __attribute__ directives are and how they can be used in development. The goal is to establish a value to using __attribute__ directives in any codebase and to provide a starting point with some directives that anyone can start using right away.</span></p> \n<ol>\n <li>What are __attribute__ directives?</li> \n <li>When should I use an __attribute__ directive?</li> \n <li>Recognizing the dangers of misusing an __attribute__ directive</li> \n <li>Core __attribute__ directives</li> \n <li>ARC __attribute__ directives</li> \n <li>More __attribute__ directives</li> \n <li>in, out and inout</li> \n <li>__attribute__ directives as a tool</li> \n <li><span>__attribute__ resources</span></li> \n</ol>\n<p><strong>What are __attribute__ directives?</strong></p> \n<p><span>The __attribute__ directive is used to decorate a code declaration in C, C++ and Objective-C programming languages. This gives the declared code additional attributes that would help the compiler incorporate optimizations or elicit useful warnings to the consumer of that code.</span></p> \n<p>Better said, __attribute__ directives provide context. The value of providing context to code cannot be overstated. Developers have provided context by way of explicit declarations and comments since before the advent of the integrated circuit, but the value of providing context that can be evaluated by a compiler gives us a whole new level of control. By explicitly providing the confines of how an API behaves to the compiler, a programmer can gain some tangible benefits. The directives can be used to enforce compliance with how other programmers consume that API. In other cases, __attribute__ directives can help the compiler to optimize - sometimes to large performance gains.</p> \n<p>As Mattt Thompson cogently put it in <a href=\"http://nshipster.com/__attribute__/\">a blog post</a>: “Context is king when it comes to compiler optimizations. By providing constraints on how to interpret your code, [you’ll increase] the chance that the generated code is as efficient as possible. Meet your compiler half-way, and you’ll always be rewarded… [It] isn’t just for the compiler either: The next person to see the code will appreciate the extra context, too. So go the extra mile for the benefit of your collaborator, successor, or just 2-years-from-now you.” Which leads nicely into wise words from Sir Paul McCartney, “and in the end, the love you take is equal to the love you make.”</p> \n<p><strong>When should I use an __attribute__ directive?</strong><br>Whenever you have an opportunity to provide additional context to a code declaration (variable, argument, function, method, class, etc), you should. Providing context to code benefits both the compiler and the reader of the API, whether that’s another programmer or yourself at a future point in time.</p> \n<p>Now let’s be practical for a moment too. There are dozens of __attribute__ directives and knowing every single one of them for every single compiler on every single architecture is just not a reasonable return on investment. Rather, let’s focus on a core set of commonly useful __attribute__ directives any developer can take advantage of.</p> \n<p><strong>Recognizing the dangers of misusing an __attribute__ directive</strong><br>Just as poorly written comments and documentation can have consequences, providing the wrong __attribute__ can have consequences. In fact, since an __attribute__ affects code compilation, affixing the wrong __attribute__ to code can actually result in a bug that could be incredibly difficult to debug.</p> \n<p>Let’s take a look at an example of where an __attribute__ directive can be misused. Let’s suppose I have an enum that I use pretty often and frequently want a string version for it, whether it’s for populating a JSON structure or just for logging. I create a simple function to help me convert that enum into an NSString.</p> \n<pre class=\"brush:cpp;first-line:1;\">// Header declarations\n\ntypedef NS_ENUM(char, XPL802_11Protocol) {\n XPL802_11ProtocolA = 'a',\n XPL802_11ProtocolB = 'b',\n XPL802_11ProtocolG = 'g',\n XPL802_11ProtocolN = 'n'\n};\n\nFOUNDATION_EXPORT NSString *XPL802_11ProtocolToString(XPL802_11Protocol protocol);\n\n// Implementation\n\nNSString *XPL802_11ProtocolToString(XPL802_11Protocol protocol)\n{\n switch(protocol) {\n case XPL802_11ProtocolA:\n\t\t return @\"802.11a\";\n case XPL802_11ProtocolB:\n\t\t return @\"802.11b\";\n case XPL802_11ProtocolG:\n\t\t return @\"802.11g\";\n case XPL802_11ProtocolN:\n\t\t return @\"802.11n\";\n default:\n break;\n }\n return nil;\n}\n</pre> \n<p>So I have my great little converting function and I end up using it a lot in my code. I notice that my return values are constant NSString references and are always the same based on the protocol that is provided as a parameter to my function. Aha! A prime candidate for a const __attribute__ directive. So I just update my header’s function declaration like so:</p> \n<pre class=\"brush:css;first-line:1;\">FOUNDATION_EXPORT NSString*XPL802_11ProtocolToString(XPL802_11Protocol protocol)__attribute__((const));\n</pre> \n<p><span>And voilà! I have just provided context to any consumer of this function such that they know that the return value is completely based on the provided parameter and won’t change over the course of the process’ life. This change would also provide a performance boost, depending on how often the function is called, since the compiler now knows that it doesn’t actually have to re-execute this function if it already has cached the return value.</span></p> \n<p>Now, let’s say one day I notice the enum value is the character of the protocol and I decide to be clever and change my implementation to something like this:</p> \n<pre class=\"brush:cpp;first-line:1;\">NSString *XPL802_11ProtocolToString(XPL802_11Protocol protocol)\n{\nswitch(protocol) {\n case XPL802_11ProtocolA:\n case XPL802_11ProtocolB:\n case XPL802_11ProtocolG:\n case XPL802_11ProtocolN:\n return [NSString stringWithFormat:@\"802.11%c\", protocol];\n default:\n break;\n }\n return nil;\n}\n</pre> \n<p>Now since I failed to remove the const attribute, I have just introduced a massive bug that could easily crash my app. Why? Well, the key difference is that the return value of my function is no longer a constant reference. When we were returning hard coded strings before, the compiler stored those strings as persistent memory and those NSStrings effectively had a retain count of infinite. Now that we dynamically generate the string based on the protocol’s char value, we are creating a new string every time - and that means memory that changes. The reference returned in one call to the function won’t actually be the same reference as a subsequent identical call. The problem will rear it’s head when the compiler optimizes subsequent calls to that function to just immediately access what the compiler considers the known return value, which would be the reference returned by the original call. Sadly, that reference has likely been deallocated and potentially reallocated by some other memory allocation by now. This will lead to our application crashing on either a bad memory access or an invalid method call on the object that occupies that reference’s memory. The worst of this is that the optimization that would cause this crash will only happen in builds that are highly optimized. Since debug builds often have optimizations turned down, you can run your app in a debugger forever and never reproduce it, making this bug, like most __attribute__ based bugs, very hard to figure out and fix.</p> \n<p>This is bug effectively boils down to treating a function that returns transient memory as const. The same goes for functions or methods that take transient memory as a parameter. Easy enough to remember is that any function returning a pointer return value must return a constant reference to use the const __attribute__ directive and absolutely no const function can have a pointer (including an Objective-C object) as a parameter.</p> \n<p>Now this example is merely a precaution for using __attribute__ directives and shouldn’t deter you from using them in your code. If you stick to __attribute__ directives you understand and pay attention to how they are used, you’ll be able to steer clear of these bugs and harness the power __attribute__ directives were meant to provide. Just remember, when in doubt, don’t attribute, because providing the wrong context is worse than providing no context.</p> \n<p>To point you in the right direction, below is a compiled list of useful attributes that should be more than enough to improve any developer’s tool belt.</p> \n<p><strong>Core __attribute__ directives</strong><br><strong><em>__attribute__((availability(…))), NS_AVAILABLE and NS_DEPRECATED</em></strong></p> \n<p><span>Indicate the availability of an API on the platform</span></p> \n<p class=\"indent-30\">NS_AVAILABLE: Apple macro for attributing an API as available in a given OS release. NS_AVAILABLE_IOS(available_os_version)</p> \n<p class=\"indent-30\">NS_DEPRECATED: Apple macro for attributing an API as deprecated in a given OS release.</p> \n<p class=\"indent-30\"><span>NS_DEPRECATED_IOS(available_os_version,deprecated_os_version)</span></p> \n<ul>\n <li>Use NS_AVAILABLE and NS_DEPRECATED macros to hide the complexity of this attribute.</li> \n <li>Commonly used when deprecating one API and adding a new API as a replacement.</li> \n <li>When creating a new API for backwards compatibility, immediately attribute the API as deprecated and include a comment on the API that should be used for current OS targets.</li> \n <li>These directives are tied to the operating system version and cannot be used for framework versioning. For those instances, use __attribute__((deprecated(…)).</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">FOUNDATION_EXPORT NSString * const MyClassNotification NS_AVAILABLE_IOS(3_0);\nFOUNDATION_EXPORT NSString * const MyClassNotificationOldKey NS_DEPRECATED_IOS(3_0, 7_0);\nFOUNDATION_EXPORT NSString * const MyClassNotificaitonNewKey NS_AVAILABLE_IOS(7_0);\n\nNS_AVAILABLE_IOS(3_0)\n<a href=\"https://twitter.com/intent/user?screen_name=class\">@class</a> MyClass : NSObject\n- (void)oldMethod NS_DEPRECATED_IOS(3_0, 6_0);\n- (void)newMethod:(out NSError * __autoreleasing *)outError NS_AVAILABLE_IOS(6_0);\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n</pre> \n<p><em><strong>__attribute__((deprecated(…))) and __attribute__((unavailable(…)))</strong></em><br>Indicates that an API is deprecated/unavailable.</p> \n<p>__attribute__((deprecated(optional_message)))<br>__attribute__((unavailable(optional_message)))</p> \n<p>In case you don’t want to use the availability attribute for deprecation.</p> \n<pre class=\"brush:cpp;first-line:1;\">- (void)deprecatedMethod __attribute__((deprecated));\n- (void)deprecatedMethodWithMessage __attribute__((deprecated(\"this method was deprecated in MyApp.app version 5.0.2, use newMethod instead\"));\n\n- (void)unavailableMethod __attribute__((unavailable));\n- (void)unavailableMethodWithMessage __attribute__((unavailable(\"this method was removed from MyApp.app version 5.3.0, use newMethod instead\"));\n</pre> \n<p><em><strong>__attribute__((format(…))) and NS_FORMAT_FUNCTION</strong></em></p> \n<p>Indicates that a function/method contains a format string with format arguments.</p> \n<p>__attribute__((format(format_type, format_string_index, first_format_argument_index)))<br>format_type: one of printf, scant, strftime, strfmon or __NSString__</p> \n<p><strong>Reminder</strong>: argument indexes when specified in an attribute are 1 based. When it comes to Objective-C methods, remember they are just C functions whose first 2 arguments are the id self argument and the SEL _cmd argument.</p> \n<ul>\n <li>NS_FORMAT_FUNCTION: macro provided by Apple for __NSString__ type format string</li> \n <li>Use NS_FORMAT_FUNCTION macro when attributing a method/function with an objective-c string for formatting.</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">FOUNDATION_EXPORT void NSLog(NSString *format, ...) NS_FORMAT_FUNCTION(1,2);\nvoid MyLog(MyLogLevel lvl, const char *format, ...) __attribute((format(printf, 2, 3)));\n// ...\n- (void)appendFormat:(NSString *)format, ... NS_FORMAT_FUNCTION(3, 4);\n</pre> \n<p><em><strong>__attribute__((sentinel(…))) and NS_REQUIRES_NIL_TERMINATION</strong></em></p> \n<p>Indicates that a function/method requires a nil (NULL) argument, usually used as a delimiter. You can only use this attribute with variadic functions/methods.</p> \n<p>__attribute__((sentinel(index))</p> \n<p>index: the index offset from the last argument in the variadic list of arguments.</p> \n<p>__attribute__((sentinel)) is equivalent to __attribute__((sentinel(0)))</p> \n<p>You’ll almost always want to use the NS_REQUIRES_NIL_TERMINATION macro</p> \n<pre class=\"brush:cpp;first-line:1;\">// Example 1\n<a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> NSArray\n- (instancetype)arrayWithObjects:... NS_REQUIRES_NIL_TERMINATION;\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n \n// Example 2 - of course you'd never do this...\nNSArray *CreateArrayWithObjectsWithLastArgumentIndicatingIfArrayIsMutable(...) __attribute__((sentinel(1)));\n \nvoid foo(id object1, id object2)\n{\n NSArray *weirdArray = CreateArrayWithObjectsWithLastArgumentIndicatingIfArrayIsMutable(object1, object2, nil, YES);\n NSAssert([weirdArrayrespondsToSelector:<a href=\"https://twitter.com/intent/user?screen_name=selector\">@selector</a>(addObject:)]);\n // ...\n}\n</pre> \n<p><strong><em>__attribute__((const)) and __attribute__((pure))</em><br></strong>__attribute__((const)) is used to indicate that the function/method results are entirely dependent on the provided arguments and the function/method does not mutate state.<br>__attribute__((pure)) is almost the same as its const counterpart, except that the function/method can also take global/static variables into account.</p> \n<ul>\n <li>Though adding the const or pure attribute to an Objective-C method is not as useful to the compiler due to the dynamic runtime, it is still VERY useful to a programmer reading an interface.</li> \n <li>It is recommended that all singleton instance accessors use the const attribute.</li> \n <li>The optimization upside of accurately using this attribute can be an enormous win. If you have an Objective-C class method that is frequently used and is const or pure, consider converting it into a C function to reap some serious benefits.</li> \n <li><span>On the flipside, using this attribute incorrectly can lead to a nearly impossible to locate bug as actually seeing the redundant use of the function removed by the compiler requires looking at the assembly! Oh, and this type of bug will rarely show in a debug build since only highly optimized builds will have the bug.</span></li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">// Example 1: Singleton\n\n<a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> MySingleton : NSObject\n+ (MySingleton *)sharedInstance __attribute__((const));\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n\n// Example 2: Function overhead optimization\n\n// Get the description of a specified error number\nconst char *StringForErrNo(int errorNumber) __attribute__((const));// strerror(errorNumber)\n\n// Get the description of the global errno error number\nconst char *StringForGlobalErrNo(void) __attribute__((pure)); // strerror(errno)\n\nvoid DoStuffWithGlobalErrNo()\n{\n NSLog(@\"%@ %s\", [NSStringstringWithUTF8String:StringForGlobalErrNo()],StringForGlobalErrNo());\n printf(\"%s\\n\", StringForGlobalErrNo());\n printf(\"%i\\n\", strlen(StringForGlobalErrNo()));\n}\n\n// will compile as something more like this:\n\nvoid DoStuffWithGlobalErrNo()\n{\nconst char *__error = StringForGlobalErrNo();\nNSLog(@\"%@ %s\", [NSString stringWithUTF8String:__error], __error);\n printf(\"%s\\n\", __error);\n printf(\"%i\\n\", strlen(__error));\n}\n\n// which effectively eliminates both 1) the overhead of the function call and 2) the internal execution cost of the function\n\n// Example 3: Function execution cost optimization\n\nint nthFibonacci(int n) __attribute__((const)); // naive implementation to get the nth fibonacci number without any caching\n\nvoid TestFibonacci()\n{\n time_t start = time(NULL);\n int result1 = nthFibonacci(1000); // execution time of D\n time_t dur1 = time(NULL) - start; // some large duration D\n int result2 = nthFibonacci(1000); // execution time of 1\n time_t dur2 = time(NULL) - start; // same as dur1, duration D\n int result3 = nthFibonacci(999); // execution time of ~D\n time_t dur3 = time(NULL) - start; // duration of 2*D\n\n// The __attribute__((const)) directive can effectively eliminate a redundant call to an expensive operation...nice!\n}\n</pre> \n<p></p> \n<p><strong><em>__attribute__((objc_requires_super)) and NS_REQUIRES_SUPER</em><br></strong>Indicate that the decorated method must call the super version of it’s implementation if overridden.</p> \n<ul>\n <li>Existed in LLVM for Xcode 4.5 but had bugs and wasn’t exposed with NS_REQUIRES_SUPER until Xcode 5.0</li> \n <li><span>When creating a class that is expressly purposed to be a base class that is subclassed, any method that is supposed to be overridden but needs to have the super implementation called needs to use this macro.</span></li> \n <li><span>This attribute can be a large codebase win by contextualizing what methods are necessary for a base class to work. Widely adopt this attribute and your codebase will benefit.</span></li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> MyBaseClass : NSObject\n- (void)handleStateTransition NS_REQUIRES_SUPER;\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n\n// ...\n\n<a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> MyConcreteClass : MyBaseClass\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n\n<a href=\"https://twitter.com/intent/user?screen_name=implementation\">@implementation</a> MyConcreteClass\n\n- (void)handleStateTransition\n{\n [super handleStateTransition]; // @end</pre> \n<p></p> \n<p><strong>ARC __attribute__ directives<br><em>__attribute__((objc_precise_lifetime)) and NS_VALID_UNTIL_END_OF_SCOPE</em><br></strong></p> \n<p>Indicate that the given variable should be considered valid for the duration of its scope</p> \n<ul>\n <li>Though this will rarely come up, it can be a big help combatting a brain scratching crash that only appears in release builds (since the optimization likely doesn’t happen in your debug build)</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">- (void)foo\n{\n NS_VALID_UNTIL_END_OF_SCOPE MyObject *obj = [[MyObject alloc] init];\n NSValue *value = [NSValue valueWithPointer:obj];\n\n// do stuff\n\n MyObject *objAgain = [value pointerValue];\n NSLog(@\"%@\", objAgain);\n}\n\n/* in ARC, without NS_VALID_UNTIL_END_OF_SCOPE, the compiler will optimize and after the obj pointer is used to create the NSValue the compiler will have no knowledge of the encapsulated use of the object in the NSValue. ARC will release obj and this NSLog line will crash with EXEC_BAD_ACCESS because the reference retrieved from the NSValue and stored in objAgain will now be pointing to the deallocated reference. */\n</pre> \n<p><strong><em>__attribute__((ns_returns_retained)) and NS_RETURNS_RETAINED</em><br></strong>Indicates to ARC that the method returns a +1 retain count.<br>Per Apple: only use this attribute for extraneous circumstances. Use the Objective-C naming convention of prefixing your method with alloc, new, copy, or mutableCopy to achieve the same result without an attribute.</p> \n<ul>\n <li>ARC will follow a naming convention and this directive for how to manage the returned value’s retain count. If the implementation is non-ARC, it is up to the implementation to adhere to the rule such that when an ARC file consumes the API the contract is adhered to.</li> \n <li><span>Honestly, you should just use the Apple recommend method prefix for methods and reserve this only for cases where you create an object with a +1 retain count with a function.</span></li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">NSString *CreateNewStringWithFormat(NSString *format, ...)NS_FORMAT_FUNCTION(1, 2) NS_RETURNS_RETAINED;\n</pre> \n<p><strong><em>__attribute__((ns_returns_not_retained)) and NS_RETURNS_NOT_RETAINED</em></strong></p> \n<p>Indicates to ARC that the method returns a +0 retain count. Default behavior of all methods and functions in Objective-C.<br>Per Apple: only use this attribute for extraneous circumstances. Use the Objective-C naming convention of NOT prefixing your method with alloc, new, copy, or mutableCopy to achieve the same result without an attribute.</p> \n<ul>\n <li>ARC will follow a naming convention and this directive for how to manage the returned value’s retain count. If the implementation is non-ARC, it is up to the implementation to adhere to the rule such that when an ARC file consumes the API the contract is adhered to.</li> \n <li>The only use for this attribute would be if you prefixed a method with alloc, new, copy, or mutableCopy but you didn’t want a +1 retain count - which is just nonsense. You should never need to use this attribute as it is implied on any method that doesn’t have a +1 keyword prefix.</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">- (NSString *)newUnretainedStringWithFormat:(NSString *)format, ... NS_FORMAT_FUNCTION(3, 4) NS_RETURNS_NOT_RETAINED;\n</pre> \n<p></p> \n<p><em><strong>__attribute__((objc_returns_inner_pointer)) and NS_RETURNS_INNER_POINTER<br></strong></em>Indicates that the method will return a pointer that is only valid for the lifetime of the owner. This will prevent ARC from preemptively releasing an object when the internal pointer is still in use.</p> \n<ul>\n <li>This is actually a very important attribute that few developers do a very good job of using, but really should. If a method returns a non Objective-C reference, then ARC doesn’t know that the returned value is a reference that belongs to the owning object and will go away if the owning object goes away. Without this attribute, after the final use of an object ARC will release it. This can result in a crash if the inner pointer is referenced after the last use of the object since it could have been deallocated. Ordering lines of code is not enough either, since the compiler could easily reorder the execution order as a way to optimize<strong>.</strong></li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> NSMutableData : NSData\n- (void *)mutableBytes NS_RETURNS_INNER_POINTER;\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n\n\nvoid Foo(void)\n{\n NSMutableData *buffer = [[NSMutableData alloc] initWithLength:8];\n char* cBuffer = buffer.mutableBytes;\n memcpy(cBuffer, \"1234567\", 8); // crash if NS_RETURNS_INNER_POINTER doesn't decorate the mutableBytes method\n printf(\"%s\\n\", cBuffer);\n (void)buffer; // this will not save us from a crash if the mutableBytes method isn't decorated with an NS_RETURNS_INNER_POINTER\n}\n</pre> \n<p></p> \n<p><strong><em>__attribute__((ns_consumes_self)) and NS_REPLACES_RECEIVER</em><br></strong>Indicates that the provided method can replace the receiver with a different object.<br>Presumes a +0 retain count (which can be overridden with NS_RETURNS_RETAINED, but if you do that you really need to be asking yourself “what the heck am I doing?”).</p> \n<ul>\n <li>By default, all methods prefixed with init are treated as if this attribute were decorating them.</li> \n <li>ARC makes this behavior really easy to implement. non-ARC implementers of the same behavior still need the decoration but also have to pay closer attention to how they are managing memory in the implementation. (awakeAfterUsingCoder: is regular source of memory leaks in non-ARC code bases)</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\"><a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> NSObject (NSCoderMethods)\n- (id)awakeAfterUsingCoder:(NSCoder *) NS_REPLACES_RECEIVER;\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n</pre> \n<p></p> \n<p><strong><em>__attribute__((objc_arc_weak_reference_unavailable)) and NS_AUTOMATED_REFCOUNT_WEAK_UNAVAILABLE</em><br></strong>Indicates that the decorated class does not support weak referencing</p> \n<ul>\n <li>Mac OS X Examples: NSATSTypesetter, NSColorSpace, NSFont, NSMenuView, NSParagraphStyle, NSSimpleHorizontalTypesetter, NSTextView,NSFontManager, NSFontPanel, NSImage, NSTableCellView, NSViewController, NSWindow, and NSWindowController.</li> \n <li>iOS and Mac OS X Examples: NSHashTable, NSMapTable, or NSPointerArray</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">NS_AUTOMATED_REFCOUNT_WEAK_UNAVAILABLE\n<a href=\"https://twitter.com/intent/user?screen_name=interface\">@interface</a> NSHashTable : NSObject *Protocols*/&gt;\n//...\n<a href=\"https://twitter.com/intent/user?screen_name=end\">@end</a>\n</pre> \n<p></p> \n<p><strong><em>NS_AUTOMATED_REFCOUNT_UNAVAILABLE</em><br></strong>Indicates that the decorated API is unavailable in ARC.</p> \n<ul>\n <li>Can also use OBJC_ARC_UNAVAILABLE</li> \n</ul> \n<pre class=\"brush:csharp;first-line:1;\">- (oneway void)release NS_AUTOMATED_REFCOUNT_UNAVAILABLE;\n</pre> \n<p><strong>More __attribute__ directives<br></strong>__attribute__((objc_root_class)) and NS_ROOT_CLASS<br>__attribute__((constructor(…))) and __attribute__((destructor(…)))<br>__attribute__((format_arg(…))) and NS_FORMAT_ARGUMENT<br>__attribute__((nonnull(…)))<br>__attribute__((returns_nonnull))<br>__attribute__((noreturn))<br>__attribute__((used))<br>__attribute__((unused))<br>__attribute__((warn_unused_result))<br>__attribute__((error(…))) and __attribute__((warning(…)))</p> \n<p><strong>in, out and inout</strong></p> \n<p>While we’re on the topic of providing context to code we should take the briefest of moments to bring up the Objective-C keywords in, out and inout. These little keywords are used to attribute Objective-C method arguments to provide context on whether the parameter is for input, output or both. These keywords came about with distributed objects along with oneway, byref, and bycopy but, in the spirit of providing context to programmers, these keywords can bridge the gap between the consumer of an API presuming how an argument will behave and knowing how that argument will behave. Consider using inout or out the next time you return a value via an argument and consumers of your API will appreciate it.</p> \n<p><strong>in</strong><br>Indicates that the given argument is used only for input. This is the default behavior for non-pointers and Objective-C objects.</p> \n<ul>\n <li>Use this keyword for methods that accept a pointer to a primitive that is read but never modified. Not a common case.</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">- (void)configureWithRect:(in CGRect *rect)\n{\n if (rect) {\n _configRect = *rect;\n }\n [self _innerConfigure];\n}\n</pre> \n<p><strong>out<br></strong>Indicates that the given argument is used just for output. This is never a default behavior.</p> \n<ul>\n <li>This keyword doesn’t make sense to apply to non pointers, which are always in arguments.</li> \n <li><span>Use this keyword for methods that return a value via an argument but don’t read that argument.</span></li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">- (void)configure:(out NSError **error)\n{\n NSError *theError = [self _configure];\n if (error) {\n *error = theError;\n }\n}\n</pre> \n<p></p> \n<p><strong>inout<br></strong>Indicates that the given argument is used for both input and output. This is the default behavior for pointers, except for Objective-C objects (which default to in).</p> \n<ul>\n <li>This keyword doesn’t make sense to be applied to non pointers, which are always in arguments.</li> \n <li>Use this to provide context when it may not be apparent how the pointer behaves.</li> \n <li>Always use this to provide context when a method has numerous pointer arguments and at least one is in or out. Basically, when there are multiple pointer arguments and they are not all inout, every pointer argument should specify its behavior.</li> \n</ul> \n<pre class=\"brush:cpp;first-line:1;\">- (void)configureRect:(inout CGRect *rect)\n{\n if (rect) {\n if (CGRectIsNull(*rect)) { // where rect acts an \"in\" argument\n *rect = CGRectMake(_x, _y, _w, _h); // where rect acts as an \"out\" argument\n }\n }\n}\n</pre> \n<p></p> \n<p><strong>__attribute__ directives as a tool</strong><br>With such a valuable tool available to the C languages, any team can benefit by using these __attribute__ directives to give context in their code. At Twitter, with a very large code base and many engineers developing on it daily, every bit of context that can be provided helps in maintaining a quality code base for reuse. Adding __attribute__ directives to your toolbelt of code comments and good naming conventions will give you a robust toolset for providing indispensable context to your code base. Don’t shy away from adding __attribute__ directives to your next project. Use them, evangelize them and everyone will benefit.</p> \n<p><strong>__attribute__ resources</strong><br><a href=\"http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Attribute-Syntax.html\">GCC __attribute__ documentation</a><br><a href=\"http://clang.llvm.org/docs/AttributeReference.html\">Clang __attribute__ reference</a><br><a href=\"http://clang.llvm.org/docs/LanguageExtensions.html#objective-c-retaining-behavior-attributes\">Clang Objective-C ARC Attributes</a><br><a href=\"http://nshipster.com/__attribute__/\">NSHipster’s __attribute__ blog post</a></p> \n<p></p>",
"date": "2014-03-10T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/attribute-directives-in-objective-c",
"domain": "engineering"
},
{
"title": "Greater privacy for your Twitter emails with TLS",
"body": "<p>Protecting users’ privacy is a never-ending process, and we are committed to keeping our users’ information safe. Since mid-January, we have been protecting your emails from Twitter using <a href=\"http://en.wikipedia.org/wiki/Transport_Layer_Security\" target=\"_blank\">TLS</a> in the form of <a href=\"http://en.wikipedia.org/wiki/STARTTLS\">StartTLS</a>. StartTLS encrypts emails as they transit between sender and receiver and is designed to prevent snooping. It also ensures that emails you receive from Twitter haven’t been read by other parties on the way to your inbox if your email provider supports TLS.</p> \n<p>We’re using StartTLS in addition to other email security protocols we’ve previously enabled like <a href=\"http://www.dkim.org/\" target=\"_blank\">DKIM</a> and <a href=\"http://www.dmarc.org/\" target=\"_blank\">DMARC</a>, which prevent spoofing and email forgeries by ensuring emails claiming to be from Twitter were indeed sent by us. These email security protocols are part of our commitment to continuous improvement in privacy protections and complement improvements like our securing of web traffic with <a href=\"https://engineering/2013/forward-secrecy-at-twitter-0\" target=\"_blank\">forward secrecy</a>&nbsp;and <a href=\"https://engineering/2011/making-twitter-more-secure-https\" target=\"_blank\">always-on HTTPS</a>.</p> \n<p>While we’ve enabled StartTLS for SMTP, that’s not enough to guarantee delivery over TLS. TLS encryption only works if both the sender and receiver of emails support it. We commend those email providers like Gmail &amp; AOL Mail that have turned on TLS and we ask all other providers that haven’t yet to prioritize it. Together, we can protect the privacy of every user.</p>",
"date": "2014-03-12T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/greater-privacy-for-your-twitter-emails-with-tls",
"domain": "engineering"
},
{
"title": "Manhattan, our real-time, multi-tenant distributed database for Twitter scale",
"body": "<p>As Twitter has grown into a global platform for public self-expression and conversation, our storage requirements have grown too. Over the last few years, we found ourselves in need of a storage system that could serve millions of queries per second, with extremely low latency in a real-time environment. Availability and speed of the system became the utmost important factor. Not only did it need to be fast; it needed to be scalable across several regions around the world.</p> \n<p>Over the years, we have used and made significant contributions to many open source databases. But we found that the real-time nature of Twitter demanded lower latency than the existing open source products were offering. We were spending far too much time firefighting production systems to meet the performance expectations of our various products, and standing up new storage capacity for a use case involved too much manual work and process. Our experience developing and operating production storage at Twitter’s scale made it clear that the situation was simply not sustainable. So we began to scope out and build Twitter’s next generation distributed database, which we call <strong>Manhattan</strong>. We needed it to take into account our existing needs, as well as put us in a position to leapfrog what exists today.</p> \n<p></p> \n<p><strong>Our holistic view into storage systems at Twitter</strong><br>Different databases today have many capabilities, but through our experience we identified a few requirements that would enable us to grow the way we wanted while covering the majority of use cases and addressing our real-world concerns, such as correctness, operability, visibility, performance and customer support. Our requirements were to build for:</p> \n<ul>\n <li><strong>Reliability</strong>: Twitter services need a durable datastore with predictable performance that they can trust through failures, slowdowns, expansions, hotspots, or anything else we throw at it.</li> \n <li><strong>Availability</strong>: Most of our use cases strongly favor availability over consistency, so an always-on eventually consistent database was a must.</li> \n <li><strong>Extensibility</strong>: The technology we built had to be able to grow as our requirements change, so we had to have a solid, modular foundation on which to build everything from new storage engines to strong consistency. Additionally, a schemaless key-value data model fit most customers’ needs and allowed room to add structure later.</li> \n <li><strong>Operability</strong>: As clusters grow from hundreds to thousands of nodes, the simplest operations can become a pain and a time sink for operators. In order to scale efficiently in manpower, we had to make it easy to operate from day one. With every new feature we think about operational complexity and the ease of diagnosing issues.</li> \n <li><strong>Low latency</strong>: As a real-time service, Twitter’s products require consistent low latency, so we had to make the proper tradeoffs to guarantee low latent performance.</li> \n <li><strong>Real-world scalability</strong>: Scaling challenges are ubiquitous in distributed systems. Twitter needs a database that can scale not just to a certain point, but can continue to grow to new heights in every metric — cluster size, requests per second, data size, geographically, and with number of tenants — without sacrificing cost effectiveness or ease of operations.</li> \n <li><strong>Developer productivity</strong>: Developers in the company should be able to store whatever they need to build their services, with a self service platform that doesn’t require intervention from a storage engineer, on a system that in their view “just works”.</li> \n</ul>\n<blockquote class=\"g-quote g-tweetable\"> \n <p>Developers should be able to store whatever they need on a system that just works.</p> \n</blockquote> \n<p><strong>Reliability at scale</strong><br>When we started building Manhattan, we already had many large storage clusters at Twitter, so we understood the challenges that come from running a system at scale, which informed what kinds of properties we wanted to encourage and avoid in a new system.</p> \n<p>A reliable storage system is one that can be trusted to perform well under all states of operation, and that kind of predictable performance is difficult to achieve. In a predictable system, worst-case performance is crucial; average performance not so much. In a well implemented, correctly provisioned system, average performance is very rarely a cause of concern. But throughout the company we look at metrics like p999 and p9999 latencies, so we care how slow the 0.01% slowest requests to the system are. We have to design and provision for worst-case throughput. For example, it is irrelevant that steady-state performance is acceptable, if there is a periodic bulk job that degrades performance for an hour every day.</p> \n<p>Because of this priority to be predictable, we had to plan for good performance during any potential issue or failure mode. The customer is not interested in our implementation details or excuses; either our service works for them and for Twitter or it does not. Even if we have to make an unfavorable trade-off to protect against a very unlikely issue, we must remember that rare events are no longer rare at scale.</p> \n<p>With scale comes not only large numbers of machines, requests and large amounts of data, but also factors of human scale in the increasing number of people who both use and support the system. We manage this by focusing on a number of concerns:</p> \n<ul>\n <li>if a customer causes a problem, the problem should be limited to that customer and not spread to others</li> \n <li>it should be simple, both for us and for the customer, to tell if an issue originates in the storage system or their client</li> \n <li>for potential issues, we must minimize the time to recovery once the problem has been detected and diagnosed</li> \n <li>we must be aware of how various failure modes will manifest for the customer</li> \n <li>an operator should not need deep, comprehensive knowledge of the storage system to complete regular tasks or diagnose and mitigate most issues</li> \n</ul>\n<p>And finally, we built Manhattan with the experience that when operating at scale, complexity is one of your biggest enemies. Ultimately, simple and working trumps fancy and broken. We prefer something that is simple but works reliably, consistently and provides good visibility, over something that is fancy and ultra-optimal in theory but in practice and implementation doesn’t work well or provides poor visibility, operability, or violates other core requirements.</p> \n<p><strong>Building a storage system</strong><br>When building our next generation storage system, we decided to break down the system into layers so it would be modular enough to provide a solid base that we can build on top of, and allow us to incrementally roll out features without major changes.</p> \n<p>We designed with the following goals in mind:</p> \n<ul>\n <li>Keep the <strong>core</strong> lean and simple</li> \n <li>Bring value <strong>sooner</strong> rather than later (focus on the incremental)</li> \n <li>Multi-Tenancy, Quality of Service (QoS) and Self-Service are <strong>first-class citizens</strong></li> \n <li>Focus on <strong>predictability</strong></li> \n <li>Storage as a <strong>service</strong>, not just technology</li> \n</ul>\n<p><strong>Layers</strong><br>We have separated Manhattan into four layers: interfaces, storage services, storage engines and the core.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/manhattan_our_real-timemulti-tenantdistributeddatabasefortwitter.thumb.1280.1280.png\" width=\"600\" height=\"373\" alt=\"Manhattan, our real-time, multi-tenant distributed database for Twitter scale\" class=\"align-center\"></p> \n<p><strong>Core</strong><br>The core is the most critical aspect of the storage system: it is highly stable and robust. It handles failure, eventual consistency, routing, topology management, intra- and inter-datacenter replication, and conflict resolution. Within the core of the system, crucial pieces of architecture are completely pluggable so we can iterate quickly on designs and improvements, as well as unit test effectively.</p> \n<p>Operators are able to alter the topology at any time for adding or removing capacity, and our visibility and strong coordination for topology management are critical. We store our topology information in Zookeeper because of it’s strong coordination capabilities and because it is a managed component in our infrastructure at Twitter, though Zookeeper is not in the critical path for reads or writes. We also put a lot of effort into making sure we have extreme visibility into the core at all times with an extensive set of <a href=\"https://github.com/twitter/ostrich\">Ostrich</a> metrics across all hosts for correctness and performance.</p> \n<p><strong>Consistency model</strong><br>Many of Twitter’s applications fit very well into the eventually consistent model. We favor high availability over consistency in almost all use cases, so it was natural to build Manhattan as an eventually consistent system at its core. However, there will always be applications that require strong consistency for their data so building such a system was a high priority for adopting more customers. Strong consistency is an opt-in model and developers must be aware of the trade-offs. In a strongly consistent system, one will typically have a form of mastership for a range of partitions. We have many use cases at Twitter where having a hiccup of a few seconds of unavailability is simply not acceptable (due to electing new masters in the event of failures). We provide good defaults for developers and help them understand the trade-offs between both models.</p> \n<p><strong>Achieving consistency</strong><br>To achieve consistency in an eventually consistent system you need a required mechanism which we call <strong>replica reconciliation</strong>. This mechanism needs to be incremental, and an always running process that reconciles data across replicas. It helps in the face of bitrot, software bugs, missed writes (nodes going down for long periods of time) and network partitions between datacenters. In addition to having replica reconciliation, there are two other mechanisms we use as an optimization to achieve faster convergence: read-repair, which is a mechanism that allows frequently accessed data to converge faster due to the rate of the data being read, and hinted-handoff, which is a secondary delivery mechanism for failed writes due to a node flapping, or being offline for a period of time.</p> \n<p><strong>Storage engines</strong><br>One of the lowest levels of a storage system is how data is stored on disk and the data structures kept in memory. To reduce the complexity and risk of managing multiple codebases for multiple storage engines, we made the decision to have our initial storage engines be designed in-house, with the flexibility of plugging in external storage engines in the future if needed.</p> \n<p>This gives us the benefit of focusing on features we find the most necessary and the control to review which changes go in and which do not. We currently have three storage engines:</p> \n<ol>\n <li>seadb, our read-only file format for batch processed data from hadoop</li> \n <li>sstable, our log-structured merge tree based format for heavy write workloads</li> \n <li>btree, our btree based format for heavy read, light write workloads</li> \n</ol>\n<p>All of our storage engines support block-based compression.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/manhattan_our_real-timemulti-tenantdistributeddatabasefortwitter.thumb.1280.1280.png\" width=\"600\" height=\"410\" alt=\"Manhattan, our real-time, multi-tenant distributed database for Twitter scale\" class=\"align-center\"></p> \n<p><strong>Storage services</strong><br>We have created additional services that sit on top of the core of Manhattan that allow us to enable more robust features that developers might come to expect from traditional databases. Some examples are:</p> \n<ol>\n <li><strong>Batch Hadoop importing:&nbsp;</strong><span>One of the original use cases of Manhattan was as an efficient serving layer on top of data generated in Hadoop. We built an importing pipeline that allows customers to generate their datasets in a simple format in HDFS and specify that location in a self service interface. Our watchers automatically pick up new datasets and convert them in HDFS into seadb files, so they can then be imported into the cluster for fast serving from SSDs or memory. We focused on making this importing pipeline streamlined and easy so developers can iterate quickly on their evolving datasets. One lesson we learned from our customers was that they tend to produce large, multi-terabyte datasets where each subsequent version typically changes less than 10-20% of their data. We baked in an optimization to reduce network bandwidth by producing binary diffs that can be applied when we download this data to replicas, substantially reducing the overall import time across datacenters.</span></li> \n <li><strong>Strong Consistency service:</strong>&nbsp;<span>The Strong Consistency service allows customers to have strong consistency when doing certain sets of operations. We use a consensus algorithm paired with a replicated log to guarantee in-order events reach all replicas. This enables us to do operations like Check-And-Set (CAS), strong read, and strong write. We support two modes today called LOCAL_CAS and GLOBAL_CAS. Global CAS enables developers to do strongly consistent operations across a quorum of our datacenters, whereas a Local CAS operation is coordinated only within the datacenter it was issued. Both operations have different tradeoffs when it comes to latency and data modeling for the application.</span></li> \n <li><strong>Timeseries Counters service</strong>:&nbsp;<span>We developed a very specific service to handle high volume timeseries counters in Manhattan. The customer who drove this requirement was our Observability infrastructure, who needed a system that could handle millions of increments per second. At this level of scale, our engineers went through the exercise of coming up with an agreed upon set of design tradeoffs over things like durability concerns, the delay before increments needed to be visible to our alerting system, and what kind of subsecond traffic patterns we could tolerate from the customer. The result was a thin, efficient counting layer on top of a specially optimized Manhattan cluster that greatly reduced our requirements and increased reliability over the previous system.</span></li> \n</ol>\n<p><span><strong>Interfaces</strong><br>The interface layer is how a customer interacts with our storage system. Currently we expose a key/value interface to our customers, and we are working on additional interfaces such as a graph based interface to interact with edges.</span></p> \n<p><strong>Tooling</strong><br>With the easy operability of our clusters as a requirement, we had to put a lot of thought into how to best design our tools for day-to-day operations. We wanted complex operations to be handled by the system as much as possible, and allow commands with high-level semantics to abstract away the details of implementation from the operator. We started with tools that allow us to change the entire topology of the system simply by editing a file with host groups and weights, and do common operations like restarting all nodes with a single command. When even that early tooling started to become too cumbersome, we built an automated agent that accepts simple commands as goals for the state of the cluster, and is able to stack, combine, and execute directives safely and efficiently with no further attention from an operator.</p> \n<p><strong>Storage as a service</strong><br>A common theme that we saw with existing databases was that they were designed to be setup and administered for a specific set of use-cases. With Twitter’s growth of new internal services, we realized that this wouldn’t be efficient for our business.</p> \n<p>Our solution is <strong>storage as a service.</strong> We’ve provided a major productivity improvement for our engineers and operational teams by building a fully self-service storage system that puts engineers in control.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/manhattan_our_real-timemulti-tenantdistributeddatabasefortwitter.thumb.1280.1280.png\" width=\"760\" height=\"421\" alt=\"Manhattan, our real-time, multi-tenant distributed database for Twitter scale\" class=\"align-center\"></p> \n<p>Engineers can provision what their application needs (storage size, queries per second, etc) and start using storage in seconds without having to wait for hardware to be installed or for schemas to be set up. Customers within the company run in a multi-tenant environment that our operational teams manage for them. Managing self service and multi-tenant clusters imposes certain challenges, so we treat this service layer as first-class feature: we provide customers with visibility into their data and workloads, we have built-in quota enforcement and rate-limiting so engineers get alerted when they go over their defined thresholds, and all our information is fed directly to our Capacity and Fleet Management teams for analysis and reporting.</p> \n<p>By making it easier for engineers to launch new features, we saw a rise in experimentation and a proliferation of new use-cases. To better handle these, we developed internal APIs to expose this data for cost analysis which allows us to determine what use cases are costing the business the most, as well as which ones aren’t being used as often.</p> \n<p><strong>Focus on the customer</strong><br>Even though our customers are our fellow Twitter employees, we are still providing a service, and they are still our customers. We must provide support, be on call, isolate the actions of one application from another, and consider the customer experience in everything we do. Most developers are familiar with the need for adequate documentation of their services, but every change or addition to our storage system requires careful consideration. A feature that should be seamlessly integrated into self-service has different requirements from one that needs intervention by operators. When a customer has a problem, we must make sure to design the service so that we can quickly and correctly identify the root cause, including issues and emergent behaviors that can arise from the many different clients and applications through which engineers access the database. We’ve had a lot success building Manhattan from the ground up as a service and not just a piece of technology.</p> \n<p><strong>Multi-Tenancy and QoS (Quality of Service)</strong><br>Supporting multi-tenancy — allowing many different applications to share the same resources — was a key requirement from the beginning. In previous systems we managed at Twitter, we were building out clusters for every feature. This was increasing operator burden, wasting resources, and slowing customers from rolling out new features quickly.</p> \n<p>As mentioned above, allowing multiple customers to use the same cluster increases the challenge of running our systems. We now must think about isolation, management of resources, capacity modeling with multiple customers, rate limiting, QoS, quotas, and more.</p> \n<p>In addition to giving customers the visibility they need to be good citizens, we designed our own rate limiting service to enforce customers usage of resources and quotas. We monitor and, if needed, throttle resource usage across many metrics to ensure no one application can affect others on the system. Rate limiting happens not at a coarse grain but at a subsecond level and with tolerance for the kinds of spikes that happen with real world usage. We had to consider not just automatic enforcement, but what controls should be available manually to operators to help us recover from issues, and how we can mitigate negative effects to all customers, including the ones going over their capacity.</p> \n<p>We built the APIs needed to extract the data for every customer and send it to our Capacity teams, who work to ensure we have resources always ready and available for customers who have small to medium requirements (by Twitter standards), so that those engineers can get started without additional help from us. Integrating all of this directly into self-service allows customers to launch new features on our large multi-tenant clusters faster, and allows us to absorb traffic spikes much more easily since most customers don’t use all of their resources at all times.</p> \n<p><strong>Looking ahead</strong><br>We still have a lot of work ahead of us. The challenges are increasing and the number of features being launched internally on Manhattan is growing at rapid pace. Pushing ourselves harder to be better and smarter is what drives us on the Core Storage team. We take pride in our values: what can we do to make Twitter better, and how do we make our customers more successful? We plan to release a white paper outlining even more technical detail on Manhattan and what we’ve learned after running over two years in production, so stay tuned!</p> \n<p><strong>Acknowledgments</strong><br>We want to give a special thank you to <a href=\"https://twitter.com/armondbigian\">Armond Bigian</a>, for helping believe in the team along the way and championing us to make the best storage system possible for Twitter. The following people made Manhattan possible:&nbsp;<a href=\"https://twitter.com/scode\">Peter Schuller</a>, <a href=\"https://twitter.com/lenn0x\">Chris Goffinet</a>, <a href=\"https://twitter.com/bx\">Boaz Avital</a>, <a href=\"https://twitter.com/fencerspang\">Spencer Fang</a>, <a href=\"https://twitter.com/yxu\">Ying Xu</a>, <a href=\"https://twitter.com/kunzie\">Kunal Naik</a>, <a href=\"https://twitter.com/yaleiwang\">Yalei Wang</a>, <a href=\"https://twitter.com/itsmedannychen\">Danny Chen</a>, <a href=\"https://twitter.com/padauk9\">Melvin Wang</a>, <a href=\"https://twitter.com/zncoder\">Bin Zhang</a>, <a href=\"https://twitter.com/pdbearman\">Peter Beaman</a>, <a href=\"https://twitter.com/sreecha\">Sree Kuchibhotla</a>,&nbsp;<a href=\"https://twitter.com/oskhan\">Osama Khan</a>, <a href=\"https://twitter.com/yeyangever\">Victor Yang Ye</a>, <a href=\"https://twitter.com/ekuber\">Esteban Kuber</a>, <a href=\"https://twitter.com/bingol\">Tugrul Bingol</a>, <a href=\"https://twitter.com/ylin30\">Yi Lin</a>, <a href=\"https://twitter.com/deng_liu\">Deng Liu</a>, <a href=\"https://twitter.com/tybulut\">Tyler Serdar Bulut</a>, <a href=\"https://twitter.com/argv0\">Andy Gross</a>, <a href=\"https://twitter.com/anthonyjasta\">Anthony Asta</a>, <a href=\"https://twitter.com/e_hoogendoorn\">Evert Hoogendoorn</a>,&nbsp;<a href=\"https://twitter.com/el_eff_el\">Lin Lee</a>, <a href=\"https://twitter.com/alexpeake\">Alex Peake</a>, <a href=\"https://twitter.com/thinkingfish\">Yao Yue</a>, <a href=\"https://twitter.com/hyungoos\">Hyun Kang</a>, <a href=\"https://twitter.com/xiangxin72\">Xin Xiang</a>, <a href=\"https://twitter.com/98tango\">Sumeet Lahorani</a>, <a href=\"https://twitter.com/rachitar\">Rachit Arora</a>, <a href=\"https://twitter.com/sagar\">Sagar Vemuri</a>, <a href=\"https://twitter.com/petchw\">Petch Wannissorn</a>, <a href=\"https://twitter.com/mahakp\">Mahak Patidar</a>, <a href=\"https://twitter.com/ajitverma\">Ajit Verma</a>, <a href=\"https://twitter.com/swang54\">Sean Wang</a>, <a href=\"https://twitter.com/drekhi\">Dipinder Rekhi</a>, <a href=\"https://twitter.com/kothasatish\">Satish Kotha</a>,&nbsp;<a href=\"https://twitter.com/jharjono\">Johan Harjono</a>,&nbsp;<a href=\"https://twitter.com/RH6341\">Alex Young</a>, <a href=\"https://twitter.com/kevin_dliu\">Kevin Donghua Liu</a>,&nbsp;<a href=\"https://twitter.com/marsanfra\">Pascal Borghino</a>, <a href=\"https://twitter.com/istvanmarko\">Istvan Marko</a>, <a href=\"https://twitter.com/andresplazar\">Andres Plaza</a>, <a href=\"https://twitter.com/_ravisharma_\">Ravi Sharma</a>, <a href=\"https://twitter.com/thevlad\">Vladimir Vassiliouk</a>, <a href=\"https://twitter.com/lnzju\">Ning Li</a>, <a href=\"https://twitter.com/liangg326\">Liang Guo</a>, <a href=\"https://twitter.com/inaamrana\">Inaam Rana</a>.</p> \n<blockquote class=\"g-quote g-tweetable\"> \n <p>If you’d like to work on Manhattan and enjoy tackling hard problems in distributed storage, apply to the Core Storage Team at jobs.twitter.com!</p> \n</blockquote> \n<p>&nbsp;</p> \n<p></p>",
"date": "2014-04-02T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale",
"domain": "engineering"
},
{
"title": "Scalding 0.9: Get it while it’s hot!",
"body": "<p>It’s been just over <a href=\"https://engineering/2012/scalding\">two years since we open sourced Scalding</a> and today we are very excited to release the 0.9 version. Scalding at Twitter powers everything from internal and external facing dashboards, to custom relevance and ad targeting algorithms, including many graph algorithms such as PageRank, approximate user cosine similarity and many more.</p> \n<p>There have been a wide breadth of new features added to Scalding since the last release:</p> \n<p><strong>Joins</strong><br>An area of particular activity and impact has been around joins. The Fields API already had an API to do left and right joins over multiple streams, but with 0.9 we bring this functionality to the Typed-API. In 0.9, joins followed by reductions followed by more joins are automatically planned as single map reduce jobs, potentially reducing the number of steps in your pipelines.</p> \n<pre class=\"brush:scala;first-line:1;\"> case class UserName(id: Long, handle: String)\n case class UserFavs(byUser: Long, favs: List[Long])\n case class UserTweets(byUser: Long, tweets: List[Long])\n \n def users: TypedSource[UserName]\n def favs: TypedSource[UserFavs]\n def tweets: TypedSource[UserTweets]\n \n def output: TypedSink[(UserName, UserFavs, UserTweets)]\n \n // Do a three-way join in one map-reduce step, with type safety\n users.groupBy(_.id)\n .join(favs.groupBy(_.byUser))\n .join(tweets.groupBy(_.byUser))\n .map { case (uid, ((user, favs), tweets)) =&gt;\n (user, favs, tweets)\n } \n .write(output)\n</pre> \n<p>This includes custom co-grouping, not just left and right joins. To handle skewed data there is a new count-min-sketch based algorithm to solve the curse of the last reducer, and a critical bug-fix for skewed joins in the Fields API.</p> \n<p><strong>Input/output</strong><br>In addition to joins, we’ve added support for new input/output formats:</p> \n<ul>\n <li>Parquet Format is a columnar storage format which we <a href=\"https://engineering/2013/announcing-parquet-10-columnar-storage-for-hadoop\">open sourced</a> in collaboration with Cloudera. Parquet can dramatically accelerate map-reduce jobs that read only a subset of the columns in an dataset, and can similarly reduce storage cost with more efficiently serialization.</li> \n <li><a href=\"http://avro.apache.org/\">Avro</a> is an Apache project to standardize serialization with self-describing IDLs. Ebay contributed the <a href=\"https://github.com/twitter/scalding/tree/develop/scalding-avro\">scalding-avro</a> module to make it easy to work with Apache Avro serialized data.</li> \n <li>TemplateTap support eases partitioned writes of data, where the output path depends on the value of the data.</li> \n</ul>\n<p><strong>Hadoop counters</strong><br>We’re also adding support for incrementing Hadoop counters inside map and reduce functions. For cases where you need to share a medium sized data file across all your tasks, support for Hadoop’s distributed cache was added in this release cycle.</p> \n<p><strong>Typed API</strong><br>The typed API saw many improvements. When doing data-cubing, partial aggregation should happen before key expansion and sumByLocalKeys enables this. The type-system enforces constraints on sorting and joining that previously would have caused run-time exceptions. When reducing a data-set to a single value, a ValuePipe is returned. Like TypedPipe is analogous to a program to produce a distributed list, a ValuePipe is a like a program to produce a single value, with which we might want to filter or transform some TypedPipe.</p> \n<p><strong>Matrix API</strong><br>When it comes to linear algebra, Scalding 0.9 introduced a new <a href=\"https://github.com/twitter/scalding/blob/develop/scalding-core/src/main/scala/com/twitter/scalding/mathematics/Matrix2.scala#L31\">Matrix API</a> which will replace the former one in our next major release. Due to the associative nature of matrix multiplication we can choose to compute (AB)C or A(BC). One of those orders might create a much smaller intermediate product than the other. The new API includes a dynamic programming optimization of the order of multiplication chains of matrices to minimize realized size along with several other optimizations. We have seen some considerable speedups of matrix operations with this API. In addition to the new optimizing API, we added some functions to efficiently compute all-pair inner-products (A A^T) using <a href=\"https://engineering/2012/dimension-independent-similarity-computation-disco\">DISCO</a> and <a href=\"http://arxiv.org/pdf/1304.1467.pdf\">DIMSUM</a>. These algorithms excel for cases of vectors highly skewed in their support, which is to say most vectors have few non-zero elements, but some are almost completely dense.</p> \n<p><strong>Upgrading and Acknowledgements</strong><br>Some APIs were deprecated, some were removed entirely, and some added more constraints. We have some <a href=\"https://github.com/twitter/scalding/wiki/Upgrading-to-0.9.0\">sed rules</a> to aid in porting. All changes fixed significant warts. For instance, in the Fields API sum takes a type parameter, and works for any Semigroup or Monoid. Several changes improve the design to aid in using scalding more as a library and less as a framework.</p> \n<p>This latest release is our biggest to date spanning over 800 commits from <a href=\"https://github.com/twitter/scalding/graphs/contributors\">57 contributors</a> It is available today in <a href=\"http://search.maven.org/#search%7Cga%7C1%7Cscalding\">maven central</a>. We hope Scalding is as useful to you as it is for us and the growing <a href=\"https://github.com/twitter/scalding/wiki/Powered-By\">community</a>. Follow us <a href=\"https://twitter.com/intent/user?screen_name=scalding\">@scalding</a>, join us on IRC (<a href=\"https://twitter.com/hashtag/scalding\">#scalding</a>) or via the <a href=\"https://groups.google.com/forum/#!forum/cascading-user\">mailing list</a>.</p>",
"date": "2014-04-03T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/scalding-09-get-it-while-it-s-hot",
"domain": "engineering"
},
{
"title": "Twitter #DataGrants selections",
"body": "<p>In February, we <a href=\"https://engineering/2014/introducing-twitter-data-grants\">introduced</a> the Twitter <a href=\"https://twitter.com/hashtag/DataGrants\">#DataGrants</a> pilot program, with the goal of giving a handful of research institutions access to Twitter’s public and historical data. We are thrilled with the response from the research community — we received more than 1,300 proposals from more than 60 different countries, with more than half of the proposals coming from outside the U.S.</p> \n<p>After reviewing all of the proposals, we’ve selected six institutions, spanning four continents, to receive free datasets in order to move forward with their research.</p> \n<ul>\n <li><a href=\"http://hms.harvard.edu/\">Harvard Medical School</a>&nbsp;/ <a href=\"http://www.childrenshospital.org/\">Boston Children’s Hospital</a> (US): Foodborne Gastrointestinal Illness Surveillance using Twitter Data</li> \n <li><a href=\"http://www.nict.go.jp/en/about/\">NICT</a> (Japan): Disaster Information Analysis System</li> \n <li><a href=\"http://www.utwente.nl/en/\">University of Twente</a> (Netherlands): The Diffusion And Effectiveness of Cancer Early Detection Campaigns on Twitter</li> \n <li><a href=\"http://ucsd.edu/\">UCSD</a> (US): Do happy people take happy images? Measuring happiness of cities</li> \n <li><a href=\"http://www.uow.edu.au/\">University of Wollongong</a> (Australia): Using GeoSocial Intelligence to Model Urban Flooding in Jakarta, Indonesia</li> \n <li><a href=\"http://www.uel.ac.uk/\">University of East London</a> (UK): Exploring the relationship between Tweets and Sports Team Performance</li> \n</ul>\n<p>Thank you to everyone who took part in this pilot. As we welcome Gnip to Twitter, we look forward to expanding the Twitter <a href=\"https://twitter.com/hashtag/DataGrants\">#DataGrants</a> program and helping even more institutions and academics access Twitter data in the future. Finally, we’d also like to thank <a href=\"https://twitter.com/mjgillis\">Mark Gillis</a>, <a href=\"https://twitter.com/cra\">Chris Aniszczyk</a>&nbsp;and <a href=\"https://twitter.com/eigenvariable\">Jeff Sarnat</a>&nbsp;for their passion in helping create this program.</p>",
"date": "2014-04-17T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/twitter-datagrants-selections",
"domain": "engineering"
},
{
"title": "Using Twitter to measure earthquake impact in almost real time",
"body": "<p>At Twitter, we know that Tweets can sometimes travel as fast as an earthquake. We were curious to know just how accurate such a correlation might be, so we <a href=\"http://stanford.edu/~rezab/papers/eqtweets.pdf\">collaborated with Stanford researchers</a> to model how Tweets can help create more accurate ShakeMaps, which provide near-real-time maps of ground motion and shaking intensity following significant earthquakes.</p> \n<p>These maps are used by federal, state and local organizations, both public and private, for post-earthquake response and recovery, public and scientific information, as well as for preparedness exercises and disaster planning.</p> \n<p></p>\n<div class=\"video video-youtube\">\n <iframe width=\"100%\" src=\"https://www.youtube.com/embed/0UFsJhYBxzY\" frameborder=\"0\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>",
"date": "2014-05-02T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/using-twitter-to-measure-earthquake-impact-in-almost-real-time",
"domain": "engineering"
},
{
"title": "TSAR, a TimeSeries AggregatoR",
"body": "<p>Twitter is a global real-time communications platform that processes many billions of events every day. Aggregating these events in real time presents a massive challenge of scale. Classic time-series applications include site traffic, service health, and user engagement monitoring; these are increasingly complemented by a range of<a href=\"https://analytics.twitter.com/about\">&nbsp;analytics products</a> and features such as Tweet activity, Followers, and Twitter Cards that surface aggregated time-series data directly to end users, publishers, and advertisers. Services that power such features need to be resilient enough to ensure a consistent user experience, flexible enough to accommodate a rapidly changing product roadmap, and able to scale to keep up with Twitter’s ever growing user base.</p> \n<p>Our experience demonstrates that truly robust real-time aggregation services are hard to build; that scaling and evolving them gracefully is even harder; and moreover, that many time-series applications call for essentially the same architecture, with only slight variations in the data model. Solving this broad class of problems at Twitter has been a multi-year effort. A previous <a href=\"https://engineering/2013/streaming-mapreduce-with-summingbird\">post</a> introduced <a href=\"https://github.com/twitter/summingbird\">Summingbird</a>, a high-level abstraction library for generalized distributed computation, which provides an elegant descriptive framework for complex aggregation problems.</p> \n<p>In this post, we’ll describe how we built a flexible, reusable, end-to-end service architecture on top of Summingbird, called TSAR (the TimeSeries AggregatoR). We’ll explore the motivations and design choices behind TSAR and illustrate how it solves a particular time-series problem: counting Tweet impressions.</p> \n<p><strong>The Tweet impressions problem in TSAR</strong></p> \n<p>Let’s suppose we want to annotate every Tweet with an impression count - that is, a count representing the total number of views of that Tweet, updated in real time. This innocent little feature conceals a massive problem of scale. Although “just” 500 million Tweets are created each day, these Tweets are then viewed tens of billions of times. Counting so many events in real time is already a nontrivial problem, but to harden our service into one that’s fit for production we need to answer questions like:</p> \n<ul>\n <li>What happens if the service is interrupted? Can we retrieve lost data?</li> \n <li>How do we coordinate our data schema and keep all of its representations consistent? In this example, we want to store Tweet impressions in several different ways: as log data (for use by downstream analytics pipelines); in a key/value data store (for low latency and high availability persistent data); in a cache (for quick access); and, in certain cases, in a relational database (for internal research and data-quality monitoring).</li> \n <li>How can we ensure the schema of our data is flexible to change and can gracefully propagate to each of its representations without disrupting the running service? For example, the product team might want to count promoted impressions (paid for by an advertiser) and earned impressions (impressions of a retweet of a promoted Tweet) separately. Or perhaps we want to segment impressions by country, or restrict to impressions just in the user’s home country… Such requirements tend to drift in unforeseeable ways, even after the service is first deployed.</li> \n <li>How do we update or repair historical data in a way that is relatively painless? In this case, we need to backfill a portion of the time-series history.</li> \n</ul>\n<p>Most important:</p> \n<ul>\n <li>How do we avoid having to solve all of these problems again the next time we are faced with a similar application?</li> \n</ul>\n<p>TSAR addresses these problems by following these essential design principles:</p> \n<ul>\n <li><strong>Hybrid computation</strong>. Process every event twice — in real time, and then again (at a later time) in a batch job. The double processing is orchestrated using Summingbird. This hybrid model confers all the advantages of batch (stability, reproducibility) and streaming (recency) computation.</li> \n <li><strong>Separation of event production from event aggregation</strong>. The first processing stage extracts events from source data; in this example, TSAR parses Tweet impression events out of log files deposited by web and mobile clients. The second processing stage buckets and aggregates events. While the “event production” stage differs from application to application, TSAR standardizes and manages the “aggregation” stage.</li> \n <li><strong>Unified data schema</strong>. The data schema for a TSAR service is specified in a datastore-independent way. TSAR maps the schema onto diverse datastores and transforms the data as necessary when the schema evolves.</li> \n <li><strong>Integrated service toolkit</strong>. TSAR integrates with other essential services that provide data processing, data warehousing, query capability, observability, and alerting, automatically configuring and orchestrating its components.</li> \n</ul>\n<p><strong>Let’s write some code!</strong></p> \n<p>Production requirements continually change at Twitter, based on user feedback, experimentation, and customer surveys. Our experience has shown us that keeping up with them is often a demanding process that involves changes at many levels of the stack. Let us walk through a lifecycle of the impression counts product to illustrate the power of the TSAR framework in seamlessly evolving with the product.</p> \n<p>Here is a minimal example of a TSAR service that counts Tweet impressions and persists the computed aggregates in <a href=\"https://engineering/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale\">Manhattan</a> (Twitter’s in-house key-value storage system):</p> \n<pre class=\"brush:scala;first-line:1;\">aggregate {\n onKeys( \n (TweetId)\n ) produce (\n Count\n ) sinkTo (Manhattan)\n } fromProducer { \n ClientEventSource(“client_events”)\n .filter { event =&gt; isImpressionEvent(event) }\n .map { event =&gt;\n val impr = ImpressionAttributes(event.tweetId)\n (event.timestamp, impr)\n }\n }\n\n</pre> \n<p>The TSAR job is broken into several code sections:</p> \n<ul>\n <li>The <em>onKeys</em>&nbsp;section declares one or more aggregation templates — the dimensions along which we’re aggregating. In this example, it’s just Tweet ID for now.</li> \n <li>The <em>produce</em>&nbsp;section tells TSAR which metrics to compute. Here we’re producing a count of the total number of impressions for each Tweet.</li> \n <li><em>sinkTo(Manhattan)</em> tells TSAR to send data to the Manhattan key/value datastore.</li> \n <li>Finally, the <em>fromProducer</em> block specifies preprocessing logic for turning raw events into impressions, in the language of Summingbird. TSAR then takes over and performs the heavy lifting of bucketing and aggregating these events (although under the covers, this step is implemented in Summingbird too).</li> \n</ul>\n<p><strong>Seamless schema evolution</strong><br>We now wish to change our product to break down impressions by the client application (e.g., Twitter for iPhone, Android, etc.) that was used to view the Tweet. This requires us to evolve our job logic to aggregate along an additional dimension. TSAR simplifies this schema evolution:</p> \n<pre class=\"brush:scala;first-line:1;\">aggregate {\n onKeys(\n (TweetId)\n (TweetId, ClientApplicationId) // new aggregation dimension\n ) produce (\n Count\n ) sinkTo (Manhattan)\n}\n</pre> \n<p><strong>Backfill tooling</strong><br>Going forward, the impression counts product will now break down data by client application as well. However, data generated by prior iterations of the job does not reflect our new aggregation dimension. TSAR makes backfilling old data as simple as running one backfill command:</p> \n<pre class=\"brush:scala;first-line:1;\">$ tsar backfill --start=&lt;start&gt; --end=&lt;end&gt;\n</pre> \n<p><span>The backfill then runs in parallel to the production job. Backfills are useful to repair bugs in the aggregated data between a certain time range, or simply to fill in old data in parallel to a production job that is computing present data.</span></p> \n<p><span><strong>Simplify aggregating data on different time granularities</strong><br>Our impression counts TSAR job has been computing daily aggregates so far, but now we wish to compute all-time aggregates. TSAR uses a custom configuration file format, where you can add or remove aggregation granularities with a single line change:</span></p> \n<p></p> \n<pre class=\"brush:scala;first-line:1;\">Output(sink = Sink.Manhattan, width = 1 * Day)\nOutput(sink = Sink.Manhattan, width = Alltime) // new aggregation granularity\n</pre> \n<p></p> \n<p><span>The user specifies whether he/she wants minutely, hourly, daily or alltime aggregates and TSAR handles the rest. The computational boilerplate of event aggregation (copying each event into various time buckets) is abstracted away.</span></p> \n<p><span><strong>Automatic metric computation</strong><br>For the next version of the product, we can even compute the number of distinct users who have viewed each Tweet, in addition to the total impression count - that is, we can compute a new metric. Normally, this would require changing the job’s aggregation logic. However, TSAR abstracts away the details of metric computation from the user:</span></p> \n<pre class=\"brush:scala;first-line:1;\">aggregate {\n onKeys(\n (TweetId),\n (TweetId, ClientApplicationId)\n ) produce (\n Count,\n Unique(UserId) // new metric\n ) sinkTo (Manhattan)\n}\n</pre> \n<p><span>TSAR provides a built-in set of core metrics that the user can specify via configuration options (such as count, sum, unique count, standard deviation, ranking, variance, max, min). However, if a user wishes to aggregate using a new metric (say exponential backoff) that TSAR does not support as yet, the user can can easily add it.</span></p> \n<p><span><strong>Automatic support for multiple sinks</strong><br>Additionally, we can export aggregated data to new output sinks like MySQL to allow for easy exploration. This is also a one-line configuration change:</span></p> \n<pre class=\"brush:scala;first-line:1;\">Output(sink = Sink.Manhattan, width = 1 * Day)\nOutput(sink = Sink.Manhattan, width = Alltime)\nOutput(sink = Sink.MySQL, width = Alltime) // new sink\n</pre> \n<p>TSAR infers and defines the key-value pair data models and relational database schema descriptions automatically via a job-specific configuration file. TSAR automates Twitter best practices using a general-purpose reusable aggregation framework. Note that TSAR is not tied to any specific sink. Sinks can easily be added to TSAR by the user, and TSAR will transparently begin persisting aggregated data to these sinks.</p> \n<p><strong>Operational simplicity</strong><br>TSAR provides the user with an end-to-end service infrastructure that you can deploy with a single command:</p> \n<pre class=\"brush:scala;first-line:1;\">$ tsar deploy\n</pre> \n<p>In addition to simply writing the business logic of the impression counts job, one has to build infrastructure to deploy the Hadoop and Storm jobs, build a query service that combines the results of the two pipelines, and deploy a process to load data into Manhattan and MySQL. A production pipeline requires monitoring and alerting around its various components along with checks for data quality.</p> \n<p>In our experience, the operational burden of building an entire analytics pipeline from scratch and managing the data flow is quite cumbersome. These parts of the infrastructure look very similar from pipeline to pipeline. We noticed common patterns between the pipelines we built before TSAR and abstracted it all away from the user into a managed&nbsp;<span>framework.&nbsp;</span></p> \n<p>A bird’s eye view of the TSAR pipeline looks like:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/tsar_a_timeseriesaggregator95.thumb.1280.1280.png\" width=\"700\" height=\"641\" alt=\"TSAR, a TimeSeries AggregatoR\" class=\"align-center\"></p> \n<p>Now let’s bring the various components of our updated TSAR service together. You will see that the updated TSAR&nbsp;service looks almost exactly like the original. However, the data produced by this version of the TSAR&nbsp;service aggregates along additional event dimensions and along more time granularities and writes to an additional data store. The TSAR toolkit and service infrastructure simplify the operational aspects of this evolution as well. The final TSAR service fits into three small files:</p> \n<p><strong>ImpressionCounts: Thrift schema</strong></p> \n<pre class=\"brush:scala;first-line:1;\">enum Client\n{\n iPhone = 0,\n Android = 1,\n ...\n}\n\nstruct ImpressionAttributes\n{\n 1: optional Client client,\n 2: optional i64 user_id,\n 3: optional i64 tweet_id\n}\n</pre> \n<p><strong>ImpressionCounts: TSAR service</strong></p> \n<pre class=\"brush:scala;first-line:1;\">object ImpressionJob extends TsarJob[ImpressionAttributes] {\n aggregate {\n onKeys(\n (TweetId),\n (TweetId, ClientApplicationId)\n ) produce (\n Count,\n Unique(UserId)\n ) sinkTo (Manhattan, MySQL)\n } fromProducer {\n ClientEventSource(“client_events”)\n .filter { event =&gt; isImpressionEvent(event) }\n .map { event =&gt;\n val impr = ImpressionAttributes(\n event.client, event.userId, event.tweetId\n )\n (event.timestamp, impr)\n }\n }\n}\n</pre> \n<p><strong>ImpressionCounts: Configuration file</strong></p> \n<pre class=\"brush:python;first-line:1;\">Config(\n base = Base(\n user = \"platform-intelligence\",\n name = \"impression-counts\",\n origin = \"2014-01-01 00:00:00 UTC\",\n primaryReducers = 1024,\n outputs = [\n Output(sink = Sink.Hdfs, width = 1 * Day),\n Output(sink = Sink.Manhattan, width = 1 * Day),\n Output(sink = Sink.Manhattan, width = Alltime),\n Output(sink = Sink.MySQL, width = Alltime)\n ],\n storm = Storm(\n topologyWorkers = 10,\n ttlSeconds = 4.days,\n ),\n ),\n)\n</pre> \n<p>The information contained in these three files (thrift, scala class, configuration) is all that the user needs to specify in order to deploy a fully functional service. TSAR fills in the blanks:</p> \n<ul>\n <li>How does one represent the aggregated data?</li> \n <li>How does one represent the schema?</li> \n <li>How does one actually perform the aggregation (computationally)?</li> \n <li>Where are the underlying services (Hadoop, Storm, MySQL, Manhattan, …) located, and how does one connect to them?</li> \n</ul>\n<blockquote class=\"g-quote g-tweetable\"> \n <p>The end-to-end management of the data pipeline is TSAR’s key feature. The user concentrates on the business logic.</p> \n</blockquote> \n<p><strong>Looking ahead</strong><br>While we have been running TSAR in production for more than a year, it is still a work in progress. The challenges are increasing and the number of features launching internally on TSAR is growing at rapid pace. Pushing ourselves harder to be better and smarter is what drives us on the Platform Intelligence team. We wish to grow our business in a way that makes us proud and do what we can to make Twitter better and our customers more successful.</p> \n<p><strong>Acknowledgments</strong></p> \n<p><span>Among the many people who have contributed to TSAR (far too many to list here), I especially want to thank <a href=\"https://twitter.com/asiegel\">Aaron Siegel</a>, <a href=\"https://twitter.com/rlotun\">Reza Lotun</a>, <a href=\"https://twitter.com/rk\">Ryan King</a>, <a href=\"https://twitter.com/dloft\">Dave Loftesness</a>, <a href=\"https://twitter.com/squarecog\">Dmitriy Ryaboy</a>, <a href=\"https://twitter.com/helicalspiral\">Andrew Nguyen</a>, <a href=\"https://twitter.com/eigenvariable\">Jeff Sarnat</a>, <a href=\"https://twitter.com/chee1bot\">John Chee</a>, <a href=\"https://twitter.com/econlon\">Eric Conlon</a>, <a href=\"https://twitter.com/allenschen\">Allen Chen</a>, <a href=\"https://twitter.com/GabrielG439\">Gabriel Gonzalez</a>, <a href=\"https://twitter.com/alialzabarah\">Ali Alzabarah</a>, <a href=\"https://twitter.com/ConcreteVitamin\">Zongheng Yang</a>, <a href=\"https://twitter.com/zhilanT\">Zhilan Zweiger</a>, <a href=\"https://twitter.com/klingerf\">Kevin Lingerfelt</a>, <a href=\"https://twitter.com/leftparen\">Justin Chen</a>, <a href=\"https://twitter.com/chanian\">Ian Chan</a>, <a href=\"https://twitter.com/jialiehu\">Jialie Hu</a>, <a href=\"https://twitter.com/hansmire\">Max Hansmire</a>, <a href=\"https://twitter.com/dmarwick\">David Marwick</a>, <a href=\"https://twitter.com/posco\">Oscar Boykin</a>, <a href=\"https://twitter.com/sritchie\">Sam Ritchie</a>, <a href=\"https://twitter.com/0x138\">Ian O’Connell</a> … And special thanks to <a href=\"https://twitter.com/raffi\">Raffi Krikorian</a> for conjuring the Platform Intelligence team into existence and believing that anything is possible.</span></p> \n<p><span>If this sounds exciting to you and you’re interested in <a href=\"https://about.twitter.com/careers/positions?jvi=oUCqXfwD,Job\">joining the Platform Intelligence team</a> to work on Tsar, we’d love to hear from you! </span></p> \n<p></p>",
"date": "2014-06-27T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/tsar-a-timeseries-aggregator",
"domain": "engineering"
},
{
"title": "#ScalingTwitter in Dublin and the UK",
"body": "<p>As Twitter continues to expand internationally, our engineering teams are growing around the world. We recently held two <a href=\"https://twitter.com/hashtag/ScalingTwitter\">#ScalingTwitter</a> tech talks in our Dublin and London engineering hubs to highlight the work of these teams, the unique challenges of operating at scale and how Twitter is addressing these issues.</p> \n<p>Over 100 guests attended these events that featured a series of lightning tech talks from some of our most senior engineers across the company. <a href=\"https://twitter.com/onesnowclimber\">Sharon Ly</a>, <a href=\"https://twitter.com/luby\">Lucy Cunningham</a> and <a href=\"https://twitter.com/ekuber\">Esteban Kuber</a> showcased the technology stack required to process our traffic volume while <a href=\"https://twitter.com/YoSuperG\">K.G. Nesbit</a> discussed the evolution of the Twitter network. In addition, <a href=\"https://twitter.com/andyhume\">Andy Hume</a>, <a href=\"https://twitter.com/harrykantas\">Harry Kantas</a> and <a href=\"https://twitter.com/LukeSzemis\">Lukasz Szemis</a> presented on using tooling to measure and manage both infrastructure and software at scale.</p> \n<p><strong>London Tweets:</strong><br></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/EleonoreMayola/status/494984311716974594\">https://twitter.com/EleonoreMayola/status/494984311716974594</a>\n </blockquote>",
"date": "2014-08-08T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/scalingtwitter-in-dublin-and-the-uk",
"domain": "engineering"
},
{
"title": "Fighting spam with BotMaker",
"body": "<p>Spam on Twitter is different from traditional spam primarily because of two aspects of our platform: Twitter exposes developer APIs to make it easy to interact with the platform and real-time content is fundamental to our user’s experience.</p> \n<p>These constraints mean that spammers know (almost) everything Twitter’s anti-spam systems know through the APIs, and anti-spam systems must avoid adding latency to user-visible operations. These operating conditions are a stark contrast to the constraints placed upon more traditional systems, like email, where data is private and adding latency of tens of seconds goes unnoticed.</p> \n<p>So, to fight spam on Twitter, we built BotMaker, a system that we designed and implemented from the ground up that forms a solid foundation for our principled defense against unsolicited content. The system handles billions of events every day in production, and we have seen a 40% reduction in key spam metrics since launching BotMaker.</p> \n<p>In this post we introduce BotMaker and discuss our overall architecture. All of the examples in this post are used to illustrate the use of BotMaker, not actual rules running in production.</p> \n<p><span><strong>BotMaker architecture</strong></span></p> \n<p><strong>Goals, challenges and BotMaker overview</strong><br>The goal of any anti-spam system is to reduce spam that the user sees while having nearly zero false positives. Three key principles guided our design of Botmaker:</p> \n<ul>\n <li><em>Prevent spam content from being created</em>. By making it as hard as possible to create spam, we reduce the amount of spam the user sees.</li> \n <li><em>Reduce the amount of time spam is visible on Twitter.</em> For the spam content that does get through, we try to clean it up as soon as possible.</li> \n <li><em>Reduce the reaction time to new spam attacks.</em> Spam evolves constantly. Spammers respond to the system defenses and the cycle never stops. In order to be effective, we have to be able to collect data, and evaluate and deploy rules and models quickly.</li> \n</ul>\n<p>BotMaker achieves these goals by receiving events from Twitter’s distributed systems, inspecting the data according to a set of rules, and then acting accordingly. BotMaker rules, or bots as they are known internally, are decomposed into two parts: conditions for deciding whether or not to act on an event, and actions that dictate what the caller should do with this particular event. For example, a simple rule for denying any Tweets that contain a spam url would be:</p> \n<p></p> \n<pre class=\"brush:bash;first-line:1;\">Condition:\nHasSpamUrl(GetUrls(tweetText))\n\nAction:\nDeny()\n</pre> \n<p>The net effect of this rule is that BotMaker will deny any Tweets that match this condition.</p> \n<p>The main challenges in supporting this type of system are evaluating rules with low enough latency that they can run on the write path for Twitter’s main features (i.e., Tweets, Retweets, favorites, follows and messages), supporting computationally intense machine learning based rules, and providing Twitter engineers with the ability to modify and create new rules instantaneously.</p> \n<p>For the remainder of this blog post, we discuss how we solve these challenges.</p> \n<p><span><strong>When we run BotMaker</strong></span></p> \n<p><span><strong><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/fighting_spam_withbotmaker95.thumb.1280.1280.png\" width=\"438\" height=\"379\" alt=\"Fighting spam with BotMaker \" class=\"align-center\"></strong></span></p> \n<p>The ideal spam defense would detect spam at the time of creation, but in practice this is difficult due to the latency requirements of Twitter. We have a combination of systems (see figure above) that detect spam at various stages.</p> \n<ol>\n <li>Real time (Scarecrow): Scarecrow detects spam in real time and prevents spam content from getting into the system, and it must run with low latency. Being in the synchronous path of all actions enables Scarecrow to deny writes and to challenge suspicious actions with countermeasures like captchas.</li> \n <li>Near real time (Sniper): For the spam that gets through Scarecrow’s real time checks, Sniper continuously classifies users and content off the write path. Some machine learning models cannot be evaluated in real time due to the nature of the features that they depend on. These models get evaluated in Sniper. Since Sniper is asynchronous, we can also afford to lookup features that have high latency.</li> \n <li>Periodic jobs: Models that have to look at user behavior over extended periods of time and extract features from massive amounts of data can be run periodically in offline jobs since latency is not a constraint. While we do use offline jobs for models that need data over a large time window, doing all spam detection by periodically running offline jobs is neither scalable nor effective.</li> \n</ol>\n<p><span><strong>The BotMaker rule language</strong></span></p> \n<p>In addition to when BotMaker runs, we have put considerable time into designing an intuitive and powerful interface for guiding how BotMaker runs. Specifically: our BotMaker language is type safe, all data structures are immutable, all functions are pure except for a few well marked functions for storing data atomically, and our runtime supports common functional programming idioms. Some of the language highlights include:</p> \n<ul>\n <li>Human readable syntax.</li> \n <li>Functions that can be combined to compose complex derived functions.</li> \n <li>New rules can be added without any code changes or recompilation.</li> \n <li>Edits to production rules get deployed in seconds.</li> \n</ul>\n<p><span><strong><span>Sample bot</span></strong></span></p> \n<p><span>Here is a bot that demonstrates some of the above features. Lets say we want to get all users that are receiving blocks due to mentions that they have posted in the last 24 hours.<br>Here is what the rule would look like:</span></p> \n<p>Condition:</p> \n<pre class=\"brush:bash;first-line:1;\"> Count(\n Intersection(\n UsersBlocking(spammerId, 1day),\n UsersMentionedBy(spammerId, 1day)\n )\n ) &gt;= 1\n</pre> \n<p>Actions:</p> \n<pre class=\"brush:bash;first-line:1;\"> Record(spammerId)\n</pre> \n<p>UsersBlocking and UsersMentionedBy are functions that return lists of users, which the bot intersects and gets a count of the result. If the count is more than one, then the user is recorded for analysis.</p> \n<p><span><strong>Impact and lessons learned</strong></span></p> \n<p>This figure shows the amount of spam we saw on Twitter before enabling spam checks on the write path for Twitter events. This graph spans 30 days with time on the x-axis and spam volume on the y-axis. After turning on spam checking on the write paths, we saw a 55% drop in spam on the system as a direct result of preventing spam content from being written.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/fighting_spam_withbotmaker96.thumb.1280.1280.png\" width=\"700\" height=\"311\" alt=\"Fighting spam with BotMaker \" class=\"align-center\"></p> \n<p>BotMaker has also helped us reduce our response time to spam attacks significantly. Before BotMaker, it took hours or days to make a code change, test and deploy, whereas using BotMaker it takes minutes to react. This faster reaction time has dramatically improved developer and operational efficiency, and it has allowed us to rapidly iterate and refine our rules and models, thus reducing the amount of spam on Twitter.</p> \n<p>Once we launched BotMaker and started using it to fight spam, we saw a 40% reduction in a metric that we use to track spam.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/fighting_spam_withbotmaker97.thumb.1280.1280.png\" width=\"700\" height=\"202\" alt=\"Fighting spam with BotMaker \" class=\"align-center\"></p> \n<p><span><strong>Conclusion</strong></span><br>BotMaker has ushered in a new era of fighting spam at Twitter. With BotMaker, Twitter engineers now have the ability to create new models and rules quickly that can prevent spam before it even enters the system. We designed BotMaker to handle the stringent latency requirements of Twitter’s real-time products, while still supporting more computationally intensive spam rules.</p> \n<p>BotMaker is already being used in production at Twitter as our main spam-fighting engine. Because of the success we have had handling the massive load of events, and the ease of writing new rules that hit production systems immediately, other groups at Twitter have started using BotMaker for non-spam purposes. BotMaker acts as a fundamental interposition layer in our distributed system. Moving forward, the principles learned from BotMaker can help guide the design and implementation of systems responsible for managing, maintaining and protecting the distributed systems of today and the future.</p> \n<p></p>",
"date": "2014-08-20T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/fighting-spam-with-botmaker",
"domain": "engineering"
},
{
"title": "Outreach Program for Women and GSoC 2014 results",
"body": "<p>We had the opportunity to participate in the <a href=\"http://code.google.com/soc/\">Google Summer of Code</a> (GSoC) for the third time and would like to share the resulting open source activities.</p> \n<p>While many GSoC participating organizations focus on a single ecosystem, we have a <a href=\"http://twitter.github.io/\">variety of projects</a> that span multiple programming languages and communities. And for the first time, we participated in <a href=\"http://gnome.org/opw/\">Outreach Program for Women</a> (OPW) — an organization that focuses on helping women get involved in open source.</p> \n<p></p> \n<blockquote class=\"g-quote\"> \n <p>In total, we worked on nine successful projects with nine amazing students over the summer.</p> \n</blockquote> \n<h4><span>Outreach Program for Women projects</span></h4> \n<p><strong>Apache Mesos CLI improvements</strong><br><a href=\"https://twitter.com/ijimene\">Isabel Jimenez</a> worked with her mentor <a href=\"https://twitter.com/benh\">Ben Hindman</a> to add new functionality to the Mesos CLI interface. You can read about the work via her <a href=\"http://blog.isabeljimenez.com/foss-apache-mesos-and-me\">blog posts over the summer</a> and review <a href=\"https://reviews.apache.org/users/ijimenez/\">commits</a> associated with the project.</p> \n<p><strong>Apache Mesos slave unregistration support</strong><br>Alexandra Sava worked with mentor <a href=\"https://twitter.com/bmahler\">Ben Mahler</a> to add the ability to unregister a slave and have it drain all tasks instead of leaving tasks running underneath it. Check out <a href=\"https://reviews.apache.org/users/alexandra.sava/\">ReviewBoard</a> to look at the commits that have been merged already, and take a look at her <a href=\"http://alexsatech.wordpress.com/2014/08/21/opw-pencils-down/\">blog posts</a> that summarize her experience.</p> \n<h4>Summer of Code projects</h4> \n<p><strong>Use zero-copy read path in <a href=\"https://twitter.com/intent/user?screen_name=ApacheParquet\">@ApacheParquet</a></strong><br>Sunyu Duan worked with mentors <a href=\"https://twitter.com/J_\">Julien Le Dem</a> and <a href=\"https://twitter.com/gerashegalov\">Gera Shegalov</a> on improving performance in Parquet by using the <a href=\"https://github.com/Parquet/parquet-mr/issues/287\">new ByteBuffer based APIs in Hadoop</a>. As a result of their efforts, performance has improved up to 40% based on initial testing and the work will make its way into the next Parquet release.</p> \n<p><strong>A pluggable algorithm to choose next EventLoop in Netty</strong><br><a href=\"https://twitter.com/jakobbuchgraber\">Jakob Buchgraber</a> worked with mentor <a href=\"https://twitter.com/normanmaurer\">Norman Maurer</a> to add pluggable algorithm support to Netty’s event loop (see <a href=\"https://github.com/netty/netty/pull/2470\">pull request</a>).</p> \n<p></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/jakobbuchgraber/status/501561485626449920\">https://twitter.com/jakobbuchgraber/status/501561485626449920</a>\n </blockquote>",
"date": "2014-08-25T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/outreach-program-for-women-and-gsoc-2014-results",
"domain": "engineering"
},
{
"title": "Bringing more design principles to security",
"body": "<p>To date, much of the web and mobile security focus has been on security bugs such as cross-site-scripting and SQL injection. Due to the number of those issues and the fact that the number of bugs in general increases in proportion to the number of lines of code, it’s clear that if we hope to address software security problems as a community, we also need to invest in designing software securely to eliminate entire classes of bugs.</p> \n<p>To that end, we are participating in the founding of the IEEE Center for Secure Design, which was announced today, and contributed to the Center’s in-depth report on “Avoiding the top ten software security design flaws.” We hope it serves as a useful resource to help software professionals as well as the community at large build more secure systems. We’ve been using these secure design principles in some form at Twitter, and with their codification by the IEEE, we’ll be further leveraging them in our own internal documentation and processes.</p> \n<p>As we continue to scale the mobile and web services that we provide, it will be increasingly important to continue taking a holistic, proactive approach to designing secure software to protect our users.</p> \n<blockquote class=\"g-quote\"> \n <p>Our participation in the IEEE Center for Secure Design is one way we are glad to contribute back to the community while furthering our own approach to secure software design.</p> \n</blockquote> \n<p>&nbsp;To learn more about the IEEE Center for Secure Design and download the report, visit <a href=\"http://cybersecurity.ieee.org\">cybersecurity.ieee.org</a>.</p>",
"date": "2014-08-27T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/bringing-more-design-principles-to-security",
"domain": "engineering"
},
{
"title": "Push our limits - reliability testing at Twitter",
"body": "<p>At Twitter, we strive to prepare for sustained traffic as well as spikes - some of which we can plan for, some of which comes at unexpected times or in unexpected ways.&nbsp;To help us prepare for these varied types of traffic, we continuously run tests against our infrastructure to ensure it remains a scalable and highly available system.</p> \n<p>Our Site Reliability Engineering (SRE) team has created a framework to perform different types of load and stress tests. We test different stages of a service life cycle in different environments (e.g., a release candidate service in a staging environment). These tests help us anticipate how our services will handle traffic spikes and ensure we are ready for <a href=\"https://engineering/2013/new-tweets-per-second-record-and-how\">such events</a>.</p> \n<p>Additionally, these tests help us to be more confident that the loosely coupled distributed services that power Twitter’s products are highly available and responsive at all times and under any circumstance.</p> \n<p>As part of our deploy process before releasing a new version of a service, we run a load test to check and validate the performance regressions of the service to estimate how many requests a single instance can handle.</p> \n<p>While load testing a service in a staging environment is a good release practice, it does not provide insight into how the overall system behaves when it’s overloaded. Services under load fail due to a variety of causes including GC pressure, thread safety violations and system bottlenecks (CPU, network).</p> \n<p>Below are the typical steps we follow to evaluate a service’s performance.</p> \n<p><span><strong>Performance evaluation</strong></span></p> \n<p>We evaluate performance in several ways for different purposes; these might be broadly categorized:</p> \n<p><strong>In staging</strong></p> \n<ul>\n <li>Load testing: Performing load tests against few instances of a service in non-production environment to identify a new service’s performance baseline or compare a specific build’s performance to the existing baseline for that service.</li> \n <li>Tap compare: Sending production requests to instances of a service in both production and staging environments and comparing the results for correctness and evaluating performance characteristics.</li> \n <li>Dark traffic testing: Sending production traffic to a new service to monitor its health and performance characteristics. In this case, the response(s) won’t be sent to the requester(s).</li> \n</ul>\n<p><strong>In production</strong></p> \n<ul>\n <li>Canarying: Sending small percentage of production traffic to some number of instances in a cluster which are running a different build (newer in most cases). The goal is to measure the performance characteristics and compare the results to the existing/older versions. Assuming the performance is in an acceptable range, the new version will be pushed to the rest of the cluster.</li> \n <li>Stress testing: Sending traffic (with specific flags) to the production site to simulate unexpected load spikes or expected organic growth.</li> \n</ul>\n<p>In this blog, we are primarily focusing on our stress testing framework, challenges, lessons learned, and future work.</p> \n<p><strong>Framework</strong></p> \n<p>We usually don’t face the typical performance testing problems such as <a href=\"https://engineering/2013/observability-at-twitter\">collecting services’ metrics</a>, <a href=\"https://engineering/2013/mesos-graduates-from-apache-incubation\">allocating resources to generate the load</a> or <a href=\"https://engineering/2012/building-and-profiling-high-performance-systems-with-iago\">implementing a load generator</a>. Obviously, any part of your system at scale could get impacted, but some are more resilient and some require more testing. Even though we are still focusing on the items mentioned above, in regards to this blog and this type of work, we are focusing on system complexity and scalability. As part of our reliability testing, we generate distributed multi-datacenter load to analyze the impact and determine the bottlenecks.</p> \n<p>Our stress-test framework is written in Scala and leverages <a href=\"https://engineering/2012/building-and-profiling-high-performance-systems-with-iago\">Iago</a> to create load generators that run on <a href=\"https://engineering/2013/mesos-graduates-from-apache-incubation\">Mesos</a>. Its load generators send requests to the Twitter APIs to simulate Tweet creation, message creation, timeline reading and other types of traffic. We simulate patterns from past events such as New Year’s Eve, the Super Bowl, the Grammys, the State of the Union, NBA Finals, etc.</p> \n<p>The framework is flexible and integrated with the core services of Twitter infrastructure. We can easily launch jobs that are capable of generating large traffic spikes or a high volume of sustained traffic with minor configuration changes. The configuration file defines the required computational resources, transactions rate, transactions logs, the test module to use, and the targeted service. Figure 1 below shows an example of a launcher configuration file:</p> \n<pre class=\"brush:java;first-line:1;\">new ParrotLauncherConfig {\n\n // Aurora Specification\n role = \"limittesting\"\n jobName = \"limittesting\"\n serverDiskInMb = 4096\n feederDiskInMb = 4096\n mesosServerNumCpus = 4.0\n mesosFeederNumCpus = 4.0\n mesosServerRamInMb = 4096\n mesosFeederRamInMb = 4096\n numInstances = 2\n\n // Target systems address\n victimClusterType = \"sdzk\"\n victims = \"/service/web\"\n\n // Test specification\n hadoopConfig = \"/etc/hadoop/conf\"\n log = \"hdfs://hadoop-nn/limittesting/logs.txt\"\n duration = 10\n timeUnit = \"MINUTES\"\n requestRate = 1000\n \n // Testing Module\n imports = \"import com.twitter.loadtest.Eagle\"\n responseType = \"HttpResponse\"\n loadTest = \"new EagleService(service.get)\"\n}\n</pre> \n<p>The framework starts by launching a job in Aurora, the job scheduler for Mesos. It registers itself in Zookeeper, and publishes into the Twitter observability stack (Viz) and distributed tracing system (Zipkin). This seamless integration lets us monitor the test execution. We can measure test resource usage, transaction volume, transaction time, etc. If we want to increase the traffic volume, we only need a few clicks to change a variable.</p> \n<p><strong><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/push_our_limits_-reliabilitytestingattwitter95.thumb.1280.1280.png\" width=\"629\" height=\"292\" alt=\"Push our limits - reliability testing at Twitter\" class=\"align-center\"></strong></p> \n<p></p> \n<p class=\"align-center\"><em>Figure 2: Load generated during a stress test </em></p> \n<p></p> \n<p><strong>Challenges</strong></p> \n<p>Comparing the performance characteristics of test runs is complicated. As we continuously integrate and deliver changes across all services, it gets harder to identify a baseline to compare against. The test’s environment changes many times between test runs due to many factors such as new service builds and releases. The inconsistency in test environments makes it difficult to determine the change that introduced the bottlenecks.</p> \n<p>If a regression is identified, we study what could contribute to it including, but not limited to, how services behave under upstream and downstream failures, and changes in traffic patterns. In some cases, detecting the root cause can be challenging. The anomaly we detect might not be the root cause, but rather a side effect of upstream or downstream issues. Finding the root cause between thousands of changes across many services is a time consuming process and might require lots of experiments and analysis.</p> \n<p>Generating the test traffic against a single or multiple data centers requires careful planning and a test case design. Many factors need to be taken into consideration (like cache hit ratio). A cache miss for a tweet read can trigger a cache fill which in turn triggers multiple backend read requests to fill the data. Because things like a cache miss is much more expensive than a cache hit, the generated test traffic must respect these factors to get accurate tests results that match production traffic patterns.</p> \n<p>Since our platform is real-time, it’s expected for us to observe extra surges of traffic at any point. The two more frequent kinds of load we have seen: heavy traffic during a special event for a few minutes or hours, and spikes that happen in a second or two when <a href=\"https://engineering/2012/election-night-2012\">users</a> <a href=\"https://engineering/2013/new-tweets-per-second-record-and-how\">share</a> a <a href=\"https://engineering/2014/the-reach-and-impact-of-oscars-2014-tweets\">moment</a>. Simulating the spikes that last for a few seconds while monitoring the infrastructure to detect anomalies in real time is a complicated problem, and we are actively working on improving our approach.</p> \n<p><strong>Lesson learned</strong></p> \n<p>Our initial focus was on overloading the entire Tweet creation path to find limits in specific internal services, verify capacity plans, and understand the overall behavior under stress. We expected to identify weaknesses in the stack, adjust capacity and implement safety checks to protect minor services from upstream problems. However, we quickly learned this approach wasn’t comprehensive. Many of our services have unique architectures that make load testing complicated. We had to focus on prioritizing our efforts, review the main call paths, then design and cover the major scenarios.</p> \n<p>An example is our <a href=\"https://engineering/2011/spiderduck-twitters-real-time-url-fetcher\">internal web crawler</a> service that assigns site crawling tasks to a specific set of machines in a cluster. The service does this for performance reasons since those machines have higher probability of having an already-established connection to the target site. This same service replicates its already-crawled sites to the other data centers to save computing resources and outbound internet bandwidth.</p> \n<p>Addressing all of these steps complicated the collection of links, their types and their distribution during the test modeling. The distribution of links among load generators throughout the test was a problem because these were real production websites.</p> \n<p>In response to those challenges, we designed a system that distributes links across all our load generators in a way that guarantees no more than N links of any website are crawled per second across the cluster. We had to specify the link types and distribution carefully. We might have overwhelmed the internal systems if most of the links were invalid, spammy or contain malware. Additionally, we could have overwhelmed the external systems if all links were for a single website. The stack’s overall behavior changes as the percentage of each category changes. We had to find the right balance to design a test that covered all possible scenarios. These custom procedures repeat every time we model new tests.</p> \n<p>We started our testings methodologies by focusing on specific site features such as tweet write and read paths. Our first approach was to simulate high volume of sustained tweets creation and reads. Due to the real-time nature of our platform, variation of spikes, and types of traffic we observe, we continuously expand our framework to cover additional features such as users retweeted Tweets, favorited Tweets, conversations, etc. The variety of our features (Tweets, Tweet with media, searches, Discover, timeline views, etc) requires diversity in our approach in order to ensure the results of our test simulations are complete and accurate.</p> \n<p>Twitter’s internal services have mechanisms to protect themselves and their downstreams. For example, a service will start doing in-process caching to prevent cache overwhelming or will raise special exceptions to trigger upstream retry/backoff logic. This complicates test execution because the cache shields downstream services from the load. In fact, when in-process cache kicks in, the service’s overall latency decreases since it no longer requires a round trip to the distributed caches. We had to work around such defence mechanisms by creating multiple tests models around a single test scenario. One test verifies the in-process cache has kicked in; another test simulates the service’s behavior without the in-process cache. This process required changes across the stack to pass and respect special testing flags. After going through that process, we learned to design and architect services with reliability testing in mind to simplify and speed up future tests’ modeling and design.</p> \n<p><strong>Future work</strong></p> \n<p>Since Twitter is growing rapidly and our services are changing continuously, our strategies and framework should as well. We continue to improve and in some case redesign our performance testing strategy and frameworks. We are automating the modeling, design, execution of our stress tests, and making the stress-testing framework context-aware&nbsp;so it’s self driven and capable of targeting a specific or N datacenters. If this sounds interesting to you, we could use your help — <a href=\"https://about.twitter.com/careers\">join the flock</a>!</p> \n<p><em>Special thanks to <a href=\"https://twitter.com/alialzabarah\">Ali Alzabarah</a>&nbsp;for leading the efforts to improve and expand our stress-test framework and his hard work and dedication to this blog.</em></p> \n<p><em>Many thanks to a number of folks across <a href=\"https://twitter.com/intent/user?screen_name=twittereng\">@twittereng</a>, specifically —&nbsp;<a href=\"https://twitter.com/HiveTheory\">James Waldrop</a>, <a href=\"https://twitter.com/WamBamBoozle\">Tom Howland</a>, <a href=\"https://twitter.com/Yasumoto\"></a><a href=\"https://twitter.com/stevesalevan\">Steven Salevan</a>,&nbsp;<a href=\"https://twitter.com/squarecog\">Dmitriy Ryaboy</a>&nbsp;and <a href=\"https://twitter.com/razb0x\">Niranjan Baiju</a>.&nbsp;In addition, thanks to <a href=\"https://twitter.com/Yasumoto\">Joseph Smith</a>, <a href=\"https://twitter.com/mleinart\">Michael Leinartas</a>, <a href=\"https://twitter.com/lahosken\">Larry Hosken</a>, <a href=\"https://twitter.com/sdean\">Stephanie Dean</a>&nbsp;and <a href=\"https://twitter.com/davebarr\">David Bar</a>r for their contributions to this blog post.<br></em></p> \n<p></p>",
"date": "2014-09-02T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/push-our-limits-reliability-testing-at-twitter",
"domain": "engineering"
},
{
"title": "All-pairs similarity via DIMSUM",
"body": "<p>We are often interested in finding users, hashtags and ads that are very similar to one another, so they may be recommended and shown to users and advertisers. To do this, we must consider many pairs of items, and evaluate how “similar” they are to one another.</p> \n<p>We call this the “all-pairs similarity” problem, sometimes known as a “similarity join.” We have developed a new efficient algorithm to solve the similarity join called “Dimension Independent Matrix Square using MapReduce,” or DIMSUM for short, which made one of our most expensive computations 40% more efficient.</p> \n<p><strong>Introduction</strong></p> \n<p>To describe the problem we’re trying to solve more formally, when given a dataset of sparse vector data, the all-pairs similarity problem is to find all similar vector pairs according to a similarity function such as <a href=\"https://en.wikipedia.org/wiki/Cosine_similarity\">cosine similarity</a>, and a given similarity score threshold.</p> \n<p>Not all pairs of items are similar to one another, and yet a naive algorithm will spend computational effort to consider even those pairs of items that are not very similar. The brute force approach of considering all pairs of items quickly breaks, since its computational effort scales quadratically.</p> \n<p>For example, for a million vectors, it is not feasible to check all roughly trillion pairs to see if they’re above the similarity threshold. Having said that, there exist clever sampling techniques to focus the computational effort on only those pairs that are above the similarity threshold, thereby making the problem feasible. We’ve developed the DIMSUM sampling scheme to focus the computational effort on only those pairs that are highly similar, thus making the problem feasible.</p> \n<p>In November 2012, we <a href=\"https://engineering/2012/dimension-independent-similarity-computation-disco\">reported the DISCO algorithm</a> to solve the similarity join problem using MapReduce. More recently, we have started using a new version called DIMSUMv2, and the purpose of this blog post is to report experiments and contributions of the new algorithm to two open-source projects. We have contributed DIMSUMv2 to the <a href=\"https://spark.apache.org/\">Spark</a> and <a href=\"http://www.cascading.org/projects/scalding/\">Scalding</a> open-source projects.</p> \n<p><strong>The algorithm</strong></p> \n<p>First, let’s lay down some notation: we’re looking for all pairs of similar columns in an m x n matrix whose entries are denoted a_ij, with the i’th row denoted r_i and the j’th column denoted c_j. There is an oversampling parameter labeled ɣ that should be set to 4 log(n)/s to get provably correct results (with high probability), where s is the similarity threshold.</p> \n<p>The algorithm is stated with a <a href=\"https://en.wikipedia.org/wiki/MapReduce\">Map and Reduce</a>, with proofs of correctness and efficiency in published papers [1] [2]. The reducer is simply the summation reducer. The mapper is more interesting, and is also the heart of the scheme. As an exercise, you should try to see why in expectation, the map-reduce below outputs cosine similarities.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/all-pairs_similarityviadimsum95.thumb.1280.1280.png\" width=\"700\" height=\"394\" alt=\"All-pairs similarity via DIMSUM\" class=\"align-center\"></p> \n<p>The mapper above is more computationally efficient than the mapper presented in [1], in that it tosses fewer coins than the one presented in [1]. Nonetheless, its proof of correctness is the same as Theorem 1 mentioned in [1]. It is also more general than the algorithm presented in [2] since it can handle real-valued vectors, as opposed to only {0,1}-valued vectors. Lastly, this version of DIMSUM is suited to handle rows that may be skewed and have many nonzeros.</p> \n<p><strong>Experiments</strong></p> \n<p>We run DIMSUM daily on a production-scale ads dataset. Upon replacing the traditional cosine similarity computation in late June, we observed 40% improvement in several performance measures, plotted below.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/all-pairs_similarityviadimsum96.thumb.1280.1280.png\" width=\"594\" height=\"290\" alt=\"All-pairs similarity via DIMSUM\" class=\"align-center\"></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/all-pairs_similarityviadimsum97.thumb.1280.1280.png\" width=\"219\" height=\"147\" alt=\"All-pairs similarity via DIMSUM\" class=\"align-center\"></p> \n<p><strong>Open source code</strong></p> \n<p>We have contributed an implementation of DIMSUM to two open source projects: Scalding and Spark.</p> \n<p>Scalding github pull-request: <a href=\"https://github.com/twitter/scalding/pull/833\" target=\"_blank\" rel=\"nofollow\">https://github.com/twitter/scalding/pull/833</a><br>Spark github pull-request: <a href=\"https://github.com/apache/spark/pull/336\" target=\"_blank\" rel=\"nofollow\">https://github.com/apache/spark/pull/336</a></p> \n<p><strong>Collaborators</strong></p> \n<p>Thanks to Kevin Lin, Oscar Boykin, Ashish Goel and Gunnar Carlsson.</p> \n<p><strong>References</strong></p> \n<p>[1] Bosagh-Zadeh, Reza and Carlsson, Gunnar (2013), Dimension Independent Matrix Square using MapReduce, arXiv:1304.1467</p> \n<p>[2] Bosagh-Zadeh, Reza and Goel, Ashish (2012), Dimension Independent Similarity Computation, arXiv:1206.2082</p> \n<p></p>",
"date": "2014-08-29T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/all-pairs-similarity-via-dimsum",
"domain": "engineering"
},
{
"title": "Hello Pants build",
"body": "<p>As codebases grow, they become increasingly difficult to work with. Builds get ever slower and existing tooling doesn’t scale. One solution is to keep splitting the code into more and more independent repositories — but you end up with hundreds of free-floating codebases with hard-to-manage dependencies. This makes it hard to discover, navigate and share code, which can affect developer productivity.</p> \n<p>Another solution is to have a single large, unified codebase. We’ve found that this promotes better engineering team cohesion and collaboration, which results in greater productivity and happiness. But tooling for such structured codebases has been lacking. That’s why we developed <a href=\"http://pantsbuild.github.io/\">Pants</a>, an open source build system written in Python.</p> \n<p></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/TwitterOSS/status/465934496475271168\">https://twitter.com/TwitterOSS/status/465934496475271168</a>\n </blockquote>",
"date": "2014-09-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/hello-pants-build",
"domain": "engineering"
},
{
"title": "Celebrating over a year of @FlightJS",
"body": "<p>Over a year ago, we <a href=\"https://engineering/2013/introducing-flight-a-web-application-framework\">open-sourced FlightJS</a>, a lightweight JavaScript framework for structuring web applications which was designed based on our experience scaling Twitter front-end projects.</p> \n<p>Since then, we have seen an <a href=\"https://github.com/flightjs/flight/blob/master/ADOPTERS.md\">independent community</a> grow with over 45 <a href=\"https://github.com/flightjs/flight/graphs/contributors\">contributors</a> who created new projects like a Yeoman <a href=\"https://github.com/flightjs/generator-flight\">generator</a> and an easier way to <a href=\"http://flight-components.nodejitsu.com/\">find new components</a>. FlightJS was also <a href=\"https://github.com/flightjs/flight/blob/master/ADOPTERS.md\">adopted</a> by other companies like <a href=\"http://nerds.airbnb.com/redesigning-search\">@Airbnb</a> and <a href=\"http://blog.gumroad.com/post/47471270406/gumroad-meets-flight\">@gumroad</a>.</p> \n<h5>Easier to get started</h5> \n<p>Over the pear year, we made it easier to get started with FlightJS due to the generator. Assuming you have NPM installed, simply start with this command to install the generator:</p> \n<pre>npm install -g generator-flight</pre> \n<p>Once that is done, you can bootstrap your FlightJS application using Yeoman and a variety of <a href=\"https://github.com/flightjs/generator-flight/#all-generators-and-their-output\">generator commands</a>:</p> \n<pre>yo flight hello-world-flight<br>yo flight:component hello-word</pre> \n<p>This will scaffold your application file structure, install the necessary library code and configure your test setup. It’s as simple as that.</p> \n<h5>The withChildComponents mixin</h5> \n<p>Grown out of TweetDeck’s codebase, <a href=\"https://github.com/flightjs/flight-with-child-components\">withChildComponents</a> is a mixin for FlightJS that we recently open sourced. It offers a way to nest components, automatically managing component lifecycles to avoid memory leaks and hard-to-maintain code. The mixin binds two or more components’ lifecycles in a parent-child relationship using events, matching the FlightJS philosophy that no component should ever have a reference to any other. A child component’s lifecycle is bound to its parent’s, and a series of child components can be joined in this way to form a tree.</p> \n<p>To host children, a component should use the <em>attachChild</em> method exposed by this mixin. The <em>attachChild</em> method, using the FlightJS late-binding mixin method, initializes the child up so that it will teardown when the parent tears down. An event, signifying that the parent is tearing down, is is passed by <em>attachChild</em> to the child component as its <em>teardownOn</em> attribute. By default, this <em>childTeardownEvent</em> is unique to the parent but can be overwritten to group components or define other teardown behaviors.</p> \n<p>Before the parent tears down, the <em>childTeardownEvent</em> is triggered. Child components, listening for this event, will teardown.</p> \n<p>If a child is hosting its own children, it will then trigger its own <em>childTeardownEvent</em> so that any children it attached will teardown. The teardown order is therefore equivalent to depth first traversal of the component tree.</p> \n<p>For example, a Tweet composition pane might be structured like this:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/celebrating_overayearofflightjs95.thumb.1280.1280.png\" width=\"592\" height=\"399\" alt=\"Celebrating over a year of @FlightJS\"></p> \n<p>The darker blue components mix-in <em>withChildComponents</em>, and use <em>attachChild</em> to attach a new component and bind the child’s lifecycle to their own using their <em>childTeardownEvent</em>.</p> \n<p>The Compose component might look something like this. It attaches the ComposeBox when it initializes, and invokes teardown when it detects a Tweet being sent.</p> \n<pre class=\"brush:jscript;first-line:1;\">function Compose() {\n this.after('initialize', function () {\n this.attachChild(ComposeBox, '.compose-box', {\n initialText: 'Some initial text...'\n });\n\n this.on('sendTweet', this.teardown);\n });\n}\n</pre> \n<p>The ComposeBox also attaches its children during initialization, as well as attaching some behaviour to its own teardown using advice.</p> \n<pre class=\"brush:jscript;first-line:1;\">function ComposeBox() {\n this.after('initialize', function () {\n this.attachChild(TweetButton, ...);\n this.attachChild(CharacterCount, ...);\n });\n\n this.before('teardown', function () {\n this.select('composeTextareaSelector')\n .attr('disabled', true);\n });\n}\n</pre> \n<p>If the Compose pane were torn down – perhaps because a Tweet was sent – the first event to fire would be <em>childTeardownEvent-1</em>, which would cause the ComposeBox and AccountPicker components to teardown. The ComposeBox would fire <em>childTeardownEvent-2</em>, causing the TweetButton and CharacterCount to teardown.</p> \n<p>Of course, if the ComposeBox was torn down on its own, only the TweetButton and the CharacterCount components would teardown with it – you can teardown only part of a component tree if you need to.</p> \n<h5>TweetDeck and the withChildComponents mixin</h5> \n<p>TweetDeck uses <em>withChildComponents</em> to tie logical groups of UI components together into pages or screens. For example, our login UI has a top level component named Startflow that nests components to look after the login form. When a user successfully logs in, the Startflow component tears down, and that brings the login forms with it.</p> \n<p>This centralises the logic for a successful login, and changes to this flow can be made without looking at all files concerned with login. We also don’t have to worry about removing DOM nodes and forgetting to rip out the component too!</p> \n<p>The <em>withChildComponents</em> mixin helps TweetDeck manage nested components in a simple way, abstracting away memory management and a good deal of complexity.</p> \n<h5>Feedback encouraged</h5> \n<p>FlightJS is an ongoing and evolving project. We’re planning to make more of the utilities and mixins used on the Twitter website and in TweetDeck available over time and look forward to your contributions and comments. If you’re interested in learning more, check out the FlightJS <a href=\"https://groups.google.com/forum/?fromgroups#!forum/twitter-flight\">mailing list</a>, <a href=\"http://webchat.freenode.net/?channels=flightjs\">#flightjs</a> on Freenode and follow <a href=\"https://twitter.com/flightjs\">@flightjs</a> for updates.</p> \n<p></p>",
"date": "2014-09-19T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/celebrating-over-a-year-of-flightjs",
"domain": "engineering"
},
{
"title": "Investing in MIT’s new Laboratory for Social Machines",
"body": "<p>Today, <a href=\"https://twitter.com/intent/user?screen_name=MIT\">@MIT</a> <a href=\"http://newsoffice.mit.edu/2014/twitter-funds-mit-media-lab-program-1001\">announced</a> the creation of the Laboratory for Social Machines, funded by a five-year, $10 million commitment from Twitter. Through <a href=\"https://twitter.com/intent/user?screen_name=Gnip\">@Gnip</a>, MIT will also have access to our full stream of public Tweets and complete corpus of historical public Tweets, starting with Jack Dorsey’s first Tweet in 2006.</p> \n<p>This is an exciting step for all of us at Twitter as we continue to develop new ways to support the research community. Building on the success of the <a href=\"https://engineering/2014/introducing-twitter-data-grants\">Twitter Data Grants</a> program, which attracted more than 1,300 applications, we remain committed to making public Twitter data available to researchers, instructors and students. We’ve already seen Twitter data being used in everything from <a href=\"http://www.healthmap.org/en/\">epidemiology</a> to <a href=\"http://petajakarta.org/banjir/en/\">natural disaster response</a>.</p> \n<p>The Laboratory for Social Machines anticipates using Twitter data to investigate the rapidly changing and intersecting worlds of news, government and collective action. The hope is that their research team will be able to understand how movements are started by better understanding how information spreads on Twitter.</p> \n<p>We look forward to the innovative research generated by the Laboratory for Social Machines and to further increasing Twitter’s footprint in the research community. For more information about the Laboratory for Social Machines, please visit <a href=\"http://socialmachines.media.mit.edu/\">socialmachines.media.mit.edu</a>.</p>",
"date": "2014-10-01T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/investing-in-mit-s-new-laboratory-for-social-machines",
"domain": "engineering"
},
{
"title": "Breakout detection in the wild",
"body": "<p>Nowadays, BigData is leveraged in every sphere of business: decision making for new products, gauging user engagement, making recommendations for products, health care, data center efficiency and more.</p> \n<p>A common form of BigData is time series data. With the progressively decreasing costs of collecting and mining large data sets, it’s become increasingly common that companies – including Twitter – collect millions of metrics on a daily basis [<a href=\"http://radar.oreilly.com/2013/09/how-twitter-monitors-millions-of-time-series.html\">1</a>, <a href=\"http://strataconf.com/strata2014/public/schedule/detail/32431\">2</a>, <a href=\"http://velocityconf.com/velocity2013/public/schedule/detail/28177\">3</a>].</p> \n<p>Exogenic and/or endogenic factors often give rise to breakouts in a time series. Breakouts can potentially have ramifications on the user experience and/or on a business’ bottom line. For example, in the context of cloud infrastructure, breakouts in time series data of system metrics – that may happen due to a hardware issues – could impact availability and performance of a service.</p> \n<p>Given the real-time nature of Twitter, and that high performance is key for delivering the best experience to our users, early detection of breakouts is of paramount importance. Breakout detection has also been used to detect change in user engagement during popular live events such as the Oscars, Super Bowl and World Cup.</p> \n<p>A breakout is typically characterized by two steady states and an intermediate transition period. Broadly speaking, breakouts have two flavors:</p> \n<ol>\n <li>Mean shift: A sudden jump in the time series corresponds to a mean shift. A sudden jump in CPU utilization from 40% to 60% would exemplify a mean shift.</li> \n <li>Ramp up: A gradual increase in the value of the metric from one steady state to another constitutes a ramp up. A gradual increase in CPU utilization from 40% to 60% would exemplify a ramp up.</li> \n</ol>\n<p>The figure below illustrates multiple mean shifts in real data.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/breakout_detectioninthewild95.thumb.1280.1280.png\" width=\"700\" height=\"536\" alt=\"Breakout detection in the wild\" class=\"align-center\"></p> \n<p>Given the ever-growing number of metrics being collected, it’s imperative to automatically detect breakouts. Although a large body of research already exists on breakout detection, existing techniques are not suitable for detecting breakouts in cloud data. This can be ascribed to the fact that existing techniques are not robust in the presence of anomalies (which are not uncommon in cloud data).</p> \n<p>Today, we’re excited to announce the release of BreakoutDetection, an <a href=\"http://www.r-project.org/\">open-source R package</a> that makes breakout detection simple and fast. With its release, we hope that the community can benefit from the package as we have at Twitter and improve it over time.</p> \n<p>Our main motivation behind creating the package has been to develop a technique to detect breakouts which are robust, from a statistical standpoint, in the presence of anomalies. The BreakoutDetection package can be used in wide variety of contexts. For example, detecting breakout in user engagement post an A/B test, detecting <a href=\"http://wiki.cbr.washington.edu/qerm/index.php/Behavioral_Change_Point_Analysis\">behavioral change</a>, or for problems in econometrics, financial engineering, political and social sciences.</p> \n<p><strong>How the package works</strong><br>The underlying algorithm – referred to as E-Divisive with Medians (EDM) – employs energy statistics to detect divergence in mean. Note that EDM can also be used detect change in distribution in a given time series. EDM uses <a href=\"http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470129905.html\">robust statistical metrics</a>, viz., median, and estimates the statistical significance of a breakout through a permutation test.</p> \n<p>In addition, EDM is non-parametric. This is important since the distribution of production data seldom (if at all) follows the commonly assumed normal distribution or any other widely accepted model. Our experience has been that time series often contain more than one breakout. To this end, the package can also be used to detect multiple breakouts in a given time series.</p> \n<p><strong>How to get started</strong><br>Install the R package using the following commands on the R console:</p> \n<pre class=\"brush:sql;first-line:1;\">install.packages(\"devtools\")\ndevtools::install_github(\"twitter/BreakoutDetection\")\nlibrary(BreakoutDetection)\n</pre> \n<p>The function breakout is called to detect one or more statistically significant breakouts in the input time series. The documentation of the function breakout, which can be seen by using the following command, details the input arguments and the output of the function breakout.</p> \n<pre class=\"brush:sql;first-line:1;\">help(breakout)\n</pre> \n<p><strong>A simple example</strong><br>To get started, the user is recommended to use the example dataset which comes with the packages. Execute the following commands:</p> \n<pre class=\"brush:sql;first-line:1;\">data(Scribe)\nres = breakout(Scribe, min.size=24, method='multi', beta=.001, degree=1, plot=TRUE)\nres$plot\n</pre> \n<p>The above yields the following plot:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/breakout_detectioninthewild96.thumb.1280.1280.png\" width=\"700\" height=\"320\" alt=\"Breakout detection in the wild\" class=\"align-center\"></p> \n<p>From the above plot, we observe that the input time series experiences a breakout and also has quite a few anomalies. The two red vertical lines denote the locations of the breakouts detected by the EDM algorithm. Unlike the existing approaches mentioned earlier, EDM is robust in the presence of anomalies. The change in mean in the time series can be better viewed with the following annotated plot:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/breakout_detectioninthewild97.thumb.1280.1280.png\" width=\"700\" height=\"320\" alt=\"Breakout detection in the wild\" class=\"align-center\"></p> \n<p>The horizontal lines in the annotated plot above correspond to the approximate (i.e., filtering out the effect of anomalies) mean for each window.</p> \n<p><strong>Acknowledgements</strong></p> \n<p>We thank <a href=\"twitter.com/jtsiamis\">James Tsiamis</a>&nbsp;and <a href=\"twitter.com/scott_wong\">Scott Wong</a> for their support and <a href=\"twitter.com/SlickRames\">Nicholas James</a> as the primary researcher behind this research.</p> \n<p></p>",
"date": "2014-10-24T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/breakout-detection-in-the-wild",
"domain": "engineering"
},
{
"title": "Building a complete Tweet index",
"body": "<p>Today, we are pleased to announce that Twitter now indexes every public Tweet since 2006.</p> \n<p>Since that <a href=\"https://twitter.com/jack/status/20\">first simple Tweet</a> over eight years ago, hundreds of billions of Tweets have captured everyday human experiences and major historical events. Our search engine excelled at surfacing breaking news and events in real time, and our search index infrastructure reflected this strong emphasis on recency. But our long-standing goal has been to let people search through every Tweet ever published.</p> \n<p>This new infrastructure enables many use cases, providing comprehensive results for entire TV and sports seasons, conferences (<a href=\"https://twitter.com/search?f=realtime&amp;q=%23tedglobal%20until%3A2013-06-15\">#TEDGlobal</a>), industry discussions (<a href=\"https://twitter.com/search?f=realtime&amp;q=%23mobilepayments%20until%3A2014-11-10\">#MobilePayments</a>), places, businesses and long-lived hashtag conversations across topics, such as <a href=\"https://twitter.com/search?f=realtime&amp;q=%23japanearthquake%20until%3A2011-03-11_14%3A15%3A00_PST\">#JapanEarthquake</a>, <a href=\"https://twitter.com/search?f=realtime&amp;q=%23Election2012%20until%3A2012-11-07\">#Election2012</a>, <a href=\"https://twitter.com/search?f=realtime&amp;q=%23scotlanddecides%20until%3A2014-09-17\">#ScotlandDecides</a>, <a href=\"https://twitter.com/search?f=realtime&amp;q=%23HongKong%20until%3A2014-09-28_15%3A49%3A07_PST\">#HongKong</a>, <a href=\"https://twitter.com/search?f=realtime&amp;q=%23ferguson%20until%3A2014-08-19_05%3A15%3A10_PST\">#Ferguson</a> and many more. This change will be rolling out to users over the next few days.</p> \n<p>In this post, we describe how we built a search service that efficiently indexes roughly half a trillion documents and serves queries with an average latency of under 100ms.</p> \n<p>The most important factors in our design were:</p> \n<ul>\n <li><strong>Modularity</strong>: Twitter already had a <a href=\"https://engineering.twitter.com/research/publication/earlybird-real-time-search-at-twitter\">real-time index</a> (an inverted index containing about a week’s worth of recent Tweets). We shared source code and tests between the two indices where possible, which created a cleaner system in less time.</li> \n <li><strong>Scalability</strong>: The full index is more than 100 times larger than our real-time index and grows by several billion Tweets a week. Our fixed-size real-time index clusters are non-trivial to expand; adding capacity requires re-partitioning and significant operational overhead. We needed a system that expands in place gracefully.</li> \n <li><strong>Cost effectiveness</strong>: Our real-time index is fully stored in RAM for low latency and fast updates. However, using the same RAM technology for the full index would have been prohibitively expensive.</li> \n <li><strong>Simple interface</strong>: Partitioning is unavoidable at this scale. But we wanted a simple interface that hides the underlying partitions so that internal clients can treat the cluster as a single endpoint.</li> \n <li>I<strong>ncremental development:</strong> The goal of “indexing every Tweet” was not achieved in one quarter. The full index builds on previous foundational projects. In 2012, we built a <a href=\"https://engineering/2013/now-showing-older-tweets-in-search-results\">small historical index</a> of approximately two billion top Tweets, developing an offline data aggregation and preprocessing pipeline. In 2013, we expanded that index by an order of magnitude, evaluating and tuning SSD performance. In 2014, we built the full index with a multi-tier architecture, focusing on scalability and operability.</li> \n</ul>\n<p><strong>Overview</strong><br>The system consists four main parts: a batched data aggregation and preprocess pipeline; an inverted index builder; Earlybird shards; and Earlybird roots. Read on for a high-level overview of each component.</p> \n<p><strong>Batched data aggregation and preprocessing</strong><br>The ingestion pipeline for our real-time index processes individual Tweets one at a time. In contrast, the full index uses a batch processing pipeline, where each batch is a day of Tweets. We wanted our offline batch processing jobs to share as much code as possible with our real-time ingestion pipeline, while still remaining efficient.</p> \n<p>To do this, we packaged the relevant real-time ingestion code into Pig User-Defined Functions so that we could reuse it in Pig jobs (soon, moving to Scalding), and created a pipeline of Hadoop jobs to aggregate data and preprocess Tweets on Hadoop. The pipeline is shown in this diagram:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_completetweetindex95.thumb.1280.1280.png\" width=\"666\" height=\"312\" alt=\"Building a complete Tweet index \" class=\"align-center\"><span>The daily data aggregation and preprocess pipeline consists of these components:</span></p> \n<ul>\n <li>Engagement aggregator: Counts the number of engagements for each Tweet in a given day. These engagement counts are used later as an input in scoring each Tweet.</li> \n <li>Aggregation: Joins multiple data sources together based on Tweet ID.</li> \n <li>Ingestion: Performs different types of preprocessing — language identification, tokenization, text feature extraction, URL resolution and more.</li> \n <li>Scorer: Computes a score based on features extracted during Ingestion. For the smaller historical indices, this score determined which Tweets were selected into the index.</li> \n <li>Partitioner: Divides the data into smaller chunks through our hashing algorithm. The final output is stored into HDFS.</li> \n</ul>\n<p>This pipeline was designed to run against a single day of Tweets. We set up the pipeline to run every day to process data incrementally. This setup had two main benefits. It allowed us to incrementally update the index with new data without having to fully rebuild too frequently. And because processing for each day is set up to be fully independent, the pipeline could be massively parallelizable on Hadoop. This allowed us to efficiently rebuild the full index periodically (e.g. to add new indexed fields or change tokenization)</p> \n<p><strong>Inverted index building</strong><br>The daily data aggregation and preprocess job outputs one record per Tweet. That output is already tokenized, but not yet inverted. So our next step was to set up single-threaded, stateless inverted index builders that run on <a href=\"https://engineering/2013/mesos-graduates-from-apache-incubation\">Mesos</a>.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_completetweetindex96.thumb.1280.1280.png\" width=\"661\" height=\"385\" alt=\"Building a complete Tweet index \" class=\"align-center\"></p> \n<p>The inverted index builder consists of the following components:</p> \n<ul>\n <li><strong>Segment partitioner:</strong> Groups multiple batches of preprocessed daily Tweet data from the same partition into bundles. We call these bundles “segments.”</li> \n <li><strong>Segment indexer:</strong> Inverts each Tweet in a segment, builds an inverted index and stores the inverted index into HDFS.</li> \n</ul>\n<p>The beauty of these inverted index builders is that they are very simple. They are single-threaded and stateless, and these small builders can be massively parallelized on Mesos (we have launched well over a thousand parallel builders in some cases). These inverted index builders can coordinate with each other by placing locks on ZooKeeper, which ensures that two builders don’t build the same segment. Using this approach, we rebuilt inverted indices for nearly half a trillion Tweets in only about two days (fun fact: our bottleneck is actually the Hadoop namenode).</p> \n<p><strong>Earlybirds shards</strong><br>The inverted index builders produced hundreds of inverted index segments. These segments were then distributed to machines called <a href=\"https://engineering.twitter.com/research/publication/earlybird-real-time-search-at-twitter\">Earlybirds</a>. Since each Earlybird machine could only serve a small portion of the full Tweet corpus, we had to introduce sharding.</p> \n<p>In the past, we distributed segments into different hosts using a hash function. This works well with our real-time index, which remains a constant size over time. However, our full index clusters needed to grow continuously.</p> \n<p>With simple hash partitioning, expanding clusters in place involves a non-trivial amount of operational work – data needs to be shuffled around as the number of hash partitions increases. Instead, we created a two-dimensional sharding scheme to distribute index segments onto serving Earlybirds. With this two-dimensional sharding, we can expand our cluster without modifying existing hosts in the cluster:</p> \n<ul>\n <li>Temporal sharding: The Tweet corpus was first divided into multiple time tiers.</li> \n <li>Hash partitioning: Within each time tier, data was divided into partitions based on a hash function.</li> \n <li>Earlybird: Within each hash partition, data was further divided into chunks called Segments. Segments were grouped together based on how many could fit on each Earlybird machine.</li> \n <li>Replicas: Each Earlybird machine is replicated to increase serving capacity and resilience.</li> \n</ul>\n<p>The sharding is shown in this diagram:</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_completetweetindex97.thumb.1280.1280.png\" width=\"682\" height=\"369\" alt=\"Building a complete Tweet index \" class=\"align-center\"></p> \n<p>This setup makes cluster expansion simple:</p> \n<ul>\n <li>To grow data capacity over time, we will add time tiers. Existing time tiers will remain unchanged. This allows us to expand the cluster in place.</li> \n <li>To grow serving capacity (QPS) over time, we can add more replicas.</li> \n</ul>\n<p>This setup allowed us to avoid adding hash partitions, which is non-trivial if we want to perform data shuffling without taking the cluster offline.</p> \n<p>A larger number of Earlybird machines per cluster translates to more operational overhead. We reduced cluster size by:</p> \n<ul>\n <li>Packing more segments onto each Earlybird (reducing hash partition count).</li> \n <li>Increasing the amount of QPS each Earlybird could serve (reducing replicas).</li> \n</ul>\n<p>In order to pack more segments onto each Earlybird, we needed to find a different storage medium. RAM was too expensive. Even worse, our ability to plug large amounts of RAM into each machine would have been physically limited by the number of DIMM slots per machine. SSDs were significantly less expensive ($/terabyte) than RAM. SSDs also provided much higher read/write performance compared to regular spindle disks.</p> \n<p>However, SSDs were still orders of magnitude slower than RAM. Switching from RAM to SSD, our Earlybird QPS capacity took a major hit. To increase serving capacity, we made multiple optimizations such as tuning kernel parameters to optimize SSD performance, packing multiple DocValues fields together to reduce SSD random access, loading frequently accessed fields directly in-process and more. These optimizations are not covered in detail in this blog post.</p> \n<p><strong>Earlybird roots</strong><br>This two-dimensional sharding addressed cluster scaling and expansion. However, we did not want API clients to have to scatter gather from the hash partitions and time tiers in order to serve a single query. To keep the client API simple, we introduced roots to abstract away the internal details of tiering and partitioning in the full index.</p> \n<p>The roots perform a two level scatter-gather as shown in the below diagram, merging search results and term statistics histograms. This results in a simple API, and it appears to our clients that they are hitting a single endpoint. In addition, this two level merging setup allows us to perform additional optimizations, such as avoiding forwarding requests to time tiers not relevant to the search query.</p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_completetweetindex98.thumb.1280.1280.png\" width=\"671\" height=\"364\" alt=\"Building a complete Tweet index \" class=\"align-center\"></p> \n<p><strong>Looking ahead</strong><br>For now, complete results from the full index will appear in the “All” tab of search results on the Twitter web client and Twitter for iOS &amp; Twitter for Android apps. Over time, you’ll see more Tweets from this index appearing in the “Top” tab of search results and in new product experiences powered by this index. <a href=\"https://twitter.com/search-advanced\">Try it out</a>: you can search for the first Tweets about <a href=\"https://twitter.com/search?f=realtime&amp;q=New%20Years%20until%3A2007-01-03%20since%3A2006-12-30&amp;src=typd\">New Years</a> between Dec. 30, 2006 and Jan. 2, 2007.</p> \n<p>The full index is a major infrastructure investment and part of ongoing improvements to the search and discovery experience on Twitter. There is still more exciting work ahead, such as optimizations for smart caching. If this project sounds interesting to you, we could use your help – <a href=\"https://about.twitter.com/careers\">join the flock</a>!</p> \n<p><strong>Acknowledgments</strong><br>The full index project described in this post was led by <a href=\"https://twitter.com/yz\">Yi Zhuang</a> and <a href=\"https://twitter.com/pasha407\">Paul Burstein</a>. However, it builds on multiple years of related work. Many thanks to the team members that made this project possible.</p> \n<p><strong>Contributors</strong>: <a href=\"https://twitter.com/ForrestHBennett\">Forrest Bennett</a>, <a href=\"https://twitter.com/SteveBezek\">Steve Bezek</a>, <a href=\"https://twitter.com/pasha407\">Paul Burstein</a>, <a href=\"https://twitter.com/michibusch\">Michael Busch</a>, <a href=\"https://twitter.com/cdudte\">Chris Dudte</a>, <a href=\"https://twitter.com/keita_f\">Keita Fujii</a>, <a href=\"https://twitter.com/ativilambit\">Vibhav Garg</a>, <a href=\"https://twitter.com/Mi_Mo\">Mila Hardt</a>, <a href=\"https://twitter.com/jmh\">Justin Hurley</a>, <a href=\"https://twitter.com/therealahook\">Aaron Hook</a>, <a href=\"https://twitter.com/jmpspn\">Nik Johnson</a>, <a href=\"https://twitter.com/larsonite\">Brian Larson</a>, <a href=\"https://twitter.com/aaronlewism\">Aaron Lewis</a>, <a href=\"https://twitter.com/zhenghuali\">Zhenghua Li</a>, <a href=\"https://twitter.com/plok\">Patrick Lok</a>, <a href=\"https://twitter.com/sam\">Sam Luckenbill,</a> <a href=\"https://twitter.com/gilad\">Gilad Mishne</a>, <a href=\"https://twitter.com/ysaraf\">Yatharth Saraf,</a> <a href=\"https://twitter.com/eecraft\">Jayarama Shenoy</a>, <a href=\"https://twitter.com/tjps636\">Thomas Snider</a>, <a href=\"https://twitter.com/tiehx\">Haixin Tie</a>, <a href=\"https://twitter.com/O_e_bert\">Owen Vallis</a>, <a href=\"https://twitter.com/jane63090871\">Jane Wang</a>, <a href=\"https://twitter.com/javasoze\">John Wang</a>, <a href=\"https://twitter.com/wonlay\">Lei Wang</a>, <a href=\"https://twitter.com/wangtian\">Tian Wang</a>, <a href=\"https://twitter.com/bryce_yan\">Bryce Yan</a>, <a href=\"https://twitter.com/JimYoull\">Jim Youll</a>, <a href=\"https://twitter.com/zengmin88\">Min Zeng</a>, <a href=\"https://twitter.com/kevinzhao\">Kevin Zhao</a>, <a href=\"https://twitter.com/yz\">Yi Zhuang</a></p>",
"date": "2014-11-18T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2014/building-a-complete-tweet-index",
"domain": "engineering"
},
{
"title": "2015",
"body": "",
"date": null,
"url": "https://engineering/engineering/en_us/a/2015",
"domain": "engineering"
},
{
"title": "Introducing practical and robust anomaly detection in a time series",
"body": "<p>Both last year and this year, we saw a spike in the number of photos uploaded to Twitter on Christmas Eve, Christmas and New Year’s Eve (in other words, an anomaly occurred in the corresponding time series). Today, we’re announcing <a href=\"https://github.com/twitter/AnomalyDetection\">AnomalyDetection</a>, our open-source R package that automatically detects anomalies like these in big data in a practical and robust way.</p> \n<p class=\"align-left\"><a href=\"https://g.twimg.com/blog/blog/image/Anomaly_ChristmasEve_2014.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries95.thumb.1280.1280.png\" width=\"700\" height=\"304\" alt=\"Introducing practical and robust anomaly detection in a time series\"></a><em>Time series from Christmas Eve 2014</em></p> \n<p class=\"align-left\"><em><a href=\"https://g.twimg.com/blog/blog/image/figure_2_Photos_ChristmasEve_2013.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries96.thumb.1280.1280.png\" width=\"700\" height=\"304\" alt=\"Introducing practical and robust anomaly detection in a time series\"></a><em><em><em><em>Time series from Christmas Eve 2013</em></em></em></em></em></p> \n<p class=\"align-left\"></p> \n<p><em></em>Early detection of anomalies plays a key role in ensuring high-fidelity data is available to our own product teams and those of our data partners. This package helps us monitor spikes in user engagement on the platform surrounding holidays, major sporting events or during breaking news. Beyond surges in social engagement, exogenic factors – such as bots or spammers – may cause an anomaly in number of favorites or followers. The package can be used to find such bots or spam, as well as detect anomalies in system metrics after a new software release. We’re open-sourcing AnomalyDetection because we’d like the public community to evolve the package and learn from it as we have.</p> \n<p><a href=\"https://engineering/2014/breakout-detection-in-the-wild\">Recently</a>, we open-sourced <a href=\"https://github.com/twitter/BreakoutDetection\">BreakoutDetection</a>, a complementary R package for automatic detection of one or more breakouts in time series. While anomalies are point-in-time anomalous data points, breakouts are characterized by a ramp up from one steady state to another.</p> \n<p>Despite prior research in anomaly detection [1], these techniques are not applicable in the context of social network data because of its inherent seasonal and trend components. Also, as pointed out by Chandola et al. [2], anomalies are contextual in nature and hence, techniques developed for anomaly detection in one domain can rarely be used ‘as is’ in another domain.</p> \n<p>Broadly, an anomaly can be characterized in the following ways:</p> \n<ol>\n <li class=\"align-left\"><strong>Global/Local:</strong> At Twitter, we observe distinct seasonal patterns in most of the time series we monitor in production. Furthermore, we monitor multiple modes in a given time period. The seasonal nature can be ascribed to a multitude of reasons such as different user behavior across different geographies. Additionally, over longer periods of time, we observe an underlying trend. This can be explained, in part, by organic growth. As the figure below shows, global anomalies typically extend above or below expected seasonality and are therefore not subject to seasonality and underlying trend. On the other hand, local anomalies, or anomalies which occur inside seasonal patterns, are masked and thus are much more difficult to detect in a robust fashion.&nbsp;<a href=\"https://g.twimg.com/blog/blog/image/figure_localglobal_anomalies.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries97.thumb.1280.1280.png\" width=\"700\" height=\"186\" alt=\"Introducing practical and robust anomaly detection in a time series\" class=\"align-center\"></a><em>Illustrates positive/negative, global/local anomalies detected in real data</em></li> \n <li class=\"align-left\"><strong>Positive/Negative:</strong> An anomaly can be positive or negative. An example of a positive anomaly is a point-in-time increase in number of Tweets during the Super Bowl. An example of a negative anomaly is a point-in-time decrease in QPS (queries per second). Robust detection of positive anomalies serves a key role in efficient capacity planning. Detection of negative anomalies helps discover potential hardware and data collection issues.</li> \n</ol>\n<p><strong>How does the package work?</strong><br>The primary algorithm, Seasonal Hybrid ESD (S-H-ESD), builds upon the Generalized ESD test [3] for detecting anomalies. S-H-ESD can be used to detect both global and local anomalies. This is achieved by employing time series decomposition and using <a href=\"http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470129905.html\">robust statistical metrics</a>, viz., median together with ESD. In addition, for long time series such as 6 months of minutely data, the algorithm employs piecewise approximation. This is rooted to the fact that trend extraction in the presence of anomalies is non-trivial for anomaly detection [4].</p> \n<p>The figure below shows large global anomalies present in the raw data and the local (intra-day) anomalies that S-H-ESD exposes in the residual component via our statistically robust decomposition technique.<a href=\"https://g.twimg.com/blog/blog/image/figure_raw_residual_global_local.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries98.thumb.1280.1280.png\" width=\"700\" height=\"177\" alt=\"Introducing practical and robust anomaly detection in a time series\" class=\"align-left\"></a></p> \n<p>Besides time series, the package can also be used to detect anomalies in a vector of numerical values. We have found this very useful as many times the corresponding timestamps are not available. The package provides rich visualization support. The user can specify the direction of anomalies, the window of interest (such as last day, last hour) and enable or disable piecewise approximation. Additionally, the x- and y-axis are annotated in a way to assist with visual data analysis.</p> \n<p><strong>Getting started</strong><br>To begin, install the R package using the commands below on the R console:</p> \n<pre class=\"brush:bash;first-line:1;\">install.packages(\"devtools\")\ndevtools::install_github(\"twitter/AnomalyDetection\")\nlibrary(AnomalyDetection)\n</pre> \n<p>The function AnomalyDetectionTs is used to discover statistically meaningful anomalies in the input time series. The documentation of the function AnomalyDetectionTs details the input arguments and output of the function AnomalyDetectionTs, which can be seen by using the command below.</p> \n<pre class=\"brush:bash;first-line:1;\">help(AnomalyDetectionTs)\n</pre> \n<p><strong>An example</strong><br>The user is recommended to use the example dataset which comes with the packages. Execute the following commands:</p> \n<pre class=\"brush:bash;first-line:1;\">data(raw_data)\nres = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', plot=TRUE)\nres$plot\n</pre> \n<p>This yields the following plot:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/figure_5_.91anomalies.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries99.thumb.1280.1280.png\" width=\"700\" height=\"266\" alt=\"Introducing practical and robust anomaly detection in a time series\" class=\"align-left\"></a></p> \n<p>From the plot, we can tell that the input time series experiences both positive and negative anomalies. Furthermore, many of the anomalies in the time series are local anomalies within the bounds of the time series’ seasonality.</p> \n<p>Therefore, these anomalies can’t be detected using the traditional methods. The anomalies detected using the proposed technique are annotated on the plot. In case the timestamps for the plot above were not available, anomaly detection could then be carried out using the AnomalyDetectionVec function. Specifically, you can use the following command:</p> \n<pre class=\"brush:bash;first-line:1;\">AnomalyDetectionVec(raw_data[,2], max_anoms=0.02, period=1440, direction='both', only_last=FALSE, plot=TRUE)\n</pre> \n<p>Often, anomaly detection is carried out on a periodic basis. For instance, you may be interested in determining whether there were any anomalies yesterday. To this end, we support a flag only_last where one can subset the anomalies that occurred during the last day or last hour. The following command</p> \n<pre class=\"brush:bash;first-line:1;\">res = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', only_last=\"day\", plot=TRUE)\nres$plot\n</pre> \n<p>yields the following plot:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/figure_6_1.74_anomalies_1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries100.thumb.1280.1280.png\" width=\"700\" height=\"266\" alt=\"Introducing practical and robust anomaly detection in a time series\" class=\"align-left\"></a></p> \n<p>From the above plot, we observe that only the anomalies that occurred during the last day have been annotated. Additionally, the prior six days are included to expose the seasonal nature of the time series but are put in the background as the window of primary interest is the last day.</p> \n<p>Anomaly detection for long duration time series can be carried out by setting the longterm argument to T. An example plot corresponding to this (for a different data set) is shown below:<a href=\"https://g.twimg.com/blog/blog/image/figure_longterm.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/introducing_practicalandrobustanomalydetectioninatimeseries101.thumb.1280.1280.png\" width=\"700\" height=\"225\" alt=\"Introducing practical and robust anomaly detection in a time series\" class=\"align-left\"></a></p> \n<p><strong>Acknowledgements</strong></p> \n<p>Our thanks to <a href=\"https://twitter.com/jtsiamis\">James Tsiamis</a> and <a href=\"https://twitter.com/scott_wonger\">Scott Wong</a> for their assistance, and Owen Vallis (<a href=\"https://twitter.com/intent/user?screen_name=OwenVallis\">@OwenVallis</a>) and Jordan Hochenbaum (<a href=\"https://twitter.com/intent/user?screen_name=jnatanh\">@jnatanh</a>) for this research.</p> \n<p><strong>References</strong></p> \n<p>[1] Charu C. Aggarwal. “<em>Outlier analysis</em>”. Springer, 2013.</p> \n<p>[2] Varun Chandola, Arindam Banerjee, and Vipin Kumar. “<em>Anomaly detection: A survey</em>”. ACM Computing Surveys, 41(3):15:1{15:58, July 2009.</p> \n<p>[3] Rosner, B., (May 1983), “<em>Percentage Points for a Generalized ESD Many-Outlier Procedure</em>”, Technometrics, 25(2), pp. 165-172.</p> \n<p>[4] Vallis, O., Hochenbaum, J. and Kejariwal, A., (2014) “<em>A Novel Technique for Long-Term Anomaly Detection in the Cloud</em>”, 6th USENIX Workshop on Hot Topics in Cloud Computing, Philadelphia, PA.</p>",
"date": "2015-01-06T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/introducing-practical-and-robust-anomaly-detection-in-a-time-series",
"domain": "engineering"
},
{
"title": "All about Apache Aurora",
"body": "<p>Today we’re excited to see the <a href=\"http://aurora.incubator.apache.org/\">Apache Aurora</a> community announce the <a href=\"http://aurora.incubator.apache.org/blog/aurora-0-7-0-incubating-released/\">0.7.0 release</a>. Since we began development on this project, Aurora has become a critical part of how we run services at Twitter. Now a fledgling open source project, Aurora is actively developed by a community of developers running it in production.</p> \n<p>This is Aurora’s third release since joining the <a href=\"http://incubator.apache.org/\">Apache Incubator</a>. New features include beta integration with Docker – which allows developers to deploy their pre-packaged applications as lightweight containers – as well as full support for a new command-line client that makes it even simpler to deploy services via Aurora. A full <a href=\"https://git-wip-us.apache.org/repos/asf?p=incubator-aurora.git&amp;f=CHANGELOG&amp;hb=0.7.0-rc3\">changelog</a> is available documenting more updates.</p> \n<p></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/ApacheAurora/status/565914721165918208\">https://twitter.com/ApacheAurora/status/565914721165918208</a>\n </blockquote>",
"date": "2015-02-12T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/all-about-apache-aurora",
"domain": "engineering"
},
{
"title": "Handling five billion sessions a day – in real time",
"body": "<p>Since we first <a href=\"http://www.crashlytics.com/blog/launching-answers-by-crashlytics/\">released Answers</a> seven months ago, we’ve been thrilled by tremendous adoption from the mobile community. We now see about <em>five billion sessions per day</em>, and growing. Hundreds of millions of devices send millions of events <em>every second</em> to the Answers endpoint. During the time that it took you to read to here, the Answers back-end will have received and processed about 10,000,000 analytics events.</p> \n<p>The challenge for us is to use this information to provide app developers with reliable, real-time and actionable insights into their mobile apps.</p> \n<p>At a high level, we guide our architectural decisions on the principles of decoupled components, asynchronous communication and graceful service degradation in response to catastrophic failures. We make use of the Lambda Architecture to combine data integrity with real-time data updates.</p> \n<p>In practice, we need to design a system that receives events, archives them, performs offline and real-time computations, and merges the results of those computations into coherent information. All of this needs to happen at the scale of millions events per second.</p> \n<p>Let’s start with our first challenge: receiving and handling these events.</p> \n<p><strong>Event reception</strong><br>When designing our device-server communication, our goals were: reducing impact on battery and network usage; ensuring data reliability; and getting the data over as close to real time as possible. To reduce impact on the device, we send analytics events in batches and compress them before sending. To ensure that valuable data always gets to our servers, devices retry failed data transfers after a randomized back-off and up to a disk size limit on the device. To get events over to the servers as soon as possible, there are several triggers that cause the device to attempt a transfer: a time trigger that fires every few minutes when the app is foregrounded, a number of events trigger and an app going into background trigger.</p> \n<p>This communication protocol results in devices sending us hundreds of thousands of compressed payloads every second. Each of these payloads may contain tens of events. To handle this load reliably and in a way that permits for easy linear scaling, we wanted to make the service that accepts the events be dead simple.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/answers_architecture_screenshot1_0.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/handling_five_billionsessionsadayinrealtime95.thumb.1280.1280.png\" width=\"555\" height=\"241\" alt=\"Handling five billion sessions a day – in real time \"></a></p> \n<p>This service is written in GOLANG, fronted by Amazon Elastic Load Balancer (ELB), and simply enqueues every payload that it receives into a durable <a href=\"http://kafka.apache.org\">Kafka queue</a>.</p> \n<p><strong>Archival</strong><br>Because Kafka writes the messages it receives to disk and supports keeping multiple copies of each message, it is a durable store. Thus, once the information is in it we know that we can tolerate downstream delays or failures by processing, or reprocessing, the messages later. However, Kafka is not the permanent source of truth for our historic data — at the incoming rate of information that we see, we’d need hundreds of boxes to store all of the data just for a few days. So we configure our Kafka cluster to retain information for a few hours (enough time for us to respond to any unexpected, major failures) and get the data to our permanent store, Amazon Simple Storage Service (Amazon S3), as soon as possible.</p> \n<p>We extensively utilize <a href=\"https://storm.apache.org\">Storm</a> for real-time data processing, and the first relevant topology is one that reads the information from Kafka and writes it to Amazon S3.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/answers_architecture_screenshot2_0.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/handling_five_billionsessionsadayinrealtime96.thumb.1280.1280.png\" width=\"555\" height=\"241\" alt=\"Handling five billion sessions a day – in real time \"></a></p> \n<p><strong>Batch computation</strong><br>Once the data is in Amazon S3, we’ve set ourselves up for being able to compute anything that our data will allow us to via Amazon Elastic MapReduce (Amazon EMR). This includes batch jobs for all of the data that our customers see in their dashboards, as well as experimental jobs as we work on new features.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/answers_architecture_screenshot3_0.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/handling_five_billionsessionsadayinrealtime97.thumb.1280.1280.png\" width=\"555\" height=\"246\" alt=\"Handling five billion sessions a day – in real time \"></a></p> \n<p>We write our MapReduce in <a href=\"http://www.cascading.org\">Cascading</a> and run them via Amazon EMR. Amazon EMR reads the data that we’ve archived in Amazon S3 as input and writes the results back out to Amazon S3 once processing is complete. We detect the jobs’ completion via a scheduler topology running in Storm and pump the output from Amazon S3 into a <a href=\"http://cassandra.apache.org\">Cassandra</a> cluster in order to make it available for sub-second API querying.</p> \n<p><strong>Speed computation</strong><br>What we have described so far is a durable and fault-tolerant framework for performing analytic computations. There is one glaring problem however — it’s not real time. Some computations run hourly, while others require a full day’s of data as input. The computation times range from minutes to hours, as does the time it takes to get the output from Amazon S3 to a serving layer. Thus, at best, our data would always be a few hours behind, and wouldn’t meet our goals of being <strong>real time</strong> and <strong>actionable</strong>.</p> \n<p>To address this, in parallel to archiving the data as it comes in, we perform stream computations on it.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/answers_architecture_screenshot4.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/handling_five_billionsessionsadayinrealtime98.thumb.1280.1280.png\" width=\"555\" height=\"231\" alt=\"Handling five billion sessions a day – in real time \"></a></p> \n<p>An independent Storm topology consumes the same Kafka topic as our archival topology and performs the same computations that our MapReduce jobs do, but in real time. The outputs of these computations are written to a different independent Cassandra cluster for real-time querying.</p> \n<p>To compensate for the fact that we have less time, and potentially fewer resources, in the speed layer than the batch, we use probabilistic algorithms like <a href=\"http://en.wikipedia.org/wiki/Bloom_filter\">Bloom Filters</a> and <a href=\"http://en.wikipedia.org/wiki/HyperLogLog\">HyperLogLog</a> (as well as a few home grown ones). These algorithms enable us to make order-of-magnitude gains in space and time complexity over their brute force alternatives, at the price of a negligible loss of accuracy.</p> \n<p><strong>Fitting it together</strong><br>So now that we have two independently produced sets of data (batch and speed), how do we combine them to present a single coherent answer?</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/answers_architecture_screenshot5.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/handling_five_billionsessionsadayinrealtime99.thumb.1280.1280.png\" width=\"555\" height=\"333\" alt=\"Handling five billion sessions a day – in real time \"></a></p> \n<p>We combine them with logic in our API that utilizes each data set under specific conditions.</p> \n<p>Because batch computations are repeatable and more fault-tolerant than speed, our API’s always favor batch produced data. So, for example, if our API receives a request for data for a thirty-day, time-series DAU graph, it will first request the full range from the batch-serving Cassandra cluster. If this is a historic query, all of the data will be satisfied there. However, in the more likely case that the query includes the current day, the query will be satisfied mostly by batch produced data, and just the most recent day or two will be satisfied by speed data.</p> \n<p><strong>Handling of failure scenarios</strong><br>Let’s go over a few different failure scenarios and see how this architecture allows us to gracefully degrade instead of go down or lose data when faced with them.</p> \n<p>We already discussed the on-device retry-after-back-off strategy. The retry ensures that data eventually gets to our servers in the presence of client-side network unavailability, or brief server outages on the back-end. A randomized back-off ensures that devices don’t overwhelm (DDos) our servers after a brief network outage in a single region or a brief period of unavailability of our back-end servers.</p> \n<p>What happens if our speed (real-time) processing layer goes down? Our on-call engineers will get paged and address the problem. Since the input to the speed processing layer is a durable Kafka cluster, no data will have been lost and once the speed layer is back and functioning, it will catch up on the data that it should have processed during its downtime.</p> \n<p>Since the speed layer is completely decoupled from the batch layer, batch layer processing will go-on unimpacted. Thus the only impact is delay in real-time updates to data points for the duration of the speed layer outage.</p> \n<p>What happens if there are issues or severe delays in the batch layer? Our APIs will seamlessly query for more data from the speed layer. A time-series query that may have previously received one day of data from the speed layer will now query it for two or three days of data. Since the speed layer is completely decoupled from the batch layer, speed layer processing will go-on unimpacted. At the same time, our on-call engineers will get paged and address the batch layer issues. Once the batch layer is back up, it will catch up on delayed data processing, and our APIs will once again seamlessly utilize the batch produced data that is now available.</p> \n<p>Our back-end architecture consists of four major components: event reception, event archival, speed computation, and batch computation. Durable queues between each of these components ensure that an outage in one of the components does not spill over to others and that we can later recover from the outage. Query logic in our APIs allows us to seamlessly gracefully degrade and then recover when one of the computations layers is delayed or down and then comes back up.</p> \n<p>Our goal for Answers is to create a dashboard that makes understanding your user base dead simple so you can spend your time building amazing experiences, not digging through data. <a href=\"http://answers.io/?utm_source=twitter_eng_blog&amp;utm_medium=twitter_blog&amp;utm_campaign=answers_5B_sessions_2.17.2015&amp;utm_content=inline_cta\">Learn more about Answers here</a> and get started today.</p> \n<p>Big thanks to the Answers team for all their efforts in making this architecture a reality. Also to Nathan Marz for his <a href=\"http://manning.com/marz/\">Big Data book</a>.</p> \n<p><strong>Contributors</strong><br><a href=\"https://twitter.com/ajorgensen\">Andrew Jorgensen</a>, <a href=\"https://twitter.com/bswift\">Brian Swift</a>, <a href=\"https://twitter.com/brianhatfield\">Brian Hatfield</a>,&nbsp;<a href=\"https://twitter.com/mikefurtak\">Michael Furtak</a>, <a href=\"https://twitter.com/marknic\">Mark Pirri</a>, <a href=\"https://twitter.com/CoryDolphin\">Cory Dolphin</a>, <a href=\"https://twitter.com/rothbutter\">Jamie Rothfeder</a>, <a href=\"https://twitter.com/jeffseibert\">Jeff Seibert</a>, <a href=\"https://twitter.com/sirstarry\">Justin Starry</a>, <a href=\"https://twitter.com/krob\">Kevin Robinson</a>, <a href=\"https://twitter.com/Kris10rht\">Kristen Johnson</a>, <a href=\"https://twitter.com/marcrichards\">Marc Richards</a>, <a href=\"https://twitter.com/patrickwmcgee\">Patrick McGee</a>, <a href=\"https://twitter.com/richparet\">Rich Paret</a>, <a href=\"https://twitter.com/wayne\">Wayne Chang</a>.</p>",
"date": "2015-02-17T08:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/handling-five-billion-sessions-a-day-in-real-time",
"domain": "engineering"
},
{
"title": "Another look at MySQL at Twitter and incubating Mysos",
"body": "<p>While we’re at the <a href=\"http://www.percona.com/live/mysql-conference-2015/\">Percona Live MySQL Conference</a>, we’d like to discuss updates on how Twitter uses MySQL, as well as share our plans to open source Mysos, a new MySQL on Apache Mesos framework.</p> \n<p></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/TwitterDBA/status/588748702216359937\">https://twitter.com/TwitterDBA/status/588748702216359937</a>\n </blockquote>",
"date": "2015-04-16T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/another-look-at-mysql-at-twitter-and-incubating-mysos",
"domain": "engineering"
},
{
"title": "Building a new trends experience",
"body": "<p>We recently launched a <a href=\"https://engineering/2015/updating-trends-on-mobile\">new trends experience</a> on mobile and web. Certain users will now see additional context with their trends: a short description and at times an accompanying image. While building this new product experience, we also switched the entire trends backend system to a brand new platform. This is the largest engineering undertaking of the trends system since 2008.</p> \n<p>Every second on Twitter, thousands of Tweets are sent. Since 2008, <a href=\"https://engineering/2008/twitter-trends-tip\">trends</a> have been the one-stop window into the Twitter universe, providing users with a great tool to keep up with <a href=\"https://2014.twitter.com/moment/mh17\">breaking news</a>, <a href=\"https://2014.twitter.com/moment/oscars\">entertainment</a> and <a href=\"https://2014.twitter.com/moment/usa-ferguson\">social movements</a>.</p> \n<p>In last week’s release, we built new mechanisms that enrich trends with additional information to simplify their consumption. In particular, we introduced an online news clustering service that detects and groups breaking news on Twitter. We also replaced the legacy system for detecting text terms that are irregular in volume with a generic, distributed system for identifying anomalies in categorized streams of content. The new system can detect trending terms, URLs, users, videos, images or any application-specific entity of interest.</p> \n<p><strong>Adding context to trends</strong></p> \n<p>Until recently, trends have only been presented as a simple list of phrases or hashtags, occasionally leaving our users puzzled as to why something like “Alps,” “Top Gear” or “East Village” is trending. Learning more required clicking on the trend and going through related Tweets. With this update, Twitter now algorithmically provides trend context for certain tailored trends users. Not only does this improve usability, but according to our experimental data, it also motivates users to engage even more.</p> \n<p>Context for the new trends experience may include the following pieces of data:</p> \n<ul>\n <li><strong>A description:</strong> Short text descriptions are presented to explain the meaning of a trend. Currently, they are mainly retrieved from popular articles shared on Twitter.</li> \n <li><strong>An image:</strong> Surfacing a relevant image provides our users with direct visual knowledge. Trend images can come either from popular images or news articles shared on Twitter.</li> \n <li><strong>Twitter-specific information:</strong> The volume of Tweets and Retweets of a trend, or the time in which it started trending, helps users understand the magnitude of conversation surrounding the topic.</li> \n</ul>\n<p>When users click on a trend, Tweets that provided the image or description source are shown at the top of the corresponding timeline. This gives tribute to the original source of context and ensures a consistent browsing experience for the user. The trend context is recomputed and refreshed every few minutes, providing users with a summary of the current status of affairs.</p> \n<p><strong>News clustering system</strong><br>Surfacing high quality, relevant and real-time contextual information can be challenging, given the massive amount of Tweets we process. Currently, one major source for descriptions and images are news URLs shared on Twitter.</p> \n<p>The news clustering system detects breaking news in real time, based on engagement signals from users. It groups similar news stories into clusters along different news verticals and surfaces real-time conversations and content like top images and Tweets for each cluster. The following diagram provides details about components of this system.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/News_Clustering_System.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_new_trendsexperience95.thumb.1280.1280.png\" width=\"680\" height=\"475\" alt=\"Building a new trends experience\"></a></p> \n<p></p> \n<p>The news clusterer consumes streams of Tweets containing URLs from a set of news domains. Twitter’s <a href=\"https://engineering/2011/spiderduck-twitters-real-time-url-fetcher\">URL crawler</a> is then used to fetch the content from the links embedded in each Tweet. The clusterer then builds feature vectors out of crawled article content. Using these feature vectors, an online clustering algorithm is employed to cluster related stories together. The resulting clusters are ranked using various criteria including recency and size. Each cluster maintains a ranked list of metadata, such as top images, top Tweets, top articles and keywords. The ranked clusters are persisted to <a href=\"https://engineering/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale\">Manhattan</a> periodically for serving purposes.</p> \n<p>The news clustering service polls for updated clusters from Manhattan. An inverted index is built from keywords computed by the clusterer to corresponding clusters. Given a query, this service returns matching clusters, or the top-n ranked clusters available, along with their metadata, which is used to add context to trends.</p> \n<p><strong>Trends computation</strong></p> \n<p>In addition to making trends more self-explanatory, we have also replaced the entire underlying trends computation system itself with a new real-time distributed solution. Since the old system ran on a single JVM, it could only process a small window of Tweets at a time. It also lacked stable recovery mechanism. The new system is built on a scalable, distributed Storm architecture for streaming MapReduce. It maintains state in both memcached and Manhattan. Tweet input passes through durable <a href=\"http://kafka.apache.org/\">Kafka</a> queue for proper recovery on restarts.</p> \n<p>The new system consists of two major components. The first component is trends detection. It is built on top of <a href=\"https://engineering/2013/streaming-mapreduce-with-summingbird\">Summingbird</a>, responsible for processing Firehose data, detecting anomalies and surfacing <a href=\"https://engineering/2010/trend-or-not-trend\">trends candidates</a>. The other component is trends postprocessing, which selects the best trends and decorates them with relevant context data.</p> \n<p><strong>Trends detection</strong><br>Based on Firehose Tweets input, a trends detection job computes trending entities in domains related to languages, geo locations and interesting topics. As shown in the diagram below, it has the following main phases:</p> \n<ol>\n <li>Data preparation</li> \n <li>Entity, domain, attribute extraction and aggregation</li> \n <li>Scoring and ranking</li> \n</ol>\n<p><a href=\"https://g.twimg.com/blog/blog/image/Trends_Detection.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_new_trendsexperience96.thumb.1280.1280.png\" width=\"700\" height=\"629\" alt=\"Building a new trends experience\"></a>Data preparation includes filtering and throttling. Basic filtering removes replies, Tweets with low text quality or containing sensitive content. Anti-spam filtering takes advantage of real-time spam signal available from <a href=\"https://engineering/2014/fighting-spam-with-botmaker\">BotMaker</a>. Throttling removes similar Tweets and ensures contribution to a trend from a single user is limited.</p> \n<p>After filtering and throttling, the trending algorithm is where the decision of what domain-entity pairs are trending is made. For this, domain-entity pairs are extracted along with related metadata, and then aggregated into counter objects. Additional pair attributes like entity co-occurrence and top URLs are collected and persisted separately, which are later used for scoring and post-processing.The scorer computes score for entity-domain pairs based on the main counter objects and their associated attribute counters. The tracker then ranks these pairs and saves top ranked results with scores onto Manhattan. These results are trends candidates ready for postprocessing and human evaluation.</p> \n<p><strong>Trends postprocessor</strong></p> \n<p>Trends postprocessor has the following main functionalities:</p> \n<ol>\n <li>Periodically retrieves trends generated by the trends detection job</li> \n <li>Performs trends backfill and folding</li> \n <li>Collects context metadata including description, image, Tweet volume</li> \n</ol>\n<p>The following diagram shows how the postprocessor works:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/Trends_Postprocess.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/building_a_new_trendsexperience97.thumb.1280.1280.png\" width=\"700\" height=\"422\" alt=\"Building a new trends experience\"></a></p> \n<p></p> \n<p>The scanner periodically loads all available domains and initiates the post-processing operation for each domain.</p> \n<p>Depending on the granularity, a domain may be expanded to form a proper sequence in ascending order of scope. For example, a city level domain [San Francisco] will be expanded to List[San Francisco, California, USA, en] that contains the full domain hierarchy, with language en as the most general one.</p> \n<p>For domains without sufficient organic trending entities, a backfill process is used to compensate them with data from their ancestors’ domains, after domain expansion.</p> \n<p>The folding process is responsible for combining textually different, semantically similar trending entities into a single cluster and selecting the best representative to display to the end user.</p> \n<p>Metadata fetcher retrieves data from multiple sources, including the <a href=\"https://engineering/2011/engineering-behind-twitter%E2%80%99s-new-search-experience\">search-blender</a> and news clustering service described earlier to decorate each tending entity with context information. These decorated entities are then persisted in batch for the trends serving layer to pick up.</p> \n<p><strong>Acknowledgement</strong><br>Looking ahead, we are working hard to improve the quality of trends in multiple ways. Stay tuned! The following people contributed to these updates: <a href=\"https://twitter.com/alexcebrian\">Alex Cebrian</a>, <a href=\"https://twitter.com/amits\">Amit Shukla</a>, <a href=\"https://twitter.com/larsonite\">Brian Larson</a>, <a href=\"https://twitter.com/changsmi\">Chang Su</a>, <a href=\"https://twitter.com/twdumi\">Dumitru Daniliuc</a>, <a href=\"https://twitter.com/drjorts\">Eric Rosenberg</a>, <a href=\"https://twitter.com/faresende\">Fabio Resende</a>, <a href=\"https://twitter.com/fei\">Fei Ma</a>, <a href=\"https://twitter.com/gabor\">Gabor Cselle</a>, <a href=\"https://twitter.com/gilad\">Gilad Mishne</a>, <a href=\"https://twitter.com/han_jinqiang\">Jay Han</a>, <a href=\"https://twitter.com/jerrymarino\">Jerry Marino</a>, <a href=\"https://twitter.com/jessm\">Jess Myra</a>, <a href=\"https://twitter.com/jingweiwu\">Jingwei Wu</a>, <a href=\"https://twitter.com/jinsong_lin\">Jinsong Lin</a>, <a href=\"https://twitter.com/jtrobec\">Justin Trobec</a>, <a href=\"https://twitter.com/kehli\">Keh-Li Sheng</a>, <a href=\"https://twitter.com/kevinzhao\">Kevin Zhao</a>, <a href=\"https://twitter.com/krismerrill\">Kris Merrill</a>, <a href=\"https://twitter.com/maerdot\">Maer Melo</a>, <a href=\"https://twitter.com/mikecvet\">Mike Cvet</a>, <a href=\"https://twitter.com/nipoon\">Nipoon Malhotra</a>, <a href=\"https://twitter.com/rchengyue\">Royce Cheng-Yue</a>, <a href=\"https://twitter.com/snikolov\">Stanislav Nikolov</a>, <a href=\"https://twitter.com/suchitagarwal\">Suchit Agarwal</a>, <a href=\"https://twitter.com/tall\">Tal Stramer</a>, <a href=\"https://twitter.com/tjack\">Todd Jackson</a>, <a href=\"https://twitter.com/vskarich\">Veljko Skarich</a>, <a href=\"https://twitter.com/venukasturi\">Venu Kasturi</a>, <a href=\"https://twitter.com/zhenghuali\">Zhenghua Li</a>, <a href=\"https://twitter.com/lingguang1997\">Zijiao Liu</a>.</p>",
"date": "2015-04-24T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/building-a-new-trends-experience",
"domain": "engineering"
},
{
"title": "Graduating Apache Parquet",
"body": "<p>ASF, the <a href=\"http://www.apache.org/\">Apache Software Foundation</a>, recently announced the graduation of <a href=\"http://parquet.apache.org\">Apache Parquet</a>, a columnar storage format for the Apache Hadoop ecosystem. At Twitter, we’re excited to be a founding member of the project.</p> \n<p></p>\n<div class=\"g-tweet\">\n <blockquote class=\"twitter-tweet\">\n <a href=\"https://twitter.com/TheASF/status/592644433813884929\">https://twitter.com/TheASF/status/592644433813884929</a>\n </blockquote>",
"date": "2015-05-21T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/graduating-apache-parquet",
"domain": "engineering"
},
{
"title": "Flying faster with Twitter Heron",
"body": "<p>We process billions of events on Twitter every day. As you might guess, analyzing these events in real time presents a massive challenge. Our main system for such analysis has been <a href=\"http://storm.apache.org/\">Storm</a>, a distributed stream computation system <a href=\"https://engineering/2011/storm-coming-more-details-and-plans-release\">we’ve open-sourced</a>. But as the scale and diversity of Twitter data has increased, our requirements have evolved. So we’ve designed a new system, <a href=\"http://dl.acm.org/citation.cfm?id=2742788\">Heron</a> — a real-time analytics platform that is fully API-compatible with Storm. We introduced it yesterday at <a href=\"http://www.sigmod2015.org/\">SIGMOD 2015</a>.</p> \n<p><strong>Our rationale and approach</strong></p> \n<p>A real-time streaming system demands certain systemic qualities to analyze data at a large scale. Among other things, it needs to: process of billions of events per minute; have sub-second latency and predictable behavior at scale; in failure scenarios, have high data accuracy, resiliency under temporary traffic spikes and pipeline congestions; be easy to debug; and simple to deploy in a shared infrastructure.</p> \n<p>To meet these needs we considered several options, including: extending Storm; using an alternative open source system; developing a brand new one. Because several of our requirements demanded changing the core architecture of Storm, extending it would have required longer development cycles. Other open source streaming processing frameworks didn’t perfectly fit our needs with respect to scale, throughput and latency. And these systems aren’t compatible with the Storm API – and adapting a new API would require rewriting several topologies and modifying higher level abstractions, leading to a lengthy migration process. So we decided to build a new system that meets the requirements above and is backward-compatible with the Storm API.</p> \n<p><strong>The highlights of Heron</strong></p> \n<p>When developing Heron, our main goals were to increase performance predictability, improve developer productivity and ease manageability. We made strategic decisions about how to architect the various components of the system to operate at Twitter scale.</p> \n<p>The overall architecture for Heron is shown here in Figure 1 and Figure 2. Users employ the Storm API to create and submit topologies to a scheduler. The scheduler runs each topology as a job consisting of several containers. One of the containers runs the topology master, responsible for managing the topology. The remaining containers each run a stream manager responsible for data routing, a metrics manager that collects and reports various metrics and a number of processes called Heron instances which run the user-defined spout/bolt code. These containers are allocated and scheduled by scheduler based on resource availability across the nodes in the cluster. The metadata for the topology, such as physical plan and execution details, are kept in Zookeeper.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/blog-figure-1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron95.thumb.1280.1280.png\" width=\"400\" height=\"294\" alt=\"Flying faster with Twitter Heron\"></a><em>Figure 1. Heron Architecture</em></p> \n<p><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron96.thumb.1280.1280.png\" width=\"428\" height=\"312\" alt=\"Flying faster with Twitter Heron\"><em>Figure 2. Topology Architecture</em></p> \n<p>Specifically, Heron includes these features:</p> \n<p><strong>Off the shelf scheduler:</strong> By abstracting out the scheduling component, we’ve made it easy to deploy on a shared infrastructure running various scheduling frameworks like Mesos, YARN, or a custom environment.</p> \n<p><strong>Handling spikes and congestion:</strong> Heron has a back pressure mechanism that dynamically adjusts the rate of data flow in a topology during execution, without compromising data accuracy. This is particularly useful under traffic spikes and pipeline congestions.</p> \n<p><strong>Easy debugging:</strong> Every task runs in process-level isolation, which makes it easy to understand its behavior, performance and profile. Furthermore, the sophisticated UI of Heron topologies, shown in Figure 3 below, enables quick and efficient troubleshooting for issues.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/blog-figure-3a.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron97.thumb.1280.1280.png\" width=\"511\" height=\"188\" alt=\"Flying faster with Twitter Heron\"></a></p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/blog-figure-3b.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron98.thumb.1280.1280.png\" width=\"557\" height=\"180\" alt=\"Flying faster with Twitter Heron\"></a><em>Figure 3. Heron UI showing logical plan, physical plan and status of a topology</em></p> \n<p><strong>Compatibility with Storm</strong>: Heron provides full backward compatibility with Storm, so we can preserve our investments with that system. No code changes are required to run existing Storm topologies in Heron, allowing for easy migration.</p> \n<p><strong>Scalability and latency</strong>: Heron is able to handle large-scale topologies with high throughput and low latency requirements. Furthermore, the system can handle a large number of topologies.</p> \n<p><strong>Heron performance</strong><br>We compared the performance of Heron with Twitter’s production version of Storm, which was forked from an open source version in October 2013, using word count topology. This topology counts the distinct words in a stream generated from a set of 150,000 words.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/blog-figure-4.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron99.thumb.1280.1280.png\" width=\"354\" height=\"321\" alt=\"Flying faster with Twitter Heron\"></a></p> \n<p><em>Figure 4. Throughput with acks enabled</em></p> \n<p><em><a href=\"https://g.twimg.com/blog/blog/image/blog-figure-5.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/flying_faster_withtwitterheron100.thumb.1280.1280.png\" width=\"354\" height=\"331\" alt=\"Flying faster with Twitter Heron\"></a></em><em>Figure 5. Latency with acks enabled</em></p> \n<p>As shown in Figure 4, the topology throughput increases linearly for both Storm and Heron. However for Heron, the throughput is 10–14x higher than that of Storm in all experiments. Similarly, the end-to-end latency, shown in Figure 5, increases far more gradually for Heron than it does for Storm. Heron latency is 5-15x lower than Storm’s latency.</p> \n<p>Beyond this, we have run topologies which scale to hundreds of machines, many of which handle sources that generate millions of events per second, without issues. Also with Heron, numerous topologies that aggregate data every second are able to achieve sub-second latencies. In these cases, Heron is able to achieve this with less resource consumption than Storm.</p> \n<p><strong>Heron at Twitter</strong></p> \n<p>At Twitter, Heron is used as our primary streaming system, running hundreds of development and production topologies. Since Heron is efficient in terms of resource usage, after migrating all Twitter’s topologies to it we’ve seen an overall 3x reduction in hardware, causing a significant improvement in our infrastructure efficiency.</p> \n<p><strong>What’s next?</strong></p> \n<p>We would like to collaborate and share lessons learned with the Storm community as well as other real-time stream processing system communities in order to further develop these programs. Our first step towards doing this was sharing our <a href=\"http://dl.acm.org/citation.cfm?id=2742788\">research paper on Heron</a> at SIGMOD 2015. In this paper, you’ll find more details about our motivations for designing Heron, the system’s features and performance, and how we’re using it on Twitter.</p> \n<p><strong>Acknowledgements</strong></p> \n<p>Heron would not have been possible without the work of <a href=\"https://twitter.com/sanjeevrk\">Sanjeev Kulkarni</a>, <a href=\"https://twitter.com/Louis_Fumaosong\">Maosong Fu</a>, <a href=\"https://twitter.com/challenger_nik\">Nikunj Bhagat</a>, <a href=\"https://twitter.com/saileshmittal\">Sailesh Mittal</a>, <a href=\"https://twitter.com/vikkyrk\">Vikas R. Kedigehalli</a>, Siddarth Taneja (<a href=\"https://twitter.com/intent/user?screen_name=staneja\">@staneja</a>), <a href=\"https://twitter.com/zhilant\">Zhilan Zweiger</a>, <a href=\"https://twitter.com/cckellogg\">Christopher Kellogg</a>, Mengdie Hu (<a href=\"https://twitter.com/intent/user?screen_name=MengdieH\">@MengdieH</a>) and <a href=\"https://twitter.com/msb5014\">Michael Barry</a>.</p> \n<p>We would also like to thank the <a href=\"http://storm.apache.org/\">Storm community</a> for teaching us numerous lessons and for moving the state of distributed real-time processing systems forward.</p> \n<p><strong>References</strong></p> \n<p>[1] <a href=\"http://dl.acm.org/citation.cfm?id=2742788&amp;CFID=678734229&amp;CFTOKEN=67082323\">Twitter Heron: Streaming at Scale</a>, Proceedings of ACM SIGMOD Conference, Melbourne, Australia, June 2015</p> \n<p>[2] <a href=\"http://dl.acm.org/citation.cfm?id=2595641\">Storm@Twitter</a>, Proceedings of ACM SIGMOD Conference, Snowbird, Utah, June 2014</p> \n<p></p>",
"date": "2015-06-02T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/flying-faster-with-twitter-heron",
"domain": "engineering"
},
{
"title": "Twitter at @MesosCon 2015",
"body": "<p>Once again, we’re pleased to sponsor and participate in <a href=\"http://mesoscon.org\" target=\"_blank\">#MesosCon</a>. As heavy users of both Mesos and <a href=\"http://mesoscon.org\" target=\"_blank\">Apache Aurora</a> to power our cloud infrastructure, we’re excited to be part of this growing community event.</p> \n<p>The conference, organized by the <a href=\"http://mesos.apache.org\" target=\"_blank\">Apache Mesos</a> community, features talks on the popular open source cluster management software and its ecosystem of software for running distributed applications at the scale of tens of thousands of servers.</p> \n<p><strong>Conference highlights</strong><br>This year’s <a href=\"https://twitter.com/hashtag/MesosCon\">#MesosCon</a> will be significantly larger than last year and features simultaneous tracks including beginner talks, the Mesos core, frameworks, and operations. We have a stellar lineup of invited keynote speakers including Adrian Cockcroft (<a href=\"https://twitter.com/adrianco\" target=\"_blank\">@adrianco</a>, Battery Ventures), Neha Narula (<a href=\"https://twitter.com/neha\" target=\"_blank\">@neha</a>, MIT), Peter Bailis (<a href=\"https://twitter.com/pbailis\" target=\"_blank\">@pbailis</a>, UC Berkeley), and Benjamin Hindman (<a href=\"https://twitter.com/benh\" target=\"_blank\">@benh</a>, Mesosphere).</p> \n<p>We’re also pleased that Twitter will have a strong presence. We’ll be sharing our latest work as well as best practices from the last four-plus years of using Apache Mesos and Apache Aurora. And if you’re interested in learning more about engineering opportunities, stop by our booth.</p> \n<blockquote class=\"g-quote\"> \n <p>There’s a pre-conference hackathon that several of us Twitter folks will be attending. We’re also hosting a <a href=\"https://twitter.com/hashtag/MesosSocial\">#MesosSocial</a> in our Seattle office on Wednesday, August 19 to kick off the conference. You can follow <a href=\"https://twitter.com/intent/user?screen_name=TwitterOSS\">@TwitterOSS</a> for updates when we announce more details next week. See you at <a href=\"https://twitter.com/hashtag/MesosCon\">#MesosCon</a>!</p> \n</blockquote> \n<p><strong>Twitter speakers</strong></p> \n<p>The <a href=\"http://sched.co/31l1\" target=\"_blank\">New Mesos HTTP API</a> - <a href=\"https://twitter.com/vinodkone\" target=\"_blank\">Vinod Kone</a>, Twitter, Isabel Jimenez (<a href=\"https://twitter.com/ijimene\" target=\"_blank\">@ijimene</a>), Mesosphere<br>This session will provide a comprehensive walkthrough of recent advancements with the Mesos API, explaining the design rationale and highlighting specific improvements that simplify writing frameworks to Mesos.</p> \n<p><a href=\"http://sched.co/36Uq\" target=\"_blank\">Twitter’s Production Scale: Mesos and Aurora Operations</a> - <a href=\"https://twitter.com/Yasumoto\" target=\"_blank\">Joe Smith</a>, Twitter<br>This talk will offer an operations perspective on the management of a Mesos + Aurora cluster, and cover many of the cluster management best practices that have evolved here from real-world production experience.</p> \n<p><a href=\"http://sched.co/35Cq\" target=\"_blank\">Supporting Stateful Services on Mesos using Persistence Primitives</a> - <a href=\"https://twitter.com/jie_yu\" target=\"_blank\">Jie Yu</a>, Twitter, and Michael Park, Mesosphere<br>This talk will cover the persistence primitives recently built into Mesos, which provide native support for running stateful services like Cassandra and MySQL in Mesos. The goal of persistent primitives is to allow a framework to have assured access to its lost state even after task failover or slave restart.</p> \n<p><a href=\"http://sched.co/36Uv\" target=\"_blank\">Apache Cotton MySQL on Mesos</a> - <a href=\"https://twitter.com/xujyan\" target=\"_blank\">Yan Xu</a>, Twitter<br>Cotton is a framework for launching and managing MySQL clusters within a Mesos cluster. <a href=\"https://engineering/2015/another-look-at-mysql-at-twitter-and-incubating-mysos\" target=\"_blank\">Recently open-sourced by Twitter as Mysos</a> and later renamed, Cotton dramatically simplifies the management of MySQL instances and is one of the first frameworks that leverages Mesos’ persistent resources API. We’ll share our experience using this framework. It’s our hope that this is helpful to other Mesos framework developers, especially those wanting to leverage Mesos’ persistent resources API.</p> \n<p><a href=\"http://sched.co/35Ct\" target=\"_blank\">Tactical Mesos: How Internet-Scale Ad Bidding Works on Mesos/Aurora</a> - <a href=\"https://twitter.com/dmontauk\" target=\"_blank\">Dobromir Montauk</a>, TellApart<br>Dobromir will present TellApart’s full stack in detail, which includes Mesos/Aurora, ZK service discovery, Finagle-Mux RPC, and a Lambda architecture with Voldemort as the serving layer.</p> \n<p><a href=\"http://sched.co/35D4\" target=\"_blank\">Scaling a Highly-Available Scheduler Using the Mesos Replicated Log: Pitfalls and Lessons Learned</a> - <a href=\"https://twitter.com/kts\" target=\"_blank\">Kevin Sweeney</a>, Twitter<br>This talk will give you tools for writing a framework scheduler for a large-scale Mesos cluster using Apache Aurora as a case study. It will also explore the tools the Aurora scheduler has used to meet these challenges, including Apache Thrift for schema management.</p> \n<p><a href=\"http://sched.co/35Cu\" target=\"_blank\">Simplifying Maintenance with Mesos</a> - <a href=\"https://twitter.com/bmahler\" target=\"_blank\">Benjamin Mahler</a>, Twitter<br>Today, individual frameworks are responsible for maintenance which poses challenges when running multiple frameworks (e.g. services, storage, batch compute). We’ll explore a current proposal for adding maintenance primitive in Mesos to address these concerns, enabling tooling for automated maintenance.</p> \n<p><a href=\"http://sched.co/35D1\" target=\"_blank\">Generalizing Software Deployment - The Many Meanings of “Update”</a> - <a href=\"https://twitter.com/wfarner\" target=\"_blank\">Bill Farner</a>, Twitter<br>Bill will present the evolution of how Apache Aurora managed deployments and describe some of the challenges imposed by wide variance in requirements. This talk will also share how deployments on Aurora currently run major services at Twitter.</p> \n<p><a href=\"http://sched.co/3Dfo\" target=\"_blank\">Per Container Network Monitoring and Isolation in Mesos</a> - <a href=\"https://twitter.com/jie_yu\" target=\"_blank\">Jie Yu</a>, Twitter<br>This talk will discuss the per container network monitoring and isolation feature introduced in Mesos 0.21.0. We’ll show you the implications of this approach and lessons we learned during the deployment and use of this feature.</p> \n<p><strong>Join us!</strong><br>Good news: there’s still time to <a href=\"http://events.linuxfoundation.org/events/mesoscon/attend/register\" target=\"_blank\">register for #MesosCon</a> and join us in Seattle on August 20-21.</p> \n<p>There’s a <a href=\"https://www.eventbrite.com/e/mesoscon-2015-pre-conference-hackathon-tickets-17752101012\" target=\"_blank\">pre-conference hackathon</a> that several of us Twitter folks will be attending. We’re also hosting a <a href=\"https://www.eventbrite.com/e/mesos-social-tickets-17959933645\" target=\"_blank\">#MesosSocial</a> in our Seattle office on Wednesday, August 19 to kick off the conference. You can follow <a href=\"http://twitter.com/twitteross\" target=\"_blank\">@TwitterOSS</a> for updates when we announce more details next week. See you at <a href=\"https://twitter.com/hashtag/MesosCon\">#MesosCon</a>!</p>",
"date": "2015-07-31T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/twitter-at-mesoscon-2015",
"domain": "engineering"
},
{
"title": "Diffy: Testing services without writing tests",
"body": "<p>Today, we’re excited to release <a href=\"http://www.github.com/twitter/diffy\">Diffy</a>, an open-source tool that automatically catches bugs in <a href=\"https://thrift.apache.org/\">Apache Thrift</a> and HTTP-based services. It needs minimal setup and is able to catch bugs without requiring developers to write many tests.</p> \n<p>Service-oriented architectures like our platform see a large number of services evolve at a very fast pace. As new features are added with each commit, existing code is inevitably modified daily – and the developer may wonder if they might have broken something. Unit tests offer some confidence, but writing good tests can take more time than writing the code itself. What’s more, unit tests offer coverage for tightly-scoped small segments of code, but don’t address the aggregate behavior of a system composed of multiple code segments.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/Diffy_1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/diffy_testing_serviceswithoutwritingtests95.thumb.1280.1280.png\" width=\"534\" height=\"553\" alt=\"Diffy: Testing services without writing tests\"></a></p> \n<p class=\"align-left\"><em>Each independent code path requires its own test.</em></p> \n<p class=\"align-left\">As the complexity of a system grows, it very quickly becomes impossible to get adequate coverage using hand-written tests, and there’s a need for more advanced automated techniques that require minimal effort from developers. Diffy is one such approach we use.</p> \n<p class=\"align-left\"><strong>What is Diffy?</strong><br>Diffy finds potential bugs in your service by running instances of your new and old code side by side. It behaves as a proxy and multicasts whatever requests it receives to each of the running instances. It then compares the responses, and reports any regressions that surface from these comparisons.</p> \n<p>The premise for Diffy is that if two implementations of the service return “similar” responses for a sufficiently large and diverse set of requests, then the two implementations can be treated as equivalent and the newer implementation is regression-free.</p> \n<p>We use the language “similar” instead of “same” because responses may be prone to a good deal of noise that can make some parts of the response data structure non-deterministic. For example:</p> \n<ul>\n <li>Server-generated timestamps embedded in the response</li> \n <li>Use of random generators in the code</li> \n <li>Race conditions in live data served by downstream services</li> \n</ul>\n<p>All of these create a strong need for noise to be automatically eliminated. Noisy results are useless for developers, because trying to manually distinguish real regressions from noise is like looking for a needle in a haystack. Diffy’s novel noise cancellation technique distinguishes it from other comparison-based regression analysis tools.</p> \n<p><strong>How Diffy works</strong><br>Diffy acts as a proxy which accepts requests drawn from any source you provide and multicasts each of these requests to three different service instances:</p> \n<ol>\n <li>A candidate instance running your new code</li> \n <li>A primary instance running your last known-good code</li> \n <li>A secondary instance running the same known-good code as the primary instance</li> \n</ol>\n<p>Here’s a diagram illustrating how Diffy operates:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/Diffy_2.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/diffy_testing_serviceswithoutwritingtests96.thumb.1280.1280.png\" width=\"682\" height=\"382\" alt=\"Diffy: Testing services without writing tests\"></a></p> \n<p>As Diffy receives a request, it sends the same request to candidate, primary and secondary instances. When those services send responses back, Diffy compares these responses and looks for two things:</p> \n<ol>\n <li>Raw differences observed between the candidate and primary instances.</li> \n <li>Non-deterministic noise observed between the primary and secondary instances. Since both of these instances are running known-good code, we would ideally expect responses to be identical. For most real services, however, we observe that some parts of the responses end up being different and exhibit nondeterministic behavior.</li> \n</ol>\n<p>These differences may not show up consistently on a per-request basis. Imagine a random boolean embedded in the response. There is a 50% chance that the boolean will be the same across primary and secondary and a 50% chance that candidate will have a different value than primary. This means that 25% of the requests will trigger a false error and result in noise. For this reason, Diffy looks at the aggregate frequency of each type of error across all the requests it has seen to date. Diffy measures how often primary and secondary disagree with each other versus how often primary and candidate disagree with each other. If these measurements are roughly the same, then it determines that there is nothing wrong and that the error can be ignored.</p> \n<p><strong>Getting started</strong><br>Here’s how you can start using Diffy to compare three instances of your service:<br>1. Deploy your old code to localhost:9990. This is your primary.<br>2. Deploy your old code to localhost:9991. This is your secondary.<br>3. Deploy your new code to localhost:9992. This is your candidate.<br>4. Build your diffy jar from the <a href=\"http://github.com/twitter/diffy\">code</a> using the “./sbt assembly” comand.<br>5. Run the Diffy jar with following command from the diffy directory :<br>java -jar./target/scala-2.11/diffy-server.jar \\<br>-candidate=\"localhost:9992\" \\<br>-master.primary=\"localhost:9990\" \\<br>-master.secondary=\"localhost:9991\" \\<br>-service.protocol=\"http\" \\<br>-serviceName=\"My Service\" \\<br>-proxy.port=:31900 \\<br>-admin.port=:31159 \\<br>-http.port=:31149 \\<br>-rootUrl=’localhost:31149’<br>6. Send a few test requests to your Diffy instance:<br>curl localhost:31900/your_application_route<br>7. Watch the differences show up in your browser at localhost:31149. You should see something like this:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/Diffy_4.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/diffy_testing_serviceswithoutwritingtests97.thumb.1280.1280.png\" width=\"1219\" height=\"1074\" alt=\"Diffy: Testing services without writing tests\"></a></p> \n<p>8. You can also see the full request that triggered the behavior and the full responses from primary and candidate:</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/Diffy_3.jpg\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/diffy_testing_serviceswithoutwritingtests98.thumb.1280.1280.png\" width=\"1219\" height=\"1074\" alt=\"Diffy: Testing services without writing tests\"></a></p> \n<p>Visit the <a href=\"http://github.com/twitter/diffy\">Github repo</a> for more detailed instructions and examples.</p> \n<p>As engineers we all want to focus on building and shipping products quickly. Diffy enables us to do that by keeping track of potential bugs for us. We hope you can gain from the project just as we have, and help us to continue improving it over time.</p>",
"date": "2015-09-03T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/diffy-testing-services-without-writing-tests",
"domain": "engineering"
},
{
"title": "Improving communication across distributed teams",
"body": "<p>As a global company, Twitter has offices in numerous locations around the world. Optimizing collaboration and improving productivity between employees across offices is a challenge faced by many companies and something we’ve spent a great deal of time on. For our part, we’ve learned a few lessons about what it takes to make distributed engineering offices successful. Over the next few months, we’ll share what we’ve learned. We’ll also periodically offer workshops about how to do distributed engineering well. The first of these will be on <a href=\"https://generalassemb.ly/education/hiring-done-right-distributed-engineering/san-francisco/15378\">Sept. 24 at General Assembly in San Francisco</a> (shameless plug!).</p> \n<p>In this post, we’ll describe the value, modes, and importance of communications between employees in different offices. We’ll share some data about how we do things, describe some of the common pitfalls that we identified, and how we’ve improved efficiency by addressing them.</p> \n<p>In a <a href=\"http://techcrunch.com/2015/07/09/an-interview-with-alex-roetter-twitters-head-of-engineering/\">TechCrunch</a> interview, Alex Roetter, our SVP of Engineering, recently said:</p> \n<p class=\"indent-60\">“We try to build teams that thrive in an environment with a clear direction, but the details are unspecified. Teams are able to try a bunch of things and fail fast, and hold themselves accountable and measure themselves rigorously. We find that smaller teams are faster…we try to build them in that sort of environment.”</p> \n<p>This kind of rapid iteration with small teams that Alex describes requires clear and constant communication between engineers, product managers, designers, researchers, and other functions. When everyone is in the same location, this is a lot easier. “Water-cooler” conversations or chats at offsites can lead to novel ideas, or foster cross-team collaboration. When employees work remotely, we have to do more work to ensure that they are able to have the kind of ad hoc conversations necessary to allow for innovation and ensure that different groups stay in sync.</p> \n<p>Building teams that are as geographically co-located as possible is one easy way to address some of this. In a future post, our New York Engineering Operations Manager Valerie Kuo (<a href=\"https://twitter.com/intent/user?screen_name=vyk\">@vyk</a>) will talk about the ways in which Twitter organizes teams against various constraints, and how we deal with them.</p> \n<p>In a modern-day technology company like Twitter, how often do meetings span locations? When I first started working here about a year ago, I was really curious about this – but it turns out relevant data are hard to find. To understand how pervasive cross-office collaboration really was, I analyzed data on every meeting held at Twitter between May 1, 2015 and July 25, 2015.&nbsp;</p> \n<p class=\"align-center\"><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/improving_communicationacrossdistributedteams95.thumb.1280.1280.png\" width=\"1400\" height=\"1156\" alt=\"Improving communication across distributed teams\"></span></p> \n<p class=\"align-center\"></p> \n<p class=\"align-center\"><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/improving_communicationacrossdistributedteams96.thumb.1280.1280.png\" width=\"1400\" height=\"842\" alt=\"Improving communication across distributed teams\"></span></p> \n<p class=\"align-center\"></p> \n<p class=\"align-center\"><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/improving_communicationacrossdistributedteams97.thumb.1280.1280.png\" width=\"1400\" height=\"1077\" alt=\"Improving communication across distributed teams\"></span></p> \n<p></p> \n<p><span><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/improving_communicationacrossdistributedteams98.thumb.1280.1280.png\" width=\"1400\" height=\"1015\" alt=\"Improving communication across distributed teams\"></span></p> \n<p>There are a number of interesting conclusions:</p> \n<ul>\n <li>Over 20% of all meetings company-wide involve more than one city.</li> \n <li>Nearly 40% of meetings originating in our San Francisco HQ involve more than one city.</li> \n <li>Some cities work very closely together due to shared resources, strategies, market needs. For instance, Tokyo, Singapore, Seoul, and Sydney interact with each other far more than with they do with San Francisco.</li> \n <li>Despite the fact that most meetings are held in one city, meetings with multiple cities generally have more participants; so that <strong>50% of meeting time at Twitter is spent in multi-city meetings</strong>.</li> \n <li>If we could increase the efficiency of multi-city meetings by 10%, <strong>we would save about 5 weeks of aggregate work every day</strong>.</li> \n <li>These data only include meetings that take place in conference rooms; since ad hoc video conferences between co-workers are excluded, we are almost certainly underestimating the scale of remote collaboration.</li> \n</ul>\n<p>Given how critical interoffice communication is to a smooth operation, it’s clear that having a videoconferencing (VC) infrastructure is key to productivity. Recent upgrades to our VC systems have made this kind of collaboration substantially easier. The rest of this post will describe where we were a year ago, how we evaluated systems, and how changes we’ve made improve the working life of our employees.</p> \n<p><strong>How we started</strong></p> \n<p>A year ago, most Twitter HQ conference rooms were not outfitted with VC equipment. In distributed offices, penetration was higher, though by no means widespread. At that time, two different types of VC systems were deployed:</p> \n<ul>\n <li><strong>Full VC rooms</strong>: These had enterprise-level VC systems including a hardware codec, table mics, built-in speakers and a touch screen controller. These rooms were fairly easy to use but relatively expensive to build.&nbsp;</li> \n <li><strong>VC-light rooms</strong>: These were outfitted with an LCD TV, a USB camera, and a USB speaker speaker/mic pod on the table. Calls were driven by the user’s laptop via cables to connect to VC calls. These rooms were very cost efficient, but also required the user to dedicate their laptop to the call.</li> \n</ul>\n<p>We primarily used a SaaS service that provides video conferencing from a number of devices and allows interconnectivity to a variety of established VC systems. To use this system:</p> \n<ul>\n <li>The host or meeting organizer was responsible for creating a meeting in the app.</li> \n <li>The host then had to paste dial-in information into the meeting invite.</li> \n <li>In Full-VC rooms, users entered a 12-digit number into a the touch panel controller.</li> \n <li>In VC-light rooms, users connected their laptops to USB connectors, clicked on the invite, logged in to the service, then edited their audio and video sources.</li> \n <li>Users could also use mobile apps or the website, or dial in using a phone number.</li> \n</ul>\n<p>Although we used this system extensively, there were some drawbacks: it regularly took over 5 minutes to initiate a VC connection. For a 30-minute meeting, this was over 15% of the meeting’s duration! And the quality of videoconferencing was poor because of several factors, including poor egress network bandwidth from our old New York engineering office, packet loss on the network, and the app’s prioritization of video quality over audio quality.</p> \n<p>After noting these issues, I worked with key members of our IT team to make things better. We proposed these requirements to fix multi-city meetings:</p> \n<ol>\n <li>80%+ of rooms should be integrated with VC equipment in all offices.</li> \n <li>Our calendaring system (Google Calendar) should be seamlessly integrated with video conferencing; i.e., VC equipment in each room should know the room’s schedule.</li> \n <li>One-button entry into a video conference upon entering the meeting room.</li> \n <li>Support for third-party phone dial-in.</li> \n <li>Ability to join calls without a Google account.</li> \n <li>Ability to support large number of dial-ins.</li> \n</ol>\n<p>After a number of pilots, our IT team decided to go with a company-wide deployment of Google Hangouts plus Chrome for Meetings (CfM) boxes in conference rooms.</p> \n<p>Deployment of Google Hangouts and CfM in a company that already uses Google Apps, as we do, makes adoption much easier — Google Calendar integration is built into the hardware. CfM boxes are not complicated, and can be connected to an LCD screen or integrated into full room systems. Our standard Hangouts room has an LCD screen mounted on the wall and Google’s standard mic/speaker pod and camera. A local HDMI connection with auto-input switching is included to enable low-friction presentations for local meetings.</p> \n<p>Currently, CfM does not support 20+ person rooms. Since support for these rooms is important for larger meetings, we worked with AV vendors to develop a way to integrate higher-fidelity cameras and microphones typically found in these larger rooms. We’ve made it a standard across our offices, installing it in over 100 rooms globally. We will describe our approach in greater detail in a future post.</p> \n<p>At this point, we have over 570 rooms set up with CfM, and most VC meetings are conducted using Hangouts and CfM. We’ve easily saved 3-5 minutes per meeting in VC setup time, and people are notably happier. Unfortunately, CfM still does not support (4), (5), and (6), and so we have maintained a few rooms to be compatible with our legacy setup. We hope that these issues will be addressed soon.</p> \n<p class=\"align-center\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/improving_communicationacrossdistributedteams99.thumb.1280.1280.png\" width=\"671\" height=\"504\" alt=\"Improving communication across distributed teams\"></p> \n<p class=\"align-center\"><em>A Twitter conference room. Photo courtesy Alex Stillings.</em></p> \n<p>Video conferencing is one of the most important pieces of internal infrastructure at Twitter, and CfM does its job well. While this setup is massive strides ahead of our initial deployment, I still fly to San Francisco once a month for in-person meetings, because the best VC available at a reasonable price doesn’t come close to substituting for in-person conversations. In larger meetings, being remote still feels a bit too much like watching a meeting on TV. And, we often have to mute mics when we’re not speaking and have periodics blips in A/V quality.</p> \n<p>The more we reduce the cost and pain of remote collaboration, the more efficiently we – or any company – can run. We’re already a highly distributed company that takes advantage of a great deal of cross-office collaboration. Further improvements in VC technology will really unlock our ability to efficiently interact with the best employees – current and future – wherever they might be.</p>",
"date": "2015-09-04T07:00:00.000Z",
"url": "https://engineering/engineering/en_us/a/2015/improving-communication-across-distributed-teams",
"domain": "engineering"
},
{
"title": "Hadoop filesystem at Twitter",
"body": "<p>Twitter runs multiple large Hadoop clusters that are among the biggest in the world. Hadoop is at the core of our data platform and provides vast storage for analytics of user actions on Twitter. In this post, we will highlight our contributions to ViewFs, the client-side Hadoop filesystem view, and its versatile usage here.</p> \n<p>ViewFs makes the interaction with our HDFS infrastructure as simple as a single namespace spanning all datacenters and clusters. <a href=\"https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Federation.html\">HDFS Federation</a> helps with scaling the filesystem to our needs for number of files and directories while NameNode High Availability helps with reliability within a namespace. These features combined add significant complexity to managing and using our several large Hadoop clusters with varying versions. ViewFs removes the need for us to remember complicated URLs by using simple paths. Configuring ViewFs itself is a complex task at our scale. Thus, we run <em>TwitterViewFs</em>, a ViewFs extension we developed, that dynamically generates a new configuration so we have a simple holistic filesystem view.</p> \n<p><strong>Hadoop at Twitter: scalability and interoperability</strong><br>Our Hadoop filesystems host over 300PB of data on tens of thousands of servers. We scale HDFS by <a href=\"http://www.slideshare.net/gerashegalov/t-235pvijaya-renuv2\">federating</a> multiple namespaces. This approach allows us to sustain a high HDFS object count (inodes and blocks) without resorting to a single large Java heap size that would suffer from long <a href=\"https://issues.apache.org/jira/browse/HADOOP-9618\">GC pauses</a> and the inability to use <a href=\"https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop\">compressed oops</a>. While this approach is great for scaling, it is not easy for us to use because each member namespace in the federation has its own URI. We use ViewFs to provide an illusion of a single namespace within a single cluster. As seen in Figure 1, under the main logical URI we create a ViewFs mount table with links to the appropriate mount point namespaces for paths beginning with <em>/user, /tmp, </em>and<em> /logs</em>, correspondingly.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/fig1.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hadoop_filesystemattwitter95.thumb.1280.1280.png\" width=\"318\" height=\"166\" alt=\"Hadoop filesystem at Twitter\"></a></p> \n<p>The configuration of the view depicted in Figure 1 translates to a lengthy configuration of a <a href=\"http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/ViewFs.html\">mount table</a> named <strong>clusterA</strong>. Logically, you can think of this as a set of symbolic links. We abbreviate such links simply as <em>/logs-&gt;hdfs://logNameSpace/logs</em>. <a href=\"http://www.slideshare.net/gerashegalov/t-235pvijaya-renuv2\">Here</a> you can find more details about our <a href=\"https://issues.apache.org/jira/browse/HADOOP-9985\"><em>TwitterViewFs</em></a> extension to ViewFs that handles both hdfs:// and viewfs:// URI’s on the client side to onboard hundreds of Hadoop 1 applications without code changes.</p> \n<p>Twitter’s Hadoop client and server nodes store configurations of all clusters. At Twitter, we don’t invoke the <em>hadoop</em> command directly. Instead we use a multiversion wrapper <em>hadoop</em> that dispatches to different hadoop installs based on a mapping from the configuration directory to the appropriate version. We store the configuration of cluster <em>C</em> in the datacenter <em>DC</em> abbreviated as <em>C@DC</em> in a local directory <em>/etc/hadoop/hadoop-conf-C-DC</em>, and we symlink the main configuration directory for the given node as <em>/etc/hadoop/conf</em>.</p> \n<p>Consider a DistCp from <em>source</em> to <em>destination</em>. Given a Hadoop 2 destination cluster (which is very common during migration), the source cluster has to be referenced via read-only Hftp regardless of the version of the source cluster. In case of a Hadoop 1 source, Hftp is used because the Hadoop 1 client is not wire-compatible with Hadoop 2. In case of a Hadoop 2 source, Hftp is used as there is no single HDFS URI because of federation. Moreover, with DistCp we have to use the destination cluster configuration to submit the job. However, the destination configuration does not contain information about HA and federation on the source side. Our previous solution implementing a <a href=\"http://www.slideshare.net/gerashegalov/t-235pvijaya-renuv2\">series of redirects</a> to the right NameNode is insufficient to cover all scenarios encountered in production so we merge all cluster configurations on the client side to generate one valid configuration for HDFS HA and ViewFs for all Twitter datacenters as described in the next section.</p> \n<p><strong>User-friendly paths instead of long URIs</strong></p> \n<p>We developed user-friendly paths instead of long URIs and enabled native access to HDFS. This removes the overwhelming number of different URIs and greatly increases the availability of the data. When we use multi-cluster applications, we have to cope with the full URIs that sometimes have a long authority part represented by a NameNode CNAME. Furthermore, if the cluster mix includes both Hadoop 1 and Hadoop 2, which are not wire-compatible, we unfortunately have to remember which cluster to address via the interoperable Hftp filesystem URI. The volume of questions around this area on our internal Twitter employee mailing lists, chat channels and office hours motivated us to solve this URI problem for good on the Hadoop 2 side. We realized that since we already present multiple namespaces as a single view within a cluster, we should do the same across all all clusters within a datacenter, or even across all datacenters. The idea is that a path <em>/path/file</em> at the cluster <em>C1</em> in the datacenter <em>DC1</em> should be mounted by the ViewFs in each cluster as <em>/DC1/C1/path/file</em> as shown Figure 3. This way we will never have to specify a full URI, nor remember whether Hftp is needed because we can transparently link via Hftp within ViewFs.</p> \n<p><a href=\"https://g.twimg.com/blog/blog/image/fig3.png\"><img src=\"https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter/archive/hadoop_filesystemattwitter96.thumb.1280.1280.png\" width=\"284\" height=\"214\" alt=\"Hadoop filesystem at Twitter\"></a></p> \n<p>With our growing number of clusters and number of namespaces per cluster, it would be very cumbersome if we had to maintain additional mount table entries in each cluster configuration manually as it turns into a O(n2) configuration problem. In other words, if we change the configuration of just one cluster we need to touch all n cluster configurations just for ViewFs. We also need to handle the HDFS client configuration for nameservices because otherwise mount point URIs cannot be resolved by the <em>DFSClient</em>.</p> \n<p>It’s quite common that we have the same logical cluster in multiple datacenters for load balancing and availability: <em>C1@DC1, C1@DC2</em>, etc. Thus, we decided to add some more features to <em>TwitterViewFs</em>. Instead of populating the configurations administratively, our code adds the configuration keys needed for the global cross-datacenter view at the runtime during the filesystem initialization automatically. This allows us to change existing namespaces in one cluster, or add more clusters without touching the configuration of the other clusters. By default our filesystem scans the glob <em>file:/etc/hadoop/hadoop-conf-</em>*.</p> \n<p>The following steps construct the <em>TwitterViewFs</em> namespace. When the Hadoop client is started with a specific <em>C-DC</em> cluster configuration directory, the following keys are added from all other <em>C’-DC’</em> directories during the <em>TwitterViewFs</em> initialization:</p> \n<ol>\n <li>If there is a ViewFs mount point link like <em>/path-&gt;hdfs://nameservice/path in C’-DC’</em>, then we will add a link <em>/DC’/C’/path-&gt;hdfs://nameservice/path</em>. For the Figure 1 example above, we would add to all cluster configurations:&nbsp;<em>/dc/a/user=hdfs://dc-A-user-ns/user</em></li> \n <li><span>Similarly, for consistency, we duplicate all conventional links <em>/path-&gt;hdfs://nameservice/path</em> for <em>C-DC</em> as <em>/DC/C/path-&gt;hdfs://nameservice/path</em>. This allows us to use the same notation regardless of whether we work with the default <em>C-DC</em> cluster or a remote cluster.</span></li> \n <li>We can easily detect whether the configuration <em>C’-DC’</em> that we are about to merge dynamically is a legacy Hadoop 1 cluster. For Hadoop 1, the key fs.defaultFS points to an <em>hdfs://</em> URI, whereas for Hadoop 2, it points to a<em> viewfs://</em> URI. Our Hadoop 1 clusters consist of a single namespace/NameNode, so we can transparently substitute the hftp scheme for the hdfs scheme and simply add the link:&nbsp;<em><span>/DC/C’/-&gt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment