ror, scala, jetty, erlang, thrift, mongrel, comet server, my-sql, memchached, varnish, kestrel(mq), starling, gizzard, cassandra, hadoop, vertica, munin, nagios, awstats
- Scaling Twitter: Making Twitter 10000 Percent Faster
- Scaling Twitter
- Blaine Cook on Scaling Twitter - YouTube
- How Twitter Stores 250 Million Tweets a Day Using MySQL (Korean)
- Twitter on Scala
- DataSift Architecture: Realtime Datamining at 120,000 Tweets Per Second
- Improving Running Components at Twitter
- QCon London 2009: Upgrading Twitter without service disruptions
php (with hiphop compiler), thrift, java(tomcat, jetty, minor),epoll, erlang, tornado, nodejs my-sql, memcahced, hadoop, hbase, hive, scribe(-hdfs), bigpipe, varnish, haystack, cassandra
- "Building for a Billion Users" - YouTube
- Scaling Out
- Facebook Architecture - Stack Overflow
- Facebook Architecture for 600M users
- Facebook Chat
- Scaling the Messages Application Back End
centos, sciapache, apache, nginx, (move out of)php, scala(selection), ruby, thrift, my-sql, redis, hbase, memcached, gearman, kafka, kestrel, finagle, varnish, ha-proxy, func, capistrano, puppet, jenkins
aws(ec2, s3), ubuntu, cloudfront, python, pylons, paste, tornado, thrift, comet server, memcached, haproxy, nginx
python, django, tornado, node.js, rabbitmq, nginx, haproxy, varnish, memcached, membase, redis, my-sql, mrjob, hadoop(elastic map reduce)
- Pinterest Architecture Update - 18 Million Visitors, 10x Growth,12 Employees, 410 TB of Data
- Polyglot persistence at Pinterest: Redis, Membase, MySQL • myNoSQL
aws(s3, ebs), cloudfront, ubuntu, django(high-cpu extra-large), gunicorn, fabric, gearman, pyapns, twisted, postgre-sql(quadruple extra-large), mdadm(sofeware raid with ebs), repmgr, pgbouncer, redis, memcached, node2dm, munin, pingdom, pagerduty, sentry
- Instagram Architecture Update: What’s new with Instagram?
- What Powers Instagram: Hundreds of Instances, Dozens of Technologies - Instagram Engineering (Korean)
- Tracking Slow Requests with Dogslow
aws(ec2, s3, elb), tornado, scribe, mrjob, node-readability, haproxy, tornado, gae, mapreduce, django(appengine), google-cloud-storage, memcache, redis
aws(ec2, s3, ebs, rds, dynamodb, sdb, sqs, sns, emr, elb, eip, vpc, direct-connect, iam), java(tomcat), mongodb, my-sql, casandra, hadoop, zookeeper, evcache, asgard, groovy, grails, zuul, priam and more netflix opensouces)
- Architectural Patterns for High Availability - Netflix
- The Netflix Tech Blog
- 3 shades of latency: How Netflix built a data architecture around timeliness — Tech News and Analysis
linux(2.6), nginx, uwsgi, aws(s3), dotcloud, mysql, redis, celery
ubuntu(12.04), aws(ec2, s3, elb), nginx, werkzeug, flask, postgre-sql, pgpool, memcached, gevent, celery, rabbitmq, fabric, boto, exceptional, flask-exceptional
rabbitmq, celery, phash
ubuntu(12.04), nhn ncloud, django, apache, mod_wsgi, ms-sql, memcached, fabric, south, wand, rsync, py-bcrypt, python-gcm, apns
gae, "천만명 이하 규모는 구글 앱 엔진을 써도 충분하다."
INFRA, PLATFORMS, FRAMEWORKS
- Amazon Web Services, Cloud Computing: Compute, Storage, Database
- Elastic Load Balancing
- Amazon Elastic Compute Cloud(Amazon EC2)
- ELB, Elastic Load Balancing
- Amazon Simple Storage Service (Amazon S3)
- Amazon RDS, Cloud Relational Database Service: MySQL, Oracle, SQL Server
- Amazon Route 53
- Amazon Elastic MapReduce(Amazon EMR)
- Scientific Linux
- nginx: HTTP and reverse proxy server, as well as a mail proxy server
- Werkzeug: WSGI utility library for Python.
- unbit/uwsgi · GitHub: uWSGI application server container
- dotCloud: Deploy, manage and scale any web app.
- Gunicorn: Python WSGI HTTP Server for UNIX.
- geeknam/python-gcm · GitHub: Python client for Google Cloud Messaging for Android (GCM).
- pylons: web framework to develop web application framework technology in Python. Rather than focusing on a single web framework.
- Erlang: programming language used to build massively scalable soft real-time systems with requirements on high availability.
- gevent: coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libevent event loop.
- memcached: in-memory key-value store for small chunks of arbitrary data (strings, objects).
- RabbitMQ: Robust messaging for applications.
- Celery: asynchronous task queue/job queue based on distributed message passing.
- Flask: microframework for Python based on Werkzeug.
- HBase: Hadoop database, a distributed, scalable, big data store.
- Varnish: caching HTTP reverse proxy.
- HAProxy: High Performance TCP/HTTP Load Balancer.
- gearman: job queue system, is used for long running fire and forget type work.
- Kafka: distributed publish-subscribe messaging system
- robey/kestrel · GitHub: simple, distributed message queue system.
- twitter/finagle · GitHub: fault tolerant, protocol-agnostic RPC system.
- defunkt/starling · GitHub: light weight server for reliable distributed message passing
- tornado: Python web framework and asynchronous networking library.
- Func: secure, scriptable remote control framework.
- evan/mongrel · GitHub: small fast HTTP library and server that runs Rails, Camping, Nitro and Iowa apps.
- jetty: Web server and javax.servlet container, plus support for SPDY, Web Sockets, OSGi, JMX, JNDI, JASPI, AJP and many other integrations.
- pyapns: universal Apple Push Notification Service (APNS) provider.
- node2dm: sending push notifications to Google's C2DM push notification server.
- Groovy: dynamic language for the Java platform
- Grails: full stack, web application framework for the JVM
- PostgreSQL: most advanced open source database.
- repmgr: open source tools that helps DBAs and System administrators manage a cluster of PostgreSQL databases.
- pgpool Wiki: middleware that works between PostgreSQL servers and a PostgreSQL database client.
- PgBouncer: lightweight connection pooler for PostgreSQL.
- SQLAlchemy: Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.
- South: intelligent schema and data migrations for Django projects.
- twitter/gizzard · GitHub: flexible sharding framework for creating eventually-consistent distributed datastores
- cassandra: used for high velocity writes, and lower velocity reads
- hadoop: process unstructured and large datasets, hundreds of billions of rows.
- vertica: used for analytics and large aggregations and joins so they don't have to write MapReduce jobs. (twitter)
- mrjob: Run MapReduce jobs on Hadoop or Amazon Web Services
- Apache Solr: popular, blazing fast open source enterprise search platform from the Apache LuceneTM project
- fatcache: Memcache on SSD.
- google-cloud-storage: Store, access and manage your data on Google’s storage infrastructure. Take advantage of the scale and efficiency we have built over the years.
- haystack: Facebook photo Infrastructure.
- Netflix/EVCache: distributed in-memory data store for the cloud.
- GAE, Google App Engine: Lets you run web applications on Google's infrastructure. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. With App Engine, there are no servers to maintain
DEPLOY, MONITORING, UTILITIES
- Fabric: library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks.
- boto/boto · GitHub: python interface to aws
- capistrano/capistrano · GitHub: Remote multi-server automation tool.
- puppetlabs/puppet · GitHub: Server automation framework and application.
- Exceptional: Exceptional tracks errors in web apps. It reports them in real-time.and gathers the info you need to fix them fast.
- jzempel/flask-exceptional · GitHub: Exceptional extension for Flask.
- rsync: utility that provides fast incremental file transfer.
- Nagios: The Industry Standard in IT Infrastructure Monitoring
- Munin: networked resource monitoring tool that can help analyze resource trends
- AWStats: powerful and featureful tool that generates advanced web, streaming, ftp or mail server statistics, graphically
- pingdom: Website monitoring. Monitor your server and network uptime and performance for free.
- sentry: realtime error logging and aggregation platform
- pagerduty: SaaS IT on-call schedule management, alerting and incident tracking.
- scribe: server for aggregating log data that's streamed in realtime from clients. It is designed to be scalable and reliable
- Sphinx: tool that makes it easy to create intelligent and beautiful
- pHash.org: perceptual hash library
- dahlia/wand · GitHub: The ctypes-based simple ImageMagick binding for Python.
- py-bcrypt: strong password hashing for Python.
- mdadm: manage MD devices aka Linux Software RAID.
- twitter/snowflake · GitHub: network service for generating unique ID numbers at high scale with some simple guarantees
- node-readability: Server side readability with node.js
- Netflix/asgard: Web interface for application deployments and cloud management in Amazon Web Services (AWS)
- Netflix/Priam: Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.
- Netflix/zuul: edge service that provides dynamic routing, monitoring, resiliency, security, and more.