faressoft/scalability.md

## scalability.md

      
    Raw
  

              scalability.md
            
          
    Instead of writing a long article I wrote this summary to give a good overview about software systems scalability by listing the main terms and methodologies that widely used to build stable systems that are able to scale and handle the increasing amont of users and data.
Index


Scalability
Terms System Quality Attributes
Failover
Load Balancer
MapReduce
Caching
PreProcessing
Database Denormalization And NoSQL
Partitioning

Vertical Partitioning
Horizontal Partitioning
Implementation
Drawbacks


Redis
MySQL
HAProxy

Scalability


Vertical scaling:

Increase resources of a specific node.
Like adding additioanl memory or disk space.
Easier but limited.


Horizontal scaling:

Increase the number of nodes.


Terms System Quality Attributes


Fault Tolerance:

Setup or configuration that prevents a computer or a network device from failing in the event of an unexpected problem or an error.
How to design for fault tolerance:

Power Failure.
Data loss: run backup.
Device or Computer failure: failover.
Unauthorized access: setup firewall.
Overload: load balancers.


Availability:

Uptime percentage.
The percentage of time the system is operational uptime/totalTime * 100.


Reliability:

The probability that the system is operational for a certain unit of time.
The ability of a system to perform its required functions for a specified time.


Scalability:

The capability of a system to handle a growing amount of work.


Partition Tolerance:

Cluster continues to function even if there is a partition (communications break) between two nodes.
Both nodes are up, but can’t communicate.


Consistency:

The data are same accross the cluster.


Failover


The capability of a system to automatically switch to a backup server when the main server is failed.
If the main server fails, a secondary server will take its place automatically.
When we design such system we need a fast way to switch to the secondary server like using Virtual IP Address (VIP).
Some helpful tools to design a failover system:

Heartbeat
Corosync and Pacemaker
Keepalived


Load Balancer


Distrubute the load among cloned servers that have the same code and access to the same data.
DNS, the basic load balancing mechanisms, via the A records.
HAProxy is used to load balance TCP (layer 4) or HTTP (layer 7) protocol.
Nginx has several load balancing mechanisms:

RoundRobin

Distrubute the requests in round robin fashion among the servers.
If the response time of different requests are almost equal it a good option.
We can assign weight for each server in the load balancer if some servers have higher resrouces than others.


LeastConnected

Next request is assigned to the server with the least number of active connections.
If the response time of different requests are vary alot, it a good option.


IPHash

If we have statefull servers (Sticky Sessions).
A hash-function is used to determine what server should be selected for the next request (based on the client’s IP address).


MapReduce


To process big data in parallel on multiple nodes in a scallable way.
Consists of two steps:

Map:

Takes some data and emit <key, value> pair.


Reduce:

Takes a key and a set of associated values.
Reduces them in some way, emitting a new key and value.
The result might be fed back into the reduce program for more reducing.


Caching


An in-memory key-value pairing.
It first checks the cache, if doesn't contain the data, lookup the data in the data store.
Might cache a query and its results.

PreProcessing


PreRendering the pages.
Queue jobs to be done to update some parts of the website.

Database Denormalization And NoSQL


Adding redundant informations into the database speed up the reads.
Designed to scale better.
Databases without the need of joins or NoSQL database that doesn't have joins.

Partitioning


Splitting the data across multiple instances.
Ensure you have a way to figure out which data is on which machine.
Is a generic term for splitting data across multiple instances.

Vertical Partitioning:


Schema-wise.
Splitting columns into multiple tables with fewer columns, like normalization.
Partitioning by object type or feature.
Example: D1: profiles, D2: users, D3: books.
Drawbacks: If one of these tables gets very large, it might need different partitioning scheme.

Horizontal Partitioning:


Row-wise.
Splitting rows into multiple tables.
Called sharding.
Methods:

Hash Partitioning:

Partitioning by mapping keys into instances using a hash function like CRC32 and mod of the number of instances.
$instance = hash($key) % $instancesCount.
Drawbacks:

The number of instances are fixed.


Consistent Hashing:

Easier to add more instances.


Directory Partitioning:

Partitioning by maintaining a lookup table for where the data can be found.
Easy to add additioanl servers.
Drawbacks:

Lookup table can be a single point of failure.
Constantly accessing the table impacts the performance.


Implementation

Partitioning can be the responsibility of different parts of a software stack.

ClientSide

The clients directly select the right node where to write or read a given key.
Like Predis for PHP.


Proxy

Clients send requests to a proxy.
Forward the request to the right instance.
Sends the replies back to the client.
Like Twemproxy for redis, developed at Twitter.
Pipelining

Proxying multiple client connections onto one or few server connections.
Batch requests from different clients and send them as a single message to save time.


ServerSide

Query routing.
Send your query to a random instance.
The instance will forward the query to the right instance.
Or the client gets redirected to the right node like Redis Cluster.


Drawbacks


Operations involving multiple keys are usually not supported.
Transactions involving multiple keys can not be used.
Complex to take a backup.
Complex to add/remove instances.

Redis


Vertical Partitioning

Splitting keyspaces into multiple instances.


Horizontal Partitioning

Client Sharding

Use Pre-Sharding to add and remove instances.


Redis Cluster

Mix between query routing and client side partitioning.
Supports rebalancing the data with the ability to add/ and remove instances at runtime.
Launched on April 2015.


Twemproxy

Use Pre-Sharding to add and remove instances.
It is not a single point of failure since you can start multiple proxies and instruct your clients to connect to the first that accepts the connection.


Master-Slave Replication.

MySQL


Master-Slave Replication (Load).

Load-balancer between the slaves.
Read/Write master.
Read-only slaves.
Pros:

Speeds up reads.


Cons:

Doesn't speed up the writes.
Replication lag.


Master-Master Cluster (Load).

Load-balancer between two instances.
Each instance acts as a slave for the other instnace.
Set auto_increment_increment=1 on the first instance.
Set auto_increment_increment=2 on the second instance.
Pros:

Speeds up writes and reads.


HAProxy


Configuration process involves 3 major sources of parameters:

The arguments from the command-line, which always take precedence.
The global section, which sets process-wide parameters.
The proxies sections which can take form of

defaults section sets default parameters for all other sections following its declaration.
frontend describes a set of listening sockets accepting client connections.
backend describes a set of servers to which the proxy will connect to forward incoming connections.
listen describes a complete proxy with its frontend and backend.


Proxy modes:

tcp (layer 4).
http (layer 7).


global
    maxconn 3000

defaults
    mode tcp

frontend main
    bind *:80
    default_backend app

backend app
    balance roundrobin
    server app1 127.0.0.1:5001 check
    server app2 127.0.0.1:5002 check
    server app3 127.0.0.1:5003 check