Skip to content

Instantly share code, notes, and snippets.

@AaradhyaSaxena
Last active November 27, 2023 18:11
Show Gist options
  • Save AaradhyaSaxena/ea3e06d0dd780459166f6d45825f8d32 to your computer and use it in GitHub Desktop.
Save AaradhyaSaxena/ea3e06d0dd780459166f6d45825f8d32 to your computer and use it in GitHub Desktop.
System design

System Design

System design is about — Architecture + Data + Applications

Architecture means how are you going to put the different functioning blocks of a system together and make them seamlessly work with each other after taking into account all the nodal points where a sub-system can fail/stop working.

Screenshot 2023-08-02 at 12 10 10 PM

Data means what is the input, what are the data processing blocks, how to store petabytes of data and most importantly how to process it and give the desired/required output.

Screenshot 2023-08-02 at 12 10 17 PM

Applications means once the data is processed and output is ready how are the different applications attached to that large back end system will utilize that data.

Screenshot 2023-08-02 at 12 10 31 PM

Why System Design?

  • Reliability : It’s measured in terms of robustness, failures and effect analysis of the sub systems of the larger system.
  • Scalability : It’s measured in terms of resources allocation, utilization, serviceability and availability.
  • Availability : It’s measured by the percentage of time in a given period that a system is available to perform the assigned task without any failure(s).
  • Efficiency : It’s measured in terms of reusability of components, simplicity and productivity.

And all of it bottles down to three things — Cost, Speed and Failures.

Components #

API

Read API -

  • The Read API in system design is responsible for retrieving data from the system’s database and returning it to the requesting client.
    • Accepting input
    • Retrieving data
    • Formatting data
    • Returning data

Write API -

  • A Write API in system design refers to an API that allows clients to modify or update data in the system. It is often used in systems that require a level of data persistence or require data to be updated in real-time.

Search API

  • The Search API is a crucial component of many systems, as it allows users to find information quickly and efficiently.
  • When designing a search API, it’s important to consider the types of queries that users will be performing, as well as the data structures and algorithms that will be used to retrieve results.

Services

User Info Service

  • Creating new user accounts
  • Authenticating users and validating their credentials
  • Retrieving user profile information (e.g. name, email address, profile picture, etc.
  • Updating user profile information
  • Deleting user accounts

User Graph Service

  • A user graph service is a component in a system design that manages and maintains the relationships between users in a network. It is responsible for storing and updating information about user connections, such as friend lists, followers, or other types of relationships.
  • This service is commonly used in social networks, online marketplaces, and other systems that require social connections between users.
    • We define a User class using SQLAlchemy to represent a user in the system.
    • The User class has a friends relationship that points to other users who are friends of this user.
    • We also define a friendships table that stores the many-to-many relationship between users. This table is used to implement the friends relationship of the User class.
    • The API provides several endpoints for managing users and their connections.
      • The GET /users endpoint returns a list of all users in the system.
      • The GET /users/{id} endpoint returns information about a specific user, including their name and friend list.
      • The POST /users endpoint creates a new user in the system.
      • Finally, the POST /users/{id}/friends endpoint adds a new friend to a user's friend list.

Timeline Service

  • A timeline service in system design is responsible for managing and presenting a chronological list of events or activities.
  • Implementation of a timeline service:
    • Data model: Define the data model for the timeline. This includes the events or activities to be displayed, along with any associated metadata such as timestamps and user IDs.
    • Database schema: Create a database schema to store the data model. This can be a relational database such as MySQL or PostgreSQL, or a NoSQL database such as MongoDB or Cassandra.
    • API endpoints: Define the API endpoints for the timeline service. This includes endpoints for creating, updating, and deleting events, as well as endpoints for retrieving the events in chronological order.
    • Event processing: Define the logic for processing events. This includes verifying user permissions, validating input data, and performing any necessary transformations on the data.
    • Presentation: Define the logic for presenting the events to users. This includes sorting the events in chronological order, applying any filters or search queries, and formatting the data for display.
  • Methods
    • get_timeline()
    • create_event()

Notification Service

  • A notification service in system design is a mechanism that allows a system to send notifications to users or other systems about events or changes in the system.
  • It is often used in systems that require real-time updates or event-driven architectures. In order to implement a notification service, you will need to define the message format, the delivery mechanism, and the logic for handling incoming notifications.
    • We can define an API endpoint for sending notifications with a channel and message payload.
    • When a POST request is made to /notifications, we publish the message to the Redis Pub/Sub system using the specified channel.
    • We also define a function for handling incoming notifications. This function simply prints the channel and message payload to the console, but in a real system, you would likely use this function to trigger some kind of action or update in the system.
    • Finally, we subscribe to the ‘notifications’ channel using the Redis pubsub() method and start listening for messages. When a message is received, the handle_notification() function is called with the channel and message payload as arguments.

Full text search service

  • A full text search service in system design is a mechanism that allows a system to search for and retrieve documents or other data based on keyword or phrase searches.
  • It is often used in systems that have large amounts of unstructured or semi-structured data, such as text documents or web pages. In order to implement a full text search service, you will need to define the data model and indexing strategy, the search API, and the logic for querying and retrieving results.
    • we define a data model for a document object with an id, title, and content. We also define a function for indexing documents in the Elasticsearch search engine. When a GET request is made to /search?q=, we use the Elasticsearch search API to search for documents containing the specified query in either the title or content fields. We then construct a list of Document objects from the search results and return them as a JSON response. Finally, we index some example documents in the Elasticsearch index when the application starts up.

Caching Service

  • Caching is a technique used to store frequently accessed data in a temporary storage area so that it can be retrieved faster when required. It is commonly used in system design to improve performance and reduce latency.

Fan Out Service

  • It’s the process of distributing requests or messages from a single source to multiple destinations.
  • This is a common pattern in distributed systems, where a service or application needs to send information to multiple downstream systems.
  • One way to implement a fan out service is to use a message queue. The service would receive requests or messages from the source system, and then publish those messages to a message queue. Each downstream system would then subscribe to the message queue and receive a copy of the message.

Components

Memory Cache

A memory cache is a system component that stores frequently accessed data in memory for fast access. It can be used to improve the performance of a system by reducing the number of expensive database queries or other resource-intensive operations.

  • Storing and retrieving data from memory
  • Expiring or invalidating cached data after a certain amount of time or under certain conditions
  • Maintaining cache consistency and preventing data corruption
  • Managing cache capacity to avoid running out of memory

DNS (Domain Name System)

DNS (Domain Name System) is a system used to translate human-readable domain names into IP addresses. It acts as a directory for the internet, allowing users to easily access websites and other internet services without needing to remember IP addresses.

In a system, a DNS server would be responsible for managing the following tasks:

  • Resolving domain names into IP addresses
  • Caching resolved IP addresses to improve performance
  • Load balancing between multiple IP addresses for a given domain name
  • Handling domain name registration and management
import socket
class DNSResolver:
    def __init__(self, ttl=60):
        self.cache = {}
        self.ttl = ttl
        
    def resolve(self, domain):
        if domain in self.cache:
            ip, expiration = self.cache[domain]
            if expiration > time.time():
                return ip
        
        ips = socket.getaddrinfo(domain, None)
        if not ips:
            return None
        
        ip = ips[0][4][0]
        self.cache[domain] = (ip, time.time() + self.ttl)
        return ip
        
    def clear_cache(self):
        self.cache.clear()

Real-time Message Ingestion

Real-time message ingestion is the process of capturing and processing messages as they are generated in real-time. This is a common requirement for many systems that handle real-time data, such as chat applications, social media platforms, and IoT systems.

  • Capturing messages as they are generated
  • Storing messages in a way that allows for efficient retrieval and querying
  • Processing messages in real-time to perform actions such as notifications or real-time analytics
  • Scaling to handle high message volume and ensure high availability

publisher

import redis

redis_conn = redis.Redis()

while True:
    message = input('Enter a message: ')
    redis_conn.publish('chat', message)

subscriber

import redis

redis_conn = redis.Redis()

pubsub = redis_conn.pubsub()
pubsub.subscribe('chat')

while True:
    message = pubsub.get_message()
    if message:
        print('Received: {0}'.format(message['data']))

Load Balancer

A load balancer is a system component that distributes incoming network traffic across multiple servers to improve performance, reliability, and availability.

In a system, a load balancer would be responsible for managing the following tasks:

  • Monitoring server health and availability
  • Distributing traffic across available servers
  • Managing session persistence (i.e. ensuring that subsequent requests from a client are routed to the same server)
  • Handling SSL termination (i.e. decrypting encrypted traffic and forwarding it to the appropriate server)
import random
class LoadBalancer:
    def __init__(self, servers):
        self.servers = servers
        self.num_servers = len(servers)
        self.server_statuses = [True] * self.num_servers
        
    def choose_server(self):
        available_servers = [i for i in range(self.num_servers) if self.server_statuses[i]]
        if not available_servers:
            return None
        return random.choice(available_servers)
        
    def handle_request(self, request):
        server_index = self.choose_server()
        if server_index is None:
            raise Exception("No available servers")
        server = self.servers[server_index]
        return server.handle_request(request)
        
    def set_server_status(self, server_index, status):
        self.server_statuses[server_index] = status        

Content Delivery Network (CDN)

A Content Delivery Network (CDN) is a distributed network of servers that helps to deliver content faster and more efficiently to end-users by caching content closer to the user’s location.

A CDN can be used in a system design to improve website performance and reliability by reducing latency, improving website availability and scalability, and reducing bandwidth costs.

Implementation of CDN using Amazon CloudFront, one of the popular CDN services available:

  • Sign up for an AWS account and create a new CloudFront distribution. aws cloudfront create-distribution
    --origin-domain-name example.com
    --default-root-object index.html
    --enabled
  • Set up your origin server(s) by creating an Amazon S3 bucket or an EC2 instance as your origin server.
  • Create a new DNS record for your distribution using your preferred DNS provider (e.g., Amazon Route 53) and point it to your CloudFront distribution.
  • Test your CDN by accessing your website using the new DNS record. You should see improved performance and faster load times.

Screenshot 2023-08-10 at 4 44 55 PM

Job Server

A job server in system design is responsible for managing and executing background jobs or tasks asynchronously. It can be used in various applications such as web scraping, data processing, or batch processing.

The implementation of a job server can be done using the following steps:

  • Job queue: Define a job queue to store the jobs to be executed. This can be a message queue such as RabbitMQ, a database table, or a simple in-memory data structure such as a list.
  • Job producer: Define a job producer to add jobs to the job queue. This can be a user interface, a REST API endpoint, or any other application logic that generates jobs.
  • Job consumer: Define a job consumer to fetch jobs from the job queue and execute them. This can be a worker process that runs continuously in the background, or a scheduler that runs periodically to check for new jobs.
  • Job executor: Define the logic for executing the jobs. This includes processing input data, performing the necessary calculations or operations, and generating output data.
app = Flask(__name__)
job_queue = Queue()
def worker():
    while True:
        job = job_queue.get()
        # execute the job logic here
        print(f"Executing job {job['id']} with data {job['data']}")
        job_queue.task_done()
@app.route('/job', methods=['POST'])
def create_job():
    # get the request data
    data = request.get_json()
    # validate the data
    if 'data' not in data:
        return jsonify({'error': 'Invalid data'}), 400
    # add the job to the job queue
    job_id = len(job_queue) + 1
    job = {'id': job_id, 'data': data['data']}
    job_queue.put(job)
    # return the ID of the created job
    return jsonify({'id': job_id}), 201
if __name__ == '__main__':
    # start the worker threads
    for i in range(4):
        t = threading.Thread(target=worker)
        t.daemon = True
        t.start()
    # start the Flask app
    app.run(debug=True)
  • In this implementation, the job server provides a single endpoint create_job() that accepts job data as input and adds it to the job queue.
  • The worker() function runs continuously in the background, fetching jobs from the queue and executing them. To start the worker threads, the name == 'main' block creates four worker threads and starts them in daemon mode. This ensures that the threads will exit gracefully when the main thread terminates.
  • When a job is added to the job queue, it is assigned an ID and a dictionary object with the job data. The worker thread fetches the job from the queue, extracts the job data, and executes the job logic.
  • After processing the job data, the worker thread calls task_done() to signal that the job has been completed.

Workers

Workers in system design refer to background processes that perform tasks asynchronously and independently of the main system flow. They are often used to perform tasks that are time-consuming or resource-intensive, such as processing large files, sending emails, or performing data analysis.

App servers

An application server in system design is a server that provides a platform for running and managing applications, often in a distributed or scalable environment.

It typically includes features such as load balancing, caching, and session management, and is often used to deploy web applications or APIs.

from flask import Flask, jsonify
import os
app = Flask(__name__)
# Define an API endpoint
@app.route('/api', methods=['GET'])
def hello():
    return jsonify({'message': 'Hello, world!'})
if __name__ == '__main__':
    # Get the port number from the environment variable, or use a default value
    port = int(os.environ.get('PORT', 5000))
    # Run the application using the Gunicorn WSGI HTTP server
    app.run(host='0.0.0.0', port=port)

In this implementation, we define a Flask app with a single API endpoint that returns a JSON response. We then use the Gunicorn HTTP server to run the app on a specified port. We also get the port number from an environment variable, which allows us to easily configure the port when deploying the app.

gunicorn app:app

This command tells Gunicorn to run the app module (which contains our Flask app) and the app variable (which is the name of our Flask app). Gunicorn will start up multiple worker processes to handle incoming requests, and will handle load balancing and other features automatically.

Queues

Queues are an important part of many system designs, allowing you to process tasks and messages in a distributed or asynchronous manner.

import queue
import time
import threading

# Define a function that will process items from the queue
def process_queue(q):
    while True:
        item = q.get()
        print(f'Processing item {item}...')
        time.sleep(1)
        q.task_done()
# Create a queue and start the worker threads
q = queue.Queue()
for i in range(5):
    t = threading.Thread(target=process_queue, args=(q,))
    t.daemon = True
    t.start()
# Add some items to the queue
for i in range(10):
    q.put(i)
# Wait for the queue to be processed
q.join()
print('All items processed')

Object storage

Object storage is a method of storing and retrieving unstructured data in the form of objects. It is often used to store large amounts of data, such as media files, and can be accessed over a network using APIs.

class ObjectStore:
    def __init__(self, storage_directory):
        self.storage_directory = storage_directory
    def put(self, data):
        """
        Store an object in the object store
        """
        # generate a unique filename for the object
        filename = hashlib.md5(data).hexdigest()
        # write the object data to a file
        with open(os.path.join(self.storage_directory, filename), 'wb') as f:
            f.write(data)
        return filename
    def get(self, filename):
        """
        Retrieve an object from the object store
        """
        # read the object data from a file
        with open(os.path.join(self.storage_directory, filename), 'rb') as f:
            data = f.read()
        return data
# example usage
object_store = ObjectStore('/path/to/storage/directory')
data = b'This is some test data'
filename = object_store.put(data)
retrieved_data = object_store.get(filename)
print(retrieved_data)

Authentication

Authentication is the process of verifying the identity of a user, typically by requiring them to provide credentials such as a username and password.

app = Flask(__name__)
# mock user data for demonstration purposes
users = {
    "alice": {
        "password": "password123",
        "email": "alice@example.com"
    },
    "bob": {
        "password": "password456",
        "email": "bob@example.com"
    }
}
# authentication endpoint
@app.route('/authenticate', methods=['POST'])
def authenticate():
    username = request.json.get('username')
    password = request.json.get('password')
    if username in users and password == users[username]['password']:
        # authentication successful
        return jsonify({'success': True, 'email': users[username]['email']})
    else:
        # authentication failed
        return jsonify({'success': False, 'message': 'Invalid username or password'}), 401
# example usage
response = app.test_client().post('/authenticate', json={'username': 'alice', 'password': 'password123'})
print(response.get_json())

Batch processing

Batch processing is a method of processing a large number of input items as a single batch, typically for efficiency reasons. It is often used in systems that need to process large amounts of data, such as data pipelines or ETL (Extract, Transform, Load) processes.

def process_batch(batch):
    """
    Process a single batch of input items
    """
    # process the input items in the batch
    result = []
    for item in batch:
        result.append(item.upper())
    return result
def batch_process(input_data, batch_size=10):
    """
    Process the input data in batches
    """
    # split the input data into batches
    batches = [input_data[i:i+batch_size] for i in range(0, len(input_data), batch_size)]
    # process each batch using multiprocessing
    with multiprocessing.Pool() as pool:
        results = pool.map(process_batch, batches)
    # flatten the results and return them
    return [item for batch_result in results for item in batch_result]
# example usage
input_data = ['apple', 'banana', 'cherry', 'date', 'elderberry', 'fig', 'grape', 'honeydew', 'kiwi', 'lemon']
output_data = batch_process(input_data, batch_size=3)
print(output_data)

Cloud storage

Cloud storage is a method of storing data in a remote server or a network of servers that are accessible over the internet. It is commonly used in systems that require scalable, flexible, and reliable storage solutions. How cloud storage can be implemented in Python using the boto3 library to interact with Amazon S3:

import boto3
# create an S3 client
s3 = boto3.client('s3')
# create a new bucket
bucket_name = 'my-bucket'
s3.create_bucket(Bucket=bucket_name)
# upload a file to the bucket
file_name = 'example.txt'
with open(file_name, 'rb') as f:
    s3.upload_fileobj(f, bucket_name, file_name)
# download a file from the bucket
with open('downloaded.txt', 'wb') as f:
    s3.download_fileobj(bucket_name, file_name, f)
# list the contents of the bucket
response = s3.list_objects_v2(Bucket=bucket_name)
for item in response['Contents']:
    print(item['Key'])

Stream processing

Stream processing is a critical component of many modern system designs, allowing applications to process large volumes of data in real-time. In stream processing, data is processed as it is generated, allowing for faster analysis and insights.

Implementation of stream processing using Python and Apache Kafka:

  • Set up Apache Kafka using the Confluent platform, which provides a Python client for interacting with Kafka:
from confluent_kafka import Producer, Consumer
conf = {
    'bootstrap.servers': 'my-kafka-broker:9092',
    'client.id': 'my-client-id'
}
producer = Producer(conf)
consumer = Consumer({
    'bootstrap.servers': 'my-kafka-broker:9092',
    'group.id': 'my-consumer-group',
    'auto.offset.reset': 'earliest'
})
  • Create a Kafka topic to store the incoming stream of data:
producer.produce('my-topic', key='key', value='value')
  • Set up a Kafka consumer to process the incoming stream of data:
consumer.subscribe(['my-topic'])
while True:
    msg = consumer.poll(1.0)
    if msg is None:
        continue
    if msg.error():
        print(f"Consumer error: {msg.error()}")
        continue
    print(f"Received message: {msg.value().decode('utf-8')}")
  • Process the incoming stream of data using Python’s data processing libraries like pandas, numpy, or scikit-learn:
import pandas as pd
df = pd.read_json(msg.value())
# perform data processing on the DataFrame
  • Write the processed data to an output stream or database:
import sqlite3
conn = sqlite3.connect('my-db')
c = conn.cursor()
# create table if it doesn't exist
c.execute('''CREATE TABLE IF NOT EXISTS my_table
             (column1 text, column2 text, column3 text)''')
# insert data into table
c.execute(f"INSERT INTO my_table VALUES (?, ?, ?)", (val1, val2, val3))
conn.commit()
# close database connection
conn.close()

In summary, stream processing using Python and Apache Kafka is a powerful tool for system design, allowing for the real-time processing of large volumes of data. The above code shows how to set up a Kafka producer to send data to a topic, a consumer to read data from the topic, process the data using data processing libraries, and write the results to an output stream or database.

Master-Slave

Master-Slave is a common pattern used in system design to improve the scalability, performance, and availability of applications. In this pattern, there is one master node that handles write requests and one or more slave nodes that handle read requests.

Here’s an implementation of the Master-Slave pattern using Python and Redis:

  • Set up Redis using the redis-py library:
import redis
# Connect to Redis as the master node
master = redis.Redis(host='my-master-redis-host', port=6379)
# Connect to Redis as the slave node(s)
slave1 = redis.Redis(host='my-slave-redis-host-1', port=6379)
slave2 = redis.Redis(host='my-slave-redis-host-2', port=6379)
  • Write data to the master node using the set() method:
master.set('my-key', 'my-value')
  • Read data from the slave node(s) using the get() method:
# read from slave1
value = slave1.get('my-key')
# if value is None, try reading from slave2
if value is None:
    value = slave2.get('my-key')
  • Configure Redis to replicate data from the master to the slave nodes:
# Set up Redis as a master-slave replication system
master_config = {
    'slaveof': None  # master has no slave
}
slave1_config = {
    'slaveof': ('my-master-redis-host', 6379)
}
slave2_config = {
    'slaveof': ('my-master-redis-host', 6379)
}
# Configure the Redis instances
master.config_set(**master_config)
slave1.config_set(**slave1_config)
slave2.config_set(**slave2_config)

In summary, implementing the Master-Slave pattern using Python and Redis is a powerful tool for system design, allowing for the improved scalability, performance, and availability of applications. The above code shows how to set up Redis as a master-slave replication system, write data to the master node, and read data from the slave node(s).

Auth Servers

Authentication servers are a critical component of many modern system designs, allowing applications to authenticate users and manage access control. In this pattern, there is a separate server dedicated to handling authentication requests and managing user credentials.

  • Set up a Flask server to handle authentication requests:
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/login', methods=['POST'])
def login():
    # Authenticate the user
    username = request.json.get('username')
    password = request.json.get('password')
    
    # Check the username and password against the database
    if authenticate_user(username, password):
        # If the user is authenticated, generate a token
        token = generate_token(username)
        return jsonify({'token': token})
    else:
        return jsonify({'error': 'Invalid username or password'})
  • Set up a user database to store user credentials:
import sqlite3
def create_user_table():
    conn = sqlite3.connect('users.db')
    c = conn.cursor()
    c.execute('''CREATE TABLE IF NOT EXISTS users
                 (username text, password text)''')
    conn.commit()
    conn.close()
def add_user(username, password):
    conn = sqlite3.connect('users.db')
    c = conn.cursor()
    c.execute(f"INSERT INTO users VALUES (?, ?)", (username, password))
    conn.commit()
    conn.close()
def authenticate_user(username, password):
    conn = sqlite3.connect('users.db')
    c = conn.cursor()
    c.execute(f"SELECT * FROM users WHERE username = '{username}' AND password = '{password}'")
    result = c.fetchone()
    conn.close()
    return result is not None
  • Set up a token generation method to generate and validate authentication tokens:
import jwt
JWT_SECRET_KEY = 'my-secret-key'
def generate_token(username):
    payload = {'username': username}
    return jwt.encode(payload, JWT_SECRET_KEY, algorithm='HS256')
def validate_token(token):
    try:
        decoded_payload = jwt.decode(token, JWT_SECRET_KEY, algorithms=['HS256'])
        return decoded_payload.get('username')
    except jwt.exceptions.DecodeError:
        return None
  • Set up an access control mechanism to check user authentication:
from functools import wraps
def auth_required(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        token = request.headers.get('Authorization')
        if token is None:
            return jsonify({'error': 'Authentication required'}), 401
        username = validate_token(token)
        if username is None:
            return jsonify({'error': 'Invalid token'}), 401
        return f(*args, **kwargs)
    return decorated_function
  • Using @auth_required decorator
@app.route('/protected')
@auth_required
def protected_resource():
    # This code will only run if the user is authenticated
    return jsonify({'message': 'Access granted'})

In summary, implementing an authentication server using Python and Flask is a powerful tool for system design, allowing for the authentication and access control of users in a scalable and secure manner. The above code shows how to set up a Flask server to handle authentication requests, set up a user database to store user credentials, generate and validate authentication tokens, and set up an access control mechanism to check user authentication.

SQL Read replicas

SQL Read replicas are an important tool for system design that can help improve the performance, scalability, and availability of databases. A read replica is a copy of a database that can be used to offload read requests from the primary database, reducing its load and improving its performance.

  • Create an AWS account and configure the AWS credentials using the Boto3 library:
import boto3
session = boto3.Session(
    aws_access_key_id='YOUR_ACCESS_KEY',
    aws_secret_access_key='YOUR_SECRET_KEY',
    region_name='YOUR_REGION'
)
  • Connect to the primary database using the SQL Alchemy library:
from sqlalchemy import create_engine
primary_db_uri = 'postgresql://user:password@primary-db-host:port/primary_db_name'
primary_engine = create_engine(primary_db_uri)
  • Create a read replica using the AWS RDS service and the Boto3 library:
client = session.client('rds')
response = client.create_db_instance_read_replica(
    DBInstanceIdentifier='my-read-replica',
    SourceDBInstanceIdentifier='my-primary-db',
    DBInstanceClass='db.t2.micro',
    AvailabilityZone='us-west-2a',
    PubliclyAccessible=False,
    Tags=[
        {
            'Key': 'Name',
            'Value': 'my-read-replica'
        }
    ]
)
read_replica_endpoint = response['DBInstance']['Endpoint']['Address']
  • Connect to the read replica database using the SQL Alchemy library:
read_replica_uri = f"postgresql://user:password@{read_replica_endpoint}:port/read_replica_db_name"
read_replica_engine = create_engine(read_replica_uri)
  • Offload read requests to the read replica using SQL Alchemy’s create_session() method:
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=read_replica_engine)
# create a session using the read replica engine
session = Session()
# perform read requests using the session
result = session.execute('SELECT * FROM my_table')

In summary, implementing SQL Read replicas using Python and AWS is a straightforward process. The above code shows how to create a read replica database using the Boto3 library, connect to it using the SQL Alchemy library, and offload read requests to it using a session. This can help improve the performance, scalability, and availability of databases in system design.

Load and Load Estimations

Load: Load refers to the amount of work or resource utilization that a system is handling at a given time. It can be measured in terms of the number of requests per second, CPU utilization, memory usage, disk I/O, network traffic, or any other relevant metric. Load Estimation: Load estimation is the process of predicting the future load on a system. This involves analyzing the patterns of resource utilization, user behavior, and business requirements to determine the expected workload. Load estimation helps engineers to design a system that can handle the anticipated load and avoid overloading or underutilization.

To estimate the performance of the system you have designed, there are many metrics. In this we will cover most important three —

  • Throughput — Its measured by no of jobs processed/ second
  • Response time — Its measured by the time between sending request and getting a response
  • Latency — Its measured by the time it takes for a request waiting in the queue to be completed.

CPU utilization: CPU utilization is a measure of how much of the CPU’s processing capacity is being used. It can be calculated as the ratio of the time the CPU spends executing instructions to the total time available. This metric can be obtained by measuring the number of CPU cycles used over a specific interval of time. On Unix-based systems, this information can be obtained using the “top” or “vmstat” commands, while on Windows systems, it can be obtained using the Task Manager or Performance Monitor.

Memory usage: Memory usage is a measure of how much of the system’s memory is being used. This metric can be obtained by measuring the amount of physical memory (RAM) being used and the amount of virtual memory being swapped to disk. On Unix-based systems, this information can be obtained using the “free” or “vmstat” commands, while on Windows systems, it can be obtained using the Task Manager or Performance Monitor.

Disk I/O: Disk I/O is a measure of how much data is being read from and written to the disk. This metric can be obtained by measuring the number of disk reads and writes and the amount of data being transferred. On Unix-based systems, this information can be obtained using the “iostat” command, while on Windows systems, it can be obtained using the Performance Monitor.

Network traffic: Network traffic is a measure of how much data is being transmitted over the network. This metric can be obtained by measuring the amount of data being sent and received, the number of network packets, and the network bandwidth utilization. On Unix-based systems, this information can be obtained using the “ifconfig” or “netstat” commands, while on Windows systems, it can be obtained using the Performance Monitor or the “netstat” command.

Database

There are three types of Databases —

Relational Databases — It consists of tables ( data organized in rows and columns). It uses joins to fetch data from multiple tables. The database management system is responsible for enforcing the schema and ensuring data consistency and integrity.

Non Relational Databases ( NoSQL databases) — It consists of keys that are mapped to the values. It’s easier to shard a NoSQL database and works without formatting the data. The data is typically stored in a document-oriented, key-value, or column-family format, and the database management system is responsible for handling the storage and retrieval of data based on the data model.

Graph Databases — It consists of of graph( vertices and edges) and the queries traverse the graph in order to provide the results. The graph data is stored as nodes and edges in a graph structure, and the database management system is responsible for indexing and optimizing the graph data for efficient querying and retrieval. Graph databases use graph algorithms and data structures, such as index-free adjacency and property graphs, to efficiently store and manage the graph data.

Indexes

Indexes are very useful as they help speed up the query execution and helps faster retrieval of the data.

Data warehousing

Row oriented Storage

Most transactional databases use row oriented storage. The database is partitioned horizontally and with this approach writes are performed easily as compared to reads. Examples — PostgreSQL, MySQL and SQL Server. In row-oriented storage, data is stored as a series of rows, where each row represents a single record or tuple in the database.

Column oriented storage

The database is partitioned vertically and with this approach reads are performed easily as compared to writes. Examples — Redshift, BigQuery and Snowflake. In column-oriented storage, data is stored as a series of columns, where each column represents a single attribute or field in the database. Each column contains all of the values for a particular attribute, across all records in the database. Column-oriented storage is optimized for fast retrieval of large amounts of data for a specific attribute, and it is a good choice for OLAP (Online Analytical Processing) systems that need to perform aggregate calculations and data analysis.

Serialization

It’s the process of translating objects into a format that can be stored in a file or memory buffer and transmitted across the network link and can be reconstructed later. In serialization the objects are converted and stored within the computer memory into linear sequence of bytes which can be sent to another process/machine/file etc.

Serialization refers to the process of converting an object or data structure into a sequence of bytes that can be stored or transmitted over a network. The process of serialization involves converting the object’s data into a binary or text format that can be written to disk or transmitted over a network. This enables the object to be easily stored, transmitted, and retrieved later.

One common way to serialize and deserialize data in a system is to use JSON (JavaScript Object Notation). JSON is a lightweight, text-based format that is easy to read and write for humans and machines alike.

Replication

It is the process of creating replicas to store multiple copies of the same data on different machines ( may or may not be distributed geographically).

Master-slave replication:

In this type of replication, there is a designated master node that accepts all write operations, and one or more slave nodes that receive updates from the master node and serve read operations. This type of replication provides good performance for read operations, since the load can be distributed across multiple slave nodes, but it has a single point of failure in the master node.

Peer-to-peer replication:

In this type of replication, all nodes are equal, and each node can accept both read and write operations. In this type of replication, there are no single points of failure, but the complexity of conflict resolution is increased, since multiple nodes may try to update the same data at the same time.

Multi-master replication:

This is similar to peer-to-peer replication, but it involves multiple master nodes that can accept write operations, and a mechanism for resolving conflicts that may arise between nodes.

Partitioning/Sharding

In layman’s words, sharding is the technique to database partitioning that separates large and complex databases into smaller, faster and distributed databases for higher throughput operations. Why sharding? To make databases —

  • Faster and improve performance
  • Smaller and manageable
  • Reduce the transactional cost
  • Distributed and scale well
  • Speed up Query response time
  • Increases reliability and mitigates the after effects of outrages

Horizontal Sharding

Divides the database table’s rows into multiple different tables where each part has the same schema and columns but different rows in order to create unique and independent partitions.

Vertical Sharding

Divides the database entire columns into new distinct tables such that each vertically partitioned parts are independent of all the others and have distinct rows and columns.

Partitioning can be performed at the database level, or at the application level. When partitioning is performed at the database level, the database management system (DBMS) takes care of the partitioning and data distribution automatically. When partitioning is performed at the application level, the application must handle the partitioning and data distribution manually.

Transaction

It’s a unit of program execution that accesses and updates the data items to preserve the integrity of the data that the DBMS should ensure using ACID properties.

ACID means —

Atomicity —

It means either all the operations of a transaction are properly reflected in the database or none of them. Implementation of an atomic transaction in PostgreSQL:

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 123;
UPDATE accounts SET balance = balance + 100 WHERE id = 456;
COMMIT;

In this implementation, we begin a transaction using the BEGIN statement, and then update two rows in the accounts table. If either of these updates fails, the entire transaction will be rolled back using the ROLLBACK statement.

Consistency-

It means the execution of a transaction should be isolated so that data consistency be maintained. Implementation of a consistent transaction in MySQL:

START TRANSACTION;
INSERT INTO orders (order_id, customer_id, order_total)
VALUES (1234, 5678, 100.00);
UPDATE customers SET balance = balance - 100.00 WHERE customer_id = 5678;
COMMIT;

In this implementation, we begin a transaction using the START TRANSACTION statement, and then insert a new order into the orders table. We also update the customers table to deduct the order total from the customer's balance. These changes satisfy the constraints defined by the database schema.

Isolation —

It means, in the situations where multiple transactions are executing, each transaction should be unaware of the other executing transaction and should be isolated.

Durability-

It means after the transaction is complete, the changes that are made to the database should persists even in case of system failures.

  • Transactions are atomic, meaning that they are indivisible and irreducible. This means that if a transaction is in progress, it will either be completed in full, or it will be rolled back to its original state, without leaving any intermediate states or partial changes.
  • Transactions are also isolated, meaning that the operations within a transaction are isolated from other transactions and from any external changes to the database. This ensures that transactions are executed in a predictable and consistent manner, even if other transactions are executing concurrently.
  • Transactions also have the property of durability, meaning that once a transaction has been committed, its changes are permanent and will survive any failures or other errors that may occur.
  • The transaction manager is responsible for coordinating the execution of transactions, enforcing the atomic, isolated, and durable properties of transactions, and providing a mechanism for managing any errors or failures that may occur during the execution of a transaction.

Performance

Caching, indexing, and proxies are techniques used to improve the performance of a system. Caching involves storing frequently-used data in memory for quick access, indexing involves creating an index of data to improve search performance, and proxies are used to forward requests to other servers or to cache responses.

Database sharding, CAP theorem, and database schema design are all related to the management of databases. Database sharding involves dividing a database into smaller, more manageable pieces, the CAP theorem is a principle that states that it is impossible for a distributed system to simultaneously provide consistency, availability, and partition tolerance, and database schema design involves creating a logical structure for storing data in a database.

Concurrency, API, components, OOP, and abstraction are all related to the design and implementation of software. Concurrency involves executing multiple tasks simultaneously, API stands for Application Programming Interface which is the way for different systems to communicate with each other, Components are the building block of any system, OOP stands for Object Oriented Programming, and Abstraction is the process of hiding complexity and showing only the necessary details.

Estimation and planning, and performance are related to the management and evaluation of software systems. Estimation and planning involve determining the resources and time required to complete a project, while performance evaluation involves measuring and optimizing the performance of a system.

Map Reduce, patterns, and microservices are all related to the design and implementation of distributed systems. MapReduce is a programming model for processing large data sets, patterns are reusable solutions to common problems in software development, and microservices are a type of software architecture that involves building a system as a collection of small, independent services.

Fault and Failure in Distributed Systems: Fault refers to a defect or error in a system, while failure refers to the inability of a system to perform its intended function. In distributed systems, faults and failures can occur due to network partitions, node crashes, and other reasons. The Raft consensus algorithm is a widely used algorithm for achieving fault tolerance in distributed systems.

Consistent Hashing: Consistent Hashing is a method of distributing data across multiple nodes in a way that minimizes remapping when nodes are added or removed.

Glossary

1. Object-Relational Mapping (ORM)

It is a programming technique for converting data between incompatible type systems using object-oriented programming languages. It creates a "virtual object database" that can be used from within the programming language.

In the context of web application development, an ORM is often used to facilitate the interaction between your application and your database. It provides a way to create, retrieve, update, and delete records in your database using the object-oriented paradigm in your language of choice.

In essence, ORMs allow developers to interact with their databases like they would with SQL (structured query language), but without having to write SQL and instead using the object-oriented language they're more comfortable with.

For example, SQLAlchemy is an ORM for Python, Hibernate is an ORM for Java, and Active Record is an ORM for Ruby. These libraries all provide a way for developers to interact with their database in the respective language.

2. SSL Termination

SSL termination (or SSL offloading) is a term used to describe the process of handling SSL encryption and decryption at the edge of a network, typically on a load balancer or reverse proxy server, instead of on the application server itself.

  • A client makes an HTTPS request to a server.
  • The request reaches the SSL terminator (usually a load balancer or proxy), which holds the SSL certificate for the domain.
  • The SSL terminator decrypts the request and forwards it as a plaintext HTTP request to the application server.
  • The application server processes the request and sends a plaintext HTTP response back to the SSL terminator.
  • The SSL terminator encrypts the response and sends it back to the client.

3. boto3

Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python. It allows developers to write software that makes use of AWS services like Amazon S3, Amazon EC2, and others. With Boto3, you can create, configure, and manage AWS services using Python code. It provides a high-level object-oriented API as well as low-level direct service access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment