Skip to content

Instantly share code, notes, and snippets.

@xvnpw
Last active October 30, 2024 12:39
Show Gist options
  • Save xvnpw/1b57a2594790b94bfca58e3bf32be7d3 to your computer and use it in GitHub Desktop.
Save xvnpw/1b57a2594790b94bfca58e3bf32be7d3 to your computer and use it in GitHub Desktop.
create design document with Fabric

DESIGN DOCUMENT

BUSINESS POSTURE

The business priorities and goals for AI Nutrition-Pro are to enhance the efficiency and personalization of diet creation for dietitians by integrating with existing meal planner applications. The key objectives are to provide a seamless and secure backend API service that can reproduce the personal style of nutrition specialists using Large Language Models (LLMs).

Most important business risks include:

  1. Ensuring data privacy and protection, especially concerning Personally Identifiable Information (PII) and personal health data.
  2. Maintaining high availability and reliability of the API service.
  3. Ensuring scalability to handle multiple tenants and large volumes of data.

SECURITY POSTURE

Existing Security Controls

  • Use of AWS Cloud Services: Leveraging AWS cloud services with built-in security features for data storage and processing.
  • Secure Integration with ChatGPT 3.5: Integration with ChatGPT 3.5 secured using OpenAI's recommended practices and API key management.
  • API Access Security: API access secured using OAuth 2.0 for enhanced security and easier key management.

Accepted Risks

  • Dependency on Third-party Services: Dependency on third-party services like OpenAI for LLM functionality, which may introduce latency or service outages.

Recommended Security Controls

  • Encryption: Implement encryption for data at rest and in transit, including internal communications between components.
  • Logging and Monitoring: Establish a comprehensive logging and monitoring system to detect and respond to security incidents.
  • Regular Security Audits: Conduct regular security assessments, vulnerability scans, and penetration testing.
  • Data Anonymization: Apply data anonymization techniques to reduce the risk of exposing PII.

Security Requirements

  • Regulatory Compliance: The system must comply with data protection regulations such as GDPR and HIPAA.
  • Role-Based Access Control (RBAC): Implement RBAC to restrict access to sensitive data based on user roles.
  • Tenant Data Isolation: Enforce strict data segregation to ensure client data remains isolated and secure.

DESIGN

C4 MODEL DIAGRAMS

Context Diagram

graph TD
    subgraph External Entities
        MealPlannerApps[Meal Planner Applications]
        Administrator
    end

    subgraph AI_NutritionPro_System
        APIGateway[API Gateway]
        APIService[API Service]
        WebControlPlane[Web Control Plane]
        APIDatabase[API Database]
        ControlPlaneDatabase[Control Plane Database]
    end

    MealPlannerApps -->|Authenticate & Request| APIGateway
    APIGateway --> APIService
    APIService -->|Store & Retrieve| APIDatabase
    APIService -->|Request Content| ChatGPT35[ChatGPT 3.5]
    Administrator -->|Manage| WebControlPlane
    WebControlPlane --> ControlPlaneDatabase
Loading
Name Type Description Responsibilities Security Controls
Meal Planner Applications External System Applications integrating with AI Nutrition-Pro Send content samples and content generation requests OAuth 2.0, TLS
Administrator External Actor System administrator managing configurations and client onboarding Manage onboarding, configurations, billing MFA, Secure Access
API Gateway System Component Entry point for API requests Authenticate clients, rate limiting, request validation OAuth 2.0, rate limiting, TLS
API Service System Component Backend service providing AI Nutrition-Pro functionality Process requests, interface with LLM, business logic Input validation, logging, RBAC
API Database Data Store Stores dietitians' content samples and LLM request/response data Secure data storage and retrieval Encryption at rest, access control
Web Control Plane System Component Management interface for administrators Client onboarding, configurations, billing Authentication, access control
Control Plane Database Data Store Stores control plane data, tenant information, and billing details Secure storage of admin and billing data Encryption at rest, access control
ChatGPT 3.5 External Service OpenAI's LLM API used for content generation Generate personalized content API key management, TLS

Container Diagram

graph TD
    subgraph Networking
        LoadBalancer[Application Load Balancer]
    end

    subgraph ApplicationLayer
        APIGateway[API Gateway]
        APIServiceCluster[API Service Cluster]
        WebControlPlane[Web Control Plane]
    end

    subgraph DataLayer
        APIDatabase[API Database]
        ControlPlaneDatabase[Control Plane Database]
    end

    LoadBalancer --> APIGateway
    APIGateway --> APIServiceCluster
    APIServiceCluster --> APIDatabase
    APIServiceCluster -->|API Requests| ChatGPT35[ChatGPT 3.5]
    WebControlPlane --> ControlPlaneDatabase
    Administrator --> WebControlPlane
Loading
Name Type Description Responsibilities Security Controls
Application Load Balancer Container Distributes incoming traffic to the API Gateway Load balancing, SSL/TLS termination TLS, security groups
API Gateway Container Manages API requests Authentication, rate limiting, request validation OAuth 2.0, rate limiting, TLS
API Service Cluster Container Scalable instances of the API service Process API requests, business logic, LLM interaction RBAC, input validation, logging
Web Control Plane Container Web interface for administrators Manage clients, configurations, billing Authentication, access control
API Database Container Stores application data Data storage and retrieval Encryption at rest, access control
Control Plane Database Container Stores administrative data Store configurations, tenant info, billing data Encryption at rest, access control
ChatGPT 3.5 External API OpenAI's LLM API for content generation Generate personalized content API key management, TLS

Deployment Diagram

graph TD
    subgraph AWS_Cloud
        subgraph VPC
            subgraph PublicSubnet
                LoadBalancer
            end
            subgraph PrivateSubnet
                APIGatewayCluster[API Gateway Cluster]
                APIServiceCluster
                WebControlPlane
            end
            subgraph DatabaseSubnet
                APIDatabase[RDS for API Data]
                ControlPlaneDatabase[RDS for Control Plane Data]
            end
        end
    end

    LoadBalancer --> APIGatewayCluster
    APIGatewayCluster --> APIServiceCluster
    APIServiceCluster --> APIDatabase
    APIServiceCluster --> ChatGPT35[ChatGPT 3.5]
    WebControlPlane --> ControlPlaneDatabase
    Administrator -->|Secure VPN / MFA| WebControlPlane
    MealPlannerApps -->|HTTPS Requests| LoadBalancer
Loading
Name Type Description Responsibilities Security Controls
VPC Network Virtual Private Cloud for network isolation Network segmentation, security Network ACLs, security groups
LoadBalancer Service Distributes traffic to API Gateway cluster Load balancing, TLS termination TLS, security groups
API Gateway Cluster Service Cluster of API Gateway instances Authenticate and authorize requests OAuth 2.0, rate limiting
API Service Cluster Service Cluster of API service instances Process requests, interact with LLM and databases Autoscaling, monitoring
Web Control Plane Service Web interface for administrators Manage configurations, clients, billing Authentication, access control
APIDatabase Service Relational Database Service for API data Store dietitians' samples, LLM requests/responses Encryption at rest, multi-AZ deployment
ControlPlaneDatabase Service RDS for control plane data Store admin configurations, tenant info, billing data Encryption at rest, multi-AZ deployment
ChatGPT 3.5 External Service OpenAI's LLM service Generate personalized content API key management, TLS
Meal Planner Applications External Client Clients requesting content generation Send API requests OAuth 2.0, TLS
Administrator Role System administrator Manage system configurations and onboarding MFA, secure access (VPN or bastion host)

Improvements and Enhancements

  • Authentication Upgrade: Switched from API keys to OAuth 2.0 for client authentication to enhance security and facilitate easier key management and token rotation.
  • Data Encryption at Rest: Enabled encryption for all databases to protect sensitive data like PII and personal health information.
  • Internal TLS Encryption: Implemented TLS encryption for all internal communications between services to secure data in transit within the system.
  • Tenant Data Isolation: Enforced strict data segregation using tenant-specific schemas and access controls to ensure client data remains isolated.
  • Caching Mechanisms: Introduced caching for LLM responses to reduce dependency on external services and improve response times.
  • Autoscaling: Configured autoscaling for the API Service Cluster to handle increased load and ensure high availability.
  • Resilience Enhancements: Deployed services across multiple Availability Zones (multi-AZ) for fault tolerance and high availability.
  • Monitoring and Logging: Established comprehensive monitoring, logging, and alerting using AWS CloudWatch and integrated with a Security Information and Event Management (SIEM) system.
  • Security Audits: Scheduled regular security audits and vulnerability assessments to proactively identify and mitigate risks.
  • Administrator Security: Implemented Multi-Factor Authentication (MFA) and secure access methods (VPN or bastion host) for administrators.

RISK ASSESSMENT

  • Critical Business Processes: Protecting sensitive data (PII, health information), ensuring reliable content generation, and maintaining service availability for clients.
  • Data Sensitivity: High sensitivity due to handling of PII and personal health information, requiring strict compliance with GDPR, HIPAA, and other regulations.

Mitigation Strategies

  • Compliance: Ensure adherence to GDPR, HIPAA, and other relevant regulations through stringent data handling policies.
  • Security Controls: Implement robust security measures including encryption, RBAC, regular audits, and incident response plans.
  • Availability: Use AWS features like autoscaling, load balancing, and multi-AZ deployments to maintain high availability.
  • Fallback Mechanisms: Develop contingency plans for external service outages, such as alternative content generation methods or notifying clients of service delays.

QUESTIONS & ASSUMPTIONS

Questions

  • Regulatory Scope: Which specific regulations (GDPR, HIPAA, others) apply based on the geographic locations of our users and clients?
  • Volume Expectations: What are the projected volumes of data and API requests to plan for scalability and capacity?
  • SLAs with OpenAI: Are there Service Level Agreements (SLAs) with OpenAI for ChatGPT 3.5 to guarantee availability and performance?

Assumptions

  • Client Compliance: Meal Planner Applications will obtain necessary user consents and comply with data protection regulations.
  • LLM Availability: OpenAI's ChatGPT 3.5 service will remain available and reliable for content generation needs.
  • AWS Resources: AWS services can be scaled according to demand to support growth.

CONCLUSION

The refined architecture for AI Nutrition-Pro addresses the key concerns raised in the design review by enhancing security, scalability, and maintainability. Upgrading to OAuth 2.0 improves authentication security, while encrypting data at rest and in transit protects sensitive information. Implementing autoscaling and caching mechanisms improves performance and resilience. Regular security audits and robust monitoring ensure proactive risk management.

Key Enhancements:

  • Security: Strengthened authentication, data encryption, tenant isolation, and administrator access controls.
  • Scalability: Autoscaling and load balancing support high availability and performance under increased load.
  • Resilience: Multi-AZ deployments and caching reduce dependency on external services and enhance fault tolerance.
  • Maintainability: Containerization, modular components, and infrastructure-as-code practices support future growth and feature additions.
  • Compliance: Adhering to regulations like GDPR and HIPAA ensures legal compliance and builds client trust.

By implementing these improvements, AI Nutrition-Pro is well-positioned to meet business goals while safeguarding sensitive data and providing a reliable service to clients.


Note: This design document aims to provide a comprehensive and secure architecture for AI Nutrition-Pro, incorporating feedback from the design review to enhance overall system quality.

BUSINESS POSTURE

The business priorities and goals for AI Nutrition-Pro are to enhance the efficiency and personalization of diet creation for dietitians by integrating with existing meal planner applications. The key objectives are to provide a seamless and secure backend API service that can reproduce the personal style of nutrition specialists using LLMs.

Most important business risks include:

  1. Ensuring data privacy and protection, especially concerning PII and personal health data.
  2. Maintaining high availability and reliability of the API service.
  3. Ensuring scalability to handle multiple tenants and large volumes of data.

SECURITY POSTURE

Existing Security Controls:

  • security control: Use of AWS cloud services with built-in security features for data storage and processing.
  • security control: Integration with ChatGPT 3.5 will be secured using OpenAI's recommended practices.
  • security control: API access will be secured using API keys and OAuth2.

Accepted Risks:

  • accepted risk: Dependency on third-party services like OpenAI for LLM functionality, which may introduce latency or service outages.

Recommended Security Controls:

  • Implement encryption for data at rest and in transit.
  • Establish a comprehensive logging and monitoring system to detect and respond to security incidents.
  • Conduct regular security audits and vulnerability assessments.

Security Requirements:

  • The system must comply with data protection regulations such as GDPR and HIPAA.
  • Role-based access control should be implemented to restrict access to sensitive data.
  • Data anonymization techniques should be applied to reduce the risk of exposing PII.

DESIGN

C4 CONTEXT

graph TB
    DietMasterPro[DietMaster Pro] -->|Integrates with| AI_NutritionPro[AI Nutrition-Pro]
    NutritionistPro[Nutritionist Pro] -->|Integrates with| AI_NutritionPro
    AI_NutritionPro -->|Requests content from| ChatGPT35[ChatGPT 3.5]
Loading
Name Type Description Responsibilities Security Controls
DietMaster Pro External App Meal planner application integrating with AI Nutrition-Pro Sends dietitian samples and requests content API key, OAuth2
Nutritionist Pro External App Meal planner application integrating with AI Nutrition-Pro Sends dietitian samples and requests content API key, OAuth2
AI Nutrition-Pro System Backend API service for generating personalized content Processes requests, interfaces with ChatGPT 3.5 Encryption, access control, logging
ChatGPT 3.5 External API LLM API by OpenAI used for content generation Generates personalized content OpenAI security practices, API key

C4 CONTAINER

graph TB
    subgraph AI_NutritionPro
        APIService[API Service] --> ProcessingService[Processing Service]
        ProcessingService --> DataStore[Data Store]
        ProcessingService --> OpenAIConnector[OpenAI Connector]
    end
    OpenAIConnector --> ChatGPT35
Loading
Name Type Description Responsibilities Security Controls
API Service Container Handles incoming requests and responses Validates requests, manages authentication API gateway, rate limiting
Processing Service Container Processes data and interfaces with other services Applies business logic, manages data flow Data validation, logging
Data Store Container Stores samples and generated content Data storage and retrieval Encryption, access control
OpenAI Connector Container Manages communication with ChatGPT 3.5 Handles API requests and responses API key, OpenAI security practices

C4 DEPLOYMENT

graph TB
    subgraph AWS_Cloud
        EC2Instance[EC2 Instance] --> APIService
        EC2Instance --> ProcessingService
        RDSInstance[RDS Instance] --> DataStore
    end
    OpenAICloud[OpenAI Cloud] --> ChatGPT35
Loading
Name Type Description Responsibilities Security Controls
EC2 Instance Node AWS compute resource for running services Hosts API and Processing services Security groups, IAM roles
RDS Instance Node AWS database service for data storage Hosts Data Store Encryption, VPC isolation
OpenAI Cloud External OpenAI's infrastructure for ChatGPT 3.5 Hosts ChatGPT 3.5 API Managed by OpenAI

RISK ASSESSMENT

  • What are critical business processes we are trying to protect? The critical business processes include secure data handling, content generation, and maintaining service availability for meal planner integrations.

  • What data are we trying to protect, and what is their sensitivity? We are trying to protect PII and personal health data of customers, which are highly sensitive and subject to data protection regulations.

QUESTIONS & ASSUMPTIONS

Questions:

  • What specific data protection regulations (e.g., HIPAA, GDPR) apply to AI Nutrition-Pro?
  • What are the expected volumes of data and requests, and how will scalability be managed?

Assumptions:

  • It is assumed that the meal planner applications will handle user consent and data collection in compliance with relevant regulations.
  • It is assumed that OpenAI's ChatGPT 3.5 will remain available and reliable for content generation tasks.

AI Nutrition-Pro

Business background

Dietitians use online applications to create meals, diets and calculate calories called meal planners. Different professionals have different ways of creating diets, which gives a personal style to it. LLMs can reproduce this personal style of writing based on samples of already created content. Meal planners can use LLMs to speed up diet creation for dietitians.

Project Overview

AI Nutrition-Pro will be backend API application that will have the possibility to integrate with any meal planner application for dietitians. It will reproduce the personal style of a nutrition specialist based on samples.

Dietitians will not use the application directly but from their meal planner applications. There will be no user interface exposed to Dietitians. Integration will be using meal plan applications backend.

Direct clients of AI Nutrition-Pro will be applications like DietMaster Pro, Nutritionist Pro, or others. Those clients will send to AI Nutrition-Pro samples of content and AI Nutrition-Pro will generate requested type of content based on that. AI Nutrition-Pro will use LLM to generate requested content.

Core Features

  • multi-tenant API application - where tenant is client application like DietMaster Pro, Nutritionist Pro, or others.
  • each tenant can contain many dietitians.
  • each dietitian can have multiple customers.
  • the application will be deployed into AWS cloud and use cloud-based services to store and process data.
  • the application will store and process PII information that might contain personal health data of customers.
  • ChatGPT 3.5 will be used as LLM.

High level connection view

flowchart TB
    DietMaster-Pro --> AI-Nutrition-Pro
    Nutritionist-Pro --> AI-Nutrition-Pro
    subgraph AWS
    AI-Nutrition-Pro
    end
    subgraph OpenAI
    ChatGPT-3.5
    end
    AI-Nutrition-Pro --> ChatGPT-3.5
Loading

1. Architecture Clarity and Component Design

The architecture diagram provides a clear overview of the AI Nutrition-Pro application, depicting both internal components and external systems. The internal components include:

  • API Gateway: Responsible for client authentication, input filtering, and rate limiting.
  • Web Control Plane: Manages client onboarding, configurations, and billing data.
  • Control Plane Database: Stores data related to the control plane, tenants, and billing.
  • API Application: Provides the core AI Nutrition-Pro functionality via API.
  • API Database: Stores dietitians' content samples and LLM request/response data.

External systems interacting with the application are:

  • Meal Planner Application: Connects to AI Nutrition-Pro for AI content generation.
  • ChatGPT-3.5: Used for generating content based on provided samples.

The roles and responsibilities of each component are well-defined, and the interactions between them are logical:

  • The Meal Planner Application communicates with the API Gateway for content generation requests.
  • The API Gateway forwards requests to the API Application after authentication and rate limiting.
  • The API Application processes requests, interacts with ChatGPT-3.5, and accesses the API Database.
  • The Web Control Plane allows administrators to manage configurations and client onboarding, interacting with the Control Plane Database.

Potential areas for improvement:

  • The diagram does not clearly illustrate the interaction between the Administrator and the internal systems.
  • There may be redundancy between authentication in the API Gateway and any additional authentication mechanisms within the API Application.

2. External System Integrations

Integrations:

  • Meal Planner Applications: External clients connecting via HTTPS/REST for AI content generation.
  • ChatGPT-3.5: An external LLM service used for generating diet-related content.

Security, performance, and reliability considerations:

  • Security: Authentication is handled via API keys, and communication is encrypted with TLS.
  • Performance: Reliance on ChatGPT-3.5 could introduce latency; no mention of handling rate limits or service unavailability.
  • Reliability: The system's ability to handle multiple external clients depends on the scalability of the API Gateway and backend services.

Recommendations:

  • Implement caching mechanisms to reduce calls to ChatGPT-3.5 and improve response times.
  • Use asynchronous processing or queuing to handle high volumes of requests without overloading the system.

3. Security Architecture

Current security mechanisms:

  • Authentication: Each Meal Planner application uses an individual API key.
  • Authorization: API Gateway utilizes ACL rules to allow or deny actions.
  • Encryption: All network traffic between clients and the API Gateway is encrypted using TLS.

Potential weaknesses:

  • API keys can be compromised; lacking mechanisms for rotation or expiration.
  • No mention of security measures for communication between internal components.
  • Data at rest encryption is not specified for the databases.

Improvements:

  • Adopt token-based authentication (e.g., OAuth 2.0) for enhanced security and easier key management.
  • Implement mutual TLS or internal API keys for secure communication between internal services.
  • Enable encryption at rest for databases to protect sensitive data.
  • Establish a security monitoring and incident response plan to detect and handle breaches.

4. Performance, Scalability, and Resilience

Performance and scalability:

  • Rate Limiting: Implemented at the API Gateway to control client request rates.
  • Containerized Deployments: Using AWS Elastic Container Service allows for horizontal scaling of services.
  • Database Interactions: No specific strategies mentioned for scaling databases.

Potential bottlenecks:

  • ChatGPT-3.5 API: Dependency on an external service could limit performance if not managed properly.
  • Databases: May become a bottleneck under high load without proper scaling and optimization.

Resilience and fault tolerance:

  • Lack of details on failover mechanisms or redundancy for critical components like databases and the API Gateway.
  • No mention of handling service degradation or failures in external dependencies.

Recommendations:

  • Implement autoscaling for containerized services based on load metrics.
  • Use read replicas and database sharding to enhance database performance and availability.
  • Incorporate caching layers to reduce load and latency.
  • Design for graceful degradation when external services fail, possibly with fallback content or messages.

5. Data Management and Storage Security

Data handling and storage:

  • API Database: Stores dietitians' content samples, LLM requests, and responses.
  • Control Plane Database: Contains control plane data, tenant information, and billing details.

Security considerations:

  • No explicit mention of data encryption at rest or in backups.
  • Potential risk of data leaks if client data is not properly segregated.

Data flow optimization:

  • All data exchanges occur over TLS, securing data in transit.
  • The efficiency of data flow between components is not addressed.

Improvements:

  • Implement database encryption to secure data at rest.
  • Enforce strict access controls and auditing on database operations.
  • Use tenant-specific schemas or databases to ensure data segregation and client isolation.
  • Regularly update and maintain database indexes for optimized performance.

6. Maintainability, Flexibility, and Future Growth

Maintainability:

  • Containerization: Promotes modularity and ease of deployment.
  • Modularity: Separation of concerns between components aids in maintenance.

Flexibility for new clients and features:

  • The architecture supports onboarding new Meal Planner applications via the Web Control Plane.
  • The use of APIs facilitates integration but lacks detail on extensibility for new features.

Future-proofing strategies:

  • Current design may need adjustments to accommodate technological advancements or significant growth.

Recommendations:

  • Employ infrastructure-as-code tools for consistent and repeatable deployments.
  • Implement continuous integration and continuous deployment (CI/CD) pipelines for rapid iteration.
  • Design APIs with versioning to support future feature enhancements without breaking existing clients.
  • Plan for scalable infrastructure components to handle increased demand.

7. Potential Risks and Areas for Improvement

Risks and limitations:

  • Third-party Dependencies: Heavy reliance on ChatGPT-3.5 could be a single point of failure.
  • Security Vulnerabilities: Potential weaknesses in authentication and lack of data encryption at rest.
  • Performance Bottlenecks: Database scalability and external API rate limits may hinder performance.

Actionable recommendations:

  • Develop a contingency plan for ChatGPT-3.5 downtime, such as alternative services or offline modes.
  • Enhance security by adopting more robust authentication protocols and encrypting data at rest.
  • Monitor and optimize database performance, and consider scalable database solutions.
  • Conduct regular security assessments and performance testing to identify and mitigate issues early.

8. Document Readability

Inconsistencies and vocabulary:

  • The term "dietitian' content samples" should be corrected to "dietitians' content samples" (plural possessive).
  • Inconsistent use of component names (e.g., "API Application" vs. "Backend API").
  • The roles of the Administrator and their interactions are not clearly depicted.

Suggestions for rewrite:

  • Standardize terminology for components and roles throughout the document.
  • Provide more detailed explanations of how clients are onboarded and managed.
  • Include the Administrator's interactions in the architecture diagram for clarity.
  • Ensure all acronyms and technical terms are defined or explained.

Conclusion

The AI Nutrition-Pro application's architecture demonstrates a well-structured approach with clearly defined components and interactions. The use of containerization and cloud services contributes to scalability and maintainability. Security measures like API key authentication and TLS encryption are in place.

Strengths:

  • Clear Separation of Concerns: Components have well-defined responsibilities.
  • Scalability Potential: Containerization allows for horizontal scaling.
  • Security Foundations: Initial security measures are established.

Critical areas for improvement:

  • Security Enhancements: Strengthen authentication methods, implement data encryption at rest, and enforce strict access controls.
  • Dependency Management: Mitigate risks associated with reliance on external services like ChatGPT-3.5 by implementing caching and fallback options.
  • Data Management: Improve data segregation and optimize data flows to enhance security and efficiency.
  • Documentation Clarity: Resolve inconsistencies and provide detailed explanations where needed to improve understanding.

Addressing these areas will significantly enhance the application's security posture, performance, and ability to scale with future growth and technological advancements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment