Skip to content

Instantly share code, notes, and snippets.

@miguelgr
Created December 14, 2023 12:29
Show Gist options
  • Save miguelgr/76ca2a184abddc32b85ad59faa3832c4 to your computer and use it in GitHub Desktop.
Save miguelgr/76ca2a184abddc32b85ad59faa3832c4 to your computer and use it in GitHub Desktop.
ecgs service

Electrocardiogram Service

Requirements

Functional Requirements:

  • API endpoint to receive the ECGs for processing.

  • API endpoint to return the associated ECGs insights, expecting future analytics.

    insight: calculating the number of zero crossings of the signal.

Non Functional Requirements:

  • Highly Performant: Expect to have hundreds of users, with a maximum concurrency of 10 requests per minute.

  • Highly Scalable The volume of data will be substantial enough to spend minutes processing the signal.

  • Secure Integrate securely with third parties/external clients

System Design Solution

The system is composed of load balancer in front of the service nodes, which run the API and communicate with a message queue (broker) to process ECGs data in the worker nodes, in an async manner.

Since the API and Worker (data processing) services are decoupled, the system should be able to scale vertically and horizontally under high volume circunstances. Adding more powerful machines with more CPUs and/or more nodes during traffic peaks means the processing throught is increased.

API

Functionality

  • Implement OAuth2 authentication through JWTs.

  • Implement endpoint for ECGs creation and internal update.

    Once the worker is done processing will update the ECGS insights data using the API.

  • Implement endpoint for ECGs insights.

High Performance

If maximum concurrency is required, the system handles requests concurrently using Python's asyncio and a ASGI server. This configuration allows high performance since we can process new requests while processing others.

For a simple solution and a expected rate of 10 requests/minute I would avoid introducing asyncio (async creep) and use a WSGI threaded server.

Scalability

The system scales horizontally and independently if required. A load balancer in front of the API services could distribute the load using sticky round-robin.

Nginx serves great as a reverse proxy and load balancer.

Security

The API implements OAuth authentication, allowing third parties to easily integrate. At transport layer (TLS) HTTPS should be used to ensure data encryption.

Worker

Functionality

  • Read processing tasks from a queue.

  • Process ECGs data.

  • Update ECGs insights through API calls.

High Performance

Since data processing, so far is CPU intensive, we won't benefit from using threads (GIL presents as a bottleneck) or async.

The service makes a pool of processes to process tasks in parallel. The pool size is typically configured using the as default the amount of cores physically available.

Scalability

This design allows us to highly scale, adding more resources to the worker machines or by adding more worker machines.

Considerations

Consider a OLTP database, such Clickhouse to efficiently handle and analyze streaming data. Specially since healthcare systems generate time series data from patient monitors, wearable devices, and other health monitoring equipment.

Consider using asyncio with FastAPI and uvicorn for maximum throughput.

Consider using a robust message queue system to ingest data. For example, Kafka or RabbitMQ allow streaming data into Clickhouse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment