- Container Runtime: Are there special requirements?
- Which runtimes would work on AWS? ECS, EKS, Beanstalk, EC2. Which one is prefered?
- Network
- Persistent Storage (why, how much)
- Session stickyness
- Logging (ELK-Stack possible?)
- Secrets/Credentials management
- Telemetry/Metrics
- Prometheus/Grafana required, or are other tools possible (CloudWatch, Datadog, ..)?
- AWS Managed ElasicSearch compatiblity
- Min/Max Version
- AWS Aurora Postgres compatiblity
- Min/Max Version
- Supported?
- RDS Postgres compatiblity
- Min/Max Version
- AWS ElastiCache compatiblity
- Min/Max Version
- Container Runtime requirements (same as above).
- autoscaling, load testing
- What are the bottlenecks (CPU, RAM, Network, DB)?
- Postgres: read-only replica supported for Public API to allow scaling read performance?
- Re-Sync ES from main data storage Postgres
- Restore Postgres from Snapshot
- Prod ⇨ Dev Data Sync
- ES minor version update
- ES major version update
- (Aurora) Postgres minor version update
- (Aurora) Postgres major version update
- LivingDocs version update during live traffic / work hours?
- Reproducable Build Process
- Do we always need external help when doing releases?
- How would we allow LivingDocs to access our infrastructure (SSM/IAM or SSH)?
- What are the boundries for a shared operation mode? Who is allowed to do what? Who is responsible, if somethings fails?
- What are the SLAs for the self-hosted and Saas variant?
- Response times and on-call times?
- Desaster recovery? MTTR, MTBF, MTTF?
- Multi-Region support make the Editor and the Backend more resillient?
- Deployment (blue/green, canary for testing)
- How do Customizations complicate stuff (build-process, deployment, updates, upgrades, migrations, multi-stage)?
- As Redis is an optional component, what happens once it becomes unavailable
As provided by Gabriel Hase
2 Postgres Hosts with Master-Slave Replication (2 x 160USD)
- 32GB memory, 320GB ssd disk
- 1-year-all-upfront ~ 37%
- 3-year-all-upfront ~ 59%
- 3 - 5 Elasticsearch Hosts for Documents, Images and Publications (amount depending on indexed document size)
- 32GB memory, 8CPUs, 640GB ssd disk (min 300GB, 1GB/s throughput)
It's recommeneded to use dedicated master nodes (min. 3) and data nodes. Dynamically adding/removing nodes is possible.
- 1-year-all-upfront ~ 35%
- 3-year-all-upfront ~ 52%
- 4 Workers for Applications (4 x 160USD)
- 16GB memory, 320GB ssd disk
- 1-year-all-upfront ~ 35%
- 3-year-all-upfront ~ 54%
Hardware requirements uncertain.
Instance | Count | CPU | MEM | Costs Per Hour | Costs Per Month | Total Costs Per Month |
---|---|---|---|---|---|---|
db.r5.xlarge | 2 | 4 | 32 | 0.70 | 504 | 1008 |
r5.xlarge.elasticsearch | 3 | 4 | 32 | 0.448 | 322 | 967 |
t2.small.elasticsearch | 3 | 1 | 2 | 0.448 | 30 | 90 |
ECS/EC2 m5.xlarge | 4 | 4 | 16 | 0.230 | 165 | 662 |
** Costs per {Hour,Month} in US$ for a single instance
- Database is way to overprovisioned, @welt.de we managed around 1.000 to 1.500 write queries per second (Piwik, unsampled, AMP & WWW traffic) using a single
db.r4.xlarge
- ElasticSearch also overprovisioned, @welt.de we used 3 x
r5.large.elasticsearch
for the production API, the editors should require significantly less - The worker nodes are also way to overprovisioned, @welt.de we used a ECS cluster with 10 x
m5.xlarge
for everything (Backends, Frontends, Feeds, Public API, ...)
Todo (Costs):
- LoadBalancing
- Data Transfer
- Images
- Video
- Storage
- ES
- RDS
- EBS
- MAM: Image cropping + Supported Image formats (webp, png, jpg...)
- Personell Costs
- Building+Deploying Software
- AWS Admin
- Database Admin: Automation, Backup, Restore, Tuning, Monitoring