Skip to content

Instantly share code, notes, and snippets.

@mcchandu
mcchandu / rds-proxy.tf
Created May 31, 2023 05:15 — forked from MikeZ77/rds-proxy.tf
RDS Proxy
resource "aws_db_proxy_default_target_group" "rds_proxy_target_group" {
db_proxy_name = aws_db_proxy.db_proxy.name
connection_pool_config {
connection_borrow_timeout = 120
max_connections_percent = 100
}
}
resource "aws_db_proxy_target" "rds_proxy_target" {
@mcchandu
mcchandu / helm-prometheus-grafana.md
Created September 10, 2022 14:03 — forked from Warns/helm-prometheus-grafana.md
Install Prometheus and Grafana on Kubernetes cluster along with data source and persistent storage.

Installing Prometheus and Grafana

Add and update repos to fetch charts

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Create monitoring namespace

How to Set Kubernetes Resource Requests and Limits - A Saga to Improve Cluster Stability and Efficiency

A mystery

So, it all started on September 1st, right after our cluster upgrade from 1.11 to 1.12. Almost on the next day, we began to see alerts on kubelet reported by Datadog. On some days we would get a few (3 - 5) of them, other days we would get more than 10 in a single day. The alert monitor is based on a Datadog check kubernetes.kubelet.check, and it's triggered whenever the kubelet process is down in a node.

We know kubelet plays an important role in Kubernetes scheduling. Not having it running properly in a node would directly remove that node from a functional cluster. Having more nodes with problematic kubelet then we get a cluster degradation. Now, Imagine waking up to