Rafael Jesus rafaeljesus

## FB-PE-InterviewTips.md

      
              4 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rafaeljesus
                / FB-PE-InterviewTips.md
            
            
              Created
              February 28, 2020 22:27
                — forked from ameenkhan07/FB-PE-InterviewTips.md
            
              
                Facebook Production Engineering Interview
              
          
    What to Expect and Tips

• 45-minute systems interview, focus on responding to real world problems with an unhealthy service, such as a web server or database. The interview will start off at a high level troubleshooting a likely scenario, dig deeper to find the cause and some possible solutions for it. The goal is to probe your knowledge of systems at scale and under load, so keep in mind the challenges of the Facebook environment.
• Focus on things such as tooling, memory management and unix process lifecycle.

Systems

More specifically, linux troubleshooting and debugging. Understanding things like memory, io, cpu, shell, memory etc. would be pretty helpful. Knowing how to actually write a unix shell would also be a good idea. What tools might you use to debug something? On another note, this interview will likely push your boundaries of what you know (and how to implement it).
Design/Architecture 

Interview is all about taking an ambiguous question of how you might build a system and letting

  
## platform-teams.txt
# Platform Teams

- A developer who is familiar with one system can easily navigate the next, with minimun frustration.
- Imagine walking in a new city without maps, distributed systems are by far haarder to reason about than cities (of course Venecy with excluded).
- Help to lighten the coginitive load for developers (it refers to the amount of working memory required by a developer to understand and build upon an existing system).
- Very often, cognitive load means both reducing the number of ways a system can be built and making sure the ways of deployment, observability and monitoring are unique.
- Common optimization tasks look like:
* creating new services
* testing services
* deploying changes to service safely

## tf-maps-lookups.tf
# variables.tf
variable "env" {
  description = "env: stg or prod"
}

variable "image_name" {
  type        = "map"
  description = "Image for container."
  default     = {
    dev  = "occollector:latest"

## haproxy_rate_limiting.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rafaeljesus
                / haproxy_rate_limiting.md
            
            
              Created
              July 4, 2019 21:16
                — forked from procrastinatio/haproxy_rate_limiting.md
            
              
                Rate limiting with HAproxy
              
          
    Introduction

So HAProxy is primalery a load balancer an proxy for TCP and HTTP. But it may act as a traffic
regulator. It may also be used as a protection against DDoS and service abuse, by maintening a wide variety
of statistics (IP, URL, cookie) and when abuse is happening, action as denying, redirecting to other backend
may undertaken ([haproxy ddos config], [haproxy ddos])

  
## prom_slo_affected_alert.yaml
rules:
  - alert: PrometheusSLOAffected
    annotations:
      summary: 'Prometheus slo is affected'
      description: 'Prometheus has slo affected during the last 15m {{humanize $value}}%'
      runbook_path: platform/prometheus-slo-affected.md
    expr: prometheus:composed_slo5m < 0.999
    for: 15m
    labels:
      severity: page

## prom_slo_viol_alert.yaml
rules:
- alert: PrometheusSLOViolation
  annotations:
    description: 'Prometheus SLO is violated {{humanize $value}}%'
    runbook_path: platform/prometheus-slo-violation.md
  expr: prometheus:composed_slo4w < 0.999 and (hour() > 9 < 18 and day_of_week() > 0 < 5)
  labels:
    severity: page
    team: platform
    slack: platform-alerts

## kube_prom_comp.yaml
- record: prometheus_infrastructure_team:up:max_avg_over_time5m
  expr: >
    max (
      avg_over_time (
        up{job="prometheus-infra"}[5m]
      )
    )
- record: prometheus_infrastructure_team:up:avg_over_time4w
  expr: >
    avg_over_time (

## prom_comp_slo.yaml
- record: prometheus:composed_slo4w
  expr: >
    (
      prometheus_product_team:up:max_avg_over_time4w +
      prometheus_infrastructure_team:up:max_avg_over_time4w
    ) / 2

## monzo-alertmanager-receiver.yaml
receivers:
###################################################
## Slack Receivers
- name: slack-code-owners
  slack_configs:
  - channel: '#{{- template "slack.monzo.code_owner_channel" . -}}'
    send_resolved: true
    title: '{{ template "slack.monzo.title" . }}'
    icon_emoji: '{{ template "slack.monzo.icon_emoji" . }}'
    color: '{{ template "slack.monzo.color" . }}'

## Gopkg.toml
[[constraint]]
  name = "go.opencensus.io"
  version = "0.17.0"
	# Platform Teams

	- A developer who is familiar with one system can easily navigate the next, with minimun frustration.
	- Imagine walking in a new city without maps, distributed systems are by far haarder to reason about than cities (of course Venecy with excluded).
	- Help to lighten the coginitive load for developers (it refers to the amount of working memory required by a developer to understand and build upon an existing system).
	- Very often, cognitive load means both reducing the number of ways a system can be built and making sure the ways of deployment, observability and monitoring are unique.
	- Common optimization tasks look like:
	* creating new services
	* testing services
	* deploying changes to service safely
	# variables.tf
	variable "env" {
	description = "env: stg or prod"
	}

	variable "image_name" {
	type = "map"
	description = "Image for container."
	default = {
	dev = "occollector:latest"
	rules:
	- alert: PrometheusSLOAffected
	annotations:
	summary: 'Prometheus slo is affected'
	description: 'Prometheus has slo affected during the last 15m {{humanize $value}}%'
	runbook_path: platform/prometheus-slo-affected.md
	expr: prometheus:composed_slo5m < 0.999
	for: 15m
	labels:
	severity: page
	rules:
	- alert: PrometheusSLOViolation
	annotations:
	description: 'Prometheus SLO is violated {{humanize $value}}%'
	runbook_path: platform/prometheus-slo-violation.md
	expr: prometheus:composed_slo4w < 0.999 and (hour() > 9 < 18 and day_of_week() > 0 < 5)
	labels:
	severity: page
	team: platform
	slack: platform-alerts
	- record: prometheus_infrastructure_team:up:max_avg_over_time5m
	expr: >
	max (
	avg_over_time (
	up{job="prometheus-infra"}[5m]
	)
	)
	- record: prometheus_infrastructure_team:up:avg_over_time4w
	expr: >
	avg_over_time (
	- record: prometheus:composed_slo4w
	expr: >
	(
	prometheus_product_team:up:max_avg_over_time4w +
	prometheus_infrastructure_team:up:max_avg_over_time4w
	) / 2
	receivers:
	###################################################
	## Slack Receivers
	- name: slack-code-owners
	slack_configs:
	- channel: '#{{- template "slack.monzo.code_owner_channel" . -}}'
	send_resolved: true
	title: '{{ template "slack.monzo.title" . }}'
	icon_emoji: '{{ template "slack.monzo.icon_emoji" . }}'
	color: '{{ template "slack.monzo.color" . }}'