The DevOps/CloudOps Engineer is primarily responsible for ensuring that cloud based Infrastructure is designed and deployed in a secure manner. These responsibilities include delivering solutions that satisfy functional and user requirements; developing, maintaining and troubleshooting cloud based services and network security systems; preventing misuse and malicious behavior; outlining constraints and restrictions within security policy; scripting and documenting processes.
This engineering group at Rooms To Go uses "serverless" infrastructure where possible. Their primary responsibility is architecting, implementing, maintaining, and monitoring AWS services. Where possible we use (pay for) existing services instead of re-solving "solved problems".
- Primary : Route53, CloudFront, WAF, API Gateway, ALB, ECS (Fargate), Lambda, S3, DynamoDB
- Secondary: ASG, EC2, ECR, EBS, SNS, SQS, SES, CloudWatch, Athena, Secrets Manager, RDS (Oracle)
- Strong understanding of security/permissions: Roles, Policy Documents, Security Groups, ACLs
- Redundancy, Fault Tolerance and response planning, Backups, Distaster Recovery
- GitHub Enterprise, MongoDB Atlas, Slack, GetGuru, Jira, Sentry.io, Contentful, Algolia
- Linux (Debian/Ubuntu, Centos/AmazonLinux2), macOS, Jenkins, OpenVPN, SSH
- Scripting
- Development of (or finding/repurposing) tools to make Operations reliable, repeatable, efficient, and silo free
- Terraform
- Development (organized into modules) of 5 separate 100% Infrastructure as Code AWS accounts (envs)
- Bash
- Provisioning and Interrogating via the CLI then scripting it for posterity
- aws-cli
- Maintenance or R&D done via the CLI is easily documented and scripted
- git cli
merge
,cherry-pick
,rebase -i
,add -i
, diagnose and fix mistakes using.git/logs
- Node JS &
jq
- In a Node shop you should be able to understand the code
- Deep dive into the software stack and troubleshoot alongside developers
- Python & boto3
- Bash is not suited for everything
- AWS Lambda with SNS or CloudWatch Events
- Many needs that used to be addressed with cron jobs or daemons can better be done with Lambda
- Docker
- Containerizing tools ensures readiness in an emergency (no "I can't find the script" or "my environment is broken")
vim
/emacs
,ssh
,tmux
/screen
, etc.- You have to be able to get work done on a remote machine
- systemd, cron, fstab, parted/mkfs, iptables, port forwarding, tunneling
- CPU/Mem/IO profiling, packet inspection, log inspection, dependency resolution
- TCP, ICMP, HTTP, DNS, TLS,
- Ability to perform problem solving in a complex, demanding environment by drawing on a pool of technical experience, business understanding, and good judgment
- Resourceful, creative, innovative, results driven, and adaptable
- Senior-level skills in scripting, automation, Infrastructure as Code, and AWS CLI
- Solid experience with the AWS services listed under "Responsibilities"
- AWS/Azure certification, or other Information Security certifications
- 3 years of security experience
- 8 years of IT experience (server/desktop hardware & software, networking, storage, disaster recovery, backup/restore)
- 3 years of cloud engineering, security, and/or cloud IT experience (SaaS, PaaS, IaaS across AWS, Azure, or Google cloud)
- A degree in Computer Science, Computer Engineering, Electrical Engineering, or MIS (or equivalent experience)