Skip to content

Instantly share code, notes, and snippets.

@johnbuhay
Last active September 18, 2023 12:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnbuhay/7835a9699891153c0eea968dddd2ba7e to your computer and use it in GitHub Desktop.
Save johnbuhay/7835a9699891153c0eea968dddd2ba7e to your computer and use it in GitHub Desktop.
{
"meta": {
"theme": "onepage"
},
"basics": {
"name": "John Buhay",
"label": "Staff Site Reliability Engineer",
"image": "https://avatars.githubusercontent.com/u/6697003?v=4",
"summary": "Staff Site Reliability Engineer with 16 years of experience implementing Information Technology solutions for organizations in the real-estate and health care sectors. Nine years of experience operating and deploying diverse workloads on Kubernetes, with a focus on Cloud Native Computing Foundation (CNCF) technologies. Committed to orchestrating detailed planning and harnessing technology as a strategic driver to optimize costs, while simultaneously championing business continuity, leading technology transformation, and facilitating modernization efforts.",
"website": "https://johnbuhay.github.io",
"email": "buhaydev@gmail.com",
"location": {
"city": "Brooklyn",
"countryCode": "US",
"region": "New York"
},
"profiles": [
{
"url": "https://github.com/johnbuhay",
"username": "johnbuhay",
"network": "github"
},
{
"network": "LinkedIn",
"username": "johnbuhay",
"url": "https://www.linkedin.com/in/johnbuhay"
}
]
},
"awards": [
{
"title": "May '22 Forever Award recipient",
"date": "2022-05-01",
"awarder": "WeWork Technology",
"summary": "WeWork Tech awards employees with an NFT who excel in contribution and make significant impact on teams, projects and product. This is a peer nominated award reviewed and approved by a committee."
}
],
"education": [
{
"institution": "GopherCon Denver",
"area": "Golang",
"studyType": "Conference",
"startDate": "2018",
"endDate": "2018"
},
{
"institution": "KubeCon + CloudNativeCon North America",
"area": "Open Source Software",
"studyType": "Conference and booth (2018)",
"startDate": "2017",
"endDate": "2019"
},
{
"institution": "AnsibleFest NYC",
"area": "DevOps",
"studyType": "Conference",
"startDate": "2016",
"endDate": "2016"
},
{
"institution": "Scrum Aliance",
"area": "Scrum Master",
"studyType": "Certified",
"startDate": "2014",
"endDate": "2014"
},
{
"institution": "New York University SCPS, New York, NY",
"area": "Java Programming",
"studyType": "Certificate",
"startDate": "2009",
"endDate": "2009"
},
{
"institution": "Fordham University, Bronx, NY",
"area": "Computer Science",
"studyType": "Bachelor of Science",
"startDate": "2003",
"endDate": "2007"
}
],
"work": [
{
"company": "WeWork Inc",
"position": "Senior Site Reliability Engineer / Developer Platform Team",
"startDate": "2019-04",
"endDate": "2023-08",
"summary": "Achieved various technical and operational goals including designing and executing the Developer Platform, establishing and maintaining business continuity procedures, enhancing site reliability for both internal and external web applications, and managing the adoption, integration, migration, or retirement of software components, whether they were developed in-house or acquired externally through company partnerships or acquisitions.",
"highlights": [
"Coached the platform team to migrate developer tools and client applications from the roll-your-own clusters to the new version of the platform, using AWS Elastic Kubernetes Service (EKS).",
"Designed, implemented and maintained the business continuity plan (BCP) contributing to the success of annual SOX audits, 3 Tender Offers, and ultimately IPO via SPAC in 2021",
"Created and implemented personalized plans to unify ten products, each with diverse system architectures and technology stacks, under a single API gateway. This initiative encompassed more than 1000 services and jobs, resulting in significant enhancements in reliability, performance, and cost-effectiveness.",
"Recovered an at-risk project for GDPR compliance by spiking on Temporal and delivering 2 environments in 6 weeks",
"Managed features and drove platform adoption, leading to improved site availability cost reduction from other SaaS providers, realizing an average 70% cost reduction.",
"Adopted a critical yet orphaned system that provides configuration to 100% of workloads",
"Reduced the time to deliver a new app from days to minutes by supporting a self-service model for developers after building tools that could create and scaffold several different project types, while maintaining and promoting best practices for WeWork’s security and compliance requirements.",
"Conducted regular analysis for system availability, performance optimization, security risk and incident response review",
"Saved millions year over year planning and executing cross team solutions for cost and performance",
"Investigated and reported areas of technical improvements in low level detail",
"Rescued or replaced at-risk projects for the Technology organization",
"Developed organization wide standardization on building, deploying, and maintaining services on our kubernetes platform"
]
},
{
"company": "WeWork",
"position": "Site Reliability Engineer / Service Infrastructure Team",
"startDate": "2018-04",
"endDate": "2019-04",
"summary": "Helped pivot team and role responsibilities to a SRE focused structure, interviews 50 candidates and helped grow team to 40 people across Tel Aviv, New York City, and San Francisco locations",
"highlights": [
"Identified and proposed solutions to 13 reliability issues across several platform functions, including observability, api gateway and CI/CD pipelines.",
"Consolidated providers, services, and strategies, while migrating products and teams to the new platform saving 18M/year.",
"Developed a self-service platform, tools, pipelines, processes, and policies that became the standard for all teams.",
"Provided subject matter expertise to other teams on platform migrations or optimized greenfield project deliveries",
"Initiated the Observability team by hiring, training and supporting the team in the logging and metrics infrastructure who ultimately took over and improved the stack and provided the alternative to Datadog and NewRelic on the platform",
"Created text and video materials for training, documenting and on-boarding a wide level of executives and engineers to the WeWork platform",
"Implemented automated guardrails to enhance service reliability and mitigate common outage scenarios",
"Drove platform adoption leading to savings from other SaaS providers seeing an average 70% cost reduction (Heroku ~$3M/year, NewRelic $1.8M/year saved)",
"On-boarded new team members to CNCF OSS used to provide the ever expanding feature set of the platform",
"Helped pivot team and role responsibilities to a SRE focused structure, interviewed 50 candidates and helped grow team to 40 people across Tel Aviv, New York City, and San Francisco locations",
"Migrated workloads from our version 1 of the internal platform to version 5",
"Identified and proposed solutions to 13 reliability issues across several platform functions including observability, traffic gateway and CI/CD pipelines",
"Helped propose and plan the execution of delivering a unified api gateway for the entire department using the nginx-ingress versus multiple hops through various proxies",
"Contributed to the design and goals of proprietary software for developers to use and achieve regulation compliance reducing time to delivery from weeks to minutes"
]
},
{
"company": "WeWork",
"position": "DevOps Engineer / DevOps Team",
"startDate": "2016-08",
"endDate": "2018-04",
"summary": "Hired to accelerate the delivery and adoption of Kubernetes for the applications that were mixed between Heroku and various AWS services. Introduced DevOps culture along with the new technologies that enabled the platform for the growing team of engineers.",
"highlights": [
"Using Terraform and Ansible, provided and maintained 5 Kubernetes clusters as a platform for WeWork's most critical applications such as Identity and Physical Access",
"Re-envisioned the features and capabilities of the platform and supported adoption to more teams and workloads, doubled the number of services provided by the platform",
"Identified 3 high severity vulnerabilities in WeWork's stack and proposed solutions for immediate remediation",
"Collaborated on an internal cli tool to decrease the level of entry for hundreds of developers to manage their own services on Kubernetes",
"Designed and supported continuous deployment pipelines in Jenkins for 20 early adopter services; eventually migrated all pipelines to CircleCi for development teams to maintain their own workflows and deprecated the use of Jenkins",
"Trained team members in any/all areas of the platform and prepared them for On-Call responsibilities",
"Conducted regular analysis for system availability, performance optimization, and incident response review."
]
},
{
"company": "WeightWatchers International",
"position": "DevOps Engineer / Automation Team",
"startDate": "2014-05",
"endDate": "2016-08",
"summary": "Transitioned the company's entire technology stack from a waterfall, 2-time per year, scheduled monolith release into a microservice architecture spanning across multiple cloud providers and datacenter, while adopting an agile-driven release cycle.",
"highlights": [
"Deployed the first production Kubernetes cluster for WeightWatchers in August 2015",
"Collaborated in a team of 5 on the development and maintenance of 6 Kubernetes clusters relied upon by both customers (2M daily active subscribers) and about 100 developers.",
"Created a python command line tool that allowed 50 developers to create and manage their own applications.",
"Contributed to the implementation of automated deployments in k8s and service registration in our custom ingress solution",
"Using Ansible and Groovy, developed scripts to configure Jenkins and Artifactory who create build, test, and deploy jobs from code",
"Developed several Jenkins CI/CD pipelines for multiple teams with different disciplines including NodeJS and Scala",
"Built an automated system for Datadog integration with various team's containerized applications to automatically gather metrics, monitor performance, and alert surpassed thresholds to groups and individuals",
"Managed hundreds of linux servers in Rackspace using Ansible",
"Migrated services from Rackspace to AWS using Ansible in 2016"
]
},
{
"company": "WeightWatchers",
"position": "Engineering Lead for Franchise Marketing and Microsite Projects / Engineering Team",
"startDate": "2014-03",
"endDate": "2014-05",
"summary": "Oversaw the deployment of time-sensitive content for the Franchise Marketing and Microsite mobile sites."
},
{
"company": "WeightWatchers",
"position": "Software Engineer / Engineering Team",
"startDate": "2011-01",
"endDate": "2014-03",
"summary": "Responsible for troubleshooting, implementing and validating hot fixes for production issues"
},
{
"company": "WeightWatchers",
"position": "Project Coordinator, eCommerce and CMS Application Lead / Content Operations Team",
"endDate": "2011-01",
"startDate": "2007-09",
"summary": "Managed projects involving multiple teams for feature development, marketing campaigns, eCommerce sales, advertising and content maintenance for US and 10 international subscription and non-subscription sites including 3 multilingual sites."
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment