jmnavarro/infrastructure-engineer.md

## infrastructure-engineer.md

      
    Raw
  

              infrastructure-engineer.md
            
          
    Who are we?

At urbanData Analytics, domain experts and engineers team up to provide advisory services and develop software and data products for the Real Estate industry, one of the few big sectors still on its way to being fully digitalized.
From a technical standpoint, our stack goes all the way from our big data and machine learning engines to our web and Excel (believe it or not, we are really proud of this one!) frontends. You can learn more about how we work by watching this talk (or this other one in Spanish).
We are a relatively small, remote-friendly company based out of Madrid. Recently, we became part of Alantra Group. As a consequence of that we are continously growing our technical and consulting teams and reach for a few ripe markets already on our radar sooner than later while still running our business with the same energy and independence as usual.
What are we looking for?

A Infrastructure Engineer (aka SRE, Systems Engineer, DevOps Engineer, etc.) for our team to work on cloud BigData infrastructure that captures, ingests, processes and delivers huge amounts of data.
We are looking for someone experienced enough to not only do a great engineering work but also help and train other developers on how to do better DevOps. Because you know, DevOps is not a role.
And following that spirit, we want someone willing to contribute in the development side, in order to break silos across dev and ops worlds.
For reference only, this is what our current stack looks like:


Google Cloud Platform: Kubernetes, big data services, etc.


On-premises: we have some in-house servers that need to be managed old-school (Linux sysadmin) for now.


Jenkins for deployments and some automations (planned to be migrated to other tech)


Apache Airflow to schedule and monitor data workflows.


Python: our lingua franca, used company-wide.


Scala: we are moving some code to Scala, in order to improve our type safety. We also like functional programming, but without getting crazy.


PostgreSQL and PostGIS: we do some advance GIS work and use PostGIS intensively.


Other storage engines: Mysql, ElasticSearch, Redis or Google Storage are other storages where our data lives.


Message brokers: RabbitMQ, Google PubSub... because we cannot swallow everything at once!


Are you a good fit?

It is hard to answer a question like this here, but here goes a list of challenges you should find yourself excited about:


We are definitely building new and exciting products, but we already have other products out there. They pay our bills and need to be taken care of. Even though the code base and infrastructure is not too old, we have our share of technical debt that needs to be dealt with: understand the reasons behind every workaround, accept the tradeoffs made, evolve iteratively, and find good trade-offs between throwing old stuff away and improving it (starting from scratch is not always an option).


Whenever it is actually possible to start from scratch, we try not to be blinded by every trend and buzzword. We need to understand the benefits behind relevant technologies and tools, and choose wisely.


Not only do we have big data on our data lake, but also huge tables in MariaDB+Postgres+PostGIS (500M-row tables do not scare us away!). We don't need a pro DBA, but we want you to be open to dive in that space.


We are transitioning our backend from a big monolith to a more flexible set of independent (macro) services. It is indeed a challenge, but also a great opportunity to improve and optimize.


We are investing on taking the way we deploy code and data to the next level by improving the reliability, observability and simplicity of current pipelines and environments.


You consider data to be a first-class citizen. Besides "code-deploy", we also have "data-deploy" pipelines because we need to deploy our datasets with the same confidence and control as we do with code.


Most of our infrastructure sits on Google Cloud. However, we are far from using it down to its full potential. We need to find cost-effective ways of squeezing the most of it by defining a proper platform governance strategy.


All our low-level infra is not properly specified and versioned, so we need to move to some kind of IoC practice: Terraform, Ansible... you name it.


Everyday we acquire and store more and more data so scaling up our data lake and different databases on our cloud platform in a fast pace is key for the business.


Great challenges make great fun, do you like it so far?
How will your day-to-day experience be?


You will develop pipelines and automations: code and data deploys, QA processes, versioning, rollback strategies, etc. You will integrate quality checks in your pipelines: test input data with preconditions, output data with postconditions and test your code with unit-testing best practices. Such pipelines and automations are the foundations of our data and application services:


Google-Storage-based data lake.


Machine Learning models.


Backup (we deal with unstructured, semi-structured and structured data, updated at different pace).


PostgreSQL, MariaDB and ElasticSearch databases.


API services.


Web applications and plugins.


You will develop our monitoring & alerting infrastructure almost from scratch: set the foundations of unified logging, telemetry, alerting and observability for all projects.


You will work with the Data (science and engineering) team in order to understand their needs and setup environments for them: Jupiter notebooks, Machine Learning training, checking pipelines, ETLs, etc.


You will help improve our development, staging and production environments and associated plumbing.


You will motivate and educate team members on DevOps coding standards and best practices.


How will it look like during the first months?

It is always hard to make predictions in such a changing environment as ours, but let us try:
After the first month...


You know what we sell as a company.


You have had the chance to meet most of your teammates, and you know the basics to already start enjoying working at uDA.


You understand our technical stack at a high level.


You have set up our data development environments, as well as frontend and backend ones.


You know our main git repos, and where to commit each piece of code. Same with our Slack channels and where to post each message.


You understand our code and data development cycles.


You have tackled your first concrete, simple tasks, and deployed to our staging and production environments.


You feel confident enough to go for bigger and more complex challenges.


After your first three months...


You feel at home. Your day-to-day runs smoothly and you have built a strong relationship with your manager and closest teammates.


You are an active part of on-going projects within the company.


You understand what our current pain points are in terms of infrastructure and are starting to lay out a plan to overcome them.


You have made specific contributions to improve the way engineers and data scientists and engineers interact with the infrastructure.


After your first nine months...


You feel like helping with the onboarding of new teammates regarding cloud infrastructure and DevOps best practices.


You are able to breakdown business goals into the workable DAG of task required to achieve the goals: identify implementation dependencies and critical paths.


Our architecture and pipelines have improved a lot: more resilient, better monitoring, more scalable... thanks to your contributions and decisions!


Our data and code deploys run smoothly and properly monitored, and alerts are raised when data or code does not meet the expected quality.


You were a speaker in at least one conference, talking about our experience and stack.


You are part of most, if not all, of the technical conversations across the company.


What can you expect from us?


A welcoming bunch of team-mates, excited to work with you.


Positive + inclusive + respectful working environment.


A key position across the most important teams of the company.


A focused, eat-our-own-food approach to product development.


Competitive salary + bonus.


Comfortable office in Torre Europa, Paseo de la Castellana (Madrid). Free home-made lunch every day, served by Jbfood.


100% remote position. The whole engineering team is at home, and not because COVID.


PluralSight account for learning during your working hours.


Personal budget for conferences, books, events and training.


I'm all in! What should I do?

Send your resume or Linkedin, Git(hub|Lab) profile, or anything you think can help us know you better to jm@urbandataanalytics.com and a brief explanation on why you want to join us!
If you like what you read, but are still not sure if this is the right position for you, please reach out too. Should you have questions or should you want to find out more about other openings we have, we will be more than happy to have a chat with you and talk it over.
Come on! Don't miss the train! 🚂