Skip to content

Instantly share code, notes, and snippets.

@rumverse
Last active November 2, 2016 12:22
Show Gist options
  • Save rumverse/187800d36ce2b746b19a40e262766ab0 to your computer and use it in GitHub Desktop.
Save rumverse/187800d36ce2b746b19a40e262766ab0 to your computer and use it in GitHub Desktop.
Data Engineer, Big Data

Data Engineer, Big Data

The Data Engineer is a software engineer who will be the principal builder of big data solutions. He/she will develop, maintain, test and evaluate big data systems of various sizes. Participation in the design of big data solutions is expected because of the experience they bring using technologies like Hadoop and related technologies. The primary considerations are to build large-scale data processing systems, an expert (or with advanced knowledge) in data warehousing and should be able to work with both RDBMS, Graph, Search Engines and NoSQL database technologies.

The role should have sufficient experience in software engineering, particularly in high traffic systems or well structured / designed database-driven systems. Experience with object-oriented design, coding and testing patterns as well as experience in engineering (commercial or open source) software platforms and large-scale data infrastructures should be present. The candidate should also have the capability to architect highly scalable distributed systems (at least in theory), using different open source tools. He or she should understand how algorithms work and have experience building high-performance algorithms and appreciates algorithm complexity (Big O).

The candidate should expect the challenge of dealing with petabyte or even exabytes of data on a daily basis. A big data engineer understands how to apply technologies to solve big data problems and to develop innovative big data solutions.

The candidate should have experience or can get up-to-speed quickly with the following technologies - Python, Go, Java and Linux. Ability to become productive with R and C/C++ will be a plus.

Responsibility:

- Implement complex big data projects with a focus on ETL (collecting, parsing, managing, analysing and visualizing large sets of data) 
- Using various tools and platforms, turn information into insights. 
- Decide on the needed hardware and software design needs and act according to the decisions. 
- Develop prototypes and proof of concepts for the selected solutions.

Additional Qualifications:

- Enjoy the challenge of solving complex problems that have no textbook solutions;
- Understands the challenges of a start-up company;
- Enjoys product building more than client services;
- Proficient in designing efficient and robust ETL workflows;
- Experienced and comfortable working with hybrid, cloud computing environment;
- Can comprehend technical specifications and business requirements accurately;
- Can write technicial requirements adequately and provide implementation documentation habitually;
- Hadoop fine-tuning is extremely desired;
- BS/MS degree in software engineering, math/sciences and computer science or equivalent experience

Soft skills:

- Collaborative and be able to express ideas thru excellent oral and written communication skills;
- Can easily grasp team and individual objectives
- A team player
- Leadership, independence, expectation management and project management skills are desired;
- Data Science and Technology Curious
- Always helpful and mindful of bottomline

This is a technical job. Substantial expertise in software development and programming is a MUST. Experience resulting from academic, research and intellectual curiosity will be considered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment