We are seeking a hands-on senior data engineer to help us build out and manage our data infrastructure, which will need to operate reliably at scale using a high degree of automation in setup and maintenance. The role will involve both setting up and managing the data infrastructure, as well as building and optimizing key ETL pipelines on both batch and streaming data. The ability to work with the teams from product, engineering, BI/analytics and data science is essential. Ownership needs to be taken of data model design and data quality. Automation and the use of data science to manage and improve data quality would be valued. The individual will also play active roles in ensuring data governance policies and tooling are implemented and adhered to
The individual will also need to be able to manage multiple stakeholders at an executive level and make well informed architectural choices when required. A high degree of empathy is required for the needs of the downstream consumers of the data artefacts produced by the data engineering team, i.e. the software engineers, data scientists, business intelligence analysts, etc and the individual needs to be able to produce transparent and easily navigable data pipelines. Value should be assigned to consistently producing high quality metadata to support discoverability and consistency of calculation and interpretation.
Candidates should have a wide set of experience across the following systems and languages:
- Apache Kafka
- Apache Flink
- Apache Airflow
- Cloud data warehouses such as Redshift or BigQuery
- Python and Java