We are looking for a senior data engineer with a focus on data engineering to join our product and engineering team. Our main office is in Paris and we are a very distributed team and are open to people working remotely in Europe and the Americas **What you will do**
- Leveraging your experience in building and maintaining complex data pipelines, you will drive the development of our analytics platform currently built on AWS Firehose/Kafka, S3, Athena, and Airflow / EMR / PySpark / TimescaleDB.
We are looking for someone who is eager to:
- Collaborate with other developers to ship new features
- Be in charge of the overall architecture of data pipelines
- Ensure that we have the right tests and structure in place to make sure that we can move quickly without breaking everything
- Share his/her knowledge of data engineering principles and best practices with the team
- Keep learning new technologies and be on the look-out for new ideas that we should try out
**What we are looking for**
- A Spark expert
- Experience with complex data pipelines and orchestration in the Cloud
- Quality-oriented mindset: testing, code reviews, code quality, etc.
- Awareness of performance considerations
- A passion for simple, maintainable and readable code that balances pragmatism and performance
**How do we build our products?**
- We process millions of events every day and are building our analytics platform on Kinesis Firehose / Kafka / S3, with Airflow/PySpark and TimescaleDB to provide an easy-to-use platform for querying and graphing events to everyone in the company and outside.
- Most of our front-end applications rely on Angular or React and we also build native mobile SDKs for Android and iOS for our clients to embed in their apps.
- Our back-end applications use Feathers or NestJS for building REST and GraphQL APIs. We try to keep our services small and lean and use AWS Lambda/Serverless for background jobs. We leverage PostgreSQL and DynamoDB as our main databases.
- We rely on a lot of AWS/GCP services (Beanstalk, Lambda, CloudWatch, S3, etc.) for building, deploying, serving, monitoring and scaling our services. We use Gitlab for our code and issues and our CI, and believe in full automation of our deployment stack with infrastructure-as-code (CloudFormation/Terraform) for everything.
**Our vision as a team**
- We are building a product and engineering team that is strongly committed to a high level of quality in our products and code. We believe that automation is the key to consistently achieving that along with velocity of development, joy, and pride in what we deliver.
- At Didomi we are organized into feature teams and work with 2-week sprints. We do our best to avoid pointless meetings. The majority of the engineering team works remotely from all over the world, the only hard requirement is a 4-hour overlap with CET working hours.
- We rely on automated tests of all sorts (unit, integration, linters, you-name-it!) and continuous integration/delivery to build flexible applications that are able to evolve without breaking. We trust that it enables engineers to focus on the quality of their code and iterate fast without fears of breaking stuff. And when we break stuff, we fix it and learn from our mistakes.
**Hiring process**
- An intro call with HR
- An intro call with an Engineering Manager
- A code challenge to build a simple Spark application. This is used as the basis of discussion for the next step. You can find our challenge on https://github.com/didomi/challenges/tree/master/data. We also accept suitable open-source projects in place of the challenge.
- A 1h code review session and architecture discussion with 3-4 Didomi engineers
- A set of 1:1 30-minute calls with the CTO, engineers, and a product manager
- For the architecture discussion, we ask you to sketch an architecture (think of event streams, databases, query engines, etc.) and discuss options and trade-offs as we would on a normal day at Didomi.
- We understand you already have a job, obligations (and maybe a personal life!) so we'll work with you to make sure it doesn't take up too much of your time while still providing a good basis for a very concrete discussion.