Data Engineer - remote

Platform.sh
Posted 4 years ago

Platform.sh is happy to announce that we are seeking a Data Engineer to join as a founding member of the Data and Analytics department. The mission of the Data and Analytics Department is twofold. Firstly, to make Platform.sh smarter by enabling a deeper understanding of our customers and our internal operations. Secondly, to make our customers smarter by enabling a deeper understanding of their applications and their customers.

The Data Engineer will play a founding role on this team and as such will have a voice in designing much of the underlying infrastructure of our data platform. Experience designing and implementing ETL/ELT workflows is a must. Candidates should be very comfortable with SQL and Python, complementary tooling such as Prefect and DBT, and should be familiar with GCP and AWS solutions in the problem domain.  

You’ll get to work remotely with some of the most interesting and intelligent people in the world, learning about technologies and techniques that will challenge you and leave you smarter in the evening than you were in the morning.

Responsibilities:

  • Design and implement ETL/ELT processes
  • Gather requirements from business stakeholders (product, marketing, finance, customer success) and effectively translate them into a multi-cloud data environment that supports the business
  • Build batch and real-time data pipelines to ingest and process data from various sources such as product backends, third party platforms, etc.
  • Work closely with other engineering teams across Platform.sh to ensure alignment of methodologies and best practices

Qualifications

  • The ideal candidate will have experience with Singer taps, Prefect, BigQuery, and DBT.
  • 2+ years experience programming in Python within a production environment
  • Hands-on experience with relational databases (MySQL and PostgreSQL) and firm grasp of different data modeling techniques
  • Familiarity with creating ETL/ELT processes for large-scale data warehouses. Querying APIs for data
  • Familiarity with GCP and AWS and their data related service offerings (BigQuery, Redshift, Athena etc)
  • Knowledge of GCP PubSub or evented messaging solutions like Kafka
  • An ability to weigh the tradeoffs of different approaches to a given problem will be a core competency of the role. Our platform spans multiple public cloud vendors and so ideal scenarios for locating all data together might not be possible in all cases.
  • Familiarity with Apache Airflow, Prefect and/or other tools in the scheduling and orchestration domain is a must.  
  • Familiarity with Go will be very useful as much of our internal event-generating platform systems are written in Go.
  • Prior experience with Git is required and CI/CD is a plus. Platform.sh is a modern web hosting platform with extensive CI/CD capabilities as part of our offering. We enable developers to create development environments as easily as creating a Git branch, and we will be using our own tooling to build out the data platform wherever possible.

We are a completely distributed company, so a clear and concise written communication style is required for success in the role and the company. The cover letter to your application will be the first test of this metric.