Lead Data Engineer - remote

Podchaser

Posted 3 years ago

Who We Are

Podchaser is the world’s most comprehensive podcast database — collecting, enhancing, and distributing podcast insights to power discovery for listeners, podcasters, and brands.

Job Overview

We are looking for an experienced lead data engineer to join our growing data team. You will be responsible for developing next-generation data pipelines that power Podchaser's data platform. You can work from anywhere in the world.

You will be responsible for expanding and optimizing our data and data pipeline architecture and optimizing data flow and collection. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.

You’ll help plan and lead the next generation of products and data initiatives through the optimization and redesign of our data architecture.

To succeed in this position, you should have the capability of creating and maintaining data pipelines, strong analytical skills, and the ability to combine data from different sources. If you are detail-oriented, with excellent organizational skills, and a lot of experience in this field, we’d like to hear from you.

Your Day-to-Day

Provide technical leadership in data engineering, data lake, and data warehouse design
Provide leadership to the team on best practices and architecture in big data systems
Collaborate with the data, product, and engineering teams to identify key priorities
Create and maintain optimal data pipeline architecture
Assemble complex data sets
Prepare data for prescriptive and predictive modeling
Combine raw information from different sources
Explore ways to enhance data quality and reliability
Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
Build analytics tools that utilize the data pipeline to provide actionable insights
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader

You Are Perfect for This Role if You Have…

At least 5 years in a Data Engineering role or similar roles
Engineered scalable software using big data technologies (e.g. Hadoop, Spark, Hive, Elasticsearch, Cassandra, etc)
Experience with data pipeline tools (e.g. Airflow, Kafka, etc)
Experience ingesting, transforming, and working with big data. We have datasets consisting of hundreds of millions of data points
Strong written and verbal communication skills
Experience designing large-scale, evolving data warehouses including storage layout, data partitioning, schema evolution
Experience designing and implementing end-to-end pipeline architectures
Technical expertise with data models, data mining, and segmentation techniques
Advanced working SQL knowledge and experience working with relational databases
Experience building and optimizing data pipelines, architectures, and data sets
Experience with the following software/tools:
Strong grasp of AWS data platform services and their strengths/weaknesses
Relational SQL and NoSQL databases
Elasticsearch
Being a podcast listener or having experience with podcasts would help but is not required

Perks of Working at Podchaser

Fully distributed team - We have been remote since the beginning so have processes and systems already in place to ensure we are remote-friendly
Flexible schedule - You know the times you work best so you can work a schedule that works best for you
A positive, collaborative, and diverse culture
Potential for stock options
Unlimited Vacation/PTO time

This role is a full-time, fully remote position with a salary range of $120,000 USD to $150,000 USD.

Apply