Lead Data Engineer - remote

Podchaser
Posted 2 years ago
We Work Remotely
Who We Are
Podchaser is the world’s most comprehensive podcast database — collecting, enhancing, and distributing podcast insights to power discovery for listeners, podcasters, and brands.

Job Overview
We are looking for an experienced lead data engineer to join our growing data team. You will be responsible for developing next-generation data pipelines that power Podchaser's data platform. You can work from anywhere in the world.

You will be responsible for expanding and optimizing our data and data pipeline architecture and optimizing data flow and collection. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.

You’ll help plan and lead the next generation of products and data initiatives through the optimization and redesign of our data architecture.

To succeed in this position, you should have the capability of creating and maintaining data pipelines, strong analytical skills, and the ability to combine data from different sources. If you are detail-oriented, with excellent organizational skills, and a lot of experience in this field, we’d like to hear from you.

Your Day-to-Day
  • Provide technical leadership in data engineering, data lake, and data warehouse design
  • Provide leadership to the team on best practices and architecture in big data systems
  • Collaborate with the data, product, and engineering teams to identify key priorities
  • Create and maintain optimal data pipeline architecture
  • Assemble complex data sets
  • Prepare data for prescriptive and predictive modeling
  • Combine raw information from different sources
  • Explore ways to enhance data quality and reliability
  • Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
  • Build analytics tools that utilize the data pipeline to provide actionable insights
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader

You Are Perfect for This Role if You Have…
  • At least 5 years in a Data Engineering role or similar roles
  • Engineered scalable software using big data technologies (e.g. Hadoop, Spark, Hive, Elasticsearch, Cassandra, etc)
  • Experience with data pipeline tools (e.g. Airflow, Kafka, etc)
  • Experience ingesting, transforming, and working with big data. We have datasets consisting of hundreds of millions of data points
  • Strong written and verbal communication skills
  • Experience designing large-scale, evolving data warehouses including storage layout, data partitioning, schema evolution
  • Experience designing and implementing end-to-end pipeline architectures
  • Technical expertise with data models, data mining, and segmentation techniques
  • Advanced working SQL knowledge and experience working with relational databases
  • Experience building and optimizing data pipelines, architectures, and data sets
  • Experience with the following software/tools:
  • Strong grasp of AWS data platform services and their strengths/weaknesses
  • Relational SQL and NoSQL databases
  • Elasticsearch
  • Being a podcast listener or having experience with podcasts would help but is not required

Perks of Working at Podchaser
  • Fully distributed team - We have been remote since the beginning so have processes and systems already in place to ensure we are remote-friendly
  • Flexible schedule - You know the times you work best so you can work a schedule that works best for you
  • A positive, collaborative, and diverse culture
  • Potential for stock options
  • Unlimited Vacation/PTO time


This role is a full-time, fully remote position with a salary range of $120,000 USD to $150,000 USD.