Data Engineer - remote

Posted 3 years ago

Responsibilities

Implement a large scale data warehouse in AWS
Implement high-performance data pipelines that can be scaled to process petabytes of data on a daily basis
Design, implement and maintain ETL processes
Implement Direct Acyclic Graphs (DAGs) in Apache Airflow to programmatically author, schedule and monitor workflows
Design and build rest APIs using Python Flask framework
Work with data scientists and productionize machine learning algorithms for real-time fraud detection
Work with data analysts to automate and optimize reporting and BI infrastructure

Requirements

At least 5 years of experience as a data engineer or a back end developer
Proficient in programming in Java and Python
Proficient in Apache Spark and Airflow
Proficient in writing and optimizing SQL statements
Proficient with AWS and/or Cloud Computing
Experienced with Data Engineering services like Athena, Redshift, Sagemaker, Kineses etc.
Experienced with SQL and NoSQL Databases like DynamoDB, RDS Aurora, MySQL, ElasticSearch, Solr, etc.
Experienced with BI tools such as Tableau, AWS Quicksight etc
Experienced in using monitoring tools and instrumentation to ensure optimum platform and application performance
Experienced in both streaming and batch data processing
Knowledge of machine learning concepts will be an advantage
Knowledge of Scala will be an advantage
Prior experience in working with cross-functional data and tech teams will be an advantage