Data Architect - remote

JBS Custom Software Solutions
Posted 3 years ago
Stack Overflow

Job Description

JBS is looking for a senior level Data Architect to help build out an enterprise data lake and help design and implement related ETL processes. The role requires knowledge of a variety of data warehouse and data lake architectures and experience with many of the AWS data services and tools. Technologies included are AWS EMR/Apache Spark, Python, AWS Lambda, AWS Step Functions, and AWS RDS (PostgreSQL). Required of this role is also experience with dimensional modeling, OLAP design principles and knowledge of how to handle the operational aspects of such systems, such as fault tolerance, performance, efficient data partitioning, security, and data governance. As a senior level architect, this role will be working closely with the implementation team and product owners.

Required Skills

Key Responsibilities

  • Translate requirements to appropriate data models and designs
  • Architect ETL pipelines using EMR for transformations and Step Functions for workflow/coordination
  • Develop and enhance existing code for performance, scalability and fault tolerance
  • Work with product owners and help support implementation teams in translating requirements and designs

Basic Qualifications

  • Minimum of 5-10 years of architecting and building data warehouse ETL solutions
    • Dimensional Modelling/OLAP design
    • SQL
    • Analytical Query Development (Aggregates, window functions, statistical functions)
  • Knowledge of operational aspects of data warehouse architecture including:
    • Designing for fault tolerance
    • Designing for performance/scale
    • Understanding of data governance issues
    • Understanding of security controls and data access
  • Minimum of 5 years working with SQL
  • Minimum of 3 to 5 years working with Apache Spark (PySpark, DataFrame API, SparkSQL)
  • Minimum 3 to 5 years working and programming in Python

Nice To Have Skills

Preferred Qualifications

  • AWS Redshift
  • Experience with typical data lake file formats –Parquet, Avro, ORC, etc.
  • Knowledge/Experience with AWS data lake-centric services (Glue, Lake Formation, S3, EMR, etc.)
  • Experience with setting up workflows with AWS Step Functions and AWS Lambda
  • Experience with typical Python data libraries (Pandas, Numpy, etc.)
  • Experience with Python building ETL and data-centric solutions

Benefits

  • Competitive base salary
  • Paid overtime
  • Generous PTO policy, company holidays
  • 401k with company match
  • Health, Dental, Life, LTD