Data Engineer - remote

Posted 3 years ago

Data Engineer

San Francisco (CA), Erlangen (Germany)

Our client is a young crypto startup from Silicon Valley.

Currently in stealth mode. However, they are backed up by a16z, Coinbase Venture, and many others. The company aims to help economies transition to cryptocurrencies.

We are looking for the best talents in the industry to join them.

Among the openings is a position for Data Engineer. Remote is possible.

Key Responsibilities:

  • Design data pipelines that can scale to handle this large data ingest. This includes figuring out ways to store, process, and load this data with robust features for filtering, pre-processing, post-processing, de-duplication, and
  • Building and refining custom data labeling services that directly influence the quality of our iris recognition
  • Work closely with other stakeholders (data contributors + consumers) to incorporate their data usage needs on a variety of tasks and

Ideal candidate:

  • Enjoy working as part of a fast-moving team, where perfectionism can sometimes be at odds with (but sometimes directly required for) pragmatism.
  • Own problems end-to-end, and are willing to pick up whatever context is needed to get the job done.
  • Has a desire to dig into problems across the stack, whether networking issues, performance bottlenecks, memory leaks, or simply reading unfamiliar code to figure out where potential issues might
  • Has a strong belief in the crucial need of high-quality data for producing state-of-the-art machine learning systems and is highly motivated to design workflows that effectively meet the associated challenges.
  • Cares about code quality and enjoy building tools that are easy to use and extensible

Tools:

Here’s a sampling of services currently running (and planned) in production:

  • Languages (Python / Go)
  • Data Orchestration (Airflow / Dagster)
  • Infrastructure, Storage, and Processing (AWS)
  • Labeling Services for ML Models (MTurk, Flask, Streamlit, Docker)
  • Pipeline Monitoring (Datadog)