Scala/AWS/Spark Backend Engineer - remote

Narrative IO
Posted 3 years ago

What You Will Do


We currently have a marketplace that connects buyers and sellers of adtech data, and we are building on this success to make it possible to transact any other kinds of data.


As a result, here are the kinds of projects you will likely work on in the foreseable future:

  • Make our core systems generic/agnostic to the actual data types that are being ingested, transacted and delivered. For instance, we are building Iceberg-backed data puddles, schema/validation tools, making the transaction process more generic, etc.
  • Improve the monitoring and reliability of our systems with the ever-increasing amount of data being handled (over 500GB per day)
  • Extract abstractions, modularize the code and improve the tests.
  • Enhance the API we expose to partners and increasingly rely on it for internal use, in order to "eat our own dogfood".

Technical Stack


In a nutshell, our technical stack looks like:

  • Frontend: Vuejs, Sass, Pug and Functional JavaScript
  • Backend: Big Data, scala, AWS, spark, cats, cats-effect, http4s and doobie
  • Ops: EC2, Fargate, Lambda, Terraform, EMR, DynamoDB, S3, RDS, Step Functions, Jenkins, and Datadog

Read more about our technical stack


The Ideal Candidate


We are not looking for a 100% fit on all the technology buzzwords, but we are looking for someone with strong personal and technical skills who is eager to pick up new technologies as necessary. We are obviously going to expect much more from a senior candidate than we would from a junior one.


The ideal candidate should:

  • Have experience in a typed functional language such as Scala, F# or Haskell, or significant experience in their non-functional equivalents (Java, C#) with an interest in Scala and Functional Programming.
  • Have experience working with non-trivial quantities of data. As of this writing, our ingestion pipelines are handling something on the order of 500GB .snappy.parquet files per day. Prior work with Spark would be ideal, but experience with similar MapReduce-based technologies would also be helpful.
  • Have experience operating in a cloud environment like Amazon Web Services, Google Compute Engine, or similar.
  • Be able to work across all aspects of back end systems, from application code to SQL to systems administration.
  • Not be afraid of contributing to the entire stack (from the UI to Devops) when the need arises.
  • Have strong experience using a version management system and continuous integration (CI) development process. We use Git/Github for version management and Gitlab for our CI pipelines.
  • Have the ability to lead the creation of architectural and design documents,
     collect requirements as well as feedback from the development and product teams and evalute new technologies as needed.
  • Be able to transform product designs into coherent, working and robust code solutions.
  • Communicate potential technical issues to relevant teams and adapt to changing requirements.
  • Be able to interface with technical and non-technical team members in order to bring business ideas to fruition.
  • Be mindful of the compromises that need to be done to be reactive on the business side while keeping the systems manageable in the long run.
  • Live/work within +/- 3 hours of EST