Data Engineer - remote

Bellroy
Posted 3 years ago
We Work Remotely
IN A NUTSHELL

Bellroy’s Data team is searching for a Data Engineer to help us make better decisions by getting our (many) data pipelines flowing smoothly and sensibly into a well-architected data platform.
We need your help to build, improve and maintain the infrastructure required to stream and enrich data from a variety of internal and external sources. Together, we will enhance our cloud-based data platform to get the right info, infer useful things and make better decisions. Then test, and keep improving. When the data platform is fully functional, you’ll lend your sharp logic to improve internal processes, automate systems, and optimise, well, everything. And down the track, we’d love you to get involved with some Bayesian analysis, machine learning and other interesting data-related projects.

At some companies, we observe a familiar and depressing pattern: the most technically excellent developers hit a ceiling beyond which they can’t progress unless they start taking on direct “reports”, and becoming “managers”. Given no other options to progress their careers, they launch themselves down this path. As they progress they spend less and less time doing the thing that they love (crafting excellent code and data flows) while they learn a completely different craft — management — and spend more and more time dealing with people and their problems. At Bellroy we love the people who want to make that transition, (and the last person who got this job just did) but we don’t think that it should be the only path to career progression, and we make sure that we have a technical stream that allows people to keep getting better at engineering. This is a role in that technical stream.

If you get excited by the idea of providing the right data to inform great decisions;and you want that data to be accessible, understandable and trustworthy, then this could be the job for you. If you bring your experience, smarts and detail-oriented brain to help us, we’ll offer a world-class team to learn from, the tools you need to do your thing, and the support you need to flourish.


IF YOU WERE HERE IN THE FEW LAST WEEKS YOU MIGHT HAVE:

  • Reviewed our overall data systems (supported by very competent sysadmins) to make sure everything was in order and to look for larger-scale improvements
  • Built out a handful of new pipelines to bring more of our core business data into our data platform within our Google Cloud Platform and AWS environments.
  • Chased a handful of data validation alerts raised by our pipelines, and taken the time to get to the root cause of each of them, then either delegated the fix to an appropriate someone else or fixed them yourself
  • Worked outside of data team, with our developers, flexing your database and query optimisation skills to decide whether to fix a performance issue they’re having at the database level, or insist that the fix should be in the code (and, that’s fun - they’re an excellent bunch)
  • Provided an ad-hoc analysis (working with our analysts) to someone who requested it, integrating a one-off data source
  • Talked with our Data Manager about some of our mid-term plans, and how we’ll support them with data


THESE ARE SOME QUALITIES YOU MUST POSSESS:

  • At least three years experience in data-related roles
  • Advanced working knowledge of SQL and experience in ETL and streaming using a workflow management tool such as Apache NiFi (our chosen tool)
  • Experience with building and optimising data pipelines
  • Experience with collecting data from a variety of sources including APIs (good APIs, bad APIs, and ugly APIs)
  • Strong analytical skills and an ability to perform root cause analysis
  • Training in at least one of Computer Science, Statistics, Informatics, Information Systems or another relevant quantitative field (or demonstrable skill in one of those areas and the story of how you built that skill without formal training)
  • Very high precision – you need to know how to verify that your work is correct (even when dealing with unreliable data, where uncertainty may be inherent)
  • Bonus points for more relevant experience, such as with programming languages used in our projects (e.g.,  Ruby on Rails, Python, R, Haskell, TypeScript/JavaScript), PostgreSQL, AWS Aurora, Google BigQuery, Pub/Sub, project management and machine learning.

LOCATION AND HOURS

This role is a full-time role based in our Fitzroy office or anywhere in the world. That is up to you, but please note that working Melbourne daytime hours is an expected part of the role. 


HOW TO APPLY


Click the apply button, you'll be redirected to our application page. You'll need your resume, a cover letter and to complete an exercise.