Senior DevOps Engineer - remote

Metabase

Posted 2 years ago

If this role seems interesting, irrespective of your location or identities, please reach out.

Even if you don't think you meet all of the criteria but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join our team. We'd love to hear from you.

Metabase is the easiest way for people to get insights from their data, from tiny startups who get up and running quickly to major corporations with tens of thousands of users. That's why people love us.

We bring data tools with the elegance and simplicity of consumer products to the crufty world of enterprise business intelligence. We provide an opinionated open source starting point for how companies should measure, analyze and share their data, which is used by tens of thousands of companies.

Tens of thousands of companies use Metabase every day to answer questions about their data. While we seek to become the de-facto self-managed open source analytics software for organizations everywhere, many customers want an ability to use Metabase without worrying about the operational details of self-hosting. That’s why we recently launched our Metabase Cloud product. We’re looking for operations engineers to help build out and run our new and quickly growing ‘Metabase Cloud’ hosted product.

You will:

Own and operate our application stack and AWS infrastructure to orchestrate and manage our hosted customer instances of Metabase
Debug runtime issues across the different levels of our application stack, AWS, and the Metabase product
Build out and improve our observability infrastructure
Develop and build our internal tooling and automation to manage the lifecycle of a hosted Metabase installation, from purchase to deployment, zero-downtime upgrades, and general operational health
Continuously improve our automated deployments and testing

We're looking for someone who:

Is thoughtful and careful
Compulsively automates everything
Is able to make solid technical judgements and back them up articulately
Has strong network security and application security skills
Can write high quality and readable code in a modern language (e.g. Python, Go, etc.)
Strong Kubernetes experience in production
Strong experience with IaC and Terraform
Experience with monitoring tools (e.g Grafana, Prometheus, Cortex, Cloudwatch)
Experience building and operating production infrastructure on AWS, and familiarity with EKS, ECS, RDS and IAM

Projects you could work on:

Add support for multiple regions
Improve our RDS sharding strategy for our multi-tenant platform.
Build out our observability and monitoring to unify uptime, system, JVM, and application level metrics
Collaborate with core application developers on changes to improve our application metrics, deployment speeds and CI integration.
Work to improve our automation of system and processes towards achieving compliance certifications such as SOC2 and regulations such as GDPR

We're a global team (50% outside of the US), fully distributed (from Thailand to Hawaii), who gets things done asynchronously, with plenty of uninterrupted time, supporting each other to do the best work of our careers.

We're relentlessly user-focused and believe in building long-term value, not short-term hacks. And we just raised a $30M Series B to take our approach to the next level for years to come.

Apply