Senior Site Reliability Engineer (Remote)

Hypothesis.is

Posted 4 years ago

Location: Remote. Candidates must be located between UTC-6 and UTC+2 time zones.

Summary

Hypothesis is seeking a Site Reliability Engineer to join our product delivery team and lead our work to help us build efficient, reliable, secure, and scalable infrastructure and code. This role combines the activities of development and site reliability engineering to ensure Hypothesis technologies and services support our vision of a world where annotation is as common as comments, but more useful and engaging. Join us as we extend what the web can do.

About the team

Hypothesis is a small mission driven startup with financial backing from leading investors who share our vision. We are ramping up to respond to high demand from stakeholders across the education marketplace. We work with educators, schools and publishers to bring new, innovative capabilities and workflows over digital content.

We are a diverse, supportive, highly collaborative, 100% remote team of technologists, educators, and business people working together to bring new capabilities to the web. We code in the open (our repos are public and liberally licensed) and help drive standards for annotation. Though we operate remotely, we are a close-knit team that communicates via Slack, video chat, GitHub, and Google Docs across 8 time zones.

About the role

Reporting to the Director of Engineering, the Site Reliability Engineer leads the work to build, document and maintain efficient, reliable, scalable, secure and easy-to-use operations including deployment, QA and production environments, and monitoring.

Infrastructure:
- Provision and administer infrastructure (hosts, cloud services, monitoring tools, etc.) for highly reliable and scalable web applications and data stores
- Document our operations systems so that the whole team can understand and operate them
- Oversee deployment of Hypothesis application servers
Automation:
- Build automated tooling to configure and maintain our systems and services
- Guide the team in the best way to use configuration management to grow and administer our services
Performance, reliability, security, and scaling:
- Identify and solve performance, reliability, security, and scaling issues in our stack
- Stress test our stack to find cracks in the system and help us scale
- Perform audits for security vulnerabilities at regular intervals, and enact the practices set forth in our Infosec policy.

Skills and experience you possess

You have experience in software development, site reliability, and backend/infrastructure engineering for an organization experiencing fast-paced growth.
You are knowledgeable in configuration management with a framework such as Ansible or Terraform.
You understand the ins and outs of AWS, Linux, and PostgreSQL well enough to teach others how to use them, and can comfortably operate all of them from the CLI.
You are proficient with a programming language like Python or Ruby, and with shell scripting.
You are familiar with security best practices and have helped to audit for and remediate security vulnerabilities in infrastructure.
Your documentation and verbal communication skills are excellent, and you’re able to collaborate and rally support with people on and off your team.
You are inclined to automate, but can discern when automation isn’t the best solution and present alternatives.
You’ve worked with continuous integration and deployment systems, and have ideas about how to build and improve them.
You strongly believe in the importance of security, and enjoy the idea of partnering with engineers to ensure the integrity of our customers’data.
You have experience with remote work and understand the importance of good time management, self-motivation, and self-discipline as a remote worker.

About you

You are someone who loves problem solving. You value simplicity over complexity. You take great satisfaction in helping others be more successful and productive and wouldn’t think to move on without documenting your work so 6-months-from-now you (or anybody else for that matter) can drop back in and understand it. We are interested in someone who wants to help everyone around them better understand how to operate software at scale and who is eager to take on the responsibilities outlined for this role.

You will be successful at Hypothesis if you:

Love learning new things,
Unafraid to ask questions
Are committed to improving both as a technologist and a human being,
Are tenacious, self-directed, and highly motivated,
Enjoy helping others around you grow as developers and be successful,
Communicate clearly and effectively (this is especially important in a remote organization), and
Approach your work with a mindset that allows for growth and change.

What’s next

Does this sound interesting? Drop us a line to tell us what about this role intrigues you and why you think you would be great for Hypothesis. Resumes are helpful, but so are examples of your recent work. We can’t wait to hear from you!

Apply