Location
We accept application from candidates based in the UK, Spain and Czech Republic.
The Role
As a Site Reliability Engineer, you will be part of our DevOps to help deliver and scale a platform that developers love to use. Your main task will be to improve our service level by improving our infrastructure, alerting, backup and disaster recovery.
As a passionate technologist, you’ll be able to contribute to discussions across the stack to ensure the best outcome.
Join us and bring your passion!
Key Responsibilities
- Ensure that we ship software that meets security and uptime SLA
- Create disaster recovery plan and execute when needed
- Introduce additional monitoring and alerting for the services and infrastructure
- Improve security for internal and external facing services
- Automate work including infrastructure needs
- Keeping services running or getting them back up and running quickly when a failure occurs
- Help diagnose infrastructure related issues
- Maintain documentation for recurring issues / tasks
Required non-technical skills
- Excellent communication skills, both verbal and written
- Team player
- Willingness to learn
Required experience
- Advanced experience operating large-scale production systems
- Significant experience with one of the main cloud providers (AWS, Azure, GCP)
- Hands-on experience with Kubernetes, Docker in production
- Hands-on experience with monitoring, logging and alerting (DataDog, Splunk, ElasticSearch, Logstash, Grafana etc)
- Backup &Disaster recovery in cloud
- Networking (DNS, load balancer, etc)
- Infrastructure automation using Terraform or alternatives
- Cloud security experience (networking rules, access management, rate limit etc)
- Unix / linux shell
Nice to have
- Azure cloud experience
- Kafka experience
- Helm
- Ansible
- Source control (git)
Benefits
- Work from home anywhere in the UK and EU (may be required to travel occasionally)
- 2 annual team meet-ups in EU destinations (normally beaches and mountains)
- Generous stock options commensurate with the opportunity
- 37 days holiday (including all public holidays in your region)
- 2 additional paid days off a year for volunteering work
- Budget to choose own hardware and office set-up
- Training and personal development budget
- Regular socials with paid food/drink/games allowance