closed vacancy Site Reliability Engineer - Remote

Numbrs
Posted 4 years ago
Stack Overflow

Numbrs is reshaping the future of the workplace. We are a fully remote company, at which every employee is free to live and work wherever they want.

Join our dedicated technology team that builds massively scalable systems, designs low latency architecture solutions and leverages machine learning technology to turn financial data into action. Want to push the limit of personal finance management? Join Numbrs.

Responsibilities

You are responsible for managing large scale micro-service based environments with thousands of pods deployed under Kubernetes. As part of the SRE team, you stay on the cutting edge by looking into new technologies and figuring out new ways of doing things, run chaos engineering projects to test the resiliency of our platform and improve it, support our developers to continuously deliver software though deployment pipelines with Spinnaker, and get involved with Data Science, Architecture, and Security teams in discussing technical solutions in Product and Technology.

Key Qualifications

  • a Bachelor's or higher degree in the technical field of study, or equivalent practical experience
  • a minimum of 5 years experience deploying, monitoring and troubleshooting large scale distributed systems
  • solid background in DevOps engineering
  • understanding of cloud-based infrastructures, such as AWS or GCP
  • strong experience with Kubernetes, Istio, and big data technologies such as Kafka, Spark, and Cassandra
  • quick to learn and fast to adapt to changing environments
  • excellent troubleshooting and creative problem-solving abilities

Ideally, candidates will also have

  • experience with open-source tools for network and security monitoring and management on Linux/Unix platforms
  • experience with Security

Location: Home office from your domicile