Infrastructure / Site Reliability Engineer - remote

Recombee
Posted 2 years ago
We Work Remotely
Recombee is a high-tech startup that delivers real-time personalized recommendations for thousands of sites and apps all around the world. We compute the recommendations in our private cloud, using hundreds of servers in multiple geo regions. In order to provide the most quality service, we need to process really BIG data and make sure that our service is highly available and scalable. This brings up a lot of interesting challenges we would like you to help us with.

Our Infrastructure Stack
We would like you to be able to help us with any of the following parts of our tech stack. It is not needed to have experience with everything or knowledge of specific technology. We want someone who is happy to learn and can bring new ideas. We are always open to trying something new :)
  • Hundreds of servers in four clusters around the world (Europe, Australia and two in North America)
  • We depend on a lot of distributed services. For example Kafka, Elasticsearch and Aerospike and we need a hand to handle it all
  • Databases PostgreSQL and ClickHouse and we would like to try more and experiment
  • We plan to migrate to Kubernetes and currently use Mesos
  • Docker and dockerd for our container management
  • Puppet for configuration management
  • Monitoring and logging tools like Grafana, Kibana and Prometheus
  • We heavily rely on Git and GitOps to automate everything we can

What we are looking for:
  • Good Linux/Unix knowledge
  • Understanding of networking
  • Some experience with distributed systems
  • Git

What we can offer:
  • A high-end working laptop, monitor and other working equipment we want
  • Full remote position - we will equip your home office
  • Skilled and experienced technical team with knowledge to share
  • 5 weeks of vacation
  • 5 sick days