Fully Remote Senior or Lead SRE Engineer - Platform9

Posted 2 years ago  • Mountain View, CA

Position: Senior or Lead DevOps Engineer

Salary: $160K - $195K (dependent on work location &level of experience)

HQ: Mountain View, CA but this is a REMOTE opportunity - anywhere in the U.S.

Perks: Attractive Equity + Flexible PTO + Medical, Dental, &Vision Benefits + 401(k) plan + more

Platform9 enables freedom in cloud computing by empowering enterprises to consume infrastructure of their choice using open-source, cloud-native technologies. Platform9 makes it easy for enterprises to operate and scale private, edge, and hybrid clouds with its industry-first SaaS management plane for Kubernetes.

The Role and Your Impact:

We are looking for a Staff level SRE who has experience building automated, robust and highly scalable operating environments for software. The ideal candidate will bring experience and technical leadership to our DevOps team as well as significant experience with Grafana and Prometheus. At Platform9 you will have the opportunity to work on tools and infrastructure that will help us deploy our software quickly, securely and repeatably.

You can be part of a team that's building a platform for engineers like you to manage large private clouds. This gives you the opportunity to work on the core product as well as the infrastructure behind it.

What You Will Be Doing:

  • Own the design and development of complex tools and infrastructure
  • Automate management of Kubernetes applications and clusters
  • Monitoring via Prometheus and Grafana
  • Develop tools and services, primarily in Python
  • Develop tests and CI/CD infrastructure for any code you write
  • Mentor junior engineers

We look for these qualities in a candidate:

  • 6+ Years of SRE / DevOps experience
  • Experience providing technical leadership for projects
  • A passion for documentation and automation. We have a strong culture of replacing manual processes with repeatable automation
  • Must be proficient in writing maintainable, testable code
  • Experience with administering of and debugging on Linux based systems
  • Experience with Prometheus (or Grafana) and monitoring Kubernetes services
  • Experience in building production grade infrastructure services, tools and automation.
  • A positive attitude and an ability to work with multiple competing demands on your time

Bonus If You Have:

  • Familiarity with Automation: Helm or Terraform
  • Knowledge of multiple programming languages like, Python, Go, etc.
  • Configuration Management using Salt (preferred), Ansible, Puppet or ChefFamiliarity with OpenStack or VMware
  • Experience with cloud computing infrastructure and automation
  • Best practices with disaster recovery: backup and recovery, snapshots, always-up systems
  • Comfortable with industry standard practices related to security
  • Network troubleshooting skills
  • Experience with MySQL or other databases