Site Reliability Engineers at UKG Technology and Innovation are hybrid software/system engineers that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation.
Primary/Essential Duties and Key Responsibilities:
- Be part of Development Experience SRE team who’s focus is on enhancing developer experience and maintain a set of production applications.
- Liaison between teams outside of SaaS
- Document application configurations
- Communicating information across the organization
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless postmortems.
Required Qualifications:
- Engineering degree, or a related technical discipline, or equivalent work experience
- Experience with Cloud based applications
- Experience with Containerization Technologies
- Experience with Microsoft and Linux Technologies
- Experience with VMWare or other Virtual Server Software
- Experience working with languages such as Python, Powershell, Javascript, C++, or Java)
- Experience in configuration and maintenance of applications such as web servers, load balancers, relational databases, storage systems and messaging systems
- Experience with MongoDB, MySQL, ElasticSearch, RabbitMQ, and others
- Experience with operating systems and TCP/IP network fundamentals
- Experience learning software, frameworks and APIs
- Ability and willingness to work evenings / nights on occasion.
- Ability to lead and work in projects
- Experience with distributed system design and architecture
- Experience building and managing CI/CD Pipelines
- Experience with public or private cloud platforms (e.g. GCP, Kubernetes, or Openstack)
- Experience with Production level monitoring and alerting with tools like Prometheus, Grafana, Datadog, etc.
Preferred Qualifications:
- A BS in Computer Science, Information Technology or related field of study is preferred
Check out how we give our employees the chance to work on whatever project they want for 48 hours! https://youtu.be/2Aw55CP1IO8
Typical Interview Process:
- If your application is selected, a Talent Acquisition Team Member will reach out to schedule a phone screen with them.
- If selected to move forward, you will complete a HackerRank Coding Assessment.
- If you pass, you will either move forward to a technical phone call for an additional screening, OR directly to an onsite interview.
- Offer stage.