Site stats you will improve:
- 728+ Nvidia P100/T4 GPUs
- 32k+ physical cores over 24 carrier hotels and 6Tbps capacity
- 10k+ concurrent live video broadcasts
- 400k+ concurrent live video streams
- 26B+ weekly web requests
- 95% of web requests completed in 59ms-72ms
- 2M database queries per minute, average response 3.5ms
- 300k+ cmd/sec Redis Clusters
What you will do:
- Performance analysis to identify sources of instability using data from APM and distributed telemetry data tools
- Analyze complex systems to identify operational surprises and minimize downtime.
- Software engineering and patching in to incrementally improve performance, scalability, and reliability
- Infrastructure modifications in both a data center metal environment with advanced routing/switching and in the public cloud
- Predictive failure analysis and disaster planning
- Author new tools and automation to streamline the devops pipeline
- Collaborate with Frontend/Backend engineering, QA, DevSecOps, and Data teams
- Database and kv store administration and configuration with a focus on uptime and performance
- Incident response and postmortem reports
What you bring:
- STEM degree and relevant experience as a Site Reliability Engineer
- Exceptional problem solving skills
- High proficiency in one of the following: C, C++, Java, Python, Go, etc.
- High proficiency in Unix/Linux environment, excellent knowledge of internals (e.g., filesystems, system calls)
- Networking knowledge (e.g., routing, switching, TCP stack) for both metal and cloud (VPC, Security Groups) environments
- Experience in database administration and configuration.
- Experience with DevOps tools such as Ansible, Docker, Kubernetes,
- On call reporting to monitoring and alerting of core website functions as needed
- Experience in growing data center teams (nice to have)
What will you receive:
- A strong team of A-players
- A robust engineering culture
- Opportunity to make an impact on the highly popular product
- Freedom to bring the ideas to the table and to make technical decisions
- Support and guidance of the highly professional and knowledgeable team
- Flexible working environment
Recruiting Process
We value the sense of urgency and aspire to build a smooth and transparent recruiting process. These are our stages in the recruiting process:
We reserve the right to add additional selection stages to the process depending on the specific skills of each candidate.
Perks &Benefits:
- Health &Life insurance with dental and vision plan. 100% Employer sponsored for employee &dependents
- 401k matching
- Paid holidays, vacation and sick days
- Corporate Udemy account and professional development assistance