Sr. Software Engineer, Machine Learning Infrastructure - remote
The Opportunity
By pairing engineers with leaders in our education, science, and justice and opportunity teams, we can bring technology to the table in new ways to help drive solutions. We are uniquely positioned to design, build, and scale software systems to help educators, scientists, and policy experts better address the myriad challenges they face. Our technology team is already helping schools bring personalized learning tools to teachers and schools across the country and supporting scientists around the world as they develop a comprehensive reference atlas of all cells in the human body.
The Infrastructure organization works on building shared tools and platforms to be used across all of the Chan Zuckerberg Initiative. Members of the shared infrastructure engineering team have an impact on all of CZI's initiatives by enabling the technology solutions used by other engineering teams at CZI to scale. A person in this role will build these technology solutions and help to cultivate a culture of shared best practices and knowledge around machine learning and data science focused engineering.
You will
- Design and build Machine Learning tools, libraries, and platform solutions for use across 5+ engineering teams in their Data Science and Machine Learning efforts
- Work with project teams across CZI in building complete lifecycle machine learning and AI based projects to automate predictive models and deploy active machine learning based services for production user facing applications
- Act as both a peer and mentor to project teams to help with the full machine learning lifecycle from concept to production
- Drive design and implementation of underlying components that are critical to successfully delivering production scale solutions, including scalable logging and data storage implementations
- Participate in architecting and building our own team’s Operational Analytics platform, which underpins our team’s efforts in providing an efficient and scalable set of core and data infrastructure platform components for the organization
- Analyze and improve efficiency, stability, security, and data privacy of CZI data science and engineering efforts
- Ensure algorithms and techniques used generate accurate results
- Evangelize and educate teams across CZI on best practices, from a focus on alerting and monitoring of the machine learning and data science projects as well as operational visibility
You have
- BS, MS, or PhD degree in Computer Science, Data Science, or a related technical discipline or equivalent experience
- 5+ years of relevant coding experience
- 3+ years of directly relevant experience with Machine Learning, Artificial Intelligence, or Deep Learning projects and/or systems actively partnering with Data Scientists on projects and solutions
- 3+ years experience using data platforms and libraries such as Kafka, Spark, Spark Streaming, Kafka, Kinesis, Pandas, Data Frames, Delta Lake, Iceberg, Hbase, Cassandra, or Dynamodb
- Experience with Data Warehouses and/or Data Lakes based on ORC, Parquet, Hive, or other Columnar Store format
- Proven ability with a systems language such as C, C++, C#, Go, Java or Scala
- Shown ability with a scripting language such as Python, PHP, or Ruby
- Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure
- Experience with SQL and the ability to troubleshoot complex SQL queries a plus
- Experience with containers, cloud container schedulers, and continuous deployment systems a plus
CZI believes that vaccines are one of the most powerful tools to fight COVID-19 and save lives. It aligns to our mission and work to cure, manage, and prevent disease. Proof of completed COVID-19 vaccination will be required for all applicants and employees to come onsite to a CZI facility. CZI will consider exceptions to this policy for medical or religious reasons on an individualized basis.