Senior Data Scientist/Engineer - remote

Posted 3 years ago

Candidates for this position must be located in the United States with the ability to work for any company to be eligible.

The Position:

SteepRock is seeking a Senior Data Scientist/Engineer that can create and manage the company’s large scale classification systems from proof of concept to production level. The role requires development of scripts, notebooks or applications (using Python/Java/Scala and associated libraries or frameworks: Scikit-Learn, Numpy, Tensorflow, Pytorch, Spark, Spark MLlib, H20.ai, etc.) that utilize, collect, aggregate and clean large data sets to perform analysis, clustering, entity resolution and structured information extraction. The successful candidate will leverage machine learning/NLP algorithms to implement highly scalable and accurate systems for text classification and named entity recognition. The ideal candidate will have demonstrable industry experience implementing and delivering data classification and optimization systems in a production environment.

Preferred Qualifications:

  • Machine Learning Experience: 5+ years
  • M.S. or Ph.D. in: Machine Learning, Applied Statistics, Applied Mathematics, Computer Science

Experience:

  • Building machine learning models that address entity resolution, natural language processing, record linkage or deduplication
  • Using relevant technologies including Python/Spark, SQL/NoSQL, cloud computing (GCP/AWS), Scikit-Learn, Numpy, Tensorflow, Pytorch, Spark and other relevant machine learning tools and/or techniques.

Responsibilities:

  • Develop, implement, and deploy custom data pipelines powering exceptional machine learning entity extraction systems over big data from prototyping to production level.
  • Perform feature engineering and optimization to improve the performance of machine learning models.
  • Develop highly scalable classifiers and tools leveraging machine learning, data regression and lexicon-based/rules-based models
  • Suggest, collect and synthesize requirements and create effective feature and technology roadmaps;estimate timelines for implementation
  • Code deliverables in tandem with the engineering team;troubleshoot problems
  • Apply standard machine learning methods to best exploit modern parallel environments (e.g. distributed clusters, multicore SMP, and GPU)
  • Develop and track metrics to support continuously improving the models
  • Ensure the highest possible quality objectives are always met
  • Innovate, bringing new ways to leverage existing data, integrate new tools and deliver proven techniques to enhance system performance.

About This Position:

This is a full-time, salaried position with benefits (medical, office expenses, IRA plan) and direct incentive plans. In response to COVID-19, for the time being, all interviewing will be done via video conference.

About Steeprock Inc.

SteepRock provides software and services to pharmaceutical, biotech and medical device companies globally. We leverage specialized industry knowledge and abilities to build products and technology that help healthcare leaders make better decisions related to patient care.