Senior Data Scientist - Big Data - remote

Socure
Posted 4 years ago
Stack Overflow

Founded in 2012, Socure is the leader in high-assurance digital identity verification technology. Named to Forbes’2019 AI 50 list as one of America’s most promising AI companies and a recent winner of API World’s Best Data API, Socure’s technology applies artificial intelligence and machine learning techniques with trusted intelligence from email, address, phone, IP, social media, and the broader Internet to verify identities in real time. Socure’s customers include three of the top five U.S. banks, seven of the top 10 U.S. card issuers, as well as the majority of leading digital banks, lenders and insurers across the U.S. Socure is funded by some of the world's best investors and entrepreneurs including Scale Venture Partners, Commerce Ventures, Work-Bench, Santander InnoVentures, and Two Sigma Ventures.

At Socure, the only way we can further our mission of becoming the single trusted source of identity verification and eliminating identity fraud is by building the best team on the planet. This is where you come in!

We are currently looking for a Senior Data Scientist for our Compliance Products DS R&D team, to be based anywhere remotely in the USA.

The Socure Compliance Products DS R&D team is responsible for developing entity-resolution improvements, building data-processing pipelines, evaluating the performance of new data sources, and providing analytical support to the Socure compliance and regulatory product suite, which includes a highly acclaimed Know-Your-Customer (KYC) product.

What You'll Do:

    • Develop machine learning, data mining, statistical, and graph-based algorithms designed to analyze massive data sets.
    • Analyze large data sets to develop multiple, custom models, and algorithms to drive innovative identity-verification solutions.
    • Understand and resolve computational limitations related to parallelizing algorithm application and data processing.
    • Provide analytic support to the compliance-product teams.
    • Develop improved models, and perform A/B analysis of production data.
    • Report on project status to senior management.
    • Work well in a fast-paced cross-functional environment.


What You'll Bring:

    • Ph.D (preferred) or MSc. in a relevant technical field or equivalent work experience
    • A minimum of 3 years of experience working in a similar role.
    • Experience in developing data-driven algorithms in information retrieval, relevance, or machine learning and working with distributed systems.
    • Familiarity with UNIX systems, Java or Scala, Python or R, and SQL.
    • Familiarity with Spark, common ML libraries, and the AWS ecosystem, including EMR and S3.
    • Experience with data mining, unsupervised machine learning algorithms, and statistical- tools and underlying theory.
    • Additionally, experience with Neo4j, Elasticsearch, and Airflow (or equivalents) is a big plus!