Social SCIENTIST For text and nlp
Giant Oak builds software to make the world a safer place. Giant Oak Search Technology (GOST®) makes screening easy by targeting the right kinds of information to help our customers combat fraud, detect crime, and enhance security. GOST® is the fastest, most reliable negative media search tool on the market because we see data differently. We look behind the numbers to see individuals and communities. And we strive to do our part to make the world a better, freer, and more secure place.
There are petabytes of text data in thousands of sources across hundreds of languages that could potentially be useful for any screening and vetting challenge. The Social Scientist for Text and NLP is responsible for picking sources and choosing methods to bring the right data to bear and prioritize them appropriately for GOST®. Which data are most helpful for customers? How do we leverage them? How do we integrate new data sources with existing sources? These are all crucial challenges that the Social Scientist for Text and NLP must solve.
The Social Scientist for Text and NLP will work closely with the Chief Scientist and the data engineering team to design and implement innovative approaches to extract meaning from web-scale unstructured text. She or he will design protocols for prioritizing sources, evaluating data quality, clustering results, and inferring features of different texts. He or she will also contribute to building out core analytics and approaches to customizing these for each customer. She or he is responsible to the Chief Scientist for the design and continual improvement of text data for GOST®.
- Ensure that GOST® draws on the best data sources to support its goals and missions by designing and conducting empirical tests of data source quality
- In collaboration with the data engineering team ensure that GOST® algorithms leverage unstructured data efficiently and consistently.
- Ensure that GOST® deploys the most appropriate natural language processing algorithms for each task.
- Work closely with data engineering to ensure that new insights, algorithms, and data sources are reflected in production systems.
- Provide subject matter expertise on natural language processing techniques and technologies throughout Giant Oak.
- Exceptional expertise in natural language processing techniques, methods, and corpora to include structured and unstructured data.
- Exceptional ability to identify and execute empirical tests of ideas.
- Outstanding communication skills, and proven ability to convey technical findings.
- Bachelor’s degree in a quantitative subject or equivalent experience.
- Eligible for US Security Clearance.
- Higher degree in quantitative subject (PhD in economics, statistics, engineering, computer science or like field; MS in data science, statistics, etc.) with focus on applications of machine learning to empirical questions.
- Deep subject-matter expertise on text modeling.
- Experience with technology including Python, SQL, AWS/Azure, Spark.
- Cover Letter
- Job Market Paper
- Letters of Reference
- Code Sample or GitHub Link