[Corpora-List] Summer Intern/Information Extraction and Integration

Ang Sun asun at cs.nyu.edu
Sat Feb 9 17:27:37 CET 2013

Description http://ch.tbe.taleo.net/CH06/ats/careers/requisition.jsp?org=INTELIUSCORP&cws=39&rid=230

inome is gathering the world’s information and making it people centric. The inome graph connects billions of entities (people, organizations and addresses) to encode the information-genome of each individual. inome Research is looking for graduate interns to work on developing its next generation of Information Extraction and Information Integration technologies. The interns for summer 2013 will explore developing novel strategies for extracting intelligence from both unstructured text and semi-structured text and integrating the intelligence with the inome graph. Sample projects include extracting events, trends, users' interests from unstructured text; extracting attributes of people from publicly available sources; linking extracted entities to the entity nodes in the inome graph. This is likely to be innovative work and we expect the summer internship to lead to both product impact and a research paper in a top conference. The internship will be at our headquarters in Bellevue, WA, and offer a competitive compensation.

inome Research develops cutting-edge systems to standardize, create, and link intelligence to power inome's industry-leading people information-genome platform. Team members have published papers in top research conferences such as NIPS, ACL, VLDB, CIKM, and SIGIR, given invited talks, organized workshops, and turned algorithms into deployed systems.


Build and/or extend systems to do high-precision IE from structured, semi-structured, and unstructured information sources

Build and/or extend systems to do high-precision linkage from a variety of information sources

Design and implement algorithms for evaluating the performance of IE and linkage

Required Skills:

Graduate student working on a Ph.D. in Natural Language Processing, Data Mining, Computational Social Science or related field

Experience in one or more of the following areas: named entity extraction, relation extraction, within document and cross-document coreference, graph-based information extraction and information fusion/integration

Self-motivated, creative, and independent researching skills

Desired Skills:

Strong hands-on skills in Java

Experience with large-scale machine learning

Experience with Hadoop

Familiarity with graph based machine learning toolkits such as GraphLab

Experience with crowdsourcing/Mechanical Turk

Experience with NLP toolkits

Experience with supervised/semi-supervised/unsupervised information extraction

Experience with graph-based NLP

Contact: Please apply online. For faster considerations, please send your CV to Ang Sun, asun at inome.com

More information about the Corpora mailing list