Scientific knowledge is one of the greatest assets of humankind. This knowledge is recorded and disseminated in scientific publications, and the body of scientific literature is growing at an enormous rate. Automatic methods of processing and cataloguing that information are necessary for assisting scientists to navigate this vast amount of information, and for facilitating automated reasoning, discovery and decision making on that data.
Structured information can be extracted at different levels of granularity. Previous and ongoing work has focused on bibliographic information (segmentation and linking of referenced literature, Wick et al., 2013), keyword extraction and categorization (e.g., what are tasks, materials and processes central to a publication, (Augenstein et al., 2017)), and cataloguing research findings. Scientific discoveries can often be represented as pairwise relationships, e.g., protein-protein (Mallory et al., 2016), drug-drug (Segura-Bedmar et al., 2013), and chemical-disease (Li et al., 2016) interactions, or as more complicated networks such as action graphs describing scientific procedures (e.g., synthesis recipes in material sciences, (Mysore et al., 2017)). Information extracted with such methods can be enriched with time-stamps, and other meta-information, such as indicators of uncertainty or limitations of the discovered facts (Zhou et al., 2015). Structured representations, such as knowledge graphs, summarize information from a variety of sources in a convenient and machine readable format. Graph representations, that link the information of a large body of publications, can reveal patterns and lead to the discovery of new information that would not be apparent from the analysis of just one publication. This kind of aggregation can lead to new scientific insights (Kim et al., 2017), and it can also help to detect trends (Prabhakaran et al., 2016), or find experts for a particular scientific area (Neshati et al., 2014).
While various workshops have focused separately on several aspects -- extraction of information from scientific articles, building and using knowledge graphs, the analysis of bibliographical information, graph algorithms for text analysis -- the proposed workshop focuses on processing scientific articles and creating structured repositories such as knowledge graphs for finding new information and making scientific discoveries. The aim of this workshop is to identify the necessary representations for facilitating automated reasoning over scientific information, and to bring together experts in natural language processing and information extraction with scientists from other domains (e.g. material sciences, biomedical research) who want to leverage the vast amount of information stored in scientific publications.
We invite submission on (but not limited to) the following topics:
- Information extraction from scientific publications
- identification of concepts in scientific articles (in various domains)
- extraction of relations from scientific articles (in various domains) — including n-ary relations with n>2, “negative relations”
- large scale information extraction, clustering and detection of trends in scientific fields
- targeted information extraction for completing knowledge graphs
- updating knowledge graphs (adding new information, removing erroneous facts, or possibly having explicit links for incorrect statements) - Finding patterns and mining new information in knowledge graphs
- automatic generation and ranking of scientific hypotheses
- aggregation and extraction of human-understandable scientific rules and generalities
- extraction of script-knowledge and scientific procedures
- detection of (inferred or explicitly stated) causality
- automated reasoning over repositories of extracted information - Using extracted structured knowledge
- visualization of knowledge in particular domains
- tools for interacting with users
- querying knowledge graphs/knowledge repositories
- evaluation of extracted knowledge
Workshop papers due Wednesday February 27, 2019 Notification of acceptance Wednesday March 27, 2019 Camera-ready papers due (firm deadline) Friday April 5, 2019 Workshop date Thursday or Friday June 6 or 7, 2019
Vivi Nastase -- University of Heidelberg Benjamin Roth -- University of Munich Laura Dietz -- University of New Hampshire Andrew McCallum -- University of Massachusetts Amherst
-- ___________________________________________________ Dr. Vivi Nastase Institut für Computerlinguistik, INF 325 (109), 69120 Heidelberg Tel: +49 (0)6221 54-3591