[Corpora-List] Submissions for Research Topic "Text Analytics and Machine Learning to support the Interpretation of Genetic Variation"

Antonio Jimeno antonio.jimeno at gmail.com
Tue Oct 12 05:17:05 CEST 2021

Sorry for multiple postings


The decreasing cost of genomic sequencing is making it accessible for the analysis of genetic variation on a global scale, which has several potential benefits for the analysis of the origin of human disease and patient-specific treatment. Thus, it is of great interest to consider the analysis of genetic variations in clinical pipelines. Information about genetic variation is available from high-throughput studies and is typically published in databases or curated from the scientific literature. Information from high-throughput methods needs to be interpreted. This interpretation usually relies on curated databases to identify potential disease-causing variants. Moreover, the curated databases rely on the manual transfer of information from the scientific literature. Due to its manual nature, this method is slow, of course, and it is not framed to keep up with the pace at which new information is being published in the scholarly literature landscape. As the scientific literature contains relevant information about genetic variation and the interpretation of the genetic variants, it is, therefore, pivotal to develop new automated processes to extract meaningful data from scientific papers. Text-mining techniques are effective to automatically process this data and can certainly make information’s curation processes more efficient and faster, while also providing relevant information to clinical pipelines that might benefit from the interpretation of genetic variants. There have been advances in the automatic processes to identify genetic variants from the scientific literature but the interpretation of these variants, e.g. how they are linked to disease mechanisms, still requires further research and development. Furthermore, to support such interpretation, it is likely that additional information is required from existing structured data, thus combining multiple sources of data might support the discovery of functions of specific variants. Moreover, the reuse of linked/combined data sets will significantly support further predictive analytics targeted towards patient care. Submissions: As such, this Research Topic aims at collecting novel scientific content in the context of, but not limited to, the following aspects: - Novel methods to extract genetic variation from scientific literature; - Novel methods to link/ground variation mention to database identifiers or a nomenclature (e.g., HGVS); - Extraction of information relevant to the relation between genetic variation and disease and/or phenotypes from scientific literature; - Development of new data sets or corpora to be used by text mining methods; - Combination of methods, which includes text mining for the interpretation of genetic variants; - Methodologies that reuse extracted genetic variation information from the literature to predict disease phenotypes and protein properties; - Novel methods to combine information extracted from the text about genes, variants, diseases, and proteins with information in existing databases or high-throughput assays. Please visit the following URL if you are interested in submitting an article: https://www.frontiersin.org/research-topics/23716/text-analytics-and-machine-learning-to-support-the-interpretation-of-genetic-variation

Topic Organizers: * Antonio José Jimeno Yepes (RMIT University, Melbourne, Australia) * Christopher Baker (University of New Brunswick, Saint John, Canada) * Maximilian Haeussler (University of California, Santa Cruz, United States) * Philippe Thomas (German Research Center for Artificial Intelligence, Germany)

Deadlines: * Abstract submission open * 14th January 2022 - manuscript -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4013 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20211012/406aa418/attachment.txt>

More information about the Corpora mailing list