[Corpora-List] SPECIALIST Lexicon for COVIDSearch

Ken Litkowski ken at clres.com
Tue Mar 31 21:27:26 CEST 2020

CORPORA recently made a call for participation in COVIDSearch, using a set of COVID-related articles, the COVID-19 Open Research Dataset (CORD-19). Medical articles are difficult for natural language processing, particularly because of multiword expressions (MWEs), such as "cyclic adenosine monophosphate response element binding protein".

The SPECIALIST Lexicon <https://lsg3.nlm.nih.gov/Specialist/Home/index.html> helps immensely with these problems, but it is not designed as a dictionary. I have converted SPECIALIST into a form for NLP tasks, an alphabetic UMLS Specialist Lexicon <https://www.clres.com/specialist/DIMAP%20Specialist%20Lexicon.htm>. This describes how I've converted the entirety of SPECIALIST, but I think the data (which I use for uploading into a dictionary) may be useful in its raw form. I've made this available as the Raw UMLS Specialist <https://www.clres.com/elec_dictionaries.php#rumls> where the data can them be downloaded (in a zip file of 14 MB). While this not designed for CORD-19, this data might be useful.

If you have any questions, comments, or suggestions, I hope they will help deal with the COVID-19 problems.


-- Ken Litkowski TEL.: 301-482-0237 CL Research EMAIL: ken at clres.com 9208 Gue Road Home Page: http://www.clres.com Damascus, MD 20872-1025 USA Blog: http://www.clres.com/blog

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2147 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200331/31b330c9/attachment.txt>

More information about the Corpora mailing list