[Corpora-List] Final CFP: BioCreative-IV CHEMDNER Task: Chemical compound and drug name recognition

Martin Krallinger mkrallinger at cnio.es
Mon Aug 19 15:44:48 CEST 2013



BioCreative IV CHEMDNER Task: Chemical compound and drug name recognition task

Task webpage: http://www.biocreative.org/tasks/biocreative-iv/chemdner/


We are pleased to announce that the CHEMDNER task development data set has been released.


There is an increasing interest to facilitate more efficient access to information on chemical compounds and drugs (chemical entities) described in repositories of scientific articles and abstracts.

The goal of this task is to promote the implementation of systems that are able to detect mentions of chemical compounds and drugs, in particular those chemical entity mentions that can be subsequently linked to a chemical structure, rather than other macromolecules like genes and proteins that had been already addressed in previous BioCreative efforts.

We invite participants to submit results for the CHEMDNER task providing predictions for:

a) Given a set of documents, return for each of them a ranked list of chemical entities described within each of these documents [Chemical document indexing (CDI) sub-task]

b) Provide for a given document the start and end indices corresponding to all the chemical entities mentioned in this document [Chemical entity mention (CEM) recognition sub-task].

For these two tasks the organizers released both a training and a development data set, each consisting in 3500 PubMed abstracts exhaustively annotated manually by chemical domain experts following carefully developed annotation guidelines.


*Training set: 3500 manually annotated PubMed abstracts for chemical entity mentions.

*Development set: 3500 manually annotated PubMed abstracts for chemical entity mentions.

*Test set: will consist in 3000 manually annotated PubMed abstracts for chemical entity mentions.

*Evaluation scripts: scripts that can be used to evaluate performance of automated systems.

*Annotation guidelines: detailed document describing the used manual annotation criteria.

*Frequently asked question (FAQ): list of questions posed by registered teams.

*Useful resource collection: compendium of useful software and lexical resources/data for the task.

Based on the participating systems performance a selected number of teams will be invited to submit

a manuscript for a special journal issue on the CHEMDNER task.


*25th June: sample data collection, detailed task description, annotation and evaluation script

*31st July: training data collection, annotations and updated guidelines

*16th August : development data annotations

*3rd September: test set release

*12th September: test set prediction due

*17th September: invite teams for workshop presentation talks

*19th September: CHEMDNER workshop proceedings paper due (2-4 pages)

*7th-9th October: BioCreative IV workshop: National Institutes of Health (NIH), in Washington DC

(see http://www.biocreative.org/events/biocreative-iv/workshop/)


* Martin Krallinger, Spanish National Cancer Research Center (CNIO)

* Obdulia Rabal, University of Navarra, Spain

* Julen Oyarzabal, University of Navarra, Spain

* Alfonso Valencia, Spanish National Cancer Research Center (CNIO)


For further information, please contact the task organizer at

mkrallinger at cnio.es -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9739 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130819/dff9106e/attachment.txt>

More information about the Corpora mailing list