[Corpora-List] Features for named entity relation

Alejandro Molina alemol at gmail.com
Fri Sep 18 23:36:07 CEST 2015

Which features are useful to relate named entities?

I want to relate scientific names with their corresponding common names using machine learning.

For instance, given the fragment:

"Acer pseudoplatanus, the sycamore or sycamore maple, is a species of maple native to Central Europe and Southwestern Asia, from France eastwards to Ukraine, and south in mountains to northern Spain, northern Turkey and the Caucasus, but cultivated and naturalized elsewhere."

I am planning that the annotation would produce something like this:

Ent_1 Acer pseudoplatanus (0, 19) SCIE_NAME Ent_2 sycamore (25, 33) COMM_NAME Ent_3 sycamore maple (37, 51) COMM_NAME Ent_4 Central Europe (85, 99) LOCATION ... Rel__1 Alias Arg1:Ent_1 Arg2:Ent_2 Rel__2 Alias Arg1:Ent_1 Arg2:Ent_3

Until now, I have solved the scientific names detection but I am confused about how to deal with common names. Do I have to split the problem into two parts (find then relate) or try to solve it all at once? Do I need to annotate anything else?

