[Corpora-List] last call: survey of interest in a potential new catalog of annotation resources

Mark Finlayson markaf at fiu.edu
Thu Jan 5 18:51:37 CET 2017


Dear Colleagues,

This is the last call for community feedback and expressions of interest regarding a proposal for a new type of annotation resource catalog. The idea is described briefly in the appended text, and was proposed, discussed, and endorsed at an NSF-funded workshop <http://uat2015.cs.fiu.edu/> held last year (described in detail in this technical report <https://dspace.mit.edu/handle/1721.1/105270>).

If you have ever done linguistic annotation and have a moment, I would very much appreciate if you would fill out this very short survey (3 questions):

http://online.cs.fiu.edu/survey/index.php?sid=74491&lang=en

Responses may remain anonymous, if you wish. If you have any questions, please get in touch. Best regards,

Mark


> It is our observation that, despite several ongoing cataloging
> efforts, researchers still struggle to find, understand, and evaluate
> the annotation resources available to support their particular
> annotation tasks (e.g., software tools, schema, and corpora). We
> propose to develop a new catalog that addresses this problem, which
> will add important new metadata on top of existing catalogs (e.g.,
> those maintained by LDC, ELRA, OLAC, etc.,with whom we will partner)
> as well as adding many additional uncatalogued resources.
>
> The key feature of the catalog is to provide new, sophisticated, and
> formal metadata that allows sophisticated and accurate search for
> linguistic-annotation-related resources. The metadata will also enable
> enhanced evaluation of the relevance of resources for a particular
> annotation task. This metadata is not currently captured anywhere,
> except insofar as it is described informally in academic publications.
> The metadata will also enable automatic inference as to what resources
> are relevant to which linguistic phenomena and annotation tasks. On
> the front-end the catalog will be a website: it will list linguistic
> annotation tools, annotation schema and guidelines, and corpora. On
> the back-end, it will capture formal relationships between these items
> in specially designed metadata schemes and description languages,
> which will enable sophisticated search by linguistic phenomenon,
> software capability, and semantic interoperability.
>
> The catalog will also incorporate a number of useful features:
>
> * *Wizard Interface for Novices*, that will guide novice users (such
> as undergraduates, young graduate students, or researchers new to
> the field) to the tools, schemes, data, and best practices most
> appropriate for their task.
> * *Free Text Reviews* which, like modern online shopping sites, will
> allow researchers to help others evaluate resources.
> * *Archive of Abandoned Resources*, providing a
> download-site-of-last-resort for resources that have lost their homes.
> * *Streamlined Intake and Archiving*, which will allow researchers
> to quickly and easily add their newly-developed resources to the
> catalog to enable other researchers to more easily find and cite
> their resources, as well as to comply with data archiving
> requirements by funding agencies (e.g., the NSF Data Management Plan).
>
> This idea was proposed and endorsed by a group of 23 researchers in
> linguistic annotation who gathered at an NSF-funded workshop
> <http://uat2015.cs.fiu.edu/> held in March 2015 (described in detail
> in this technical report
> <https://dspace.mit.edu/handle/1721.1/105270>). We will be soon
> seeking NSF funding to support the implementation of the idea.
>

_________________ Mark A. Finlayson Assistant Professor, FIU SCIS 11200 SW 8th Street, ECS Room 362, Miami, FL 33199 +1.305.348.7988 (office); +1.617.515.0708 (mobile);markaf at fiu.edu .

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6155 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170105/5d7870ad/attachment.txt>



More information about the Corpora mailing list