[Corpora-List] Timebanks

Amir Zeldes Amir.Zeldes at georgetown.edu
Tue Nov 9 02:22:13 CET 2021

Hi Andy,

There is also the English GUM corpus, which AMALGUM below is based on – that corpus currently has ~150K tokens in 12 genres with manually annotated date and time expressions which cover many of the same things as TIMEX (but excluding some non-specific/generic time expressions). Both GUM and AMALGUM also cover any mention of a time entity, even if it cannot be resolved to a specific time (e.g. “a week”, “minutes”).

You can query things such as date/year expressions resolving to 2014: https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV93aGVuPS8yMDE0Liov <https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV93aGVuPS8yMDE0Liov&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random

Or times in the afternoon:

https://corpling.uis.georgetown.edu/annis/#_q=dGltZV93aGVuPS8oMi58MVsyLTldKTouLi4qLw <https://corpling.uis.georgetown.edu/annis/#_q=dGltZV93aGVuPS8oMi58MVsyLTldKTouLi4qLw&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random

Or date ranges, with notBefore/notAfter limits, such as “not before 1900”:

https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV9ub3RCZWZvcmU9LzE5Li4v <https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV9ub3RCZWZvcmU9LzE5Li4v&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random

The data is available in multiple formats, including conllu, where time expressions look like in the last column here: https://github.com/amir-zeldes/gum/blob/master/dep/GUM_news_ie9.conllu#L149-L153

Hope that helps,


From: corpora-bounces at uib.no <corpora-bounces at uib.no> On Behalf Of Luke Gessler Sent: Monday, November 8, 2021 7:13 PM To: Andy Lücking <luecking at em.uni-frankfurt.de> Cc: corpora at uib.no Subject: Re: [Corpora-List] Timebanks

Dear Andy,

AMALGUM �is a 4M token English corpus with machine-annotated time expressions. I'm not familiar with TIMEX3, but you may find it useful regardless. You can find it here: �https://github.com/gucorpling/amalgum



On Mon, Nov 8, 2021 at 12:06 PM Andy Lücking <luecking at em.uni-frankfurt.de <mailto:luecking at em.uni-frankfurt.de> > wrote:

Dear colleagues,

We are looking for resources with temporal annotation, in particular � TIMEX3 expressions. We know of the English timebank, but there should � be more corpora around (all links to the Spanish and French timebanks � I discovered are broken).

Thanks for any hints!



_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no <mailto:Corpora at uib.no> https://mailman.uib.no/listinfo/corpora

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 7061 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20211108/c0878429/attachment.txt>

More information about the Corpora mailing list