[Corpora-List] Timebanks

Andy Lücking luecking at em.uni-frankfurt.de
Tue Nov 9 07:48:17 CET 2021


Hi Marc, hi Andre, hi Luke, hi Amir,

Many thanks for your responses! This is much appreciated! And helps a lot!

Best,

Andy

Zitat von Amir Zeldes <Amir.Zeldes at georgetown.edu>:


> Hi Andy,
>
>
>
> There is also the English GUM corpus, which AMALGUM below is based
> on – that corpus currently has ~150K tokens in 12 genres with
> manually annotated date and time expressions which cover many of the
> same things as TIMEX (but excluding some non-specific/generic time
> expressions). Both GUM and AMALGUM also cover any mention of a time
> entity, even if it cannot be resolved to a specific time (e.g. “a
> week”, “minutes”).
>
>
>
> You can query things such as date/year expressions resolving to
> 2014:
> https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV93aGVuPS8yMDE0Liov
> <https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV93aGVuPS8yMDE0Liov&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random>
> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random
>
>
>
> Or times in the afternoon:
>
>
>
> https://corpling.uis.georgetown.edu/annis/#_q=dGltZV93aGVuPS8oMi58MVsyLTldKTouLi4qLw <https://corpling.uis.georgetown.edu/annis/#_q=dGltZV93aGVuPS8oMi58MVsyLTldKTouLi4qLw&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random>
> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random
>
>
>
> Or date ranges, with notBefore/notAfter limits, such as “not before 1900”:
>
>
>
> https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV9ub3RCZWZvcmU9LzE5Li4v
> <https://corpling.uis.georgetown.edu/annis/#_q=ZGF0ZV9ub3RCZWZvcmU9LzE5Li4v&_c=R1VN&cl=5&cr=5&s=0&l=10&o=random>
> &_c=R1VN&cl=5&cr=5&s=0&l=10&o=random
>
>
>
> The data is available in multiple formats, including conllu, where
> time expressions look like in the last column here:
> https://github.com/amir-zeldes/gum/blob/master/dep/GUM_news_ie9.conllu#L149-L153
>
>
>
> Hope that helps,
>
> Amir
>
>
>
>
>
> From: corpora-bounces at uib.no <corpora-bounces at uib.no> On Behalf Of
> Luke Gessler
> Sent: Monday, November 8, 2021 7:13 PM
> To: Andy Lücking <luecking at em.uni-frankfurt.de>
> Cc: corpora at uib.no
> Subject: Re: [Corpora-List] Timebanks
>
>
>
> Dear Andy,
>
>
>
> AMALGUM �is a 4M token English corpus with machine-annotated time
> expressions. I'm not familiar with TIMEX3, but you may find it
> useful regardless. You can find it here:
>https://github.com/gucorpling/amalgum
>
>
>
> Regards,
>
> Luke
>
>
>
> On Mon, Nov 8, 2021 at 12:06 PM Andy Lücking
> <luecking at em.uni-frankfurt.de <mailto:luecking at em.uni-frankfurt.de>
> > wrote:
>
>
> Dear colleagues,
>
> We are looking for resources with temporal annotation, in particular �
> TIMEX3 expressions. We know of the English timebank, but there should �
> be more corpora around (all links to the Spanish and French timebanks �
> I discovered are broken).
>
> Thanks for any hints!
>
> Best,
>
> Andy
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no <mailto:Corpora at uib.no>
> https://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list