[Corpora-List] English idiom dataset

Guy Emerson gete2 at cam.ac.uk
Sun Nov 19 15:23:10 CET 2017


For idiomatic compound nouns:

Reddy, McCarthy, & Manandhar (2011) "An Empirical Study on Compositionality in Compound Nouns"

http://www.anthology.aclweb.org/I/I11/I11-1024.pdf

2017-11-19 11:00 GMT+00:00 <corpora-request at uib.no>:


> Today's Topics:
>
> 1. Re: English idiom dataset (Alexander Osherenko)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 18 Nov 2017 14:42:18 +0100
> From: Alexander Osherenko <osherenko at gmx.de>
> Subject: Re: [Corpora-List] English idiom dataset
> To: Maria Pia di Buono <mariapia.dibuono at fer.hr>
> Cc: "Corpora at uib.no" <corpora at uib.no>
>
> Maria,
>
> before you collect comprehensive data you might choose the languages you
> are working on and some field of interest -- you can't study all idioms. Be
> aware: you can't trust yourself since you will find many astonishing things
> concerning correct understanding of idioms you thought you know. ;-)
>
> I my research, I was working on emotional verbs and phrases. As references,
> I've chosen Cambridge idioms (ISBN-13: 978-0521677691) or Cambridge Phrasal
> Verbs (ISBN-13: 978-0521677707) or The free dictionary
> https://idioms.thefreedictionary.com/ or Thesaurus
> http://www.thesaurus.com/browse. There is a comprehensive annotation in
> these dictionaries and you can work with them. These dictionaries have also
> examples of use that can be used in your corpus composition.
>
> Translator software can help. It is not important that some translations
> are not correct. You can get an idea where to start. You can use Wikipedia,
> find an idiom in one language and then switch to the homepage in another
> language. You can also use Google Translator -- it can also assist you to
> find a translation. Otherwise use dictionaries in your target languages
> (they might have proper translations of your idioms).
>
> In any case it is quite a hard research you are doing.
>
> HIH, Alexander
>
> --
> Alexander Osherenko, Dr. rer. nat.
> Senior HCI architect
>
> Founder and R&D
> Socioware Development <http://www.socioware.de/osherenko_page.html>
>
> Humboldt Innovation
> Humboldt-Universität zu Berlin
>
> Profile: ResearchGate
> <https://www.researchgate.net/profile/Alexander_Osherenko>
> Channel: LinkedIn
> <https://www.linkedin.com/pub/alexander-osherenko/1/30a/a74>
> Channel: Google+ <https://plus.google.com/105305790720313348886>, Google
> Scholar <https://scholar.google.com/citations?user=q_0QJBoAAAAJ&hl=en>
> Channel: Youtube <https://www.youtube.com/user/MrOsherenko>
> Channel: Twitter <https://twitter.com/mrosherenko>
>
> Social interaction, globalization and computer-aided analysis
> <https://www.researchgate.net/publication/281644865_Social_
> Interaction_Globalization_and_Computer-Aided_Analysis_A_
> Practical_Guide_to_Developing_Social_Simulation>
> at
> Springer
>
> 2017-11-17 23:31 GMT+01:00 Jelena Mitrovic <jecovit at gmail.com>:
>
> > Hello, Maria,
> >
> > You might find outcomes of the PARSEME COST Action useful
> >
> > https://typo.uni-konstanz.de/parseme/
> >
> > Kind regards
> > Jelena
> >
> > On 17 November 2017 at 10:07, Maria Pia di Buono <
> mariapia.dibuono at fer.hr>
> > wrote:
> >
> >> Hi all,
> >>
> >> I'm working on a cross-lingual classification system for idioms and I
> was
> >> wondering if there are some available resources for English (I'm sure
> there
> >> are but I wan not able to find them).
> >> At first glance, I was looking for VNC-Tokens Dataset by Fazly and
> >> Stevenson (2006). I know that this dataset includes just constructions
> with
> >> a verb and a noun in its direct object position, so, probably I'd need
> >> other comprehensive resources.
> >>
> >> Do you have any suggestions?
> >>
> >> Thank you.
> >>
> >> Best,
> >> Maria Pia
> >>
> >>
> >>
> >> Maria Pia di Buono
> >> --
> >> Text Analysis and Knowledge Engineering Lab <http://takelab.fer.hr>
> >> Faculty of Electrical Engineering and Computing
> >> University of Zagreb, Croatia
> >> mail: mariapia.dibuono at fer.hr
> >> web: http://takelab.fer.hr/maria-pia-di-buono/
> >>
> >>
> >> _______________________________________________
> >> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> >> Corpora mailing list
> >> Corpora at uib.no
> >> https://mailman.uib.no/listinfo/corpora
> >>
> >>
> >
> >
> > --
> > Jelena Mitrovi?, PhDc
> > Wissenschaftliche Mitarbeiterin
> >
> > Lehrstuhl für Informatik
> > Digital Libraries and Web Information Systems
> > Universität Passau / ITZ / Raum 108
> > Innstr. 43
> > 94032 Passau
> > +49 851 509 3395 <+49%20851%205093395>
> >
> > jelena.mitrovic at uni-passau.de
> > www.uni-passau.de
> >
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > https://mailman.uib.no/listinfo/corpora
> >
> >
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 10007 bytes
> Desc: not available
> URL: <https://www.uib.no/mailman/public/corpora/attachments/
> 20171118/b1e2a2b6/attachment.txt>
>
> ----------------------------------------------------------------------
> Send Corpora mailing list submissions to
> corpora at uib.no
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mailman.uib.no/listinfo/corpora
> or, via email, send a message with subject or body 'help' to
> corpora-request at uib.no
>
> You can reach the person managing the list at
> corpora-owner at uib.no
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Corpora digest..."
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>
>
> End of Corpora Digest, Vol 125, Issue 25
> ****************************************
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 10003 bytes Desc: not available URL: <https://www.uib.no/mailman/public/corpora/attachments/20171119/54fba0a4/attachment.txt>



More information about the Corpora mailing list