[Corpora-List] Transliteration/Romanization tool for Modern Greek

Isabella Chiari isabella.chiari at uniroma1.it
Wed Feb 27 17:06:28 CET 2013

Dear Corpora list members, on behalf of a colleague I ask your help in order to find a transliteration/romanization tool for modern greek texts. Is there anything available (for free or for purchase?)? Thank you in advance for your help, Isabella

Il giorno 25/feb/2013, alle ore 16:23, Gill Philip <g.philip.polidoro at gmail.com> ha scritto:

> Although it has its critics and its weak points, a pretty good point of reference is Berlin & Kay 1969. Their listing of colour words actually refers to existence in languages: if a language has a "blue" colour term, then it already has black, white, red, green & yellow: no language (in their study) can have, e.g. "pink" if it doesn't already have "blue".
> Anyway, as a rough guide, their order is (Berlin and Kay 1969: 4)
> white & black
> red
> yellow & green
> blue
> brown
> pink / purple / grey/ orange
> When I looked at colour words in English and Italian, I got these figures (freq. per million)
> ENGLISH (Bank of English, circa 2003)
> white (316) & black (294)
> red (182)
> green (139), brown (136), blue (122)
> grey (63)
> yellow (51)
> pink (37) & purple (15)
> orange (35)
> ITALIAN (CORIS, circa 2003)
> White (Bianco, 308)
> Red (Rosso, 267) and Black (Nero, 265)
> Green (Verde, 176)
> Blue (=143: Azzurro, 85 plus Blu, 58)
> Pink (Rosa, 90), Yellow (Giallo, 82), Grey (Grigio, 63)
> Purple (Viola, 22)
> Brown (Marrone, 13)
> Orange (Arancione, 9)
> They're not an exact match with B&K's sequencing, but you can see the basic principle at work. Black, white and red are clearly more common than the other colours; blue and green are similar in frequency; pink & purple form another group. I should mention, though, that this is a fairly crude measure, and not based on POS-tagged data. There are problems with homographs, e.g. "orange" is also the fruit in English (but not in Italian); Brown is a surname in English (and was the name of the then Chancellor, subsequently Prime Minister, so cropped up disproportionately in the data).
> This data comes from my long-forgotten PhD dissertation "Collocation and Connotation": I believe it's still hanging around on the web somewhere.
> hope this helps,
> Gill
> On 25 February 2013 14:31, H.A.E Viethen <H.A.E.Viethen at uvt.nl> wrote:
> Hi,
> we are looking for a way to estimate the relative frequency of colour
> terms in different languages, in particular Greek and Dutch. So for
> example, we'd like to know how frequent the term 'rood' (red) is in
> Dutch compared to the term 'roze' (pink), or how the frequencies of
> the terms 'ble' and 'galázio' compare in Greek.
> We only need ballpark figures, the kind of thing one might estimate
> with hit counts in web searches, altough having slightly more
> reliable numbers than that would be nice. In any case, many Greek
> colour terms are derived from common nouns for objects in the natural
> environment and usually even spelled the same. This makes it difficult
> to distinguish the use of a word as a colour term from its use as a
> common noun.
> Does anyone know of a resource (paper, website, anything) that might
> readily list relative frequencies for colour terms in Greek and Dutch?
> Alternatively, can anyone point us to a POS-tagged corpus of Greek or
> Dutch which would be suitable for counting the use of colour terms?
> Many thanks,
> Jette Viethen
> Tilburg University
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> --
> *********************************
> Dr. Gill Philip
> Universitŕ degli Studi di Macerata
> Dipartimento di Scienze della Formazione, dei Beni Culturali, e del Turismo
> Piazzale L. Bertelli
> Contrada Vallebona
> 62100 Macerata
> Italy _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5147 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130227/7101eb37/attachment.txt>

More information about the Corpora mailing list