[Corpora-List] Compilation of language resources for French - ELRA

Yannick Versley versley at sfs.uni-tuebingen.de
Tue Apr 12 12:56:49 CEST 2011

Dear Valérie,

The "free" price point is interesting especially for masters students who have the desire to do actual research, but may have to provide the materials out of their own pocket. For these people, a price of 200-500EUR is definitely out of reach (some would balk at 50EUR, which I'd understand if you need to combine data from multiple resources to carry out your research), and labeling these "at media cost" will not change this.

I do think that LDC and ELRA play an important role in the ecosystem around language resources, but I am also sure that, to ensure the widest possible use of a resource in academic research, the most effective way is to make it available free of cost and under a liberal license, as has been done with the Lefff. I understand that this is not always possible, but I applaud the people behind Lefff (and similar resources) for making it a possibility.

Best wishes, Yannick Versley

On Tue, Apr 12, 2011 at 12:08 PM, Valérie Mapelli <mapelli at elda.org> wrote:

> Dear Corpora readers,
> Recently, Ineta Sejane circulated a message listing a number of French
> language resources.
> With the aim to contribute to the enrichment of this list, we identified
> some language resources, with a French component, available in the ELRA
> Catalogue, which are either free or at media cost for research purposes.
> These are distributed as follows:
> *Written Corpora:
> *W0003 CRATER corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=84>
> W0004 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=85>
> W0013 TSNLP (Test Suites for NLP Testing)<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=51>
> W0015 Text corpus of "Le Monde"<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=438>
> W0017 MULTEXT JOC Corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=534>
> W0018 ARCADE/ROMANSEVAL corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=535>
> W0023 MLCC Multilingual and Parallel Corpora<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=764>
> W0025-01 A "scientific" corpus of modern French ("La Recherche"
> magazine) - Raw data<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=594>
> W0025-02 A "scientific" corpus of modern French ("La Recherche"
> magazine) - Complete version<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=595>
> W0032 Modern French Corpus including Anaphors Tagging<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=634>
> W0033 CRATER 2 Corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=636>
> W0036-01 "Le Monde Diplomatique" Text corpus in French - archives
> 1980-1998<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=7>
> W0036-02 "Le Monde Diplomatique" Text corpus in French - archives from
> 1999 <http://catalog.elra.info/product_info.php?cPath=42_43&products_id=9>
> *Lexicons:
> *L0010 MULTEXT Lexicons<http://catalog.elra.info/product_info.php?products_id=29>
> M0020 EuroWordNet French<http://catalog.elra.info/product_info.php?products_id=550>
> *Speech LRs:
> *S0006 BREF-80<http://catalog.elra.info/product_info.php?products_id=36>
> S0007 BREF-POLYGLOT<http://catalog.elra.info/product_info.php?products_id=37>
> S0021 M2VTS Speaker Verification Database<http://catalog.elra.info/product_info.php?products_id=758>
> S0033 BDBRUIT<http://catalog.elra.info/product_info.php?products_id=80>
> S0060 MULTEXT Prosodic database<http://catalog.elra.info/product_info.php?products_id=530>
> S0088 Twin database - TWINDB1<http://catalog.elra.info/product_info.php?products_id=579>
> S0163 ILPho phonetic lexicon<http://catalog.elra.info/product_info.php?products_id=760>
> S0238 MIST Multi-lingual Interoperability in Speech Technology database<http://catalog.elra.info/product_info.php?products_id=988>
> S0241 ESTER Corpus<http://catalog.elra.info/product_info.php?products_id=999>
> S0305 EPAC Corpus: orthographic transcriptions<http://catalog.elra.info/product_info.php?products_id=1119>
> *Evaluation Packages:
> *E0008 The CLEF Test Suite for the CLEF 2000-2003 Campaigns -
> Evaluation Package<http://catalog.elra.info/product_info.php?products_id=888>
> E0018 ARCADE II Evaluation Package<http://catalog.elra.info/product_info.php?products_id=992>
> E0019 CESART Evaluation Package<http://catalog.elra.info/product_info.php?products_id=993>
> E0020 CESTA Evaluation Package<http://catalog.elra.info/product_info.php?products_id=994>
> E0021 ESTER Evaluation Package<http://catalog.elra.info/product_info.php?products_id=995>
> E0022 EQueR Evaluation Package<http://catalog.elra.info/product_info.php?products_id=996>
> E0023 EvaSy Evaluation Package<http://catalog.elra.info/product_info.php?products_id=997>
> E0024 MEDIA Evaluation Package<http://catalog.elra.info/product_info.php?products_id=998>
> E0034 EASy Evaluation Package<http://catalog.elra.info/product_info.php?products_id=1112>
> E0036 CLEF AdHoc-News Test Suites (2004-2008) - Evaluation Package<http://catalog.elra.info/product_info.php?products_id=1127>
> E0038 CLEF Question Answering Test Suites (2003-2008) - Evaluation
> Package <http://catalog.elra.info/product_info.php?products_id=1129>
> W0029 Amaryllis Corpus - Evaluation Package<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=626>
> Other French language resources and many other languages are available both
> for research and commercial communities in our catalogue that you may visit
> at:
> http://catalogue.elra.info
> Best regards,
> Valérie Mapelli
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 8621 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20110412/419405f7/attachment.txt>

More information about the Corpora mailing list