[Corpora-List] Spanish and Latin-American Spanish parallel corpora

Krallinger.Martin mkrallinger at cnio.es
Wed Sep 12 11:17:07 CEST 2018


Dear Nicola,

in case you do not mind that the corpus is domain specific, SCIELO (http://www.scielo.org/php/index.php?lang=es) might be a suitable resource.

It does host medical literature with titles and abstracts in English and Spanish. In case of the Spanish version there are collections from multiple countries such as Argentina, Spain, Mexico, Colombia, Chile, Cuba etc,..

This resource has also been explored for several shared task and corpus/resource construction, e.g.:

http://temu.bsc.es/mespen/ Villegas, M., et al. (2018). The MeSpEN Resource for English-Spanish Medical Machine Translation and Terminologies: Census of Parallel Corpora, Glossaries and Term Translations. In LREC MultilingualBIO: Multilingual Biomedical Text Processing. ELRA.

or

Neves, M. L., et al. (2016). The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine. In LREC.

or the SCIELO collection used for the biomedical track of the second language technologies hackathon (info in Spanish):

http://www.hackathonplantl.es/recursos-y-herramientas-disponibles

Best regards,

Martin

**********************************************************************  Important: Please notice my new email: martin.krallinger at bsc.es **********************************************************************

============================ Martin Krallinger, Dr. -------------------------------------------------------------------- Head of Biological Text Mining Unit Structural Biology and BioComputing Programme Spanish National Cancer Research Centre (CNIO) -------------------------------------------------------------------- Oficina Técnica General (OTG) del Plan TL en el área de Biomedicina de la Secretaria de Estado de Telecomunicaciones y para la Sociedad de la Información ============================ ________________________________ From: corpora-bounces at uib.no [corpora-bounces at uib.no] on behalf of Eugenio Martínez Cámara [emcamara at decsai.ugr.es] Sent: Wednesday, September 12, 2018 9:36 AM To: Nicola Bertoldi Cc: corpora at uib.no Subject: Re: [Corpora-List] Spanish and Latin-American Spanish parallel corpora

Hi Nicola,

I recommend you to study the last corpora/datasets of TASS workshops on different version of Spanish language.

URL: http://www.sepln.org/workshops/tass/2018/

Kind regards, Eugenio.

El 2018-07-30 14:26, Nicola Bertoldi escribió:

Dear all,

I am looking for parallel corpora between English and Spanish dialects: Spanish of Spain (es-ES), Mexican Spanish (es-MX), and so on.

I would be also interested in parallel corpora between Spanish dialects (e.g. es-ES vs es-MX).

Any suggestion where to find such resources are very welcome.

best, Nicola

-- -- Le informazioni contenute nella presente comunicazione sono di natura privata e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai destinatari indicati e per le finalità strettamente legate al relativo contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di eliminarlo e di inviare una comunicazione all'indirizzo e-mail del mittente.

-- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no<mailto:Corpora at uib.no> https://mailman.uib.no/listinfo/corpora

--- Eugenio Martínez Cámara Investigador posdoctoral en Tec. del Lenguaje Humano / Postdoctoral Researcher in Natural Language Proc. Grupo de investigación SCI2S<http://sci2s.ugr.es/> / Research group SCI2S<http://sci2s.ugr.es/> Dpto. Ciencias de la Computación e Inteligencia Artificial / Computer Science and Artificial Intelligence department Universidad de Granada

**ADVERTENCIA LEGAL**: Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido. De conformidad con lo dispuesto en el Reglamento (UE) 2016/679 relativo a la protección de los datos personales de las personas físicas, la información personal que nos pueda facilitar a través de este correo electrónico quedará registrada por la Fundación CNIO con la finalidad de tramitar el objeto del presente correo electrónico. El tratamiento de sus datos personales se encuentra legitimado por ser necesario para gestionar el objeto del presente mensaje. Estos datos personales no serán comunicados a ningún destinatario salvo a aquellos que usted nos autorice o así venga exigido por una ley. Ud. podrá ejercer los derechos de acceso, rectificación, supresión, limitación de tratamiento, portabilidad y oposición en la siguiente dirección: c/Melchor Fernandez Almagro 3, 28029 (Madrid). Podrá ponerse en contacto con el Delegado de Protección de Datos en: delegado_lopd at cnio.es. Para el caso de que Ud. precise conocer información adicional sobre el tratamiento de sus datos personales, puede consultar dicha información adicional en el siguiente enlace dentro de nuestra página web: https://www.cnio.es/es/privacidad/index.asp

**LEGAL NOTICE**: This email and any attached files may contain protected information for the sole use of its intended recipient or addressee. Anyone other than the intended recipient or addressee is strictly prohibited from distributing, reproducing or transmitting the email and its contents in any way. If you receive this email in error, please notify the sender and delete the message. Pursuant to the provisions of EU Regulation 2016/679 regarding the protection of personal data, any personal information you provide through this email will be registered by the CNIO Foundation in order to deal with content of this email. Your personal data must be processed in order to be able to deal with the content and purpose of this message. Your personal details will not be passed on to anyone else unless you authorise us to do so or we are required to do so by law. You may exercise your rights regarding access, rectification, suppression, limitation of processing, portability and opposition by writing to the following address: c/Melchor Fernandez Almagro 3, 28029 (Madrid). You may contact the Data Protection Delegate (Delegado de Protección de Datos) at: delegado_lopd at cnio.es. If you require further information about the processing of your personal data, go to the following link on our webpage: https://www.cnio.es/es/privacidad/index.asp -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 15667 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20180912/da00ad48/attachment.txt>



More information about the Corpora mailing list