[Corpora-List] Corpora Digest, Vol 88, Issue 7

Miguel-Angel Benitez-Castro mabenitez at ugr.es
Tue Oct 7 14:15:03 CEST 2014


Hi everyone,

Please excuse my ignorance, but does anyone know of an accessible corpus of research articles from soft and hard academic disciplines that is tagged for move structure? I would like to examine how a particular linguistic feature is used in discussion moves of RAs in several popular 'soft' and 'hard' academic disciplines.

Many thanks in advance for your help

Dr Miguel-Angel Benitez-Castro University of Granada, Spain.

El 2014-10-07 12:00, corpora-request at uib.no escribió:
> Today's Topics:
>
> 1. Job opening for Senior Language Technologist, Oxford
> University Press (WHITELOCK, Pete)
> 2. Job opening for Language Technologist, Oxford University
> Press (WHITELOCK, Pete)
> 3. Re: Bilingual Dictionary from Comparable Corpora (Alexandr
> Rosen)
> 4. Call for participation: 2nd DTA- & CLARIN-D-Conference and
> Workshop on Text Corpora in Infrastructures, November
> 17th/18th,
> 2014 (Susanne Haaf)
> 5. Re: Bilingual Dictionary from Comparable Corpora (Serge
> Sharoff)
> 6. Re: Bilingual Dictionary from Comparable Corpora
> (Krishnamurthy, Ramesh)
> 7. CfP: From data to evidence in English language research: Big
> data, rich data, uncharted data (Tanja Säily)
> 8. Reminder: Ph.D. position in Computational Linguistics at
> Stockholm University (Mats Wirén)
> 9. Doktorandanställning i data- och systemvetenskap inom NIASC,
> Ref.nr SU FV-2615-14, deadline 15 okt 2014 (Hercules Dalianis)
> 10. NAACL 2015 System Demonstrations -- First Call (Matthew
> Gerber)
> 11. Re: Bilingual Dictionary from Comparable Corpora
> (inguna.skadina at lumii.lv)
> 12. Call for papers - Special issue on Medical Information
> Retrieval (Lorraine Goeuriot)
> 13. Question about expressions of intermediate meanings
> (Carita Paradis)
>
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 6 Oct 2014 10:14:37 +0000
> From: "WHITELOCK, Pete" <pete.whitelock at oup.com>
> Subject: [Corpora-List] Job opening for Senior Language Technologist,
> Oxford University Press
> To: "corpora at uib.no" <corpora at uib.no>
>
>
> About Us
>
> Oxford University Press is a department of the University of Oxford,
> which furthers the University's objective of excellence in research,
> scholarship, and education by publishing worldwide.
>
> Global Academic publishes books, journals and digital resources for
> the research, professional and higher education markets.
>
> About the Role
>
> The Dictionaries Division publishes the flagship online products
> Oxford English Dictionary (OED) and Oxforddictionaries.com, leads
> innovation in digital lexical publishing and licensing working with
> the world's largest technology and information providers, and is
> launching new initiatives including the Oxford Global Languages
> programme which will develop digital lexical resources with
> communities across a wide range of languages.
> As we grow our dictionary business in line with new digital
> opportunities for language resources, we are looking for a versatile
> programmer with interest and experience in NLP, computational
> linguistics, or machine learning. The successful candidate will join
> our language technology team developing the next generation of
> language resources.
> As a Senior Language Technologist working in collaboration with our
> editorial, online, and business development teams, you will join a
> growing language technology team supporting our core business in
> lexical content technology. You will be creative, analytical, and
> enthusiastic, able to apply innovative approaches in linguistic
> analysis.
>
> Responsibilities will include:
> * Defining and directing projects involving internal and external
> partners
> * Directing and supporting the team by providing mentorship, advice
> and technical assistance to all members of the Language Technology
> Group
> * Identifying and developing tools and techniques to improve and
> enrich media-independent delivery of language content, while ensuring
> optimum quality and consistency
> * Providing expertise in modelling language content
> * Defining and implementing processes for the standardisation,
> integration and enhancement of diverse projects involving language
> content and technology.
> * Defining processes for the development of enhanced digital content
> and application functionality in liaison with external partners and
> licensees.
> * Providing strategic support to the Head of Language Technology
>
> About You
>
> * A degree (or equivalent) in computer science, computational
> linguistics or similar
> * Proven experience with semantic technologies and RDF
> * Proven experience in XML, XSLT and related technologies
> * Proven commercial experience in the field of NLP
> * Proven experience of Perl, Python, Java or similar
> * Strong interpersonal and communication skills
> * Strong interest in expanding your knowledge and learning new skills
>
>
> Expertise in any of the following areas would be an advantage:
> * Knowledge representation and reasoning, statistical language
> processing, machine learning, data mining, text analytics
> * Creation of high-quality textual corpora
> * Experience in working with multilingual linguistic resources
> * Familiarity with a non-European language such as Arabic, Chinese,
> or Japanese
>
> To apply for this job, visit:
>
> http://ukjobs.oup.com/Exp/Vacancy.aspx?VacancyId=57232
>
>
> For more jobs in Dictionaries visit
> www.oxforddictionaries.com/words/oxfordlanguages
>
>
>
>
> Pete Whitelock, PhD
> Principal Language Engineer, Technology
> Academic Dictionaries
> Oxford University Press
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 6 Oct 2014 10:14:38 +0000
> From: "WHITELOCK, Pete" <pete.whitelock at oup.com>
> Subject: [Corpora-List] Job opening for Language Technologist, Oxford
> University Press
> To: "corpora at uib.no" <corpora at uib.no>
>
>
> About Us
>
> Oxford University Press is a department of the University of Oxford,
> which furthers the University's objective of excellence in research,
> scholarship, and education by publishing worldwide.
>
> Global Academic publishes books, journals and digital resources for
> the research, professional and higher education markets.
>
> About the Role
>
> The Dictionaries Division publishes the flagship online products
> Oxford English Dictionary (OED) and Oxforddictionaries.com, leads
> innovation in digital lexical publishing and licensing working with
> the world's largest technology and information providers, and is
> launching new initiatives including the Oxford Global Languages
> programme which will develop digital lexical resources with
> communities across a wide range of languages.
> As we grow our dictionary business in line with new digital
> opportunities for language resources, we are looking for a versatile
> programmer with interest in NLP, computational linguistics, or
> machine
> learning. The successful candidate will join our language technology
> team developing the next generation of language resources.
> As part of a growing Language Technology Group (LTG) working in
> collaboration with our editorial, online, and business development
> teams, you will be supporting our core business in lexical content
> technology. You will be creative, analytical, and enthusiastic, able
> to apply innovative approaches in linguistic analysis.
>
> Responsibilities will include:
> * Defining and contributing to language technology projects
> * Defining, guiding, implementing, and documenting lexical data
> conversion processes in close cooperation with other Dictionary
> groups
>
> * Identifying and developing tools and techniques to improve and
> enrich media-independent delivery of language content, while ensuring
> optimum quality and consistency.
> * Developing models for language content considering optimisation of
> content reusability
> * Defining and implementing processes for the standardisation,
> integration and enhancement of diverse projects involving language
> content and technology
> * Defining and implementing processes for the development of enhanced
> digital content and application functionality in liaison with
> external
> partners and licensees
>
> About You
> * A degree (or equivalent) in computer science, computational
> linguistics or similar
> * Excellent programming skills and proven programming experience
> * Familiarity with XML, XSLT and related technologies
> * Some knowledge in semantic technologies, RDF, NLP, and corpora
> * Good interpersonal and communication skills
> * Strong interest in expanding your knowledge and learning new skills
> * Ideally you will have experience in working with multilingual
> linguistic resources
> * Familiarity with a non-European language such as Arabic, Chinese,
> or Japanese would be an advantage
> To apply for this job, visit:
>
> http://ukjobs.oup.com/Exp/Vacancy.aspx?VacancyId=57227
>
>
> For more jobs in Dictionaries visit
> www.oxforddictionaries.com/words/oxfordlanguages
>
>
>
> Pete Whitelock, PhD
> Principal Language Engineer, Technology
> Academic Dictionaries
> Oxford University Press
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 6 Oct 2014 13:08:32 +0200
> From: Alexandr Rosen <alexandr.rosen at gmail.com>
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
> To: corpora at uib.no
>
> Hi,
>
> There's also http://code.google.com/p/berkeleyaligner/, "a word
> alignment software package that implements recent innovations in
> unsupervised word alignment". Uploaded Sep 28, 2009.
>
> HTH
>
> Alexandr
>
>
>> Message: 6
>> Date: Sun, 5 Oct 2014 21:36:45 -0400
>> From: Philipp Koehn <pkoehn at inf.ed.ac.uk>
>> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
>> Corpora
>> To: javid dadashkarimi <javiddadashkarimi at gmail.com>
>> Cc: gate-users-request at lists.sourceforge.net, "corpora at uib.no"
>> <corpora at uib.no>
>>
>> Hi,
>>
>> Moses does facilitate the construction of translation models
>> from comparable data - but there has been some recent
>> research on the topic that are a good starting point for
>> developing such a tool:
>>
>> http://www.aclweb.org/anthology/P14-1064.pdf
>>
>> http://www.cs.jhu.edu/~anni/papers/irvineCCB_Hallucinating_CoNLL_14.pdf
>>
>> -phi
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 06 Oct 2014 13:56:29 +0200
> From: Susanne Haaf <haaf at bbaw.de>
> Subject: [Corpora-List] Call for participation: 2nd DTA- &
> CLARIN-D-Conference and Workshop on Text Corpora in Infrastructures,
> November 17th/18th, 2014
> To: corpora at uib.no
>
> Dear members,
>
> we would like to invite you to the second joint DTA- &
> CLARIN-D-Conference on "Text Corpora in Infrastructures for the
> Humanities and Social Sciences" and the associated second
> CLARIN-D/WP5
> workshop on CLARIN-D's Language Resources and Services.
>
> The events will take place on November 17th/18th, 2014, at the
> Berlin-Brandenburg Academy of Sciences and Humanities, Jägerstr.
> 22/23,
> Berlin (Germany), Einsteinsaal.
>
> **********************************
>
> Title: 2. DTA-/CLARIN-D-Konferenz und CLARIN-D-Workshop: Textkorpora
> in
> Infrastrukturen für die Geistes- und Sozialwissenschaften
>
> Conference/Workshop language: German
>
> Contact: Deutsches Textarchiv (http://www.deutschestextarchiv.de),
> dta at bbaw.de
>
> Please register until: November 2nd, 2014. Participation is free of
> charge.
>
> For further information see:
> http://www.deutschestextarchiv.de/veranstaltungen/DTAClarinDConf2014
>
> **********************************
>
> Description:
> Die zweite gemeinsame DTA- und CLARIN-D-Konferenz behandelt
> Bedeutung,
> Nutzen und Möglichkeiten der Nachnutzung von "Textkorpora in
> Infrastrukturen für die Geistes- und Sozialwissenschaften".
>
> In zwei übergeordneten Themenblöcken stellen Wissenschaftlerinnen und
> Wissenschaftler verschiedener geistes- und sozialwissenschaftlicher
> Disziplinen zum einen aktuelle, korpusgeleitete Forschungsfragen und
> zum
> anderen verschiedene Zugriffs- und Auswertungsmöglichkeiten für
> Textkorpora vor.
>
> Die Konferenz geht mit einem CLARIN-D-Workshop zum Arbeitspaket 5
> "Sprachressourcen und Dienste" einher. Dieser Workshop widmet sich
> den
> neuen Entwicklungen im Verbundprojekt CLARIN-D rund um den Aufbau,
> das
> Angebot und Möglichkeiten der Auswertung von CLARIN-kompatiblen
> Sprachressourcen.
>
> **********************************
>
> Best regards,
> Susanne Haaf.
>
> --
> Susanne Haaf, M.A.
> Berlin-Brandenburgische Akademie der Wissenschaften
> Deutsches Textarchiv & CLARIN-D
>
> Jägerstr. 22/23, 10117 Berlin
> haaf at bbaw.de, +49 (0)30 2037 0523
> www.deutschestextarchiv.de, www.clarin-d.de
>
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 06 Oct 2014 13:06:01 +0100
> From: Serge Sharoff <s.sharoff at leeds.ac.uk>
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
> To: Alexandr Rosen <alexandr.rosen at gmail.com>, "corpora at uib.no"
> <corpora at uib.no>
>
> Dear all,
>
> there is a book overviewing more recent developments in this field:
> http://www.springer.com/computer/ai/book/978-3-642-20127-1
>
> The overview chapter from this book is freely available from:
>
> http://www.springer.com/cda/content/document/cda_downloaddocument/9783642201271-c1.pdf?SGWID=0-0-45-1442068-p174109864
>
> Best wishes,
> Serge
>
> On 06/10/14 12:08, Alexandr Rosen wrote:
>> Hi,
>>
>> There's also http://code.google.com/p/berkeleyaligner/, "a word
>> alignment software package that implements recent innovations in
>> unsupervised word alignment". Uploaded Sep 28, 2009.
>>
>> HTH
>>
>> Alexandr
>>
>>
>>> Message: 6
>>> Date: Sun, 5 Oct 2014 21:36:45 -0400
>>> From: Philipp Koehn <pkoehn at inf.ed.ac.uk>
>>> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
>>> Corpora
>>> To: javid dadashkarimi <javiddadashkarimi at gmail.com>
>>> Cc: gate-users-request at lists.sourceforge.net, "corpora at uib.no"
>>> <corpora at uib.no>
>>>
>>> Hi,
>>>
>>> Moses does facilitate the construction of translation models
>>> from comparable data - but there has been some recent
>>> research on the topic that are a good starting point for
>>> developing such a tool:
>>>
>>> http://www.aclweb.org/anthology/P14-1064.pdf
>>>
>>> http://www.cs.jhu.edu/~anni/papers/irvineCCB_Hallucinating_CoNLL_14.pdf
>>>
>>> -phi
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 6 Oct 2014 13:26:42 +0000
> From: "Krishnamurthy, Ramesh" <r.krishnamurthy at aston.ac.uk>
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
> To: javid dadashkarimi <javiddadashkarimi at gmail.com>
> Cc: "corpora at uib.no" <corpora at uib.no>
>
> Hi Javid
> yes, i am familiar with parallel corpora and comparable corpora. :)
> ...but for me, a 'dictionary' means something very different to 'an
> aligning tool
> for comparable corpora'.... :)
> best
> ramesh
> ________________________________
> From: javid dadashkarimi [javiddadashkarimi at gmail.com]
> Sent: 06 October 2014 10:09
> To: Krishnamurthy, Ramesh
> Cc: Jörg Tiedemann; corpora at uib.no
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
>
> Hi Ramesh,
> ?Excuse me If I did not explain carefully,?
> In Statistical Machine Translation of Cross-lingual Information
> Retrieval (CLIR), parallel corpora(sentence-aligned corpora) and
> comparable corpora (document -aligned corpora that documents are not
> as precisely translations of each other as the parallel corpora but
> they are in the same topic) are useful resources to translate queries
> in different languages from documents. Indeed, these tasks extract
> some words in target language that are translations of a source
> language word with different probabilities. So we have a comparable
> corpora that each document in the source language
> ?is
> in the same topic that some other in-the-target-language documents
> ??
> (
> ?
> ?D0s?
> ? Dt1, Dt2, ..Dtk?
> )
> ? ?
> ,
> (
> ?
> ?D
> ?1
> s?
> ? D
> ?'?
> t1, D
> ?'?
> t2, ..D
> ?'?
> tk?
> )
> ? , .. ,
> ?
> (
> ?
> ?D
> ?m
> s?
> ? D
> ?"?
> t1, D
> ?"?
> t2, ..D
> ?"?
> tk?
> )
> ?
> .
> ?Best,
> Javid?
>
>
> On Mon, Oct 6, 2014 at 1:44 AM, Krishnamurthy, Ramesh
> <r.krishnamurthy at aston.ac.uk<mailto:r.krishnamurthy at aston.ac.uk>>
> wrote:
> hi javid
>
> i think you and i have different ideas about what a 'dictionary' is.
> :)
>
> i think perhaps you just want to find 'word/phrase-equivalents' in
> comparable corpora in
> different languages?
>
> i don't know enough about computational linguistics, but i *suspect*
> that both SketchEngine and Tshwanelex are for 'fuller' dictionaries,
> eg with collocational, grammatical, semantic, phraseological info,
> etc
> for each entry.... but they can probably be used with a bilingual
> lookup
> (eg Wordnet) to link items in the comparable corpora...?
>
> best
> ramesh
>
>
>
> ________________________________
> From: Jörg Tiedemann
> [Jorg.Tiedemann at lingfil.uu.se<mailto:Jorg.Tiedemann at lingfil.uu.se>]
> Sent: 06 October 2014 09:02
> To: javid dadashkarimi
> Cc: Krishnamurthy, Ramesh; corpora at uib.no<mailto:corpora at uib.no>
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
>
>
> Maybe you want to have a look at alignment tools for comparable
> corpora such as:
> - http://www.accurat-project.eu
> - http://yalign.machinalis.com
>
> I haven't used these tools myself but I would be interested to hear
> if they work for you.
>
> Good luck!
> Jörg
>
>
> **********************************************************************************
> Jörg Tiedemann
>
> jorg.tiedemann at lingfil.uu.se<mailto:jorg.tiedemann at lingfil.uu.se><mailto:jorg.tiedemann at lingfil.uu.se<mailto:jorg.tiedemann at lingfil.uu.se>>
> Dep. of Linguistics and Philology
> http://stp.lingfil.uu.se/~joerg/
> Uppsala University tel: +46 (0)18
> - 471 1412
> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>
>
>
> On Oct 5, 2014, at 7:00 PM, javid dadashkarimi wrote:
>
> Dear Ramesh,
> I only want to extract dictionary within an aligned bilingual corpus.
> I know that Moses can do it for parallel and sentence-level aligned
> corpus, but are the tools like SketchEngine or Tshwanelex extracting
> such a knowledge?
> Best,
> Javid
>
> On Sun, Oct 5, 2014 at 7:23 PM, Krishnamurthy, Ramesh
>
> <r.krishnamurthy at aston.ac.uk<mailto:r.krishnamurthy at aston.ac.uk><mailto:r.krishnamurthy at aston.ac.uk<mailto:r.krishnamurthy at aston.ac.uk>>>
> wrote:
> hi javid
> not sure quite what you want,
> but i'd suggest contacting the
> people at SketchEngine
> http://www.sketchengine.co.uk/
> and Tshwanelex
> http://tshwanedje.com/tshwanelex/
> best
> ramesh
> -------------
> Date: Sat, 4 Oct 2014 15:11:02 +0330
> From: javid dadashkarimi
>
> <javiddadashkarimi at gmail.com<mailto:javiddadashkarimi at gmail.com><mailto:javiddadashkarimi at gmail.com<mailto:javiddadashkarimi at gmail.com>>>
> Subject: [Corpora-List] Bilingual Dictionary from Comparable Corpora
> To:
>
> corpora at uib.no<mailto:corpora at uib.no><mailto:corpora at uib.no<mailto:corpora at uib.no>>,
>
> gate-users-request at lists.sourceforge.net<mailto:gate-users-request at lists.sourceforge.net><mailto:gate-users-request at lists.sourceforge.net<mailto:gate-users-request at lists.sourceforge.net>>
>
> Hi,
> Is there any tool for extracting probabilistic bilingual dictionary
> for a
> bilingual comparable corpora? Does Moses support such a task?
> Best,
> Javid
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
>
> Corpora at uib.no<mailto:Corpora at uib.no><mailto:Corpora at uib.no<mailto:Corpora at uib.no>>
> http://mailman.uib.no/listinfo/corpora
>
>
>
> ------------------------------
>
> Message: 7
> Date: Mon, 6 Oct 2014 16:38:55 +0300
> From: Tanja Säily <tanja.saily at helsinki.fi>
> Subject: [Corpora-List] CfP: From data to evidence in English
> language
> research: Big data, rich data, uncharted data
> To: corpora <corpora at uib.no>,
> HISTORICAL-CORPORA at listserv.manchester.ac.uk
>
>>From data to evidence in English language research: Big data, rich
>> data, uncharted data
>
> ***Conference in Helsinki, Finland, 19-22 October 2015***
>
> To diversify the discussion of data explosion in the humanities, the
> Research Unit for Variation, Contacts and Change in English (VARIENG)
> is organising an academic conference that addresses the use of new
> data sources, historical and modern, in English language research. We
> are particularly interested in papers discussing the advantages and
> disadvantages of the following three kinds of data:
>
> Big data
>
> In recent years, mega-corpora and other large text collections have
> become increasingly available to linguists. These databases open new
> opportunities for linguistic research, but they may be problematic in
> terms of representativeness and contextualisation, and the sheer
> amount of data may also pose practical problems. We welcome papers
> drawing on big data, including large corpora representing different
> genres and varieties (e.g. COCA, GloWbE), databases (e.g. EEBO, ECCO)
> and corpora created by web crawling (e.g. EnTenTen, UKWaC).
>
> Rich data
>
> Rich data contains more than just the texts, including
> representations of spacing, graphical elements, choice of typeface,
> prosody, or gestures. This is further supplemented by analytic and
> descriptive metadata linked to either entire texts or individual
> textual elements. The benefit of rich data is that it can provide new
> kinds of evidence about pragmatic, sociolinguistic and even syntactic
> aspects of linguistic events. Yet the creation and use of rich data
> bring great challenges. We invite papers on the representation,
> query,
> analysis, and visualisation of data consisting of more than linear
> text.
>
> Uncharted data
>
> Uncharted data comprises material which has not yet been
> systematically mapped, surveyed or investigated. We wish to draw
> attention to texts and language varieties which are marginally
> represented in current corpora, to data sources that exist on the
> internet or in manuscript form alone, and material compiled for
> purposes other than linguistic research. We welcome papers discussing
> the innovative research prospects offered by new and and previously
> unused or even unidentified material for the study of English in
> various contexts ranging from communities and networks to social
> groups and individuals.
>
> Abstracts are invited by 15 February 2015 for 30-minute presentations
> including discussion as well as for posters and corpus and software
> demonstrations.
>
> The following invited speakers have confirmed their participation:
>
> Professor Mark Davies (Brigham Young University)
> Professor Tony McEnery (Lancaster University)
> Professor Päivi Pahta (University of Tampere)
> Dr Jane Winters (Institute of Historical Research, University of
> London)
>
> The conference forms part of the programme celebrating the 375th
> anniversary of the University of Helsinki in 2015 and will be held in
> the Main Building of the University.
>
> More information on the conference will be available on the
> conference home page at: http://www.helsinki.fi/varieng/d2e/. Please
> address any queries to: d2e-conference at helsinki.fi.
>
> --
> Tanja Säily
> MA, Postgraduate Student
> Research Unit for Variation, Contacts and Change in English (VARIENG)
> http://www.helsinki.fi/varieng/people/varieng_saily.html
>
>
>
>
> ------------------------------
>
> Message: 8
> Date: Mon, 6 Oct 2014 14:32:04 +0000
> From: Mats Wirén <mats.wiren at ling.su.se>
> Subject: [Corpora-List] Reminder: Ph.D. position in Computational
> Linguistics at Stockholm University
> To: "nodali at helsinki.fi" <nodali at helsinki.fi>, "corpora at uib.no"
> <corpora at uib.no>, "elsnet-list at elsnet.org" <elsnet-list at elsnet.org>
>
> The Department of Linguistics at Stockholm University is seeking a
> Ph.D. student for a funded position in Computational Linguistics
> starting in January 2015. All subfields of Computational Linguistics
> where we have active research are relevant, but we especially welcome
> applicants interested in one of the following topics:
>
> -- Computational models of linguistic typology, for example, using
> massively parallel corpora
>
> -- Computational models of first-language acquisition
>
> -- User-generated content, for example, innovation and variation in
> social media, or analysis of medical health records
>
> The announcement can be found at
>
> http://www.ling.su.se/english/about-us/vacancies/phd-student-position-in-computational-linguistics
>
> To find out more about research in Computational Linguistics at the
> department, see
>
> http://www.ling.su.se/english/research/research-areas/research-in-computational-linguistics
>
> The deadline for applications is October 15, 2014.
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 1476 bytes
> Desc: not available
> URL:
>
> <https://mailman.uib.no/public/corpora/attachments/20141006/07db89f6/attachment.txt>
>
> ------------------------------
>
> Message: 9
> Date: Mon, 6 Oct 2014 16:49:20 +0200 (CEST)
> From: Hercules Dalianis <hercules at dsv.su.se>
> Subject: [Corpora-List] Doktorandanställning i data- och
> systemvetenskap inom NIASC, Ref.nr SU FV-2615-14, deadline 15 okt
> 2014
> To: corpora at uib.no, nodali at helsinki.fi
>
>
>
> Doktorandanställning i data- och systemvetenskap
> Doktorandanställning i data- och systemvetenskap vid Institutionen
> för data-
> och systemvetenskap, Stockholms universitet, Ref.nr SU FV-2615-14.
> Sista
> ansökningsdag: 2014-10-15.
>
> http://www.su.se/om-oss/lediga-anställningar/platser-i-forskarutb/doktorandanställning-i-data-och-systemvetenskap
> Doktorandanställningen utlyses inom ramen för the Nordic Center of
> Excellence
> in Health-Related e-Sciences. Forskningsområdet är klinisk textmining
> som
> använder sig av språkteknologiska metoder speciellt utvecklade för
> elektroniska
> patientjournaler. Ett syfte är detektion av cancersymptom i
> patientjournaltext,
> specifikt för cervixcancer och prostatacancer och i första hand på
> svenska
> språket och i andra hand på danska språket. Ett mål är att finna
> tidiga symptom
> på cervixcancer. Forskarutbildningen kommer att ske i samarbete med
> CBS,
> Danmarks Tekniske Universitet (DTU) i Lyngby, Köpenhamn, där
> doktoranden även
> förväntas vistas.
>
> In English
> PhD student position in Computer and Systems Sciences
> PhD student position in Computer and Systems Sciences at the
> Department of
> Computer and Systems Sciences, Stockholm University, Reference number
> SU
> FV-2615-14. Deadline for applications: October 15, 2014.
>
> http://www.su.se/english/about/vacancies/phd-studies/phd-student-position-in-computer-and-systems-sciences
>
> Bästa hälsningar
>
> Hercules
>
>
>
> ___________________________________________________________________________
> Dr. Hercules Dalianis, Professor
> Department of Computer and Systems Sciences
> ph: +46 8 674 75 47 DSV/Stockholm University
> mobile ph: +46 70 568 13 59 P.O. Box 7003
> fax: +46 8 703 90 25 164 07 Kista
> email: hercules at dsv.su.se Stockholm, Sweden
> www: http://www.dsv.su.se/~hercules/
>
> ___________________________________________________________________________
>
> ------------------------------
>
> Message: 10
> Date: Mon, 6 Oct 2014 14:13:56 -0400
> From: Matthew Gerber <gerber.matthew at gmail.com>
> Subject: [Corpora-List] NAACL 2015 System Demonstrations -- First
> Call
> To: corpora <Corpora at uib.no>
>
> NAACL HLT 2015 CALL FOR DEMONSTRATIONS
>
> http://naacl.org/naacl-hlt-2015/call-for-demos.html
>
> The NAACL 2015 Program Committee invites proposals for Demonstrations
> to be
> displayed at NAACL HLT 2015. We encourage submissions from early
> prototype
> demonstrations to mature production-ready systems; of particular
> interest
> are demos that can be openly used by NAACL attendees from the
> conference
> website before and during the conference. All accepted demos will be
> published in a companion volume of the conference proceedings, and
> will be
> presented during a demo session (with an optional poster).
> Areas of Interest
>
> Areas of interest include, but are not limited to, the following
> types of
> systems:
>
> - End-to-end systems:
> - Mobile applications of language technologies
> - Text- or speech-based information access or dialogue systems
> - Machine translation systems for consumer or industry
> applications
> - Question answering applications
> - Information extraction
> - Systems for meeting capture and analysis
> - NLP and speech technologies to support accessibility and
> assistive
> devices
> - Systems aiding research and development:
> - Software architectures and reusable components for use in NLP
> - Tools for data visualization
> - Software tools for system evaluation or error analysis
> - Interfaces and resources to support linguistic annotation
> - Toolkits for machine learning, NLP, or data mining
> - Systems supporting learning or education:
> - Visual interactive aids for students
> - Tutorial agents to support real-time feedback for learning
> - Instructional aids for topics in computational linguistics
> - Systems to score or critique textual student responses
> - Systems to mine textual or behavioral data for educational
> purposes
>
> Format for Submission
>
> Please use the main NAACL paper style (http://naacl.org/naacl-pubs)
> and
> submission guidelines. At a minimum, demo proposals should include
> the
> following:
>
> - A brief description of the technical content to be demonstrated.
> - A ?script outline? of the demo presentation, including
> accompanying
> narrative, and either a Web address for accessing the demo or
> visual aids
> (e.g., screenshots, snapshots, or diagrams).
>
> As the reviewing will be blind, proposals must not include the
> authors?
> names and affiliations. Furthermore, self-references that reveal the
> author?s identity, e.g., ?We previously showed (Smith, 1991) ?? must
> be
> avoided. Instead, use citations such as ?Smith previously showed
> (Smith,
> 1991) ??. In addition, please do not post your proposals on the web
> until
> after the review process is complete.
>
> The entire proposal must not be more than four pages. We will reject
> without review any papers that do not follow the official style
> guidelines,
> anonymity conditions and page limits. Also, please note that no
> hardware or
> software will be provided by the local organizers.
> Submissions Procedure
>
> Proposals must be submitted electronically by February 13, 2015 using
> submission software available at
> https://www.softconf.com/naacl2015/demos.
> Important Dates
>
> All times are 11:59pm PST on the deadline day
>
> - Submission deadline: February 13, 2015
> - Notification of acceptance: March 15, 2015
> - Submission of camera ready copies: March 30, 2015
>
> Further Details
>
> Submissions will be evaluated on the basis of their relevance to
> computational linguistics, innovation, scientific contribution,
> presentation, as well as potential logistical constraints. Accepted
> submissions will be allocated four pages in the Companion Volume to
> the
> Proceedings of the Conference. Further details on the date, time, and
> format of the demonstration session(s) will be determined and
> provided at a
> later date.
>
> Please send any inquiries to the demonstration co-chairs:
>
> - Matthew Gerber, University of Virginia (msg8u at virginia.edu)
> - Catherine Havasi, Luminoso & MIT Media Lab
> (havasi at media.mit.edu)
> - Finley Lacatusu, Language Computer Corporation (
> finley at languagecomputer.com)
>
> Program Committee
>
> - Zeljko Agic
> - Omar Alonso
> - Tyler Baldwin
> - Georgeta Bordea
> - Kevin Cohen
> - Montse Cuadros
> - Thierry Declerck
> - Karthik Dinakar
> - Mark Dras
> - Michele Filannino
> - Matthew Gerber
> - Marek Krawczyk
> - Brigitte Krenn
> - Finley Lacatusu
> - Changsong Liu
> - Clare Llewellyn
> - Marie-Jean Meurs
> - Tsuyoshi Okita
> - Arzucan Ozgur
> - Stelios Piperidis
> - Zahar Prasov
> - Kirk Roberts
> - Melissa Roemmele
> - Masoud Rouhizadeh
> - Irene Russo
> - Le Sun
> - Maarten van Gompel
> - Marc Vilain
> - Liang-Chih Yu
> - Guodong Zhou
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 5645 bytes
> Desc: not available
> URL:
>
> <https://mailman.uib.no/public/corpora/attachments/20141006/a89ea47e/attachment.txt>
>
> ------------------------------
>
> Message: 11
> Date: Tue, 07 Oct 2014 09:48:47 +0300
> From: inguna.skadina at lumii.lv
> Subject: Re: [Corpora-List] Bilingual Dictionary from Comparable
> Corpora
> To: Inguna Skadi?a <inguna at latnet.lv>
> Cc: corpora at uib.no, gate-users-request at lists.sourceforge.net
>
> Dear Javid,
>
>
> The ACCURAT toolkit (http://accurat-project.eu/) allows to identify
> semi-parallel sentences in comparable corpora and extract
> dictionary/translation table from them (with support of GIZA+++).
>
> I hope, you will find it useful.
>
> Best wishes,
> Inguna Skadi?a
>
>> Cit?jot javid dadashkarimi <javiddadashkarimi at gmail.com>:
>>
>>> Hi,
>>> Is there any tool for extracting probabilistic bilingual dictionary
>>> for a
>>> bilingual comparable corpora? Does Moses support such a task?
>>> Best,
>>> Javid
>>>
>>
>>
>>
>
>
>
>
>
>
> ------------------------------
>
> Message: 12
> Date: Tue, 7 Oct 2014 10:48:49 +0200
> From: Lorraine Goeuriot <lorraine.goeuriot at gmail.com>
> Subject: [Corpora-List] Call for papers - Special issue on Medical
> Information Retrieval
> To: lorraine.goeuriot at imag.fr
>
> Call for Papers - Information Retrieval Special Issue on Medical
> Information Retrieval
> ====================================================================
>
> Editors: Dr Lorraine Goeuriot, Dr Gareth J.F. Jones, Dr Liadh Kelly,
> Pr
> Henning Mueller, Pr Justin Zobel
>
> Submission deadline: December 15
>
> --------------------------------
>
> Medical information search refers to methodologies and technologies
> that
> seek to improve access to medical information archives via a process
> of
> information retrieval (IR). Such information is now potentially
> accessible
> from many sources including the general web, social media, journal
> articles, and hospital records.
> Medical information is of interest to a wide variety of users,
> including
> patients and their families, researchers, general practitioners and
> clinicians, and practitioners with specific expertise such as
> radiologists.
> Despite the popularity of the medical domain for users of search
> engines,
> and current interest in this topic within the information retrieval
> research community, development of search and access technologies
> remains
> particularly challenging. One of the central issues in medical
> information
> search is diversity of the users of these services. In particular,
> they
> will have varying categories of information needs, varying levels of
> medical knowledge, and varying language skills. In addition, the
> format,
> reliability, and quality of biomedical and medical information varies
> greatly. A single health record can contain clinical notes, technical
> pathology data, images, and patient-contributed histories, and may be
> linked by a physician to research papers. The importance of health
> and
> medical topics and their impact on people?s everyday lives makes the
> need
> for retrieval of accurate and reliable information especially
> important.
> Determining the likely reliability of available information is
> challenging.
> Finally, as with information retrieval in general, the evaluation of
> medical search tools is vital and challenging. For example, there are
> no
> established or standardized baselines or evaluation metrics, and
> limited
> availability of test collections.
> We encourage participation from researchers in all fields related to
> medical information search including mainstream information
> retrieval, but
> also natural language processing, multilingual text processing, and
> medical
> image analysis.
>
> Topics of interest include but are not limited to:
> - Users and information needs
> - Semantics and NLP for medical IR
> - Reliability and trust in medical IR
> - Personalised search
> - Evaluation of medical IR
> - Multilingual issues in medical IR
> - Multimedia technologies in medical IR
> - The role of social media in medical IR
>
> --------------------------------
>
> Paper Submissions:
> Papers should be appropriate for journal publication. Submissions
> should
> follow the guidelines set out by the Information Retrieval journal
> (under
> the section ?Instructions for authors).
> Submissions to the Special Issue are received via the
> http://www.editorialmanager.com/inrt/ website, by choosing the "S.I.
> :
> Medical Information Retrieval?
>
> All papers submitted to the special issue will be reviewed by at
> least 3
> reviewers. Papers will appear online on the Springer journal website
> soon
> after they are accepted.
>
> --------------------------------
>
> Important Dates:
> - December 15, 2014: Deadline for paper submission (midnight Pacific
> Daylight Time)
> - March 1, 2015: Notification to authors
> - May 15, 2015: Camera-ready papers due
>
> --------------------------------
>
> Contact:
> Dr Lorraine Goeuriot (lorraine.goeuriot at imag.fr) and Dr Liadh Kelly (
> liadh.kelly at scss.tcd.ie)
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 4267 bytes
> Desc: not available
> URL:
>
> <https://mailman.uib.no/public/corpora/attachments/20141007/2cfe9525/attachment.txt>
>
> ------------------------------
>
> Message: 13
> Date: Tue, 7 Oct 2014 09:19:04 +0000
> From: Carita Paradis <carita.paradis at englund.lu.se>
> Subject: [Corpora-List] Question about expressions of intermediate
> meanings
> To: "corpora at uib.no" <corpora at uib.no>
>
> Dear list members,
>
> A fair amount of research has been carried out on expressions of
> binary opposites in language, but, as far as I know, very little has
> been carried out on expressions of intermediate meanings, i.e.
> form-meaning pairings related to the in-between zone, irrespective of
> whether this zone is a point/boundary or a range. I am curious to
> know
> if I have missed something important that might have been done on
> intermediates in English or any other language. I would be most
> grateful for pointers.
>
> With best wishes,
>
> Carita Paradis
>
>
>
> Professor Carita Paradis, PhD
> Centre for Languages and Literature
> Lund University
> Box 201,
> SE-221 00 Lund
>
> http://www.sol.lu.se/person/CaritaParadis
> P Please consider the environment
> before printing this e-mail
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 2703 bytes
> Desc: not available
> URL:
>
> <https://mailman.uib.no/public/corpora/attachments/20141007/38a8b2ab/attachment.txt>
>
>
> ----------------------------------------------------------------------
> Send Corpora mailing list submissions to
> corpora at uib.no
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.uib.no/listinfo/corpora
> or, via email, send a message with subject or body 'help' to
> corpora-request at uib.no
>
> You can reach the person managing the list at
> corpora-owner at uib.no
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Corpora digest..."
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
> End of Corpora Digest, Vol 88, Issue 7
> **************************************



More information about the Corpora mailing list