[Corpora-List] Corpora Digest, Vol 58, Issue 27

Solomon Getachew sgetachew92 at yahoo.com
Wed Apr 25 16:27:32 CEST 2012


dear sir  how  can i get  Amharic corpora  and implement this corpora by using python code?

--- On Sun, 4/22/12, corpora-request at uib.no <corpora-request at uib.no> wrote:

From: corpora-request at uib.no <corpora-request at uib.no> Subject: Corpora Digest, Vol 58, Issue 27 To: corpora at uib.no Date: Sunday, April 22, 2012, 10:00 AM

Today's Topics:

   1. Re:  (no subject) (Christophe Servan)    2.  CADS International Conference, 13-14 September 2012       (Gabrielatos, Costas)    3.  Assistant Professor CL/NLP at ILLC, Amsterdam (Khalil Simaan)    4.  !! SHORT PAPER DEADLINE EXTENSION !! SEMANTIC TRACK OF ACL       2012 SP-SEM-MRL Workshop (Yuval Marton)    5.  CFP: Text Summarization of the Future - Workshop at    SEPLN       2012 (Spain) (Horacio Saggion)

----------------------------------------------------------------------

Message: 1 Date: Sat, 21 Apr 2012 12:57:44 +0200 From: Christophe Servan <christophe.servan at gmail.com> Subject: Re: [Corpora-List] (no subject) To: corpora at uib.no

Le 18/04/2012 12:30, Eirini LS a écrit :
> Dear Corpora members,
> My question may be very simple and I am very sorry for bothering you;
> but I have heard about morphological analyzers made on the basis of
> xfst modules (so called Xerox Finite State Platform), as I have seen
> the Book of Finite State Morphology with a software distributed by
> PARC can be used for this purposes, but at the same time fst - can be
> considered as a tool using its own language. So, Is it possible
> to integrate transducers created in this program in the other program
> made for example in C or C++? and if so, does it need additional
> permission or license?
> Thank you in advance,
>
> Irina
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
Dear Irina, I answer quite late but, you may try the openFST, made by former creators of the AT&T's FSM. http://www.openfst.org It is open source and written in C++

Cheers,

Christophe

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2326 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120421/133e623e/attachment.txt>

------------------------------

Message: 2 Date: Fri, 20 Apr 2012 05:14:55 -0700 (PDT) From: "Gabrielatos, Costas" <c.gabrielatos at lancaster.ac.uk> Subject: [Corpora-List] CADS International Conference, 13-14 September     2012 To: "corpora at uib.no" <corpora at uib.no>

SECOND CALL FOR PAPERS

CADS International Conference Corpus-assisted Discourse Studies:More than the sum of Discourse Analysis and computing?

University of Bologna,13-14 September 2012 Conference website: http://www3.lingue.unibo.it/blog/clb/?p=287

Featured speakers include: Michael Hoey (Liverpool), Paul Baker (Lancaster), Tony McEnery (Lancaster), Ramesh Krishnamurthy (Aston), Costas Gabrielatos (Lancaster), Alan Partington (Bologna)

The term corpus-assisted discourse studies (CADS) was coined ten years ago. Although such research dates back at least to Biber (1988) and Stubbs (1996), it was in those days still possible to lament: In comparison with the impressive strides corpus linguistics has made in the fields of lexicography, grammatical description, register studies etc, it has had relatively little to say in describing features of discourse, particularly of interaction, that is, the rhetorical aspects of texts.  This is clearly no longer the case. In these ten years CADS has come of age with major projects under its belt on, among others, the reporting of immigration, reporting the Iraq conflict, White House press relations and perceptions of the EU. Language topics studied include evaluation, discourse organisation, facework/politeness, metaphor, irony, stylistics, diachronic linguistics, and many more.  But new questions have arisen. Is CADS a coherent discipline? What are its methods? What are the overall objectives of CADS research(ers)? Has its focus altered over the years and is it likely to alter in the future? And, of course: is it more than just the sum of discourse analysis and computing? If so, what is its added value?  We invite speakers to share their own experiences of using corpus techniques to shed light on discourse and to debate these fundamental questions Talks will be 20 minutes with 10 minutes for questions. Abstracts Please send abstracts to: catharina.solano2 at unibo.it Abstracts should be no more than 500 words (including references) and should specify five keywords. The number of conference places is limited to 40. Please supply abstract by e-mail without name with a separate document with name and affiliation. Address e-mail subject as ?CADS conference?. Abstracts will be sent for anonymous refereeing.

Scientific committee Alan Partington (Bologna) Anna Marchi (Lancaster) Costas Gabrielatos (Lancaster) Jane Johnson (Bologna) Charlotte Taylor (Portsmouth) Alison Duguid (Siena) John Morley (Siena) Federica Ferrari (Bologna)

Important dates Deadline for abstract submission: May 7th 2012. Notification of acceptance / non acceptance: May 20th 2012. Registration begins & programme published: May 22nd 2012. For further information please contact Anna Marchi (anna.marchi at unibo.it)

_______________________________________________ Baalmail mailing list Baalmail at lists.leeds.ac.uk http://lists.leeds.ac.uk/mailman/listinfo/baalmail -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 16783 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120420/d356cedf/attachment.txt>

------------------------------

Message: 3 Date: Sat, 21 Apr 2012 20:03:39 +0200 From: Khalil Simaan <k.simaan at uva.nl> Subject: [Corpora-List] Assistant Professor CL/NLP at ILLC, Amsterdam To: "corpora at uib.no" <corpora at uib.no>

A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6871 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120421/1f96c687/attachment.txt>

------------------------------

Message: 4 Date: Sat, 21 Apr 2012 18:43:34 -0400 From: Yuval Marton <yuvalmarton at gmail.com> Subject: [Corpora-List] !! SHORT PAPER DEADLINE EXTENSION !! SEMANTIC     TRACK OF ACL 2012 SP-SEM-MRL Workshop To: Corpora Mailing List <corpora at uib.no>

!! SHORT PAPER DEADLINE EXTENSION !!  SEMANTIC TRACK OF ACL 2012 SP-SEM-MRL Workshop ==================================================================================

Due to multiple requests, the semantic track short paper deadline of the ACL 2012 SP-SEM-MRL workshop is now extended to Saturday, April 28, 11:59pm PST (UTC/GMT -8 hours).

Authors who wish to take advantage of this extension for new submissions are requested to submit an abstract draft by Tuesday (April 23), to help us assign reviewers in this tight schedule. The abstract should be extended to a short paper format by April 28. All semantic processing track short paper submissions -- both previously submitted short and newly submitted abstracts -- may be updated and resubmitted online until the new extended deadline (April 28).

Note: Syntactic parsing  track short paper deadline has NOT changed (it is tomorrow, April 22). This extension is due to overlap with *SEM notification deadline and other events, and in order to encourage submission for the newly introduced track of semantic processing of MRL, so we can have a broader coverage for this emerging important topic. No additional extensions will be given under any circumstances.

For CFP and other details, go to   https://sites.google.com/site/spsemmrl2012 Submission: in PDF format via the START system: https://www.softconf.com/acl2012/sp-sem-mrl-2012

On behalf of the Organizing Committee, ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages (SP-Sem-MRL 2012)

------------------------------

Message: 5 Date: Sun, 22 Apr 2012 09:29:15 +0200 From: Horacio Saggion <horacio.saggion at upf.edu> Subject: [Corpora-List] CFP: Text Summarization of the Future -     Workshop at    SEPLN 2012 (Spain) To: corpora at uib.no

------------------------------------------------------------------------------------------------------------------------------------

1st Workshop on Automatic Text Summarization of the Future

*** Satellite workshop to SEPLN 2012 (Castellón, Spain)***

http://www.taln.upf.edu/pages/sepln_ws_2012/

------------------------------------------------------------------------------------------------------------------------------------ ABOUT THE WORKSHOP: ------------------------------------------------------------------------------------------------------------------------------------

Due to the great proliferation of online documents and information, it becomes necessary to develop automatic tools capable of filtering redundant and irrelevant information, thus presenting the most important one in an efficient and effective manner. This is the goal of Automatic Summarization, which aims at producing a concise document, keeping the essential information of a document or set of documents.

Research into Automatic Summarization began in the 50s with the purpose of summarizing scientific texts. However, the interest for this type of documents decreased, while the interest in news article summarization grew. Recently, new challenges have appeared in this research area. In the context of the Internet, not only is information being constantly updated, but there is also a lack of quality control of what is being published on the Web. Social networks, blogs, reviews, etc. are non-traditional texts of informal nature, and they therefore constitute a big challenge for the new generation of summaries.

High quality documentation such as technical/scientific articles and patents has not received in the past years all the attention that the field deserves. However, given the explosion of technical documentation available on the Web and in intranets, scientific and research and development institutions face a true scientific information deluge. Therefore, summarization should be a key instrument not only for reducing information content in this field but also for measuring information relevance in context, providing users with adequate answers in context.

Another challenge for automatic summarization is the generation of abstracts, where it is necessary to take into consideration

natural language generation techniques and be able to adapt them from one domain to another. In addition to these, efforts are needed to produce summaries in languages other than English and in multiple languages.

Therefore, the main goal of the 1st Workshop on Automatic Text Summarization of the Future is to bring together researchers working on Automatic Summarization, encouraging research into little explored areas such as new textual gentres as well as old, forgotten ones, or summarization in languages other than English (for instance, Spanish).

------------------------------------------------------------------------------------------------------------------------------------ IMPORTANT DATES: ------------------------------------------------------------------------------------------------------------------------------------

Papers submission deadline: 15 June 2012 Notification of decisions to authors: 15 July 2012 Workshop date:  7th September 2012 Camera-ready: 20 July 2012

------------------------------------------------------------------------------------------------------------------------------------ SUBMISSIONS: ------------------------------------------------------------------------------------------------------------------------------------

We will accept full paper contributions for the workshop. These papers should be written in English, with a maximum length of 8 pages, including references. The submission guidelines can be found on the following page: http://www.sepln.org/?page_id=358

Reviewing for the papers will be blind: reviewers will not be presented with the identity of paper authors. Authors should avoid writing anything that makes their identity obvious in the text. Submissions should be original, and in particular should not have been formally published prior to submission for the workshop.

Accepted papers will be published in the Workshop proceedings, with ISBN. We are negotiating the edition of a journal special issue for the best submitted papers. More to be announced.

The submission site for the worshop will be announced with the second call for papers and will be available from the workshop Web site at http://www.taln.upf.edu/pages/sepln_ws_2012/.

------------------------------------------------------------------------------------------------------------------------------------- TOPICS OF INTEREST: -------------------------------------------------------------------------------------------------------------------------------------

Researchers are encouraged to submit papers including, but not restricted to the following topics:

-    Multi-document summarization -    Summarization for new textual genres (blogs, microblogs, social networks, etc.) -    Abstractive summarization -    Multilingual/crosslingual summarization -    Development of resources, corpora, tools, etc. for summary generation -    Summarization for facilitating information access -    Applications of Summarization and Demos -    Summarization for technical and/or scientific documents -    Intrinsic and/or Extrinsic Evaluation of Summaries

------------------------------------------------------------------------------------------------------------------------------------- ORGANIZERS: -------------------------------------------------------------------------------------------------------------------------------------

Horacio Saggion --  Universitat Pompeu Fabra, horacio.saggion at upf.edu Elena Lloret --  Universidad de Alicante, elloret at dlsi.ua.es Manuel Palomar  --  Universidad de Alicante, mpalomar at dlsi.ua.es

------------------------------------------------------------------------------------------------------------------------------------- PROGRAM COMMITTEE: -------------------------------------------------------------------------------------------------------------------------------------

Laura Alonso (Universidad Nacional de Córdoba, Argentina) Ahmet Aker (University of Sheffield, UK) Ester Boldrini (Universidad de Alicante, Spain) Hakan Ceylan (University of North Texas, USA) Iria da Cunha (Universitat Pompeu Fabra, Spain) Alberto Díaz (Universidad Complutense de Madrid, Spain) Maria Fuentes (Universitat Politècnica de Catalunya, Spain) Robert Gaizauskas (University of Sheffield, UK) George Giannakopoulos (University of Trento, Italy) Nicolas Hernandez (Université de Nantes, France) Leila Kosseim (Concordia University, Canada) Guy Lapalme (Universite de Montreal, Canada) Jean-Luc Minel (Université Paris X, France) Paloma Moreda (Universidad de Alicante, Spain) Rafael Muñoz (Universidad de Alicante, Spain) Ani Nenkova (University of Pennsylvania, USA) Thiago Pardo (Universidade de São Paulo, Brazil) Laura Plaza (Universidad Complutense de Madrid, Spain) Horacio Rodriguez (Universitat Politècnica de Catalunya, Spain) Jorge Vivaldi (Universitat Pompeu Fabra, Spain) René Witte (Concordia University, Canada) Dina Wonsever (Universidad de la Republique, Uruguay)

---------------------------------------------------------------------- Send Corpora mailing list submissions to     corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit     http://mailman.uib.no/listinfo/corpora or, via email, send a message with subject or body 'help' to     corpora-request at uib.no

You can reach the person managing the list at     corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific than "Re: Contents of Corpora digest..."

_______________________________________________ Corpora mailing list Corpora at uib.no http://mailman.uib.no/listinfo/corpora

End of Corpora Digest, Vol 58, Issue 27 *************************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 21942 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120425/4cac3e5e/attachment.txt>



More information about the Corpora mailing list