[Corpora-List] Segmenting dialogue corpora (Yorick Wilks)

Maria Georgescul Maria.Georgescul at eti.unige.ch
Mon Oct 27 14:29:05 CET 2008


Hello,

We performed experiences with discriminative and generative machine learning techniques for automatic text structuring into linear and non-overlapping thematic episodes. In particular, we investigated the topic segmentation performance on multi-party dialogues using the ICSI data. Here are a few references regarding the results we obtained:

- M. Georgescul, A. Clark and S. Armstrong, "An Analysis of Quantitative Aspects in the Evaluation of Thematic Segmentation Algorithms", The 7th SIGdial Workshop on Discourse and Dialogue, Sydney, 144-152, 2006.

- M. Georgescul, A. Clark and S. Armstrong, "Word Distributions for Thematic Segmentation in a Support Vector Machine Approach", The 10th Conference on Computational Natural Language Learning (CoNLL-X), 101-109, New York, USA, 2006.

- M. Georgescul, A. Clark and S. Armstrong, "Exploiting Structural Meeting-Specific Features for Topic Segmentation", Actes de la 14ème Conférence sur le Traitement Automatique des Langues Naturelles, Association pour le Traitement Automatique des Langues, Toulouse, France, 2007.

- M. Georgescul, A. Clark and S. Armstrong, "A Comparative Study of Mixture Models for Automatic Topic Segmentation of Multiparty Dialogues", The 3rd International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India, January 7-12, 2008.

Best regards, Maria Georgescul -- ISSCO/TIM, ETI University of Geneva

-------- Original Message -------- Subject: Corpora Digest, Vol 16, Issue 23 Date: Sat, 25 Oct 2008 15:00:15 +0200 From: corpora-request at uib.no Reply-To: corpora at uib.no To: corpora at uib.no

Today's Topics:

1. Re: Free POS tagger (Niels Ott)

2. Re: Segmenting dialogue corpora (Yorick Wilks)

3. Re: Segmenting dialogue corpora (John Niekrasz)

4. Re: Segmenting dialogue corpora (John Niekrasz)

----------------------------------------------------------------------

Message: 1 Date: Fri, 24 Oct 2008 17:01:41 +0200 From: Niels Ott <nott at sfs.uni-tuebingen.de> Subject: Re: [Corpora-List] Free POS tagger Cc: corpora at uib.no

Hi,

Niels Ott schrieb:
> I know that there are some models available. but the person asking was
> interested in POS tagging English and at least me and myself can't find
> an OpenNLP model for English tagging at the download site.
> http://opennlp.sourceforge.net/models/english/

As it turned out that the models are located in the parser directory http://opennlp.sourceforge.net/models/english/parser/

Sorry for any inconvenience I may have created.

Best,

Niels

-- Niels Ott Computational Linguist (B.A.) http://www.drni.de/niels/

------------------------------

Message: 2 Date: Fri, 24 Oct 2008 16:40:15 +0100 From: Yorick Wilks <Yorick at dcs.shef.ac.uk> Subject: Re: [Corpora-List] Segmenting dialogue corpora To: CORPORA List <corpora at uib.no>


> Does anyone out there have experience or recommendations on attempts
> to segment dialogue corpora into "tiles" (by methods like Marti
> Hearst's) into topic-coherent segments. We have not found applying
> her (prose) methods very productive for the dialogue corpora we have

and I would be glad to hear of any positive experiences of researchers in doing this.

Yorick Wilks

Sheffield University


> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

------------------------------

Message: 3 Date: Fri, 24 Oct 2008 22:03:23 +0100 From: "John Niekrasz" <john.niekrasz at gmail.com> Subject: Re: [Corpora-List] Segmenting dialogue corpora To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk> Cc: CORPORA List <corpora at uib.no>

Maybe try Michel Galley's LCSeg software.

John Niekrasz Edinburgh University

On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>> Does anyone out there have experience or recommendations on attempts
>> to segment dialogue corpora into "tiles" (by methods like Marti
>> Hearst's) into topic-coherent segments. We have not found applying
>> her (prose) methods very productive for the dialogue corpora we have
> and I would be glad to hear of any positive experiences of
> researchers in doing this.
> Yorick Wilks
> Sheffield University
>
>
>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

------------------------------

Message: 4 Date: Fri, 24 Oct 2008 22:08:47 +0100 From: "John Niekrasz" <john.niekrasz at gmail.com> Subject: Re: [Corpora-List] Segmenting dialogue corpora To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk> Cc: CORPORA List <corpora at uib.no>

The relevant publication describing it is:

Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, Hongyan Jing (2003). Discourse Segmentation of Multi-Party Conversation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003). July 21-26, 2003. Sapporo, Japan.

http://www-nlp.stanford.edu/~mgalley/papers/mtgseg.pdf

John

On Fri, Oct 24, 2008 at 10:03 PM, John Niekrasz <john.niekrasz at gmail.com> wrote:
> Maybe try Michel Galley's LCSeg software.
>
> John Niekrasz
> Edinburgh University
>
> On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>>> Does anyone out there have experience or recommendations on attempts
>>> to segment dialogue corpora into "tiles" (by methods like Marti
>>> Hearst's) into topic-coherent segments. We have not found applying
>>> her (prose) methods very productive for the dialogue corpora we have
>> and I would be glad to hear of any positive experiences of
>> researchers in doing this.
>> Yorick Wilks
>> Sheffield University
>>
>>
>>
>>> _______________________________________________
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>>>
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>

---------------------------------------------------------------------- Send Corpora mailing list submissions to

corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit

http://mailman.uib.no/listinfo/corpora or, via email, send a message with subject or body 'help' to

corpora-request at uib.no

You can reach the person managing the list at

corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific than "Re: Contents of Corpora digest..."

_______________________________________________ Corpora mailing list Corpora at uib.no http://mailman.uib.no/listinfo/corpora

End of Corpora Digest, Vol 16, Issue 23 ***************************************



More information about the Corpora mailing list