[Corpora-List] Corpora with prosodic tagging

Kristine Yu krisyu at linguist.umass.edu
Wed Apr 20 14:46:08 CEST 2016

Dear Eitan,

There's a nice list (a little old, but still good) that was compiled by Tyler Schnoebelen here:


The Switchboard corpus (English) linked there has annotations for "nuclear" accent, although you'll have to check to see if that means what you are referring to by nuclear accent.

For the Boston News corpus linked there (English), I think it might just be ToBI labeled. You could extract "nuclear" accent by looking for the last pitch accent (marked with a *, e.g. "H*") before either a phrase accent (ending with - and no %, e.g. "H-") or a compound phrase accent-intonational phrase tone (ending in %, also includes a -, e.g. "L-L%") in the .TON files and time-aligning those to the words. This assumes that how I'm thinking about nuclear accent is the same way you are.

Cheers, Kristine

On Tue, Apr 19, 2016 at 9:20 AM, Eitan Grossman < eitan.grossman at mail.huji.ac.il> wrote:

> Dear all,
> Does anyone know of any spoken corpus in any language whatsoever that is
> tagged for prosody?
> Specifically, I'm interested in corpora in which nuclear stress is marked
> on each intonation unit, even if in a simple way, like 'He ate the BREAD".
> Thanks in advance,
> Eitan
> Eitan Grossman
> Lecturer, Department of Linguistics/School of Language Sciences
> Hebrew University of Jerusalem
> Tel: +972 2 588 3809
> Fax: +972 2 588 1224
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Kristine Yu UMass Amherst, Department of Linguistics Integrative Learning Center 650 North Pleasant Street Amherst, MA 01003 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3075 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160420/6c3e9b51/attachment.txt>

More information about the Corpora mailing list