[Corpora-List] Query regarding phoneme extraction

Arne Köhn koehn at informatik.uni-hamburg.de
Tue Apr 11 10:51:16 CEST 2017

Hi Varun,

Varun Jain writes:

> I was wondering if you knew a tool or a software that can extract
> phonemes from an audio file for the french language? Otherwise, do
> you know if someone has done some work related to it for the french
> language or what database they might have used for it?

in case you also have the corresponding text:

Did you have a look at MAUS[0]? It can create segmentation based on audio & text. There is also a web interface[1]. French seems to be supported.

We use MAUS to perform sub-word alignments for the Spoken Wikipedia corpus[2]. Note that it works best on a more-or-less per-sentence basis, i.e. you might have to split your audio and text into smaller chunks. (These alignments are not yet in the distributed SWC corpus)



[0]: http://www.bas.uni-muenchen.de/Bas/BasMAUS.html

[1]: https://clarin.phonetik.uni-muenchen.de/BASWebServices/index.html#/services

[2]: http://islrn.org/resources/684-927-624-257-3/


More information about the Corpora mailing list