[Corpora-List] Question concerning audio file search

Adam Kilgarriff adam at lexmasterclass.com
Wed Dec 20 22:26:00 CET 2006


Not directly an answer, but do you know http://podzinger.com. This
astonishing website has vast quantities of podcasts, automatically
transcribed and text-searchable. (Just the day before, I had been
confidently declaring that this level of transcription was beyond the state
of the art.) I encountered it because John Milton, in Hong Kong, has
integrated it into his English Language Teaching tools so students can hear
a word or phrase they are learning.


-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Briony Williams
Sent: 20 December 2006 15:18
To: CORPORA at uib.no
Subject: [Corpora-List] Question concerning audio file search

sato hiroaki wrote:

> I've just made a software tool for using DVD movies as a multimedia


Following the publicising of this useful-looking tool, I wonder whether any
members of the Corpora list could help me with a related task, concerning
audio rather than video files. I'm asking on behalf of someone else.

Is there an existing Windows-based software application that will do the
following? (preferably free of charge, and without a large memory

1) Given: several very long .wav sound files (possibly an hour long), with
associated text transcript files (.trs files) as produced using the
"Transcriber" software application.

2) User input: User types one or more words to search for in the
transcription files.

3) Output: Software returns each chunk of text that contains the search
string (where "chunk" could be a phrase, sentence, paragraph, topic, or
larger, depending on the granularity of the transcription files).

4) User input: User selects one of the search results.

5) Output: Software plays back the portion of the (large) sound file
corresponding to the chunk selected by the user.

I can think of a way to do this using Cygwin and Edinburgh Speech Tools -
does anyone know of an existing solution using a Windows graphical
My contact seems to prefer a point-and-click interface if possible.

Thanks in advance for any responses.

Best regards

Briony Williams

Briony Williams

Arweinydd Tm Technoleg Lleferydd / Speech Technology Team Leader
Uned Technolegau Iaith / Language Technologies Unit
Canolfan Bedwyr / Canolfan Bedwyr
Prifysgol Cymru / University of Wales
Bangor / Bangor
Gwynedd LL57 2EN, UK / Gwynedd LL57 2EN, UK

E-Bost / E-Mail : b.williams at bangor.ac.uk
Gwe (Cymraeg) : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.cy
Web (English) : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.en
Ffn / Tel : +44 (0) 1506 200862
Rhithfro / Blog : http://murmur.bangor.ac.uk

Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi,
gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig
gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y
neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar
unwaith a dilwch y neges. Os na fwriadwyd anfon y neges atoch chi,
rhaid i chi beidio defnyddio, cadw neu ddatgelu unrhyw wybodaeth a
gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i
hanfonodd yn unig ac nid yw o anghenraid yn cynrychioli barn
Prifysgol Cymru, Bangor. Nid yw Prifysgol Cymru, Bangor yn gwarantu
bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu
100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn
nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contract
rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa
Cyllid Prifysgol Cymru, Bangor. www.bangor.ac.uk

This email and any attachments may contain confidential material and
is solely for the use of the intended recipient(s). If you have
received this email in error, please notify the sender immediately
and delete this email. If you are not the intended recipient(s), you
must not use, retain or disclose any information contained in this
email. Any views or opinions are solely those of the sender and do
not necessarily represent those of the University of Wales, Bangor.
The University of Wales, Bangor does not guarantee that this email or
any attachments are free from viruses or 100% secure. Unless
expressly stated in the body of the text of the email, this email is
not intended to form a binding contract - a list of authorised
signatories is available from the University of Wales, Bangor Finance
Office. www.bangor.ac.uk

More information about the Corpora-archive mailing list