[Corpora-List] Corpora of comic strips/books

Ryan North ryan at cs.toronto.edu
Thu Dec 21 18:55:00 CET 2006


Axel,

I'm actually the guy who built Oh No Robot, and if you'd like a dump of the
transcription data, I'd be happy to provide it! There's some meta-data of scene
descriptions included in the transcriptions, but it's not in all of them, and
depends on the transcriber's idea of what's going on, so there'll be some noise.

Cheers,
Ryan

----- Original Message -----
From: "Chris Callison-Burch" <callison-burch at ed.ac.uk>
To: "Axel Herold" <aherold at informatik.hu-berlin.de>
Cc: <corpora at uib.no>
Sent: Wednesday, December 20, 2006 5:00 PM
Subject: Re: [Corpora-List] Corpora of comic strips/books


Dear Axel,

There is a web site called "Oh No Robot" (http://www.ohnorobot.com/)
which provides search services for web comics. They use
"crowdsourcing" to have users transcribe the comics. They've got
50,000 transcribed strips from 600 series at the moment.

Yours,
Chris Callison-Burch

Quoting Axel Herold <aherold at informatik.hu-berlin.de>:


> Dear all,

>

> I'm planning to write my master's thesis on the nature of the dialog data

> from comic strips/books that might be seen somewhere in between spoken and

> written language. Is anyone of you aware of corpora containing comic texts

> (any language, though I'll focus on German)? Ideally the data should

> indicate speaker--utterance(s) relations.

>

> So far, I've seen a description of a comic corpus of bosnian, croatian and

> serbian comic series that was tailor made for a special survey on a slavic

> deictic system and marked up accordingly:

> http://tusnelda.sfb.uni-tuebingen.de/TUSNELDA/b8/comics/comicheader.html

>

> Any further pointers would be most welcome.

>

> Best regards, Axel Herold.

>

> --

> [...] er mißtraute den Worten, die sich euphonisch und rhythmisch fügten,

> mit dem behaglichen Schnurren, das den Leser hypnotisiert,

> nachdem der Schriftsteller als erster ihm zum Opfer gefallen ist.

> (Cortázar: Rayuela)

>

>

>

>













More information about the Corpora-archive mailing list