[Corpora-List] Syntactically parsed Tagalog corpus

Eric Atwell csc6ea at leeds.ac.uk
Sat Jan 7 13:35:59 CET 2012


Sebastian,

a quick Google search for "Tagalog corpus" shows research where Tagalog web-pages have been compiled into a corpus; and/or you could do this yourself, using a web-crawler corpus-builder such as WebBootCat available via SketchEngine website (see previous posts on this to CORPORA). BUT I will be very surprised if you find a Tagalog Treebank - parsing/treebanking is a big undertaking, so I'm sure if anyone has

done this they would publish results on the WWW.

But you should ask youself: do you really need a parsed corpus to answer your research question(s)? A parsed corpus is most often used to train a machine-learning NLP system, but I don't think you're doing machine learning, are you?

What do psycholingusits do when studying sentence processing in other non-European languages? I guess few languages outside the "top 20" have treebanks available for psycholinguisrtic study, so is there an alternaitve approach used by others in your field?

good luck

Eric Atwell, Senior Lecturer, Language Processing research group,

I-AIBS Institute for Artificial Intelligence and Biological Systems

School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS

Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468

WWW: http://www.comp.leeds.ac.uk/eric

http://www.comp.leeds.ac.uk/nlp

On Sat, 7 Jan 2012, Sebastian Sauppe wrote:


> Dear CORPORA list members,
>
> I’m a PhD student at the Max Planck Institute for Psycholinguistics in
> Nijmegen (The Netherlands) and my PhD project is on sentence processing
> in Tagalog, an Austronesian language of the Philippines.
>
> In order to prepare my psycholinguistic experiments, I would like to
> conduct some corpus linguistic analyses. I have been looking for a
> syntactically tagged/parsed Tagalog corpus for a while now;
> unfortunately, I was not able to find one.
>
> Does anyone of you know a syntactically parsed Tagalog corpus that I
> could use for my analyses? If you don’t know such a corpus, do you maybe
> know some experts that probably can help me to find a syntactically
> parsed Tagalog corpus?
>
> Thank you very much for your help in advance.
>
> Best regards,
> Sebastian Sauppe
>
> --
> Sebastian Sauppe, M.A.
> Language and Cognition Department
> &
> International Max Planck Research School for Language Sciences
> Max Planck Institute for Psycholinguistics
> Wundtlaan 1
> 6525 XD Nijmegen
> The Netherlands
>
> Homepage: http://www.mpi.nl/people/sauppe-sebastian
> Telephone: +31-24-3521561
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Eric Atwell, Senior Lecturer, Language Processing research group,

I-AIBS Institute for Artificial Intelligence and Biological Systems

School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS

Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468

WWW: http://www.comp.leeds.ac.uk/eric

http://www.comp.leeds.ac.uk/nlp



More information about the Corpora mailing list