If you’re interested in tools covering data types beyond POS tagged concordances, and in particular syntactically annotated data and complex user defined annotation types, you may want to check out ANNIS:
http://corpus-tools.org/annis/
We also offer some richly annotated corpora via an ANNIS server at Georgetown University, some of which are in languages you mentioned below, so you can see some of what the system can do here:
https://corpling.uis.georgetown.edu/annis-corpora/
We also serve flat annotated corpora, including in languages on your list, using a CQPWeb interface here:
https://corpling.uis.georgetown.edu/cqp/
Hope this helps,
Amir
------------
Dr. Amir Zeldes
Asst. Prof. of Computational Linguistics
Department of Linguistics
Georgetown University
1437 37th St. NW
Washington, DC 20057
http://corpling.uis.georgetown.edu/amir
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Irina Temnikova Sent: Wednesday, July 18, 2018 2:55 PM To: corpora at uib.no Subject: [Corpora-List] Interesting Corpus analysis tools & specific corpora
Hi all!
I am trying to update a group of (not computational) linguists about the currently _accessible corpora_ and working _corpus analysis tools_.
I am aware of the most famous tools and multilingual/English corpora.
*I would be extremely thankful if somebody could point me towards the following:*
1. I am interested in any corpus analysis tools, which are usable by linguists and
are **different** from the usual concordances, keywords/terms extractors, and collocations, i.e. different from:
AntConc, WordSmith tools, SketchEngine (although it is amazingly great! :) ), LIWC, no NLTK -- too complex for my audience ;).
It would be nice if the tools offer some syntactic analysis, for example.
*It would be better if the tools could be used with the user’s own corpora*, and if they are easy to use.
2. I am interested in corpora with texts in the following languages (especially learners’ corpora, social media corpora, parallel corpora):
Italian - especially medieval historical
Norwegian
Swedish
French, specifically social media (e.g. tweets), dialogues between foreigners
Spanish tourism
Modern Greek
Swahili
Afrikaans
Thank you very much in advance!
Irina Temnikova
--
Irina P. Temnikova, B.A., M.A., Ph.D.
Lecturer & Computational Linguistics Researcher
Sofia University (past Qatar Computing Research Institute & Bulgarian Academy of Sciences)
https://scholar.google.bg/citations?user=7BcpifAAAAAJ <https://scholar.google.bg/citations?user=7BcpifAAAAAJ&hl=en> &hl=en
------------------------------- -------------------------------- -----
Woke up
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 17556 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20180719/c611b24b/attachment.txt>