[Corpora-List] Interesting Corpus analysis tools & specific corpora

Adam Ek adam.ek at ling.su.se
Thu Jul 19 16:38:28 CEST 2018

You might find språkbanken (https://spraakbanken.gu.se/korp/) useful. The tool contains various Swedish corpora and a sophisticated search interface. The site also has an English version which can be accessed in the top right corner.


________________________________ From: corpora-bounces at uib.no <corpora-bounces at uib.no> on behalf of Irina Temnikova <irina.temnikova at gmail.com> Sent: Wednesday, July 18, 2018 8:55:04 PM To: corpora at uib.no Subject: [Corpora-List] Interesting Corpus analysis tools & specific corpora

Hi all!

I am trying to update a group of (not computational) linguists about the currently _accessible corpora_ and working _corpus analysis tools_.

I am aware of the most famous tools and multilingual/English corpora.

*I would be extremely thankful if somebody could point me towards the following:*

1. I am interested in any corpus analysis tools, which are usable by linguists and

are **different** from the usual concordances, keywords/terms extractors, and collocations, i.e. different from:

AntConc, WordSmith tools, SketchEngine (although it is amazingly great! :) ), LIWC, no NLTK -- too complex for my audience ;).

It would be nice if the tools offer some syntactic analysis, for example.

*It would be better if the tools could be used with the user’s own corpora*, and if they are easy to use.

2. I am interested in corpora with texts in the following languages (especially learners’ corpora, social media corpora, parallel corpora):

Italian - especially medieval historical



French, specifically social media (e.g. tweets), dialogues between foreigners

Spanish tourism

Modern Greek



Thank you very much in advance!

Irina Temnikova

-- Irina P. Temnikova, B.A., M.A., Ph.D.

Lecturer & Computational Linguistics Researcher

Sofia University (past Qatar Computing Research Institute & Bulgarian Academy of Sciences)


------------------------------- -------------------------------- ----- Woke up

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 10840 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20180719/74449a78/attachment.txt>

More information about the Corpora mailing list