[Corpora-List] 2nd CfP: "Contrastive Corpus Methodology for Language Modeling and Analysis"

Anke Lüdeling anke.luedeling at hu-berlin.de
Wed Aug 19 06:47:59 CEST 2020

Dear colleagues,

I would like to draw your attention to the following 2nd CfP which might be interesting to some of you. Please feel free to distribute. Apologies for cross-posting.

Best, Anke


We invite submissions for a Workshop on

"Contrastive Corpus Methodology for Language Modeling and Analysis"

The workshop will take place as part of the 43. Jahrestagung der Deutschen Gesellschaft für Sprachwissenschaft (43rd Annual Meeting of the German Linguistic Society) in Freiburg, February 24 to 26 2021.

*Workshop Description: *

As we try to understand and empirically investigate language, a wide range of methods are at our disposal and many decisions are to be made. One of the first decisions in the process is the amount of data we include in our corpora or the depth of annotation. We often sacrifice more data for deeper, manually obtained linguistic annotation and prefer a richer and more explicit description of language over shallow but larger data sets. This is sometimes perceived as not so much the outcome of a conscious decision in favor of analytical depth, but as a compromise we are forced to make due to restricted temporal, human, and financial resources.

However, research indicates that when dealing with language, gathering more data does not necessarily result in more insight. In fact, statistical analysis might not always yield better results just because it yields different ones for bigger data. In addition, when we attempt a combination of established methods of statistical analyses with complex and adequate linguistic models in corpora, we still encounter limitations of sample size and thus we can easily find ourselves lost in front of the tool cabinet. In the same way that theoretical linguistics has been in an ongoing and productive debate around the virtues of varying syntactic models such as for instance constraint vs. phrase structure grammars (and combinations thereof), corpus linguistics requires an informed discussion about the virtues and limitations of different models for each linguistic phenomenon. With adequate and meaningful models, fewer data may yield more satisfactory results than larger datasets that only provide shallow linguistic annotation.

How can we understand the limitations of the tools that we have at our disposal and develop new models, methods, measures, or frameworks that fit the linguistic needs of our analyses? And what is the influence of the theoretical model in analysis, how does it affect our results? This workshop encourages discussions of methods dealing with small and mid-sized corpora as a resource for linguistic analysis rooted in in-depth theoretical modelling. It addresses linguists working empirically on all linguistic levels, corpus, and computational linguists, as well as statisticians.

Contributions to the workshop may cover, but are not limited to, the following topics:

* new and/or comparative methods for data analysis within and beyond statistical frameworks * effects of different data sizes and data partitions for linguistic analyses * a contrastive perspective of modelling decisions and results * the influence of linguistic models in data modelling decisions and means of analysis * recent trends in linguistic data analysis

*Organizers *

Martin Klotz Anna Shadrova Anke Lüdeling

(all from HU Berlin Corpus Linguistics)

*Invited Speaker *

We are happy to welcome Wander Lowie <https://www.rug.nl/staff/w.m.lowie/> from the University of Groningen as our invited speaker. Prof. Lowie has contributed a wide range of research to the areas of L2 acquisition, variability, dynamic systems and usage-based dynamics, and quantitative modeling in linguistic domains as diverse as phonology, lexical acquisition, language assessment, and conceptual representations.

*Format *

Authors should submit 1 page abstracts (including references) in a 12 point font (e.g. Times New Roman) to


References should be formatted according to the APA guidelines. Talks will be given 30 or 60 minute slots including discussion, depending on the program. Please specify your preferred length in your submission. The workshop language is English for both abstracts and talks. According to DGfS regulations, speakers can only present a paper in one workshop.

Important dates

submission of abstracts: 15.09.2020 notification of acceptance:  01.10.2020 workshop: 24.–26.02.2021

Find all information on the conference here: https://dgfs2021.uni-freiburg.de/

Due to the SARS-CoV-2 pandemic the format of the conference might be subject to change. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5801 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200819/35f16906/attachment.txt>

More information about the Corpora mailing list