[Corpora-List] GermEval 2020 Task 1 on the Prediction of Intellectual Ability and Personality Traits from Text: 1st Call for Participation

Zeerak Waseem z.w.butt at sheffield.ac.uk
Wed Dec 4 22:58:39 CET 2019


Anecdotally from my own experiences in Denmark and conversations with god knows how many racialised people around Europe (best estimation n ~= 50): the concerns and reasons for concern described (and reasons for strong language) are likely to map to a European context.

The racial inequalities of the societies play out in the same way (i know Europeans hate admitting to similarities with the us but they are highly similar) with marginalisation of groups of people based on heritage (the specific group(s) depend on the country in question), (forced) ghettoisation, White flight, schools and public infrastructure being poorly maintained (anecdotally: I was told multiple times along the way (directly and indirectly) that i should probably not seek an academic career and I’ve rarely had a conversation with a racialised person who grew up in the global economic north who didn’t have similar experiences), etc. Of course there are also dissimilar systems of thinking between cultures and having multiple cultural backgrounds could influence scores resulting from processing patterns measured in IQ tests. In short: based on my own experience, what I saw growing up, and other’s experiences, I would be highly surprised is European investigations of correlations between race and IQ would show dissimilarity with the US investigations. And these inequities would likely be built into the systems relying on data such as IQ.

So yeah, seconding that the strong language may be entirely appropriate (and frankly Europeans should stop thinking that the processes of racial discrimination here are wildly different than in the us).

Zeerak


> On 4 Dec 2019, at 22:10, Yannick Versley <yversley at gmail.com> wrote:
>
> Even understanding the German educational system and culture, I'd say that this task should light up the "irresponsible" light on the mind of any person who is (i) reasonably clear-thinking and (ii) familiar with the problems that responsible/ethical AI wants to warn us about. By overselling models and tasks that necessarily (i) lead to a poor fit overall and (ii) are prone to pick up cultural/ethnical background (to pick a less US-centric term than "race") in addition to any informative features, we're lending legitimacy to the use of similar tools that are used as a pseudo-scientific mantle to disguise (essentially) the automation of racial/ethnic/cultural discrimination and biases.
> People (yes, most people) desperately want automated computer decisions that work the same way that biased humans do them but with the aura of objectivity. And as the technological experts it's our duty to call bullshit on that and criticise the flawed tools as well as the processes that lead to their acceptance and/or use in production. Rather than capitalizing on the desire for pseudo-science and becoming a helping party in the deception.
>
> Best wishes,
> Yannick
>
>> On Wed, Dec 4, 2019 at 9:24 PM Laura Dietz <dietz at cs.unh.edu> wrote:
>> I think it is unfair to call this task out as "irresponsible" without understanding the German educational system and culture. I understand that Americans have a knee-jerk response, and it is always good to caution experimental setups. However, I would have hoped for a more measured response.
>>
>> Laura Dietz
>>
>>
>>> On 12/4/19 2:00 PM, Emily M. Bender wrote:
>>> Thank you, Jacob, for this reply. This task seems irresponsible/poorly conceived to me. Before designing such a task, I think it is imperative to consider its use cases: When and why would we want to predict IQ scores or high school grades from text? Given the high potential for any such system to learn preexisting biases (themselves the result of structural discrimination in society), what are the likely impacts, especially on already marginalized populations?
>>>
>>> Emily
>>>
>>>> On Wed, Dec 4, 2019 at 10:34 AM Jacob Eisenstein <jacobe at gmail.com> wrote:
>>>> As a community, we should think carefully about whether it is appropriate to work with IQ test results as data, and what the applications of this research might be.
>>>>
>>>> In the United States, there is considerable evidence that IQ tests are racially biased. In the past, courts have excluded IQ tests from educational placement in California for precisely this reason. I wonder if there is research on this topic in the German context.
>>>>
>>>> It is not difficult to imagine that the outcome of this shared task would be a set of technologies that encode spurious correlations between estimates of intelligence and the linguistic features of specific racial groups. If such a system were trained on data that already contains biases, there is a risk that this bias would be not only entrenched but amplified. And even if the IQ test statistics are not themselves biased, an NLP system that predicts IQ from text could introduce bias, if there is an unmeasured confound that is statistically associated with both IQ and race.
>>>>
>>>> I hope that these issues will receive serious consideration from the organizers and participants in the task.
>>>>
>>>> Jacob Eisenstein
>>>>
>>>>> On Wed, Dec 4, 2019 at 8:27 AM Dirk Johannßen <johannssen at informatik.uni-hamburg.de> wrote:
>>>>> GermEval 2020 Task 1 on the Prediction of Intellectual Ability and Personality Traits from Text
>>>>>
>>>>> 1st Call for Participation
>>>>> We invite interested parties from academia and industry to participate in this shared task. Further information can be found here: https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html .
>>>>>
>>>>> The validity of high school grades as a predictor of academic success is controversial. Researchers have found indications that linguistic features such as function words used in a prospective student's writing perform better in predicting academic success (Pennebaker et al., 2014).
>>>>>
>>>>> During an aptitude test, participants are asked to write freely associated texts to provided questions and images. Trained psychologists can predict behavior, long-term development, and subsequent success from those expressions. Paired with an IQ test and provided high school grades, prediction of intellectual ability from a text can be investigated. Such an approach would extend the sole text classification and could reveal insightful psychological traits.
>>>>>
>>>>> Operant motives are unconscious intrinsic desires that can be measured by implicit or operant methods, such as the Operant Motive Test (OMT) or the Motive Index (MIX) employs. During the OMT and MIX, participants are asked to write freely associated texts to provided questions and images. Trained psychologists label these textual answers with one of five motives and corresponding levels. The identified motives allow psychologists to predict behavior, longterm development, and subsequent success. For our task, we provide extensive amounts of textual data from both, the OMT and MIX, paired with IQ and high school grades (MIX) and labels (OMT).
>>>>>
>>>>> With this task, we aim to foster research within this context. This task is focusing on classifying German psychological text data for predicting the IQ and high school grades of college applicants as well as performing speaker identification by the same image descriptions.
>>>>>
>>>>>
>>>>> Tasks
>>>>> This shared task consists of two subtasks, described below. Participants are free to participate in either one of them or both.
>>>>>
>>>>> - Subtask 1: Prediction of Intellectual Ability. The task is to predict measures of intellectual ability solemnly based on text. For this, z-standardized high school grades and IQ scores of college applicants are summed and globally ranked. The goal of this subtask is to reproduce their ranking, systems are evaluated by the Pearson correlation coefficient between system and gold ranking.
>>>>>
>>>>> For the final results, participants of this shared task will be provided with an MIX_text only and are asked to reproduce the ranking of each student relative to all students in a collection (i.e. the within the test set).
>>>>>
>>>>> The data is delivered in two files, one containing participant data, the other containing sample data, each being connected by a student ID. The rank in the sample data reflects the averaged performance relative to all instances within the collection (i.e. within train / test / dev), which is to be reproduced for the task.
>>>>>
>>>>> - Subtask 2: Classification of the Operant Motive Test (OMT). Operant motives are unconscious intrinsic desires that can be measured by implicit or operant methods, such as the Operant Motive Test (OMT)(Kuhl and Scheffer, 1999). During the OMT, participants are asked to write freely associated texts to provided questions and images. An exemplary illustration can be found in the Data area. Trained psychologists label these textual answers with one of four motives. The identified motives allow psychologists to predict behavior, long-term development, and subsequent success.
>>>>>
>>>>> For this shared task, participants will be provided with an OMT_text and are asked to predict the motive and level of each instance. The success will be measured with the macro-averaged F1-score.
>>>>>
>>>>>
>>>>> Data
>>>>> Since 2011, the private university of applied sciences NORDAKADEMIE performs an aptitude college application test, where participants state their high school performance, perform an IQ test and a psychometrical test called the Motive Index (MIX). The MIX measures so-called implicit or operant motives by having participants answer questions to those images like the one displayed below such as "who is the main person and what is important for that person?" and "what is that person feeling". Furthermore, those participants answer the question of what motivated them to apply for the NORDAKADEMIE.
>>>>>
>>>>> The data consists of a unique ID per entry, one ID per participant, of the applicants' major and high school grades as well as IQ scores with one textual expression attached to each entry. high school grades and IQ scores are z-standardized for privacy protection. In total there are 2,595 participants, who produced 77,850 unique MIX answers. The shortest textual answers consist of 3 words, the longest of 42 and on average there are roughly 15 words per textual answer with a standard deviation of 8 words.
>>>>>
>>>>> The available data set has been collected and hand-labeled by researchers of the University of Trier. More than 14,600 volunteers participated in answering questions to 15 provided images. The pairwise annotator intraclass correlation was r = .85 on the Winter scale (Winter, 1994). The length of the answers ranges from 4 to 79 words with a mean length of 22 words and a standard deviation of roughly 12 words.
>>>>>
>>>>> Submissions for the validation set via the Codalab page are accepted and published on a leaderboard from January 1st. From May 1st, we will start the final evaluation phase of the task by providing the gold labels of the validation set, which can be used as additional training data. Additionally, the test set samples will be provided, for which we accept submissions until June, 1st.
>>>>>
>>>>> More information can be found on the task's webpage: https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html
>>>>>
>>>>>
>>>>> Important Dates
>>>>> - 01-Dec-2019: Release of trial data and systems
>>>>> - 01-Jan-2020: Release of training data (train + validation)
>>>>> - 08-May-2020: Release of test data
>>>>> - 01-Jun-2020: Final submission of test results
>>>>> - 03-Jun-2020: Submission of description paper
>>>>> - 04-11-Jun-2020: Peer reviewing: participants are expected to review other participant's system descriptions
>>>>> - 12-Jun-2020: Notification of acceptance and reviewer feedback
>>>>> - 18-Jun-2020: Camera-ready deadline for system description papers
>>>>> - 23-Jun-2020: Workshop in Zurich, Switzerland at the KONVENS 2020 and SwissText joint conference
>>>>>
>>>>> The shared task will be accompanied by a pre-conference workshop of the Conference on Natural Language Processing ("Konferenz zur Verarbeitung natürlicher Sprache", KONVENS) hosted on June 23, 2020, at Zürich (https://swisstext-and-konvens-2020.org/).
>>>>>
>>>>>
>>>>> Workshop Proceedings
>>>>> Description papers will appear in online workshop proceedings. Participants who submit a description paper will be asked to register at the workshop and present their system as a poster or in an oral presentation (depending on the number of submissions).
>>>>>
>>>>>
>>>>> Organizers
>>>>> The shared task is organized by Dirk Johannßen, Chris Biemann, Steffen Remus and Timo Baumann from the Language Technology group of the University of Hamburg (https://www.inf.uni-hamburg.de/en/inst/ab/lt/home.html), as well as David Scheffer from the NORDAKADEMIE Elmshorn, Nicola Baumann from the Universität Trier and the Gudula Ritz from the Impart GmbH (Germany).
>>>>>
>>>>>
>>>>> GermEval
>>>>> GermEval is a series of shared task evaluation campaigns that focus on Natural Language Processing for the German language. GermEval has been conducted four times since 2014 in co-location with KONVENS/GSCL conferences. For an overview of the currently conducted tasks, visit https://swisstext-and-konvens-2020.org/shared-tasks/.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dirk Johannßen
>>>>> Universität Hamburg
>>>>> Department of Informatics
>>>>> Language Technology Group (LT)
>>>>> Vogt-Kölln-Straße 30
>>>>> 22527 Hamburg
>>>>>
>>>>> Room: F-412
>>>>>
>>>>> johannssen at informatik.uni-hamburg.de
>>>>> http://lt.informatik.uni-hamburg.de
>>>>> http://www.uni-hamburg.de
>>>>>
>>>>> _______________________________________________
>>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>>>> Corpora mailing list
>>>>> Corpora at uib.no
>>>>> https://mailman.uib.no/listinfo/corpora
>>>> _______________________________________________
>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>>> Corpora mailing list
>>>> Corpora at uib.no
>>>> https://mailman.uib.no/listinfo/corpora
>>>
>>>
>>> --
>>> Emily M. Bender (she/her)
>>> Howard and Frances Nostrand Endowed Professor
>>> Department of Linguistics
>>> Faculty Director, CLMS
>>> University of Washington
>>> Twitter: @emilymbender
>>>
>>>
>>> _______________________________________________
>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>> Corpora mailing list
>>> Corpora at uib.no
>>> https://mailman.uib.no/listinfo/corpora
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> https://mailman.uib.no/listinfo/corpora
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 25378 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191204/913c38ed/attachment.txt>



More information about the Corpora mailing list