In the United States, there is considerable evidence that IQ tests are racially biased. In the past, courts have excluded IQ tests from educational placement in California for precisely this reason. I wonder if there is research on this topic in the German context.
It is not difficult to imagine that the outcome of this shared task would be a set of technologies that encode spurious correlations between estimates of intelligence and the linguistic features of specific racial groups. If such a system were trained on data that already contains biases, there is a risk that this bias would be not only entrenched but amplified. And even if the IQ test statistics are not themselves biased, an NLP system that predicts IQ from text could introduce bias, if there is an unmeasured confound that is statistically associated with both IQ and race.
I hope that these issues will receive serious consideration from the organizers and participants in the task.
On Wed, Dec 4, 2019 at 8:27 AM Dirk Johannßen < johannssen at informatik.uni-hamburg.de> wrote:
> *GermEval 2020 Task 1 on the Prediction of Intellectual Ability and
> Personality Traits from Text*
> *1st Call for Participation*
> We invite interested parties from academia and industry to participate in
> this shared task. Further information can be found here:
> The validity of high school grades as a predictor of academic success is
> controversial. Researchers have found indications that linguistic features
> such as function words used in a prospective student's writing perform
> better in predicting academic success (Pennebaker et al., 2014).
> During an aptitude test, participants are asked to write freely associated
> texts to provided questions and images. Trained psychologists can predict
> behavior, long-term development, and subsequent success from those
> expressions. Paired with an IQ test and provided high school grades,
> prediction of intellectual ability from a text can be investigated. Such an
> approach would extend the sole text classification and could reveal
> insightful psychological traits.
> Operant motives are unconscious intrinsic desires that can be measured by
> implicit or operant methods, such as the Operant Motive Test (OMT) or the
> Motive Index (MIX) employs. During the OMT and MIX, participants are asked
> to write freely associated texts to provided questions and images. Trained
> psychologists label these textual answers with one of five motives and
> corresponding levels. The identified motives allow psychologists to predict
> behavior, longterm development, and subsequent success. For our task, we
> provide extensive amounts of textual data from both, the OMT and MIX,
> paired with IQ and high school grades (MIX) and labels (OMT).
> With this task, we aim to foster research within this context. This task
> is focusing on classifying German psychological text data for predicting
> the IQ and high school grades of college applicants as well as performing
> speaker identification by the same image descriptions.
> This shared task consists of two subtasks, described below. Participants
> are free to participate in either one of them or both.
> *- Subtask 1*: Prediction of Intellectual Ability. The task is to predict
> measures of intellectual ability solemnly based on text. For this,
> z-standardized high school grades and IQ scores of college applicants are
> summed and globally ranked. The goal of this subtask is to reproduce their
> ranking, systems are evaluated by the Pearson correlation coefficient
> between system and gold ranking.
> For the final results, participants of this shared task will be provided
> with an MIX_text only and are asked to reproduce the ranking of each
> student relative to all students in a collection (i.e. the within the test
> The data is delivered in two files, one containing participant data, the
> other containing sample data, each being connected by a student ID. The
> rank in the sample data reflects the averaged performance relative to all
> instances within the collection (i.e. within train / test / dev), which is
> to be reproduced for the task.
> *- Subtask 2*: Classification of the Operant Motive Test (OMT). Operant
> motives are unconscious intrinsic desires that can be measured by implicit
> or operant methods, such as the Operant Motive Test (OMT)(Kuhl and
> Scheffer, 1999). During the OMT, participants are asked to write freely
> associated texts to provided questions and images. An exemplary
> illustration can be found in the Data area. Trained psychologists label
> these textual answers with one of four motives. The identified motives
> allow psychologists to predict behavior, long-term development, and
> subsequent success.
> For this shared task, participants will be provided with an OMT_text and
> are asked to predict the motive and level of each instance. The success
> will be measured with the macro-averaged F1-score.
> Since 2011, the private university of applied sciences NORDAKADEMIE
> performs an aptitude college application test, where participants state
> their high school performance, perform an IQ test and a psychometrical test
> called the Motive Index (MIX). The MIX measures so-called implicit or
> operant motives by having participants answer questions to those images
> like the one displayed below such as "who is the main person and what is
> important for that person?" and "what is that person feeling". Furthermore,
> those participants answer the question of what motivated them to apply for
> the NORDAKADEMIE.
> The data consists of a unique ID per entry, one ID per participant, of the
> applicants' major and high school grades as well as IQ scores with one
> textual expression attached to each entry. high school grades and IQ scores
> are z-standardized for privacy protection. In total there are 2,595
> participants, who produced 77,850 unique MIX answers. The shortest textual
> answers consist of 3 words, the longest of 42 and on average there are
> roughly 15 words per textual answer with a standard deviation of 8 words.
> The available data set has been collected and hand-labeled by researchers
> of the University of Trier. More than 14,600 volunteers participated in
> answering questions to 15 provided images. The pairwise annotator
> intraclass correlation was r = .85 on the Winter scale (Winter, 1994). The
> length of the answers ranges from 4 to 79 words with a mean length of 22
> words and a standard deviation of roughly 12 words.
> Submissions for the validation set via the Codalab page are accepted and
> published on a leaderboard from January 1st. From May 1st, we will start
> the final evaluation phase of the task by providing the gold labels of the
> validation set, which can be used as additional training data.
> Additionally, the test set samples will be provided, for which we accept
> submissions until June, 1st.
> More information can be found on the task's webpage:
> *Important Dates*
> - 01-Dec-2019: Release of trial data and systems
> - 01-Jan-2020: Release of training data (train + validation)
> - 08-May-2020: Release of test data
> - 01-Jun-2020: Final submission of test results
> - 03-Jun-2020: Submission of description paper
> - 04-11-Jun-2020: Peer reviewing: participants are expected to review
> other participant's system descriptions
> - 12-Jun-2020: Notification of acceptance and reviewer feedback
> - 18-Jun-2020: Camera-ready deadline for system description papers
> - 23-Jun-2020: Workshop in Zurich, Switzerland at the KONVENS 2020 and
> SwissText joint conference
> The shared task will be accompanied by a pre-conference workshop of the
> Conference on Natural Language Processing ("Konferenz zur Verarbeitung
> natürlicher Sprache", KONVENS) hosted on June 23, 2020, at Zürich (
> *Workshop Proceedings*
> Description papers will appear in online workshop proceedings.
> Participants who submit a description paper will be asked to register at
> the workshop and present their system as a poster or in an oral
> presentation (depending on the number of submissions).
> The shared task is organized by Dirk Johannßen, Chris Biemann, Steffen
> Remus and Timo Baumann from the Language Technology group of the University
> of Hamburg (https://www.inf.uni-hamburg.de/en/inst/ab/lt/home.html), as
> well as David Scheffer from the NORDAKADEMIE Elmshorn, Nicola Baumann from
> the Universität Trier and the Gudula Ritz from the Impart GmbH (Germany).
> GermEval is a series of shared task evaluation campaigns that focus on
> Natural Language Processing for the German language. GermEval has been
> conducted four times since 2014 in co-location with KONVENS/GSCL
> conferences. For an overview of the currently conducted tasks, visit
> Dirk Johannßen
> Universität Hamburg
> Department of Informatics
> Language Technology Group (LT)
> Vogt-Kölln-Straße 30
> 22527 Hamburg
> Room: F-412
> johannssen at informatik.uni-hamburg.de
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 12074 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191204/a6e8aa7a/attachment.txt>