[Corpora-List] GermEval 2020 Task 1 on the Prediction of Intellectual Ability and Personality Traits from Text: 1st Call for Participation

Jacob Eisenstein jacobe at gmail.com
Thu Dec 5 01:44:51 CET 2019



>
> If you're serious about societal bias in AI, there is an increasing
> number of events that you can get involved in and help make a real
> contribution to address this complex issue. Simply spewing opinions
> and moral incense into mailboxes seems to be much less beneficial.
>

Hi Detmar, I think that you'll find that Zeerak and Emily have been actively involved in organizing such events. My own contributions are limited to spewing opinions into mailboxes :)

On that note, I think it's perfectly reasonable to respectfully question our colleagues about the ethical implications of research projects that are promoted on this list. I hope that the organizers do not view being asked to address these issues as being "ridiculed" or "shamed." That was certainly not my intent in raising this discussion.

Substantively, I do not agree that concerns about intelligence testing are restricted to the anecdotal or second-hand. The history of intelligence testing is particularly problematic (and not just in the US), and so particular care is necessary. In fact, an enormous amount has already been written on this topic, and it seems fair to ask whether and how the organizers of the shared task plan to engage with this literature. I am also curious about the question that Emily posed: what is the use case that the organizers foresee, and do they plan to test for and address potential biases in this use case?

best, -Jacob


> Best regards,
> Detmar
>
>
> On Wed, Dec 04, 2019 at 10:58:39PM +0100, Zeerak Waseem wrote:
> > Anecdotally from my own experiences in Denmark and conversations with
> god knows how many racialised people around Europe (best estimation n ~=
> 50): the concerns and reasons for concern described (and reasons for strong
> language) are likely to map to a European context.
> >
> > The racial inequalities of the societies play out in the same way (i
> know Europeans hate admitting to similarities with the us but they are
> highly similar) with marginalisation of groups of people based on heritage
> (the specific group(s) depend on the country in question), (forced)
> ghettoisation, White flight, schools and public infrastructure being poorly
> maintained (anecdotally: I was told multiple times along the way (directly
> and indirectly) that i should probably not seek an academic career and I’ve
> rarely had a conversation with a racialised person who grew up in the
> global economic north who didn’t have similar experiences), etc. Of course
> there are also dissimilar systems of thinking between cultures and having
> multiple cultural backgrounds could influence scores resulting from
> processing patterns measured in IQ tests.
> > In short: based on my own experience, what I saw growing up, and other’s
> experiences, I would be highly surprised is European investigations of
> correlations between race and IQ would show dissimilarity with the US
> investigations. And these inequities would likely be built into the systems
> relying on data such as IQ.
> >
> > So yeah, seconding that the strong language may be entirely appropriate
> (and frankly Europeans should stop thinking that the processes of racial
> discrimination here are wildly different than in the us).
> >
> > Zeerak
> >
> > > On 4 Dec 2019, at 22:10, Yannick Versley <yversley at gmail.com> wrote:
> > >
> > > Even understanding the German educational system and culture, I'd say
> that this task should light up the "irresponsible" light on the mind of any
> person who is (i) reasonably clear-thinking and (ii) familiar with the
> problems that responsible/ethical AI wants to warn us about. By overselling
> models and tasks that necessarily (i) lead to a poor fit overall and (ii)
> are prone to pick up cultural/ethnical background (to pick a less
> US-centric term than "race") in addition to any informative features, we're
> lending legitimacy to the use of similar tools that are used as a
> pseudo-scientific mantle to disguise (essentially) the automation of
> racial/ethnic/cultural discrimination and biases.
> > > People (yes, most people) desperately want automated computer
> decisions that work the same way that biased humans do them but with the
> aura of objectivity. And as the technological experts it's our duty to call
> bullshit on that and criticise the flawed tools as well as the processes
> that lead to their acceptance and/or use in production. Rather than
> capitalizing on the desire for pseudo-science and becoming a helping party
> in the deception.
> > >
> > > Best wishes,
> > > Yannick
> > >
> > >> On Wed, Dec 4, 2019 at 9:24 PM Laura Dietz <dietz at cs.unh.edu> wrote:
> > >> I think it is unfair to call this task out as "irresponsible" without
> understanding the German educational system and culture. I understand that
> Americans have a knee-jerk response, and it is always good to caution
> experimental setups. However, I would have hoped for a more measured
> response.
> > >>
> > >> Laura Dietz
> > >>
> > >>
> > >>> On 12/4/19 2:00 PM, Emily M. Bender wrote:
> > >>> Thank you, Jacob, for this reply. This task seems
> irresponsible/poorly conceived to me. Before designing such a task, I think
> it is imperative to consider its use cases: When and why would we want to
> predict IQ scores or high school grades from text? Given the high potential
> for any such system to learn preexisting biases (themselves the result of
> structural discrimination in society), what are the likely impacts,
> especially on already marginalized populations?
> > >>>
> > >>> Emily
> > >>>
> > >>>> On Wed, Dec 4, 2019 at 10:34 AM Jacob Eisenstein <jacobe at gmail.com>
> wrote:
> > >>>> As a community, we should think carefully about whether it is
> appropriate to work with IQ test results as data, and what the applications
> of this research might be.
> > >>>>
> > >>>> In the United States, there is considerable evidence that IQ tests
> are racially biased. In the past, courts have excluded IQ tests from
> educational placement in California for precisely this reason. I wonder if
> there is research on this topic in the German context.
> > >>>>
> > >>>> It is not difficult to imagine that the outcome of this shared task
> would be a set of technologies that encode spurious correlations between
> estimates of intelligence and the linguistic features of specific racial
> groups. If such a system were trained on data that already contains biases,
> there is a risk that this bias would be not only entrenched but amplified.
> And even if the IQ test statistics are not themselves biased, an NLP system
> that predicts IQ from text could introduce bias, if there is an unmeasured
> confound that is statistically associated with both IQ and race.
> > >>>>
> > >>>> I hope that these issues will receive serious consideration from
> the organizers and participants in the task.
> > >>>>
> > >>>> Jacob Eisenstein
> > >>>>
> > >>>>> On Wed, Dec 4, 2019 at 8:27 AM Dirk Johannßen <
> johannssen at informatik.uni-hamburg.de> wrote:
> > >>>>> GermEval 2020 Task 1 on the Prediction of Intellectual Ability and
> Personality Traits from Text
> > >>>>>
> > >>>>> 1st Call for Participation
> > >>>>> We invite interested parties from academia and industry to
> participate in this shared task. Further information can be found here:
> https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html
> .
> > >>>>>
> > >>>>> The validity of high school grades as a predictor
> of academic success is controversial. Researchers have found indications
> that linguistic features such as function words used in a prospective
> student's writing perform better in predicting academic success (Pennebaker
> et al., 2014).
> > >>>>>
> > >>>>> During an aptitude test, participants are asked to write freely
> associated texts to provided questions and images. Trained psychologists
> can predict behavior, long-term development, and subsequent success from
> those expressions. Paired with an IQ test and provided high school grades,
> prediction of intellectual ability from a text can be investigated. Such an
> approach would extend the sole text classification and could reveal
> insightful psychological traits.
> > >>>>>
> > >>>>> Operant motives are unconscious intrinsic desires that can be
> measured by implicit or operant methods, such as the Operant Motive Test
> (OMT) or the Motive Index (MIX) employs. During the OMT and MIX,
> participants are asked to write freely associated texts to provided
> questions and images. Trained psychologists label these textual answers
> with one of five motives and corresponding levels. The identified motives
> allow psychologists to predict behavior, longterm development, and
> subsequent success. For our task, we provide extensive amounts of textual
> data from both, the OMT and MIX, paired with IQ and high school grades
> (MIX) and labels (OMT).
> > >>>>>
> > >>>>> With this task, we aim to foster research within this context.
> This task is focusing on classifying German psychological text data for
> predicting the IQ and high school grades of college applicants as well as
> performing speaker identification by the same image descriptions.
> > >>>>>
> > >>>>>
> > >>>>> Tasks
> > >>>>> This shared task consists of two subtasks, described below.
> Participants are free to participate in either one of them or both.
> > >>>>>
> > >>>>> - Subtask 1: Prediction of Intellectual Ability. The task is to
> predict measures of intellectual ability solemnly based on text. For this,
> z-standardized high school grades and IQ scores of college applicants are
> summed and globally ranked. The goal of this subtask is to reproduce their
> ranking, systems are evaluated by the Pearson correlation coefficient
> between system and gold ranking.
> > >>>>>
> > >>>>> For the final results, participants of this shared task will be
> provided with an MIX_text only and are asked to reproduce the ranking of
> each student relative to all students in a collection (i.e. the within the
> test set).
> > >>>>>
> > >>>>> The data is delivered in two files, one containing participant
> data, the other containing sample data, each being connected by a student
> ID. The rank in the sample data reflects the averaged performance relative
> to all instances within the collection (i.e. within train / test / dev),
> which is to be reproduced for the task.
> > >>>>>
> > >>>>> - Subtask 2: Classification of the Operant Motive Test (OMT).
> Operant motives are unconscious intrinsic desires that can be measured by
> implicit or operant methods, such as the Operant Motive Test (OMT)(Kuhl and
> Scheffer, 1999). During the OMT, participants are asked to write freely
> associated texts to provided questions and images. An exemplary
> illustration can be found in the Data area. Trained psychologists label
> these textual answers with one of four motives. The identified motives
> allow psychologists to predict behavior, long-term development, and
> subsequent success.
> > >>>>>
> > >>>>> For this shared task, participants will be provided with an
> OMT_text and are asked to predict the motive and level of each instance.
> The success will be measured with the macro-averaged F1-score.
> > >>>>>
> > >>>>>
> > >>>>> Data
> > >>>>> Since 2011, the private university of applied sciences
> NORDAKADEMIE performs an aptitude college application test, where
> participants state their high school performance, perform an IQ test and a
> psychometrical test called the Motive Index (MIX). The MIX measures
> so-called implicit or operant motives by having participants answer
> questions to those images like the one displayed below such as "who is the
> main person and what is important for that person?" and "what is that
> person feeling". Furthermore, those participants answer the question of
> what motivated them to apply for the NORDAKADEMIE.
> > >>>>>
> > >>>>> The data consists of a unique ID per entry, one ID per
> participant, of the applicants' major and high school grades as well as IQ
> scores with one textual expression attached to each entry. high school
> grades and IQ scores are z-standardized for privacy protection. In total
> there are 2,595 participants, who produced 77,850 unique MIX answers. The
> shortest textual answers consist of 3 words, the longest of 42 and on
> average there are roughly 15 words per textual answer with a standard
> deviation of 8 words.
> > >>>>>
> > >>>>> The available data set has been collected and hand-labeled by
> researchers of the University of Trier. More than 14,600
> volunteers participated in answering questions to 15 provided images. The
> pairwise annotator intraclass correlation was r = .85 on the Winter scale
> (Winter, 1994). The length of the answers ranges from 4 to 79 words with a
> mean length of 22 words and a standard deviation of roughly 12 words.
> > >>>>>
> > >>>>> Submissions for the validation set via the Codalab page are
> accepted and published on a leaderboard from January 1st. From May 1st, we
> will start the final evaluation phase of the task by providing the gold
> labels of the validation set, which can be used as additional training
> data. Additionally, the test set samples will be provided, for which we
> accept submissions until June, 1st.
> > >>>>>
> > >>>>> More information can be found on the task's webpage:
> https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html
> > >>>>>
> > >>>>>
> > >>>>> Important Dates
> > >>>>> - 01-Dec-2019: Release of trial data and systems
> > >>>>> - 01-Jan-2020: Release of training data (train + validation)
> > >>>>> - 08-May-2020: Release of test data
> > >>>>> - 01-Jun-2020: Final submission of test results
> > >>>>> - 03-Jun-2020: Submission of description paper
> > >>>>> - 04-11-Jun-2020: Peer reviewing: participants are expected to
> review other participant's system descriptions
> > >>>>> - 12-Jun-2020: Notification of acceptance and reviewer feedback
> > >>>>> - 18-Jun-2020: Camera-ready deadline for system description papers
> > >>>>> - 23-Jun-2020: Workshop in Zurich, Switzerland at the KONVENS 2020
> and SwissText joint conference
> > >>>>>
> > >>>>> The shared task will be accompanied by a pre-conference workshop
> of the Conference on Natural Language Processing ("Konferenz zur
> Verarbeitung natürlicher Sprache", KONVENS) hosted on June 23, 2020, at
> Zürich (https://swisstext-and-konvens-2020.org/).
> > >>>>>
> > >>>>>
> > >>>>> Workshop Proceedings
> > >>>>> Description papers will appear in online workshop proceedings.
> Participants who submit a description paper will be asked to register at
> the workshop and present their system as a poster or in an oral
> presentation (depending on the number of submissions).
> > >>>>>
> > >>>>>
> > >>>>> Organizers
> > >>>>> The shared task is organized by Dirk Johannßen, Chris Biemann,
> Steffen Remus and Timo Baumann from the Language Technology group of the
> University of Hamburg (
> https://www.inf.uni-hamburg.de/en/inst/ab/lt/home.html), as well as David
> Scheffer from the NORDAKADEMIE Elmshorn, Nicola Baumann from the
> Universität Trier and the Gudula Ritz from the Impart GmbH (Germany).
> > >>>>>
> > >>>>>
> > >>>>> GermEval
> > >>>>> GermEval is a series of shared task evaluation campaigns that
> focus on Natural Language Processing for the German language. GermEval has
> been conducted four times since 2014 in co-location with KONVENS/GSCL
> conferences. For an overview of the currently conducted tasks, visit
> https://swisstext-and-konvens-2020.org/shared-tasks/.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Dirk Johannßen
> > >>>>> Universität Hamburg
> > >>>>> Department of Informatics
> > >>>>> Language Technology Group (LT)
> > >>>>> Vogt-Kölln-Straße 30
> > >>>>> 22527 Hamburg
> > >>>>>
> > >>>>> Room: F-412
> > >>>>>
> > >>>>> johannssen at informatik.uni-hamburg.de
> > >>>>> http://lt.informatik.uni-hamburg.de
> > >>>>> http://www.uni-hamburg.de
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > >>>>> Corpora mailing list
> > >>>>> Corpora at uib.no
> > >>>>> https://mailman.uib.no/listinfo/corpora
> > >>>> _______________________________________________
> > >>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > >>>> Corpora mailing list
> > >>>> Corpora at uib.no
> > >>>> https://mailman.uib.no/listinfo/corpora
> > >>>
> > >>>
> > >>> --
> > >>> Emily M. Bender (she/her)
> > >>> Howard and Frances Nostrand Endowed Professor
> > >>> Department of Linguistics
> > >>> Faculty Director, CLMS
> > >>> University of Washington
> > >>> Twitter: @emilymbender
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > >>> Corpora mailing list
> > >>> Corpora at uib.no
> > >>> https://mailman.uib.no/listinfo/corpora
> > >>
> > >> _______________________________________________
> > >> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > >> Corpora mailing list
> > >> Corpora at uib.no
> > >> https://mailman.uib.no/listinfo/corpora
> > > _______________________________________________
> > > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > > Corpora mailing list
> > > Corpora at uib.no
> > > https://mailman.uib.no/listinfo/corpora
>
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > https://mailman.uib.no/listinfo/corpora
>
>
> --
> --
> Prof. Dr. Detmar Meurers http://purl.org/dm
> Dept. of Linguistics - LEAD Graduate School - SFB 833 - Univ. Tübingen
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 23693 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191204/fc07a1ce/attachment.txt>



More information about the Corpora mailing list