[Corpora-List] GermEval 2020 Task 1 on the Prediction of Intellectual Ability and Personality Traits from Text: 1st Call for Participation

Nihal Yağmur Aydın nyagmuraydin at gmail.com
Thu Dec 5 03:06:12 CET 2019


Hello, After joining German academia last year, I started to do research about education system there. It is quite true that immigrants have clear disadvantage over natives when it comes to selection process of students for schools (starting from childhood). And it is mostly recognized by wrong classification based on language use(such as biases based on prior German knowledge), as pointed out by many external and internal organizations. This is a known fact worldwide. By Google search, you can find many articles on it, so there is no need to reject a known fact.

Moreover, my own experience also shows that at a German university, people only cared about my personal language use, while deciding to terminate my contract. Even if it was in English, language use was the main factor for them deciding my capabilities to be a researcher. So, I would say that there is a strong sensitivity towards one's own personal language use, whereas it is nothing to do with one's potential to advance in computer science topics for instance. I think that having that much personal/subjective measures bring question mark on education system. Considering psychological aspects, also, one's psychology could be highly dependent on the existing external conditions, so if you make the test another time with same people, you might not see the same result. So, data tends to change. Therefore, it is not easy to infer something from test results.

Therefore, I agree that it is very true not to include linguistic features to decide on important things, such as inferring IQ level from language, or correlating them somehow?.

Moreover, in psychology there are many approaches, so it is hard to reach at any agreement between psychologists to decide on "any label". This brings difficulty in evaluating the task, but also it makes it interesting.

I really wonder about the reliability of the evaluation, as it seems not universal enough. *** Anyway, I think that questions asked by some experienced researchers are notable to concern, instead of labeling them as "irresponsible".

Best Regards, Nihal.

5 Aralık 2019 Perşembe tarihinde Jacob Eisenstein <jacobe at gmail.com> yazdı:
>> If you're serious about societal bias in AI, there is an increasing
>> number of events that you can get involved in and help make a real
>> contribution to address this complex issue. Simply spewing opinions
>> and moral incense into mailboxes seems to be much less beneficial.
>
>
> Hi Detmar, I think that you'll find that Zeerak and Emily have been
actively involved in organizing such events. My own contributions are limited to spewing opinions into mailboxes :)
>
> On that note, I think it's perfectly reasonable to respectfully question
our colleagues about the ethical implications of research projects that are promoted on this list. I hope that the organizers do not view being asked to address these issues as being "ridiculed" or "shamed." That was certainly not my intent in raising this discussion.
>
> Substantively, I do not agree that concerns about intelligence testing
are restricted to the anecdotal or second-hand. The history of intelligence testing is particularly problematic (and not just in the US), and so particular care is necessary. In fact, an enormous amount has already been written on this topic, and it seems fair to ask whether and how the organizers of the shared task plan to engage with this literature. I am also curious about the question that Emily posed: what is the use case that the organizers foresee, and do they plan to test for and address potential biases in this use case?
>
> best,
> -Jacob
>
>>
>> Best regards,
>> Detmar
>>
>>
>> On Wed, Dec 04, 2019 at 10:58:39PM +0100, Zeerak Waseem wrote:
>> > Anecdotally from my own experiences in Denmark and conversations with
god knows how many racialised people around Europe (best estimation n ~= 50): the concerns and reasons for concern described (and reasons for strong language) are likely to map to a European context.
>> >
>> > The racial inequalities of the societies play out in the same way (i
know Europeans hate admitting to similarities with the us but they are highly similar) with marginalisation of groups of people based on heritage (the specific group(s) depend on the country in question), (forced) ghettoisation, White flight, schools and public infrastructure being poorly maintained (anecdotally: I was told multiple times along the way (directly and indirectly) that i should probably not seek an academic career and I’ve rarely had a conversation with a racialised person who grew up in the global economic north who didn’t have similar experiences), etc. Of course there are also dissimilar systems of thinking between cultures and having multiple cultural backgrounds could influence scores resulting from processing patterns measured in IQ tests.
>> > In short: based on my own experience, what I saw growing up, and
other’s experiences, I would be highly surprised is European investigations of correlations between race and IQ would show dissimilarity with the US investigations. And these inequities would likely be built into the systems relying on data such as IQ.
>> >
>> > So yeah, seconding that the strong language may be entirely
appropriate (and frankly Europeans should stop thinking that the processes of racial discrimination here are wildly different than in the us).
>> >
>> > Zeerak
>> >
>> > > On 4 Dec 2019, at 22:10, Yannick Versley <yversley at gmail.com> wrote:
>> > >
>> > > Even understanding the German educational system and culture, I'd
say that this task should light up the "irresponsible" light on the mind of any person who is (i) reasonably clear-thinking and (ii) familiar with the problems that responsible/ethical AI wants to warn us about. By overselling models and tasks that necessarily (i) lead to a poor fit overall and (ii) are prone to pick up cultural/ethnical background (to pick a less US-centric term than "race") in addition to any informative features, we're lending legitimacy to the use of similar tools that are used as a pseudo-scientific mantle to disguise (essentially) the automation of racial/ethnic/cultural discrimination and biases.
>> > > People (yes, most people) desperately want automated computer
decisions that work the same way that biased humans do them but with the aura of objectivity. And as the technological experts it's our duty to call bullshit on that and criticise the flawed tools as well as the processes that lead to their acceptance and/or use in production. Rather than capitalizing on the desire for pseudo-science and becoming a helping party in the deception.
>> > >
>> > > Best wishes,
>> > > Yannick
>> > >
>> > >> On Wed, Dec 4, 2019 at 9:24 PM Laura Dietz <dietz at cs.unh.edu> wrote:
>> > >> I think it is unfair to call this task out as "irresponsible"
without understanding the German educational system and culture. I understand that Americans have a knee-jerk response, and it is always good to caution experimental setups. However, I would have hoped for a more measured response.
>> > >>
>> > >> Laura Dietz
>> > >>
>> > >>
>> > >>> On 12/4/19 2:00 PM, Emily M. Bender wrote:
>> > >>> Thank you, Jacob, for this reply. This task seems
irresponsible/poorly conceived to me. Before designing such a task, I think it is imperative to consider its use cases: When and why would we want to predict IQ scores or high school grades from text? Given the high potential for any such system to learn preexisting biases (themselves the result of structural discrimination in society), what are the likely impacts, especially on already marginalized populations?
>> > >>>
>> > >>> Emily
>> > >>>
>> > >>>> On Wed, Dec 4, 2019 at 10:34 AM Jacob Eisenstein <jacobe at gmail.com>
wrote:
>> > >>>> As a community, we should think carefully about whether it is
appropriate to work with IQ test results as data, and what the applications of this research might be.
>> > >>>>
>> > >>>> In the United States, there is considerable evidence that IQ
tests are racially biased. In the past, courts have excluded IQ tests from educational placement in California for precisely this reason. I wonder if there is research on this topic in the German context.
>> > >>>>
>> > >>>> It is not difficult to imagine that the outcome of this shared
task would be a set of technologies that encode spurious correlations between estimates of intelligence and the linguistic features of specific racial groups. If such a system were trained on data that already contains biases, there is a risk that this bias would be not only entrenched but amplified. And even if the IQ test statistics are not themselves biased, an NLP system that predicts IQ from text could introduce bias, if there is an unmeasured confound that is statistically associated with both IQ and race.
>> > >>>>
>> > >>>> I hope that these issues will receive serious consideration from
the organizers and participants in the task.
>> > >>>>
>> > >>>> Jacob Eisenstein
>> > >>>>
>> > >>>>> On Wed, Dec 4, 2019 at 8:27 AM Dirk Johannßen <
johannssen at informatik.uni-hamburg.de> wrote:
>> > >>>>> GermEval 2020 Task 1 on the Prediction of Intellectual Ability
and Personality Traits from Text
>> > >>>>>
>> > >>>>> 1st Call for Participation
>> > >>>>> We invite interested parties from academia and industry to
participate in this shared task. Further information can be found here: https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html .
>> > >>>>>
>> > >>>>> The validity of high school grades as a predictor

of academic success is controversial. Researchers have found indications that linguistic features such as function words used in a prospective student's writing perform better in predicting academic success (Pennebaker et al., 2014).
>> > >>>>>
>> > >>>>> During an aptitude test, participants are asked to write freely
associated texts to provided questions and images. Trained psychologists can predict behavior, long-term development, and subsequent success from those expressions. Paired with an IQ test and provided high school grades, prediction of intellectual ability from a text can be investigated. Such an approach would extend the sole text classification and could reveal insightful psychological traits.
>> > >>>>>
>> > >>>>> Operant motives are unconscious intrinsic desires that can be
measured by implicit or operant methods, such as the Operant Motive Test (OMT) or the Motive Index (MIX) employs. During the OMT and MIX, participants are asked to write freely associated texts to provided questions and images. Trained psychologists label these textual answers with one of five motives and corresponding levels. The identified motives allow psychologists to predict behavior, longterm development, and subsequent success. For our task, we provide extensive amounts of textual data from both, the OMT and MIX, paired with IQ and high school grades (MIX) and labels (OMT).
>> > >>>>>
>> > >>>>> With this task, we aim to foster research within this context.
This task is focusing on classifying German psychological text data for predicting the IQ and high school grades of college applicants as well as performing speaker identification by the same image descriptions.
>> > >>>>>
>> > >>>>>
>> > >>>>> Tasks
>> > >>>>> This shared task consists of two subtasks, described below.
Participants are free to participate in either one of them or both.
>> > >>>>>
>> > >>>>> - Subtask 1: Prediction of Intellectual Ability. The task is to
predict measures of intellectual ability solemnly based on text. For this, z-standardized high school grades and IQ scores of college applicants are summed and globally ranked. The goal of this subtask is to reproduce their ranking, systems are evaluated by the Pearson correlation coefficient between system and gold ranking.
>> > >>>>>
>> > >>>>> For the final results, participants of this shared task will be
provided with an MIX_text only and are asked to reproduce the ranking of each student relative to all students in a collection (i.e. the within the test set).
>> > >>>>>
>> > >>>>> The data is delivered in two files, one containing participant
data, the other containing sample data, each being connected by a student ID. The rank in the sample data reflects the averaged performance relative to all instances within the collection (i.e. within train / test / dev), which is to be reproduced for the task.
>> > >>>>>
>> > >>>>> - Subtask 2: Classification of the Operant Motive Test (OMT).
Operant motives are unconscious intrinsic desires that can be measured by implicit or operant methods, such as the Operant Motive Test (OMT)(Kuhl and Scheffer, 1999). During the OMT, participants are asked to write freely associated texts to provided questions and images. An exemplary illustration can be found in the Data area. Trained psychologists label these textual answers with one of four motives. The identified motives allow psychologists to predict behavior, long-term development, and subsequent success.
>> > >>>>>
>> > >>>>> For this shared task, participants will be provided with an
OMT_text and are asked to predict the motive and level of each instance. The success will be measured with the macro-averaged F1-score.
>> > >>>>>
>> > >>>>>
>> > >>>>> Data
>> > >>>>> Since 2011, the private university of applied sciences
NORDAKADEMIE performs an aptitude college application test, where participants state their high school performance, perform an IQ test and a psychometrical test called the Motive Index (MIX). The MIX measures so-called implicit or operant motives by having participants answer questions to those images like the one displayed below such as "who is the main person and what is important for that person?" and "what is that person feeling". Furthermore, those participants answer the question of what motivated them to apply for the NORDAKADEMIE.
>> > >>>>>
>> > >>>>> The data consists of a unique ID per entry, one ID per
participant, of the applicants' major and high school grades as well as IQ scores with one textual expression attached to each entry. high school grades and IQ scores are z-standardized for privacy protection. In total there are 2,595 participants, who produced 77,850 unique MIX answers. The shortest textual answers consist of 3 words, the longest of 42 and on average there are roughly 15 words per textual answer with a standard deviation of 8 words.
>> > >>>>>
>> > >>>>> The available data set has been collected and hand-labeled by
researchers of the University of Trier. More than 14,600 volunteers participated in answering questions to 15 provided images. The pairwise annotator intraclass correlation was r = .85 on the Winter scale (Winter, 1994). The length of the answers ranges from 4 to 79 words with a mean length of 22 words and a standard deviation of roughly 12 words.
>> > >>>>>
>> > >>>>> Submissions for the validation set via the Codalab page are
accepted and published on a leaderboard from January 1st. From May 1st, we will start the final evaluation phase of the task by providing the gold labels of the validation set, which can be used as additional training data. Additionally, the test set samples will be provided, for which we accept submissions until June, 1st.
>> > >>>>>
>> > >>>>> More information can be found on the task's webpage:
https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/germeval-2020-psychopred.html
>> > >>>>>
>> > >>>>>
>> > >>>>> Important Dates
>> > >>>>> - 01-Dec-2019: Release of trial data and systems
>> > >>>>> - 01-Jan-2020: Release of training data (train + validation)
>> > >>>>> - 08-May-2020: Release of test data
>> > >>>>> - 01-Jun-2020: Final submission of test results
>> > >>>>> - 03-Jun-2020: Submission of description paper
>> > >>>>> - 04-11-Jun-2020: Peer reviewing: participants are expected to
review other participant's system descriptions
>> > >>>>> - 12-Jun-2020: Notification of acceptance and reviewer feedback
>> > >>>>> - 18-Jun-2020: Camera-ready deadline for system description
papers
>> > >>>>> - 23-Jun-2020: Workshop in Zurich, Switzerland at the KONVENS
2020 and SwissText joint conference
>> > >>>>>
>> > >>>>> The shared task will be accompanied by a pre-conference workshop
of the Conference on Natural Language Processing ("Konferenz zur Verarbeitung natürlicher Sprache", KONVENS) hosted on June 23, 2020, at Zürich (https://swisstext-and-konvens-2020.org/).
>> > >>>>>
>> > >>>>>
>> > >>>>> Workshop Proceedings
>> > >>>>> Description papers will appear in online workshop proceedings.
Participants who submit a description paper will be asked to register at the workshop and present their system as a poster or in an oral presentation (depending on the number of submissions).
>> > >>>>>
>> > >>>>>
>> > >>>>> Organizers
>> > >>>>> The shared task is organized by Dirk Johannßen, Chris Biemann,
Steffen Remus and Timo Baumann from the Language Technology group of the University of Hamburg ( https://www.inf.uni-hamburg.de/en/inst/ab/lt/home.html), as well as David Scheffer from the NORDAKADEMIE Elmshorn, Nicola Baumann from the Universität Trier and the Gudula Ritz from the Impart GmbH (Germany).
>> > >>>>>
>> > >>>>>
>> > >>>>> GermEval
>> > >>>>> GermEval is a series of shared task evaluation campaigns that
focus on Natural Language Processing for the German language. GermEval has been conducted four times since 2014 in co-location with KONVENS/GSCL conferences. For an overview of the currently conducted tasks, visit https://swisstext-and-konvens-2020.org/shared-tasks/.
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> --
>> > >>>>> Dirk Johannßen
>> > >>>>> Universität Hamburg
>> > >>>>> Department of Informatics
>> > >>>>> Language Technology Group (LT)
>> > >>>>> Vogt-Kölln-Straße 30
>> > >>>>> 22527 Hamburg
>> > >>>>>
>> > >>>>> Room: F-412
>> > >>>>>
>> > >>>>> johannssen at informatik.uni-hamburg.de
>> > >>>>> http://lt.informatik.uni-hamburg.de
>> > >>>>> http://www.uni-hamburg.de
>> > >>>>>
>> > >>>>> _______________________________________________
>> > >>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > >>>>> Corpora mailing list
>> > >>>>> Corpora at uib.no
>> > >>>>> https://mailman.uib.no/listinfo/corpora
>> > >>>> _______________________________________________
>> > >>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > >>>> Corpora mailing list
>> > >>>> Corpora at uib.no
>> > >>>> https://mailman.uib.no/listinfo/corpora
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Emily M. Bender (she/her)
>> > >>> Howard and Frances Nostrand Endowed Professor
>> > >>> Department of Linguistics
>> > >>> Faculty Director, CLMS
>> > >>> University of Washington
>> > >>> Twitter: @emilymbender
>> > >>>
>> > >>>
>> > >>> _______________________________________________
>> > >>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > >>> Corpora mailing list
>> > >>> Corpora at uib.no
>> > >>> https://mailman.uib.no/listinfo/corpora
>> > >>
>> > >> _______________________________________________
>> > >> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > >> Corpora mailing list
>> > >> Corpora at uib.no
>> > >> https://mailman.uib.no/listinfo/corpora
>> > > _______________________________________________
>> > > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > > Corpora mailing list
>> > > Corpora at uib.no
>> > > https://mailman.uib.no/listinfo/corpora
>>
>> > _______________________________________________
>> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > Corpora mailing list
>> > Corpora at uib.no
>> > https://mailman.uib.no/listinfo/corpora
>>
>>
>> --
>> --
>> Prof. Dr. Detmar Meurers http://purl.org/dm
>> Dept. of Linguistics - LEAD Graduate School - SFB 833 - Univ. Tübingen
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> https://mailman.uib.no/listinfo/corpora
>

--

Best Regards, Nihal Yağmur Aydın -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 26564 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191205/2ef2d8c3/attachment.txt>



More information about the Corpora mailing list