On Thu, Dec 5, 2019 at 8:06 AM Vidas Daudaravičius < vidas.daudaravicius at vtex.lt> wrote:
> Dear sheared task organizers,
>
> The discussion is timely and important. My highest concerns are:
> - Participants of any shared task need to decide whether to participate
> in or to discard shared task. The announcement of the shared task gives
> us to many ethical and organizational questions that are not explained:
> 2595 out of 14600 participant were selected. How they were selected?
> Does it produce bias? Probably, Yes. Do organizers have permissions from
> parents of high school students to collect data?
> - And Yes, we are afraid of being ranked like in 1984 novel. It raises
> much more concerns than Native language Identification shared task. It
> is good to have discussions/proposals in advance for shared tasks that
> might have Ethical issues.
> Explanations on Transparency, Privacy and Ethics issues would help
> participants and other interested researcher not to be so emotional and
> critical.
>
> All the best with organizing shared task,
>
> Vidas Daudaravicius
>
>
> On 04/12/2019 15:08, Dirk Johannßen wrote:
> > The data consists of a unique ID per entry, one ID per participant, of
> > the applicants' major and high school grades as well as IQ scores with
> > one textual expression attached to each entry. high school grades and
> > IQ scores are z-standardized for privacy protection. In total there
> > are 2,595 participants, who produced 77,850 unique MIX answers. The
> > shortest textual answers consist of 3 words, the longest of 42 and on
> > average there are roughly 15 words per textual answer with a standard
> > deviation of 8 words.
> >
> > The available data set has been collected and hand-labeled by
> > researchers of the University of Trier. More than 14,600 volunteers
> > participated in answering questions to 15 provided images. The
> > pairwise annotator intraclass correlation was r = .85 on the Winter
> > scale (Winter, 1994). The length of the answers ranges from 4 to 79
> > words with a mean length of 22 words and a standard deviation of
> > roughly 12 words.
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>
-- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي "No victim has ever been more repressed and alienated than the truth"
Emad Soliman Nawfal Indiana University, Bloomington -------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3727 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191205/7da92c89/attachment.txt>