[Corpora-List] SemEval discussion at NAACL 2019

Jelena Mitrovic jecovit at gmail.com
Mon Jun 17 13:27:28 CEST 2019

Dear Ted,

Thank you for starting this discussion and reminding us all about the issues that were raised at SemEval.

This was my first-time participation in SemEval, and I did indeed think that only the top scoring systems should be described in papers. Ours was No 10 so I thought, OK, we barely made it :) I was also very surprised to hear that some of the top-scoring teams in some tasks did not write a paper at all, but I understand that some students do not wish to go into academia and see no point in learning how to write an academic paper.

This is certainly an important discussion and I do hope that others join in.

Best wishes, Jelena

On Mon, 17 Jun 2019 at 13:03, Ted Pedersen <tpederse at d.umn.edu> wrote:

> Here are some very interesting followup thoughts from Laura Dietz.
> ------------------------
> My student participated in SemEval this year -- however traditionally
> SemEval is not my community. I did however participate in similar
> evaluations, CLEF, TAC KBP, and I am organizing a task at TREC.
> At TREC and TAC, the leaderboard is only revealed at the workshop. TREC
> organizers purposefully decided to not have a live leaderboard.
> Participating teams are required to submit a workshop paper (no page
> limit) before they know their rank. This has a nice side effect that you
> get more system descriptions and a deeper analysis on the performance of
> the system --- not in comparison to the leaderboard.
> Regarding anonymous teams: It is more likely that these are individual
> grad students that were messing with the data, but were too shy to raise
> their hand. My own student nearly did not submit a paper, unless I
> strongly encouraged him. Sadly he could not travel by himself, but was
> represented by another student in my lab. I try to teach them the
> importance of **community** in research community, but its sometimes
> difficult to get students to jump.
> At the last TREC workshop, I had one participant who was at the far end
> of the leaderboard. It took some convincing from my side and
> confirmation that we want to hear about all participating systems, not
> just the top performers. Another anecdote is about another team, who
> were last the previous year, then mid-range last year, but who had the
> right approach, but ruined their performance by some "stupid" mistakes.
> At the workshop we helped the team "debug" their system (wrong tokenizer
> & only binary predictions --- for a ranking task). It turns out their
> approach can outperform the best team by 200% (!!!)
> I explain my participants that the shared task is to figure out together
> what works and what doesn't. We can always learn from a system, no
> matter if its a high or low performer. Sometimes it requires to combine
> a set of ideas to really make progress in a domain.
> Cheers,
> Laura
> ---
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
> On Sun, Jun 16, 2019 at 9:50 AM Ted Pedersen <tpederse at d.umn.edu> wrote:
> >
> > Greetings all,
> >
> > I posted this to various SemEval lists and Twitter, but was also
> > encouraged to send it here (to Corpora). Apologies if you've seen this
> > before!
> >
> > -----------------
> >
> > The SemEval workshop took place during the last two days of NAACL 2019
> > in Minneapolis, and included quite a bit of discussion both days about
> > the future of SemEval. I enjoyed this conversation (and participated
> > in it), so wanted to try and share some of what I think was said.
> >
> > A few general concerns were raised about SemEval - one of them is that
> > many teams participate without then going on to submit papers
> > describing their systems. Related to this is that there are also
> > participants who never even really identify themselves to the task
> > organizers, and in effect remain anonymous throughout the event. In
> > both cases the problem is that in the end SemEval aspires to be an
> > academic event where participants describe what they have done in a
> > form that can be easily shared with other participants (and papers are
> > a good way to do that).
> >
> > My own informal estimate is that maybe a half of participating teams
> > submit a paper, and then half of those go on to attend the workshop
> > and present a poster. So if you see a task with 20 teams, perhaps 10
> > of them submit a paper and maybe 5 present a poster. SemEval is
> > totally ok with teams that submit a paper but do not attend the
> > workshop to present a poster. That has long been the case, and this
> > was confirmed again in Minneapolis. The goal then is to get more
> > participating teams to submit papers. There was considerable
> > discussion on the related issues of why don't more teams submit
> > papers, and how can we encourage (or require) the submission of more
> > papers?
> >
> > One point made is that SemEval participants are sometimes new to our
> > community and so don't have a clear idea of what a "system description
> > paper" should consist of, and so might not submit papers because they
> > believe it will be too difficult or time consuming, or they just don't
> > know what to do and fear immediate rejection. There was considerable
> > support for the idea of providing a paper template that would help new
> > authors know what is expected.
> >
> > It was also observed that when teams have disappointing results (not
> > top ranked) they might feel like a paper isn't really necessary or
> > might even be a bad idea. This tied into a larger discussion about the
> > reality that some (many?) participants in SemEval tasks focus on their
> > overall ranking and less on understanding the problem that they are
> > working on. There was discussion at various points about how to get
> > away from the obsession with the leaderboard, and to focus more on
> > understanding the problem that is being presented by the task. A
> > carefully done analysis of a system that doesn't perform terrifically
> > well can shed important light on a problem, while simply describing a
> > model and hyperparameter settings that might lead to high scores may
> > not be too useful in understanding that same problem.
> >
> > One idea was for each task to award a "best analysis paper" and
> > potentially award the authors of that paper an oral presentation
> > during the workshop. Typically nearly all presentations at SemEval are
> > posters, and so the oral slots are somewhat coveted and are often (but
> > not always) awarded to the team with the highest rank. Shifting the
> > focus of prizes and presentations away from the leaderboard might tend
> > to encourage more participants to carry out such analysis and submit
> > papers.
> >
> > That said, a carefully done analysis paper can be fairly time
> > consuming to create and may require more pages than the typical 4 page
> > limit. It was suggested that we be more flexible with page limits, so
> > that teams could submit fairly minimal descriptions, or go into more
> > depth on their systems and analysis. A related idea was to allow
> > analysis papers to be submitted to the SemEval year X+1 workshop based
> > on system participation in year X. This might be a good option to
> > provide since SemEval timelines tend to be pretty tight as it stands.
> >
> > Papers sometimes tend to focus more on the horse race or bake off (and
> > so analysis is limited to reporting a rank or score in the task).
> > However, if scores or rankings were not released until after papers
> > were submitted then this could certainly change the nature of such
> > papers. In addition, a submitted paper could be made a requirement for
> > appearing on the leaderboard.
> >
> > There is of course a trade off between increasing participation and
> > increasing the number of papers submitted. If papers are made into
> > requirements then some teams won't participate. There is perhaps a
> > larger question for SemEval to consider, and that is how to increase
> > the number of papers without driving away too many participants.
> >
> > Another observation that was made was that some teams never identify
> > themselves and so participate in the task but are never really
> > involved beyond being on the leaderboard. These could of course be
> > shadow accounts created by teams who are already participating (to get
> > past submission limits?), or they could be accounts created by teams
> > who may only want to identify themselves if they end up ranking
> > highly. Should anonymous teams be allowed to participate? I don't know
> > that there was a clear answer to that question. While anonymous
> > participation could be a means to game the system in some way, it
> > might also be something done by those who are participating contrary
> > to the wishes of an advisor or employer, If teams are reluctant to
> > identify themselves for fear of being associated with a "bad" score,
> > perhaps it could be possible for teams to remove scores from the
> > leaderboard.
> >
> > To summarize, I got the sense that there is some interest in both
> > increasing the number of papers submitted to SemEval, and also in
> > making it clear that there is more to the event than the leaderboard.
> > I think there were some great ideas discussed, and I fear I have done
> > a somewhat imperfect job of trying to convey those here, but I don't
> > want to let the perfect be the enemy of the good enough, so I'm going
> > to go ahead and send this around and hope that others who have ideas
> > will join in the conversation in some way.
> >
> > Cordially,
> > Ted
> >
> > PS Emily Bender pointed out the following paper overlaps with some of
> > the issues mentioned in my summary. I'd strongly encourage all SemEval
> > organizers and participants to read through this, very much on target
> > and presents some nice ideas about how to think about shared tasks.
> >
> > https://aclweb.org/anthology/papers/W/W17/W17-1608/
> >
> > ---
> > Ted Pedersen
> > http://www.d.umn.edu/~tpederse
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora

-- Dr. Jelena Mitrović Postdoctoral Research Fellow Fakultät für Informatik und Mathematik Universität Passau / ITZ / Raum 114 Innstr. 43 94032 Passau +49 851 509 3395

jelena.mitrovic at uni-passau.de www.uni-passau.de -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 12680 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20190617/04af73b5/attachment.txt>

More information about the Corpora mailing list