[Corpora-List] Corpora Digest, Vol 153, Issue 15

Michael Pace-Sigge michael.pace-sigge at uef.fi
Tue Mar 10 15:59:26 CET 2020


Hi Shunji,

I guess it will need the time-honoured method of manual transcription like for all other projects before this (in particular to deal with overlap, neologism and non-lexical vocalisations). A useful way with speech-to-text software is, however, as an assisting tool. I found Google’s tool dealing pretty well with non-native speaker accents but that does not mean it works that well with NZ Eng.

vSpeech-to-text API by Google • https://www.google.com/intl/en/chrome/demos/speech.html ü Needs Google Chrome browser

Are you planning to create an updated version of the Wellington Corpus if I may ask?

M.


> Message: 3
> Date: Tue, 10 Mar 2020 10:46:31 +0900
> From: Shunji Yamazaki <shunji.yamazaki at gmail.com>
> Subject: [Corpora-List] Speech transcribing software
> To: corpora-request at uib.no, corpora at uib.no
>
> Dear fellow corpus linguists,
>
> Can anyone recommend to me software which can accurately transcribe
> speech? I am planning to collect New Zealand English spoken corpora by
> using an IC recorder which records interviewees? speech. I would like to
> use the software to transcribe that spoken data into written form. The
> software does need to be able to deal with the New Zealand accent, but does
> not necessarily have to be free.
>
> Thanks in advance,
>
> Shunji

Michael Pace-Sigge mtlps at me.com Liverpool; Joensuu


> On 10 Mar 2020, at 16:16, "corpora-request at uib.no" <corpora-request at uib.no> wrote:
>
> Today's Topics:
>
> 1. 3rd CFP: The Fifth Arabic NLP Workshop / Shared Task
> Collocated With COLING 2020 (Wajdi Zaghouani)
> 2. 2nd CfP: Disinformation, Hoaxes and Propaganda within Online
> Social Networks and Media (Elsevier - Online Social Networks and
> Media Journal special issue) (Carol Scarton)
> 3. Speech transcribing software (Shunji Yamazaki)
> 4. [1st CfP] SDP at EMNLP 2020: 1st Workshop on Scholarly Document
> Processing and Shared Tasks (SDP 2020) (Muthu Kumar Chandrasekaran)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 10 Mar 2020 14:12:29 +0300
> From: Wajdi Zaghouani <wajdiz at gmail.com>
> Subject: [Corpora-List] 3rd CFP: The Fifth Arabic NLP Workshop /
> Shared Task Collocated With COLING 2020
> To: Wajdi Zaghouani <wzaghouani at hbku.edu.qa>
>
> (apologies for cross-posting)
>
>
> ==== Call for Papers ====
>
>
>
> The 5th Arabic Natural Language Processing Workshop/Shared Task (WANLP-5
> 2020 <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsites.google.com%2Fview%2Fwanlp-2020&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=RAUiiPNwzHKQSMY4D3%2F5Mh3Pn4EO2SdcUJ2d%2F86tOt0%3D&amp;reserved=0>) will be a full day event
> taking place on September 13, 2020 in Barcelona, Spain. The workshop is
> collocated with COLING 2020 <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcoling2020.org%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=KJEmvfYvaK3fAcHzSC8V6x3c%2FfwLW%2Bo4X02Wclif39w%3D&amp;reserved=0>.
>
>
> Workshop URL: *https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsites.google.com%2Fview%2Fwanlp-2020&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=RAUiiPNwzHKQSMY4D3%2F5Mh3Pn4EO2SdcUJ2d%2F86tOt0%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsites.google.com%2Fview%2Fwanlp-2020&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=RAUiiPNwzHKQSMY4D3%2F5Mh3Pn4EO2SdcUJ2d%2F86tOt0%3D&amp;reserved=0>*
>
>
>
> We invite submissions on topics that include, but are not limited to, the
> following:
>
> - Basic core technologies: morphological analysis, disambiguation,
> tokenization, POS tagging, named entity detection, chunking, parsing,
> semantic role labeling, sentiment analysis, Arabic dialect modeling, etc.
>
> - Applications: machine translation, speech recognition, speech
> synthesis, optical character recognition, pedagogy, assistive technologies,
> social media, etc.
>
> - Resources: dictionaries, annotated data, corpus, etc.
>
>
>
> Submissions may include work in progress as well as finished work.
> Submissions must have a clear focus on specific issues pertaining to the
> Arabic language whether it is standard Arabic, dialectal, or mixed. Papers
> on other languages sharing problems faced by Arabic NLP researchers such as
> Semitic languages or languages using Arabic script are welcome.
> Additionally, papers on efforts using Arabic resources but targeting other
> languages are also welcome. Descriptions of commercial systems are welcome,
> but authors should be willing to discuss the details of their work.
>
>
>
> *Shared Task*
>
> Associated with the workshop will be a shared task on Arabic dialect
> identification <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsites.google.com%2Fview%2Fnadi-shared-task&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=YK3NLAwXl%2FP1P2hB%2FgctIgELC3HZ7JW1d3Afhy4V6Hg%3D&amp;reserved=0>. This
> shared task targets province-level dialects, and as such will be the first
> to focus on naturally-occurring fine-grained dialect at the sub-country
> level.
>
> Shared Task URL: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsites.google.com%2Fview%2Fnadi-shared-task&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=YK3NLAwXl%2FP1P2hB%2FgctIgELC3HZ7JW1d3Afhy4V6Hg%3D&amp;reserved=0
>
> *Important Dates*
>
> - May 20, 2020: Workshop Paper Due Date
> - Jun 24, 2020: Notification of Acceptance
> - Jul 11, 2020: Camera-ready Papers Due
> - Sep 13: Workshop Date
>
>
>
> *Submission Details*
>
> Submissions are expected to be up to 8 pages long plus any number of pages
> for references. Final versions of long papers will be given one additional
> page of content (up to 9 pages) so that reviewers? comments can be taken
> into account. Submissions will be done via softconf.
>
>
>
> *Submission Link*: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.softconf.com%2Fcoling2020%2FWANLP2020%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=GbdFDcv8%2BOxyvuLGztoyGvXRqzshaJA6FVus9gSC0M8%3D&amp;reserved=0
>
>
>
> *WANLP 2019 Organizing Committee*
>
> *General Chair:* Imed Zitouni
>
> *Ex-General Chair Advisor:* Wassim El Hajj
>
> *Program Chairs:* Muhammad Abdul-Mageed, Houda Bouamor, Fethi Bougares,
> Mahmoud El-Haj
>
> *Publication Chair:* Nadi Tomeh
>
> *Publicity Chair:* Wajdi Zaghouani
>
> *Shared Tasks:* Muhammad Abdul-Mageed, Chiyu Zhang, Nizar Habash and Houda
> Bouamor
>
>
>
> *Advisory Committee:* Muhammad Abdul-Mageed, Ahmed Ali, Hend Alkhalifa,
> Houda Bouamor, Fethi Bougares, Khalid Choukri, Kareem Darwish, Mona Diab,
> Mahmoud El-Haj, Samhaa El-Beltagy, Wassim El Hajj, Nizar Habash, Lamia
> Hadrich Belguith, Hazem Hajj, Walid Magdy, Khaled Shaalan, Kamel Smaili,
> Nadi Tomeh, Wajdi Zaghouani, Imed Zitouni
>
>
>
> For questions or comments regarding WANLP-5 you may contact Wajdi
> Zaghouani: wzaghouani at hbku.edu.qa
>
> ----
>
> *Wajdi Zaghouani, Ph.D.*
>
> *Assistant Professor*
> College of Humanities and Social Sciences
>
> P.O. Box 34110 | Education City | Doha, Qatar
> tel: +974 4454 5601 | mob: +974 33454992
>
> wzaghouani at hbku.edu.qa| Office A141, LAS Building
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 18335 bytes
> Desc: not available
> URL: <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Fpublic%2Fcorpora%2Fattachments%2F20200310%2Feb67671d%2Fattachment.txt&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=U%2FRENAV841bTgyDMmnZo5G1kdhpsGS5m4ky1Tn%2F6OZY%3D&amp;reserved=0>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 10 Mar 2020 11:47:56 +0000
> From: Carol Scarton <carol.scarton at gmail.com>
> Subject: [Corpora-List] 2nd CfP: Disinformation, Hoaxes and Propaganda
> within Online Social Networks and Media (Elsevier - Online Social
> Networks and Media Journal special issue)
> To: corpora at uib.no, ce-pln <Ce-pln at grupos.ufrgs.br>
>
> CALL FOR PAPERS
>
> Elsevier - Online Social Networks and Media Journal
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.journals.elsevier.com%2Fonline-social-networks-and-media%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600612088&amp;sdata=oxVejpA6DoPeI%2FomkGQ%2FaC4d4AJB8ZLdF3n8ZhR%2Boxs%3D&amp;reserved=0
> Special Issue on Disinformation, Hoaxes and Propaganda within Online Social
> Networks and Media
> Submission Deadline: March 20, 2020
>
> **********************************************************************************************************************************************************************
> Manuscripts can be submitted continuously until the deadline. Once a paper
> is submitted, the review process will start immediately. Accepted papers
> will be published continuously in the journal (in the first issue available
> as soon as the paper is accepted). All accepted papers will be listed
> together in an online virtual special issue published in the journal
> website.
> **********************************************************************************************************************************************************************
>
> Online Social Networks and Media natively convey the information quickly
> and diffusely. They are 'optimised' for posting and sharing catchy and
> sensationalist news. Problematic messages may span biased information
> aiming to influence communities and agendas to deliberate lies meant to
> mislead users. Whatever the strategy adopted for spreading false news (like
> support of automatic accounts and presence of trolls to inflame crowds),
> this would not be effective if there were no audience willing to believe
> them. The quest for belonging to a community and reassuring answers, the
> adherence to one's viewpoint: these are key factors for people to
> contribute to the success of disinformation diffusion. That's why the
> battle against disinformation must be fought at both technological and
> sociological level.
>
> This special issue seeks high-quality scientific articles (both theoretical
> and experimental) on using Online Social Networks and Media (OSNEM) data
> for the analysis of hoaxes, propaganda, and disinformation fabrication and
> spread on social media, automatic techniques to be embedded in OSNEM
> platforms to block/prevent their diffusion, and countermeasures to dissuade
> people to believe/diffuse them.
>
> Areas of interest include, but are not limited to, the design and
> implementation of methodologies and techniques to detect disinformation
> and/or raise the users' awareness to the threats represented by
> disinformation, including:
>
> - Modelling and analysis techniques to study/predict the dynamics of the
> spread of disinformation;
> - Text mining, graph mining, network and behavioural analyses to detect
> disinformation;
> - Reputation systems to support the detection - or mitigate the effects -
> of disinformation;
> - Disinformation strategies;
> - Understanding and guiding the societal reaction in the presence of
> disinformation;
> - Supervised/unsupervised approaches to let accounts? automation degree
> emerge from the crowd;
> - Computational fact-checking;
> - Detection of information polarization in online communities;
> - Definition and evaluation of novel metrics to verify news veracity;
> - Domain-free approaches to fight disinformation (i.e., context independent
> w.r.t. accounts, news, reviews, posts, tweets, etc..);
> - Interplay between OSNEM social network structures and
> diffusion/prevention of disinformation;
> - Behavioural models behind disinformation diffusion/prevention obtained
> from large-scale OSNEM data.
> - Data-driven approaches, supported by publicly available datasets, are
> more than welcome.
>
> Guest Editors
> Yelena Mejova, ISI Foundation, Turin, Italy
> Marinella Petrocchi, IIT-CNR, Italy
> Carolina Scarton, University of Sheffield, UK
>
>
> *** Instructions for submission ***
> Manuscripts must not have been previously published nor currently under
> review by other journals or conferences. Papers previously published in
> conference proceedings are eligible for submission if the submitted
> manuscript is a substantial revision and extension of the conference
> version. In this case, authors should indicate the previous publication(s)
> in the cover letter and are also required to submit their published
> conference article(s) and a summary document explaining the enhancements
> made in the journal version.
>
> The submission website for this journal is located at
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.editorialmanager.com%2Fosnem%2Fdefault.aspx&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=akfj0CWk6tW4oEeC825%2BACitP1jB28nDBhUMHmfhjSM%3D&amp;reserved=0.
>
> Please select "VSI:Disinformation" when you reach the "Article Type" step
> in the submission process. To ensure that all manuscripts are correctly
> identified, for consideration by the special issue, the authors should
> indicate in the cover letter that the manuscript has been submitted for the
> special issue on Disinformation, Hoaxes and Propaganda within Online Social
> Networks and Media.
>
> For further information, please contact the guest editors at
> yelena.mejova at gmail.com
> marinella.petrocchi at iit.cnr.it
> c.scarton at sheffield.ac.uk
>
> --
> *Carolina Scarton*
> Academic Fellow
> Department of Computer Science
> University of Sheffield
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fstaffwww.dcs.shef.ac.uk%2Fpeople%2FC.Scarton%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=FRluPg0XnpbymWMSNnhA3TvirGE%2BFzAAXIoXoWo1kuE%3D&amp;reserved=0
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 6434 bytes
> Desc: not available
> URL: <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Fpublic%2Fcorpora%2Fattachments%2F20200310%2Fb01fc74a%2Fattachment.txt&
>
> Message: 3
> Date: Tue, 10 Mar 2020 10:46:31 +0900
> From: Shunji Yamazaki <shunji.yamazaki at gmail.com>
> Subject: [Corpora-List] Speech transcribing software
> To: corpora-request at uib.no, corpora at uib.no
>
> Dear fellow corpus linguists,
>
> Can anyone recommend to me software which can accurately transcribe
> speech? I am planning to collect New Zealand English spoken corpora by
> using an IC recorder which records interviewees? speech. I would like to
> use the software to transcribe that spoken data into written form. The
> software does need to be able to deal with the New Zealand accent, but does
> not necessarily have to be free.
>
> Thanks in advance,
>
> Shunji
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 1412 bytes
> Desc: not available
> URL: <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Fpublic%2Fcorpora%2Fattachments%2F20200310%2F5d087836%2Fattachment.txt&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=QpdHHlO4h4JZYBFh%2FPZ77R2Uj9CQO2gehGhVX7XxZQQ%3D&amp;reserved=0>
>
> ------------------------------
>
> Message: 4
> Date: Thu, 5 Mar 2020 00:12:54 -0800
> From: Muthu Kumar Chandrasekaran <cmkumar087 at gmail.com>
> Subject: [Corpora-List] [1st CfP] SDP at EMNLP 2020: 1st Workshop on
> Scholarly Document Processing and Shared Tasks (SDP 2020)
> To: corpora at uib.no
> Cc: "Jie Tang \(THU\)" <jietang at tsinghua.edu.cn>, "Mayr-Schlegel,
> Philipp" <Philipp.Mayr-Schlegel at gesis.org>, Bonnie Webber
> <bonnie at inf.ed.ac.uk>, Dominika <d.tkaczyk at gmail.com>, Michal
> Shmueli-Scheuer <shmueli at il.ibm.com>, David Konopnicki
> <davidko at il.ibm.com>, "Robert M. Patton" <pattonrm at ornl.gov>, Dasha
> Herrmannova <d.herrmannova at gmail.com>, Peter Knoth
> <petr.knoth at open.ac.uk>, Eduard Hovy <ehovy at andrew.cmu.edu>, Guy
> Feigenblat <GUYF at il.ibm.com>, Alex Wade <awade at chanzuckerberg.com>,
> "Giles, Clyde Lee" <clg20 at psu.edu>, Ed Fox <fox at vt.edu>, Kuansan Wang
> <Kuansan.Wang at microsoft.com>, Anita de Waard <a.dewaard at elsevier.com>
>
> Call for papers:
> You are invited to participate in the 1st Workshop on Scholarly Document
> Processing (SDP 2020) to be held in conjunction with the 2020 Conference in
> Empirical Methods in Natural Language Processing (EMNLP 2020) on November
> 11 or 12 in Punta Cana, Dominican Republic.
>
> The workshop will consist of a research track and a shared task track. The
> shared task track includes the 6th edition of the CL-SciSumm shared task (
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FWING-NUS%2Fscisumm-corpus&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=nElBSsOJ37N%2FWJvNB5NZ7z5c8r8Yz%2FIupVUVWAaysXg%3D&amp;reserved=0) and two new summarization tasks
> -- CL-LaySumm and LongSumm -- geared towards easier access to scientific
> methods and results.
>
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2FSDProc%2Fstatus%2F1235405786068602880&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=K3L0XQDI940nZ%2BTwQ6SVbGTSHdpIL0ektLpCLnV%2FE0s%3D&amp;reserved=0
>
> The tentative submission deadline is July 15, 2020.
>
> SDP is a continuation of the BIRNDL (
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fphilippmayr.github.io%2FBIRNDL-WS%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=COMuMZJ4W9Id3BfOSv%2FxfhO0Aosf1J%2BTydrFjvo1a6M%3D&amp;reserved=0) and WOSP (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwosp.core.ac.uk%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=U8BAPQYaJ%2FsMZB1wLUQzyMn2xHHDtoW%2FOye8QwBaAVY%3D&amp;reserved=0)
> workshop series.
>
> Workshop Date and Venue: November 11/12, Punta Cana, Dominican Republic
>
> Website: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fornlcda.github.io%2FSDProc%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=sRGA3WaeyI8nQXERYteXFb%2B%2FKM3xhh9KzCotlLPkVJs%3D&amp;reserved=0
>
> ** Introduction **
>
> In addition to the long-standing challenge faced by scholars of keeping up
> with the growing literature in their own and related fields, they must now
> compete with malign pseudo-science and dis-information in informing public
> policy and behavior. This has stimulated workshops and research focused on
> enhancing search, retrieval, summarization, and analysis of scholarly
> documents. However, the general research community on scholarly document
> processing remains fragmented, and efforts towards natural language
> understanding of scholarly text that is central to vastly improve all the
> said downstream applications are not widespread.
>
> To address these gaps, we, the organizers of BIRNDL and WOSP workshops,
> propose the first Workshop on Scholarly Document Processing. We seek to
> reach to the broader NLP and AI/ML community to pool the distributed
> efforts to improve scholarly document understanding and enable intelligent
> access to the published research. The goal of SDP is two-fold: to increase
> collaboration between communities interested in leveraging knowledge stored
> in scientific literature and data and to establish SDP as the
> single-focused primary venue for the field.
>
> We seek to appeal to the mainstream NLP and ML community working on SDP
> tasks ? which are NLP tasks ? to publish at SDP as we seek to establish SDP
> as the integrated premier venue. We have established a steering committee
> to help us turn SDP into a conference in the forthcoming years.
>
> ** Topics of Interest **
>
> We invite submissions from all communities interested in natural language
> processing, information retrieval, and data mining problems in scientific
> documents; and in processing scientific documents for easier access to
> various audiences. The topics of interest include, but are not limited to:
>
>
> -
>
> Information extraction, text mining and parsing scholarly literature
> -
>
> Reproducibility and peer review
> -
>
> Lay Summarization (i.e., summaries created for non-experts) of
> individual and collections of scholarly documents
> -
>
> Discourse modeling and argument mining
> -
>
> Summarization and question-answering for scholarly documents
> -
>
> Semantic and network-based indexing, search and navigation in structured
> text
> -
>
> Graph analysis/mining including citation and co-authorship networks
> -
>
> Analysing and mining of citation contexts for document understanding and
> retrieval
> -
>
> New scholarly language resources and evaluation
> -
>
> Connecting and interlinking publications, data, tweets, blogs or their
> parts
> -
>
> Disambiguation, metadata extraction, enrichment, and data quality
> assurance for scholarly documents
> -
>
> Bibliometrics, scientometrics, and altmetrics approaches and applications
> -
>
> Other aspects of scientific workflows including open access/science, and
> research assessment
> -
>
> Infrastructures for accessing scientific publications and/or research
> data
>
>
> ** The 6th Computational Linguistics Scientific Document Summarization
> Shared Task (CL-SciSumm 2020) **
>
> (Organisers: Muthu Kumar Chandrasekaran)
>
> CL-SciSumm is the first medium-scale shared task on scientific document
> summarization, with over 500 annotated documents. Last year's CL-SciSumm
> shared task introduced large scale training datasets, both annotated from
> ScisummNet and auto-annotated. For the task, Systems were provided with a
> Reference Paper (RP) and 10 or more Citing Papers (CPs) that all contain
> citations to the RP, which they used to summarise RP. This was evaluated
> against abstract and human written summaries on ROUGE.
>
> The task is defined as follows:
>
> -
>
> Given: A topic consisting of a Reference Paper (RP) and Citing Papers
> (CPs) that all contain citations to the RP. In each CP, the text spans
> (i.e., citances) have been identified that pertain to a particular citation
> to the RP.
> -
>
> Task 1A: For each citance, identify the spans of text (cited text spans)
> in the RP that most accurately reflect the citance. These are of the
> granularity of a sentence fragment, a full sentence, or several consecutive
> sentences (no more than 5).
> -
>
> Task 1B: For each cited text span, identify what facet of the paper it
> belongs to, from a predefined set of facets.
> -
>
> Task 2 (optional bonus task): Finally, generate a structured summary of
> the RP from the cited text spans of the RP. The length of the summary
> should not exceed 250 words.
>
> This year, CL-SciSumm '20 will have two new tracks: LaySumm and LongSumm.
>
> ** CL-LaySumm 2020: The 1st Computational Linguistics Lay Summary Challenge
> Shared Task **
>
> (Organisers: Anita De Waard, Ed Hovy)
>
> To ensure and increase the relevance of science for all of society and not
> just a small group of niche practitioners, researchers have been
> increasingly tasked by funders and publishers to outline the scope of their
> research for the general public by writing a summary for a lay audience, or
> lay summary. The LaySumm summarization task considers automating this
> responsibility, by enabling systems to automatically generate lay
> summaries. A lay summary explains, succinctly and without using technical
> jargon, what the overall scope, goal and potential impact of a scientific
> paper is.
>
> The corpus for this task will comprise full-text papers with lay summaries,
> in a variety of domains, and from a number of journals. Elsevier will make
> available a collection of Lay Summaries from a multidisciplinary collection of
> journals, as well as the abstracts and full text of these journals.
>
> The task is defined as follows:
>
> -
>
> Given: A full-text paper, its Abstract, and a Lay Summary of a given
> paper
> -
>
> Task: For each paper, generate a Lay Summary of the specified length
>
>
> Evaluation
>
> The Lay Summary Task will be scored by using several ROUGE metrics to
> compare the system output and the gold standard Lay Summary. As a follow-up
> to the intrinsic evaluation, we will crowdsource a number of automatically
> generated lay summaries to a panel of judges and a lay audience. Details of
> the crowdsourcing evaluation will be announced with the sharing of the
> final test corpus on July 1st.
>
> All nominated entries will be invited to publish a paper in Open Access
> (Author-Payment Charges will be waived) in a selected Elsevier publication.
> Authors will be asked to provide an automatically generated lay summary of
> their paper, together with their contribution.
>
> ** LongSumm 2020: Shared Task on Generating Long Summaries for Scientific
> Documents **
>
> (Organisers: Michal Shmueli-Scheuer, Guy Feigenblat)
>
> Most of the work on scientific document summarization focuses on generating
> relatively short summaries (250 words or less). While such a length
> constraint can be sufficient for summarizing news articles, it is far from
> sufficient for summarizing scientific work. In fact, such a short summary
> resembles more to an abstract than to a summary that aims to cover all the
> salient information conveyed in a given text. Writing such summaries
> requires expertise and a deep understanding in a scientific domain, as can
> be found in some researchers? blogs.
>
> The LongSumm task opted to leverage blogs created by researchers in the NLP
> and Machine learning communities and use these summaries as reference
> summaries to compare the submissions against.
>
> The corpus for this task includes a training set that consists of 1705
> extractive summaries and around 700 abstractive summaries of NLP and
> Machine Learning scientific papers. These are drawn from papers based on
> video talks from associated conferences (Lev et al. 2019 TalkSumm
> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fabs%2F1906.01351&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=CsgnnxWYZySQ1UlMaoCxpweDNS5r76wif3nbya9qC0s%3D&amp;reserved=0>) and from blogs created by NLP and ML
> researchers. In addition, we create a test set of abstractive summaries.
> Each submission is judged against one reference summary (gold summary) on
> ROUGE and should not exceed 600 words.
>
> ** Submission Information **
>
> Authors are invited to submit full and short papers with unpublished,
> original work. Submissions will be subject to a double-blind peer review
> process. Accepted papers will be presented by the authors at the workshop
> either as a talk or a poster. All accepted papers will be published in the
> workshop proceedings.
>
> The submissions should be in PDF format and anonymized for review.
>
> All submissions must be written in English and follow the EMNLP 2020
> formatting requirements <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F2020.emnlp.org%2Fcall-for-papers&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=au4n%2F%2BsWOW4r40ZpaPys0gQZPmw60xHoBM4ICK9aSMI%3D&amp;reserved=0>. EMNLP
> will make it available soon.
>
> Long paper submissions: up to 8 pages of content, plus unlimited references.
>
> Short paper submissions: up to 4 pages of content, plus unlimited
> references.
>
> Final versions of accepted papers will be allowed 1 additional page of
> content so that reviewer comments can be taken into account.
>
>
> Submission Website: Submission is electronic, using the Softconf START
> conference management system. EMNLP will make it available soon.
>
> Shared Task registration: Participants of all shared tasks need to register
> here
> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fforms%2Fd%2Fe%2F1FAIpQLScfHzByrog-k299qBuCp3SbPWcb905_kmOWMvHpDH57VLpVrg%2Fviewform&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=KQQT7VUfmKkloEnDLhGAUXYUpVCKCxSH%2BHjsIPvtF1c%3D&amp;reserved=0>
> before March 31st, 2020
>
> ** Important Dates **
>
> Research track:
>
> Submission deadline ? July 15, 2020
>
> Notification of Acceptance ? August 17, 2020
>
> Camera-ready submission due ? August 31, 2020
>
> Workshop ? November 11 or 12, 2020
>
> Shared task track:
>
> Training set release ? Feb 15, 2020
>
> Deadline for registration and short systems description ? March 31, 2020
>
> Test set release (Blind) ? July 1, 2020
>
> System runs due ? August 1, 2020
>
> Preliminary system reports due ? August 16, 2020
>
> Camera-ready submission due ? August 31, 2020
>
> Workshop ? November 11 or 12, 2020
>
> The dates are at this stage indicative only and can change.
>
> ** Keynote Speakers **
>
>
> 1.
>
> Kuansang Wang <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fpeople%2Fkuansanw%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=KLNzIzDnXhAHLi%2FvyRRPo%2F%2F85pi4SSaZYwKly7oeuyc%3D&amp;reserved=0>
> , Managing Director, Microsoft Research Outreach Academic Services
> 2.
>
> The second keynote speaker will be announced shortly
>
>
> ** Journal Extension **
>
> In the past, the accepted authors were invited to submit an extended
> version of their work to a special issue of a selected journal. The
> organizers are currently in the process of identifying appropriate journals
> to host a similar special issue this year. Relevant updates including
> topics and requirements for this special issue will be shared on the
> workshop website in due time.
>
> ** Organizing Committee **
>
> Muthu Kumar Chandrasekaran, Amazon, Seattle, USA
>
> Anita de Waard, Elsevier, USA
>
> Guy Feigenblat, IBM Research AI, Haifa Research Lab, Israel
>
> Dayne Freitag, SRI International, San Diego, USA
>
> Tirthankar Ghosal, Indian Institute of Technology Patna, India
>
> Drahomira Herrmannova, Oak Ridge National Laboratory, USA
>
> Eduard Hovy, Research Professor, LTI, Carnegie Melon University, USA
>
> Petr Knoth, Open University, UK
>
> David Konopnicki, IBM Research AI, Haifa Research Lab, Israel
>
> Philipp Mayr, GESIS ? Leibniz Institute for the Social Sciences, Germany
>
> Robert M. Patton, Oak Ridge National Laboratory, USA
>
> Michal Shmueli-Scheuer, IBM Research AI, Haifa Research Lab, Israel
>
> Dominika Tkaczyk, Crossref, UK
>
> ** Steering Committee **
>
> Edward Fox, Professor, Department of Computer Science and Director, Digital
> Library Research Laboratory, Virginia Tech
>
> C. Lee Giles, David Reese Professor, College of Information Sciences and
> Technology, Pennsylvania State University
>
> Min-Yen Kan, Associate Professor, School of Computing, National University
> of Singapore
>
> Dragomir Radev, A. Bartlett Giamatti Professor of Computer Science, Yale
> University
>
> Jie Tang, Professor and Associate Chair of the Department of Computer
> Science and Technology, Tsinghua University
>
> Alex Wade, Group Technical Program Manager, Chan Zuckerberg Initiative
>
> Kuansang Wang, Managing Director, Microsoft Research Outreach Academic
> Services
>
> Bonnie Webber, Professor, School of Informatics, University of Edinburgh
>
> ** Programme Committee **
>
>
> 1.
>
> Akiko Aizawa, National Institute of Informatics, Japan
> 2.
>
> Colin Batchelor, Cambridge, UK
> 3.
>
> Joeran Beel, Trinity College Dublin, Ireland
> 4.
>
> Katarina Boland, GESIS, Germany
> 5.
>
> Guillaume Cabanac, University of Toulouse, France
> 6.
>
> Cornelia Caragea, University of illinois at Chicago, US
> 7.
>
> Zeljko Carevic, GESIS, Germany
> 8.
>
> Tanmoy Chakraborty, IIIT Delhi, India
> 9.
>
> Richard Eckart de Castilho, TU Darmstadt, Germany
> 10.
>
> Helena Deus <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fhelenadeus&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=0oae4wc%2FDf5snAh2%2BVBGVDE42vtLW0%2F%2FGplg4jBrdHM%3D&amp;reserved=0>, Elsevier Labs
> 11.
>
> Daniel Duma, University of Edinburgh, UK
> 12.
>
> Ed A. Fox, Virginia Tech, USA
> 13.
>
> Norbert Fuhr, University of Duisburg, Germany
> 14.
>
> C. Lee Giles, Penn State University, USA
> 15.
>
> Bela Gipp, University of Wuppertal, Germany
> 16.
>
> Goran Glavas, University of Mannheim, Germany
> 17.
>
> Hannaneh Hajishirzi, University of Washington, USA
> 18.
>
> Monica Ihli, University of Tennessee Knoxville, USA
> 19.
>
> Ameni Kacem, iCOVER, France
> 20.
>
> Roman Kern, Graz University of Technology, Austria
> 21.
>
> Atsushi Keyaki, Denso IT Laboratory Inc, Tokyo, Japan
> 22.
>
> Martin Klein, Los Alamos National Laboratory, USA
> 23.
>
> Ilia Kuznetsov, Techn. Univ. Darmstadt, Germany
> 24.
>
> Birger Larsen, Aalborg University, Denmark
> 25.
>
> Anne Lauscher, University of Mannheim, Germany
> 26.
>
> Paolo Manghi, National Research Council of Italy, Italy
> 27.
>
> Bruno Martins, University of Lisbon, Portugal
> 28.
>
> Norman Meuschke, University of Wuppertal, Germany
> 29.
>
> Diego Molla-Aliod, Macquarie University, Australia
> 30.
>
> Preslav Nakov, Qatar Computing Research Inst., Qatar
> 31.
>
> Federico Nanni, University of Mannheim, Germany
> 32.
>
> Jumana Nassour, Ben-Gurion University, Israel
> 33.
>
> Paco Nathan, Derwen Inc., US
> 34.
>
> Manabu Okumura, Tokyo Institute of Technology, Japan
> 35.
>
> Francesco Osborne, Open University, UK
> 36.
>
> Arzucan Ozgur, Bogazici University, Turkey
> 37.
>
> Sujit Pal, Elsevier Labs
> 38.
>
> Rajesh Piryani, South Asian University, India
> 39.
>
> Silvio Peroni, University of Bologna, Italy
> 40.
>
> Sujit Pal, Elsevier Labs
> 41.
>
> Animesh Prasad, Amazon, Cambridge, UK
> 42.
>
> Horacio Saggion, Universitat Pompeu Fabra, Spain
> 43.
>
> Angelo Antonio Salatino, The Open University, UK
> 44.
>
> Philipp Schaer, TH Cologne, Germany
> 45.
>
> Vivek Kumar Singh, Banaras Hindu University, India
> 46.
>
> Kazunari Sugiyama, Kyoto University, Japan
> 47.
>
> Saeed Ul Hassan, IT University, Pakistan
> 48.
>
> Lucy Vanderwende, University of Washington, USA
> 49.
>
> Stephen Wan, CSIRO, Australia
> 50.
>
> Bonnie Webber, University of Edinburgh, UK
> 51.
>
> Ivana Williams, Chan Zuckerberg Initiative, USA
> 52.
>
> Dietmar Wolfram, University of Wisconsin-Milwaukee, USA
> 53.
>
> Jian Wu, Old Dominion University, USA
>
>
> More details available on the workshop website:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fornlcda.github.io%2FSDProc%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=sRGA3WaeyI8nQXERYteXFb%2B%2FKM3xhh9KzCotlLPkVJs%3D&amp;reserved=0
>
> With kind regards,
>
> SDP 2020 organizing committee
>
> Muthu Kumar Chandrasekaran
> Research Scientist II
> Amazon, Day 1
> 2121 7th Ave,
> Seattle, WA 98121
> LinkedIn <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmuthukumarc87%2F&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=Lo1146%2Bb1ERwbuHssOaCIwUVGDAIP9SEjgZoFgO%2BtkY%3D&amp;reserved=0>| Google Scholar
> Profile <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fscholar.google.com%2Fcitations%3Fuser%3DTNXPTz0AAAAJ%26hl%3Den&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600622080&amp;sdata=uV4r%2Fx1oC1lr8tiM372dY2H9oaOYrLRRFOOtaMRjv84%3D&amp;reserved=0>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 90382 bytes
> Desc: not available
> URL: <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Fpublic%2Fcorpora%2Fattachments%2F20200305%2F330e8174%2Fattachment.txt&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600632073&amp;sdata=JZQuRTH5vtipnEKyZ%2FYMOxzXfymQud4JPzoBYwNjyTc%3D&amp;reserved=0>
>
> ----------------------------------------------------------------------
> Send Corpora mailing list submissions to
> corpora at uib.no
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Flistinfo%2Fcorpora&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600632073&amp;sdata=0sgQpxTlm86TcPvmD%2FX5KP6DESbzaFNFQWOY2oTFS2c%3D&amp;reserved=0
> or, via email, send a message with subject or body 'help' to
> corpora-request at uib.no
>
> You can reach the person managing the list at
> corpora-owner at uib.no
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Corpora digest..."
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.uib.no%2Flistinfo%2Fcorpora&amp;data=02%7C01%7C%7C43fd5cc73b164fb9313208d7c4fd8da7%7C87879f2e73044bf2baf263e7f83f3c34%7C0%7C0%7C637194465600632073&amp;sdata=0sgQpxTlm86TcPvmD%2FX5KP6DESbzaFNFQWOY2oTFS2c%3D&amp;reserved=0
>
>
> End of Corpora Digest, Vol 153, Issue 15
> ****************************************



More information about the Corpora mailing list