[Corpora-List] Annotation tool for two documents in parallel -- please Help!

Jonathan Reeve jon.reeve at gmail.com
Fri May 5 19:36:57 CEST 2017


Hi Irina,

The simplest solution is probably just to use a text editor with a scroll lock function (vim, emacs). But if you want to know what is missing in one and present in another, it sounds like you want something like vimdiff or ediff (emacs). You can run either of those with a third buffer in scroll-lock with the first two that serves as your annotations.

Best,

Jonathan

corpora-request at uib.no writes:


> Today's Topics:
>
> 1. Re: Annotation tool for two documents in parallel -- please
> Help! (Roman Klinger)
> 2. Postdoc Position on Robots Learning Semantic Description of
> Objects from Web resources, INRIA Sophia Antipolis, France
> (Elena.CABRIO at unice.fr)
> 3. Re: Annotation tool for two documents in parallel -- please
> Help! (Pavel)
> 4. CFP: Special Issue - Terminology 24(1), 2018 - Second call
> for paper (Thierry Hamon)
> 5. Job: Scientific System Developer for Natural Language
> Processing applications, TU Darmstadt (Johannes Daxenberger)
> 6. Job opening at the University of Cambridge: Teaching
> Associate in Computational Linguistics (Anna Korhonen)
> 7. Re: Annotation tool for two documents in parallel -- please
> Help! (Dominique Brunato)
> 8. [CFP] 14th ACS/IEEE International Conference on Computer
> Systems and Applications AICCSA 2017 (NLP Track) (Yannick Parmentier)
> 9. RANLP 2017 - Deadline EXTENDED to 19 May 2017 (Ivelina Nikolova)
> 10. First Call for Participation : EUROLAN SUMMER SCHOOL (Nancy Ide)
> 11. Call for Participation: TAC 2017 Adverse Drug Reaction
> Extraction from Drug Labels (Kirk Roberts)
> 12. Re: WordNet ignores function words ... (Albretch Mueller)
> 13. Assistant/Associate professorships in Machine Learning and
> Data Science (Zeljko Agic)
> 14. Re: WordNet ignores function words ... (John F Sowa)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 2 May 2017 22:42:23 +0200
> From: Roman Klinger <roman-mailinglists at klinger.xyz>
> Subject: Re: [Corpora-List] Annotation tool for two documents in
> parallel -- please Help!
> To: corpora at uib.no
>
> Hi Irina,
>
> On 02.05.17 18:50, Irina Temnikova wrote:
>> we are looking for an annotation tool with which we can annotate two
>> documents in parallel.
>> Both documents must be visible in the same time, so the annotators can
>> compare them & both must be in the same time open for annotation.
>> Let's say that one document is a transformation of the other, and we
>> want to
>> know what is missing in one and appearing in the other and vice versa.
>> Our annotators are linguists, with not much expertise in Computer
>> Science, so the tool should be easy to use.
>> We don't think GATE or Brat support this.
>
>
> You could consider building something from scratch with
> http://annotatorjs.org/. If what you want to annotate is supported by
> the plugin, it's straight-forward. However, I never tried to add
> functionality.
>
> Best,
>
> Roman
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 3 May 2017 09:48:09 +0200
> From: Elena.CABRIO at unice.fr
> Subject: [Corpora-List] Postdoc Position on Robots Learning Semantic
> Description of Objects from Web resources, INRIA Sophia Antipolis,
> France
> To: LN at cines.fr
>
> Postdoctoral Position on Robots Learning Semantic Description of Objects from Web resources - ALOOF (Autonomous Learning of the Meaning of Objects) - CHIST-ERA European project
>
> Context:
> The Wimmics team at INRIA is a partner of the ALOOF (Autonomous Learning of the Meaning of Objects) CHIST-ERA European project, which goal is to enable robots and autonomous systems working with and for humans to exploit the vast amount of knowledge on the Web in order to learn about previously unseen objects involved in human activities, and to use this knowledge when acting in the real world. More precisely, the project scenario consists of an open-ended domestic setting where robots have to find objects.
> In this context, we propose a framework based on "machine reading" to extract formally encoded knowledge from unstructured text. Given the robot domestic setting scenario, we focus on extracting the following information about objects: i) the type of the object, ii) where it is typically located (Basile et al. EKAW 2016), and iii) common semantic frames involving the object (Basile et al. ECAI demo paper, Shah et al. AnSWeR).
> Our approach combines linguistic and semantic analysis of natural language with entity linking and formal reasoning to create a knowledge base of common sense knowledge. The extracted knowledge is represented as RDF triples. We leverage curated resources (e.g. DBpedia, BabelNet) and learn from the unstructured Web when a knowledge gap occurs.
>
> Job description:
> We are looking for a Postdoctoral researcher with a background in Knowledge Representation and Reasoning (in particular Semantic Web and Linked Data) to join the Inria WIMMICS team (http://wimmics.inria.fr).
> The goal of this postdoctoral position is to manage the knowledge resources created within the project. In particular, the following research tasks will be addressed:
> - design the representation framework for the acquired knowledge
> - extend the existing resources both in size and functionalities
> - publish the resources as Linked Data
>
> Profile:
> Mandatory requirements for applicants:
> 1. PhD in Computer Science or related field;
> 2. Experience in Knowledge Representation and Reasoning, Semantic Web, and in a related field (NLP, Artificial Intelligence, Machine Learning...);
> 3. Hands-on programming experience;
> 4. Self-motivated, goal-oriented and willing to work in an international team;
> 5. Fluent English is mandatory.
>
> Optional:
> 1. Good control of scripting tools (bash, Unix/Linux tools) and of web languages;
> 2. Experience with automation of NLP processing chains
>
> Job duration: 12 months
> Deadline: open until filled
> Working environment: the postdoc will be employed at Inria Sophia Antipolis, France, in the Wimmics team
> Salary: Gross Salary per month according to the level of diploma and the experience in the domain: 2500 ? 2800? / month (corresponding to 2100-2300? net salary / month)
> Contact email: "Elena Cabrio"<elena.cabrio at inria.fr> ; "Fabien Gandon"<fabien.gandon at inria.fr>
>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 3413 bytes
> Desc: not available
> URL: <https://mailman.uib.no/public/corpora/attachments/20170503/ec2c56ee/attachment.txt>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 3 May 2017 10:34:38 +0200
> From: Pavel <pavel.vondricka at ff.cuni.cz>
> Subject: Re: [Corpora-List] Annotation tool for two documents in
> parallel -- please Help!
> To: Irina Temnikova <irina.temnikova at gmail.com>
> Cc: corpora at uib.no
>
> Hello,
>
> depending on the complexity of annotation: InterText is not meant for annotation, but for automatic/manual alignment of parallel texts. However, it allows for easy editing the text contents as well. Not really convenient for annotation, as you would have to manually enter your markup, but it is highly customizable (as for the view). It could probably be extended for easier and more serious annotation with more or less programming effort - depending on your demands (at least the IT-Editor version).
>
> Best regards,
> Pavel
>
>> 2. 5. 2017 v 18:50, Irina Temnikova <irina.temnikova at gmail.com>:
>>
>> Dear all,
>>
>> we are looking for an annotation tool with which we can annotate two documents in parallel.
>> Both documents must be visible in the same time, so the annotators can compare them & both must be in the same time open for annotation.
>> Let's say that one document is a transformation of the other, and we want to
>> know what is missing in one and appearing in the other and vice versa.
>> Our annotators are linguists, with not much expertise in Computer Science, so the tool should be easy to use.
>> We don't think GATE or Brat support this.
>>
>> Thank you very much in advance,
>>
>> Irina
>>
>> Irina P. Temnikova, B.A., M.A., Ph.D.
>> Postdoctoral Researcher
>>
>> Arabic Language Technologies Research Group
>>
>> Qatar Computing Research Institute
>>
>> Hamad Bin Khalifa university (HBKU)
>>
>> The Research and Development Complex (RDC)
>>
>> P.O. Box 5825
>>
>> Doha, Qatar
>>
>> Mob: +974 33320188
>>
>> Tel: +974 ...
>>
>> www.qcri.qa
>>
>> ------------------------------- -------------------------------- ---------------------------------
>> If you want to build a ship, don't drum up the men to gather wood, divide the work and give orders. Instead, teach them to yearn for the vast and endless sea. (Antoine de Saint-Exupery)
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: smime.p7s
> Type: application/pkcs7-signature
> Size: 3593 bytes
> Desc: not available
> URL: <https://mailman.uib.no/public/corpora/attachments/20170503/b29b4cea/attachment-0001.p7s>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 03 May 2017 10:51:43 +0200
> From: Thierry Hamon <hamon at limsi.fr>
> Subject: [Corpora-List] CFP: Special Issue - Terminology 24(1), 2018 -
> Second call for paper
> To: corpora at uib.no
>
>
> Call for papers: Special Issue - Terminology 24(1), 2018
>
> Computational Terminology and Filtering of Terminological Information
>
> https://perso.limsi.fr/hamon/Terminology2018/
>
> Computational Terminology covers an increasingly important aspect in
> Natural Language Processing areas such as text mining, information
> retrieval, information extraction, summarisation, textual entailment,
> document management systems, question-answering systems, ontology
> building, machine translation, etc. Terminological information is
> paramount for knowledge mining from texts for scientific discovery and
> competitive intelligence.
>
> Thanks to many years of research work, Computational Terminology has
> gained in strength and maturity. New requirements emerge from the
> current use of terminological approaches in many domains. Thus,
> scientific needs in fast growing domains (such as biomedicine,
> chemistry and ecology) and the overwhelming amount of textual data
> published daily demand that terminology is acquired and managed
> systematically and automatically; while in well established domains
> (such as law, economy, banking and music) the demand is on
> fine-grained analyses of documents for knowledge description and
> acquisition. Moreover, capturing new concepts leads to the acquisition
> and management of new knowledge.
>
> The aim of this special issue is to present and describe research work
> dedicated to extraction and filtering of terminological information
> with computational methods. In that context, the addressed topics are
> more particularly dedicated, but not limited, to:
>
> - robustness and portability of methods for filtering extracted terms:
> this aspect allows to apply term extraction and filtering methods
> developed in one given context to other contexts (corpora, domains,
> languages, etc.) and to share the research expertise among them;
>
> - word embedding approaches for terminology acquisition: this aspect
> addresses use and adaptation of word embedding in the context of
> specialized domains for acquisition of terms and relations;
>
> - transfer of methodologies from one language to another, especially
> when the transfer is concerned with less-resourced languages or
> domains;
>
> - new needs of users: this aspect addresses design, creation and
> adaptation of the existing methods and research experience to users?
> needs that have not been addressed up to now by the existing
> research;
>
> - consideration of the user expertise, that is becoming a new issue in
> the terminological activity, takes into account the fact that
> specialized domains contain notions and terms often
> non-understandable to non-experts or to laymen (such as patients
> within the medical area, or bank clients within banking and economy
> areas). This aspect, although related to specialized areas, provides
> direct link between specialized languages and general language. It
> concerns the challenge to use methods and resources, often designed
> for the expert needs, to satisfy the non-expert needs;
>
> - monolingual and multilingual resources: this aspect opens the
> possibility for developing cross-lingual and multi-lingual
> applications, requires specific corpora, robust methods and tools
> which design and evaluation are challenging issues;
>
> - re-utilization and adaptation of terminologies in various NLP
> applications: because the terminologies are a necessary component of
> any NLP system dealing with domain-specific literature, their use in
> the corresponding NLP applications is essential. Re-utilization and
> adaptation of terminologies is a challenging research direction,
> especially when the terminologies are to be used in new domains or
> applications;
>
> - systematic terminology management and updating domain specific
> dictionaries and thesauri, that are important aspects for
> maintaining the existing terminological resources. These aspects
> become crucial because the amount of the existing terminological
> resources is constantly increasing and because their perennial and
> efficient use depends on their maintenance and updating, while their
> re-acquisition is costly and often non-reproducible.
>
> The submissions are open to different approaches, theoretical
> frameworks and applications. We encourage authors to submit their
> research work related to various aspects of computational terminology,
> such as mentioned in this call.
>
>
> Deadlines
> - First call for submissions: March 1st, 2017
> - Submission deadline: June 1st, 2017
> - First acceptance notification: August 1st, 2017
> - Modified version: September 1st, 2017
> - Final acceptance notification: October 1st, 2017
> - Final version ready: November 1st, 2017
>
> Submission Guidelines
>
> Articles should not exceed 9,000 words (excluding references). More
> information on formatting requirements can be found on the web page
> (submission guidelines). English is preferred
> (80% of the contents), but submissions in French, Spanish or German
> will be considered. Each issue of Terminology contains up to six or
> seven articles.
>
> Papers should be submitted to EasyChair at the following address:
>
> https://easychair.org/conferences/?conf=terminosi2018
>
> Program Committee
>
> Galia Angelova, Bulgarian Academy of Sciences, Bulgaria
> Svetla Boytcheva, Bulgarian Academy of Sciences, Bulgaria
> Béatrice Daille, University of Nantes, France
> Louise Deléger, INRA, France
> Yoshihiko Hayashi, Waseda University, Japan
> Olga Kanishcheva, Kharkiv Polytechnic Institute, Ukraine
> Veronique Malaise, Elsevier BV, the Netherlands
> Fleur Mougin, University Bordeaux, France
> Agnieszka Mykowiecka, IPIPAN, Poland
> Rogelio Nazar, University Pompeu Fabra, Spain
> Goran Nenadic, University of Manchester, UK
> Fabio Rinaldi, University of Zurich, Switzerland
> Selja Seppälä, University of Florida, USA
> Takehito Utsuro, University of Tsukuba, Japan
> Jorge Vivaldi Palatresi, University Pompeu Fabra, Spain
>
> TBC
>
> Guest editors
>
> Patrick Drouin, Observatoire de linguistique Sens-Texte, Université de Montréal
> Natalia Grabar, CNRS UMR 8163 STL, Université Lille 1&3,
> Thierry Hamon, LIMSI-CNRS, Université Paris-Saclay & Université Paris 13, Sorbonne Paris Cité, France
> Kyo Kageura, Library and Information Science Laboratory, University of Tokyo
> Koichi Takeuchi, Graduate School of Natural Science and Technology, Okayama University
>
>
> --
> Thierry Hamon E-mail : hamon at limsi.fr
> LIMSI-CNRS Tel: +33 1 69 85 80 39
> Institut Galilée - Université Paris 13 Tel: +33 1 49 40 35 53
> URL: http://perso.limsi.fr/hamon/
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 3 May 2017 09:56:02 +0000
> From: Johannes Daxenberger
> <daxenberger at ukp.informatik.tu-darmstadt.de>
> Subject: [Corpora-List] Job: Scientific System Developer for Natural
> Language Processing applications, TU Darmstadt
> To: "CORPORA at UIB.NO" <CORPORA at UIB.NO>
>
> The Ubiquitous Knowledge Processing (UKP) Lab at the Department of Computer Science, Technische Universität (TU) Darmstadt, Germany has an opening for a
>
> Scientific System Developer
> (PostDoc- or PhD-level; time-limited project position until April 2020)
>
> to strengthen the group?s profile in the area of Argument Mining, Machine Learning and Big Data Analysis. The UKP Lab is a research group comprising over 30 team members who work on various aspects of Natural Language Processing (NLP), of which Argument Mining is one of the rapidly developing focus areas in collaboration with industrial partners.
>
> We ask for applications from candidates in Computer Science preferably with expertise in research and development projects, and strong communication skills in English and German. The successful applicant will work in projects including research activities in the area of Argument Mining (e.g. automatic evidence detection, decision support, large-scale web mining on heterogeneous source and data management), and development activities to create new products or industrial product prototypes. Prior work in the above areas is a definite advantage. Ideally, the candidates should have demonstrable experience in designing and implementing complex (NLP) systems in Java and Python as well as experience in information retrieval, large-scale data processing and machine learning. Experience with continuous system integration and testing and distributed/cluster computing is a strong plus. Combining fundamental NLP research with industrial applications from different application domains will be highly encouraged.
>
> UKP?s wide cooperation network both within its own research community and with partners from industry provides an excellent environment for the position to be filled. The Department of Computer Science of TU Darmstadt is regularly ranked among the top ones in respective rankings of German universities. Its unique and recently established Research Training Group ?Adaptive Information Processing of Heterogeneous Content? (AIPHES) funded by the DFG emphasizes NLP, text mining, machine learning, as well as scalable infrastructures for the assessment and aggregation of knowledge. UKP Lab is a highly dynamic research group committed to high-quality research results, technologies of the highest industrial standards, cooperative work style and close interaction of team members working on common goals.
>
> Applications should include a detailed CV, a motivation letter and an outline of previous working or research experience (if available).
>
> Applications from women are particularly encouraged. All other things being equal, candidates with disabilities will be given preference. Please send the application to: jobs(a-t)ukp.informatik.tu-darmstadt.de by 31.05.2017. The position is open until filled. Later applications may be considered if the position is still open.
>
> Questions about the position can be directed to: daxenberger(at)ukp.informatik.tu-darmstadt.de; phone: [+49] (0)6151 16-25297
> We look forward to receiving your application!



More information about the Corpora mailing list