Text mining tools and technologies have a long history in the repository world, where they have been applied successfully for a variety of purposes. These vary from pragmatic aims such as enabling document search and browse facilities, linking related documents, identifying copies or facilitating the deposit process, to support tools for academic research. The latter category includes supporting research on the basis of a large body of documents, facilitating access to and reuse of existing work, and connecting the formal academic world with areas such as the traditional and social media. The JISC have funded a number of projects and initiatives in both areas, notably NaCTeM and the ResDis programme. Research areas as diverse as biology, chemistry, sociology and criminology have seen effective use made of text mining technologies.
However, the uptake and hence the impact of these tools has been uneven. Several obstacles to development and deployment are frequently cited, including the maturity, complexity, and in some instances cost of software packages, as well as scarcity of relevant technical skills. Text mining methods and tools can be fragile and complex, requiring significant set-up time and effort. Projects making use of text mining may also suffer from legal obstacles, such as copyright and intellectual property considerations. The benefit to be gained from deployment of text-mining tools in areas such as institutional repositories or as a research tool in its own right may be difficult to predict without a costly pilot project.
This workshop is intended to bring together contributions from practitioners and researchers in fields connected to text mining and analysis. Authors are invited to submit original, unpublished research papers: as a workshop, both work-in-progress and completed work are welcome.
This event will take place during the OR-2012 pre-conference workshop session (9th-10th July 2012).
Topics of interest include, but are not limited to:
1. Discipline-specific research involving text-mining: bioinformatics, chemistry, the social sciences, etc. 2. Techniques in text mining: sentiment analysis/subjectivity analysis, opinion mining, affect analysis, metaphor analysis, etc. 3. Legal aspects of text mining/analysis. 4. Current developments in text mining. 5. Metadata extraction from document text, including formal and informal metadata: document indexing, document classification, and evaluation of metadata quality. 6. Text mining for document categorisation or summarisation. 7. Text mining over the social web: community detection, timelines, etc. 8. Evaluation of text mining tools, open-source or commercial: case studies and findings.
Types of contribution
The following contributions are sought: 1. Full papers (6-8 pages) 2. Extended abstracts for oral presentation, posters or software demos (1-2 pages)
Papers/extended abstracts should be prepared in either Word or LaTeX using the Springer LNCS format (http://www.springer.com/computer/lncs?SGWID=0?164?6?793341?0). Files should be submitted by email to Emma Tonkin
15-May-2012 Title/Abstract submission (optional) 25-May-2012 Full paper/Extended abstract submission 8-June-2012 Decisions announced 25-June-2012 Submission of final papers 9/10-July-2012 Workshop
All accepted contributions will be published in the workshop proceedings. Authors of selected contributions will be invited to submit an extended and revised version for formal publication; to this end, a call for chapters will be launched following the workshop.
Workshop chairs Paul Walk - Innovation Support Centre, UKOLN, University of Bath, UK Emma Tonkin - Innovation Support Centre, UKOLN, University of Bath, UK Torsten Reimer - JISC