We are pleased to announce that the British Academic Written English (BAWE) corpus is now available to all researchers via We are pleased to announce that the British Academic Written English (BAWE) corpus is now available to all researchers via the Oxford Text Archive (resource number 2539 http://ota.ahds.ac.uk/headers/2539.xml). The corpus and end-of-award report are also available to ESRC award holders via UKDA-store http://store.data-archive.ac.uk .
The 6.5 million word corpus was developed with ESRC funding as part of the project entitled 'An investigation of genres of assessed writing in British Higher Education' (RES-000-23-0800). It contains 2761 files of proficient student writing, fairly evenly distributed across four levels of study and four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences, and Physical Sciences). The files are available in three formats: XML UTF-8, XML ASCII and Plain Text. A corpus manual explaining the encoding conventions is included as part of the deposit.
There are no restrictions on access to the corpus for research purposes, and we welcome such use, which should be acknowledged as follows: "The data in this study come from the British Academic Written English (BAWE) corpus, which was developed at the Universities of Warwick, Reading and Oxford Brookes under the directorship of Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics [previously called CELTE], Warwick), Paul Thompson (Department of Applied Linguistics, Reading) and Paul Wickens (Westminster Institute of Education, Oxford Brookes), with funding from the ESRC (RES-000-23-0800)."
Please inform us of output in the form of dissertations, theses, presentations or publications arising from analysis of the corpus, so that we can mention it when reporting to our sponsors.