Centre for Applied Linguistics

CAL

BAWE (British Academic Written English) and BAWE Plus Collections

[c]

Overview of BAWE

The British Academic Written English (BAWE) corpus was created through a project entitled 'An investigation of genres of assessed writing in British Higher Education' from 2004 – 2007. This project was funded by the Economic and Social Research Council (Project number RES-000-23-0800) and was a collaboration between the Universities of Warwick, Reading and Oxford Brookes.

The BAWE corpus contains 2761 pieces of proficient assessed student writing, ranging in length from about 500 words to about 5000 words. Holdings are fairly evenly distributed across four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences and Physical Sciences) adn across four levels of study (undergraduate and taught masters level). Thirty-five disciplines are represented.

The assignments have been annotated using a system devised in accordance with the TEI guidelines. The header for each file includes factual information such as gender and year of birth and also contains some research findings from the initial team such as genre family. There is a dtd file which must be kept in the same folder as the corpus files, named tei_bawe.dtd and the holdings are described in an Excel spreadsheet 'BAWE.xls'. The transcription and mark-up conventions are described in the BAWE manual document, which is in PDF format.

The corpus is available free of charge to non-commercial researchers who agree to the conditions of use and who register with the Oxford Text Archive. The BAWE corpus can be accessed through the Oxford Text Archive (http://ota.ahds.ac.uk) as resource number  2539. It includes text files, a spreadsheet with contextual information, and a corpus manual.

For more information about the BAWE corpus, please email baweplus@warwick.ac.uk

 

Overview of BAWE Plus

BAWE Plus is a collection of resources for research into academic written English in the UK in the twenty-first century. The following are its main components:

 

(Viewlet) supplementary bawe data

 
BAWE PDF files
 
The Centre for Applied Linguistics holds PDF files of the assignments which make up the BAWE corpus. These may be useful to researchers who wish to examine assignments in their original layout.
 
The BAWE Pilot Corpus
 
A pilot for the British Academic Written English (BAWE) corpus was created in 2001, with support from the University of Warwick Teaching Development Fund. The pilot corpus contains about one million words of text, in the form of 500 student assignments ranging from 1,000 to 5,000 words. The collection is held at the Centre for Applied Linguistics and is not a part of the BAWE corpus submited to the OTA.
 
Details of the pilot corpus are reported in: Nesi, H., Sharpling, G. & Ganobcsik-Williams, L. (2004) "Student papers across the curriculum: Designing and developing a corpus of British student writing". Computers and Composition Volume 21, Issue 4, pp 401-503.
 
Tutor interviews
 
As part of the 2004- 2007 BAWE project, Tutors from contributing departments were interviewed in order to find out more about departmental practice. Notes and transcripts from these interviews are held at the Centre for Applied Linguistics.

 

(Viewlet) the welt pilot corpus

This is a collection of written answers at grade B and above from the Warwick English Language Test.

 

 

Other Academic English Resources

The Centre for Applied Linguistics also holds a collection (a corpus and  associated resources) in British Academic Spoken English. See BASE Plus

 

We welcome proposals from potential doctoral students and other researchers interested in working with these resources.

To contact us, please email baweplus@warwick.ac.uk

 

esrc logo 

 

 

Web site search

People search

News

News.