Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About
  • Text only
  • |
  • Sign in
  • Search Modern Records Centre
  • Search The Library
  • Search University of Warwick
  • Search for people at Warwick
  • Search Warwick Blogs
  • Search past exam papers
  • Search video
  • More…

    The Library » Modern Records Centre

    • Our Holdings
    • Archive search
    • Use the Centre
    • Explore further
    • Warwick students
    • Warwick teachers
    • Owners
    • News
    • Digitisation project: The Spanish Civil War »
    • Individuals
    • Organisations
    • Optical Character Recognition (OCR)
    • Trades Union Congress Files
    University of Warwick

    Digitisation project: The Spanish Civil War

    This project has now been completed - www.warwick.ac.uk/go/scw

    Work is now underway on a major new project to digitise internationally significant archives relating to the Spanish Civil War.

    The project will result in over 10,000 pages of archive material being made available online free of charge. Transcriptions will be available for every item, allowing researchers to search through the mass of material for key words or phrases.

    It is anticipated that the project will go live in May 2012.

     

    What is being digitised?

    The archive collection of the Trades Union Congress includes 45 files on different aspects of the conflict. The files contain correspondence, minutes, reports, memoranda and propaganda material produced by members of the British and Spanish governments; political groups; international, British and Spanish trade unions; pressure groups, aid organisations, and other interested parties.

    In addition, we are also digitising a small number of publications from the collections of the Trotskyists Henry Sara and Hugo Dewar. These include examples of bulletins (in English and Spanish) produced by Partido Obrero de Unificación Marxista (POUM).

     

    What does the project involve (and why will it take so long)?

    Actually scanning the documents is only a small part of the work needed in order to produce a fully searchable, easily navigable database. The project requires us to:

    • Individually number every piece of paper, so that each item has its own individual reference number or code.
    • Digitally scan each item - the scanner that we are using is capable of copying documents up to A2 size and photographs the documents from above, helping to minimise any damage to the archive material.
    • Produce metadata ("data about data") for each item and image. We are effectively creating a catalogue entry for every individual item in the 45 files - recording what it is, who produced it (and, if it is a letter, who received it), and when it was produced. We are also recording technical data about each digital image (jpg).
    • Produce a transcription of every item - the online database will use these to identify key words used in searches. We are using optical character recognition (OCR) software to "read" the images and convert them into text files. Unfortunately OCR does not recognise handwriting and can struggle to accurately read 1930s printed or typescript text. We therefore have to go through and correct every OCR text file to ensure that it is an accurate transcription of the original document.
    • Attempt to identify copyright owners, and contact them to request permission to publish. According to UK copyright law, individual intellectual ownership of a "literary work" doesn't expire until 70 years after the creators' death. Many of the archives being digitised are therefore still in copyright. We are asking for the help of members of the public to identify current copyright owners (for example, descendents of individual authors) - lists of the individuals and organisations who wrote or issued the documents in the TUC files are now online - please contact us if you think that you may be the current copyright owner.
    • Upload the images, transcriptions and metadata on to digital collection management software, which will then be customised to allow the documents to be searched / browsed in as straightforward a way as possible. An earlier example of a digitisation project - the Marandet Collection of 18th and 19th century French Plays - can be seen on the website of the Department of French Studies.

     

    Are all your holdings on the Spanish Civil War being scanned?

    No, it will be still be necessary to visit us to see significant collections such as the archives of the International Transport Workers' Federation and the papers of Wilfrid Roberts. A subject guide on sources for the Spanish Civil War at the Modern Records Centre is available elsewhere on our website.

     Illustration from POUM journal [MSS.206/3/5/8/4]

    0 page comments
    Modern Records Centre Contact Us
    University Library, University of Warwick Telephone: +44 (0)24 7652 4219
    Coventry, CV4 7AL, UK Email: archives@warwick.ac.uk
    Close this email form
    Page contact: Archives Last revised: Fri 4 May 2012
    • Sign in
    • |
    • Powered by Sitebuilder
    • |
    • © MMXII
    • |
    • Privacy
    • |
    • Accessibility