Skip to main content

Preserving research data

When you have spent a great deal of time and effort in collecting your research data it is natural for you to want to ensure that is preserved and stored securely for as long as necessary. There are a range of activities associated with data preservation that you will need to undertake to prepare any data for deposit.

Preserving and sharing

Research data is a valuable resource and in many cases can have multiple uses after the end of the original project. Sharing data after a project completes can:

  • encourage further research branching from the original project
  • can lead to new collaborations
  • encourages the transparency and the improvement of research practice
  • can reduce the cost of further data collection
  • can increase your profile as a research output in its own right

Research funders now have requirements on how long research data should be stored after the end of the project and many explicitly require data arising from their funding to be shared openly, where possible.

When not to share

  • Your data has financial value or is the basis for potentially valuable patents that could be exploited by the University, it may be unwise to share it
  • The data contains sensitive, personal information about human subjects, it may violate the Data Protection Act, ethics codes, or your own written consent forms to share it, even with other researchers. However, there might be ways of anonymising the data to make it sharable

Please note that if you think you cannot share your data some funders may require statements justifying why data should be restricted as part of their application process. All research data you share needs to conform to the University’s Research Code of Practice, guidelines and support on ethics can be found from Research and Impact Services and the UK Data Service has a very useful guide on data anonymisation

Retention of your data

Many funders have requirements on when you need to deposit your data once a project has finished as well as for how long the data needs to be retained for. The Digital Curation Centre (DCC) has provided a summary of the requirements in this area by the major funders. If your funder is not on this list please use the Sherpa Juliet service to check if your funder has any requirements for research data.

The University of Warwick’s Research Data Policy is designed to allow all Warwick researchers to be compliant with all known funder requirements by default and states:

“9. Data must be retained intact in an appropriate format and storage facility, normally for a period of at least 10 years from the date of any publication which is based upon it. Where specific regulations with regard to data retention apply, e.g., from funders, these regulations should prevail, particularly where the required retention period is longer than the University requires.”

Contact us by email: researchdata at warwick dot ac dot uk if you need any further advice.

Selecting data for long term curation

When preserving data for the long term there is the temptation to want to store everything! However not all research data is suitable for long term preservation. As a general guideline you should only store the data that underpins research publications.

The DCC has created a comprehensive guide to appraising and selecting data for preservation which outlines the key issues in this area. The DCC’s guide can be supplemented by the:

Choosing a data archive for your data

Once you’ve decided what data you need to keep in the long term you will need to choose where to deposit this data to make it available. It may be that you have an idea from looking at sources of additional data during your data management planning stage or use the flowchart below to help you decide.data archive flowchart

Licensing your research data

Licensing allows you to specify what people can and cannot do with your data and allows both you and the users of your data to be clear on what the rights situation is in relation to your data.

It is best to have an idea which license you plan to use before you start collecting the data. This will allow you to clarify the legal position and ownership of the data early on and can help in planning consent requests from research participants. If you are subject to funder requirements or collaborating with other researchers discuss your plans with them early on as well.

Available licenses

There are a number of different licensing schemes that you can use for research data depending on what the data is and how it is formatted. Please note: some data services/centres have specific recommended licenses for data you deposit with them that you must use.

  • Creative Commons licenses  are a range of open licenses allowing you to mix and match the level of rights you are retaining and are commonly used for research publications and other research outputs
  • Open Data Commons offer two licenses that are often used for research data
  • Open Government License is used for public sector databases and data. You may be using data subject to this license and so should be aware of it
  • GNU General Public License 3.0 (GPLv3) is the license most often recommended for open source software releases

Jisc has published a practical guide on licensing open data and the DCC has published a guide on how to license research data.

Data access statements

Data access statements, or, data availability statements are used in publications to indicate where and how you can access the data that supports the research paper you are reading. These statements are a requirement for many research funders and are a particular requirements for anyone with RCUK funding since April 2013 (section 3.3 ii).

The aim of the data statement is to promote the discoverability of research data that has been created in the life of a project – BUT the data itself does not have to be publically accessible. If you have any concerns over whether you should publish you data openly or not please contact us via email: researchdata at warwick dot ac dot uk.

What to include

Depending on if your data is openly available or not one of the following options will apply:

  • If data are openly available the name(s) of the data repositories should be provided, as well as any persistent identifiers or accession numbers for the dataset. The guidelines on data citation may help
  • If there are justifiable legal or ethical reasons why your data cannot be made openly available, these should be included in the data access statement. In this case, the data access statement must direct users to a permanent record that describes any access constraints or conditions that must be satisfied for access to be granted. You can create a record of this type in the University’s Publication service
  • If you did not collect the research data yourself but instead used existing data obtained from another source, you should cite the source following the guidelines for data citation

Please note that a simple direction to interested parties to contact the author would not normally be considered sufficient.

Public Library of Science has guidance on data access statements and the University of Bath has a detailed guide