Guidance for Researchers: Collecting, Storing and Documenting Data

This guide aims to support academics in many aspects of their research with workshops, information about publishing trends, and any useful tools and resources.

This page covers the three aspects in relations to research data.

Creating Data

Choosing file formats

You can ensure your research remains accessible by taking steps to make your data compatible with different software and systems. For sharing or long-term storage, it is best practice to select file formats that are:

  • open documented standards or have publicly available technical specifications rather than proprietary
  • used commonly by your research community
  • have the capacity to extract and discover data, rather than simply displaying data
  • able to preserve the data without compression
  • shareable

The UK Data Service offers guidance on formats for long-term storage as well as recommended formats for different data types.

Organising Data

Creating an organising system:

Effective organisation of your folders and files will make it easier to locate and track your research. Consider where you will keep your work on the networked drive, whether you would prefer a deep or shallow hierarchy structure, how you will manage ongoing and completed work, and build in time to review and manage your folders on a regular basis.

Creating file names:

A consistent naming convention contributes to effective file management and documentation and facilitates appropriate citation. File names may include researchers' names, project numbers, experiment numbers, version numbers, or whatever best suits your research practice.

The University of Edinburgh offers full practical guidance on naming conventions for records management.

Metadata

Where possible, include an additional file containing metadata (data about data) in the same folder as your data. This metadata will allow you to add context to your data so that you and others can understand it at any point in time. This file should:

  • Be written as a plain text file
  • Be called: README
  • Include general information - title, authors, date of collection
  • Provide an overview - short description of the data each file contains and date it was created
  • State how the data can be shared - detail any licences or restrictions placed on the data
  • Describe methodological information - how the data was collected, generated and processed
  • Include data-specific information - a variable list (including definitions) for tabular data, units of measurement, definitions for codes or symbols used to record missing data

Cornell University offer a guide to best practice and a 'readme' template.