Skip to Main Content

Research Data Services

Organize, Document, Save, and Backup your Data!

Making sure your data remain accessible for the long term is a big challenge, especially since technology changes so quickly. Open file formats have a history of wide adoption and backward compatibility and are less likely to become obsolete. Proprietary software, such as Microsoft Office Suite, may become obsolete. 

Choose open formats:

  • Non-proprietary, open, documented standards (e.g., .tif, .txt, .csv, .pdf)
  • Used commonly in your research community
  • Encoded with standard characters (e.g., ASCII, UTF-8)
  • .txt over .docx
  • .cvs over .xslx
  • .tif over .jpg
Top Tip: 

README.txt files are a recommended method to document your project and add context when clarity is needed. 

You should use a README.text file at the top level of your research project folder to explain the purpose of the research, the relevant summary, name and contact information for project researchers, general organization of your files, and copyright and licensing information.

Great organization is your best asset for data management. The most important part of organization is having a system and using it consistently. You may choose to organize your data by the following, or use them in combination:
1. Create a system
          a) By project
          b) By data
          c) By analysis type
          d) By research
          e) By site or data source
2. Work with collaborators
3. Use file version control
Version Control is the way to track revisions of a data set, or a process.  If your research involves more than one person, it is essential.  You will want to record every change to a file, no matter how small.  Keep track of the changes to a file in your file naming convention and log files, or version control software.  File sharing software can also be used to track versions.
Top Tip: Organize 
  • Use folders - group files within folders so information on a particular topic is located in one place 
  • Adhere to existing procedures - check for established approaches in your team or department which you can adopt 
  • Name folders appropriately - name folders after the areas of work to which they relate and not after individual researchers or students. This avoids confusion in shared workspaces if a member of staff leaves, and makes the file system easier to navigate for new people joining the workspace 
  • Be consistent – when developing a naming scheme for your folders it is important that once you have decided on a method, you stick to it. If you can, try to agree on a naming scheme from the outset of your research project
  • Structure folders hierarchically - start with a limited number of folders for the broader topics, and then create more specific folders within these

                                              Credits: University of Cambridge Data Management Guide

Data documentation explains how data were created or digitised, what data mean, what their content and structure are and any data manipulations that may have taken place.

Document throughout your research process. 

  • Document any data processing analysis
  • Take notes!
  • Include both written, electronic, and recorded notes
  • Create documentation within your organization at project and folder levels
    • Create a README.txt file
  • Use descriptive names within your documentation

File Naming - Best practice is that the names are descriptive – they reflect the content of the file.  Be consistent – use the same format for all of the files in a project, including data set files and zip or tar files. Some suggested attributes to include:

  • unique identifier or project name/acronym
  • PI
  • location/spatial coordinates
  • year of study
  • data type
  • version number
  • file type
Top Tip: File Naming 

Use meaningful names that are consistent, descriptive, and short.

A great way to meaningfully name files is to include the project, instrument, and the year, month, and date in the file name.

Example: 
Don't: File12935.xls
Do: Project_Instrument_location_YYYYMMDD.csv

Take Notes 

You can improve your documentation by improving your note taking. Good notes are:

  • Legible
  • Clear and concise
  • Include all the relevant information 
  • Understandable to others with similar training

Storage  Backup

Storage and backup are separate elements of data management that complement each other. 

Storage is for your working files that you access regularly. If you lose storage, you'll lose the current versions of your data.

Backup is the regular process of copyright data. You don't need a backup until you lose your data, but it can save your research.

A good combination of storage and backup supports strong data management.

Top Tip: File Backup 

A good rule to follow when managing your data is the rule of 3. Keep three copies of your data:

  • Two copies onsite
  • One copy offsite

Example
1. Laptop
2. External hard drive
3. Cloud storage