Part 2 - Establishing File Naming Conventions#

2a. Relevance:#

? Why assign naming rules to files and folders?

Consistent file naming can help you to clearly organise all the information for your projects.

At the beginning of a project, it’s helpful to determine an FNC (File Naming Convention), or “file naming schema,” for each type of information that you will gather, create, or store to work with. An FNC is the pattern that you establish and follow for naming the files associated with a project. Without an FNC, you might end up with a mysterious list of file names that looks something like this:

Documents Cartoon

"Documents" by xkcd is licensed under CC-BY-NC.

File naming conventions help you stay organised so you can quickly identify your files. In a collaborative setting that requires file-sharing, a clear FNC will help others more easily navigate your work.

⚠️ It is essential to establish a convention before you begin creating files or collecting data, and to adjust it as needed during the research while documenting any changes accordingly. Having a clear FNC at the start of a project prevents a backlog of unorganised content.

In this module we’ll share general recommendations for file naming that apply across academic disciplines. Please note, there’s no single “right” way to name your files. However, also be aware that different fields of study may have their own specific conventions for naming different types of files. For example, biologists may adhere to standardised 4-letter abbreviations for species names.

2b. Rules of Thumb for File Naming:#

? How do I develop an FNC (file naming convention) for my files?

Read the basic rules of thumb to learn more about recommended file naming practices. Afterwards, you’ll be asked to analyse several examples.

#1 Be consistent#

  • At the start of a project, choose a naming pattern for all files of the same type. Then stick with it!

#2 Be descriptive#

  • An FNC should include elements or attributes that best describe the file. For example, which keywords would you and collaborators use to search for the file? Which elements make the file recognizable and distinct from other files? Because every project is different, the same naming elements won’t apply to every project. When you develop an FNC, you must choose descriptors that are most relevant to the specific files(s) and/or project. Here are a few examples of file descriptors:

    • Project title

    • Conditions (lab instrument or set-up used, specimen tested, temperature, variable being measured, etc.)

    • Type or purpose of document (e.g. data, progress report, questionnaire, interview).

    • If working collaboratively, the initials of the person who ran the test

  • “Final” is NOT a descriptive name.

  • Use abbreviations that are commonly unterestood by collaborators, and document these abbreviations.

Describing data in a series

  • If you are working with data in a series, file names should indicate what makes each separate piece of data unique or should indicate where each piece of data falls within the collection sequence. Otherwise, it may be more difficult to retrace your research process.

For example:

  • Run of the experiment (also called the ascension number, trial number, or recording ID). Add “leading zeros” so that all file names in a series stay the same length. This will make it easier to search for the data sequentially. To exemplify:

    • In an experiment involving 500 trials, the data will be numbered like this: 001
      015
      066
      488

    • In an experiment involving 2000 trials, the data will be numbered like this: 0001
      0015
      0066
      0488
      1356

  • The date the file was created or the date the data were collected. Generally recommended format for dates is: YYYYMMDD.

Check your understanding: Let’s pause to process these first two rules of thumb with two file naming scenarios.

Practice A:
A team of researchers is doing a comparative study titled “Project Vis.” It involves field observations of fish in the Netherlands at different locations and times. They establish this FNC for the datasets they'll collect in the field:

[Project name][location of data collection][date collected]_[initials of the researcher].file type

Practice B:
Practice applying the first two file naming rules of thumb to a research scenario.

#3 Keep file names short#

  • Try to limit names to 32 characters or less. Shorter names are easier to quickly scan to know what the file is. For a visual:

32CharactersLooksExactlyLikeThis.ext

  • Try to shorten file names as much as possible by using abbreviations or acronyms (but make sure others in your field understand them).

#4 Avoid spaces#

  • Some operating systems and command line programs don’t recognise spaces in file names. Spaces can cause problems when you’re transferring files between systems.

  • Recommended: use underscores _ or hyphens - instead of spaces.

#5 Avoid “weird” characters#

  • Use periods only at the end of the file name, right before the *extension (*extension: the last 2-3 characters that tell you the file format).

  • Some characters carry meanings within a computer programming environment, so it’s better to avoid them. These characters can be confusing for machines:

.&, *%#;()!@$^~'{}[]?<>

#6 Versioning matters: Track versions of your work#

Why versioning Matters

"Why versioning matters" by TU Delft Library - Education Support is licensed under CC BY 4.0

Sometimes you may need to keep track of multiple versions or revisions of the same file.

Without a clear system for versioning, you might forget which version is which (see graphic). This could lead to unfortunate mistakes. For example, for a big class assignment you might submit "Reallyfinal.doc", when you meant to turn in "Finalfinal.doc".

Such a confusion can be remedied by including version information in the file name. This will help you and collaborators to track the evolution of a document more clearly over time.

Here are standard recommendations to indicate version:

  • For significant changes, use whole numbers: V1, V2

  • For minor changes, use decimals: V1.1, V1.2

Please note: Storage systems such as One Drive or version control systems like GitHub actually have versioning build in. If you’re working in these systems it’s actually NOT advisable to use numbered versions in file names.

2c. Process the rules of thumb for file naming:#

You just learned basic rules of thumb for developing an FNC. Now, take a few minutes to practice applying basic file naming guidelines to three new scenarios.

FNC Scenario #1:

FNC Scenario #2:

FNC Scenario #3:

2d. Practical applications: how do I keep track of the FNC I’ve chosen?#

Alice in WL

"README Alice in Wonderland" adaped from original image by John Tenniel - John Tenniel, Public Domain, https://commons.wikimedia.org/w/index.php?curid=629633

To keep track of the FNCs they’ve established, researchers create a piece of documentation called a “README” file. A README is generally a txt file that gets saved into the same folder as the dataset(s) it describes. The README.txt acts like a short guide to your FNC. It helps explain and document the schema that was used to name those specific files. We plan to develop a separate mini-module about documentation strategies; it will go into more depth with step-by-step instructions about how to create README files. Would you like to learn more about README documentation in the meantime? We suggest you visit this link to Harvard University’s research data management site which offers README templates and guides.