data

Home dataset_description.json participants_* motion\

Minimal data standards

Motivation

This page is intended as a blueprint for the data organization of all movement sensor-based projects of the Neurogeriatrics research group. It should help to streamline analysis pipelines and keep projects organized for research partners.

Folder Structure

The general folder structure for any dataset is depicted below:

.
├── rawdata/
|   ├── dataset_description.json
│   ├── participants_overview.tsv
│   ├── participants_overview.json
│   ├── sub-<label>
│   |   ├── motion
│   ├── sub-<label>
│   └── ...
├── sourcedata/
├── scripts/

where:

Subject-specific Folders

Then, for each subject, the data are organized in a folder sub-<label>, with data from different recording modalities, e.g. EEG, motion, etc., organized in a separate folder:

.
├── sourcedata/
├── rawdata/
    ├── sub-<label>
        ├── eeg/
        ├── motion/
        ├── ...
        └── sub-<label>_scans.tsv

Within each modality-specific folder, the data files are organized as follows:

.
├── sourcedata/
├── rawdata/
    ├── sub-<label>
        ├── eeg/
        ├── motion/
            ├── sub-<label>_task-<label>_tracksys-<label>_channels.tsv
            ├── sub-<label>_task-<label>_tracksys-<label>_motion.tsv
            └── ...
        ├── ...
        └── sub-<label>_scans.tsv

The data are stored in text-based format, and are TAB delimited. In this way, data can be read into MATLAB, Python, or even Excel. However, if the files become too large, the user can change the file format e.g. .npy, but keeping the same tabular structure.

File name structure

A filename consists of a chain of entities, or key-value pairs, a suffix and an extension. Two prominent examples of entities are subject and session.

For a data file that was collected in a given session from a given subject, the filename MUST begin with the string sub-<label>_ses-<label>. If the session level is omitted in the folder structure, the filename MUST begin with the string sub-<label>, without ses-<label>.

Note that sub-<label> corresponds to the subject entity because it has the sub- “key” and<label> “value”, where <label> would in a real data file correspond to a unique identifier of that subject, such as 01. The same holds for the session entity with its ses- key and its <label> value.

If one subject completes multiple task within the same session, the key task must be introduced in the file name (for example *task-StraightWalk*). If one task is repeated multiple times, this can be indicated using the run key (for example *task-StraightWalk_run-01*). For each repetition per subject (sessions, tasks, runs) the user should give detailed information about these in a according *.json in the top level directory.

The full template would read as follows ([] means optional):
sub-<label>[_ses-<label>][_task-<label>][_run-<index>][_tracksys-<label>]_motion.*

Example:
sub-COKI10043_ses-T1_task-DualTaskMM_run-02_tracksys-IMU1_motion.tsv

NOTE: sub and task key have to be included in every filename.

About

This page is maintained by Robbin Johannes Romijnders and Julius Welzel in collaboration with Clint Hansen, Johanna Geritz and Walter Maetzler.