The PHOENIX filesystem¶
This guide is a description of the PHOENIX filesystem, not necessarily that of any particular software implementation that produces a PHOENIX filesystem.
GENERAL and PROTECTED folders¶
At the root level of the PHOENIX file system, there are two subdirectories GENERAL and
PROTECTED
PHOENIX
├── GENERAL
└── PROTECTED
Data that are not encrypted at rest are stored under the GENERAL folder. Data that
are encrypted at rest are stored under the PROTECTED folder.
Note
Types of data that are encrypted at rest tend to include gps, onsite interviews, and voice recordings.
Study folders¶
Under the GENERAL and PROTECTED folders are STUDY folders
PHOENIX
├── GENERAL
│ └── STUDY_A
└── PROTECTED
└── STUDY_A
Note
Study folders should contain only letters, numbers, and underscores [A-Za-z0-9_].
study folder permissions¶
Each STUDY folder is assigned the default permissions rwx------. Individual user
permissions are then added using POSIX.1e access control lists. To add read (ls) and
execute (cd) permissions on the STUDY_A folder for user jdoe you would issue
the following command
setfacl -m u:jdoe:rx /PHOENIX/GENERAL/STUDY_A /PHOENIX/PROTECTED/STUDY_A
Warning
Many but not all filesystems support POSIX.1e access control lists. For example, some versions of PANASAS do not support them at all, while other filesystems, such as NFSv4, may use different tools and/or a modified syntax than shown above.
Subject folders¶
Within each STUDY folder are individual SUBJECT folders. Subject names should be
unique across PHOENIX
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
Warning
While subject names should be unique across PHOENIX, this is not enforced by Lochness in any way.
Note
Subject names should contain only letters, numbers, and underscores [A-Za-z0-9_].
Data type folders¶
Within each SUBJECT folder, there are folders for each DATA TYPE
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA_TYPE
Some example DATA TYPE names include actigraphy, mri, phone, and
surveys.
Note
Data type names should contain only letters, numbers, and underscores [A-Za-z0-9_].
Raw and processed folders¶
Within each DATA TYPE folder, there are folders for raw and processed
data
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
│ ├── raw
│ └── processed
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA_TYPE
├── raw
└── processed
raw¶
The raw folders are the bedrock of the PHOENIX filesystem. These folders are
typically populated by data aggregation software. The user designated to run the
data aggregation software should be the only user with write permissions on
these folders. All other users should be granted read-only permissions.
processed¶
The processed folders are assigned the permissions rwxrwxrwxt which allows
any user who has been granted access to the parent STUDY folder to write files.
Because these folders use a sticky bit,
only the owner of a file will be allowed to edit or delete their own files.
Note
Folders must be named raw and processed in lowercase letters.
Product folders (optional)¶
Within each raw folder, there may be folders for each data capturing PRODUCT.
This allows for multiple data capturing products, which happen to capture the same
type of data, to be clearly differentiated
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
│ └── raw
│ └── PRODUCT
│
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA TYPE
└── raw
└── PRODUCT
Some product names include Actiwatch2 and GENEActiv.
Note
Product names should contain only letters, numbers, and underscores [A-Za-z0-9_].
Raw file integrity¶
As raw files are being downloaded from each data source, the file contents are stored
within hidden files.
These hidden files should be ignored by end users. The file will be renamed to a
visible file name only after the file has been considered downloaded successfully. If the
file contents can be verified using a checksum, numbers of bytes, or by some other means,
the file will be verified before it is made visible.
Raw file naming convention¶
As a general rule, files will always preserve their original names or they will be assigned a name provided by the originating data source. Instances where a file name is not provided by the originating data source, an appropriately descriptive file name will be automatically generated.
Metadata files¶
In PHOENIX, all data for a subject are downloaded and organized under unique SUBJECT
folders. To accomplish this, the data aggregation software must understand how to query
for data belonging to the SUBJECT within each data source. This is achieved using
metadata files. Each STUDY folder must contain a metadata file
PHOENIX
└── GENERAL
└── STUDY_A
└── STUDY_A_metadata.csv
The metadata file should be named with the study name followed by a _metadata.csv
suffix. The data aggregator is largely driven off these PHOENIX metadata files. The
minimal contents of a metadata file should look like this
Active,Subject ID,Consent Date
1,SUBJECT_1,2019-01-01
For convenience, here’s the same file rendered as a table
| Active | Subject ID | Consent |
|---|---|---|
| 1 | SUBJECT_1 | 2019-01-01 |
You must add additional columns to this file for each supported data source that you wish to pull data from.
See also
You can read much more about the supported data sources on the data sources page.
For the sake of brevity, let’s see what a metadata file looks like when we
add a Beiwe column
| Active | Subject ID | Consent | Beiwe |
|---|---|---|---|
| 1 | SUBJECT_1 | 2019-01-01 | beiwe.example:5e2311:abcde |
This instructs the data aggregator that SUBJECT_1 should have data in the
Beiwe instance beiwe.example, under the study 5e2311, under the patient
abcde.