The PHOENIX filesystem¶
This guide is a description of the PHOENIX filesystem, not necessarily that of any particular software implementation that produces a PHOENIX filesystem.
GENERAL and PROTECTED folders¶
At the root level of the PHOENIX file system, there are two subdirectories GENERAL
and
PROTECTED
PHOENIX
├── GENERAL
└── PROTECTED
Data that are not encrypted at rest are stored under the GENERAL
folder. Data that
are encrypted at rest are stored under the PROTECTED
folder.
Note
Types of data that are encrypted at rest tend to include gps, onsite interviews, and voice recordings.
Study folders¶
Under the GENERAL
and PROTECTED
folders are STUDY
folders
PHOENIX
├── GENERAL
│ └── STUDY_A
└── PROTECTED
└── STUDY_A
Note
Study folders should contain only letters, numbers, and underscores [A-Za-z0-9_]
.
study folder permissions¶
Each STUDY
folder is assigned the default permissions rwx------
. Individual user
permissions are then added using POSIX.1e access control lists. To add read (ls
) and
execute (cd
) permissions on the STUDY_A
folder for user jdoe
you would issue
the following command
setfacl -m u:jdoe:rx /PHOENIX/GENERAL/STUDY_A /PHOENIX/PROTECTED/STUDY_A
Warning
Many but not all filesystems support POSIX.1e access control lists. For example, some versions of PANASAS do not support them at all, while other filesystems, such as NFSv4, may use different tools and/or a modified syntax than shown above.
Subject folders¶
Within each STUDY
folder are individual SUBJECT
folders. Subject names should be
unique across PHOENIX
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
Warning
While subject names should be unique across PHOENIX, this is not enforced by Lochness in any way.
Note
Subject names should contain only letters, numbers, and underscores [A-Za-z0-9_]
.
Data type folders¶
Within each SUBJECT
folder, there are folders for each DATA TYPE
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA_TYPE
Some example DATA TYPE
names include actigraphy
, mri
, phone
, and
surveys
.
Note
Data type names should contain only letters, numbers, and underscores [A-Za-z0-9_]
.
Raw and processed folders¶
Within each DATA TYPE
folder, there are folders for raw
and processed
data
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
│ ├── raw
│ └── processed
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA_TYPE
├── raw
└── processed
raw¶
The raw
folders are the bedrock of the PHOENIX filesystem. These folders are
typically populated by data aggregation software. The user designated to run the
data aggregation software should be the only user with write permissions on
these folders. All other users should be granted read-only permissions.
processed¶
The processed
folders are assigned the permissions rwxrwxrwxt
which allows
any user who has been granted access to the parent STUDY
folder to write files.
Because these folders use a sticky bit,
only the owner of a file will be allowed to edit or delete their own files.
Note
Folders must be named raw
and processed
in lowercase letters.
Product folders (optional)¶
Within each raw
folder, there may be folders for each data capturing PRODUCT
.
This allows for multiple data capturing products, which happen to capture the same
type of data, to be clearly differentiated
PHOENIX
├── GENERAL
│ └── STUDY_A
│ └── SUBJECT_1
│ └── DATA_TYPE
│ └── raw
│ └── PRODUCT
│
└── PROTECTED
└── STUDY_A
└── SUBJECT_1
└── DATA TYPE
└── raw
└── PRODUCT
Some product names include Actiwatch2
and GENEActiv
.
Note
Product names should contain only letters, numbers, and underscores [A-Za-z0-9_]
.
Raw file integrity¶
As raw
files are being downloaded from each data source, the file contents are stored
within hidden files.
These hidden files should be ignored by end users. The file will be renamed to a
visible file name only after the file has been considered downloaded successfully. If the
file contents can be verified using a checksum, numbers of bytes, or by some other means,
the file will be verified before it is made visible.
Raw file naming convention¶
As a general rule, files will always preserve their original names or they will be assigned a name provided by the originating data source. Instances where a file name is not provided by the originating data source, an appropriately descriptive file name will be automatically generated.
Metadata files¶
In PHOENIX, all data for a subject are downloaded and organized under unique SUBJECT
folders. To accomplish this, the data aggregation software must understand how to query
for data belonging to the SUBJECT
within each data source. This is achieved using
metadata files
. Each STUDY
folder must contain a metadata file
PHOENIX
└── GENERAL
└── STUDY_A
└── STUDY_A_metadata.csv
The metadata file should be named with the study name followed by a _metadata.csv
suffix. The data aggregator is largely driven off these PHOENIX metadata files. The
minimal contents of a metadata file should look like this
Active,Subject ID,Consent Date
1,SUBJECT_1,2019-01-01
For convenience, here’s the same file rendered as a table
Active | Subject ID | Consent |
---|---|---|
1 | SUBJECT_1 | 2019-01-01 |
You must add additional columns to this file for each supported data source that you wish to pull data from.
See also
You can read much more about the supported data sources on the data sources page.
For the sake of brevity, let’s see what a metadata file looks like when we
add a Beiwe
column
Active | Subject ID | Consent | Beiwe |
---|---|---|---|
1 | SUBJECT_1 | 2019-01-01 | beiwe.example:5e2311:abcde |
This instructs the data aggregator that SUBJECT_1
should have data in the
Beiwe instance beiwe.example
, under the study 5e2311
, under the patient
abcde
.