Skip to content

Brainhack DC Badge

Child Mind Institute Healthy Brain Network (HBN)

From the About page of the HBN website:

An ongoing initiative focused on creating and sharing a biobank comprised of data from 10,000 New York City area children and adolescents (ages 5-21). The Healthy Brain Network has adopted a community-referred recruitment model. Specifically, study advertisements seek the participation of families who have concerns about one or more psychiatric symptoms in their child. The Healthy Brain Network Biobank houses data about psychiatric, behavioral, cognitive, and lifestyle (e.g., fitness, diet) phenotypes, as well as multimodal brain imaging, electroencephalography, digital voice and video recordings, genetics, and actigraphy. Beyond accelerating transdiagnostic research, we discuss the potential of the Healthy Brain Network Biobank to advance related areas, such as biophysical modeling, voice and speech analysis, natural viewing fMRI and EEG, and methods optimization.

Steps to produce this study's data dictionaries

Note: Some of the following COINS access instructions were copied from the HBN Phenotypic Data Access webpage.

  1. Go to the COINS Data Exchange website.
  2. Log in using your COINS user ID and password. If you do not have an account, select the Get Account option.
  3. From the main screen of the COINS Data Exchange, click on Study Information.
  4. Click on the drop-down for Select a study and choose CMI_HBN.
  5. Under Study Docs: download the all_data_dicts_Aug_2018.zip into the HBN/ subfolder without renaming the ZIP file.
  6. Unzip the ZIP file in place. The HBN/ subfolder hierarchy should now look like this:

    shell HBN/ ├── all_data_dicts_Aug_2018.zip ├── Data Dictionaries/ ├── dictionary.py └── README.md

  7. Install the required Python 3 library with the following line of code:

    shell python3 -m pip install --user pandas openpyxl

  8. Run the following line of code within the HBN/ subfolder:

    shell python3 dictionary.py

Notes about this data dictionary

  1. Some questionnaires (listed here) have an entry on the last line that states "Continue to" which is ignored when creating the corresponding .json files for that questionnaires.
  2. On some questionnaires the ShortName is entered as Variable Name on others as Variable
  3. Similarly, the Description header, which contains the description for every question on the questionnaire was entered as Question, Question, or Item
  4. Some questionnaires used Value Labels instead of Value Label to describe the range of possible values.
  5. Many of the levels from different questionnaire needed adjustments that were provided as notes on the .xlsx file. The code has many different if statements to handle the correct behaviour for each questionnaire.
  6. The SWAN questionnaire is provided twice with the same data with the name SWAN.xlsx and SWAN .xlsx
  7. Some of the levels on SCARED_P AND SCARED_SR are defined as values that are >= to a specific threshold