Setting Up A Pipeline Configuration¶
Overview¶
This section explains how to create a new pipeline or edit an existing one. If you wish to use one of the pre-configured pipelines that come packaged with C-PAC, you can view the current available library of pipelines here.
There are two ways of setting up or editing a pipeline configuration for C-PAC:
Using the pipeline configuration interface in the C-PAC GUI
Using a text editor (useful for remote servers where using the C-PAC GUI is not possible or impractical)
Definitions¶
- Workflow
A workflow accomplishes a particular processing task (e.g. functional preprocessing, scrubbing, nuisance correction). Each workflow can be turned on or off in the pipeline configuration. Sometimes a workflow can be set to both on and off, allowing for pipelines to branch.
- Pipeline
A pipeline is a combination of workflows.
- Strategy
A strategy is a set of preprocessing options. Specifically, a strategy is defined by nuisance corrections and scrubbing settings. Strategies can branch depending on which of these workflows are turned on or off and how they are configured.
- Derivative
Derivatives are the results of processing a participant’s raw data (i.e., connectivity measures).
- Atlas
An atlas provides a guide to the location of anatomical features in a coordinate space.
—[1] p. 55
An atlas provides a defined coordinate space, a template for aligning images, and labels for regions of interest.
Currently C-PAC only supports atlases in NifTI format.
CIFTI is a popular atlas format not yet supported by C-PAC. As always, if anyone wants to share any tips or hacks with us, we are an open-source platform after all. Below are some resources that might help enable CIFTI atlas support in a future version of C-PAC.
CIFTI Resources
Etzel, J. 2014 March. NIfTI, CIFTI, GIFTI in the HCP and Workbench: a primer.
Habeeb, H. 2018 December. Convert from CIFTI2 (HCP) to 3D array.
Mejia, M. 2015 August. A layman's guide to working with CIFTI files.
- Template
C-PAC images include the following templates:
path in container
source
/ants_template/oasis
/cpac_templates
/ndmg_atlases
/opt/dcan-tools/pipeline/global/templates
/usr/share/data/fsl-mni152-templates
fsl-mni152-templates – MNI152 stereotaxic brain templates for FSL
/usr/share/fsl/5.0/data/atlases
fsl-atlases – FSL’s MNI152 standard space stereotaxic brain atlases http://fcon_1000.projects.nitrc.org/indi/cpac_resources.tar.gz
/usr/share/fsl/5.0/data/standard
http://fcon_1000.projects.nitrc.org/indi/cpac_resources.tar.gz
Design A Pipeline¶
Note
The C-PAC pipeline configuration was changed to a nested format with import capabilities in v1.8.0.
With this change, the following configuration keys are deprecated:
crashLogDirectory
output_tree
TR
fdCalc
reGenerateOutputs
runMedianAngleCorrection
slice_timing_pattern
targetAngleDeg
runSymbolicLinks
configFileTwomm
ref_mask_2mm
template_skull_for_anat_2mm
surface_reconstruction
See Regressors specification to manually update any of the following keys:
nComponents
nuisanceBandpassFreq
numRemovePrecedingFrames
numRemoveSubsequentFrames
runFrequencyFiltering
runFristonModel
runMotionSpike
spikeThreshold
smoothing_order
already_skullstripped
roiTSOutputs
Mappings for all other C-PAC 1.7 keys can be found here.
C-PAC offers a graphical interface you can use to quickly and easily modify the default pipeline or create your own from scratch: https://fcp-indi.github.io/C-PAC_GUI/
Currently the GUI creates a C-PAC v1.6.0 pipeline configuration file. This syntax persisted through v1.7.2 but is deprecated with the release of v1.8.0.
If given a pipeline file in the older syntax, C-PAC v1.8 will attempt to convert the pipeline configuration file to the new syntax, saving the converted file in your output directory.
An update to the GUI to create v1.8.0 syntax configuration files is underway.
The newer (v1.8) syntax will not work with older versions of C-PAC.
See Using a Text Editor for configuring a custom pipeline without the GUI.
Once you save the pipeline configuration YAML file, you can provide it to the C-PAC Docker container like so:
docker run -i --rm \
-v /Users/You/local_bids_data:/bids_dataset \
-v /Users/You/some_folder:/outputs \
-v /tmp:/tmp \
-v /Users/You/Documents:/configs \
-v /Users/You/resources:/resources \
fcpindi/c-pac:latest /bids_dataset /outputs participant --pipeline-file /configs/pipeline_config.yml
Or you can provide it to the C-PAC Singularity container like so:
singularity run \
-B /Users/You/some_folder:/outputs \
-B /tmp:/tmp \
-B /Users/You/Documents:/configs \
fcpindi_c-pac_latest-{date}-{hash value}.img s3://fcp-indi/data/Projects/ADHD200/RawDataBIDS /outputs participant --pipeline-file /configs/pipeline_config.yml
Reporting errors and getting help
Please report errors on the C-PAC GitHub issue tracker. Please use Neurostars for help using C-PAC and this application.
Using a Text Editor¶
If you want to base a pipeline on another pipeline configuration YAML file, you can specify
FROM: /path/to/pipeline.yml
in your pipeline configuration file. You can use the name of a preconfigured pipeline instead of a filepath if you want to base a configuration file on a preconfigured pipeline. If FROM
is not specified, the pipeline will be based on the default pipeline.
C-PAC will include all expected keys from the pipeline file specified in FROM
(or the default pipeline if none is specified). Any keys specified in a pipeline configuration file will take precedence over the same key in the FROM
base configuration, but all omitted keys will retain their values from the FROM
base configuration.
From terminal, you can quickly generate a default pipeline configuration YAML file template in the directory you are in:
cpac utils pipe-config new-template
You can then edit the file as needed. For values that you want to leave at the default, you can either leave the key as-is, or you can remove the key, and C-PAC will automatically use value from the default pipeline configuration (or from the pipeline specified in FROM
).
If you want to run the analysis from terminal:
cpac run --pipe-config {path to pipeline config} {path to data config}
Pipeline configuration files, like the data settings and data configuration files discussed in the data configuration builder section, are stored as YAML files. Similarly, each of the parameters used by C-PAC to assemble your pipeline can be specified as nested key-value pairs, so a pipeline configuration YAML would have multiple lines of the form key: value
like so
pipeline_setup:
# Name for this pipeline configuration - useful for identification.
# This string will be sanitized and used in filepaths
pipeline_name: cpac-default-pipeline
output_directory:
# Directory where C-PAC should write out processed data, logs, and crash reports.
# - If running in a container (Singularity/Docker), you can simply set this to an arbitrary
# name like '/outputs', and then map (-B/-v) your desired output directory to that label.
# - If running outside a container, this should be a full path to a directory.
path: /outputs/output
# (Optional) Path to a BIDS-Derivatives directory that already has outputs.
# - This option is intended to ingress already-existing resources from an output
# directory without writing new outputs back into the same directory.
# - If provided, C-PAC will ingress the already-computed outputs from this directory and
# continue the pipeline from where they leave off.
# - If left as 'None', C-PAC will ingress any already-computed outputs from the
# output directory you provide above in 'path' instead, the default behavior.
source_outputs_dir: None
# Set to True to make C-PAC ingress the outputs from the primary output directory if they
# exist, even if a source_outputs_dir is provided
# - Setting to False will pull from source_outputs_dir every time, over-writing any
# calculated outputs in the main output directory
# - C-PAC will still pull from source_outputs_dir if the main output directory is
# empty, however
pull_source_once: True
# Include extra versions and intermediate steps of functional preprocessing in the output directory.
write_func_outputs: False
# Include extra outputs in the output directory that may be of interest when more information is needed.
write_debugging_outputs: False
# Output directory format and structure.
# Options: default, ndmg
output_tree: "default"
# Quality control outputs
quality_control:
# Generate quality control pages containing preprocessing and derivative outputs.
generate_quality_control_images: True
# Generate eXtensible Connectivity Pipeline-style quality control files
generate_xcpqc_files: False
An example of a pipeline configuration YAML file can be found here. Tables explaining the keys and their potential values can be found on the individual pages for each of the outputs C-PAC is capable of producing. All pipeline setup configuration files should have the keys in the Output Settings table defined.
String values can include the simplest form of POSIX parameter expansion <https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02> (${parameter}
). Two special variables are included for these types of parameters:
resolution_for_anat
will be populated with the value set inregistration_workflows['anatomical_registration']['resolution_for_anat']
.func_resolution
willbe populated with the value set in
registration_workflows['functional_registration']['func_registration_to_template']['output_resolution']['func_preproc_outputs']
iffuncreg
is in the value’s key,be populated with the value set in
registration_workflows['functional_registration']['func_registration_to_template']['output_resolution']['func_derivative_outputs']
ifderiv
is in the value’s key, orraise an exception if neither
funcreg
norderiv
is in the value’s key.
If FROM
is defined (see above), any undefined keys will be inferred from the pipeline configuration specified; otherwise, any undefined keys will be inferred from the default pipeline.
Why a list?¶
You may notice as you learn about the settings for various outputs that many of the values for C-PAC’s configurable settings are stored in lists (i.e., multiple values are separated by commas and surrounded by square brackets). Such lists containing On``s and ``Off``s (for ``True
and False
respectively) allow you to toggle on multiple options at the same time, and branch a pipeline into two different analysis strategies. See the developer documentation for more information about how lists are used in C-PAC.
Configurable Settings¶
Data Management and Environment Settings¶
Pre- and post-processing¶
Derivatives¶
Seed-based Correlation Analysis (SCA) and Dual Regression - Analyze the connectivity between brain regions.
Voxel-mirrored Homotopic Connectivity (VMHC) - Investigate connectivity between hemispheres.
Amplitude of Low Frequency Fluctuations (ALFF) and fractional ALFF (fALFF) - Measure the power of slow fluctuations in brain activity.
Regional Homogeneity (ReHo) - Measure the similarity of activity patterns across neighboring voxels.
Network Centrality - Analyze the structure of functional networks.
References