Setting Up A Pipeline Configuration

Overview

This section explains how to create a new pipeline or edit an existing one. If you wish to use one of the pre-configured pipelines that come packaged with C-PAC, you can view the current available library of pipelines here.

There are two ways of setting up or editing a pipeline configuration for C-PAC:

  • Using the pipeline configuration interface in the C-PAC GUI

  • Using a text editor (useful for remote servers where using the C-PAC GUI is not possible or impractical)

Definitions

Design A Pipeline

Note

The C-PAC pipeline configuration was changed to a nested format with import capabilities in v1.8.0.

With this change, the following configuration keys are deprecated:

  • crashLogDirectory

  • output_tree

  • TR

  • fdCalc

  • reGenerateOutputs

  • runMedianAngleCorrection

  • slice_timing_pattern

  • targetAngleDeg

  • runSymbolicLinks

  • configFileTwomm

  • ref_mask_2mm

  • template_skull_for_anat_2mm

  • surface_reconstruction

See Regressors specification to manually update any of the following keys:

  • nComponents

  • nuisanceBandpassFreq

  • numRemovePrecedingFrames

  • numRemoveSubsequentFrames

  • runFrequencyFiltering

  • runFristonModel

  • runMotionSpike

  • spikeThreshold

  • smoothing_order

  • already_skullstripped

  • roiTSOutputs

Mappings for all other C-PAC 1.7 keys can be found here.

C-PAC offers a graphical interface you can use to quickly and easily modify the default pipeline or create your own from scratch: https://fcp-indi.github.io/C-PAC_GUI/

Currently the GUI creates a C-PAC v1.6.0 pipeline configuration file. This syntax persisted through v1.7.2 but is deprecated with the release of v1.8.0.

If given a pipeline file in the older syntax, C-PAC v1.8 will attempt to convert the pipeline configuration file to the new syntax, saving the converted file in your output directory.

An update to the GUI to create v1.8.0 syntax configuration files is underway.

The newer (v1.8) syntax will not work with older versions of C-PAC.

See Using a Text Editor for configuring a custom pipeline without the GUI.

../../_images/gui_home1.png

Once you save the pipeline configuration YAML file, you can provide it to the C-PAC Docker container like so:

docker run -i --rm \
        -v /Users/You/local_bids_data:/bids_dataset \
        -v /Users/You/some_folder:/outputs \
        -v /tmp:/tmp \
        -v /Users/You/Documents:/configs \
        -v /Users/You/resources:/resources \
        fcpindi/c-pac:latest /bids_dataset /outputs participant --pipeline_file /configs/pipeline_config.yml

Or you can provide it to the C-PAC Singularity container like so:

singularity run \
        -B /Users/You/some_folder:/outputs \
        -B /tmp:/tmp \
        -B /Users/You/Documents:/configs \
        fcpindi_c-pac_latest-{date}-{hash value}.img s3://fcp-indi/data/Projects/ADHD200/RawDataBIDS /outputs participant --pipeline_file /configs/pipeline_config.yml

Reporting errors and getting help

Please report errors on the C-PAC github page issue tracker. Please use the C-PAC google group for help using C-PAC and this application.

Using a Text Editor

If you want to base a pipeline on another pipeline configuration YAML file, you can specify

FROM: /path/to/pipeline.yml

in your pipeline configuration file. You can use the name of a preconfigured pipeline instead of a filepath if you want to base a configuration file on a preconfigured pipeline. If FROM is not specified, the pipeline will be based on the default pipeline.

C-PAC will include all expected keys from the pipeline file specified in FROM (or the default pipeline if none is specified). Any keys specified in a pipeline configuration file will take precedence over the same key in the FROM base configuration, but all omitted keys will retain their values from the FROM base configuration.

From terminal, you can quickly generate a default pipeline configuration YAML file template in the directory you are in:

cpac utils pipe_config new_template

You can then edit the file as needed. For values that you want to leave at the default, you can either leave the key as-is, or you can remove the key, and C-PAC will automatically use value from the default pipeline configuration (or from the pipeline specified in FROM).

If you want to run the analysis from terminal:

cpac run --pipe_config {path to pipeline config} {path to data config}

Pipeline configuration files, like the data settings and data configuration files discussed in the data configuration builder section, are stored as YAML files. Similarly, each of the parameters used by C-PAC to assemble your pipeline can be specified as nested key-value pairs, so a pipeline configuration YAML would have multiple lines of the form key: value like so

pipeline_setup:

  # Name for this pipeline configuration - useful for identification.
  pipeline_name: cpac-default-pipeline

  output_directory:

    # Directory where C-PAC should write out processed data, logs, and crash reports.
    # - If running in a container (Singularity/Docker), you can simply set this to an arbitrary
    #   name like '/output', and then map (-B/-v) your desired output directory to that label.
    # - If running outside a container, this should be a full path to a directory.
    path: /output

    # (Optional) Path to a BIDS-Derivatives directory that already has outputs.
    #   - This option is intended to ingress already-existing resources from an output
    #     directory without writing new outputs back into the same directory.
    #   - If provided, C-PAC will ingress the already-computed outputs from this directory and
    #     continue the pipeline from where they leave off.
    #   - If left as 'None', C-PAC will ingress any already-computed outputs from the
    #     output directory you provide above in 'path' instead, the default behavior.
    source_outputs_dir: None

    # Set to True to make C-PAC ingress the outputs from the primary output directory if they
    # exist, even if a source_outputs_dir is provided
    #   - Setting to False will pull from source_outputs_dir every time, over-writing any
    #     calculated outputs in the main output directory
    #   - C-PAC will still pull from source_outputs_dir if the main output directory is
    #     empty, however
    pull_source_once: True

    # Include extra versions and intermediate steps of functional preprocessing in the output directory.
    write_func_outputs: False

    # Include extra outputs in the output directory that may be of interest when more information is needed.
    write_debugging_outputs: False

    # Output directory format and structure.
    # Options: default, ndmg
    output_tree: "default"

    # Quality control outputs
    quality_control:
      # Generate quality control pages containing preprocessing and derivative outputs.
      generate_quality_control_images: True

      # Generate eXtensible Connectivity Pipeline-style quality control files
      generate_xcpqc_files: False

An example of a pipeline configuration YAML file can be found here. Tables explaining the keys and their potential values can be found on the individual pages for each of the outputs C-PAC is capable of producing. All pipeline setup configuration files should have the keys in the Output Settings table defined.

String values can include the simplest form of POSIX parameter expansion <https://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02> (${parameter}). Two special variables are included for these types of parameters:

  • resolution_for_anat will be populated with the value set in registration_workflows['anatomical_registration']['resolution_for_anat'].

  • func_resolution will

    • be populated with the value set in registration_workflows['functional_registration']['func_registration_to_template']['output_resolution']['func_preproc_outputs'] if funcreg is in the value’s key,

    • be populated with the value set in registration_workflows['functional_registration']['func_registration_to_template']['output_resolution']['func_derivative_outputs'] if deriv is in the value’s key, or

    • raise an exception if neither funcreg nor deriv is in the value’s key.

If FROM is defined (see above), any undefined keys will be inferred from the pipeline configuration specified; otherwise, any undefined keys will be inferred from the default pipeline.

Why a list?

You may notice as you learn about the settings for various outputs that many of the values for C-PAC’s configurable settings are stored in lists (i.e., multiple values are separated by commas and surrounded by square brackets). Such lists containing On``s and ``Off``s (for ``True and False respectively) allow you to toggle on multiple options at the same time, and branch a pipeline into two different analysis strategies. See the developer documentation for more information about how lists are used in C-PAC.

Configurable Settings

Data Management and Environment Settings

Pre- and post-processing

Derivatives

Reference

[1] (1,2)

Poldrack, R. A., Mumford, J. A., and Nichols, T. E. 2011 August. Handbook of Functional MRI Data Analysis. New York: Cambridge University Press.