Pipelines yaml files

There are a number of *.yml files located in config/pipelines/ these configure the flow of plate purposes through a Pipeline. Limber automatically loads all .yml files within this directory into PipelineList. Filenames, and the grouping of pipelines within files, have no functional relevance, and are intended for organizational reasons.

Loading of yaml files is handled by ConfigLoader::PipelinesLoader which loads all files, detects potential duplicates, and populates the PipelineList.

TIP It is suggested that you create a new file for each new 'pipeline'. In most cases this file will actually contain a handful of internal 'pipelines' reflecting branches, or different stages of the process.

An example file

This is an example yaml file configuring a WGS (whole genome sequencing) pipeline.

---
WGS: # Top of the pipeline (Library Prep)
  filters:
    request_type_key:
      - limber_wgs
      - limber_lcmb
      - limber_rnaa
    library_type: Standard
  library_pass: LB Lib PCR-XP
  relationships:
    LB Cherrypick: LB Shear
    LB Shear: LB Post Shear
    LB Post Shear: LB End Prep
    LB End Prep: LB Lib PCR
    LB Lib PCR: LB Lib PCR-XP
WGS MX: # Bottom of the pipeline (Pooling and normalization)
  filters:
    request_type_key:
      - limber_multiplexing
  relationships:
    LB Lib PCR-XP: LB Lib Pool
    LB Lib Pool: LB Lib Pool Norm

The rest of the document describes the structure of this file, and what each of the keys do.

Top level

Each file is a .yml file located in config/pipelines, it contains the configuration for one or more pipelines.

The top level structure consists of series of keys, uniquely identifying each pipeline. Keys need to be unique across all pipelines, not just those within the same file. Limber will detect duplicate keys, and will raise an exception on boot.

The key will be used to set the Pipeline#name, this is exposed in the pipelines overview page, and may get shown to the user in future.

The values in turn are used to describe each Pipeline. The valid options are details in Pipeline below.

Pipeline

Each pipeline configures a name, high-level behaviour and a list of relationships. As discussed above, the key is a unique value, which gets used to set the pipeline's name.

@see Pipeline for the Ruby objects generated by this configuration.

WGS: # Top of the pipeline (Library Prep)
  filters:
    request_type_key:
      - limber_wgs
      - limber_lcmb
      - limber_rnaa
    library_type: Standard
  library_pass: LB Lib PCR-XP
  relationships:
    LB Cherrypick: LB Shear
    LB Shear: LB Post Shear
    LB Post Shear: LB End Prep
    LB End Prep: LB Lib PCR
    LB Lib PCR: LB Lib PCR-XP

The other keys are detailed below.

pipeline_group

This groups several Limber pipelines together that are part of the same real world pipeline.

For instance, 'Heron-384 Tailed A V2' and 'Heron-384 Tailed B V2' - the split here is purely for technical reasons, to allow branching. In reality, they are both part of the Heron pipeline.

Another example is when there are separate Limber pipelines for sequential stages. For instance, 'pWGS-384' (the library prep part) and 'pWGS-384 MX' (the multiplexing part). In reality, these are both part of the same pipeline, so they both have the pipeline group 'pWGS-384'.

The pipeline group is used in the 'Work in progress' pages and the 'Pipelines overview' page.

filters

Filters are the way in which a pipeline works out if it is in progress. It consists of a series of keys, and their acceptable values. Keys should be attributes on request (eg. library_type) whereas values are either an array of acceptable values, or a single acceptable value.

filters:
  request_type_key:
    - limber_wgs
    - limber_lcmb
    - limber_rnaa
  library_type: Standard

Indicates that this pipeline can be used for requests with a request type of 'limber_wgs', 'limber_lcmb' or 'limber_rnaa', and a library type of 'Standard'.

The most common keys to filter on are request_type and library_type.

All filters must be fulfilled for a pipeline to be considered valid.

For branching pipelines with identical filters, you are strongly encouraged to use yaml anchors to share the filter between pipelines. See the relationships section below for more details, and an example.

library_pass

library_pass indicates the plate purposes for which the Lims should suggest the 'Charge and Pass Libraries' option. The values should be strings matching purpose names specified in config/purposes/*.yml.

It can be a string if library pass should be suggested at a single step:

library_pass: LB Lib PCR-XP

Or an array, if there are multiple points at which a library can be passed:

library_pass:
  - LB Cap Lib PCR-XP
  - LB Cap Lib Pool

TIP library_pass usually occurs on the last plate of the pipeline, immediately prior to multiplexing and normalization. This is the point at which the pipeline transitions from the library creation request (eg. limber_wgs) to the multiplexing request (eg. limber_multiplexing). You'll see this reflected in the example above, with the 'WGS' and 'WGS MX' pipelines.

This split ensures that customers can request re-pools of existing libraries, without incurring further charges for library creation.

It is common, although not necessary, to specify both library_creation and multiplexing sections of a pipeline in the same file.

library_pass is not specified for the final tube in the WGS MX pipeline because:

The behaviour is already handled by passing the tube itself

Multiplexing is not charged for, and rarely failed, so an explicit step is unnecessary and confusing.

relationships

The relationships is a hash representing transitions from parent labware to child labware. Both keys and values are strings matching purpose names specified in config/purposes/*.yml.

relationships:
  LB Cherrypick: LB Shear
  LB Shear: LB Post Shear
  LB Post Shear: LB End Prep
  LB End Prep: LB Lib PCR
  LB Lib PCR: LB Lib PCR-XP

The above shows a transition from 'LB Cherrypick' to 'LB Shear', 'LB Shear' to 'LB Post Shear' and so on.

TIP In most Limber pipelines, the final multiplex library tube is created upfront by the limber_multiplexing request. This allows the SSRs to access the sequencing requests easily prior to the completion of library creation, allowing for the addition of removal of requests. A side effect of this is that any Limber pipelines using the standard limber_multiplexing request share the final tube purpose, 'LB Lib Pool Norm'. This is defined in: final_tube

It should be noted that because the above structure is a hash, it is not possible to reflect a branching pipeline. Instead, each branch of the pipeline can be represented by a separate pipeline within the same file.

For example, the heron pipeline has A and B forks, representing the PCR 1 and PCR 2 routes.

TIP Note the use of &heron_filters and *heron_filters in the example below. This allows a filter to be share between two branches of the pipeline. You are *strongly* encouraged to use this approach when dealing with branched pipelines with identical filters. In the past there have been several occasions where failure to follow this pattern has resulted in a library type only getting added to one branch of the pipeline by mistake.

---
Heron-384 A: # Heron 384-well pipeline specific to PCR 1 plate
  filters: &heron_filters
    request_type_key: limber_heron
    library_type: PCR amplicon ligated adapters 384
  library_pass: LHR-384 Lib PCR
  relationships:
    LHR-384 RT: LHR-384 PCR 1
    LHR-384 PCR 1: LHR-384 cDNA
    LHR-384 cDNA: LHR-384 XP
    LHR-384 XP: LHR-384 End Prep
    LHR-384 End Prep: LHR-384 AL Lib
    LHR-384 AL Lib: LHR-384 Lib PCR
Heron-384 B: # Heron 384-well pipeline specific to PCR 2 plate (uses above relationships after cDNA plate)
  filters: *heron_filters
  relationships:
    LHR-384 RT: LHR-384 PCR 2
    LHR-384 PCR 2: LHR-384 cDNA