Pipelines yaml files
There are a number of *.yml
files located in config/pipelines/
these
configure the flow of plate purposes through a Pipeline. Limber automatically
loads all .yml
files within this directory into PipelineList.
Filenames, and the grouping of pipelines within files, have no functional
relevance, and are intended for organizational reasons.
Loading of yaml files is handled by ConfigLoader::PipelinesLoader which loads all files, detects potential duplicates, and populates the PipelineList.
TIP It is suggested that you create a new file for each new 'pipeline'. In most cases this file will actually contain a handful of internal 'pipelines' reflecting branches, or different stages of the process.
An example file
This is an example yaml file configuring a WGS (whole genome sequencing) pipeline.
---
WGS: # Top of the pipeline (Library Prep)
filters:
request_type_key:
- limber_wgs
- limber_lcmb
- limber_rnaa
library_type: Standard
library_pass: LB Lib PCR-XP
relationships:
LB Cherrypick: LB Shear
LB Shear: LB Post Shear
LB Post Shear: LB End Prep
LB End Prep: LB Lib PCR
LB Lib PCR: LB Lib PCR-XP
WGS MX: # Bottom of the pipeline (Pooling and normalization)
filters:
request_type_key:
- limber_multiplexing
relationships:
LB Lib PCR-XP: LB Lib Pool
LB Lib Pool: LB Lib Pool Norm
The rest of the document describes the structure of this file, and what each of the keys do.
Top level
Each file is a .yml
file located in config/pipelines
, it contains the
configuration for one or more pipelines.
The top level structure consists of series of keys, uniquely identifying each pipeline. Keys need to be unique across all pipelines, not just those within the same file. Limber will detect duplicate keys, and will raise an exception on boot.
The key will be used to set the Pipeline#name, this is exposed in the pipelines overview page, and may get shown to the user in future.
The values in turn are used to describe each Pipeline. The valid options are details in Pipeline below.
Pipeline
Each pipeline configures a name, high-level behaviour and a list of relationships. As discussed above, the key is a unique value, which gets used to set the pipeline's name.
@see Pipeline for the Ruby objects generated by this configuration.
WGS: # Top of the pipeline (Library Prep)
filters:
request_type_key:
- limber_wgs
- limber_lcmb
- limber_rnaa
library_type: Standard
library_pass: LB Lib PCR-XP
relationships:
LB Cherrypick: LB Shear
LB Shear: LB Post Shear
LB Post Shear: LB End Prep
LB End Prep: LB Lib PCR
LB Lib PCR: LB Lib PCR-XP
The other keys are detailed below.
pipeline_group
This groups several Limber pipelines together that are part of the same real world pipeline.
For instance, 'Heron-384 Tailed A V2' and 'Heron-384 Tailed B V2' - the split here is purely for technical reasons, to allow branching. In reality, they are both part of the Heron pipeline.
Another example is when there are separate Limber pipelines for sequential stages. For instance, 'pWGS-384' (the library prep part) and 'pWGS-384 MX' (the multiplexing part). In reality, these are both part of the same pipeline, so they both have the pipeline group 'pWGS-384'.
The pipeline group is used in the 'Work in progress' pages and the 'Pipelines overview' page.
filters
Filters are the way in which a pipeline works out if it is in progress. It consists of a series of keys, and their acceptable values. Keys should be attributes on request (eg. library_type) whereas values are either an array of acceptable values, or a single acceptable value.
filters:
request_type_key:
- limber_wgs
- limber_lcmb
- limber_rnaa
library_type: Standard
Indicates that this pipeline can be used for requests with a request type of 'limber_wgs', 'limber_lcmb' or 'limber_rnaa', and a library type of 'Standard'.
The most common keys to filter on are request_type and library_type.
All filters must be fulfilled for a pipeline to be considered valid.
For branching pipelines with identical filters, you are strongly encouraged to use yaml anchors to share the filter between pipelines. See the relationships section below for more details, and an example.
library_pass
library_pass indicates the plate purposes for which the Lims should suggest the
'Charge and Pass Libraries' option. The values should be strings matching purpose names specified in config/purposes/*.yml
.
It can be a string if library pass should be suggested at a single step:
library_pass: LB Lib PCR-XP
Or an array, if there are multiple points at which a library can be passed:
library_pass:
- LB Cap Lib PCR-XP
- LB Cap Lib Pool
TIP library_pass usually occurs on the last plate of the pipeline, immediately prior to multiplexing and normalization. This is the point at which the pipeline transitions from the library creation request (eg. limber_wgs) to the multiplexing request (eg. limber_multiplexing). You'll see this reflected in the example above, with the 'WGS' and 'WGS MX' pipelines.
This split ensures that customers can request re-pools of existing libraries, without incurring further charges for library creation.
It is common, although not necessary, to specify both library_creation and multiplexing sections of a pipeline in the same file.
library_pass is not specified for the final tube in the WGS MX pipeline because:
- The behaviour is already handled by passing the tube itself
- Multiplexing is not charged for, and rarely failed, so an explicit step is unnecessary and confusing.
relationships
The relationships is a hash representing transitions from parent labware to
child labware. Both keys and values are strings matching purpose names specified
in config/purposes/*.yml
.
relationships:
LB Cherrypick: LB Shear
LB Shear: LB Post Shear
LB Post Shear: LB End Prep
LB End Prep: LB Lib PCR
LB Lib PCR: LB Lib PCR-XP
The above shows a transition from 'LB Cherrypick' to 'LB Shear', 'LB Shear' to 'LB Post Shear' and so on.
TIP In most Limber pipelines, the final multiplex library tube is created upfront by the limber_multiplexing request. This allows the SSRs to access the sequencing requests easily prior to the completion of library creation, allowing for the addition of removal of requests. A side effect of this is that any Limber pipelines using the standard limber_multiplexing request share the final tube purpose, 'LB Lib Pool Norm'. This is defined in: final_tube
It should be noted that because the above structure is a hash, it is not possible to reflect a branching pipeline. Instead, each branch of the pipeline can be represented by a separate pipeline within the same file.
For example, the heron pipeline has A and B forks, representing the PCR 1 and PCR 2 routes.
TIP Note the use of &heron_filters and *heron_filters in the example below. This allows a filter to be share between two branches of the pipeline. You are *strongly* encouraged to use this approach when dealing with branched pipelines with identical filters. In the past there have been several occasions where failure to follow this pattern has resulted in a library type only getting added to one branch of the pipeline by mistake.
---
Heron-384 A: # Heron 384-well pipeline specific to PCR 1 plate
filters: &heron_filters
request_type_key: limber_heron
library_type: PCR amplicon ligated adapters 384
library_pass: LHR-384 Lib PCR
relationships:
LHR-384 RT: LHR-384 PCR 1
LHR-384 PCR 1: LHR-384 cDNA
LHR-384 cDNA: LHR-384 XP
LHR-384 XP: LHR-384 End Prep
LHR-384 End Prep: LHR-384 AL Lib
LHR-384 AL Lib: LHR-384 Lib PCR
Heron-384 B: # Heron 384-well pipeline specific to PCR 2 plate (uses above relationships after cDNA plate)
filters: *heron_filters
relationships:
LHR-384 RT: LHR-384 PCR 2
LHR-384 PCR 2: LHR-384 cDNA