Module: MiGA::Dataset::Base

Included in:
Result
Defined in:
lib/miga/dataset/base.rb

Constant Summary collapse

@@RESULT_DIRS =

Directories containing the results from dataset-specific tasks

{
  # Preprocessing
  raw_reads: '01.raw_reads',
  trimmed_reads: '02.trimmed_reads',
  read_quality: '03.read_quality',
  trimmed_fasta: '04.trimmed_fasta',
  assembly: '05.assembly',
  cds: '06.cds',
  # Annotation
  essential_genes: '07.annotation/01.function/01.essential',
  mytaxa: '07.annotation/02.taxonomy/01.mytaxa',
  mytaxa_scan: '07.annotation/03.qa/02.mytaxa_scan',
  # Distances (for single-species datasets)
  taxonomy: '09.distances/05.taxonomy',
  distances: '09.distances',
  # Post-QC
  ssu: '07.annotation/01.function/02.ssu',
  stats: '90.stats'
}
@@KNOWN_TYPES =

Supported dataset types

{
  genome: {
    description: 'The genome from an isolate', multi: false
  },
  scgenome: {
    description: 'A Single-cell Amplified Genome (SAG)', multi: false
  },
  popgenome: {
    description: 'A Metagenome-Assembled Genome (MAG)', multi: false
  },
  metagenome: {
    description: 'A metagenome (excluding viromes)', multi: true
  },
  virome: {
    description: 'A viral metagenome', multi: true
  }
}
@@PREPROCESSING_TASKS =

Returns an Array of tasks to be executed before project-wide tasks

[
  :raw_reads, :trimmed_reads, :read_quality, :trimmed_fasta,
  :assembly, :cds, :essential_genes, :mytaxa, :mytaxa_scan,
  :taxonomy, :distances, :ssu, :stats
]
@@EXCLUDE_NOREF_TASKS =

Tasks to be excluded from query datasets.

[:mytaxa_scan, :taxonomy]
@@_EXCLUDE_NOREF_TASKS_H =
Hash[@@EXCLUDE_NOREF_TASKS.map { |i| [i, true] }]
@@ONLY_NONMULTI_TASKS =

Tasks to be executed only in datasets that are not multi-organism. These tasks are ignored for multi-organism datasets or for unknown types.

[:mytaxa_scan, :taxonomy, :distances]
@@_ONLY_NONMULTI_TASKS_H =
Hash[@@ONLY_NONMULTI_TASKS.map { |i| [i, true] }]
@@ONLY_MULTI_TASKS =

Tasks to be executed only in datasets that are multi-organism. These tasks are ignored for single-organism datasets or for unknwon types.

[:mytaxa]
@@_ONLY_MULTI_TASKS_H =
Hash[@@ONLY_MULTI_TASKS.map { |i| [i, true] }]
@@OPTIONS =

Options supported by datasets

{
  db_project: {
    desc: 'Project to use as database', type: String
  },
  dist_req: {
    desc: 'Run distances against these datasets', type: Array, default: []
  }
}