Class: MiGA::Dataset

Inherits:
MiGA
  • Object
show all
Includes:
Hooks, Result
Defined in:
lib/miga/dataset/base.rb,
lib/miga/dataset.rb

Overview

Dataset representation in MiGA.

Defined Under Namespace

Modules: Base, Hooks, Result

Constant Summary

Constants included from MiGA

CITATION, VERSION, VERSION_DATE, VERSION_NAME

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Hooks

#default_hooks, #hook__pull_result_hooks, #hook_clear_run_counts, #hook_run_cmd

Methods included from Common::Hooks

#add_hook, #default_hooks, #hook_run_lambda, #hooks, #pull_hook

Methods included from Result

#cleanup_distances!, #done_preprocessing?, #first_preprocessing, #ignore_task?, #next_preprocessing, #profile_advance, #result_base, #result_status, #results_status, #why_ignore

Methods included from Common::WithResult

#add_result, #each_result, #get_result, #result, #result_dirs, #results

Methods inherited from MiGA

CITATION, DEBUG, DEBUG_OFF, DEBUG_ON, DEBUG_TRACE_OFF, DEBUG_TRACE_ON, FULL_VERSION, LONG_VERSION, VERSION, VERSION_DATE, initialized?, #result_files_exist?

Methods included from Common::Path

#root_path, #script_path

Methods included from Common::Format

#clean_fasta_file, #seqs_length, #tabulate

Constructor Details

#initialize(project, name, is_ref = true, metadata = {}) ⇒ Dataset

Create a MiGA::Dataset object in a project MiGA::Project with a uniquely identifying name. is_ref indicates if the dataset is to be treated as reference (true, default) or query (false). Pass any additional metadata as a Hash.



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/miga/dataset.rb', line 47

def initialize(project, name, is_ref = true,  = {})
  unless name.miga_name?
    raise 'Invalid name, please use only alphanumerics and underscores: ' +
      name.to_s
  end
  @project = project
  @name = name
   = nil
  [:ref] = is_ref
   = [
    File.expand_path("metadata/#{name}.json", project.path),
    
  ]
  save unless File.exist? [0]
end

Instance Attribute Details

#nameObject (readonly)

Datasets are uniquely identified by name in a project.



40
41
42
# File 'lib/miga/dataset.rb', line 40

def name
  @name
end

#projectObject (readonly)

MiGA::Project that contains the dataset.



36
37
38
# File 'lib/miga/dataset.rb', line 36

def project
  @project
end

Class Method Details

.exist?(project, name) ⇒ Boolean

Does the project already have a dataset with that name?

Returns:

  • (Boolean)


20
21
22
# File 'lib/miga/dataset.rb', line 20

def exist?(project, name)
  not project.dataset_names_hash[name].nil?
end

.INFO_FIELDSObject

Standard fields of metadata for datasets.



26
27
28
# File 'lib/miga/dataset.rb', line 26

def INFO_FIELDS
  %w(name created updated type ref user description comments)
end

.KNOWN_TYPESObject



9
# File 'lib/miga/dataset/base.rb', line 9

def KNOWN_TYPES ; @@KNOWN_TYPES ; end

.PREPROCESSING_TASKSObject



10
# File 'lib/miga/dataset/base.rb', line 10

def PREPROCESSING_TASKS ; @@PREPROCESSING_TASKS ; end

.RESULT_DIRSObject



8
# File 'lib/miga/dataset/base.rb', line 8

def RESULT_DIRS ; @@RESULT_DIRS ; end

Instance Method Details

#activate!Object

Activate a dataset. This removes the :inactive flag.



104
105
106
107
108
# File 'lib/miga/dataset.rb', line 104

def activate!
  self.[:inactive] = nil
  self..save
  pull_hook :on_activate
end

#closest_relatives(how_many = 1, ref_project = false) ⇒ Object

Returns an Array of how_many duples (Arrays) sorted by AAI:

  • 0: A String with the name(s) of the reference dataset.

  • 1: A Float with the AAI.

This function is currently only supported for query datasets when ref_project is false (default), and only for reference dataset when ref_project is true. It returns nil if this analysis is not supported.



153
154
155
156
157
158
159
160
161
# File 'lib/miga/dataset.rb', line 153

def closest_relatives(how_many = 1, ref_project = false)
  return nil if (is_ref? != ref_project) or is_multi?
  r = result(ref_project ? :taxonomy : :distances)
  return nil if r.nil?
  db = SQLite3::Database.new(r.file_path :aai_db)
  db.execute(
    'SELECT seq2, aai FROM aai WHERE seq2 != ? ' \
    'GROUP BY seq2 ORDER BY aai DESC LIMIT ?', [name, how_many])
end

#inactivate!Object

Inactivate a dataset. This halts automated processing by the daemon.



96
97
98
99
100
# File 'lib/miga/dataset.rb', line 96

def inactivate!
  self.[:inactive] = true
  self..save
  pull_hook :on_inactivate
end

#infoObject

Get standard metadata values for the dataset as Array.



112
113
114
115
116
# File 'lib/miga/dataset.rb', line 112

def info
  MiGA::Dataset.INFO_FIELDS.map do |k|
    (k == 'name') ? self.name : [k.to_sym]
  end
end

#is_active?Boolean

Is this dataset active?

Returns:

  • (Boolean)


142
143
144
# File 'lib/miga/dataset.rb', line 142

def is_active?
  [:inactive].nil? or ![:inactive]
end

#is_multi?Boolean

Is this dataset known to be multi-organism?

Returns:

  • (Boolean)


128
129
130
131
# File 'lib/miga/dataset.rb', line 128

def is_multi?
  return false if [:type].nil? or @@KNOWN_TYPES[type].nil?
  @@KNOWN_TYPES[type][:multi]
end

#is_nonmulti?Boolean

Is this dataset known to be single-organism?

Returns:

  • (Boolean)


135
136
137
138
# File 'lib/miga/dataset.rb', line 135

def is_nonmulti?
  return false if [:type].nil? or @@KNOWN_TYPES[type].nil?
  !@@KNOWN_TYPES[type][:multi]
end

#is_query?Boolean

Is this dataset a query (non-reference)?

Returns:

  • (Boolean)


124
# File 'lib/miga/dataset.rb', line 124

def is_query? ; ![:ref] ; end

#is_ref?Boolean

Is this dataset a reference?

Returns:

  • (Boolean)


120
# File 'lib/miga/dataset.rb', line 120

def is_ref? ; !![:ref] ; end

#metadataObject

MiGA::Metadata with information about the dataset



65
66
67
68
69
70
71
# File 'lib/miga/dataset.rb', line 65

def 
  if .nil?
     = MiGA::.new(*)
    pull_hook :on_load
  end
  
end

#remove!Object

Delete the dataset with all it’s contents (including results) and returns nil.



88
89
90
91
92
# File 'lib/miga/dataset.rb', line 88

def remove!
  self.results.each{ |r| r.remove! }
  self..remove!
  pull_hook :on_remove
end

#saveObject

Save any changes you’ve made in the dataset.



75
76
77
78
79
# File 'lib/miga/dataset.rb', line 75

def save
  MiGA.DEBUG "Dataset.metadata: #{metadata.data}"
  .save
  pull_hook :on_save
end

#typeObject

Get the type of dataset as Symbol.



83
# File 'lib/miga/dataset.rb', line 83

def type ; [:type] ; end