Class: RGFA::Line

Inherits:
Object show all
Defined in:
lib/rgfa/line.rb

Overview

different record types.

Direct Known Subclasses

Comment, Containment, Header, Link, Path, Segment

Defined Under Namespace

Classes: Comment, Containment, CustomOptfieldNameError, DuplicatedOptfieldNameError, FieldnameError, Header, Link, Path, PredefinedOptfieldTypeError, RequiredFieldMissingError, Segment, TagMissingError, UnknownDatatype, UnknownRecordTypeError

Constant Summary collapse

SEPARATOR =

Separator in the string representation of RGFA lines

"\t"
RECORD_TYPES =

List of allowed record_type values

[ :H, :S, :L, :C, :P ]
RECORD_TYPE_LABELS =

Full name of the record types

{
  :H => "header",
  :S => "segment",
  :L => "link",
  :C => "containment",
  :P => "path",
}
OPTFIELD_DATATYPE =

A symbol representing a datatype for optional fields

[:A, :i, :f, :Z, :J, :H, :B]
REQFIELD_DATATYPE =

A symbol representing a datatype for required fields

[:lbl, :orn, :lbs, :seq, :pos, :cig, :cgs]
FIELD_DATATYPE =

A symbol representing a valid datatype

OPTFIELD_DATATYPE + REQFIELD_DATATYPE
DELAYED_PARSING_DATATYPES =

List of data types which are parsed only on access; all other are parsed when read.

[:cig, :cgs, :lbs, :H, :J, :B]
DIRECTION =

Direction of a segment for links/containments

[:from, :to]
ORIENTATION =

Orientation of segments in paths/links/containments

[:+, :-]

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data, validate: 2, virtual: false) ⇒ RGFA::Line

Constants defined by subclasses

Subclasses of RGFA::Line must define the following constants:

  • RECORD_TYPE [RGFA::Line::RECORD_TYPES]

  • REQFIELDS [Array<Symbol>] required fields

  • PREDEFINED_OPTFIELDS [Array<Symbol>] predefined optional fields

  • DATATYPE [HashSymbol=>Symbol]: datatypes for the required fields and the predefined optional fields

Validation levels

The default is 2, i.e. if a field content is changed, the user is responsible to call #validate_field!, if necessary.

  • 0: no validation

  • 1: the number of required fields must be correct; optional fields

    cannot be duplicated; custom optional field names must be correct;
    predefined optional fields must have the correct type; only some
    fields are validated on initialization or first-time access to
    the field content
    
  • 2: 1 + all fields are validated on initialization or first-time

    access to the field content
    
  • 3: 2 + all fields are validated on initialization and record-specific

    validations are run (e.g. compare segment LN tag and sequence lenght)
    
  • 4: 3 + all fields are validated on writing to string

  • 5: 4 + all fields are validated by get and set methods

Parameters:

  • data (Array<String>)

    the content of the line; if an array of strings, this is interpreted as the splitted content of a GFA file line; note: an hash is also allowed, but this is for internal usage and shall be considered private

  • validate (Integer) (defaults to: 2)

    see paragraph Validation

  • virtual (Boolean) (defaults to: false)

    (default: false) mark the line as virtual, i.e. not yet found in the GFA file; e.g. a link is allowed to refer to a segment which is not yet created; in this case a segment marked as virtual is created, which is replaced by a non-virtual segment, when the segment line is later found

Raises:



101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/rgfa/line.rb', line 101

def initialize(data, validate: 2, virtual: false)
  unless self.class.const_defined?(:"RECORD_TYPE")
    raise RuntimeError, "This class shall not be directly instantiated"
  end
  @validate = validate
  @virtual = virtual
  @datatype = {}
  @data = {}
  if data.kind_of?(Hash)
    @data.merge!(data)
  else
    # normal initialization, data is an array of strings
    initialize_required_fields(data)
    initialize_optional_fields(data)
    validate_record_type_specific_info! if @validate >= 3
  end
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(m, *args, &block) ⇒ Object

Methods are dynamically created for non-existing but valid optional field names. Methods for predefined optional fields and required fields are created dynamically for each subclass; methods for existing optional fields are created on instance initialization.


- (Object) <fieldname>(parse=true)

The parsed content of a field. See also #get.

Parameters:

Returns:

  • (String, Hash, Array, Integer, Float) the parsed content of the field

  • (nil) if the field does not exist, but is a valid optional field name


- (Object) <fieldname>!(parse=true)

The parsed content of a field, raising an exception if not available. See also #get!.

Returns:

  • (String, Hash, Array, Integer, Float) the parsed content of the field

Raises:

  • (RGFA::Line::TagMissingError) if the field does not exist


- (self) <fieldname>=(value)

Sets the value of a required or optional field, or creates a new optional field if the fieldname is non-existing but valid. See also #set, #set_datatype.

Parameters:

  • value (String|Hash|Array|Integer|Float) value to set




406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
# File 'lib/rgfa/line.rb', line 406

def method_missing(m, *args, &block)
  field_name, operation, state = split_method_name(m)
  if ((operation == :get or operation == :get!) and args.size > 1) or
     (operation == :set and args.size != 1)
    raise ArgumentError, "wrong number of arguments"
  end
  case state
  when :invalid
    super
  when :existing
    case operation
    when :get
      if args[0] == false
        field_to_s(field_name)
      else
        get(field_name)
      end
    when :get!
      if args[0] == false
        field_to_s!(field_name)
      else
        get!(field_name)
      end
    when :set
      set_existing_field(field_name, args[0])
      return nil
    end
  when :valid
    case operation
    when :get
      return nil
    when :get!
      raise RGFA::Line::TagMissingError,
        "No value defined for tag #{field_name}"
    when :set
      set(field_name, args[0])
      return nil
    end
  end
end

Class Method Details

.subclass(record_type) ⇒ Class

Select a subclass based on the record type

Returns:

  • (Class)

    a subclass of RGFA::Line

Raises:



122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/rgfa/line.rb', line 122

def self.subclass(record_type)
  case record_type.to_sym
  when :H then RGFA::Line::Header
  when :S then RGFA::Line::Segment
  when :L then RGFA::Line::Link
  when :C then RGFA::Line::Containment
  when :P then RGFA::Line::Path
  when :"#" then RGFA::Line::Comment
  else
    raise RGFA::Line::UnknownRecordTypeError,
      "Record type unknown: '#{record_type}'"
  end
end

Instance Method Details

#==(o) ⇒ Boolean

Equivalence check

Returns:

  • (Boolean)

    does the line has the same record type, contains the same optional fields and all required and optional fields contain the same field values?

See Also:

  • RGFA::Line::Link#==


464
465
466
467
468
469
470
471
472
473
474
475
476
# File 'lib/rgfa/line.rb', line 464

def ==(o)
  return self.to_sym == o.to_sym if o.kind_of?(Symbol)
  return false if (o.record_type != self.record_type)
  return false if o.data.keys.sort != data.keys.sort
  o.data.each do |k, v|
    if @data[k] != o.data[k]
      if field_to_s(k) != o.field_to_s(k)
        return false
      end
    end
  end
  return true
end

#cloneRGFA::Line

Deep copy of a RGFA::Line instance.

Returns:



158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/rgfa/line.rb', line 158

def clone
  data_cpy = {}
  @data.each_pair do |k, v|
    if field_datatype(k) == :J
      data_cpy[k] = JSON.parse(v.to_json)
    elsif v.kind_of?(Array) or v.kind_of?(String)
      data_cpy[k] = v.clone
    else
      data_cpy[k] = v
    end
  end
  cpy = self.class.new(data_cpy, validate: @validate, virtual: @virtual)
  cpy.instance_variable_set("@datatype", @datatype.clone)
  return cpy
end

#delete(fieldname) ⇒ Object?

Remove an optional field from the line, if it exists;

do nothing if it does not

Parameters:

  • fieldname (Symbol)

    the tag name of the optfield to remove

Returns:

  • (Object, nil)

    the deleted value or nil, if the field was not defined



225
226
227
228
229
230
231
232
# File 'lib/rgfa/line.rb', line 225

def delete(fieldname)
  if optional_fieldnames.include?(fieldname)
    @datatype.delete(fieldname)
    return @data.delete(fieldname)
  else
    return nil
  end
end

#field_to_s(fieldname, optfield: false) ⇒ String

Compute the string representation of a field.

Parameters:

  • fieldname (Symbol)

    the tag name of the field

  • optfield (Boolean) (defaults to: false)

    (defaults to: false) return the tagname:datatype:value representation

Returns:

  • (String)

    the string representation

Raises:



257
258
259
260
261
262
263
264
265
266
267
# File 'lib/rgfa/line.rb', line 257

def field_to_s(fieldname, optfield: false)
  field = @data[fieldname]
  raise RGFA::Line::TagMissingError,
    "No value defined for tag #{fieldname}" if field.nil?
  t = field_or_default_datatype(fieldname, field)
  if !field.kind_of?(String)
    field = field.to_gfa_field(datatype: t)
  end
  field.validate_gfa_field!(t, fieldname) if @validate >= 4
  return optfield ? field.to_gfa_optfield(fieldname, datatype: t) : field
end

#fieldnamesArray<Symbol>

Returns fields defined for this instance.

Returns:



142
143
144
# File 'lib/rgfa/line.rb', line 142

def fieldnames
  @data.keys
end

#get(fieldname, frozen: false) ⇒ Object?

Get the value of a field

Parameters:

  • fieldname (Symbol)

    name of the field

  • frozen (Boolean) (defaults to: false)

    defaults to: false return a frozen value; this guarantees that a validation will not be necessary on output if the field value has not been changed using #set

Returns:

  • (Object, nil)

    value of the field or nil if field is not defined



336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
# File 'lib/rgfa/line.rb', line 336

def get(fieldname, frozen: false)
  v = @data[fieldname]
  if v.kind_of?(String)
    t = field_datatype(fieldname)
    if t != :Z and t != :seq
      # value was not parsed or was set to a string by the user
      return (@data[fieldname] = v.parse_gfa_field(datatype: t,
                                                   validate_strings:
                                                     @validate >= 2))
    else
       v.validate_gfa_field!(t, fieldname) if (@validate >= 5)
    end
  elsif !v.nil?
    if (@validate >= 5)
      t = field_datatype(fieldname)
      v.validate_gfa_field!(t, fieldname)
    end
  end
  return v
end

#get!(fieldname) ⇒ Object?

Value of a field, raising an exception if it is not defined

Parameters:

  • fieldname (Symbol)

    name of the field

Returns:

  • (Object, nil)

    value of the field

Raises:



361
362
363
364
365
366
# File 'lib/rgfa/line.rb', line 361

def get!(fieldname)
  v = get(fieldname)
  raise RGFA::Line::TagMissingError,
    "No value defined for tag #{fieldname}" if v.nil?
  return v
end

#get_datatype(fieldname) ⇒ RGFA::Line::FIELD_DATATYPE

Returns a symbol, which specifies the datatype of a field

Parameters:

  • fieldname (Symbol)

    the tag name of the field

Returns:



273
274
275
# File 'lib/rgfa/line.rb', line 273

def get_datatype(fieldname)
  field_or_default_datatype(fieldname, @data[fieldname])
end

#optional_fieldnamesArray<Symbol>

Returns name of the optional fields.

Returns:



152
153
154
# File 'lib/rgfa/line.rb', line 152

def optional_fieldnames
  (@data.keys - self.class::REQFIELDS)
end

#real!(real_line) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Make a virtual line real. This is called when a line which is expected, and for which a virtual line has been created, is finally found. So the line is converted into a real line, by merging in the line information from the found line.

Parameters:



190
191
192
193
194
195
# File 'lib/rgfa/line.rb', line 190

def real!(real_line)
  @virtual = false
  real_line.data.each_pair do |k, v|
    @data[k] = v
  end
end

#record_typeSymbol

Returns record type code.

Returns:

  • (Symbol)

    record type code



137
138
139
# File 'lib/rgfa/line.rb', line 137

def record_type
  self.class::RECORD_TYPE
end

#required_fieldnamesArray<Symbol>

Returns name of the required fields.

Returns:



147
148
149
# File 'lib/rgfa/line.rb', line 147

def required_fieldnames
  self.class::REQFIELDS
end

#respond_to?(m, include_all = false) ⇒ Boolean

Redefines respond_to? to correctly handle dynamical methods.

Returns:

  • (Boolean)

See Also:



449
450
451
# File 'lib/rgfa/line.rb', line 449

def respond_to?(m, include_all=false)
  super || (split_method_name(m)[2] != :invalid)
end

#set(fieldname, value) ⇒ Object

Set the value of a field.

If a datatype for a new custom optional field is not set, the default for the value assigned to the field will be used (e.g. J for Hashes, i for Integer, etc).

Parameters:

  • fieldname (Symbol)

    the name of the field to set (required field, predefined optional field (uppercase) or custom optional field name (lowercase))

Returns:

Raises:



311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
# File 'lib/rgfa/line.rb', line 311

def set(fieldname, value)
  if @data.has_key?(fieldname) or predefined_optional_fieldname?(fieldname)
    return set_existing_field(fieldname, value)
  elsif (@validate == 0) or valid_custom_optional_fieldname?(fieldname)
    define_field_methods(fieldname)
    if !@datatype[fieldname].nil?
      return set_existing_field(fieldname, value)
    elsif !value.nil?
      @datatype[fieldname] = value.default_gfa_datatype
      return @data[fieldname] = value
    end
  else
    raise RGFA::Line::FieldnameError,
      "#{fieldname} is not an existing or predefined field or a "+
      "valid custom optional field"
  end
end

#set_datatype(fieldname, datatype) ⇒ RGFA::Line::FIELD_DATATYPE

Set the datatype of a field.

If an existing field datatype is changed, its content may become invalid (call #validate_field! if necessary).

If the method is used for a required field or a predefined field, the line will use the specified datatype instead of the predefined one, resulting in a potentially invalid line.

Parameters:

Returns:

Raises:



292
293
294
295
296
297
# File 'lib/rgfa/line.rb', line 292

def set_datatype(fieldname, datatype)
  unless OPTFIELD_DATATYPE.include?(datatype)
    raise RGFA::Line::UnknownDatatype, "Unknown datatype: #{datatype}"
  end
  @datatype[fieldname] = datatype
end

#tagsArray<[Symbol, Symbol, Object]>

Returns the optional fields as an array of [fieldname, datatype, value] arrays.

Returns:



213
214
215
216
217
218
219
# File 'lib/rgfa/line.rb', line 213

def tags
  retval = []
  optional_fieldnames.each do |of|
    retval << [of, get_datatype(of), get(of)]
  end
  return retval
end

#to_aArray<String>

Returns an array of string representations of the fields.

Returns:

  • (Array<String>)

    an array of string representations of the fields



203
204
205
206
207
208
# File 'lib/rgfa/line.rb', line 203

def to_a
  a = [record_type]
  required_fieldnames.each {|fn| a << field_to_s(fn, optfield: false)}
  optional_fieldnames.each {|fn| a << field_to_s(fn, optfield: true)}
  return a
end

#to_rgfa_line(validate: nil) ⇒ Object

Returns self.

Parameters:

  • validate (Boolean) (defaults to: nil)

    ignored (compatibility reasons)

Returns:

  • self



455
456
457
# File 'lib/rgfa/line.rb', line 455

def to_rgfa_line(validate: nil)
  self
end

#to_sString

Returns a string representation of self.

Returns:

  • (String)

    a string representation of self



198
199
200
# File 'lib/rgfa/line.rb', line 198

def to_s
  to_a.join(SEPARATOR)
end

#validate!void

This method returns an undefined value.

Validate the RGFA::Line instance

Raises:



481
482
483
484
# File 'lib/rgfa/line.rb', line 481

def validate!
  fieldnames.each {|fieldname| validate_field!(fieldname) }
  validate_record_type_specific_info!
end

#validate_field!(fieldname) ⇒ void

This method returns an undefined value.

Raises an error if the content of the field does not correspond to the field type

Parameters:

  • fieldname (Symbol)

    the tag name of the field to validate

Raises:



241
242
243
244
245
246
# File 'lib/rgfa/line.rb', line 241

def validate_field!(fieldname)
  v = @data[fieldname]
  t = field_or_default_datatype(fieldname, v)
  v.validate_gfa_field!(t, fieldname)
  return nil
end

#virtual?Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Is the line virtual?

Is this RGFA::Line a virtual line repreentation (i.e. a placeholder for an expected but not encountered yet line)?

Returns:

  • (Boolean)


180
181
182
# File 'lib/rgfa/line.rb', line 180

def virtual?
  @virtual
end