Class: Cheripic::Contig

Inherits:
Object
  • Object
show all
Defined in:
lib/cheripic/contig.rb

Overview

A contig object from assembly that stores positions of homozygous, heterozygous and hemi-variants

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(fasta) ⇒ Contig

creates a Contig object using fasta entry

Parameters:

  • fasta (Bio::FastaFormat)

    an individual fasta entry from input assembly file



30
31
32
33
34
35
36
37
38
# File 'lib/cheripic/contig.rb', line 30

def initialize (fasta)
  @id = fasta.entry_id
  @length = fasta.length
  @hm_pos = {}
  @ht_pos = {}
  @hemi_pos = {}
  @mean_depth = nil
  @sd_depth = nil
end

Instance Attribute Details

#hemi_posHash

Returns a hash of hemi-variant positions as keys and allele frequency as values.

Returns:

  • (Hash)

    a hash of hemi-variant positions as keys and allele frequency as values



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/cheripic/contig.rb', line 23

class Contig

  attr_accessor :hm_pos, :ht_pos, :hemi_pos, :mean_depth, :sd_depth
  attr_reader :id, :length

  # creates a Contig object using fasta entry
  # @param fasta [Bio::FastaFormat] an individual fasta entry from input assembly file
  def initialize (fasta)
    @id = fasta.entry_id
    @length = fasta.length
    @hm_pos = {}
    @ht_pos = {}
    @hemi_pos = {}
    @mean_depth = nil
    @sd_depth = nil
  end

  # Number of homozygous variants identified in the contig
  # @return [Integer]
  def hm_num
    self.hm_pos.length
  end

  # Number of heterozygous variants identified in the contig
  # @return [Integer]
  def ht_num
    self.ht_pos.length
  end

  # Homozygosity enrichment score calculated using
  # hm_num and ht_num of the contig object
  # @return [Float]
  def hme_score
    hmes_adjust = Options.hmes_adjust
    if self.hm_num == 0 and self.ht_num == 0
      0.0
    else
      (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
    end
  end

  # Number of hemi-variants identified in the contig
  # @return [Integer]
  def hemi_num
    self.hemi_pos.length
  end

  # Mean of bulk frequency ratios (bfr) calculated using
  # bfr values all hemi_pos of the contig
  # @return [Float]
  def bfr_score
    if self.hemi_pos.values.empty?
      0.0
    else
      geom_mean(self.hemi_pos.values)
    end
  end

  # Calculates mean of an array of numbers
  # @param array [Array] an array of bfr values from hemi_snp
  # @return [Float] mean value as float
  def geom_mean(array)
    return array[0].to_f if array.length == 1
    array.reduce(:+) / array.size.to_f
    # sum = 0.0
    # array.each{ |v| sum += Math.log(v.to_f) }
    # sum /= array.size
    # Math.exp sum
  end

end

#hm_posHash

Returns a hash of homozygous variant positions as keys and allele frequency as values.

Returns:

  • (Hash)

    a hash of homozygous variant positions as keys and allele frequency as values



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/cheripic/contig.rb', line 23

class Contig

  attr_accessor :hm_pos, :ht_pos, :hemi_pos, :mean_depth, :sd_depth
  attr_reader :id, :length

  # creates a Contig object using fasta entry
  # @param fasta [Bio::FastaFormat] an individual fasta entry from input assembly file
  def initialize (fasta)
    @id = fasta.entry_id
    @length = fasta.length
    @hm_pos = {}
    @ht_pos = {}
    @hemi_pos = {}
    @mean_depth = nil
    @sd_depth = nil
  end

  # Number of homozygous variants identified in the contig
  # @return [Integer]
  def hm_num
    self.hm_pos.length
  end

  # Number of heterozygous variants identified in the contig
  # @return [Integer]
  def ht_num
    self.ht_pos.length
  end

  # Homozygosity enrichment score calculated using
  # hm_num and ht_num of the contig object
  # @return [Float]
  def hme_score
    hmes_adjust = Options.hmes_adjust
    if self.hm_num == 0 and self.ht_num == 0
      0.0
    else
      (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
    end
  end

  # Number of hemi-variants identified in the contig
  # @return [Integer]
  def hemi_num
    self.hemi_pos.length
  end

  # Mean of bulk frequency ratios (bfr) calculated using
  # bfr values all hemi_pos of the contig
  # @return [Float]
  def bfr_score
    if self.hemi_pos.values.empty?
      0.0
    else
      geom_mean(self.hemi_pos.values)
    end
  end

  # Calculates mean of an array of numbers
  # @param array [Array] an array of bfr values from hemi_snp
  # @return [Float] mean value as float
  def geom_mean(array)
    return array[0].to_f if array.length == 1
    array.reduce(:+) / array.size.to_f
    # sum = 0.0
    # array.each{ |v| sum += Math.log(v.to_f) }
    # sum /= array.size
    # Math.exp sum
  end

end

#ht_posHash

Returns a hash of heterozygous variant positions as keys and allele frequency as values.

Returns:

  • (Hash)

    a hash of heterozygous variant positions as keys and allele frequency as values



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/cheripic/contig.rb', line 23

class Contig

  attr_accessor :hm_pos, :ht_pos, :hemi_pos, :mean_depth, :sd_depth
  attr_reader :id, :length

  # creates a Contig object using fasta entry
  # @param fasta [Bio::FastaFormat] an individual fasta entry from input assembly file
  def initialize (fasta)
    @id = fasta.entry_id
    @length = fasta.length
    @hm_pos = {}
    @ht_pos = {}
    @hemi_pos = {}
    @mean_depth = nil
    @sd_depth = nil
  end

  # Number of homozygous variants identified in the contig
  # @return [Integer]
  def hm_num
    self.hm_pos.length
  end

  # Number of heterozygous variants identified in the contig
  # @return [Integer]
  def ht_num
    self.ht_pos.length
  end

  # Homozygosity enrichment score calculated using
  # hm_num and ht_num of the contig object
  # @return [Float]
  def hme_score
    hmes_adjust = Options.hmes_adjust
    if self.hm_num == 0 and self.ht_num == 0
      0.0
    else
      (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
    end
  end

  # Number of hemi-variants identified in the contig
  # @return [Integer]
  def hemi_num
    self.hemi_pos.length
  end

  # Mean of bulk frequency ratios (bfr) calculated using
  # bfr values all hemi_pos of the contig
  # @return [Float]
  def bfr_score
    if self.hemi_pos.values.empty?
      0.0
    else
      geom_mean(self.hemi_pos.values)
    end
  end

  # Calculates mean of an array of numbers
  # @param array [Array] an array of bfr values from hemi_snp
  # @return [Float] mean value as float
  def geom_mean(array)
    return array[0].to_f if array.length == 1
    array.reduce(:+) / array.size.to_f
    # sum = 0.0
    # array.each{ |v| sum += Math.log(v.to_f) }
    # sum /= array.size
    # Math.exp sum
  end

end

#idString (readonly)

Returns id of the contig in assembly taken from fasta file.

Returns:

  • (String)

    id of the contig in assembly taken from fasta file



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/cheripic/contig.rb', line 23

class Contig

  attr_accessor :hm_pos, :ht_pos, :hemi_pos, :mean_depth, :sd_depth
  attr_reader :id, :length

  # creates a Contig object using fasta entry
  # @param fasta [Bio::FastaFormat] an individual fasta entry from input assembly file
  def initialize (fasta)
    @id = fasta.entry_id
    @length = fasta.length
    @hm_pos = {}
    @ht_pos = {}
    @hemi_pos = {}
    @mean_depth = nil
    @sd_depth = nil
  end

  # Number of homozygous variants identified in the contig
  # @return [Integer]
  def hm_num
    self.hm_pos.length
  end

  # Number of heterozygous variants identified in the contig
  # @return [Integer]
  def ht_num
    self.ht_pos.length
  end

  # Homozygosity enrichment score calculated using
  # hm_num and ht_num of the contig object
  # @return [Float]
  def hme_score
    hmes_adjust = Options.hmes_adjust
    if self.hm_num == 0 and self.ht_num == 0
      0.0
    else
      (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
    end
  end

  # Number of hemi-variants identified in the contig
  # @return [Integer]
  def hemi_num
    self.hemi_pos.length
  end

  # Mean of bulk frequency ratios (bfr) calculated using
  # bfr values all hemi_pos of the contig
  # @return [Float]
  def bfr_score
    if self.hemi_pos.values.empty?
      0.0
    else
      geom_mean(self.hemi_pos.values)
    end
  end

  # Calculates mean of an array of numbers
  # @param array [Array] an array of bfr values from hemi_snp
  # @return [Float] mean value as float
  def geom_mean(array)
    return array[0].to_f if array.length == 1
    array.reduce(:+) / array.size.to_f
    # sum = 0.0
    # array.each{ |v| sum += Math.log(v.to_f) }
    # sum /= array.size
    # Math.exp sum
  end

end

#lengthInteger (readonly)

Returns length of contig in bases.

Returns:

  • (Integer)

    length of contig in bases



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/cheripic/contig.rb', line 23

class Contig

  attr_accessor :hm_pos, :ht_pos, :hemi_pos, :mean_depth, :sd_depth
  attr_reader :id, :length

  # creates a Contig object using fasta entry
  # @param fasta [Bio::FastaFormat] an individual fasta entry from input assembly file
  def initialize (fasta)
    @id = fasta.entry_id
    @length = fasta.length
    @hm_pos = {}
    @ht_pos = {}
    @hemi_pos = {}
    @mean_depth = nil
    @sd_depth = nil
  end

  # Number of homozygous variants identified in the contig
  # @return [Integer]
  def hm_num
    self.hm_pos.length
  end

  # Number of heterozygous variants identified in the contig
  # @return [Integer]
  def ht_num
    self.ht_pos.length
  end

  # Homozygosity enrichment score calculated using
  # hm_num and ht_num of the contig object
  # @return [Float]
  def hme_score
    hmes_adjust = Options.hmes_adjust
    if self.hm_num == 0 and self.ht_num == 0
      0.0
    else
      (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
    end
  end

  # Number of hemi-variants identified in the contig
  # @return [Integer]
  def hemi_num
    self.hemi_pos.length
  end

  # Mean of bulk frequency ratios (bfr) calculated using
  # bfr values all hemi_pos of the contig
  # @return [Float]
  def bfr_score
    if self.hemi_pos.values.empty?
      0.0
    else
      geom_mean(self.hemi_pos.values)
    end
  end

  # Calculates mean of an array of numbers
  # @param array [Array] an array of bfr values from hemi_snp
  # @return [Float] mean value as float
  def geom_mean(array)
    return array[0].to_f if array.length == 1
    array.reduce(:+) / array.size.to_f
    # sum = 0.0
    # array.each{ |v| sum += Math.log(v.to_f) }
    # sum /= array.size
    # Math.exp sum
  end

end

#mean_depthObject

Returns the value of attribute mean_depth.



25
26
27
# File 'lib/cheripic/contig.rb', line 25

def mean_depth
  @mean_depth
end

#sd_depthObject

Returns the value of attribute sd_depth.



25
26
27
# File 'lib/cheripic/contig.rb', line 25

def sd_depth
  @sd_depth
end

Instance Method Details

#bfr_scoreFloat

Mean of bulk frequency ratios (bfr) calculated using bfr values all hemi_pos of the contig

Returns:

  • (Float)


73
74
75
76
77
78
79
# File 'lib/cheripic/contig.rb', line 73

def bfr_score
  if self.hemi_pos.values.empty?
    0.0
  else
    geom_mean(self.hemi_pos.values)
  end
end

#geom_mean(array) ⇒ Float

Calculates mean of an array of numbers

Parameters:

  • array (Array)

    an array of bfr values from hemi_snp

Returns:

  • (Float)

    mean value as float



84
85
86
87
88
89
90
91
# File 'lib/cheripic/contig.rb', line 84

def geom_mean(array)
  return array[0].to_f if array.length == 1
  array.reduce(:+) / array.size.to_f
  # sum = 0.0
  # array.each{ |v| sum += Math.log(v.to_f) }
  # sum /= array.size
  # Math.exp sum
end

#hemi_numInteger

Number of hemi-variants identified in the contig

Returns:

  • (Integer)


66
67
68
# File 'lib/cheripic/contig.rb', line 66

def hemi_num
  self.hemi_pos.length
end

#hm_numInteger

Number of homozygous variants identified in the contig

Returns:

  • (Integer)


42
43
44
# File 'lib/cheripic/contig.rb', line 42

def hm_num
  self.hm_pos.length
end

#hme_scoreFloat

Homozygosity enrichment score calculated using hm_num and ht_num of the contig object

Returns:

  • (Float)


55
56
57
58
59
60
61
62
# File 'lib/cheripic/contig.rb', line 55

def hme_score
  hmes_adjust = Options.hmes_adjust
  if self.hm_num == 0 and self.ht_num == 0
    0.0
  else
    (self.hm_num + hmes_adjust) / (self.ht_num + hmes_adjust)
  end
end

#ht_numInteger

Number of heterozygous variants identified in the contig

Returns:

  • (Integer)


48
49
50
# File 'lib/cheripic/contig.rb', line 48

def ht_num
  self.ht_pos.length
end