Module: HashDiff

Defined in:
lib/hashdiff/patch.rb,
lib/hashdiff/lcs.rb,
lib/hashdiff/diff.rb,
lib/hashdiff/util.rb,
lib/hashdiff/version.rb,
lib/hashdiff/linear_compare_array.rb

Overview

This module provides methods to diff two hash, patch and unpatch hash

Constant Summary collapse

VERSION =
'0.3.9'.freeze

Class Method Summary collapse

Class Method Details

.best_diff(obj1, obj2, options = {}) {|path, value1, value2| ... } ⇒ Array

Best diff two objects, which tries to generate the smallest change set using different similarity values.

HashDiff.best_diff is useful in case of comparing two objects which include similar hashes in arrays.

Examples:

a = {'x' => [{'a' => 1, 'c' => 3, 'e' => 5}, {'y' => 3}]}
b = {'x' => [{'a' => 1, 'b' => 2, 'e' => 5}] }
diff = HashDiff.best_diff(a, b)
diff.should == [['-', 'x[0].c', 3], ['+', 'x[0].b', 2], ['-', 'x[1].y', 3], ['-', 'x[1]', {}]]

Parameters:

  • obj1 (Array, Hash)
  • obj2 (Array, Hash)
  • options (Hash) (defaults to: {})

    the options to use when comparing

    • :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other

    • :delimiter (String) [‘.’] the delimiter used when returning nested key references

    • :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.

    • :strip (Boolean) [false] whether or not to call #strip on strings before comparing

    • :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.

    • :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.

Yields:

  • (path, value1, value2)

    Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.

Returns:

  • (Array)

    an array of changes. e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

Since:

  • 0.0.1



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/hashdiff/diff.rb', line 30

def self.best_diff(obj1, obj2, options = {}, &block)
  options[:comparison] = block if block_given?

  opts = { similarity: 0.3 }.merge!(options)
  diffs1 = diff(obj1, obj2, opts)
  count1 = count_diff diffs1

  opts = { similarity: 0.5 }.merge!(options)
  diffs2 = diff(obj1, obj2, opts)
  count2 = count_diff diffs2

  opts = { similarity: 0.8 }.merge!(options)
  diffs3 = diff(obj1, obj2, opts)
  count3 = count_diff diffs3

  count, diffs = count1 < count2 ? [count1, diffs1] : [count2, diffs2]
  count < count3 ? diffs : diffs3
end

.diff(obj1, obj2, options = {}) {|path, value1, value2| ... } ⇒ Array

Compute the diff of two hashes or arrays

Examples:

a = {"a" => 1, "b" => {"b1" => 1, "b2" =>2}}
b = {"a" => 1, "b" => {}}

diff = HashDiff.diff(a, b)
diff.should == [['-', 'b.b1', 1], ['-', 'b.b2', 2]]

Parameters:

  • obj1 (Array, Hash)
  • obj2 (Array, Hash)
  • options (Hash) (defaults to: {})

    the options to use when comparing

    • :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other

    • :similarity (Numeric) [0.8] should be between (0, 1]. Meaningful if there are similar hashes in arrays. See best_diff.

    • :delimiter (String) [‘.’] the delimiter used when returning nested key references

    • :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.

    • :strip (Boolean) [false] whether or not to call #strip on strings before comparing

    • :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.

    • :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.

Yields:

  • (path, value1, value2)

    Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.

Returns:

  • (Array)

    an array of changes. e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

Since:

  • 0.0.1



76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/hashdiff/diff.rb', line 76

def self.diff(obj1, obj2, options = {}, &block)
  opts = {
    prefix: '',
    similarity: 0.8,
    delimiter: '.',
    strict: true,
    strip: false,
    numeric_tolerance: 0,
    array_path: false,
    use_lcs: true
  }.merge!(options)

  opts[:prefix] = [] if opts[:array_path] && opts[:prefix] == ''

  opts[:comparison] = block if block_given?

  # prefer to compare with provided block
  result = custom_compare(opts[:comparison], opts[:prefix], obj1, obj2)
  return result if result

  return [] if obj1.nil? && obj2.nil?

  return [['~', opts[:prefix], nil, obj2]] if obj1.nil?

  return [['~', opts[:prefix], obj1, nil]] if obj2.nil?

  return [['~', opts[:prefix], obj1, obj2]] unless comparable?(obj1, obj2, opts[:strict])

  result = []
  if obj1.is_a?(Array) && opts[:use_lcs]
    changeset = diff_array_lcs(obj1, obj2, opts) do |lcs|
      # use a's index for similarity
      lcs.each do |pair|
        prefix = prefix_append_array_index(opts[:prefix], pair[0], opts)
        result.concat(diff(obj1[pair[0]], obj2[pair[1]], opts.merge(prefix: prefix)))
      end
    end

    changeset.each do |change|
      change_key = prefix_append_array_index(opts[:prefix], change[1], opts)
      if change[0] == '-'
        result << ['-', change_key, change[2]]
      elsif change[0] == '+'
        result << ['+', change_key, change[2]]
      end
    end
  elsif obj1.is_a?(Array) && !opts[:use_lcs]
    result.concat(LinearCompareArray.call(obj1, obj2, opts))
  elsif obj1.is_a?(Hash)
    obj1_keys = obj1.keys
    obj2_keys = obj2.keys

    deleted_keys = (obj1_keys - obj2_keys).sort_by(&:to_s)
    common_keys = (obj1_keys & obj2_keys).sort_by(&:to_s)
    added_keys = (obj2_keys - obj1_keys).sort_by(&:to_s)

    # add deleted properties
    deleted_keys.each do |k|
      change_key = prefix_append_key(opts[:prefix], k, opts)
      custom_result = custom_compare(opts[:comparison], change_key, obj1[k], nil)

      if custom_result
        result.concat(custom_result)
      else
        result << ['-', change_key, obj1[k]]
      end
    end

    # recursive comparison for common keys
    common_keys.each do |k|
      prefix = prefix_append_key(opts[:prefix], k, opts)
      result.concat(diff(obj1[k], obj2[k], opts.merge(prefix: prefix)))
    end

    # added properties
    added_keys.each do |k|
      change_key = prefix_append_key(opts[:prefix], k, opts)
      next if obj1.key?(k)

      custom_result = custom_compare(opts[:comparison], change_key, nil, obj2[k])

      if custom_result
        result.concat(custom_result)
      else
        result << ['+', change_key, obj2[k]]
      end
    end
  else
    return [] if compare_values(obj1, obj2, opts)

    return [['~', opts[:prefix], obj1, obj2]]
  end

  result
end

.patch!(obj, changes, options = {}) ⇒ Object

Apply patch to object

Parameters:

  • obj (Hash, Array)

    the object to be patched, can be an Array or a Hash

  • changes (Array)

    e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

  • options (Hash) (defaults to: {})

    supports following keys:

    • :delimiter (String) [‘.’] delimiter string for representing nested keys in changes array

Returns:

  • the object after patch

Since:

  • 0.0.1



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/hashdiff/patch.rb', line 17

def self.patch!(obj, changes, options = {})
  delimiter = options[:delimiter] || '.'

  changes.each do |change|
    parts = change[1]
    parts = decode_property_path(parts, delimiter) unless parts.is_a?(Array)

    last_part = parts.last

    parent_node = node(obj, parts[0, parts.size - 1])

    if change[0] == '+'
      if parent_node.is_a?(Array)
        parent_node.insert(last_part, change[2])
      else
        parent_node[last_part] = change[2]
      end
    elsif change[0] == '-'
      if parent_node.is_a?(Array)
        parent_node.delete_at(last_part)
      else
        parent_node.delete(last_part)
      end
    elsif change[0] == '~'
      parent_node[last_part] = change[3]
    end
  end

  obj
end

.prefix_append_array_index(prefix, array_index, opts) ⇒ Object



136
137
138
139
140
141
142
# File 'lib/hashdiff/util.rb', line 136

def self.prefix_append_array_index(prefix, array_index, opts)
  if opts[:array_path]
    prefix + [array_index]
  else
    "#{prefix}[#{array_index}]"
  end
end

.prefix_append_key(prefix, key, opts) ⇒ Object



128
129
130
131
132
133
134
# File 'lib/hashdiff/util.rb', line 128

def self.prefix_append_key(prefix, key, opts)
  if opts[:array_path]
    prefix + [key]
  else
    prefix.empty? ? key.to_s : "#{prefix}#{opts[:delimiter]}#{key}"
  end
end

.unpatch!(obj, changes, options = {}) ⇒ Object

Unpatch an object

Parameters:

  • obj (Hash, Array)

    the object to be unpatched, can be an Array or a Hash

  • changes (Array)

    e.g. [[ ‘+’, ‘a.b’, ‘45’ ], [ ‘-’, ‘a.c’, ‘5’ ], [ ‘~’, ‘a.x’, ‘45’, ‘63’]]

  • options (Hash) (defaults to: {})

    supports following keys:

    • :delimiter (String) [‘.’] delimiter string for representing nested keys in changes array

Returns:

  • the object after unpatch

Since:

  • 0.0.1



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/hashdiff/patch.rb', line 58

def self.unpatch!(obj, changes, options = {})
  delimiter = options[:delimiter] || '.'

  changes.reverse_each do |change|
    parts = change[1]
    parts = decode_property_path(parts, delimiter) unless parts.is_a?(Array)

    last_part = parts.last

    parent_node = node(obj, parts[0, parts.size - 1])

    if change[0] == '+'
      if parent_node.is_a?(Array)
        parent_node.delete_at(last_part)
      else
        parent_node.delete(last_part)
      end
    elsif change[0] == '-'
      if parent_node.is_a?(Array)
        parent_node.insert(last_part, change[2])
      else
        parent_node[last_part] = change[2]
      end
    elsif change[0] == '~'
      parent_node[last_part] = change[2]
    end
  end

  obj
end