Class: PatienceDiff::SequenceMatcher

Inherits:
Object
  • Object
show all
Defined in:
lib/patience_diff/sequence_matcher.rb

Overview

Matches indexed data (generally text) using the Patience diff algorithm.

Defined Under Namespace

Classes: Card

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ SequenceMatcher

Options:

* :context: number of lines of context to use when grouping


10
11
12
# File 'lib/patience_diff/sequence_matcher.rb', line 10

def initialize(opts = {})
  @context = opts[:context] || 3
end

Instance Attribute Details

#contextObject

Returns the value of attribute context.



4
5
6
# File 'lib/patience_diff/sequence_matcher.rb', line 4

def context
  @context
end

Instance Method Details

#diff_opcodes(a, b) ⇒ Object

Generate a diff of a and b, and return an array of opcodes describing that diff. Each opcode represents a range in a and b that is either equal, only in a, or only in b. Opcodes are 5-tuples, in the format:

0: code
   A symbol indicating the diff operation. Can be :equal, :delete, or :insert.
1: a_start
   Index in a where the range begins
2: a_end
   Index in a where the range ends.
3: b_start
   Index in b where the range begins
4: b_end
   Index in b where the range ends.

For :equal, (a_end - a_start) == (b_end - b_start). For :delete, a_start == a_end. For :insert, b_start == b_end.



80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/patience_diff/sequence_matcher.rb', line 80

def diff_opcodes(a, b)
  sequences = collapse_matches(match(a, b))
  sequences << [a.length, b.length, 0]
  
  a_pos = b_pos = 0
  opcodes = []
  sequences.each do |(i, j, len)|
    if a_pos < i
      opcodes << [:delete, a_pos, i-1, b_pos, b_pos]
    end
    if b_pos < j
      opcodes << [:insert, a_pos, a_pos, b_pos, j-1]
    end
    if len > 0
      opcodes << [:equal, i, i+len-1, j, j+len-1]
    end
    a_pos = i+len
    b_pos = j+len
  end
  opcodes
end

#grouped_opcodes(a, b) ⇒ Object

Generate a diff of a and b using #diff_opcodes, and split the opcode into groups whenever an :equal range is encountered that is longer than @context * 2. Returns an array of arrays of 5-tuples as described for #diff_opcodes.



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/patience_diff/sequence_matcher.rb', line 17

def grouped_opcodes(a, b)
  groups = []
  last_group = []
  diff_opcodes(a, b).each do |opcode|
    if opcode[0] == :equal
      if @context.zero?
        groups << last_group
        last_group = []
        next
      end
      
      code, a_start, a_end, b_start, b_end = *opcode
      
      if (a_start.zero? and b_start.zero?) or (a_end == a.length-1 and b_end == b.length-1)
        threshold = @context
      else
        threshold = @context * 2
      end
      
      if (b_end - b_start + 1) > threshold
        unless last_group.empty?
          last_group << [
            code,
            a_start,
            a_start + @context - 1,
            b_start,
            b_start + @context - 1
          ]
          groups << last_group
          last_group = []
        end
        opcode = [
          code, 
          a_end - @context + 1,
          a_end,
          b_end - @context + 1,
          b_end
        ]
      end
    end
    last_group << opcode
  end
  groups << last_group unless last_group.one? and last_group.first[0] == :equal
  groups
end