Module: Amp::Diffs::BinaryDiff
- Defined in:
- lib/amp/encoding/binary_diff.rb
Overview
Methods for producing a binary diff file for 2 input strings. Direct port from pure/bdiff.py in the mercurial source.
Class Method Summary collapse
-
.bdiff(str1, str2) ⇒ String
Produces a binary diff file from the input strings, str1 and str2.
-
.blocks(str1, str2) ⇒ [Hash]
Breaks the 2 input strings into blocks that match each other.
- .blocks_as_array(str1, str2) ⇒ Object
Class Method Details
.bdiff(str1, str2) ⇒ String
Produces a binary diff file from the input strings, str1 and str2. Works by getting the list of matching blocks (using SequenceMatcher) and filling in the gaps between them. Basically.
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# File 'lib/amp/encoding/binary_diff.rb', line 19 def bdiff(str1, str2) # break 'em up into lines a = [] str1.each_line {|l| a << l} b = [] str2.each_line {|l| b << l} if a.nil? || a.empty? s = b.join return [0,0,s.size].pack("NNN") + s end bin = [] byte_offsets = [0] a.each {|line| byte_offsets << (byte_offsets.last + line.size) } # Get all the sections of a and b that actually match each other. matched_blocks = SequenceMatcher.new(a, b).get_matching_blocks la = lb = 0 matched_blocks.each do |block| am, bm, size = block[:start_a], block[:start_b], block[:length] # At this point, a[la..am-1] does NOT equal b[lb..bm-1]. We know this # because the block we just got is a *matching* block. It tells us where # the two strings are the *same*. So a[la..am-1] is the source text, and # b[lb..bm-1] is the destination text of our diff. Thus, we say "replace # the text in a from (la..am) with the following text we got from b". # Since the array byte_offsets[] contains the actual byte offsets of each line, # our diff is stored as [start_a_text_to_replace, end_a_text_to_replace, # size_of_replacement_text, replacement_text]. s = b[lb .. (bm-1)].join unless lb == bm && lb == 0 s = "" if lb == bm && lb == 0 bin << [byte_offsets[la], byte_offsets[am], s.size].pack("NNN") + s if am > la || s.any? la = am + size lb = bm + size end bin.join end |
.blocks(str1, str2) ⇒ [Hash]
Breaks the 2 input strings into blocks that match each other. Uses SequenceMatcher and just manipulates the output a little.
65 66 67 68 69 70 71 72 73 74 |
# File 'lib/amp/encoding/binary_diff.rb', line 65 def blocks(str1, str2) an = str1.split_lines_better bn = str2.split_lines_better matches = Diffs::SequenceMatcher.new(an, bn).get_matching_blocks matches.map do |match| {:start_a => match[:start_a], :end_a => match[:start_a] + match[:length], :start_b => match[:start_b], :end_b => match[:start_b] + match[:length] } end end |
.blocks_as_array(str1, str2) ⇒ Object
77 78 79 |
# File 'lib/amp/encoding/binary_diff.rb', line 77 def self.blocks_as_array(str1, str2) blocks(str1,str2).map {|h| [h[:start_a], h[:end_a], h[:start_b], h[:end_b]]} end |