Class: Doppelganger::NodeAnalysis
- Inherits:
-
Object
- Object
- Doppelganger::NodeAnalysis
- Defined in:
- lib/doppelganger/node_analysis.rb
Overview
This handles the comparison of the Ruby nodes.
This will use various iterators to compare all the diffent block-like nodes in your code base and find similar or duplicate nodes.
Instance Attribute Summary collapse
-
#sexp_blocks ⇒ Object
Returns the value of attribute sexp_blocks.
Instance Method Summary collapse
-
#diff(threshold, progress_bar = nil) ⇒ Object
Finds block-like nodes that differ from another node by the threshold or less, but are not duplicates.
-
#duplicates ⇒ Object
Finds blocks of code that are exact duplicates, node for node.
-
#duplication? ⇒ Boolean
Are there any duplicates in the code base.
-
#initialize(sexp_blocks) ⇒ NodeAnalysis
constructor
A new instance of NodeAnalysis.
-
#percent_diff(percentage, progress_bar = nil) ⇒ Object
Finds block-like nodes that differ by a given threshold percentage or less, but are not duplicates.
Constructor Details
#initialize(sexp_blocks) ⇒ NodeAnalysis
Returns a new instance of NodeAnalysis.
10 11 12 |
# File 'lib/doppelganger/node_analysis.rb', line 10 def initialize(sexp_blocks) @sexp_blocks = sexp_blocks end |
Instance Attribute Details
#sexp_blocks ⇒ Object
Returns the value of attribute sexp_blocks.
8 9 10 |
# File 'lib/doppelganger/node_analysis.rb', line 8 def sexp_blocks @sexp_blocks end |
Instance Method Details
#diff(threshold, progress_bar = nil) ⇒ Object
Finds block-like nodes that differ from another node by the threshold or less, but are not duplicates.
37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/doppelganger/node_analysis.rb', line 37 def diff(threshold, = nil) diff_nodes = [] @compared_node_pairs = [] stepwise_sblocks() do |block_node_1, block_node_2| if threshold >= Diff::LCS.diff(block_node_1.flat_body_array, block_node_2.flat_body_array).size diff_nodes << [block_node_1, block_node_2] end @compared_node_pairs << [block_node_1, block_node_2] end @compared_node_pairs = [] cleanup_descendant_duplicate_matches(diff_nodes) end |
#duplicates ⇒ Object
Finds blocks of code that are exact duplicates, node for node. All duplicate blocks are grouped together.
21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/doppelganger/node_analysis.rb', line 21 def duplicates block_nodes = @sexp_blocks.map{ |sblock| sblock.body.remove_literals } (@sexp_blocks.inject([]) do |duplicate_blocks, sblock| node_body = sblock.body.remove_literals if block_nodes.duplicates?(node_body) if duplicate_blocks.map{|sb| sb.first.body.remove_literals}.include?(node_body) duplicate_blocks.find{|sb| sb.first.body.remove_literals == node_body } << sblock else duplicate_blocks << [sblock] end end duplicate_blocks end).compact.uniq end |
#duplication? ⇒ Boolean
Are there any duplicates in the code base.
15 16 17 |
# File 'lib/doppelganger/node_analysis.rb', line 15 def duplication? not duplicates.empty? end |
#percent_diff(percentage, progress_bar = nil) ⇒ Object
Finds block-like nodes that differ by a given threshold percentage or less, but are not duplicates.
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
# File 'lib/doppelganger/node_analysis.rb', line 51 def percent_diff(percentage, = nil) # To calculate the percentage we can do this in one of two ways we can compare # total differences (the diff set flattened) over the total nodes (the flattened bodies added) # or we can compare the number of change sets (the size of the diff) over the average number of nodes # in the two methods. # Not sure which is best but I've gone with the former for now. diff_nodes = [] @compared_node_pairs = [] stepwise_sblocks() do |block_node_1, block_node_2| total_nodes = block_node_1.flat_body_array.size + block_node_2.flat_body_array.size diff_size = Diff::LCS.diff(block_node_1.flat_body_array, block_node_2.flat_body_array).flatten.size if percentage >= (diff_size.to_f/total_nodes.to_f * 100) diff_nodes << [block_node_1, block_node_2] end @compared_node_pairs << [block_node_1, block_node_2] end @compared_node_pairs = [] cleanup_descendant_duplicate_matches(diff_nodes) end |