Class: Wikipedia::VandalismDetection::Features::MarkupImpact

Inherits:
Base
  • Object
show all
Defined in:
lib/wikipedia/vandalism_detection/features/markup_impact.rb

Overview

This feature computes the percentage by which the edit increases the number of markup words in the text.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object



11
12
13
14
15
16
17
18
19
20
21
22
23
# File 'lib/wikipedia/vandalism_detection/features/markup_impact.rb', line 11

def calculate(edit)
  super

  old_text = edit.old_revision.text
  new_text = edit.new_revision.text
  regex = /(#{WordLists::MARKUP.join('|')})/

  old_markup_count = old_text.scan(regex).count.to_f
  new_markup_count = new_text.scan(regex).count.to_f

  no_terms_in_both = (old_markup_count == 0 && new_markup_count == 0)
  no_terms_in_both ? 0.5 : (old_markup_count / (old_markup_count + new_markup_count))
end