Class: Wikipedia::VandalismDetection::Features::CommentMarkupFrequency

Inherits:
Base
  • Object
show all
Defined in:
lib/wikipedia/vandalism_detection/features/comment_markup_frequency.rb

Overview

This feature computes frequency of markup words in the comment of the edit’s new revision.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object

Returns the percentage of markup words in the new revision’s comment. Returns 0.0 if text is of zero length.



13
14
15
16
17
18
19
20
21
22
23
# File 'lib/wikipedia/vandalism_detection/features/comment_markup_frequency.rb', line 13

def calculate(edit)
  super

  comment = edit.new_revision.comment
  all_words_count = comment.split.count

  regex = /(#{WordLists::MARKUP.join('|')})/
  markup_words_count = comment.scan(regex).count

  (all_words_count > 0) ? (markup_words_count.to_f) / (all_words_count.to_f) : 0.0
end