Class: Wikipedia::VandalismDetection::Features::UpperCaseWordsRatio

Inherits:
Base
  • Object
show all
Defined in:
lib/wikipedia/vandalism_detection/features/upper_case_words_ratio.rb

Overview

This feature computes the uppercase to all words ratio of the edit’s new revision inserted text.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object



13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# File 'lib/wikipedia/vandalism_detection/features/upper_case_words_ratio.rb', line 13

def calculate(edit)
  super

  inserted_alpha_text = edit.inserted_words.delete_if{ |w| w.gsub(/[^A-Za-z]/, '').empty? }.join("\n")
  words = Text.new(inserted_alpha_text).clean.gsub(/[^\w\s]/, '').split

  return 0.0 if words.empty?

  uppercase_words_count = words.reduce(0) do |count, word|
    count += 1 if word == word.upcase
    count
  end

  (1.0 + uppercase_words_count) / (1.0 + words.count)
end