Class: Wikipedia::VandalismDetection::Features::NonAlphanumericRatio

Inherits:
Base
  • Object
show all
Defined in:
lib/wikipedia/vandalism_detection/features/non_alphanumeric_ratio.rb

Overview

This feature computes the non-alphanumeric to all letters ratio of the edit’s new revision inserted text.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object



10
11
12
13
14
15
16
17
18
19
20
# File 'lib/wikipedia/vandalism_detection/features/non_alphanumeric_ratio.rb', line 10

def calculate(edit)
  super

  text = edit.inserted_text
  return 0.0 if text.empty?

  non_alpha_count = text.scan(/[^a-zA-Z0-9\s]/).size
  all_letters_count = text.scan(/[^\s]/).size

  (1.0 + non_alpha_count) / (1.0 + all_letters_count)
end