Class: Wikipedia::VandalismDetection::Features::InsertedCharacterDistribution

Inherits:
Base
  • Object
show all
Includes:
Algorithms
Defined in:
lib/wikipedia/vandalism_detection/features/inserted_character_distribution.rb

Overview

This feature computes the Kullback-Leibler Divergence of the inserted text’s character distribution relative to the character distribution of the old revision’s text. The smaller the divergence, the higher the similarity of the distributions and conversely.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object



14
15
16
17
18
# File 'lib/wikipedia/vandalism_detection/features/inserted_character_distribution.rb', line 14

def calculate(edit)
  super

  kullback_leibler_divergence(edit.old_revision.text, edit.inserted_text)
end