Class: Wikipedia::VandalismDetection::Features::RemovedCharacterDistribution

Inherits:
Base
  • Object
show all
Includes:
Algorithms
Defined in:
lib/wikipedia/vandalism_detection/features/removed_character_distribution.rb

Overview

This feature computes the Kullback-Leibler Divergence of the removed text’s character distribution relative to the character distribution of the new revision’s text. The smaller the divergence, the higher the similarity of the distributions and conversely.

Instance Method Summary collapse

Methods inherited from Base

#count

Instance Method Details

#calculate(edit) ⇒ Object



14
15
16
17
18
# File 'lib/wikipedia/vandalism_detection/features/removed_character_distribution.rb', line 14

def calculate(edit)
  super

  kullback_leibler_divergence(edit.new_revision.text, edit.removed_text)
end