Class: Wikipedia::VandalismDetection::FeatureCalculator

Inherits:
Object
  • Object
show all
Defined in:
lib/wikipedia/vandalism_detection/feature_calculator.rb

Overview

This class provides methods for calculating a feature set of an edit. The features that shall be used can be defined in the config/config.yml file under the ‘features:’ root attribute like this:

features:

- anonymity
- character sequence
- ...

etc.

Instance Method Summary collapse

Constructor Details

#initializeFeatureCalculator

Returns a new instance of FeatureCalculator.



22
23
24
25
26
# File 'lib/wikipedia/vandalism_detection/feature_calculator.rb', line 22

def initialize
  @features = Wikipedia::VandalismDetection.configuration.features
  raise FeaturesNotConfiguredError if (@features.blank? || @features.empty?)
  @feature_classes = build_feature_classes @features
end

Instance Method Details

#calculate_feature_for(edit, feature_name) ⇒ Object

Returns the calculated Numeric feature value for given edit and feature with given name

Raises:

  • (ArgumentError)


49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/wikipedia/vandalism_detection/feature_calculator.rb', line 49

def calculate_feature_for(edit, feature_name)
  raise ArgumentError, "First parameter has to be an Edit." unless edit.is_a? Edit
  raise ArgumentError, "Second parameter has to be a feature name String (e.g. 'anonymity')." unless \
    feature_name.is_a? String

  value = Features::MISSING_VALUE

  begin
    feature = feature_class_from_name(feature_name)
    value = feature.calculate(edit)
  rescue WikitextExtractionError
    $stderr.print %Q{
      Edit (#{edit.old_revision.id}, #{edit.new_revision.id}) could not be parsed
      by the WikitextExtractor and will be discarded.\n""}
  end

  value
end

#calculate_features_for(edit) ⇒ Object

Calculates the configured festures for the given edit and returns an array of the computed values.

Raises:

  • (ArgumentError)


30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/wikipedia/vandalism_detection/feature_calculator.rb', line 30

def calculate_features_for(edit)
  raise ArgumentError, "Input has to be an Edit." unless edit.is_a? Edit

  features = @feature_classes.map do |feature|
    begin
      feature.calculate(edit)
    rescue WikitextExtractionError => e
      $stderr.print %Q{
        Edit (#{edit.old_revision.id}, #{edit.new_revision.id}) could not be parsed
        by the WikitextExtractor and will be discarded.\n""}

      Features::MISSING_VALUE
    end
  end

  features
end

#used_featuresObject

Returns the feature names as defined in conf/config.yml under ‘features:’.



69
70
71
# File 'lib/wikipedia/vandalism_detection/feature_calculator.rb', line 69

def used_features
  @features
end