Class: Wikipedia::VandalismDetection::Features::RemovedEmoticonsFrequency
- Inherits:
-
Base
- Object
- Base
- Wikipedia::VandalismDetection::Features::RemovedEmoticonsFrequency
- Defined in:
- lib/wikipedia/vandalism_detection/features/removed_emoticons_frequency.rb
Overview
This feature computes the frequency of emoticon words in the removed text.
Instance Method Summary collapse
-
#calculate(edit) ⇒ Object
Returns the percentage of markup words in the removed text.
Methods inherited from Base
Instance Method Details
#calculate(edit) ⇒ Object
Returns the percentage of markup words in the removed text. Returns 0.0 if cleaned removed text is of zero length.
13 14 15 16 17 18 19 20 21 22 23 |
# File 'lib/wikipedia/vandalism_detection/features/removed_emoticons_frequency.rb', line 13 def calculate(edit) super removed_text = edit.removed_text regex = /(^|\s)(#{WordLists::EMOTICONS.join('|')})(?=\s|$|\Z|[\.,!?]\s|[\.!?]\Z)/ emoticons_count = removed_text.scan(regex).flatten.reject { |c| c.size < 2 }.count total_count = removed_text.split.count (total_count > 0) ? (emoticons_count.to_f) / (total_count.to_f) : 0.0 end |