Class: Wikipedia::VandalismDetection::Features::Compressibility
- Defined in:
- lib/wikipedia/vandalism_detection/features/compressibility.rb
Overview
This feature describes compressibility ratio of compressed and uncompressed inserted text.
Instance Method Summary collapse
-
#calculate(edit) ⇒ Object
Claculates the compressibility ratio of the inserted text.
Methods inherited from Base
Instance Method Details
#calculate(edit) ⇒ Object
Claculates the compressibility ratio of the inserted text. Values above 0.5 are higher compressed and therefor can stand for nonsense text as: ‘AAAAAAAAAAAAAAAAAAAhhhhhhhhhhhhhhhh!’ etc.
15 16 17 18 19 20 21 22 23 |
# File 'lib/wikipedia/vandalism_detection/features/compressibility.rb', line 15 def calculate(edit) super inserted_text = edit.inserted_text uncompressed_size = inserted_text.bytesize.to_f compressed_size = Zlib::Deflate.deflate(inserted_text).bytesize.to_f inserted_text.empty? ? 0.5 : (uncompressed_size / ( compressed_size + uncompressed_size)) end |