Class: String

Inherits:
Object show all
Defined in:
lib/arachni/ruby/string.rb

Overview

Overloads the String class.

Author:

Instance Method Summary collapse

Instance Method Details

#binary?Boolean

Returns:

  • (Boolean)


157
158
159
160
161
162
# File 'lib/arachni/ruby/string.rb', line 157

def binary?
    # Stolen from YAML.
    encoding == Encoding::ASCII_8BIT ||
        index("\x00") ||
        count("\x00-\x7F", "^ -~\t\r\n").fdiv(length) > 0.3
end

#diff_ratio(other) ⇒ Float

Calculates the difference ratio (at a word level) between ‘self` and `other`

Parameters:

Returns:

  • (Float)

    ‘0.0` (identical strings) to `1.0` (completely different)



99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/arachni/ruby/string.rb', line 99

def diff_ratio( other )
    return 0.0 if self == other
    return 1.0 if empty? || other.empty?

    s_words = self.words( true )
    o_words = other.words( true )

    common = (s_words & o_words).size.to_f
    union  = (s_words | o_words).size.to_f

    (union - common) / union
end

#longest_wordString

Returns Longest word.

Returns:



132
133
134
# File 'lib/arachni/ruby/string.rb', line 132

def longest_word
    words( true ).sort_by { |w| w.size }.last
end

#persistent_hashInteger

Returns In integer with the property of:

If ‘str1 == str2` then `str1.persistent_hash == str2.persistent_hash`.

It basically has the same function as Ruby’s ‘#hash` method, but does not use a random seed per Ruby process – making it suitable for use in distributed systems.

Returns:

  • (Integer)

    In integer with the property of:

    If ‘str1 == str2` then `str1.persistent_hash == str2.persistent_hash`.

    It basically has the same function as Ruby’s ‘#hash` method, but does not use a random seed per Ruby process – making it suitable for use in distributed systems.



144
145
146
# File 'lib/arachni/ruby/string.rb', line 144

def persistent_hash
    Zlib.crc32 self
end

#rdiff(other) ⇒ String

Gets the reverse diff between self and str on a word level.

str = <<END
This is the first test.
Not really sure what else to put here...
END

str2 = <<END
This is the second test.
Not really sure what else to put here...
Boo-Yah!
END

str.rdiff( str2 )
# => "This is the test.\nNot really sure what else to put here...\n"

Parameters:

Returns:



83
84
85
86
87
88
89
90
91
# File 'lib/arachni/ruby/string.rb', line 83

def rdiff( other )
    return self if self == other

    # get the words of the first text in an array
    s_words = words

    # get what hasn't changed (the rdiff, so to speak) as a string
    (s_words - (s_words - other.words)).join
end

#recodeObject



153
154
155
# File 'lib/arachni/ruby/string.rb', line 153

def recode
    dup.recode!
end

#recode!Object



148
149
150
151
# File 'lib/arachni/ruby/string.rb', line 148

def recode!
    encode!( 'utf-8', invalid: :replace, undef: :replace )
    self
end

#scan_in_groups(regexp) ⇒ Hash

Returns Grouped matches.

Parameters:

  • regexp (Regexp)

    Regular expression with named captures.

Returns:

  • (Hash)

    Grouped matches.

Raises:

  • (ArgumentError)


21
22
23
24
25
26
# File 'lib/arachni/ruby/string.rb', line 21

def scan_in_groups( regexp )
    raise ArgumentError, 'Regexp does not contain any names.' if regexp.names.empty?
    return {} if !(matches = scan( regexp ).first)

    Hash[regexp.names.zip( matches )].reject { |_, v| v.empty? }
end

#shortest_wordString

Returns Shortest word.

Returns:



126
127
128
# File 'lib/arachni/ruby/string.rb', line 126

def shortest_word
    words( true ).sort_by { |w| w.size }.first
end

#sub_in_groups(regexp, substitutions) ⇒ String

Returns Updated copy of self.

Parameters:

  • regexp (Regexp)

    Regular expression with named captures.

  • substitutions (Hash)

    Hash (with capture names as keys) with which to replace the ‘regexp` matches.

Returns:

  • (String)

    Updated copy of self.



36
37
38
# File 'lib/arachni/ruby/string.rb', line 36

def sub_in_groups( regexp, substitutions )
    dup.sub_in_groups!( regexp, substitutions )
end

#sub_in_groups!(regexp, updates) ⇒ String

Returns Updated self.

Parameters:

  • regexp (Regexp)

    Regular expression with named captures.

  • updates (Hash)

    Hash (with capture names as keys) with which to replace the ‘regexp` matches.

Returns:



48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/arachni/ruby/string.rb', line 48

def sub_in_groups!( regexp, updates )
    return if !(match = regexp.match( self ))

    # updates.reject! { |k| !(match.offset( k ) rescue nil) }

    keys_in_order = updates.keys.sort_by { |k| match.offset( k ) }.reverse
    keys_in_order.each do |k|
        offsets_for_group = match.offset( k )
        self[offsets_for_group.first...offsets_for_group.last] = updates[k]
    end

    self
end

#words(strict = false) ⇒ Array<String>

Returns the words in ‘self`.

Parameters:

  • strict (Bool) (defaults to: false)

    Include only words, no boundary characters (like spaces, etc.).

Returns:



118
119
120
121
122
# File 'lib/arachni/ruby/string.rb', line 118

def words( strict = false )
    splits = split( /\b/ )
    splits.reject! { |w| !(w =~ /\w/) } if strict
    splits
end