Class: String
- Defined in:
- lib/rbot/irc.rb,
lib/rbot/irc.rb,
lib/rbot/irc.rb,
lib/rbot/irc.rb,
lib/rbot/irc.rb,
lib/rbot/botuser.rb,
lib/rbot/ircsocket.rb,
lib/rbot/core/utils/extends.rb
Overview
Extensions to the String class
TODO make riphtml() just call ircify_html() with stronger purify options.
Instance Method Summary collapse
-
#get_html_title ⇒ Object
This method tries to find an HTML title in the string, and returns it if found.
-
#has_irc_glob? ⇒ Boolean
This method checks if the receiver contains IRC glob characters.
-
#irc_downcase(casemap = 'rfc1459') ⇒ Object
This method returns a string which is the downcased version of the receiver, according to the given casemap.
-
#irc_downcase!(casemap = 'rfc1459') ⇒ Object
This is the same as the above, except that the string is altered in place.
-
#irc_send_penalty ⇒ Object
Calculate the penalty which will be assigned to this message by the IRCd.
-
#irc_upcase(casemap = 'rfc1459') ⇒ Object
Upcasing functions are provided too.
-
#irc_upcase!(casemap = 'rfc1459') ⇒ Object
In-place upcasing.
-
#ircify_html(opts = {}) ⇒ Object
This method will return a purified version of the receiver, with all HTML stripped off and some of it converted to IRC formatting.
-
#ircify_html!(opts = {}) ⇒ Object
As above, but modify the receiver.
-
#ircify_html_title ⇒ Object
This method returns the IRC-formatted version of an HTML title found in the string.
-
#riphtml ⇒ Object
This method will strip all HTML crud from the receiver.
-
#to_irc_auth_command ⇒ Object
Returns an Irc::Bot::Auth::Comand from the receiver.
-
#to_irc_casemap ⇒ Object
This method returns the Irc::Casemap whose name is the receiver.
-
#to_irc_channel(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Channel object.
-
#to_irc_channel_topic ⇒ Object
Returns an Irc::Channel::Topic with self as text.
-
#to_irc_netmask(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Netmask object.
-
#to_irc_regexp ⇒ Object
This method is used to convert the receiver into a Regular Expression that matches according to the IRC glob syntax.
-
#to_irc_user(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::User object.
-
#wrap_nonempty(pre, post, opts = {}) ⇒ Object
This method is used to wrap a nonempty String by adding the prefix and postfix.
Instance Method Details
#get_html_title ⇒ Object
This method tries to find an HTML title in the string, and returns it if found
338 339 340 341 342 343 344 345 |
# File 'lib/rbot/core/utils/extends.rb', line 338 def get_html_title if defined? ::Hpricot Hpricot(self).at("title").inner_html else return unless Irc::Utils::TITLE_REGEX.match(self) $1 end end |
#has_irc_glob? ⇒ Boolean
This method checks if the receiver contains IRC glob characters
IRC has a very primitive concept of globs: a *
stands for “any number of arbitrary characters”, a ?
stands for “one and exactly one arbitrary character”. These characters can be escaped by prefixing them with a slash (\
).
A known limitation of this glob syntax is that there is no way to escape the escape character itself, so it’s not possible to build a glob pattern where the escape character precedes a glob.
332 333 334 |
# File 'lib/rbot/irc.rb', line 332 def has_irc_glob? self =~ /^[*?]|[^\\][*?]/ end |
#irc_downcase(casemap = 'rfc1459') ⇒ Object
This method returns a string which is the downcased version of the receiver, according to the given casemap
289 290 291 292 |
# File 'lib/rbot/irc.rb', line 289 def irc_downcase(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr(cmap.upper, cmap.lower) end |
#irc_downcase!(casemap = 'rfc1459') ⇒ Object
This is the same as the above, except that the string is altered in place
See also the discussion about irc_downcase
298 299 300 301 |
# File 'lib/rbot/irc.rb', line 298 def irc_downcase!(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr!(cmap.upper, cmap.lower) end |
#irc_send_penalty ⇒ Object
Calculate the penalty which will be assigned to this message by the IRCd
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/rbot/ircsocket.rb', line 14 def irc_send_penalty # According to eggdrop, the initial penalty is penalty = 1 + self.size/100 # on everything but UnderNET where it's # penalty = 2 + self.size/120 cmd, pars = self.split($;,2) debug "cmd: #{cmd}, pars: #{pars.inspect}" case cmd.to_sym when :KICK chan, nick, msg = pars.split chan = chan.split(',') nick = nick.split(',') penalty += nick.size penalty *= chan.size when :MODE chan, modes, argument = pars.split extra = 0 if modes extra = 1 if argument extra += modes.split(/\+|-/).size else extra += 3 * modes.split(/\+|-/).size end end if argument extra += 2 * argument.split.size end penalty += extra * chan.split.size when :TOPIC penalty += 1 penalty += 2 unless pars.split.size < 2 when :PRIVMSG, :NOTICE dests = pars.split($;,2).first penalty += dests.split(',').size when :WHO args = pars.split if args.length > 0 penalty += args.inject(0){ |sum,x| sum += ((x.length > 4) ? 3 : 5) } else penalty += 10 end when :PART penalty += 4 when :AWAY, :JOIN, :VERSION, :TIME, :TRACE, :WHOIS, :DNS penalty += 2 when :INVITE, :NICK penalty += 3 when :ISON penalty += 1 else # Unknown messages penalty += 1 end if penalty > 99 debug "Wow, more than 99 secs of penalty!" penalty = 99 end if penalty < 2 debug "Wow, less than 2 secs of penalty!" penalty = 2 end debug "penalty: #{penalty}" return penalty end |
#irc_upcase(casemap = 'rfc1459') ⇒ Object
Upcasing functions are provided too
See also the discussion about irc_downcase
307 308 309 310 |
# File 'lib/rbot/irc.rb', line 307 def irc_upcase(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr(cmap.lower, cmap.upper) end |
#irc_upcase!(casemap = 'rfc1459') ⇒ Object
In-place upcasing
See also the discussion about irc_downcase
316 317 318 319 |
# File 'lib/rbot/irc.rb', line 316 def irc_upcase!(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr!(cmap.lower, cmap.upper) end |
#ircify_html(opts = {}) ⇒ Object
This method will return a purified version of the receiver, with all HTML stripped off and some of it converted to IRC formatting
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 |
# File 'lib/rbot/core/utils/extends.rb', line 214 def ircify_html(opts={}) txt = self.dup # remove scripts txt.gsub!(/<script(?:\s+[^>]*)?>.*?<\/script>/im, "") # remove styles txt.gsub!(/<style(?:\s+[^>]*)?>.*?<\/style>/im, "") # bold and strong -> bold txt.gsub!(/<\/?(?:b|strong)(?:\s+[^>]*)?>/im, "#{Bold}") # italic, emphasis and underline -> underline txt.gsub!(/<\/?(?:i|em|u)(?:\s+[^>]*)?>/im, "#{Underline}") ## This would be a nice addition, but the results are horrible ## Maybe make it configurable? # txt.gsub!(/<\/?a( [^>]*)?>/, "#{Reverse}") case val = opts[:a_href] when Reverse, Bold, Underline txt.gsub!(/<(?:\/a\s*|a (?:[^>]*\s+)?href\s*=\s*(?:[^>]*\s*)?)>/, val) when :link_out # Not good for nested links, but the best we can do without something like hpricot txt.gsub!(/<a (?:[^>]*\s+)?href\s*=\s*(?:([^"'>][^\s>]*)\s+|"((?:[^"]|\\")*)"|'((?:[^']|\\')*)')(?:[^>]*\s+)?>(.*?)<\/a>/) { |match| debug match debug [$1, $2, $3, $4].inspect link = $1 || $2 || $3 str = $4 str + ": " + link } else warning "unknown :a_href option #{val} passed to ircify_html" if val end # If opts[:img] is defined, it should be a String. Each image # will be replaced by the string itself, replacing occurrences of # %{alt} %{dimensions} and %{src} with the alt text, image dimensions # and URL if val = opts[:img] if val.kind_of? String txt.gsub!(/<img\s+(.*?)\s*\/?>/) do |imgtag| attrs = Hash.new imgtag.scan(/([[:alpha:]]+)\s*=\s*(['"])?(.*?)\2/) do |key, quote, value| k = key.downcase.intern rescue 'junk' attrs[k] = value end attrs[:alt] ||= attrs[:title] attrs[:width] ||= '...' attrs[:height] ||= '...' attrs[:dimensions] ||= "#{attrs[:width]}x#{attrs[:height]}" val % attrs end else warning ":img option is not a string" end end # Paragraph and br tags are converted to whitespace txt.gsub!(/<\/?(p|br)(?:\s+[^>]*)?\s*\/?\s*>/i, ' ') txt.gsub!("\n", ' ') txt.gsub!("\r", ' ') # Superscripts and subscripts are turned into ^{...} and _{...} # where the {} are omitted for single characters txt.gsub!(/<sup>(.*?)<\/sup>/, '^{\1}') txt.gsub!(/<sub>(.*?)<\/sub>/, '_{\1}') txt.gsub!(/(^|_)\{(.)\}/, '\1\2') # List items are converted to *). We don't have special support for # nested or ordered lists. txt.gsub!(/<li>/, ' *) ') # All other tags are just removed txt.gsub!(/<[^>]+>/, '') # Convert HTML entities. We do it now to be able to handle stuff # such as txt = Utils.decode_html_entities(txt) # Keep unbreakable spaces or conver them to plain spaces? case val = opts[:nbsp] when :space, ' ' txt.gsub!([160].pack('U'), ' ') else warning "unknown :nbsp option #{val} passed to ircify_html" if val end # Remove double formatting options, since they only waste bytes txt.gsub!(/#{Bold}(\s*)#{Bold}/, '\1') txt.gsub!(/#{Underline}(\s*)#{Underline}/, '\1') # Simplify whitespace that appears on both sides of a formatting option txt.gsub!(/\s+(#{Bold}|#{Underline})\s+/, ' \1') txt.sub!(/\s+(#{Bold}|#{Underline})\z/, '\1') txt.sub!(/\A(#{Bold}|#{Underline})\s+/, '\1') # And finally whitespace is squeezed txt.gsub!(/\s+/, ' ') txt.strip! if opts[:limit] && txt.size > opts[:limit] txt = txt.slice(0, opts[:limit]) + "#{Reverse}...#{Reverse}" end # Decode entities and strip whitespace return txt end |
#ircify_html!(opts = {}) ⇒ Object
As above, but modify the receiver
324 325 326 327 328 |
# File 'lib/rbot/core/utils/extends.rb', line 324 def ircify_html!(opts={}) old_hash = self.hash replace self.ircify_html(opts) return self unless self.hash == old_hash end |
#ircify_html_title ⇒ Object
This method returns the IRC-formatted version of an HTML title found in the string
349 350 351 |
# File 'lib/rbot/core/utils/extends.rb', line 349 def ircify_html_title self.get_html_title.ircify_html rescue nil end |
#riphtml ⇒ Object
This method will strip all HTML crud from the receiver
332 333 334 |
# File 'lib/rbot/core/utils/extends.rb', line 332 def riphtml self.gsub(/<[^>]+>/, '').gsub(/&/,'&').gsub(/"/,'"').gsub(/</,'<').gsub(/>/,'>').gsub(/&ellip;/,'...').gsub(/'/, "'").gsub("\n",'') end |
#to_irc_auth_command ⇒ Object
Returns an Irc::Bot::Auth::Comand from the receiver
119 120 121 |
# File 'lib/rbot/botuser.rb', line 119 def to_irc_auth_command Irc::Bot::Auth::Command.new(self) end |
#to_irc_casemap ⇒ Object
This method returns the Irc::Casemap whose name is the receiver
275 276 277 278 279 280 281 282 283 |
# File 'lib/rbot/irc.rb', line 275 def to_irc_casemap begin Irc::Casemap.get(self) rescue # raise TypeError, "Unkown Irc::Casemap #{self.inspect}" error "Unkown Irc::Casemap #{self.inspect} requested, defaulting to rfc1459" Irc::Casemap.get('rfc1459') end end |
#to_irc_channel(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Channel object
1513 1514 1515 |
# File 'lib/rbot/irc.rb', line 1513 def to_irc_channel(opts={}) Irc::Channel.new(self, opts) end |
#to_irc_channel_topic ⇒ Object
Returns an Irc::Channel::Topic with self as text
1318 1319 1320 |
# File 'lib/rbot/irc.rb', line 1318 def to_irc_channel_topic Irc::Channel::Topic.new(self) end |
#to_irc_netmask(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Netmask object
915 916 917 |
# File 'lib/rbot/irc.rb', line 915 def to_irc_netmask(opts={}) Irc::Netmask.new(self, opts) end |
#to_irc_regexp ⇒ Object
This method is used to convert the receiver into a Regular Expression that matches according to the IRC glob syntax
339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 |
# File 'lib/rbot/irc.rb', line 339 def to_irc_regexp regmask = Regexp.escape(self) regmask.gsub!(/(\\\\)?\\[*?]/) { |m| case m when /\\(\\[*?])/ $1 when /\\\*/ '.*' when /\\\?/ '.' else raise "Unexpected match #{m} when converting #{self}" end } Regexp.new("^#{regmask}$") end |
#to_irc_user(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::User object
1108 1109 1110 |
# File 'lib/rbot/irc.rb', line 1108 def to_irc_user(opts={}) Irc::User.new(self, opts) end |
#wrap_nonempty(pre, post, opts = {}) ⇒ Object
This method is used to wrap a nonempty String by adding the prefix and postfix
355 356 357 358 359 360 361 |
# File 'lib/rbot/core/utils/extends.rb', line 355 def wrap_nonempty(pre, post, opts={}) if self.empty? String.new else "#{pre}#{self}#{post}" end end |