Module: Stringex::StringExtensions::PublicInstanceMethods

Defined in:: lib/stringex/string_extensions.rb,
lib/stringex/unidecoder.rb

Overview

These methods are all included into the String class.

Instance Method Summary collapse

#collapse(character = " ") ⇒ Object

Removes specified character from the beginning and/or end of the string and then performs String#squeeze(character), condensing runs of the character within the string.
#convert_accented_html_entities ⇒ Object

Converts HTML entities into the respective non-accented letters.
#convert_miscellaneous_characters(options = {}) ⇒ Object

Converts various common plaintext characters to a more URI-friendly representation.
#convert_miscellaneous_html_entities ⇒ Object

Converts HTML entities (taken from common Textile/RedCloth formattings) into plain text formats.
#convert_smart_punctuation ⇒ Object

Converts MS Word ‘smart punctuation’ to ASCII.
#convert_unreadable_control_characters ⇒ Object
#convert_vulgar_fractions ⇒ Object

Converts vulgar fractions from supported HTML entities and Unicode to plain text formats.
#limit(limit = nil, truncate_words = true, whitespace_replacement_token = "-") ⇒ Object

Returns the string limited in size to the value of limit.
#remove_formatting(options = {}) ⇒ Object

Performs multiple text manipulations.
#replace_whitespace(replacement = " ") ⇒ Object

Replace runs of whitespace in string.
#strip_html_tags(leave_whitespace = false) ⇒ Object

Removes HTML tags from text.
#to_ascii ⇒ Object

Returns string with its UTF-8 characters transliterated to ASCII ones.
#to_html(lite_mode = false) ⇒ Object

Returns the string converted (via Textile/RedCloth) to HTML format or self [with a friendly warning] if Redcloth is not available.
#to_url(options = {}) ⇒ Object

Create a URI-friendly representation of the string.
#whole_word_limit(limit, whitespace_replacement_token = "-") ⇒ Object

Instance Method Details

#collapse(character = " ") ⇒ `Object`

Removes specified character from the beginning and/or end of the string and then performs String#squeeze(character), condensing runs of the character within the string.

Note: This method has been superceded by ActiveSupport’s squish method.



19
20
21

# File 'lib/stringex/string_extensions.rb', line 19

def collapse(character = " ")
  sub(/^#{character}*/, "").sub(/#{character}*$/, "").squeeze(character)
end

#convert_accented_html_entities ⇒ `Object`

Converts HTML entities into the respective non-accented letters. Examples:

"&aacute;".convert_accented_entities # => "a"
"&ccedil;".convert_accented_entities # => "c"
"&egrave;".convert_accented_entities # => "e"
"&icirc;".convert_accented_entities # => "i"
"&oslash;".convert_accented_entities # => "o"
"&uuml;".convert_accented_entities # => "u"

Note: This does not do any conversion of Unicode/ASCII accented-characters. For that functionality please use to_ascii.

# File 'lib/stringex/string_extensions.rb', line 34

def convert_accented_html_entities
  stringex_convert do
    cleanup_accented_html_entities!
  end
end

#convert_miscellaneous_characters(options = {}) ⇒ `Object`

Converts various common plaintext characters to a more URI-friendly representation. Examples:

"foo & bar".convert_misc_characters # => "foo and bar"
"Chanel #9".convert_misc_characters # => "Chanel number nine"
"user@host".convert_misc_characters # => "user at host"
"google.com".convert_misc_characters # => "google dot com"
"$10".convert_misc_characters # => "10 dollars"
"*69".convert_misc_characters # => "star 69"
"100%".convert_misc_characters # => "100 percent"
"windows/mac/linux".convert_misc_characters # => "windows slash mac slash linux"

It allows localization of conversions so you can use it to convert characters into your own language. Example:

I18n.backend.store_translations :de, { stringex: { characters: { and: "und" } } }
I18n.locale = :de
"ich & dich".convert_misc_characters # => "ich und dich"

Note: Because this method will convert any & symbols to the string “and”, you should run any methods which convert HTML entities (convert_accented_html_entities and convert_miscellaneous_html_entities) before running this method.

# File 'lib/stringex/string_extensions.rb', line 62

def convert_miscellaneous_characters(options = {})
  stringex_convert(options) do
    normalize_currency!
    translate! :ellipses, :currencies, :abbreviations, :characters, :apostrophes
    cleanup_characters!
  end
end

#convert_miscellaneous_html_entities ⇒ `Object`

Converts HTML entities (taken from common Textile/RedCloth formattings) into plain text formats.

Note: This isn’t an attempt at complete conversion of HTML entities, just those most likely to be generated by Textile.

# File 'lib/stringex/string_extensions.rb', line 74

def convert_miscellaneous_html_entities
  stringex_convert do
    translate! :html_entities
    cleanup_html_entities!
  end
end

#convert_smart_punctuation ⇒ `Object`

Converts MS Word ‘smart punctuation’ to ASCII

# File 'lib/stringex/string_extensions.rb', line 83

def convert_smart_punctuation
  stringex_convert do
    cleanup_smart_punctuation!
  end
end

#convert_unreadable_control_characters ⇒ `Object`

# File 'lib/stringex/string_extensions.rb', line 96

def convert_unreadable_control_characters
  stringex_convert do
    translate! :unreadable_control_characters
  end
end

#convert_vulgar_fractions ⇒ `Object`

Converts vulgar fractions from supported HTML entities and Unicode to plain text formats.

# File 'lib/stringex/string_extensions.rb', line 90

def convert_vulgar_fractions
  stringex_convert do
    translate! :vulgar_fractions
  end
end

#limit(limit = nil, truncate_words = true, whitespace_replacement_token = "-") ⇒ `Object`

Returns the string limited in size to the value of limit.

# File 'lib/stringex/string_extensions.rb', line 103

def limit(limit = nil, truncate_words = true, whitespace_replacement_token = "-")
  if limit.nil?
    self
  else
    truncate_words == false ? self.whole_word_limit(limit, whitespace_replacement_token) : self[0...limit]
  end
end

#remove_formatting(options = {}) ⇒ `Object`

Performs multiple text manipulations. Essentially a shortcut for typing them all. View source below to see which methods are run.

# File 'lib/stringex/string_extensions.rb', line 130

def remove_formatting(options = {})
  strip_html_tags.
    convert_smart_punctuation.
    convert_accented_html_entities.
    convert_vulgar_fractions.
    convert_unreadable_control_characters.
    convert_miscellaneous_html_entities.
    convert_miscellaneous_characters(options).
    to_ascii.
    # NOTE: String#to_ascii may convert some Unicode characters to ascii we'd already transliterated
    # so we need to do it again just to be safe
    convert_miscellaneous_characters(options).
    collapse
end

#replace_whitespace(replacement = " ") ⇒ `Object`

Replace runs of whitespace in string. Defaults to a single space but any replacement string may be specified as an argument. Examples:

"Foo       bar".replace_whitespace # => "Foo bar"
"Foo       bar".replace_whitespace("-") # => "Foo-bar"



150
151
152

# File 'lib/stringex/string_extensions.rb', line 150

def replace_whitespace(replacement = " ")
  gsub(/\s+/, replacement)
end

#strip_html_tags(leave_whitespace = false) ⇒ `Object`

Removes HTML tags from text. NOTE: This code is simplified from Tobias Luettke’s regular expression in Typo.

# File 'lib/stringex/string_extensions.rb', line 156

def strip_html_tags(leave_whitespace = false)
  string = stringex_convert do
    strip_html_tags!
  end
  leave_whitespace ? string : string.replace_whitespace(' ')
end

#to_ascii ⇒ `Object`

Returns string with its UTF-8 characters transliterated to ASCII ones. Example:

"⠋⠗⠁⠝⠉⠑".to_ascii #=> "france"



77
78
79

# File 'lib/stringex/unidecoder.rb', line 77

def to_ascii
  Stringex::Unidecoder.decode(self)
end

#to_html(lite_mode = false) ⇒ `Object`

Returns the string converted (via Textile/RedCloth) to HTML format or self [with a friendly warning] if Redcloth is not available.

Using :lite argument will cause RedCloth to not wrap the HTML in a container P element, which is useful behavior for generating header element text, etc. This is roughly equivalent to ActionView’s textilize_without_paragraph except that it makes RedCloth do all the work instead of just gsubbing the return from RedCloth.

# File 'lib/stringex/string_extensions.rb', line 171

def to_html(lite_mode = false)
  if defined?(RedCloth)
    if lite_mode
      RedCloth.new(self, [:lite_mode]).to_html
    else
      if self =~ /<pre>/
        RedCloth.new(self).to_html.tr("\t", "")
      else
        RedCloth.new(self).to_html.tr("\t", "").gsub(/\n\n/, "")
      end
    end
  else
    warn "String#to_html was called without RedCloth being successfully required"
    self
  end
end

#to_url(options = {}) ⇒ `Object`

Create a URI-friendly representation of the string. This is used internally by acts_as_url but can be called manually in order to generate an URI-friendly version of any string.

# File 'lib/stringex/string_extensions.rb', line 191

def to_url(options = {})
  return self if options[:exclude] && options[:exclude].include?(self)
  options = stringex_default_options.merge(options)
  whitespace_replacement_token = options[:replace_whitespace_with]
  dummy = remove_formatting(options).
            replace_whitespace(whitespace_replacement_token).
            collapse(whitespace_replacement_token).
            limit(options[:limit], options[:truncate_words], whitespace_replacement_token)
  dummy.downcase! unless options[:force_downcase] == false
  dummy
end

#whole_word_limit(limit, whitespace_replacement_token = "-") ⇒ `Object`

# File 'lib/stringex/string_extensions.rb', line 111

def whole_word_limit(limit, whitespace_replacement_token = "-")
  whole_words = []
  words = self.split(whitespace_replacement_token)

  words.each do |word|
    if word.size > limit
      break
    else
      whole_words << word
      limit -= (word.size + 1)
    end
  end

  whole_words.join(whitespace_replacement_token)
end

Module: Stringex::StringExtensions::PublicInstanceMethods

Overview

Instance Method Summary collapse

Instance Method Details

#collapse(character = " ") ⇒ Object

#convert_accented_html_entities ⇒ Object

#convert_miscellaneous_characters(options = {}) ⇒ Object

#convert_miscellaneous_html_entities ⇒ Object

#convert_smart_punctuation ⇒ Object

#convert_unreadable_control_characters ⇒ Object

#convert_vulgar_fractions ⇒ Object

#limit(limit = nil, truncate_words = true, whitespace_replacement_token = "-") ⇒ Object

#remove_formatting(options = {}) ⇒ Object

#replace_whitespace(replacement = " ") ⇒ Object

#strip_html_tags(leave_whitespace = false) ⇒ Object

#to_ascii ⇒ Object

#to_html(lite_mode = false) ⇒ Object

#to_url(options = {}) ⇒ Object

#whole_word_limit(limit, whitespace_replacement_token = "-") ⇒ Object