Class: HtmlPrettify

Inherits:
String
  • Object
show all
Defined in:
lib/html_prettify.rb

Overview

heavily based off github.com/vmg/redcarpet/blob/master/ext/redcarpet/html_smartypants.c and github.com/jmcnevin/rubypants/blob/master/lib/rubypants/core.rb 99% of the code here is by Jeremy McNevin

This Source File is available under BSD/MIT license as well as standard GPL

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(string, options = [2], entities = {}) ⇒ HtmlPrettify

Create a new RubyPants instance with the text in string.

Allowed elements in the options array:

0

do nothing

1

enable all, using only em-dash shortcuts

2

enable all, using old school en- and em-dash shortcuts (default)

3

enable all, using inverted old school en and em-dash shortcuts

-1

stupefy (translate HTML entities to their ASCII-counterparts)

If you don’t like any of these defaults, you can pass symbols to change RubyPants’ behavior:

:quotes

quotes

:backticks

backtick quotes (“double” only)

:allbackticks

backtick quotes (“double” and ‘single’)

:dashes

dashes

:oldschool

old school dashes

:inverted

inverted old school dashes

:ellipses

ellipses

:convertquotes

convert " entities to "

:stupefy

translate RubyPants HTML entities to their ASCII counterparts.

In addition, you can customize the HTML entities that will be injected by passing in a hash for the final argument. The defaults for these entities are as follows:

:single_left_quote

‘

:double_left_quote

“

:single_right_quote

’

:double_right_quote

”

:em_dash

—

:en_dash

–

:ellipsis

…

:html_quote

"



55
56
57
58
59
60
# File 'lib/html_prettify.rb', line 55

def initialize(string, options = [2], entities = {})
  super string

  @options = [*options]
  @entities = default_entities.update(entities)
end

Class Method Details

.render(html) ⇒ Object



13
14
15
# File 'lib/html_prettify.rb', line 13

def self.render(html)
  new(html).to_html
end

Instance Method Details

#to_htmlObject

Apply SmartyPants transformations.



63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# File 'lib/html_prettify.rb', line 63

def to_html
  do_quotes = do_backticks = do_dashes = do_ellipses = nil

  if @options.include?(0)
    # Do nothing.
    return self
  elsif @options.include?(1)
    # Do everything, turn all options on.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :normal
  elsif @options.include?(2)
    # Do everything, turn all options on, use old school dash shorthand.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :oldschool
  elsif @options.include?(3)
    # Do everything, turn all options on, use inverted old school
    # dash shorthand.
    do_quotes = do_backticks = do_ellipses = true
    do_dashes = :inverted
  elsif @options.include?(-1)
    do_stupefy = true
  else
    do_quotes = @options.include?(:quotes)
    do_backticks = @options.include?(:backticks)
    do_backticks = :both if @options.include?(:allbackticks)
    do_dashes = :normal if @options.include?(:dashes)
    do_dashes = :oldschool if @options.include?(:oldschool)
    do_dashes = :inverted if @options.include?(:inverted)
    do_ellipses = @options.include?(:ellipses)
    do_stupefy = @options.include?(:stupefy)
  end

  # Parse the HTML
  tokens = tokenize

  # Keep track of when we're inside <pre> or <code> tags.
  in_pre = false

  # Here is the result stored in.
  result = +""

  # This is a cheat, used to get some context for one-character
  # tokens that consist of just a quote char. What we do is remember
  # the last character of the previous text token, to use as context
  # to curl single- character quote tokens correctly.
  prev_token_last_char = nil

  tokens.each do |token|
    if token.first == :tag
      result << token[1]
      if token[1] =~ %r{<(/?)(?:pre|code|kbd|script|math)[\s>]}
        in_pre = ($1 != "/") # Opening or closing tag?
      end
    else
      t = token[1]

      # Remember last char of this token before processing.
      last_char = t[-1].chr

      unless in_pre
        t.gsub!("&#39;", "'")
        t.gsub!("&quot;", '"')

        if do_dashes
          t = educate_dashes t if do_dashes == :normal
          t = educate_dashes_oldschool t if do_dashes == :oldschool
          t = educate_dashes_inverted t if do_dashes == :inverted
        end

        t = educate_ellipses t if do_ellipses

        t = educate_fractions t

        # Note: backticks need to be processed before quotes.
        if do_backticks
          t = educate_backticks t
          t = educate_single_backticks t if do_backticks == :both
        end

        if do_quotes
          if t == "'"
            # Special case: single-character ' token
            if prev_token_last_char =~ /\S/
              t = entity(:single_right_quote)
            else
              t = entity(:single_left_quote)
            end
          elsif t == '"'
            # Special case: single-character " token
            if prev_token_last_char =~ /\S/
              t = entity(:double_right_quote)
            else
              t = entity(:double_left_quote)
            end
          else
            # Normal case:
            t = educate_quotes t
          end
        end

        t = stupefy_entities t if do_stupefy
      end

      prev_token_last_char = last_char
      result << t
    end
  end

  # Done
  result
end