Module: FriendlyFormat

Defined in:
lib/friendly_format.rb,
lib/friendly_format/version.rb,
lib/friendly_format/set_common.rb,
lib/friendly_format/set_strict.rb,
lib/friendly_format/adapter/hpricot_adapter.rb,
lib/friendly_format/adapter/nokogiri_adapter.rb

Overview

2008-05-09 godfat

Defined Under Namespace

Modules: HpricotAdapter, NokogiriAdapter Classes: SetCommon, SetStrict

Constant Summary collapse

VERSION =
'0.7.0'

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.adapterObject



14
15
16
17
18
19
20
21
22
23
24
# File 'lib/friendly_format.rb', line 14

def adapter
  @adapter ||= begin
                 HpricotAdapter
               rescue LoadError
                 begin
                   NokogiriAdapter
                 rescue LoadError
                   LibxmlAdapter
                 end
               end
end

Class Method Details

.attrs2str(attrs) ⇒ Object



250
251
252
# File 'lib/friendly_format.rb', line 250

def attrs2str attrs
  attrs.sort.inject(''){ |i, (k, v)| i + " #{k}=\"#{v}\"" }
end

.escape_ltgt(text) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



216
217
218
# File 'lib/friendly_format.rb', line 216

def escape_ltgt text
  text.gsub('<', '&lt;').gsub('>', '&gt;')
end

.escape_ltgt_inside_pre(html, allowed_tags) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

perhaps we should escape all inside code instead of pre?



137
138
139
140
141
142
143
144
145
146
# File 'lib/friendly_format.rb', line 137

def escape_ltgt_inside_pre html, allowed_tags
  return html unless allowed_tags.member?('pre')
  # don't bother nested pre, because we escape all tags in pre
  html = html + '</pre>' unless html =~ %r{</pre>}i
  html.gsub(%r{<pre>(.*)</pre>}mi){
    # stop escaping for '>' because drupal's url filter would make &gt; into url...
    # is there any other way to get matched group?
    "<pre>#{escape_ltgt($1)}</pre>"
  }
end

.force_encoding(output, input) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

force encoding for ruby 1.9



222
223
224
225
226
227
228
# File 'lib/friendly_format.rb', line 222

def force_encoding output, input
  if output.respond_to?(:force_encoding)
    output.force_encoding(input.encoding)
  else
    output
  end
end

.format_article(html, *args) ⇒ Object

format entire article for you, passing allowed tags to it. you can use Set or Symbol to specify which tags would be allowed. default was no tags at all, all tags would be escaped. it uses Hpricot to parse input.



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/friendly_format.rb', line 32

def format_article html, *args
  return html if html.strip == ''

  FriendlyFormat.force_encoding(
    FriendlyFormat.format_article_entrance(html,
      args.inject(Set.new){ |allowed_tags, arg|
        case arg
          when String; allowed_tags << arg
          when Symbol; allowed_tags << arg.to_s
          when Set;    allowed_tags += Set.new(arg.map{|a|a.to_s})
          else; raise(TypeError.new("expected String|Symbol|Set, got #{arg.class}"))
        end
        allowed_tags
      }),
    html)
end

.format_article_entrance(html, allowed_tags = Set.new) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

recursion entrance



173
174
175
176
177
# File 'lib/friendly_format.rb', line 173

def format_article_entrance html, allowed_tags = Set.new
  format_article_rec(
    adapter.parse(escape_ltgt_inside_pre(html, allowed_tags)),
    allowed_tags)
end

.format_article_rec(elem, allowed_tags = Set.new, tag_name = nil) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

recursion



181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
# File 'lib/friendly_format.rb', line 181

def format_article_rec(elem, allowed_tags = Set.new, tag_name = nil)

  elem.children.map{ |e|
    if e.text?
      result = e.to_html
      case tag_name
        when 'pre'; format_url(    result)
        when   'a'; format_newline(result)
        else      ; format_newline(format_url(result))
      end

    elsif e.elem?
      if allowed_tags.member?(e.name)
        if adapter.empty?(e)
          node_tag_single(e)
        else
          node_tag_normal(e) +
          format_article_rec(e, allowed_tags, e.name) +
          "</#{e.name}>"
        end
      else
        node_tag_escape(e) +
        if adapter.empty?(e)
          ''
        else
          format_article_rec(e, allowed_tags) +
          "&lt;/#{e.name}&gt;"
        end
      end

    end
  }.join
end

automaticly add “a href” tag on text starts from http/ftp/mailto/etc protocol. use Hpricot to parse and regexp translated from drupal to find where’s the target. it uses simplified regexp to do the task. see format_url.



53
54
55
56
57
58
59
60
# File 'lib/friendly_format.rb', line 53

def format_autolink html, attrs = {}
  return html if html.strip == ''

  FriendlyFormat.force_encoding(
    FriendlyFormat.format_autolink_rec(
      FriendlyFormat.adapter.parse(html), attrs),
    html)
end

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# File 'lib/friendly_format.rb', line 149

def format_autolink_rec elem, attrs = {}
  elem.children.map{ |e|
    if e.text?
      format_url(e.content, attrs)

    elsif e.elem?
      if adapter.empty?(e)
        adapter.to_xhtml(e)
      else
        node_tag_normal(e) +
        format_autolink_rec(e, attrs) +
        "</#{e.name}>"
      end

    else
      e

    end

  }.join
end

translated from drupal-6.2/modules/filter/filter.module same as format_autolink, but doesn’t use Hpricot, use only regexp.



65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# File 'lib/friendly_format.rb', line 65

def format_autolink_regexp text, attrs = {}
  attrs = attrs.map{ |k,v| " #{k}=\"#{v}\""}.join
  # Match absolute URLs.
  " #{text}".gsub(%r{(<p>|<li>|<br\s*/?>|[ \n\r\t\(])((http://|https://|ftp://|mailto:|smb://|afp://|file://|gopher://|news://|ssl://|sslv2://|sslv3://|tls://|tcp://|udp://)([a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+*~#&=/;-]))([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)])?)}i){ |match|
    match = [match, $1, $2, $3, $4, $5]
    match[2] = match[2] # escape something here
    caption = FriendlyFormat.trim match[2]
    # match[2] = sanitize match[2]
    match[1]+'<a href="'+match[2]+'" title="'+match[2]+"\"#{attrs}>"+
      caption+'</a>'+match[5]

  # Match e-mail addresses.
  }.gsub(%r{(<p>|<li>|<br\s*/?>|[ \n\r\t\(])([A-Za-z0-9._-]+@[A-Za-z0-9._+-]+\.[A-Za-z]{2,4})([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)]))}i, '\1<a href="mailto:\2">\2</a>\3').

  # Match www domains/addresses.
  gsub(%r{(<p>|<li>|[ \n\r\t\(])(www\.[a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+~#\&=/;-])([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)]))}i){ |match|
    match = [match, $1, $2, $3, $4, $5]
    match[2] = match[2] # escape something here
    caption = FriendlyFormat.trim match[2]
    # match[2] = sanitize match[2]
    match[1]+'<a href="http://'+match[2]+'" title="http://'+match[2]+"\"#{attrs}>"+
      caption+'</a>'+match[3]
  }[1..-1]
end

.format_newline(text) ⇒ Object

convert newline character(s) to <br />



91
92
93
94
95
# File 'lib/friendly_format.rb', line 91

def format_newline text
  # windows: \r\n
  # mac os 9: \r
  text.gsub("\r\n", "\n").tr("\r", "\n").gsub("\n", "<br />\n")
end

.format_url(text, attrs = {}) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

same as format_autolink_regexp, but it’s simplified and cannot process text composed with html and plain text. used in format_autolink.



118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# File 'lib/friendly_format.rb', line 118

def format_url text, attrs = {}
  # translated from drupal-6.2/modules/filter/filter.module
  # Match absolute URLs.
  text.gsub(
  %r{((http://|https://|ftp://|mailto:|smb://|afp://|file://|gopher://|news://|ssl://|sslv2://|sslv3://|tls://|tcp://|udp://|www\.)([a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+*~#&=/;-]))([.,?!]*?)}i){ |match|
    url = $1 # is there any other way to get this variable?
    caption = trim(url)
    html_attrs = attrs.map{ |k,v| " #{k}=\"#{v}\""}.join

    # Match www domains/addresses.
    url = "http://#{url}" unless url =~ %r{^http://}
    "<a href=\"#{url}\" title=\"#{url}\"#{html_attrs}>#{caption}</a>"
  # Match e-mail addresses.
  }.gsub( %r{([A-Za-z0-9._-]+@[A-Za-z0-9._+-]+\.[A-Za-z]{2,4})([.,?!]*?)}i,
          '<a href="mailto:\1">\1</a>')
end

.node_attrs(node) ⇒ Object



242
243
244
# File 'lib/friendly_format.rb', line 242

def node_attrs node
  attrs2str(node.attributes)
end

.node_attrs_reject_js(node) ⇒ Object



246
247
248
# File 'lib/friendly_format.rb', line 246

def node_attrs_reject_js node
  attrs2str(node.attributes.reject{ |k, v| k =~ /\Aon/ })
end

.node_tag_escape(node) ⇒ Object



238
239
240
# File 'lib/friendly_format.rb', line 238

def node_tag_escape node
  "&lt;#{node.name}#{node_attrs(node)}&gt;"
end

.node_tag_normal(node) ⇒ Object



234
235
236
# File 'lib/friendly_format.rb', line 234

def node_tag_normal node
  "<#{node.name}#{node_attrs_reject_js(node)}>"
end

.node_tag_single(node) ⇒ Object



230
231
232
# File 'lib/friendly_format.rb', line 230

def node_tag_single node
  "<#{node.name}#{node_attrs_reject_js(node)} />"
end

.trim(text, length = 75) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

extract it to public?



103
104
105
106
107
108
109
110
111
112
# File 'lib/friendly_format.rb', line 103

def trim text, length = 75
  # Use +3 for '...' string length.
  if text.size <= 3
    '...'
  elsif text.size > length
    "#{text[0...length-3]}..."
  else
    text
  end
end