Module: FriendlyFormat
- Defined in:
- lib/friendly_format.rb,
lib/friendly_format/version.rb,
lib/friendly_format/set_common.rb,
lib/friendly_format/set_strict.rb,
lib/friendly_format/adapter/hpricot_adapter.rb,
lib/friendly_format/adapter/nokogiri_adapter.rb
Overview
2008-05-09 godfat
Defined Under Namespace
Modules: HpricotAdapter, NokogiriAdapter Classes: SetCommon, SetStrict
Constant Summary collapse
- VERSION =
'0.7.3'
Class Attribute Summary collapse
Class Method Summary collapse
- .attrs2str(attrs) ⇒ Object
- .escape_ltgt(text) ⇒ Object private
-
.escape_ltgt_inside_pre(html, allowed_tags) ⇒ Object
private
perhaps we should escape all inside code instead of pre?.
-
.force_encoding(output, input) ⇒ Object
private
force encoding for ruby 1.9.
-
.format_article(html, *args) ⇒ Object
format entire article for you, passing allowed tags to it.
-
.format_article_entrance(html, allowed_tags = Set.new) ⇒ Object
private
recursion entrance.
-
.format_article_rec(elem, allowed_tags = Set.new, tag_name = nil) ⇒ Object
private
recursion.
-
.format_autolink(html, attrs = {}) ⇒ Object
automaticly add “a href” tag on text starts from http/ftp/mailto/etc protocol.
- .format_autolink_rec(elem, attrs = {}) ⇒ Object private
-
.format_autolink_regexp(text, attrs = {}) ⇒ Object
translated from drupal-6.2/modules/filter/filter.module same as format_autolink, but doesn’t use Hpricot, use only regexp.
-
.format_newline(text) ⇒ Object
convert newline character(s) to <br />.
-
.format_url(text, attrs = {}) ⇒ Object
private
same as format_autolink_regexp, but it’s simplified and cannot process text composed with html and plain text.
- .node_attrs(node) ⇒ Object
- .node_attrs_reject_js(node) ⇒ Object
- .node_tag_escape(node) ⇒ Object
- .node_tag_normal(node) ⇒ Object
- .node_tag_single(node) ⇒ Object
-
.trim(text, length = 75) ⇒ Object
private
extract it to public?.
Class Attribute Details
.adapter ⇒ Object
13 14 15 16 17 18 19 |
# File 'lib/friendly_format.rb', line 13 def adapter @adapter ||= begin NokogiriAdapter rescue LoadError HpricotAdapter end end |
Class Method Details
.attrs2str(attrs) ⇒ Object
248 249 250 251 |
# File 'lib/friendly_format.rb', line 248 def attrs2str attrs # TODO: no need to convert to hash for nokogiri Hash[attrs].sort.inject(''){ |i, (k, v)| i + " #{k}=\"#{v}\"" } end |
.escape_ltgt(text) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
211 212 213 |
# File 'lib/friendly_format.rb', line 211 def escape_ltgt text text.gsub('<', '<').gsub('>', '>') end |
.escape_ltgt_inside_pre(html, allowed_tags) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
perhaps we should escape all inside code instead of pre?
132 133 134 135 136 137 138 139 140 141 |
# File 'lib/friendly_format.rb', line 132 def escape_ltgt_inside_pre html, return html unless .member?('pre') # don't bother nested pre, because we escape all tags in pre html = html + '</pre>' unless html =~ %r{</pre>}i html.gsub(%r{<pre>(.*)</pre>}mi){ # stop escaping for '>' because drupal's url filter would make > into url... # is there any other way to get matched group? "<pre>#{escape_ltgt($1)}</pre>" } end |
.force_encoding(output, input) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
force encoding for ruby 1.9
217 218 219 220 221 222 223 |
# File 'lib/friendly_format.rb', line 217 def force_encoding output, input if output.respond_to?(:force_encoding) output.force_encoding(input.encoding) else output end end |
.format_article(html, *args) ⇒ Object
format entire article for you, passing allowed tags to it. you can use Set or Symbol to specify which tags would be allowed. default was no tags at all, all tags would be escaped. it uses Hpricot to parse input.
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# File 'lib/friendly_format.rb', line 27 def format_article html, *args return html if html.strip == '' FriendlyFormat.force_encoding( FriendlyFormat.format_article_entrance(html, args.inject(Set.new){ |, arg| case arg when String; << arg when Symbol; << arg.to_s when Set; += Set.new(arg.map{|a|a.to_s}) else; raise(TypeError.new("expected String|Symbol|Set, got #{arg.class}")) end }), html) end |
.format_article_entrance(html, allowed_tags = Set.new) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
recursion entrance
168 169 170 171 172 |
# File 'lib/friendly_format.rb', line 168 def format_article_entrance html, = Set.new format_article_rec( adapter.parse(escape_ltgt_inside_pre(html, )), ) end |
.format_article_rec(elem, allowed_tags = Set.new, tag_name = nil) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
recursion
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
# File 'lib/friendly_format.rb', line 176 def format_article_rec(elem, = Set.new, tag_name = nil) elem.children.map{ |e| if e.text? result = e.to_html case tag_name when 'pre'; format_url( result) when 'a'; format_newline(result) else ; format_newline(format_url(result)) end elsif e.elem? if .member?(e.name) if adapter.empty?(e) node_tag_single(e) else node_tag_normal(e) + format_article_rec(e, , e.name) + "</#{e.name}>" end else node_tag_escape(e) + if adapter.empty?(e) '' else format_article_rec(e, ) + "</#{e.name}>" end end end }.join end |
.format_autolink(html, attrs = {}) ⇒ Object
automaticly add “a href” tag on text starts from http/ftp/mailto/etc protocol. use Hpricot to parse and regexp translated from drupal to find where’s the target. it uses simplified regexp to do the task. see format_url.
48 49 50 51 52 53 54 55 |
# File 'lib/friendly_format.rb', line 48 def format_autolink html, attrs = {} return html if html.strip == '' FriendlyFormat.force_encoding( FriendlyFormat.format_autolink_rec( FriendlyFormat.adapter.parse(html), attrs), html) end |
.format_autolink_rec(elem, attrs = {}) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/friendly_format.rb', line 144 def format_autolink_rec elem, attrs = {} elem.children.map{ |e| if e.text? format_url(e.content, attrs) elsif e.elem? if adapter.empty?(e) adapter.to_xhtml(e) else node_tag_normal(e) + format_autolink_rec(e, attrs) + "</#{e.name}>" end else e end }.join end |
.format_autolink_regexp(text, attrs = {}) ⇒ Object
translated from drupal-6.2/modules/filter/filter.module same as format_autolink, but doesn’t use Hpricot, use only regexp.
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/friendly_format.rb', line 60 def format_autolink_regexp text, attrs = {} attrs = attrs.map{ |k,v| " #{k}=\"#{v}\""}.join # Match absolute URLs. " #{text}".gsub(%r{(<p>|<li>|<br\s*/?>|[ \n\r\t\(])((http://|https://|ftp://|mailto:|smb://|afp://|file://|gopher://|news://|ssl://|sslv2://|sslv3://|tls://|tcp://|udp://)([a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+*~#&=/;-]))([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)])?)}i){ |match| match = [match, $1, $2, $3, $4, $5] match[2] = match[2] # escape something here = FriendlyFormat.trim match[2] # match[2] = sanitize match[2] match[1]+'<a href="'+match[2]+'" title="'+match[2]+"\"#{attrs}>"+ +'</a>'+match[5] # Match e-mail addresses. }.gsub(%r{(<p>|<li>|<br\s*/?>|[ \n\r\t\(])([A-Za-z0-9._-]+@[A-Za-z0-9._+-]+\.[A-Za-z]{2,4})([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)]))}i, '\1<a href="mailto:\2">\2</a>\3'). # Match www domains/addresses. gsub(%r{(<p>|<li>|[ \n\r\t\(])(www\.[a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+~#\&=/;-])([.,?!]*?)(?=(</p>|</li>|<br\s*/?>|[ \n\r\t\)]))}i){ |match| match = [match, $1, $2, $3, $4, $5] match[2] = match[2] # escape something here = FriendlyFormat.trim match[2] # match[2] = sanitize match[2] match[1]+'<a href="http://'+match[2]+'" title="http://'+match[2]+"\"#{attrs}>"+ +'</a>'+match[3] }[1..-1] end |
.format_newline(text) ⇒ Object
convert newline character(s) to <br />
86 87 88 89 90 |
# File 'lib/friendly_format.rb', line 86 def format_newline text # windows: \r\n # mac os 9: \r text.gsub("\r\n", "\n").tr("\r", "\n").gsub("\n", "<br />\n") end |
.format_url(text, attrs = {}) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
same as format_autolink_regexp, but it’s simplified and cannot process text composed with html and plain text. used in format_autolink.
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
# File 'lib/friendly_format.rb', line 113 def format_url text, attrs = {} # translated from drupal-6.2/modules/filter/filter.module # Match absolute URLs. text.gsub( %r{((http://|https://|ftp://|mailto:|smb://|afp://|file://|gopher://|news://|ssl://|sslv2://|sslv3://|tls://|tcp://|udp://|www\.)([a-zA-Z0-9@:%_+*~#?&=.,/;-]*[a-zA-Z0-9@:%_+*~#&=/;-]))([.,?!]*?)}i){ |match| url = $1 # is there any other way to get this variable? = trim(url) html_attrs = attrs.map{ |k,v| " #{k}=\"#{v}\""}.join # Match www domains/addresses. url = "http://#{url}" unless url =~ %r{^http://} "<a href=\"#{url}\" title=\"#{url}\"#{html_attrs}>#{}</a>" # Match e-mail addresses. }.gsub( %r{([A-Za-z0-9._-]+@[A-Za-z0-9._+-]+\.[A-Za-z]{2,4})([.,?!]*?)}i, '<a href="mailto:\1">\1</a>') end |
.node_attrs(node) ⇒ Object
237 238 239 |
# File 'lib/friendly_format.rb', line 237 def node_attrs node attrs2str(node.attributes) end |
.node_attrs_reject_js(node) ⇒ Object
241 242 243 244 245 246 |
# File 'lib/friendly_format.rb', line 241 def node_attrs_reject_js node # TODO: no need to convert to hash for nokogiri attrs2str(Hash[node.attributes].reject{ |k, v| k =~ /\Aon/ || v.to_s =~ /\Ajavascript/ }) end |
.node_tag_escape(node) ⇒ Object
233 234 235 |
# File 'lib/friendly_format.rb', line 233 def node_tag_escape node "<#{node.name}#{node_attrs(node)}>" end |
.node_tag_normal(node) ⇒ Object
229 230 231 |
# File 'lib/friendly_format.rb', line 229 def node_tag_normal node "<#{node.name}#{node_attrs_reject_js(node)}>" end |
.node_tag_single(node) ⇒ Object
225 226 227 |
# File 'lib/friendly_format.rb', line 225 def node_tag_single node "<#{node.name}#{node_attrs_reject_js(node)} />" end |
.trim(text, length = 75) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
extract it to public?
98 99 100 101 102 103 104 105 106 107 |
# File 'lib/friendly_format.rb', line 98 def trim text, length = 75 # Use +3 for '...' string length. if text.size <= 3 '...' elsif text.size > length "#{text[0...length-3]}..." else text end end |