Module: Typeset
- Defined in:
- lib/typeset.rb,
lib/typeset/quotes.rb,
lib/typeset/spaces.rb,
lib/typeset/hyphenate.rb,
lib/typeset/ligatures.rb,
lib/typeset/small_caps.rb,
lib/typeset/punctuation.rb,
lib/typeset/hanging_punctuation.rb
Overview
Contains all of our typeset-related class methods. Mix this module into a class, or just call ‘Typeset#typset` directly
Defined Under Namespace
Modules: HangingPunctuation
Constant Summary collapse
- DefaultMethods =
The default typesetting methods and their configuration. Add new methods here in whatever order makes sense.
[ [:quotes, true], [:hanging_punctuation, true], [:spaces, true], [:small_caps, true], [:ligatures, false], [:punctuation, false], [:hyphenate, true] ]
- DefaultOptions =
{ :disable => [], :language => "en_us" }
- Ligatures =
Map of raw text sequences to unicode ligatures
{ 'ffi' => 'ffi', 'ffl' => 'ffl', 'fi' => 'fi', 'fl' => 'fl', 'st' => 'st', 'ff' => 'ff', 'ue' => 'ᵫ' }
- DefaultLigatures =
List of ligatures to process by default
%w{ffi ffl fi fl ff}
Class Method Summary collapse
-
.apply_to_text_nodes(html, &func) ⇒ Object
Parse an HTML fragment with Nokogiri and apply a function to all of the descendant text nodes.
-
.hanging_punctuation(text, options) ⇒ Object
Add push/pull spans for hanging punctuation to text.
-
.hyphenate(text, options) ⇒ Object
Hyphenate text, inserting soft hyphenation markers.
-
.ligatures(text, options) ⇒ Object
Find and replace sequences of text with their unicode ligature equivalents.
-
.punctuation(text, options) ⇒ Object
Make dashes, elipses, and start/end punctuation a little prettier.
-
.quotes(text, options) ⇒ Object
A poor-man’s Smarty Pants implementation.
-
.small_caps(text, options) ⇒ Object
Identify likely acronyms, and wrap them in a ‘small-caps’ span.
-
.spaces(text, options) ⇒ Object
Replace wide (normal) spaces around math operators with hair spaces.
-
.typeset(html, options = Typeset::DefaultOptions) ⇒ Object
The main entry point for Typeset.
Class Method Details
.apply_to_text_nodes(html, &func) ⇒ Object
Parse an HTML fragment with Nokogiri and apply a function to all of the descendant text nodes
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# File 'lib/typeset.rb', line 15 def self.apply_to_text_nodes(html, &func) doc = Nokogiri::HTML("<div id='rtypeset_internal'>#{html}</div>", nil,"UTF-8",Nokogiri::XML::ParseOptions::NOENT) doc.search('//text()').each do |node| old_content = node.content new_content = func.call(node.content.strip) if old_content =~ /^(\s+)/ new_content = " #{new_content}" end if old_content =~ /(\s+)$/ new_content = "#{new_content} " end node.replace(new_content) end content = doc.css("#rtypeset_internal")[0].children.map { |child| child.to_html } return content.join("") end |
.hanging_punctuation(text, options) ⇒ Object
Add push/pull spans for hanging punctuation to text.
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/typeset/hanging_punctuation.rb', line 27 def self.hanging_punctuation(text, ) return text if text.length < 2 aligns = "CcOoYTAVvWwY".split('') words = text.split(/\s+/) words.each_with_index do |word, i| [[aligns, false], [HangingPunctuation::SingleWidth, 'single'], [HangingPunctuation::DoubleWidth, 'double']].each do |pair| pair[0].each do |signal| if word[0] == signal words[i] = "#{HangingPunctuation.pull(pair[1], signal)}#{word.slice(1,word.length)}" if not words[i-1].nil? words[i-1] = "#{words[i-1]}#{HangingPunctuation.push(pair[1] ? pair[1] : signal)}" end end end end end return words.join(" ") end |
.hyphenate(text, options) ⇒ Object
8 9 10 11 12 13 14 15 |
# File 'lib/typeset/hyphenate.rb', line 8 def self.hyphenate(text, ) [:language] ||= 'en_us' hyphen = Text::Hyphen.new(:language => [:language], :left => 0, :right => 0) text = hyphen.visualise(text, "\u00AD") return text end |
.ligatures(text, options) ⇒ Object
21 22 23 24 25 26 27 28 29 |
# File 'lib/typeset/ligatures.rb', line 21 def self.ligatures(text, ) [:ligatures] ||= DefaultLigatures [:ligatures].each do |ligature| text.gsub!(ligature, Ligatures[ligature]) end return text end |
.punctuation(text, options) ⇒ Object
Make dashes, elipses, and start/end punctuation a little prettier.
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# File 'lib/typeset/punctuation.rb', line 3 def self.punctuation(text, ) # Dashes text.gsub!('--', '–') text.gsub!(' – ', "\u2009–\u2009") # Elipses text.gsub!('...', '…') # Non-breaking space for start/end punctuation with spaces. start_punc = /([«¿¡\[\(]) / if text =~ start_punc text.gsub!(start_punc, "#{$1} ") end end_punc = / ([\!\?:;\.,‽»\]\)])/ if text =~ end_punc text.gsub!(end_punc," #{$1}") end return text end |
.quotes(text, options) ⇒ Object
A poor-man’s Smarty Pants implementation. Converts single & double quotes, tick marks, backticks, and primes into prettier unicode equivalents.
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/typeset/quotes.rb', line 4 def self.quotes(text, ) # Unencode encoded characters, so our regex mess below works text.gsub!(''',"\'") text.gsub!('"',"\"") if text =~ /(\W|^)"(\S+)/ text.gsub!(/(\W|^)"(\S+)/, "#{$1}\u201c#{$2}") # beginning " end if text =~ /(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/ text.gsub!(/(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/, "#{$1}\u201d#{$2}") # ending " end if text =~ /([^0-9])"/ text.gsub!(/([^0-9])"/, "#{$1}\u201d") # remaining " at end of word end if text =~ /(\W|^)'(\S)/ text.gsub!(/(\W|^)'(\S)/, "#{$1}\u2018#{$2}") # beginning ' end if text =~ /([a-z])'([a-z])/i text.gsub!(/([a-z])'([a-z])/i, "#{$1}\u2019#{$2}") # conjunction's possession end if text =~ /((\u2018[^']*)|[a-z])'([^0-9]|$)/i text.gsub!(/((\u2018[^']*)|[a-z])'([^0-9]|$)/i, "#{$1}\u2019#{$3}") # ending ' end if text =~ /(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/i text.gsub!(/(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/i, "\u2019#{$2}#{$3}") # abbrev. years like '93 end if text =~ /(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/i text.gsub!(/(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/i, "#{$1}\u2019") # backwards apostrophe end text.gsub!(/'''/, "\u2034") # triple prime text.gsub!(/("|'')/, "\u2033") # double prime text.gsub!(/'/, "\u2032") # Allow escaped quotes text.gsub!('\\\“','\"') text.gsub!('\\\”','\"') text.gsub!('\\\’','\'') text.gsub!('\\\‘','\'') return text end |
.small_caps(text, options) ⇒ Object
Identify likely acronyms, and wrap them in a ‘small-caps’ span.
3 4 5 6 7 8 9 10 11 12 |
# File 'lib/typeset/small_caps.rb', line 3 def self.small_caps(text, ) words = text.split(" ") words.each_with_index do |word, i| if word =~ /^\W*([[:upper:]][[:upper:]][[:upper:]]+)\W*/ leading,trailing = word.split($1) words[i] = "#{leading}<span class=\"small-caps\">#{$1}</span>#{trailing}" end end return words.map { |x| x.strip }.join(" ") end |
.spaces(text, options) ⇒ Object
Replace wide (normal) spaces around math operators with hair spaces.
3 4 5 6 7 8 9 10 |
# File 'lib/typeset/spaces.rb', line 3 def self.spaces(text, ) text.gsub!(" / ", "\u2009/\u2009") text.gsub!(" × ", "\u2009×\u2009") text.gsub!(" % ", "\u2009%\u2009") text.gsub!(" + ", "\u2009+\u2009") return text end |
.typeset(html, options = Typeset::DefaultOptions) ⇒ Object
The main entry point for Typeset. Pass in raw HTML or text, along with an optional options block.
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/typeset.rb', line 51 def self.typeset(html, =Typeset::DefaultOptions) methods = Typeset::DefaultMethods.dup [:disable] ||= DefaultOptions[:disable] methods.reject! { |method| [:disable].include?(method[0]) } methods.each do |func, use_text_nodes| new_html = html if use_text_nodes new_html = Typeset.apply_to_text_nodes(html) { |content| Typeset.send(func, content, ) } else new_html = Typeset.send(func, html, ).strip end html = new_html end return html end |