Module: Typeset

Defined in:: lib/typeset.rb,
lib/typeset/quotes.rb,
lib/typeset/spaces.rb,
lib/typeset/hyphenate.rb,
lib/typeset/ligatures.rb,
lib/typeset/small_caps.rb,
lib/typeset/punctuation.rb,
lib/typeset/hanging_punctuation.rb

Overview

Contains all of our typeset-related class methods. Mix this module into a class, or just call ‘Typeset#typset` directly

Defined Under Namespace

Constant Summary collapse

DefaultMethods = The default typesetting methods and their configuration. Add new methods here in whatever order makes sense.

[
  [:quotes, true],
  [:hanging_punctuation, true],
  [:spaces, true],
  [:small_caps, true],
  [:ligatures, false],
  [:punctuation, false],
  [:hyphenate, true]
]

DefaultOptions =

{
  :disable => [],
  :language => "en_us"
}

Ligatures = Map of raw text sequences to unicode ligatures

{
  'ffi' => 'ﬃ',
  'ffl' => 'ﬄ',
  'fi' => 'ﬁ',
  'fl' => 'ﬂ',
  'st' => 'ﬆ',
  'ff' => 'ﬀ',
  'ue' => 'ᵫ'
}

DefaultLigatures = List of ligatures to process by default

%w{ffi ffl fi fl ff}

Class Method Summary collapse

.apply_to_text_nodes(html, &func) ⇒ Object

Parse an HTML fragment with Nokogiri and apply a function to all of the descendant text nodes.
.hanging_punctuation(text, options) ⇒ Object

Add push/pull spans for hanging punctuation to text.
.hyphenate(text, options) ⇒ Object

Hyphenate text, inserting soft hyphenation markers.
.ligatures(text, options) ⇒ Object

Find and replace sequences of text with their unicode ligature equivalents.
.punctuation(text, options) ⇒ Object

Make dashes, elipses, and start/end punctuation a little prettier.
.quotes(text, options) ⇒ Object

A poor-man’s Smarty Pants implementation.
.small_caps(text, options) ⇒ Object

Identify likely acronyms, and wrap them in a ‘small-caps’ span.
.spaces(text, options) ⇒ Object

Replace wide (normal) spaces around math operators with hair spaces.
.typeset(html, options = Typeset::DefaultOptions) ⇒ Object

The main entry point for Typeset.

Class Method Details

.apply_to_text_nodes(html, &func) ⇒ `Object`

Parse an HTML fragment with Nokogiri and apply a function to all of the descendant text nodes

# File 'lib/typeset.rb', line 15

def self.apply_to_text_nodes(html, &func)
  doc = Nokogiri::HTML("<div id='rtypeset_internal'>#{html}</div>", nil,"UTF-8",Nokogiri::XML::ParseOptions::NOENT)
  doc.search('//text()').each do |node|
    old_content = node.content
    new_content = func.call(node.content.strip)
    if old_content =~ /^(\s+)/
      new_content = " #{new_content}"
    end
    if old_content =~ /(\s+)$/
      new_content = "#{new_content} "
    end
    node.replace(new_content)
  end
  content = doc.css("#rtypeset_internal")[0].children.map { |child| child.to_html }
  return content.join("")
end

.hanging_punctuation(text, options) ⇒ `Object`

Add push/pull spans for hanging punctuation to text.

# File 'lib/typeset/hanging_punctuation.rb', line 27

def self.hanging_punctuation(text, options)
  return text if text.length < 2

  aligns = "CcOoYTAVvWwY".split('')
  words = text.split(/\s+/)
  words.each_with_index do |word, i|
    [[aligns, false],
     [HangingPunctuation::SingleWidth, 'single'],
     [HangingPunctuation::DoubleWidth, 'double']].each do |pair|
      pair[0].each do |signal|
        if word[0] == signal
          words[i] = "#{HangingPunctuation.pull(pair[1], signal)}#{word.slice(1,word.length)}"

          if not words[i-1].nil?
            words[i-1] = "#{words[i-1]}#{HangingPunctuation.push(pair[1] ? pair[1] : signal)}"
          end
        end
      end
    end
  end

  return words.join(" ")
end

.hyphenate(text, options) ⇒ `Object`

Hyphenate text, inserting soft hyphenation markers. Specify the language for hyphenation by passing in an options block to your typeset call, e.g.:

Typeset.typeset("do hyphenation on this", {:language => "en_gb"})

# File 'lib/typeset/hyphenate.rb', line 8

def self.hyphenate(text, options)
  options[:language] ||= 'en_us'
  hyphen = Text::Hyphen.new(:language => options[:language], :left => 0, :right => 0)

  text = hyphen.visualise(text, "\u00AD")

  return text
end

.ligatures(text, options) ⇒ `Object`

Find and replace sequences of text with their unicode ligature equivalents. Override the set of ligatures to find by passing in a custom options hash, e.g.:

Typeset.typeset("flue", {:ligatures => ["fl", "ue"]})
# -> returns "ﬂᵫ"

# File 'lib/typeset/ligatures.rb', line 21

def self.ligatures(text, options)
  options[:ligatures] ||= DefaultLigatures

  options[:ligatures].each do |ligature|
    text.gsub!(ligature, Ligatures[ligature])
  end

  return text
end

.punctuation(text, options) ⇒ `Object`

Make dashes, elipses, and start/end punctuation a little prettier.

# File 'lib/typeset/punctuation.rb', line 3

def self.punctuation(text, options)
  # Dashes
  text.gsub!('--', '–')
  text.gsub!(' – ', "\u2009–\u2009")

  # Elipses
  text.gsub!('...', '…')

  # Non-breaking space for start/end punctuation with spaces.
  start_punc = /([«¿¡\[\(]) /
  if text =~ start_punc
    text.gsub!(start_punc, "#{$1}&nbsp;")
  end
  end_punc = / ([\!\?:;\.,‽»\]\)])/
  if text =~ end_punc
    text.gsub!(end_punc,"&nbsp;#{$1}")
  end

  return text
end

.quotes(text, options) ⇒ `Object`

A poor-man’s Smarty Pants implementation. Converts single & double quotes, tick marks, backticks, and primes into prettier unicode equivalents.

# File 'lib/typeset/quotes.rb', line 4

def self.quotes(text, options)
  # Unencode encoded characters, so our regex mess below works
  text.gsub!('&#39;',"\'")
  text.gsub!('&quot;',"\"")

  if text =~ /(\W|^)"(\S+)/
    text.gsub!(/(\W|^)"(\S+)/, "#{$1}\u201c#{$2}") # beginning "
  end
  if text =~ /(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/
    text.gsub!(/(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/, "#{$1}\u201d#{$2}") # ending "
  end
  if text =~ /([^0-9])"/
    text.gsub!(/([^0-9])"/, "#{$1}\u201d") # remaining " at end of word
  end
  if text =~ /(\W|^)'(\S)/
    text.gsub!(/(\W|^)'(\S)/, "#{$1}\u2018#{$2}") # beginning '
  end
  if text =~ /([a-z])'([a-z])/i
    text.gsub!(/([a-z])'([a-z])/i, "#{$1}\u2019#{$2}") # conjunction's possession
  end
  if text =~ /((\u2018[^']*)|[a-z])'([^0-9]|$)/i
    text.gsub!(/((\u2018[^']*)|[a-z])'([^0-9]|$)/i, "#{$1}\u2019#{$3}") # ending '
  end
  if text =~ /(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/i
    text.gsub!(/(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/i, "\u2019#{$2}#{$3}") # abbrev. years like '93
  end
  if text =~ /(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/i
    text.gsub!(/(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/i, "#{$1}\u2019") # backwards apostrophe
  end
  text.gsub!(/'''/, "\u2034") # triple prime
  text.gsub!(/("|'')/, "\u2033") # double prime
  text.gsub!(/'/, "\u2032")

  # Allow escaped quotes
  text.gsub!('\\\“','\"')
  text.gsub!('\\\”','\"')
  text.gsub!('\\\’','\'')
  text.gsub!('\\\‘','\'')

  return text
end

.small_caps(text, options) ⇒ `Object`

Identify likely acronyms, and wrap them in a ‘small-caps’ span.

# File 'lib/typeset/small_caps.rb', line 3

def self.small_caps(text, options)
  words = text.split(" ")
  words.each_with_index do |word, i|
    if word =~ /^\W*([[:upper:]][[:upper:]][[:upper:]]+)\W*/
      leading,trailing = word.split($1)
      words[i] = "#{leading}<span class=\"small-caps\">#{$1}</span>#{trailing}"
    end
  end
  return words.map { |x| x.strip }.join(" ")
end

.spaces(text, options) ⇒ `Object`

Replace wide (normal) spaces around math operators with hair spaces.

# File 'lib/typeset/spaces.rb', line 3

def self.spaces(text, options)
  text.gsub!(" / ", "\u2009/\u2009")
  text.gsub!(" × ", "\u2009×\u2009")
  text.gsub!(" % ", "\u2009%\u2009")
  text.gsub!(" + ", "\u2009+\u2009")

  return text
end

.typeset(html, options = Typeset::DefaultOptions) ⇒ `Object`

The main entry point for Typeset. Pass in raw HTML or text, along with an optional options block.

# File 'lib/typeset.rb', line 51

def self.typeset(html, options=Typeset::DefaultOptions)
  methods = Typeset::DefaultMethods.dup
  options[:disable] ||= DefaultOptions[:disable]
  methods.reject! { |method| options[:disable].include?(method[0]) }

  methods.each do |func, use_text_nodes|
    new_html = html
    if use_text_nodes
      new_html = Typeset.apply_to_text_nodes(html) { |content| Typeset.send(func, content, options) }
    else
      new_html = Typeset.send(func, html, options).strip
    end
    html = new_html
  end
  return html
end

Module: Typeset

Overview

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.apply_to_text_nodes(html, &func) ⇒ Object

.hanging_punctuation(text, options) ⇒ Object

.hyphenate(text, options) ⇒ Object

.ligatures(text, options) ⇒ Object

.punctuation(text, options) ⇒ Object

.quotes(text, options) ⇒ Object

.small_caps(text, options) ⇒ Object

.spaces(text, options) ⇒ Object

.typeset(html, options = Typeset::DefaultOptions) ⇒ Object