Class: Rocco

Inherits:

Object

Object
Rocco

Includes:: CommentStyles

Defined in:: lib/rocco.rb,
lib/rocco/tasks.rb,
lib/rocco/version.rb,
lib/rocco/comment_styles.rb

Overview

Reopen the Rocco class and add a ‘make` class method. This is a simple bit of sugar over `Rocco::Task.new`. If you want your Rake task to be named something other than `:rocco`, you can use `Rocco::Task` directly.

Defined Under Namespace

Modules: CommentStyles Classes: Layout, Task

Constant Summary collapse

MD_BLUECLOTH =

defined?(BlueCloth) && Markdown == BlueCloth

VERSION =

'1.0.0'

Constants included from CommentStyles

CommentStyles::COMMENT_STYLES, CommentStyles::C_STYLE_COMMENTS

Instance Attribute Summary collapse

#file ⇒ Object readonly

The filename as given to ‘Rocco.new`.
#options ⇒ Object readonly

The merged options array.
#sections ⇒ Object readonly

A list of two-tuples representing each section of the source file.
#sources ⇒ Object readonly

A list of all source filenames included in the documentation set.

Class Method Summary collapse

.make(dest = 'docs/', source_files = 'lib/**/*.rb', options = {}) ⇒ Object

Instance Method Summary collapse

#detect_language ⇒ Object
#docblock(docs) ⇒ Object

Take a list of block comments and convert Docblock @annotations to Markdown syntax.
#generate_comment_chars ⇒ Object

Given a file’s language, we should be able to autopopulate the ‘comment_chars` variables for single-line comments.
#highlight(blocks) ⇒ Object

Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.
#highlight_pygmentize(code) ⇒ Object

We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.
#highlight_webservice(code) ⇒ Object

Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.
#initialize(filename, sources = [], options = {}) ⇒ Rocco constructor

A new instance of Rocco.
#normalize_leading_spaces(sections) ⇒ Object

Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line.
#parse(data) ⇒ Object

Parse the raw file data into a list of two-tuples.
#process_markdown(text) ⇒ Object

Convert Markdown to classy HTML.
#pygmentize? ⇒ Boolean

Returns ‘true` if `pygmentize` is available locally, `false` otherwise.
#read_with_encoding(filename) ⇒ Object

Helper Functions —————- Read file encoded ‘@options` into a string encoded in UTF-8.
#split(sections) ⇒ Object

Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.
#to_html ⇒ Object

Constructor Details

#initialize(filename, sources = [], options = {}) ⇒ `Rocco`

Returns a new instance of Rocco.

# File 'lib/rocco.rb', line 86

def initialize(filename, sources=[], options={})
  @file       = filename
  @sources    = sources

  defaults =  {
    :language      => 'ruby',
    :comment_chars => '#',
    :template_file => nil,
    :stylesheet    => 'http://jashkenas.github.com/docco/resources/docco.css',
    :encoding      => 'UTF-8'
  }
  @options = defaults.merge(options)

  # When `block` is given, it must read the contents of the file using
  # whatever means necessary and return it as a string. With no `block`,
  # the file is read to retrieve data.
  @data = if block_given? then yield else read_with_encoding(filename) end

  # If we detect a language
  if "text" != detect_language
    # then assign the detected language to `:language`, and look for
    # comment characters based on that language
    @options[:language] = detect_language
    @options[:comment_chars] = generate_comment_chars

  # If we didn't detect a language, but the user provided one, use it
  # to look around for comment characters to override the default.
  elsif @options[:language]
    @options[:comment_chars] = generate_comment_chars

  # If neither is true, then convert the default comment character string
  # into the comment_char syntax (we'll discuss that syntax in detail when
  # we get to `generate_comment_chars()` in a moment.
  else
    @options[:comment_chars] = { :single => @options[:comment_chars], :multi => nil }
  end

  # Turn `:comment_chars` into a regex matching a series of spaces, the
  # `:comment_chars` string, and the an optional space.  We'll use that
  # to detect single-line comments.
  @comment_pattern = Regexp.new("^\\s*#{@options[:comment_chars][:single]}\s?")

  # `parse()` the file contents stored in `@data`.  Run the result through
  # `split()` and that result through `highlight()` to generate the final
  # section list.
  @sections = highlight(split(parse(@data)))
end

Instance Attribute Details

#file ⇒ `Object` (readonly)

The filename as given to ‘Rocco.new`.



135
136
137

# File 'lib/rocco.rb', line 135

def file
  @file
end

#options ⇒ `Object` (readonly)

The merged options array



138
139
140

# File 'lib/rocco.rb', line 138

def options
  @options
end

#sections ⇒ `Object` (readonly)

A list of two-tuples representing each section of the source file. Each item in the list has the form: ‘[docs_html, code_html]`, where both elements are strings containing the documentation and source code HTML, respectively.



144
145
146

# File 'lib/rocco.rb', line 144

def sections
  @sections
end

#sources ⇒ `Object` (readonly)

A list of all source filenames included in the documentation set. Useful for building an index of other files.



148
149
150

# File 'lib/rocco.rb', line 148

def sources
  @sources
end

Class Method Details

.make(dest = 'docs/', source_files = 'lib/**/*.rb', options = {}) ⇒ `Object`



54
55
56

# File 'lib/rocco/tasks.rb', line 54

def self.make(dest='docs/', source_files='lib/**/*.rb', options={})
  Task.new(:rocco, dest, source_files, options)
end

Instance Method Details

#detect_language ⇒ `Object`

# File 'lib/rocco.rb', line 187

def detect_language
  ext = File.extname(@file).slice(1..-1)
  @_language ||=
    if pygmentize?
      %x[pygmentize -N #{@file}].strip.split('+').first
    elsif !COMMENT_STYLES[ext].nil?
      ext
    else
      "text"
    end
end

#docblock(docs) ⇒ `Object`

Take a list of block comments and convert Docblock @annotations to Markdown syntax.

# File 'lib/rocco.rb', line 385

def docblock(docs)
  docs.map do |doc|
    doc.split("\n").map do |line|
      line.match(/^@\w+/) ? line.sub(/^@(\w+)\s+/, '> **\1** ')+"  " : line
    end.join("\n")
  end
end

#generate_comment_chars ⇒ `Object`

Given a file’s language, we should be able to autopopulate the ‘comment_chars` variables for single-line comments. If we don’t have comment characters on record for a given language, we’ll use the user-provided ‘:comment_char` option (which defaults to `#`).

Comment characters are listed as:

{ :single       => "//",
  :multi_start  => "/**",
  :multi_middle => "*",
  :multi_end    => "*/" }

‘:single` denotes the leading character of a single-line comment. `:multi_start` denotes the string that should appear alone on a line of code to begin a block of documentation. `:multi_middle` denotes the leading character of block comment content, and `:multi_end` is the string that ought appear alone on a line to close a block of documentation. That is:

/**                 [:multi][:start]
 *                  [:multi][:middle]
 ...
 *                  [:multi][:middle]
 */                 [:multi][:end]

If a language only has one type of comment, the missing type should be assigned ‘nil`.

At the moment, we’re only returning ‘:single`. Consider this groundwork for block comment parsing.

# File 'lib/rocco.rb', line 230

def generate_comment_chars
  @_commentchar ||=
    if COMMENT_STYLES[@options[:language]]
      COMMENT_STYLES[@options[:language]]
    else
      { :single => @options[:comment_chars], :multi => nil, :heredoc => nil }
    end
end

#highlight(blocks) ⇒ `Object`

Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.

# File 'lib/rocco.rb', line 395

def highlight(blocks)
  docs_blocks, code_blocks = blocks

  # Pre-process Docblock @annotations.
  docs_blocks = docblock(docs_blocks) if @options[:docblocks]

  # Combine all docs blocks into a single big markdown document with section
  # dividers and run through the Markdown processor. Then split it back out
  # into separate sections.
  markdown = docs_blocks.join("\n\n##### DIVIDER\n\n")
  docs_html = process_markdown(markdown).split(/\n*<h5>DIVIDER<\/h5>\n*/m)

  # Combine all code blocks into a single big stream with section dividers and
  # run through either `pygmentize(1)` or <http://pygments.appspot.com>
  span, espan = '<span class="c.?">', '</span>'
  if @options[:comment_chars][:single]
    front = @options[:comment_chars][:single]
    divider_input  = "\n\n#{front} DIVIDER\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span,
        Regexp.escape(CGI.escapeHTML(front)),
        ' DIVIDER',
        espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  else
    front = @options[:comment_chars][:multi][:start]
    back  = @options[:comment_chars][:multi][:end]
    divider_input  = "\n\n#{front}\nDIVIDER\n#{back}\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span, Regexp.escape(CGI.escapeHTML(front)), espan,
        "\\n",
        span, "DIVIDER", espan,
        "\\n",
        span, Regexp.escape(CGI.escapeHTML(back)), espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  end

  code_stream = code_blocks.join(divider_input)

  code_html =
    if pygmentize?
      highlight_pygmentize(code_stream)
    else
      highlight_webservice(code_stream)
    end

  # Do some post-processing on the pygments output to split things back
  # into sections and remove partial `<pre>` blocks.
  code_html = code_html.
    split(divider_output).
    map { |code| code.sub(/\n?<div class="highlight"><pre>/m, '') }.
    map { |code| code.sub(/\n?<\/pre><\/div>\n/m, '') }

  # Lastly, combine the docs and code lists back into a list of two-tuples.
  docs_html.zip(code_html)
end

#highlight_pygmentize(code) ⇒ `Object`

We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.

# File 'lib/rocco.rb', line 465

def highlight_pygmentize(code)
  code_html = nil
  open("|pygmentize -l #{@options[:language]} -O encoding=utf-8 -f html", 'r+') do |fd|
    pid =
      fork {
        fd.close_read
        fd.write code
        fd.close_write
        exit!
      }
    fd.close_write
    code_html = fd.read
    fd.close_read
    Process.wait(pid)
  end

  code_html
end

#highlight_webservice(code) ⇒ `Object`

Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.

# File 'lib/rocco.rb', line 486

def highlight_webservice(code)
  url = URI.parse 'http://pygments.appspot.com/'
  options = { 'lang' => @options[:language], 'code' => code}
  Net::HTTP.post_form(url, options).body
end

#normalize_leading_spaces(sections) ⇒ `Object`

Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line. That is:

def func():
  """
    Comment 1
    Comment 2
  """
  print "omg!"

should yield a comment block of ‘Comment 1nComment 2` and code of `def func():n print “omg!”`

# File 'lib/rocco.rb', line 355

def normalize_leading_spaces(sections)
  sections.map do |section|
    if section.any? && section[0].any?
      leading_space = section[0][0].match("^\s+")
      if leading_space
        section[0] =
          section[0].map{ |line| line.sub(/^#{leading_space.to_s}/, '') }
      end
    end
    section
  end
end

#parse(data) ⇒ `Object`

Parse the raw file data into a list of two-tuples. Each tuple has the form ‘[docs, code]` where both elements are arrays containing the raw lines parsed from the input file, comment characters stripped.

# File 'lib/rocco.rb', line 245

def parse(data)
  sections, docs, code, lines = [], [], [], data.split("\n")

  # The first line is ignored if it is a shebang line.  We also ignore the
  # PEP 263 encoding information in python sourcefiles, and the similar ruby
  # 1.9 syntax.
  lines.shift if lines[0] =~ /^\#\!/
  lines.shift if lines[0] =~ /coding[:=]\s*[-\w.]+/ &&
                 [ "python", "rb" ].include?(@options[:language])

  # To detect both block comments and single-line comments, we'll set
  # up a tiny state machine, and loop through each line of the file.
  # This requires an `in_comment_block` boolean, and a few regular
  # expressions for line tests.  We'll do the same for fake heredoc parsing.
  in_comment_block = false
  in_heredoc = false
  single_line_comment, block_comment_start, block_comment_mid, block_comment_end =
    nil, nil, nil, nil
  if not @options[:comment_chars][:single].nil?
    single_line_comment = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:single])}\\s?")
  end
  if not @options[:comment_chars][:multi].nil?
    block_comment_start = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*$")
    block_comment_end   = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_one_liner = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_start_with = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)$")
    block_comment_end_with = Regexp.new("\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    if @options[:comment_chars][:multi][:middle]
      block_comment_mid = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:middle])}\\s?")
    end
  end
  if not @options[:comment_chars][:heredoc].nil?
    heredoc_start = Regexp.new("#{Regexp.escape(@options[:comment_chars][:heredoc])}(\\S+)$")
  end
  lines.each do |line|
    # If we're currently in a comment block, check whether the line matches
    # the _end_ of a comment block or the _end_ of a comment block with a
    # comment.
    if in_comment_block
      if block_comment_end && line.match(block_comment_end)
        in_comment_block = false
      elsif block_comment_end_with && line.match(block_comment_end_with)
        in_comment_block = false
        docs << line.match(block_comment_end_with).captures.first.
                      sub(block_comment_mid || '', '')
      else
        docs << line.sub(block_comment_mid || '', '')
      end
    # If we're currently in a heredoc, we're looking for the end of the
    # heredoc, and everything it contains is code.
    elsif in_heredoc
      if line.match(Regexp.new("^#{Regexp.escape(in_heredoc)}$"))
        in_heredoc = false
      end
      code << line
    # Otherwise, check whether the line starts a heredoc. If so, note the end
    # pattern, and the line is code.  Otherwise check whether the line matches
    # the beginning of a block, or a single-line comment all on it's lonesome.
    # In either case, if there's code, start a new section.
    else
      if heredoc_start && line.match(heredoc_start)
        in_heredoc = $1
        code << line
      elsif block_comment_one_liner && line.match(block_comment_one_liner)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_one_liner).captures.first
      elsif block_comment_start && line.match(block_comment_start)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
      elsif block_comment_start_with && line.match(block_comment_start_with)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_start_with).captures.first
      elsif single_line_comment && line.match(single_line_comment)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.sub(single_line_comment || '', '')
      else
        code << line
      end
    end
  end
  sections << [docs, code] if docs.any? || code.any?
  normalize_leading_spaces(sections)
end

#process_markdown(text) ⇒ `Object`

Convert Markdown to classy HTML.



459
460
461

# File 'lib/rocco.rb', line 459

def process_markdown(text)
  if MD_BLUECLOTH then Markdown.new(text).to_html else Markdown.new(text, :smart).to_html end
end

#pygmentize? ⇒ `Boolean`

Returns ‘true` if `pygmentize` is available locally, `false` otherwise.

Returns:

(Boolean)

# File 'lib/rocco.rb', line 173

def pygmentize?
  pygmentize = `which pygmentize`
  @_pygmentize ||= !(pygmentize.include?("not found") || pygmentize.empty?)
end

#read_with_encoding(filename) ⇒ `Object`

Helper Functions

Read file encoded ‘@options` into a string encoded in UTF-8.

# File 'lib/rocco.rb', line 159

def read_with_encoding filename
  # This works differently in Ruby 1.8 and Ruby 1.9, which are
  # distinguished by checking if `IO#external_encoding` exists.
  if IO.method_defined?("external_encoding")
    File.read(filename, :external_encoding => @options[:encoding],
              :internal_encoding => "UTF-8")
  else
    require 'iconv'
    data = File.read(filename)
    Iconv.conv("UTF-8", @options[:encoding], data)
  end
end

#split(sections) ⇒ `Object`

Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.

# File 'lib/rocco.rb', line 371

def split(sections)
  docs_blocks, code_blocks = [], []
  sections.each do |docs,code|
    docs_blocks << docs.join("\n")
    code_blocks << code.map do |line|
      tabs = line.match(/^(\t+)/)
      tabs ? line.sub(/^\t+/, '  ' * tabs.captures[0].length) : line
    end.join("\n")
  end
  [docs_blocks, code_blocks]
end

#to_html ⇒ `Object`



152
153
154

# File 'lib/rocco.rb', line 152

def to_html
  Rocco::Layout.new(self, @options[:stylesheet], @options[:template_file]).render
end

Class: Rocco

Overview

Defined Under Namespace

Constant Summary collapse

Constants included from CommentStyles

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, sources = [], options = {}) ⇒ Rocco

Instance Attribute Details

#file ⇒ Object (readonly)

#options ⇒ Object (readonly)

#sections ⇒ Object (readonly)

#sources ⇒ Object (readonly)

Class Method Details

.make(dest = 'docs/', source_files = 'lib/**/*.rb', options = {}) ⇒ Object

Instance Method Details

#detect_language ⇒ Object

#docblock(docs) ⇒ Object

#generate_comment_chars ⇒ Object

#highlight(blocks) ⇒ Object

#highlight_pygmentize(code) ⇒ Object

#highlight_webservice(code) ⇒ Object

#normalize_leading_spaces(sections) ⇒ Object

#parse(data) ⇒ Object

#process_markdown(text) ⇒ Object

#pygmentize? ⇒ Boolean

#read_with_encoding(filename) ⇒ Object

#split(sections) ⇒ Object

#to_html ⇒ Object