Class: Rocco

Inherits:
Object
  • Object
show all
Includes:
CommentStyles
Defined in:
lib/rocco.rb,
lib/rocco/tasks.rb,
lib/rocco/version.rb,
lib/rocco/comment_styles.rb

Overview

Reopen the Rocco class and add a ‘make` class method. This is a simple bit of sugar over `Rocco::Task.new`. If you want your Rake task to be named something other than `:rocco`, you can use `Rocco::Task` directly.

Defined Under Namespace

Modules: CommentStyles Classes: Layout, Task

Constant Summary collapse

MD_BLUECLOTH =
defined?(BlueCloth) && Markdown == BlueCloth
VERSION =
'1.0.0'

Constants included from CommentStyles

CommentStyles::COMMENT_STYLES, CommentStyles::C_STYLE_COMMENTS

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, sources = [], options = {}) ⇒ Rocco

Returns a new instance of Rocco.



86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# File 'lib/rocco.rb', line 86

def initialize(filename, sources=[], options={})
  @file       = filename
  @sources    = sources

  defaults =  {
    :language      => 'ruby',
    :comment_chars => '#',
    :template_file => nil,
    :stylesheet    => 'http://jashkenas.github.com/docco/resources/docco.css',
    :encoding      => 'UTF-8'
  }
  @options = defaults.merge(options)

  # When `block` is given, it must read the contents of the file using
  # whatever means necessary and return it as a string. With no `block`,
  # the file is read to retrieve data.
  @data = if block_given? then yield else read_with_encoding(filename) end

  # If we detect a language
  if "text" != detect_language
    # then assign the detected language to `:language`, and look for
    # comment characters based on that language
    @options[:language] = detect_language
    @options[:comment_chars] = generate_comment_chars

  # If we didn't detect a language, but the user provided one, use it
  # to look around for comment characters to override the default.
  elsif @options[:language]
    @options[:comment_chars] = generate_comment_chars

  # If neither is true, then convert the default comment character string
  # into the comment_char syntax (we'll discuss that syntax in detail when
  # we get to `generate_comment_chars()` in a moment.
  else
    @options[:comment_chars] = { :single => @options[:comment_chars], :multi => nil }
  end

  # Turn `:comment_chars` into a regex matching a series of spaces, the
  # `:comment_chars` string, and the an optional space.  We'll use that
  # to detect single-line comments.
  @comment_pattern = Regexp.new("^\\s*#{@options[:comment_chars][:single]}\s?")

  # `parse()` the file contents stored in `@data`.  Run the result through
  # `split()` and that result through `highlight()` to generate the final
  # section list.
  @sections = highlight(split(parse(@data)))
end

Instance Attribute Details

#fileObject (readonly)

The filename as given to ‘Rocco.new`.



135
136
137
# File 'lib/rocco.rb', line 135

def file
  @file
end

#optionsObject (readonly)

The merged options array



138
139
140
# File 'lib/rocco.rb', line 138

def options
  @options
end

#sectionsObject (readonly)

A list of two-tuples representing each section of the source file. Each item in the list has the form: ‘[docs_html, code_html]`, where both elements are strings containing the documentation and source code HTML, respectively.



144
145
146
# File 'lib/rocco.rb', line 144

def sections
  @sections
end

#sourcesObject (readonly)

A list of all source filenames included in the documentation set. Useful for building an index of other files.



148
149
150
# File 'lib/rocco.rb', line 148

def sources
  @sources
end

Class Method Details

.make(dest = 'docs/', source_files = 'lib/**/*.rb', options = {}) ⇒ Object



54
55
56
# File 'lib/rocco/tasks.rb', line 54

def self.make(dest='docs/', source_files='lib/**/*.rb', options={})
  Task.new(:rocco, dest, source_files, options)
end

Instance Method Details

#detect_languageObject



187
188
189
190
191
192
193
194
195
196
197
# File 'lib/rocco.rb', line 187

def detect_language
  ext = File.extname(@file).slice(1..-1)
  @_language ||=
    if pygmentize?
      %x[pygmentize -N #{@file}].strip.split('+').first
    elsif !COMMENT_STYLES[ext].nil?
      ext
    else
      "text"
    end
end

#docblock(docs) ⇒ Object

Take a list of block comments and convert Docblock @annotations to Markdown syntax.



385
386
387
388
389
390
391
# File 'lib/rocco.rb', line 385

def docblock(docs)
  docs.map do |doc|
    doc.split("\n").map do |line|
      line.match(/^@\w+/) ? line.sub(/^@(\w+)\s+/, '> **\1** ')+"  " : line
    end.join("\n")
  end
end

#generate_comment_charsObject

Given a file’s language, we should be able to autopopulate the ‘comment_chars` variables for single-line comments. If we don’t have comment characters on record for a given language, we’ll use the user-provided ‘:comment_char` option (which defaults to `#`).

Comment characters are listed as:

{ :single       => "//",
  :multi_start  => "/**",
  :multi_middle => "*",
  :multi_end    => "*/" }

‘:single` denotes the leading character of a single-line comment. `:multi_start` denotes the string that should appear alone on a line of code to begin a block of documentation. `:multi_middle` denotes the leading character of block comment content, and `:multi_end` is the string that ought appear alone on a line to close a block of documentation. That is:

/**                 [:multi][:start]
 *                  [:multi][:middle]
 ...
 *                  [:multi][:middle]
 */                 [:multi][:end]

If a language only has one type of comment, the missing type should be assigned ‘nil`.

At the moment, we’re only returning ‘:single`. Consider this groundwork for block comment parsing.



230
231
232
233
234
235
236
237
# File 'lib/rocco.rb', line 230

def generate_comment_chars
  @_commentchar ||=
    if COMMENT_STYLES[@options[:language]]
      COMMENT_STYLES[@options[:language]]
    else
      { :single => @options[:comment_chars], :multi => nil, :heredoc => nil }
    end
end

#highlight(blocks) ⇒ Object

Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.



395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
# File 'lib/rocco.rb', line 395

def highlight(blocks)
  docs_blocks, code_blocks = blocks

  # Pre-process Docblock @annotations.
  docs_blocks = docblock(docs_blocks) if @options[:docblocks]

  # Combine all docs blocks into a single big markdown document with section
  # dividers and run through the Markdown processor. Then split it back out
  # into separate sections.
  markdown = docs_blocks.join("\n\n##### DIVIDER\n\n")
  docs_html = process_markdown(markdown).split(/\n*<h5>DIVIDER<\/h5>\n*/m)

  # Combine all code blocks into a single big stream with section dividers and
  # run through either `pygmentize(1)` or <http://pygments.appspot.com>
  span, espan = '<span class="c.?">', '</span>'
  if @options[:comment_chars][:single]
    front = @options[:comment_chars][:single]
    divider_input  = "\n\n#{front} DIVIDER\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span,
        Regexp.escape(CGI.escapeHTML(front)),
        ' DIVIDER',
        espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  else
    front = @options[:comment_chars][:multi][:start]
    back  = @options[:comment_chars][:multi][:end]
    divider_input  = "\n\n#{front}\nDIVIDER\n#{back}\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span, Regexp.escape(CGI.escapeHTML(front)), espan,
        "\\n",
        span, "DIVIDER", espan,
        "\\n",
        span, Regexp.escape(CGI.escapeHTML(back)), espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  end

  code_stream = code_blocks.join(divider_input)

  code_html =
    if pygmentize?
      highlight_pygmentize(code_stream)
    else
      highlight_webservice(code_stream)
    end

  # Do some post-processing on the pygments output to split things back
  # into sections and remove partial `<pre>` blocks.
  code_html = code_html.
    split(divider_output).
    map { |code| code.sub(/\n?<div class="highlight"><pre>/m, '') }.
    map { |code| code.sub(/\n?<\/pre><\/div>\n/m, '') }

  # Lastly, combine the docs and code lists back into a list of two-tuples.
  docs_html.zip(code_html)
end

#highlight_pygmentize(code) ⇒ Object

We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.



465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
# File 'lib/rocco.rb', line 465

def highlight_pygmentize(code)
  code_html = nil
  open("|pygmentize -l #{@options[:language]} -O encoding=utf-8 -f html", 'r+') do |fd|
    pid =
      fork {
        fd.close_read
        fd.write code
        fd.close_write
        exit!
      }
    fd.close_write
    code_html = fd.read
    fd.close_read
    Process.wait(pid)
  end

  code_html
end

#highlight_webservice(code) ⇒ Object

Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.



486
487
488
489
490
# File 'lib/rocco.rb', line 486

def highlight_webservice(code)
  url = URI.parse 'http://pygments.appspot.com/'
  options = { 'lang' => @options[:language], 'code' => code}
  Net::HTTP.post_form(url, options).body
end

#normalize_leading_spaces(sections) ⇒ Object

Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line. That is:

def func():
  """
    Comment 1
    Comment 2
  """
  print "omg!"

should yield a comment block of ‘Comment 1nComment 2` and code of `def func():n print “omg!”`



355
356
357
358
359
360
361
362
363
364
365
366
# File 'lib/rocco.rb', line 355

def normalize_leading_spaces(sections)
  sections.map do |section|
    if section.any? && section[0].any?
      leading_space = section[0][0].match("^\s+")
      if leading_space
        section[0] =
          section[0].map{ |line| line.sub(/^#{leading_space.to_s}/, '') }
      end
    end
    section
  end
end

#parse(data) ⇒ Object

Parse the raw file data into a list of two-tuples. Each tuple has the form ‘[docs, code]` where both elements are arrays containing the raw lines parsed from the input file, comment characters stripped.



245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
# File 'lib/rocco.rb', line 245

def parse(data)
  sections, docs, code, lines = [], [], [], data.split("\n")

  # The first line is ignored if it is a shebang line.  We also ignore the
  # PEP 263 encoding information in python sourcefiles, and the similar ruby
  # 1.9 syntax.
  lines.shift if lines[0] =~ /^\#\!/
  lines.shift if lines[0] =~ /coding[:=]\s*[-\w.]+/ &&
                 [ "python", "rb" ].include?(@options[:language])

  # To detect both block comments and single-line comments, we'll set
  # up a tiny state machine, and loop through each line of the file.
  # This requires an `in_comment_block` boolean, and a few regular
  # expressions for line tests.  We'll do the same for fake heredoc parsing.
  in_comment_block = false
  in_heredoc = false
  single_line_comment, block_comment_start, block_comment_mid, block_comment_end =
    nil, nil, nil, nil
  if not @options[:comment_chars][:single].nil?
    single_line_comment = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:single])}\\s?")
  end
  if not @options[:comment_chars][:multi].nil?
    block_comment_start = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*$")
    block_comment_end   = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_one_liner = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_start_with = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)$")
    block_comment_end_with = Regexp.new("\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    if @options[:comment_chars][:multi][:middle]
      block_comment_mid = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:middle])}\\s?")
    end
  end
  if not @options[:comment_chars][:heredoc].nil?
    heredoc_start = Regexp.new("#{Regexp.escape(@options[:comment_chars][:heredoc])}(\\S+)$")
  end
  lines.each do |line|
    # If we're currently in a comment block, check whether the line matches
    # the _end_ of a comment block or the _end_ of a comment block with a
    # comment.
    if in_comment_block
      if block_comment_end && line.match(block_comment_end)
        in_comment_block = false
      elsif block_comment_end_with && line.match(block_comment_end_with)
        in_comment_block = false
        docs << line.match(block_comment_end_with).captures.first.
                      sub(block_comment_mid || '', '')
      else
        docs << line.sub(block_comment_mid || '', '')
      end
    # If we're currently in a heredoc, we're looking for the end of the
    # heredoc, and everything it contains is code.
    elsif in_heredoc
      if line.match(Regexp.new("^#{Regexp.escape(in_heredoc)}$"))
        in_heredoc = false
      end
      code << line
    # Otherwise, check whether the line starts a heredoc. If so, note the end
    # pattern, and the line is code.  Otherwise check whether the line matches
    # the beginning of a block, or a single-line comment all on it's lonesome.
    # In either case, if there's code, start a new section.
    else
      if heredoc_start && line.match(heredoc_start)
        in_heredoc = $1
        code << line
      elsif block_comment_one_liner && line.match(block_comment_one_liner)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_one_liner).captures.first
      elsif block_comment_start && line.match(block_comment_start)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
      elsif block_comment_start_with && line.match(block_comment_start_with)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_start_with).captures.first
      elsif single_line_comment && line.match(single_line_comment)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.sub(single_line_comment || '', '')
      else
        code << line
      end
    end
  end
  sections << [docs, code] if docs.any? || code.any?
  normalize_leading_spaces(sections)
end

#process_markdown(text) ⇒ Object

Convert Markdown to classy HTML.



459
460
461
# File 'lib/rocco.rb', line 459

def process_markdown(text)
  if MD_BLUECLOTH then Markdown.new(text).to_html else Markdown.new(text, :smart).to_html end
end

#pygmentize?Boolean

Returns ‘true` if `pygmentize` is available locally, `false` otherwise.

Returns:

  • (Boolean)


173
174
175
176
# File 'lib/rocco.rb', line 173

def pygmentize?
  pygmentize = `which pygmentize`
  @_pygmentize ||= !(pygmentize.include?("not found") || pygmentize.empty?)
end

#read_with_encoding(filename) ⇒ Object

Helper Functions


Read file encoded ‘@options` into a string encoded in UTF-8.



159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/rocco.rb', line 159

def read_with_encoding filename
  # This works differently in Ruby 1.8 and Ruby 1.9, which are
  # distinguished by checking if `IO#external_encoding` exists.
  if IO.method_defined?("external_encoding")
    File.read(filename, :external_encoding => @options[:encoding],
              :internal_encoding => "UTF-8")
  else
    require 'iconv'
    data = File.read(filename)
    Iconv.conv("UTF-8", @options[:encoding], data)
  end
end

#split(sections) ⇒ Object

Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.



371
372
373
374
375
376
377
378
379
380
381
# File 'lib/rocco.rb', line 371

def split(sections)
  docs_blocks, code_blocks = [], []
  sections.each do |docs,code|
    docs_blocks << docs.join("\n")
    code_blocks << code.map do |line|
      tabs = line.match(/^(\t+)/)
      tabs ? line.sub(/^\t+/, '  ' * tabs.captures[0].length) : line
    end.join("\n")
  end
  [docs_blocks, code_blocks]
end

#to_htmlObject



152
153
154
# File 'lib/rocco.rb', line 152

def to_html
  Rocco::Layout.new(self, @options[:stylesheet], @options[:template_file]).render
end