Class: Rocco
- Inherits:
-
Object
- Object
- Rocco
- Defined in:
- lib/rocco.rb,
lib/rocco/tasks.rb
Overview
Reopen the Rocco class and add a ‘make` class method. This is a simple bit of sugar over `Rocco::Task.new`. If you want your Rake task to be named something other than `:rocco`, you can use `Rocco::Task` directly.
Defined Under Namespace
Constant Summary collapse
- VERSION =
'0.6'
- C_STYLE_COMMENTS =
Given a file’s language, we should be able to autopopulate the ‘comment_chars` variables for single-line comments. If we don’t have comment characters on record for a given language, we’ll use the user-provided ‘:comment_char` option (which defaults to `#`).
Comment characters are listed as:
{ :single => "//", :multi_start => "/**", :multi_middle => "*", :multi_end => "*/" }
‘:single` denotes the leading character of a single-line comment. `:multi_start` denotes the string that should appear alone on a line of code to begin a block of documentation. `:multi_middle` denotes the leading character of block comment content, and `:multi_end` is the string that ought appear alone on a line to close a block of documentation. That is:
/** [:multi][:start] * [:multi][:middle] ... * [:multi][:middle] */ [:multi][:end]
If a language only has one type of comment, the missing type should be assigned ‘nil`.
At the moment, we’re only returning ‘:single`. Consider this groundwork for block comment parsing.
{ :single => "//", :multi => { :start => "/**", :middle => "*", :end => "*/" }, :heredoc => nil }
- COMMENT_STYLES =
{ "bash" => { :single => "#", :multi => nil }, "c" => C_STYLE_COMMENTS, "coffee-script" => { :single => "#", :multi => { :start => "###", :middle => nil, :end => "###" }, :heredoc => nil }, "cpp" => C_STYLE_COMMENTS, "csharp" => C_STYLE_COMMENTS, "css" => { :single => nil, :multi => { :start => "/**", :middle => "*", :end => "*/" }, :heredoc => nil }, "html" => { :single => nil, :multi => { :start => '<!--', :middle => nil, :end => '-->' }, :heredoc => nil }, "java" => C_STYLE_COMMENTS, "js" => C_STYLE_COMMENTS, "lua" => { :single => "--", :multi => nil, :heredoc => nil }, "php" => C_STYLE_COMMENTS, "python" => { :single => "#", :multi => { :start => '"""', :middle => nil, :end => '"""' }, :heredoc => nil }, "rb" => { :single => "#", :multi => { :start => '=begin', :middle => nil, :end => '=end' }, :heredoc => "<<-" }, "scheme" => { :single => ";;", :multi => nil, :heredoc => nil }, "xml" => { :single => nil, :multi => { :start => '<!--', :middle => nil, :end => '-->' }, :heredoc => nil }, }
Instance Attribute Summary collapse
-
#file ⇒ Object
readonly
The filename as given to ‘Rocco.new`.
-
#options ⇒ Object
readonly
The merged options array.
-
#sections ⇒ Object
readonly
A list of two-tuples representing each section of the source file.
-
#sources ⇒ Object
readonly
A list of all source filenames included in the documentation set.
Class Method Summary collapse
Instance Method Summary collapse
-
#detect_language ⇒ Object
If ‘pygmentize` is available, we can use it to autodetect a file’s language based on its filename.
-
#docblock(docs) ⇒ Object
Take a list of block comments and convert Docblock @annotations to Markdown syntax.
- #generate_comment_chars ⇒ Object
-
#highlight(blocks) ⇒ Object
Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.
-
#highlight_pygmentize(code) ⇒ Object
We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.
-
#highlight_webservice(code) ⇒ Object
Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.
-
#initialize(filename, sources = [], options = {}, &block) ⇒ Rocco
constructor
A new instance of Rocco.
-
#normalize_leading_spaces(sections) ⇒ Object
Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line.
-
#parse(data) ⇒ Object
Parse the raw file data into a list of two-tuples.
-
#pygmentize? ⇒ Boolean
Returns ‘true` if `pygmentize` is available locally, `false` otherwise.
-
#split(sections) ⇒ Object
Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.
- #to_html ⇒ Object
Constructor Details
#initialize(filename, sources = [], options = {}, &block) ⇒ Rocco
Returns a new instance of Rocco.
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/rocco.rb', line 78 def initialize(filename, sources=[], ={}, &block) @file = filename @sources = sources # When `block` is given, it must read the contents of the file using # whatever means necessary and return it as a string. With no `block`, # the file is read to retrieve data. @data = if block_given? yield else File.read(filename) end defaults = { :language => 'ruby', :comment_chars => '#', :template_file => nil } @options = defaults.merge() # If we detect a language if detect_language() != "text" # then assign the detected language to `:language`, and look for # comment characters based on that language @options[:language] = detect_language() @options[:comment_chars] = generate_comment_chars() # If we didn't detect a language, but the user provided one, use it # to look around for comment characters to override the default. elsif @options[:language] != defaults[:language] @options[:comment_chars] = generate_comment_chars() # If neither is true, then convert the default comment character string # into the comment_char syntax (we'll discuss that syntax in detail when # we get to `generate_comment_chars()` in a moment. else @options[:comment_chars] = { :single => @options[:comment_chars], :multi => nil } end # Turn `:comment_chars` into a regex matching a series of spaces, the # `:comment_chars` string, and the an optional space. We'll use that # to detect single-line comments. @comment_pattern = Regexp.new("^\\s*#{@options[:comment_chars][:single]}\s?") # `parse()` the file contents stored in `@data`. Run the result through # `split()` and that result through `highlight()` to generate the final # section list. @sections = highlight(split(parse(@data))) end |
Instance Attribute Details
#file ⇒ Object (readonly)
The filename as given to ‘Rocco.new`.
134 135 136 |
# File 'lib/rocco.rb', line 134 def file @file end |
#options ⇒ Object (readonly)
The merged options array
137 138 139 |
# File 'lib/rocco.rb', line 137 def @options end |
#sections ⇒ Object (readonly)
A list of two-tuples representing each section of the source file. Each item in the list has the form: ‘[docs_html, code_html]`, where both elements are strings containing the documentation and source code HTML, respectively.
143 144 145 |
# File 'lib/rocco.rb', line 143 def sections @sections end |
#sources ⇒ Object (readonly)
A list of all source filenames included in the documentation set. Useful for building an index of other files.
147 148 149 |
# File 'lib/rocco.rb', line 147 def sources @sources end |
Class Method Details
Instance Method Details
#detect_language ⇒ Object
If ‘pygmentize` is available, we can use it to autodetect a file’s language based on its filename. Filenames without extensions, or with extensions that ‘pygmentize` doesn’t understand will return ‘text`. We’ll also return ‘text` if `pygmentize` isn’t available.
We’ll memoize the result, as we’ll call this a few times.
170 171 172 173 174 175 176 177 |
# File 'lib/rocco.rb', line 170 def detect_language @_language ||= if pygmentize? %x[pygmentize -N #{@file}].strip.split('+').first else "text" end end |
#docblock(docs) ⇒ Object
Take a list of block comments and convert Docblock @annotations to Markdown syntax.
418 419 420 421 422 423 424 |
# File 'lib/rocco.rb', line 418 def docblock(docs) docs.map do |doc| doc.split("\n").map do |line| line.match(/^@\w+/) ? line.sub(/^@(\w+)\s+/, '> **\1** ')+" " : line end.join("\n") end end |
#generate_comment_chars ⇒ Object
261 262 263 264 265 266 267 268 |
# File 'lib/rocco.rb', line 261 def generate_comment_chars @_commentchar ||= if COMMENT_STYLES[@options[:language]] COMMENT_STYLES[@options[:language]] else { :single => @options[:comment_chars], :multi => nil, :heredoc => nil } end end |
#highlight(blocks) ⇒ Object
Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.
428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 |
# File 'lib/rocco.rb', line 428 def highlight(blocks) docs_blocks, code_blocks = blocks # Pre-process Docblock @annotations. if @options[:docblocks] docs_blocks = docblock(docs_blocks) end # Combine all docs blocks into a single big markdown document with section # dividers and run through the Markdown processor. Then split it back out # into separate sections. markdown = docs_blocks.join("\n\n##### DIVIDER\n\n") docs_html = Markdown.new(markdown, :smart). to_html. split(/\n*<h5>DIVIDER<\/h5>\n*/m) # Combine all code blocks into a single big stream with section dividers and # run through either `pygmentize(1)` or <http://pygments.appspot.com> span, espan = '<span class="c.?">', '</span>' if @options[:comment_chars][:single] front = @options[:comment_chars][:single] divider_input = "\n\n#{front} DIVIDER\n\n" divider_output = Regexp.new( [ "\\n*", span, Regexp.escape(CGI.escapeHTML(front)), ' DIVIDER', espan, "\\n*" ].join, Regexp::MULTILINE ) else front = @options[:comment_chars][:multi][:start] back = @options[:comment_chars][:multi][:end] divider_input = "\n\n#{front}\nDIVIDER\n#{back}\n\n" divider_output = Regexp.new( [ "\\n*", span, Regexp.escape(CGI.escapeHTML(front)), espan, "\\n", span, "DIVIDER", espan, "\\n", span, Regexp.escape(CGI.escapeHTML(back)), espan, "\\n*" ].join, Regexp::MULTILINE ) end code_stream = code_blocks.join(divider_input) code_html = if pygmentize? highlight_pygmentize(code_stream) else highlight_webservice(code_stream) end # Do some post-processing on the pygments output to split things back # into sections and remove partial `<pre>` blocks. code_html = code_html. split(divider_output). map { |code| code.sub(/\n?<div class="highlight"><pre>/m, '') }. map { |code| code.sub(/\n?<\/pre><\/div>\n/m, '') } # Lastly, combine the docs and code lists back into a list of two-tuples. docs_html.zip(code_html) end |
#highlight_pygmentize(code) ⇒ Object
We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.
497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 |
# File 'lib/rocco.rb', line 497 def highlight_pygmentize(code) code_html = nil open("|pygmentize -l #{@options[:language]} -O encoding=utf-8 -f html", 'r+') do |fd| pid = fork { fd.close_read fd.write code fd.close_write exit! } fd.close_write code_html = fd.read fd.close_read Process.wait(pid) end code_html end |
#highlight_webservice(code) ⇒ Object
Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.
518 519 520 521 522 523 |
# File 'lib/rocco.rb', line 518 def highlight_webservice(code) Net::HTTP.post_form( URI.parse('http://pygments.appspot.com/'), {'lang' => @options[:language], 'code' => code} ).body end |
#normalize_leading_spaces(sections) ⇒ Object
Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line. That is:
def func():
"""
Comment 1
Comment 2
"""
print "omg!"
should yield a comment block of ‘Comment 1nComment 2` and code of `def func():n print “omg!”`
388 389 390 391 392 393 394 395 396 397 398 399 |
# File 'lib/rocco.rb', line 388 def normalize_leading_spaces(sections) sections.map do |section| if section.any? && section[0].any? leading_space = section[0][0].match("^\s+") if leading_space section[0] = section[0].map{ |line| line.sub(/^#{leading_space.to_s}/, '') } end end section end end |
#parse(data) ⇒ Object
Parse the raw file data into a list of two-tuples. Each tuple has the form ‘[docs, code]` where both elements are arrays containing the raw lines parsed from the input file, comment characters stripped.
276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 |
# File 'lib/rocco.rb', line 276 def parse(data) sections = [] docs, code = [], [] lines = data.split("\n") # The first line is ignored if it is a shebang line. We also ignore the # PEP 263 encoding information in python sourcefiles, and the similar ruby # 1.9 syntax. lines.shift if lines[0] =~ /^\#\!/ lines.shift if lines[0] =~ /coding[:=]\s*[-\w.]+/ && [ "python", "rb" ].include?(@options[:language]) # To detect both block comments and single-line comments, we'll set # up a tiny state machine, and loop through each line of the file. # This requires an `in_comment_block` boolean, and a few regular # expressions for line tests. We'll do the same for fake heredoc parsing. in_comment_block = false in_heredoc = false single_line_comment, block_comment_start, block_comment_mid, block_comment_end = nil, nil, nil, nil if not @options[:comment_chars][:single].nil? single_line_comment = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:single])}\\s?") end if not @options[:comment_chars][:multi].nil? block_comment_start = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*$") block_comment_end = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$") block_comment_one_liner = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$") block_comment_start_with = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)$") block_comment_end_with = Regexp.new("\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$") if @options[:comment_chars][:multi][:middle] block_comment_mid = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:middle])}\\s?") end end if not @options[:comment_chars][:heredoc].nil? heredoc_start = Regexp.new("#{Regexp.escape(@options[:comment_chars][:heredoc])}(\\S+)$") end lines.each do |line| # If we're currently in a comment block, check whether the line matches # the _end_ of a comment block or the _end_ of a comment block with a # comment. if in_comment_block if block_comment_end && line.match(block_comment_end) in_comment_block = false elsif block_comment_end_with && line.match(block_comment_end_with) in_comment_block = false docs << line.match(block_comment_end_with).captures.first. sub(block_comment_mid || '', '') else docs << line.sub(block_comment_mid || '', '') end # If we're currently in a heredoc, we're looking for the end of the # heredoc, and everything it contains is code. elsif in_heredoc if line.match(Regexp.new("^#{Regexp.escape(in_heredoc)}$")) in_heredoc = false end code << line # Otherwise, check whether the line starts a heredoc. If so, note the end # pattern, and the line is code. Otherwise check whether the line matches # the beginning of a block, or a single-line comment all on it's lonesome. # In either case, if there's code, start a new section. else if heredoc_start && line.match(heredoc_start) in_heredoc = $1 code << line elsif block_comment_one_liner && line.match(block_comment_one_liner) if code.any? sections << [docs, code] docs, code = [], [] end docs << line.match(block_comment_one_liner).captures.first elsif block_comment_start && line.match(block_comment_start) in_comment_block = true if code.any? sections << [docs, code] docs, code = [], [] end elsif block_comment_start_with && line.match(block_comment_start_with) in_comment_block = true if code.any? sections << [docs, code] docs, code = [], [] end docs << line.match(block_comment_start_with).captures.first elsif single_line_comment && line.match(single_line_comment) if code.any? sections << [docs, code] docs, code = [], [] end docs << line.sub(single_line_comment || '', '') else code << line end end end sections << [docs, code] if docs.any? || code.any? normalize_leading_spaces(sections) end |
#pygmentize? ⇒ Boolean
Returns ‘true` if `pygmentize` is available locally, `false` otherwise.
159 160 161 162 |
# File 'lib/rocco.rb', line 159 def pygmentize? @_pygmentize ||= ENV['PATH'].split(':'). any? { |dir| File.executable?("#{dir}/pygmentize") } end |
#split(sections) ⇒ Object
Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.
404 405 406 407 408 409 410 411 412 413 414 |
# File 'lib/rocco.rb', line 404 def split(sections) docs_blocks, code_blocks = [], [] sections.each do |docs,code| docs_blocks << docs.join("\n") code_blocks << code.map do |line| tabs = line.match(/^(\t+)/) tabs ? line.sub(/^\t+/, ' ' * tabs.captures[0].length) : line end.join("\n") end [docs_blocks, code_blocks] end |