Class: Asciidoctor::Table::ParserContext
- Inherits:
-
Object
- Object
- Asciidoctor::Table::ParserContext
- Includes:
- Logging
- Defined in:
- lib/asciidoctor/table.rb
Overview
Methods for managing the parsing of an AsciiDoc table. Instances of this class are primarily responsible for tracking the buffer of a cell as the parser moves through the lines of the table using tail recursion. When a cell boundary is located, the previous cell is closed, an instance of Table::Cell is instantiated, the row is closed if the cell satisifies the column count and, finally, a new buffer is allocated to track the next cell.
Constant Summary collapse
- FORMATS =
An Array of String keys that represent the table formats in AsciiDoc – QUESTION should we recognize !sv as a valid format value?
['psv', 'csv', 'dsv', 'tsv'].to_set
- DELIMITERS =
A Hash mapping the AsciiDoc table formats to default delimiters
{ 'psv' => ['|', /\|/], 'csv' => [',', /,/], 'dsv' => [':', /:/], 'tsv' => [?\t, /\t/], '!sv' => ['!', /!/], }
Instance Attribute Summary collapse
-
#buffer ⇒ Object
The String buffer of the currently open cell.
-
#colcount ⇒ Object
readonly
Get the expected column count for a row.
-
#delimiter ⇒ Object
readonly
The cell delimiter for this table.
-
#delimiter_re ⇒ Object
readonly
The cell delimiter compiled Regexp for this table.
-
#format ⇒ Object
The AsciiDoc table format (psv, dsv, or csv).
-
#table ⇒ Object
The Table currently being parsed.
Instance Method Summary collapse
-
#buffer_has_unclosed_quotes?(append = nil) ⇒ Boolean
Determines whether the buffer has unclosed quotes.
-
#cell_closed? ⇒ Boolean
Checks whether the current cell has been marked as closed.
-
#cell_open? ⇒ Boolean
Checks whether the current cell is still open.
-
#close_cell(eol = false) ⇒ Object
Close the current cell, instantiate a new Table::Cell, add it to the current row and, if the number of expected columns for the current row has been met, close the row and begin a new one.
-
#close_open_cell(next_cellspec = {}) ⇒ Object
If the current cell is open, close it.
-
#initialize(reader, table, attributes = {}) ⇒ ParserContext
constructor
A new instance of ParserContext.
-
#keep_cell_open ⇒ Object
Marks that the cell should be kept open.
-
#mark_cell_closed ⇒ Object
Marks the cell as closed so that the parser knows to instantiate a new cell instance and add it to the current row.
-
#match_delimiter(line) ⇒ Object
Checks whether the line provided contains the cell delimiter used by this table.
-
#push_cellspec(cellspec = {}) ⇒ Object
Puts a cell spec onto the stack.
-
#skip_past_delimiter(pre) ⇒ void
Skip past the matched delimiter because it’s inside quoted text.
-
#skip_past_escaped_delimiter(pre) ⇒ void
Skip past the matched delimiter because it’s escaped.
-
#starts_with_delimiter?(line) ⇒ Boolean
Checks whether the line provided starts with the cell delimiter used by this table.
-
#take_cellspec ⇒ Object
Takes a cell spec from the stack.
Methods included from Logging
#logger, #message_with_context
Constructor Details
#initialize(reader, table, attributes = {}) ⇒ ParserContext
Returns a new instance of ParserContext.
429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 |
# File 'lib/asciidoctor/table.rb', line 429 def initialize reader, table, attributes = {} @start_cursor_data = (@reader = reader).mark @table = table if attributes.key? 'format' if FORMATS.include?(xsv = attributes['format']) if xsv == 'tsv' # NOTE tsv is just an alias for csv with a tab separator @format = 'csv' elsif (@format = xsv) == 'psv' && table.document.nested? xsv = '!sv' end else logger.error %(illegal table format: #{xsv}), source_location: reader.cursor_at_prev_line @format, xsv = 'psv', (table.document.nested? ? '!sv' : 'psv') end else @format, xsv = 'psv', (table.document.nested? ? '!sv' : 'psv') end if attributes.key? 'separator' if (sep = attributes['separator']).nil_or_empty? @delimiter, @delimiter_rx = DELIMITERS[xsv] # QUESTION should we support any other escape codes or multiple tabs? elsif sep == '\t' @delimiter, @delimiter_rx = DELIMITERS['tsv'] else @delimiter, @delimiter_rx = sep, /#{::Regexp.escape sep}/ end else @delimiter, @delimiter_rx = DELIMITERS[xsv] end @colcount = table.columns.empty? ? -1 : table.columns.size @buffer = '' @cellspecs = [] @cell_open = false @active_rowspans = [0] @column_visits = 0 @current_row = [] @linenum = -1 end |
Instance Attribute Details
#buffer ⇒ Object
The String buffer of the currently open cell
421 422 423 |
# File 'lib/asciidoctor/table.rb', line 421 def buffer @buffer end |
#colcount ⇒ Object (readonly)
Get the expected column count for a row
colcount is the number of columns to pull into a row A value of -1 means we use the number of columns found in the first line as the colcount
418 419 420 |
# File 'lib/asciidoctor/table.rb', line 418 def colcount @colcount end |
#delimiter ⇒ Object (readonly)
The cell delimiter for this table.
424 425 426 |
# File 'lib/asciidoctor/table.rb', line 424 def delimiter @delimiter end |
#delimiter_re ⇒ Object (readonly)
The cell delimiter compiled Regexp for this table.
427 428 429 |
# File 'lib/asciidoctor/table.rb', line 427 def delimiter_re @delimiter_re end |
#format ⇒ Object
The AsciiDoc table format (psv, dsv, or csv)
411 412 413 |
# File 'lib/asciidoctor/table.rb', line 411 def format @format end |
#table ⇒ Object
The Table currently being parsed
408 409 410 |
# File 'lib/asciidoctor/table.rb', line 408 def table @table end |
Instance Method Details
#buffer_has_unclosed_quotes?(append = nil) ⇒ Boolean
Determines whether the buffer has unclosed quotes. Used for CSV data.
returns true if the buffer has unclosed quotes, false if it doesn’t or it isn’t quoted data
508 509 510 511 512 513 514 515 516 517 518 519 520 |
# File 'lib/asciidoctor/table.rb', line 508 def buffer_has_unclosed_quotes? append = nil if (record = append ? (@buffer + append).strip : @buffer.strip) == '"' true elsif record.start_with? '"' if ((trailing_quote = record.end_with? '"') && (record.end_with? '""')) || (record.start_with? '""') ((record = record.gsub '""', '').start_with? '"') && !(record.end_with? '"') else !trailing_quote end else false end end |
#cell_closed? ⇒ Boolean
Checks whether the current cell has been marked as closed
returns true if the cell is marked as closed, false otherwise
569 570 571 |
# File 'lib/asciidoctor/table.rb', line 569 def cell_closed? !@cell_open end |
#cell_open? ⇒ Boolean
Checks whether the current cell is still open
returns true if the cell is marked as open, false otherwise
562 563 564 |
# File 'lib/asciidoctor/table.rb', line 562 def cell_open? @cell_open end |
#close_cell(eol = false) ⇒ Object
Close the current cell, instantiate a new Table::Cell, add it to the current row and, if the number of expected columns for the current row has been met, close the row and begin a new one.
returns nothing
590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 |
# File 'lib/asciidoctor/table.rb', line 590 def close_cell(eol = false) if @format == 'psv' cell_text = @buffer @buffer = '' if (cellspec = take_cellspec) repeat = cellspec.delete('repeatcol') || 1 else logger.error 'table missing leading separator; recovering automatically', source_location: Reader::Cursor.new(*@start_cursor_data) cellspec = {} repeat = 1 end else cell_text = @buffer.strip @buffer = '' cellspec = nil repeat = 1 if @format == 'csv' && !cell_text.empty? && cell_text.include?('"') # this may not be perfect logic, but it hits the 99% if cell_text.start_with?('"') && cell_text.end_with?('"') # unquote if (cell_text = cell_text.slice(1, cell_text.length - 2)) # trim whitespace and collapse escaped quotes cell_text = cell_text.strip.squeeze('"') else logger.error 'unclosed quote in CSV data; setting cell to empty', source_location: @reader.cursor_at_prev_line cell_text = '' end else # collapse escaped quotes cell_text = cell_text.squeeze('"') end end end 1.upto(repeat) do |i| # TODO make column resolving an operation if @colcount == -1 @table.columns << (column = Table::Column.new(@table, @table.columns.size + i - 1)) if cellspec && (cellspec.key? 'colspan') && (extra_cols = cellspec['colspan'].to_i - 1) > 0 offset = @table.columns.size extra_cols.times do |j| @table.columns << Table::Column.new(@table, offset + j) end end else # QUESTION is this right for cells that span columns? unless (column = @table.columns[@current_row.size]) logger.error 'dropping cell because it exceeds specified number of columns', source_location: @reader.cursor_before_mark return end end cell = Table::Cell.new(column, cell_text, cellspec, cursor: @reader.cursor_before_mark) @reader.mark unless !cell.rowspan || cell.rowspan == 1 activate_rowspan(cell.rowspan, (cell.colspan || 1)) end @column_visits += (cell.colspan || 1) @current_row << cell # don't close the row if we're on the first line and the column count has not been set explicitly # TODO perhaps the colcount/linenum logic should be in end_of_row? (or a should_end_row? method) close_row if end_of_row? && (@colcount != -1 || @linenum > 0 || (eol && i == repeat)) end @cell_open = false nil end |
#close_open_cell(next_cellspec = {}) ⇒ Object
If the current cell is open, close it. In additional, push the cell spec captured from the end of this cell onto the stack for use by the next cell.
returns nothing
578 579 580 581 582 583 |
# File 'lib/asciidoctor/table.rb', line 578 def close_open_cell(next_cellspec = {}) push_cellspec next_cellspec close_cell(true) if cell_open? advance nil end |
#keep_cell_open ⇒ Object
Marks that the cell should be kept open. Used when the end of the line is reached and the cell may contain additional text.
returns nothing
545 546 547 548 |
# File 'lib/asciidoctor/table.rb', line 545 def keep_cell_open @cell_open = true nil end |
#mark_cell_closed ⇒ Object
Marks the cell as closed so that the parser knows to instantiate a new cell instance and add it to the current row.
returns nothing
554 555 556 557 |
# File 'lib/asciidoctor/table.rb', line 554 def mark_cell_closed @cell_open = false nil end |
#match_delimiter(line) ⇒ Object
Checks whether the line provided contains the cell delimiter used by this table.
returns Regexp MatchData if the line contains the delimiter, false otherwise
484 485 486 |
# File 'lib/asciidoctor/table.rb', line 484 def match_delimiter(line) @delimiter_rx.match(line) end |
#push_cellspec(cellspec = {}) ⇒ Object
Puts a cell spec onto the stack. Cell specs precede the delimiter, so a stack is used to carry over the spec to the next cell.
returns nothing
535 536 537 538 539 |
# File 'lib/asciidoctor/table.rb', line 535 def push_cellspec(cellspec = {}) # this shouldn't be nil, but we check anyway @cellspecs << (cellspec || {}) nil end |
#skip_past_delimiter(pre) ⇒ void
This method returns an undefined value.
Skip past the matched delimiter because it’s inside quoted text.
491 492 493 494 |
# File 'lib/asciidoctor/table.rb', line 491 def skip_past_delimiter(pre) @buffer = %(#{@buffer}#{pre}#{@delimiter}) nil end |
#skip_past_escaped_delimiter(pre) ⇒ void
This method returns an undefined value.
Skip past the matched delimiter because it’s escaped.
499 500 501 502 |
# File 'lib/asciidoctor/table.rb', line 499 def skip_past_escaped_delimiter(pre) @buffer = %(#{@buffer}#{pre.chop}#{@delimiter}) nil end |
#starts_with_delimiter?(line) ⇒ Boolean
Checks whether the line provided starts with the cell delimiter used by this table.
returns true if the line starts with the delimiter, false otherwise
476 477 478 |
# File 'lib/asciidoctor/table.rb', line 476 def starts_with_delimiter?(line) line.start_with? @delimiter end |
#take_cellspec ⇒ Object
Takes a cell spec from the stack. Cell specs precede the delimiter, so a stack is used to carry over the spec from the previous cell to the current cell when the cell is being closed.
returns The cell spec Hash captured from parsing the previous cell
527 528 529 |
# File 'lib/asciidoctor/table.rb', line 527 def take_cellspec @cellspecs.shift end |