Class: GithubToCanvasQuiz::Parser::Markdown::Helpers::NodeScanner

Inherits:
Object
  • Object
show all
Defined in:
lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb

Overview

Loosely based on the Ruby ‘StringScanner` class. Allows position-based traversal of a `Nokogiri::XML::NodeSet`:

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

# scan and return nodes before the first H3
nodes = scanner.scan_before('h3')
nodes.first.content # => 'Hello'
nodes.last.content  # => 'World'
scanner.cursor      # => 2

# scan the current node if it is a H3
h3 = scanner.scan('h3')
h3.content          # => 'end.'
scanner.eof?        # => true

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(nodes) ⇒ NodeScanner

Create a new instance from a Nokogiri::XML::NodeSet or HTML string

Parameters:

  • node_set (Nokogiri::XML::NodeSet, String)

    HTML nodes to be scanned



32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 32

def initialize(nodes)
  case nodes
  when Nokogiri::XML::NodeSet
    @node_set = nodes
  when String
    @node_set = Nokogiri::HTML5.fragment(nodes).children
  else
    raise TypeError, "expected a Nokogiri::XML::NodeSet or String, got #{nodes.class.name}"
  end

  @cursor = 0
end

Instance Attribute Details

#cursorObject

Returns the value of attribute cursor.



25
26
27
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 25

def cursor
  @cursor
end

#node_setObject (readonly)

Returns the value of attribute node_set.



24
25
26
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 24

def node_set
  @node_set
end

Instance Method Details

#check(selector) ⇒ Object

Does not update cursor. Checks the current node to see if it matches the selector. If it does, returns the found node. Otherwise, returns ‘nil`.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

h1 = scanner.check('h1')
h1.content     # => 'Hello'
scanner.cursor # => 0


150
151
152
153
154
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 150

def check(selector)
  return if eof? || !current.matches?(selector)

  current
end

#check_before(selector) ⇒ Object

Does not update cursor. Checks until the node matching the selector is reached, and updates the cursor position to the index of the matched node.

Returns a ‘NodeSet` of all nodes between the previous cursor position and before the found node.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

nodes = scanner.check_before('h2')
nodes.last.content  # => 'Hello'
scanner.cursor      # => 0


192
193
194
195
196
197
198
199
200
201
202
203
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 192

def check_before(selector)
  scan_cursor = cursor + 1
  while scan_cursor < node_set.length
    if node_set[scan_cursor].matches?(selector)
      found_nodes = node_set[cursor..scan_cursor - 1]
      self.cursor = scan_cursor
      return found_nodes
    end

    scan_cursor += 1
  end
end

#check_until(selector) ⇒ Object

Does not update cursor. Checks until the node matching the selector is reached, and updates the cursor position to the index after the matched node.

Returns a ‘NodeSet` of all nodes between the previous cursor position and the found node.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

nodes = scanner.check_until('h2')
nodes.last.content  # => 'World'
scanner.cursor      # => 0


168
169
170
171
172
173
174
175
176
177
178
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 168

def check_until(selector)
  scan_cursor = cursor
  while scan_cursor < node_set.length
    if node_set[scan_cursor].matches?(selector)
      found_nodes = node_set[cursor..scan_cursor]
      return found_nodes
    end

    scan_cursor += 1
  end
end

#currentObject

Returns the node at the current cursor position



51
52
53
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 51

def current
  node_set[cursor]
end

#eof?Boolean

Returns whether or not the scanner is at the end of the ‘node_set`

Returns:

  • (Boolean)


46
47
48
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 46

def eof?
  cursor >= node_set.length
end

#scan(selector) ⇒ Object

Scans the current node to see if it matches the selector. If it does, update the cursor position to the index after the found node and returns the found node. Otherwise, return ‘nil`.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

h1 = scanner.scan('h1')
h1.content     # => 'Hello'
scanner.cursor # => 1


65
66
67
68
69
70
71
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 65

def scan(selector)
  scanned_node = current
  return unless scanned_node.matches?(selector)

  self.cursor += 1
  scanned_node
end

#scan_before(selector) ⇒ Object

Scans until the node matching the selector is reached, and updates the cursor position to the index of the matched node.

Returns a ‘NodeSet` of all nodes between the previous cursor position and before the found node.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

nodes = scanner.scan_before('h2')
nodes.last.content  # => 'Hello'
scanner.cursor      # => 1


110
111
112
113
114
115
116
117
118
119
120
121
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 110

def scan_before(selector)
  scan_cursor = cursor + 1
  while scan_cursor < node_set.length
    if node_set[scan_cursor].matches?(selector)
      found_nodes = node_set[cursor..scan_cursor - 1]
      self.cursor = scan_cursor
      return found_nodes
    end

    scan_cursor += 1
  end
end

#scan_restObject

Scans until the end of the node set, and updates the cursor position to the end. Returns a ‘NodeSet` of all the nodes between the cursor position and the end.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

nodes = scanner.scan_before('h2')
nodes.last.content  # => 'Hello'
scanner.cursor      # => 1
nodes = scanner.scan_rest
nodes.last.content  # => 'end.
scanner.cursor      # => 3


135
136
137
138
139
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 135

def scan_rest
  found_nodes = node_set[cursor..node_set.length - 1]
  self.cursor = node_set.length
  found_nodes
end

#scan_until(selector) ⇒ Object

Scans until the node matching the selector is reached, and updates the cursor position to the index after the matched node.

Returns a ‘NodeSet` of all nodes between the previous cursor position and the found node.

html = '<h1>Hello</h1><h2>World</h2><h3>end.</h3>'
scanner = HTML::Scanner.new(html)

nodes = scanner.scan_until('h2')
nodes.last.content  # => 'World'
scanner.cursor      # => 2


85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/github_to_canvas_quiz/parser/markdown/helpers/node_scanner.rb', line 85

def scan_until(selector)
  scan_cursor = cursor
  while scan_cursor < node_set.length
    if node_set[scan_cursor].matches?(selector)
      found_nodes = node_set[cursor..scan_cursor]
      self.cursor = scan_cursor + 1
      return found_nodes
    end

    scan_cursor += 1
  end
end