Module: RfcReader::Search
- Defined in:
- lib/rfc_reader/search.rb
Class Method Summary collapse
-
.fetch_by(term:) ⇒ String
The raw HTML of the search results for the given term.
-
.parse(html) ⇒ Hash<String, String>
Example: HTML fragment we’re trying to parse title and link info from.
-
.search_by(term:) ⇒ Hash<String, String>
From RFC title to text file url.
Class Method Details
.fetch_by(term:) ⇒ String
Returns the raw HTML of the search results for the given term.
20 21 22 23 24 |
# File 'lib/rfc_reader/search.rb', line 20 def self.fetch_by(term:) ErrorContext.wrap("Fetching RFC search results") do Net::HTTP.post_form(RFC_SEARCH_URI, { combo_box: term }).body end end |
.parse(html) ⇒ Hash<String, String>
Example: HTML fragment we’re trying to parse title and link info from.
“‘html <div class=“scrolltable”>
<table class='gridtable'>
<tr>
<th>
<a href='rfc_search_detail.php?sortkey=Number&sorting=DESC&page=25&title=ftp&pubstatus[]=Any&pub_date_type=any'>Number</a>
</th>
<th>Files</th>
<th>Title</th>
<th>Authors</th>
<th>
<a href='rfc_search_detail.php?sortkey=Date&sorting=DESC&page=25&title=ftp&pubstatus[]=Any&pub_date_type=any'>Date</a>
</th>
<th>More Info</th>
<th>Status</th>
</tr>
<tr>
<td>
<a href="https://www.rfc-editor.org/info/rfc114" target="_blank">RFC 114</a>
</td>
<td>
<a href="https://www.rfc-editor.org/rfc/rfc114.txt" target="_blank">ASCII</a>
,
<a href="https://www.rfc-editor.org/pdfrfc/rfc114.txt.pdf" target="_blank">PDF</a>
,
<a href="https://www.rfc-editor.org/rfc/rfc114.html" target="_blank">HTML</a>
</td>
<td class="title"> File Transfer Protocol </td>
<td> A.K. Bhushan</td>
<td>April 1971</td>
<td>
Updated by
<a href="https://www.rfc-editor.org/info/rfc133" target="_blank">RFC 133</a>
,
<a href="https://www.rfc-editor.org/info/rfc141" target="_blank">RFC 141</a>
,
<a href="https://www.rfc-editor.org/info/rfc171" target="_blank">RFC 171</a>
,
<a href="https://www.rfc-editor.org/info/rfc172" target="_blank">RFC 172</a>
</td>
<td>Unknown</td>
</tr>
… “‘
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/rfc_reader/search.rb', line 75 def self.parse(html) ErrorContext.wrap("Parsing RFC search results") do # NOTE: The first element in the table is just some general search information. See example HTML above. Nokogiri::HTML(html) .xpath("//div[@class='scrolltable']//table[@class='gridtable']//tr") .drop(1) .to_h do |tr_node| td_nodes = tr_node.elements title = td_nodes[2] .text .strip url = td_nodes[1] .elements .map { _1.attribute("href").text.strip } .find { _1.end_with?(".txt") } [title, url] end end end |
.search_by(term:) ⇒ Hash<String, String>
Returns from RFC title to text file url.
13 14 15 16 |
# File 'lib/rfc_reader/search.rb', line 13 def self.search_by(term:) html = fetch_by(term: term) parse(html) end |