Class: NHKore::DictScraper

Inherits:
Scraper
  • Object
show all
Defined in:
lib/nhkore/dict_scraper.rb

Overview

Author:

  • Jonathan Bradley Whited

Since:

  • 0.2.0

Constant Summary

Constants inherited from Scraper

Scraper::DEFAULT_HEADER

Instance Attribute Summary collapse

Attributes inherited from Scraper

#kargs, #max_redirects, #max_retries, #redirect_rule, #str_or_io, #url

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Scraper

#fetch_cookie, #html_doc, #join_url, #open, #open_file, #open_url, #read, #reopen, #rss_doc

Constructor Details

#initialize(url, missingno: nil, parse_url: true, **kargs) ⇒ DictScraper

Returns a new instance of DictScraper.

Since:

  • 0.2.0



26
27
28
29
30
31
32
# File 'lib/nhkore/dict_scraper.rb', line 26

def initialize(url,missingno: nil,parse_url: true,**kargs)
  url = self.class.parse_url(url) if parse_url

  super(url,**kargs)

  @missingno = missingno
end

Instance Attribute Details

#missingnoObject

Since:

  • 0.2.0



24
25
26
# File 'lib/nhkore/dict_scraper.rb', line 24

def missingno
  @missingno
end

Class Method Details

.parse_url(url, basename: nil) ⇒ Object

Raises:

Since:

  • 0.2.0



34
35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/nhkore/dict_scraper.rb', line 34

def self.parse_url(url,basename: nil)
  url = Util.strip_web_str(url.to_s)

  raise ParseError,"cannot parse dictionary URL from URL[#{url}]" if url.empty?

  i = url.rindex(%r{[/\\]}) # Can be a URL or a file
  i = i.nil? ? 0 : (i + 1) # If no match found, no path

  basename = File.basename(url[i..],'.*') if basename.nil?
  path = url[0...i]

  return "#{path}#{basename}.out.dic"
end

Instance Method Details

#scrapeObject

Since:

  • 0.2.0



48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/nhkore/dict_scraper.rb', line 48

def scrape
  require 'json'

  str = read # Make sure it has all been read.
  str = str.string if str.respond_to?(:string) # For StringIO.

  json = JSON.parse(str)

  return Dict.new if json.nil?

  hash = json['reikai']

  return Dict.new if hash.nil?

  hash = hash['entries']

  return Dict.new if hash.nil?
  return Dict.scrape(hash,missingno: @missingno,url: @url)
end