Class: Camdict::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/camdict/client.rb

Overview

The client downloads all the useful data about a word or phrase from remote Cambridge dictionaries, but not includes the extended data. For example, when the word “mind” is searched, all its four exactly matched entries are downloaded. However, separated entries like “turn of mind” & “open mind” are not included.

Instance Method Summary collapse

Constructor Details

#initialize(dict = nil) ⇒ Client

Default dictionary is english-chinese-simplified. Other possible dict values: british, american-english, business-english, learner-english.



16
17
18
# File 'lib/camdict/client.rb', line 16

def initialize(dict=nil)
  @dictionary = dict || "english-chinese-simplified"
end

Instance Method Details

#get_htmldef(url) ⇒ Object

Get a word html page source by its entry url.



54
55
56
57
# File 'lib/camdict/client.rb', line 54

def get_htmldef(url)
  html = Camdict::HTTP::Client.get_html(url)
  di_head(html) + di_body(html)
end

#html_definition(word) ⇒ Object

Get a word’s html definition(s) by searching it from the web dictionary. The returned result could be an empty array when nothing is found, or is an array with a hash element,

[{ word => html definition }],

or many hash elements when it has multiple entries,

[{ entry_id => html definition }, ...].

Normally, when a word has more than one meanings, its entry ID format is like word_nn. Otherwise it’s just the word itself.



29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/camdict/client.rb', line 29

def html_definition(word)
  html = fetch(word)
  return [] if html.nil?
  html_defs = []
  # some words return their only definition directly, such as aluminium.
  if definition_page? html
    # entry id is just the word when there is only one definition
    html_defs << { word => di_head(html) + di_body(html) }
  else
    # returned page could be a spelling check suggestion page in case it is 
    # not found, or the found page with all matched entries and related.
    # when entry urls are not found, they are empty and spelling suggestion
    # pages. So mentry_links() returns an empty array. Otherwise, it returns
    # all the exactly matched entry links.
    matched_urls = mentry_links(word, html)
    unless matched_urls.empty?
      matched_urls.each { |url|
        html_defs << { entry_id(url) => get_htmldef(url) }
      }
    end
  end
  html_defs
end