Module: GoogleSiteSearch
- Defined in:
- lib/google-site-search.rb,
lib/google-site-search/result.rb,
lib/google-site-search/search.rb,
lib/google-site-search/version.rb,
lib/google-site-search/url_builder.rb
Overview
A module to help query and parse the google site search api.
Defined Under Namespace
Classes: ParsingError, Result, Search, UrlBuilder
Constant Summary collapse
- GOOGLE_SEARCH_URL =
"http://www.google.com"
- DEFAULT_PARAMS =
{ :client => "google-csbe", :output => "xml_no_dtd", }
- VERSION =
"0.0.8"
Class Method Summary collapse
-
.caching_key(url) ⇒ Object
Takes a url, strips out un-required query params, and compresses a string representation.
-
.paginate(url, search_engine_id) ⇒ Object
Expects the URL returned by Search#next_results_url or Search#previous_results_url.
-
.query(url, result_class = Result) {|search_result| ... } ⇒ Object
See Search - This is a convienence method for creating and querying.
-
.query_multiple(times, url, result_class = Result, &block) ⇒ Object
See Search - This allows you to retrieve up to (times) number of searchs if they are available (i.e. Stops if a search has no next_results_url).
-
.relative_path(path) ⇒ Object
Google returns a result link as an absolute but you may want a relative version.
-
.request_xml(url) ⇒ Object
Makes a request to the google search api and returns the xml response as a string.
-
.separate_search_term_from_filters(string) ⇒ Object
Google’s api will give back a full query which has the filter options on it.
Class Method Details
.caching_key(url) ⇒ Object
Takes a url, strips out un-required query params, and compresses a string representation. The intent is to have a small string to use as a caching key.
32 33 34 35 36 37 38 |
# File 'lib/google-site-search.rb', line 32 def caching_key url params = Rack::Utils.parse_query(URI.parse(url).query) # ei = "Passes on an alphanumeric parameter that decodes the originating SERP where user clicked on a related search". Don't fully understand what it does but it makes my caching less effective. params.delete("ei") key = params.map{|k,v| k.to_s + v.to_s}.sort.join key.blank? ? nil : RSmaz.compress(key) end |
.paginate(url, search_engine_id) ⇒ Object
Expects the URL returned by Search#next_results_url or Search#previous_results_url.
41 42 43 44 45 46 47 48 49 50 51 |
# File 'lib/google-site-search.rb', line 41 def paginate url, search_engine_id raise StandardError, "search_engine_id required" if search_engine_id.blank? uri = URI.parse(url.to_s) raise StandardError, "url seems to be invalid, parameters expected" if uri.query.blank? if uri.relative? uri.host = "www.google.com" uri.scheme = "http" end uri.query = uri.query += "&cx=#{search_engine_id}" uri.to_s end |
.query(url, result_class = Result) {|search_result| ... } ⇒ Object
See Search - This is a convienence method for creating and querying. This method can except a block which can access the resulting search object.
55 56 57 58 59 |
# File 'lib/google-site-search.rb', line 55 def query url, result_class = Result, &block search_result = Search.new(url, result_class).query yield(search_result) if block_given? search_result end |
.query_multiple(times, url, result_class = Result, &block) ⇒ Object
See Search - This allows you to retrieve up to (times) number of searchs if they are available (i.e. Stops if a search has no next_results_url). This method can except a block which can access the resulting search object.
64 65 66 67 68 69 70 71 72 |
# File 'lib/google-site-search.rb', line 64 def query_multiple times, url, result_class = Result, &block searchs = [query(url, result_class, &block).query] while (times=times-1) > 0 next_results_url = searchs.last.try(:next_results_url) break if next_results_url.blank? searchs << search_result = query(url, result_class, &block).query end searchs end |
.relative_path(path) ⇒ Object
Google returns a result link as an absolute but you may want a relative version.
82 83 84 85 |
# File 'lib/google-site-search.rb', line 82 def relative_path path uri = URI.parse(path) uri.relative? ? path : [uri.path,uri.query].compact.join("?") end |
.request_xml(url) ⇒ Object
Makes a request to the google search api and returns the xml response as a string.
75 76 77 78 |
# File 'lib/google-site-search.rb', line 75 def request_xml url response = Net::HTTP.get_response(URI.parse(url.to_s)) response.body if response.is_a?(Net::HTTPSuccess) end |
.separate_search_term_from_filters(string) ⇒ Object
Google’s api will give back a full query which has the filter options on it. I like to deal with them separately so this method breaks them up.
88 89 90 91 92 |
# File 'lib/google-site-search.rb', line 88 def separate_search_term_from_filters(string) match = /\smore:p.*/.match(string) return [string, nil] if match.nil? return [match.pre_match.strip, match[0].strip] end |