Class: Lac::Page
- Inherits:
-
Object
- Object
- Lac::Page
- Defined in:
- lib/lac/page.rb
Instance Attribute Summary collapse
-
#html ⇒ Object
Returns the value of attribute html.
Class Method Summary collapse
- .by_html(html) ⇒ Object
- .by_html_string(html_string) ⇒ Object
-
.by_url(url) ⇒ Object
helpers for seamless initialisation no matter what starting point.
-
.get_page(url, from_cache = true) ⇒ Object
Simply gets a webpage based on a url.
Instance Method Summary collapse
-
#collection_by_selector(selector) ⇒ Object
Returns a collection of pages based on a selector.
-
#initialize(html: nil) ⇒ Page
constructor
A new instance of Page.
- #try_css(css) ⇒ Object
-
#try_css_attr(css, attr) ⇒ Object
Need better flow for what this does.
- #try_css_parent_attr(css, attr) ⇒ Object
Constructor Details
#initialize(html: nil) ⇒ Page
Returns a new instance of Page.
50 51 52 |
# File 'lib/lac/page.rb', line 50 def initialize html: nil self.html = html end |
Instance Attribute Details
#html ⇒ Object
Returns the value of attribute html.
7 8 9 |
# File 'lib/lac/page.rb', line 7 def html @html end |
Class Method Details
.by_html(html) ⇒ Object
64 65 66 |
# File 'lib/lac/page.rb', line 64 def self.by_html(html) self.new(html: html) end |
.by_html_string(html_string) ⇒ Object
60 61 62 |
# File 'lib/lac/page.rb', line 60 def self.by_html_string(html_string) self.new(html: Nokogiri::HTML(html_string)) end |
.by_url(url) ⇒ Object
helpers for seamless initialisation no matter what starting point
56 57 58 |
# File 'lib/lac/page.rb', line 56 def self.by_url(url) self.new(html: Nokogiri::HTML(get_page(url))) end |
.get_page(url, from_cache = true) ⇒ Object
Simply gets a webpage based on a url. from_cache = true (default) will take a cached version if it exists.
36 37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/lac/page.rb', line 36 def self.get_page(url, from_cache = true) url_hash = Digest::SHA256.hexdigest(url) filename = "cache/#{url_hash}" if from_cache && File.file?(filename) result = open(filename).read puts "Gotten #{filename} from cache" else result = open(url).read File.write(filename, result) puts "Written cache file #{filename}" end return result end |
Instance Method Details
#collection_by_selector(selector) ⇒ Object
Returns a collection of pages based on a selector. Use to collect a collection of elements from a page.
71 72 73 |
# File 'lib/lac/page.rb', line 71 def collection_by_selector(selector) self.html.css(selector).map{|item| Lac::Page.by_html(item)} end |
#try_css(css) ⇒ Object
27 28 29 |
# File 'lib/lac/page.rb', line 27 def try_css(css) self.html.css(css).first end |
#try_css_attr(css, attr) ⇒ Object
Need better flow for what this does.
11 12 13 14 15 16 17 |
# File 'lib/lac/page.rb', line 11 def try_css_attr(css, attr) if element = self.try_css(css) element.attr(attr) else nil end end |
#try_css_parent_attr(css, attr) ⇒ Object
19 20 21 22 23 24 25 |
# File 'lib/lac/page.rb', line 19 def try_css_parent_attr(css, attr) if element = try_css(css) element.parent.attr(attr) else nil end end |