Class: GoodNews::Scraper
- Inherits:
-
Object
- Object
- GoodNews::Scraper
- Defined in:
- lib/good_news/scraper.rb
Constant Summary collapse
- HOMEPAGEURL =
A constant to store the homepage.
"https://www.goodnewsnetwork.org/category/news/"
Class Method Summary collapse
-
.get_articles ⇒ Object
This method is used to get and store each topic’s articles.
-
.get_page(url) ⇒ Object
Uses open-uri and nokogiri to grab and parse the HTML.
-
.get_topics ⇒ Object
This method grabs Topics and stores them.
Class Method Details
.get_articles ⇒ Object
This method is used to get and store each topic’s articles. Calls Topic’s @@all Class variable array to loop through Topic objects. Instantiates new Article object. Saves article’s web address and title to Article object. Pushes Article object into the Topic object’s articles attribute(an array).
28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/good_news/scraper.rb', line 28 def self.get_articles GoodNews::Topic.all.each do |topic| doc = self.get_page(topic.web_addr) doc.css("h3.entry-title a").each do |info| new_article = GoodNews::Article.new new_article.web_addr = info.attribute("href").value new_article.title = info.text topic.articles.push(new_article) end end end |
.get_page(url) ⇒ Object
Uses open-uri and nokogiri to grab and parse the HTML. Returns the parsed page in a array which sets it up for a search using CSS selectors.
7 8 9 |
# File 'lib/good_news/scraper.rb', line 7 def self.get_page(url) return Nokogiri::HTML(open(url)) end |
.get_topics ⇒ Object
This method grabs Topics and stores them. Uses Class method #get_page and saves to doc. Instantiates a Topic object and stores the topic name and web address in the Topic object. Saves each Topic object in the Topic Class variable @@all using the #save method.
14 15 16 17 18 19 20 21 22 |
# File 'lib/good_news/scraper.rb', line 14 def self.get_topics doc = self.get_page(HOMEPAGEURL) doc.css("ul.td-category a").each do |topic| new_topic = GoodNews::Topic.new new_topic.name = topic.text new_topic.web_addr = topic.attribute("href").value new_topic.save end end |