Class: Scraper

Inherits:
Object
  • Object
show all
Includes:
Contracts::Core
Defined in:
lib/scraper.rb

Overview

Class for handling RSS feed to grab posts

Constant Summary collapse

AWL_RSS_URL =

URL to pull the initial feed

'http://feeds2.feedburner.com/TheAwl'
C =

Shortcut for contracts

Contracts

Instance Attribute Summary collapse

Instance Method Summary collapse

Instance Attribute Details

#articlesObject (readonly)

Array of Hashes


12
13
14
# File 'lib/scraper.rb', line 12

def articles
  @articles
end

Instance Method Details

#retrieve_postsObject

Retrieve a list of posts and return array of short links


21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/scraper.rb', line 21

def retrieve_posts
  # Get posts
  rss = RSS::Parser.parse(AWL_RSS_URL)

  # Grab shortened URLs
  links = rss.items.map(&:guid).map(&:content)

  @articles = []

  links.each do |link|
    @articles << Article.new(link)
  end

  # TODO: Only grab the tags for articles that haven't already be tweeted
  @articles.map(&:retrieve_tags)
end

#subtract_cacheObject

Subtrack saved artciles from the list of articles


40
41
42
43
44
# File 'lib/scraper.rb', line 40

def subtract_cache
  tracker = Tracker.new
  tracker.read_articles
  @articles.delete_if { |x| tracker.articles.include?(x.link) }
end