Class: Paperboy::Collector

Inherits:

Object

Object
Paperboy::Collector

show all

Defined in:: lib/paperboy.rb

Overview

‘Paperboy::Collector` queries the chartbeat API’s snapshots method and consolidates visitors over the specified timespan. Then it pushes out barebones HTML to be gussied-up.

Instance Attribute Summary collapse

#outfile ⇒ Object

Determine if there is an outfile for this instance.

Instance Method Summary collapse

#html ⇒ Object

Get the contents of the HTML file.
#initialize(opts = {}) ⇒ Collector constructor

Initialize a ‘Paperboy` instance.
#run(opts = {}) ⇒ Object

Run runs the collector according to parameters set up in ‘new`.

Constructor Details

#initialize(opts = {}) ⇒ `Collector`

Initialize a ‘Paperboy` instance. This script is relatively expensive, and is ideally run on a cron, perhaps once a day. Unlike stats_combiner, Paperboy uses historical data, so it really doesn’t matter when you set this to run, as long as you’re grabbing relative timestamps.

API key and host come from your Chartbeat settings. Start and end times are UNIX timestamps. Paperboy will collect hourly between them. It defaults to yesterday from midnight to midnight. Filters is an instance of ‘StatsCombiner::Filterer`. To use it, first instantiate a Filterer object with:

e = StatsCombiner::Filterer.new

then add filter rules such as

e.add { 
  :prefix => 'tpmdc', 
  :title_regex => /\| TPMDC/, 
  :modify_title => true
}

finally, pass ‘e.filters` to this method.

‘img_xpath` and `blurb_xpath` are xpath queries that will run on the URL extracted from chartbeat (and any filters run on it) to populate your email with data that might reside in META tags. Here some I’ve found useful. ‘*_xpath` takes the `content` attribute of whatever HEAD tag is queried.

:img_xpath => '//head/meta[@property="og:image"]',
:blurb_xpath => '//head/meta[@name="description"]'

Another option is ‘:interval`, which determines the interval of snapshots it takes before `start_time` and `end_time`. The default is 3600 seconds, or one hour.

Usage example:

p = Paperboy::Collector.new {
 :api_key => 'chartbeat_api_key',  
 :host => 'yourdomain.com',
 :start_time => 1277784000
 :end_time => 1277870399,
 :interval => 3600,
 :filters => e.filters,
 :img_xpath => '//head/meta[@property="og:image"]',
 :blurb_xpath => '//head/meta[@name="description"]'
 }

The static file generated by Paperboy will be called “yourdomain.com_paperboy_output.html.” Change this with ‘p.outfile`

# File 'lib/paperboy.rb', line 75

def initialize(opts = {})
  @opts = {
    :apikey => nil,
    :host => nil,
    :start_time => Time.now.to_i - 18000, #four hour default window
    :end_time => Time.now.to_i - 3600,
    :interval => 3600,
    :filters => nil,
    :img_xpath => nil,
    :blurb_xpath => nil
  }.merge!(opts)
  
  if @opts[:apikey].nil? || @opts[:host].nil?
    raise Paperboy::Error, "No Chartbeat API Key or Host Specified!"
  end
  
  @c = Chartbeat.new :apikey => @opts[:apikey], :host => @opts[:host]
  @outfile = "#{@opts[:host]}_paperboy_output.html"
  
  @stories = []
  @uniq_stories = []
end

Instance Attribute Details

#outfile ⇒ `Object`

Determine if there is an outfile for this instance. If so, get the filename.



127
128
129

# File 'lib/paperboy.rb', line 127

def outfile
  @outfile
end

Instance Method Details

#html ⇒ `Object`

Get the contents of the HTML file. I.e. the final product of the Paperboy run.



137
138
139

# File 'lib/paperboy.rb', line 137

def html
  File.open(@outfile).read
end

#run(opts = {}) ⇒ `Object`

Run runs the collector according to parameters set up in ‘new`. By default, it will generate an HTML file in the current directory with a standard bare-bones structure. There is also an option to pass data through an ERB template. That is done like so:

p.run :via => 'erb', :template => '/path/to/tmpl.erb'

ERB templates will expect to iterate over a ‘@stories` array, where each item is a hash of story attributes. See Paperboy::View#erb below for more on templating.

# File 'lib/paperboy.rb', line 107

def run(opts = {})
  @run_opts = {
    :via => 'html',
    :template => nil
  }.merge!(opts)
  
  result = self.collect_stories      
  v = Paperboy::View.new(result,@outfile)
    
  if @run_opts[:via] == 'erb'
    if @run_opts[:template].nil?
      raise Paperboy::Error, "A template file must be specified with the erb option."
    end
    v.erb(@run_opts[:template])
  else
    v.html
  end
end