Class: Hermaeus::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/hermaeus/client.rb

Overview

Public: Wraps a reddit client for access to reddit’s API, and provides methods for downloading posts from reddit.

Constant Summary collapse

USER_AGENT =
"Redd/Ruby:Hermaeus:#{Hermaeus::VERSION} (by /u/myrrlyn)"

Instance Method Summary collapse

Constructor Details

#initializeClient

Public: Connects the Hermaeus::Client to reddit.



18
19
20
21
22
23
24
# File 'lib/hermaeus/client.rb', line 18

def initialize
	Config.validate!
	cfg = Config.info[:client]
	@client = Redd.it(cfg.delete(:type).to_sym, *cfg.values, user_agent: USER_AGENT)
	@client.authorize!
	@html_filter = HTMLEntities.new
end

Instance Method Details

#get_fullnames(data, **opts) ⇒ Object

Public: Transforms a list of raw reddit links (“/r/SUB/comments/ID/NAME”) into their reddit fullname (“t3_ID”).

data - A String Array such as that returned by get_global_listing.

Optional parameters:

regex: A Regular Expression used to match the reddit ID out of a link.

Returns a String Array containing the reddit fullnames harvested from the input list. Input elements that do not match are stripped.



65
66
67
68
69
70
71
72
73
# File 'lib/hermaeus/client.rb', line 65

def get_fullnames data, **opts
	# TODO: Move this regex to the configuration file.
	regex = opts[:regex] || %r(/r/.+/(comments/)?(?<id>[0-9a-z]+)/.+)
	data.map do |item|
		m = item.match regex
		"t3_#{m[:id]}" if m
	end
	.reject { |item| item.nil? }
end

#get_global_listing(**opts) ⇒ Object

Public: Scrapes the Compilation full index.

Wraps Client#scrape_index; see it for documentation.



29
30
31
# File 'lib/hermaeus/client.rb', line 29

def get_global_listing **opts
	scrape_index Config.info[:index][:path], opts
end

#get_posts(fullnames, &block) ⇒ Object

Public: Collects posts from reddit.

fullnames - A String Array of reddit fullnames (“tNUM_ID”, following reddit documentation) to query.

Yields a sequence of Hashes, each describing a reddit post.

Returns an Array of the response bodies from the reddit call(s).

Examples

get_posts get_fullnames get_global_listing do |post|

puts post[:selftext] # Prints the Markdown source of each post

end

> returns an array of hashes, each of which includes an array of posts.



90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/hermaeus/client.rb', line 90

def get_posts fullnames, &block
	ret = []
	# reddit has finite limits on acceptable query sizes. Split the list into
	# manageable portions
	fullnames.each_slice(100).each do |chunk|
		# Assemble the list of reddit objects being queried
		query = "/by_id/#{chunk.join(",")}.json"
		response = scrape_posts query, &block
		ret << response.body
	end
	ret
end

#get_weekly_listing(ids, **opts) ⇒ Object

Public: Scrapes a Weekly Community Thread patch index.

ids - A String Array of reddit post IDs for Weekly Community Threads.

Examples:

get_weekly_listing “56j7pq” # Targets one Community Thread get_weekly_listing “56j7pq”, “55erkr” # Targets two Community Threads get_weekly_listing “55erkr”, css: “td:last-child a” # Custom CSS selector

Wraps Client#scrape_index; see it for documentation.



44
45
46
47
48
49
50
51
52
# File 'lib/hermaeus/client.rb', line 44

def get_weekly_listing ids, **opts
	ids.map! do |id|
	 "t3_#{id}" unless id.match /^t3_/
	end
	# TODO: Ensure that this is safe (only query <= 100 IDs at a time), and
	# call the scraper multiple times and reassemble output if necessary.
	query = "/by_id/#{ids.join(",")}"
	scrape_index query, opts
end