Class: Caboodle::FeedDetector
- Inherits:
-
Object
- Object
- Caboodle::FeedDetector
- Defined in:
- lib/caboodle/scrape.rb
Class Method Summary collapse
-
.fetch_feed_url(page_url, only_detect = nil) ⇒ Object
return the feed url for a url for example: blog.dominiek.com/ => blog.dominiek.com/feed/atom.xml only_detect can force detection of :rss or :atom.
-
.get_feed_path(html, only_detect = nil) ⇒ Object
get the feed href from an HTML document for example: …
Class Method Details
.fetch_feed_url(page_url, only_detect = nil) ⇒ Object
return the feed url for a url for example: blog.dominiek.com/ => blog.dominiek.com/feed/atom.xml only_detect can force detection of :rss or :atom
64 65 66 67 68 69 70 71 72 73 |
# File 'lib/caboodle/scrape.rb', line 64 def self.fetch_feed_url(page_url, only_detect=nil) url = URI.parse(page_url) host_with_port = url.host host_with_port << ":#{url.port}" unless url.port == 80 res = Weary.get(page_url).perform_sleepily feed_url = self.get_feed_path(res.body, only_detect) "http://#{host_with_port}/#{feed_url.gsub(/^\//, '')}" unless !feed_url || feed_url =~ /^http:\/\// end |
.get_feed_path(html, only_detect = nil) ⇒ Object
get the feed href from an HTML document for example: … <link href=“/feed/atom.xml” rel=“alternate” type=“application/atom+xml” /> …
> /feed/atom.xml
only_detect can force detection of :rss or :atom
83 84 85 86 87 88 89 90 91 92 93 |
# File 'lib/caboodle/scrape.rb', line 83 def self.get_feed_path(html, only_detect=nil) unless only_detect && only_detect != :atom md ||= /<link.*href=['"]*([^\s'"]+)['"]*.*application\/atom\+xml.*>/.match(html) md ||= /<link.*application\/atom\+xml.*href=['"]*([^\s'"]+)['"]*.*>/.match(html) end unless only_detect && only_detect != :rss md ||= /<link.*href=['"]*([^\s'"]+)['"]*.*application\/rss\+xml.*>/.match(html) md ||= /<link.*application\/rss\+xml.*href=['"]*([^\s'"]+)['"]*.*>/.match(html) end md && md[1] end |