Class: Youtube::SearchResultScraper
- Inherits:
-
Object
- Object
- Youtube::SearchResultScraper
- Defined in:
- lib/youtube/searchresultscraper.rb
Overview
Introduction
Youtube::SearchResultScraper scrapes video information from search result page on www.youtube.com.
You can get result as array or xml.
XML format is same as YouTube Developer API (www.youtube.com/dev_api_ref?m=youtube.videos.list_by_tag).
Example
require "rubygems"
require "youtube/searchresultscraper"
scraper = Youtube::SearchResultScraper.new(keyword, page)
scraper.open
scraper.scrape
puts scraper.get_xml
More Information
www.ark-web.jp/sandbox/wiki/184.html (japanese only)
- Author
-
Yuki SHIDA <[email protected]>
- Author
-
Konuma Akio <[email protected]>
- Version
-
0.0.3
- License
-
MIT license
Constant Summary collapse
- Relevance =
'relevance'
- DateAdded =
'video_date_uploaded'
- ViewCount =
'video_view_count'
- Rating =
'video_avg_rating'
- @@youtube_search_base_url =
"http://www.youtube.com/results?search_query="
Instance Attribute Summary collapse
-
#keyword ⇒ Object
Returns the value of attribute keyword.
-
#page ⇒ Object
Returns the value of attribute page.
-
#sort ⇒ Object
Returns the value of attribute sort.
-
#video_count ⇒ Object
readonly
Returns the value of attribute video_count.
-
#video_from ⇒ Object
readonly
Returns the value of attribute video_from.
-
#video_to ⇒ Object
readonly
Returns the value of attribute video_to.
Instance Method Summary collapse
-
#each ⇒ Object
Iterator for scraped videos.
-
#get_xml ⇒ Object
Return videos information as XML Format.
-
#initialize(keyword, page = nil, sort = nil) ⇒ SearchResultScraper
constructor
Create Youtube::SearchResultScraper object specifying keyword and number of page.
-
#open ⇒ Object
Get search result from youtube by specified keyword.
-
#scrape ⇒ Object
Scrape video information from search result html.
Constructor Details
#initialize(keyword, page = nil, sort = nil) ⇒ SearchResultScraper
Create Youtube::SearchResultScraper object specifying keyword and number of page.
You cannot specify number of videos per page. Always, the number of videos is 20 per page.
-
keyword - specify keyword that you want to search on YouTube. You must specify keyword encoded by UTF-8.
-
page - specify number of page
-
sort - specify sort rule
86 87 88 89 90 |
# File 'lib/youtube/searchresultscraper.rb', line 86 def initialize keyword, page=nil, sort=nil @keyword = keyword @page = page if not page == nil @sort = sort if not sort == nil end |
Instance Attribute Details
#keyword ⇒ Object
Returns the value of attribute keyword.
62 63 64 |
# File 'lib/youtube/searchresultscraper.rb', line 62 def keyword @keyword end |
#page ⇒ Object
Returns the value of attribute page.
63 64 65 |
# File 'lib/youtube/searchresultscraper.rb', line 63 def page @page end |
#sort ⇒ Object
Returns the value of attribute sort.
64 65 66 |
# File 'lib/youtube/searchresultscraper.rb', line 64 def sort @sort end |
#video_count ⇒ Object (readonly)
Returns the value of attribute video_count.
65 66 67 |
# File 'lib/youtube/searchresultscraper.rb', line 65 def video_count @video_count end |
#video_from ⇒ Object (readonly)
Returns the value of attribute video_from.
66 67 68 |
# File 'lib/youtube/searchresultscraper.rb', line 66 def video_from @video_from end |
#video_to ⇒ Object (readonly)
Returns the value of attribute video_to.
67 68 69 |
# File 'lib/youtube/searchresultscraper.rb', line 67 def video_to @video_to end |
Instance Method Details
#each ⇒ Object
Iterator for scraped videos.
136 137 138 139 140 |
# File 'lib/youtube/searchresultscraper.rb', line 136 def each @videos.each do |video| yield video end end |
#get_xml ⇒ Object
Return videos information as XML Format.
143 144 145 146 147 148 149 150 151 |
# File 'lib/youtube/searchresultscraper.rb', line 143 def get_xml xml = "<ut_response status=\"ok\">" + "<video_count>" + @video_count.to_s + "</video_count>" + "<video_list>\n" each do |video| xml += video.to_xml end xml += "</video_list></ut_response>" end |
#open ⇒ Object
Get search result from youtube by specified keyword.
93 94 95 96 97 98 99 100 |
# File 'lib/youtube/searchresultscraper.rb', line 93 def open @url = @@youtube_search_base_url + CGI.escape(@keyword) @url += "&page=#{@page}" if not @page == nil @url += "&search_sort=#{@sort}" if not @sort == nil @html = Kernel.open(@url).read replace_document_write_javascript @search_result = Hpricot.parse(@html) end |
#scrape ⇒ Object
Scrape video information from search result html.
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
# File 'lib/youtube/searchresultscraper.rb', line 103 def scrape @videos = [] @search_result.search("//div[@class='vEntry']").each do |video_html| video = Youtube::Video.new video.id = scrape_id(video_html) video. = (video_html) video.title = scrape_title(video_html) video.length_seconds = scrape_length_seconds(video_html) video. = (video_html) video. = (video_html) video.description = scrape_description(video_html) video.view_count = scrape_view_count(video_html) video.thumbnail_url = scrape_thumbnail_url(video_html) video. = (video_html) video.upload_time = scrape_upload_time(video_html) video.url = scrape_url(video_html) check_video video @videos << video end @video_count = scrape_video_count @video_from = scrape_video_from @video_to = scrape_video_to raise "scraping error" if (is_no_result != @videos.empty?) @videos end |