Module: MMonitor::Spider
- Defined in:
- lib/mmonitor/spider.rb
Overview
蜘蛛,负责http请求处理
Class Method Summary collapse
-
.get_html(url, params = {}) ⇒ Object
抓取HTML.
-
.get_json(url, params = {}) ⇒ Object
抓取JSON.
-
.get_ocr(photo_url) ⇒ Object
抓取图片上的文字.
-
.number_page(total, limit) ⇒ Object
分页.
Class Method Details
.get_html(url, params = {}) ⇒ Object
抓取HTML
12 13 14 15 16 |
# File 'lib/mmonitor/spider.rb', line 12 def get_html(url, params={}) body = get(url, params) $body = body ::Nokogiri::HTML(body) end |
.get_json(url, params = {}) ⇒ Object
抓取JSON
18 19 20 21 |
# File 'lib/mmonitor/spider.rb', line 18 def get_json(url, params={}) body = get(url, params) ::Oj.load(body) rescue {} end |
.get_ocr(photo_url) ⇒ Object
抓取图片上的文字
23 24 25 26 27 28 29 30 31 32 33 34 35 |
# File 'lib/mmonitor/spider.rb', line 23 def get_ocr(photo_url) image = MiniMagick::Image.open(photo_url) image. do |c| c.background '#FFFFFF' c.colorspace 'GRAY' c.alpha 'remove' end image.format 'jpg' ocr = RTesseract.new(image.path, processor: 'mini_magick') str = ocr.to_s image.destroy! return str end |
.number_page(total, limit) ⇒ Object
分页
37 38 39 40 41 |
# File 'lib/mmonitor/spider.rb', line 37 def number_page(total, limit) count = total / limit count += 1 if total % limit > 0 count end |