Class: ReadabilityJs::Nodo
- Inherits:
-
Nodo::Core
- Object
- Nodo::Core
- ReadabilityJs::Nodo
- Defined in:
- lib/readability_js/nodo.rb
Class Method Summary collapse
-
.is_probably_readerable(html, min_content_length: 140, min_score: 20, visibility_checker: nil) ⇒ Object
instance wrapper method, as nodo does not support class methods.
-
.parse(html, url: nil, debug: false, max_elems_to_parse: 0, nb_top_candidates: 5, char_threshold: 500, classes_to_preserve: [], keep_classes: false, disable_json_ld: false, serializer: nil, allow_video_regex: nil, link_density_modifier: 0) ⇒ Object
instance wrapper method, as nodo does not support class methods.
- .probably_readerable(html) ⇒ Object
Class Method Details
.is_probably_readerable(html, min_content_length: 140, min_score: 20, visibility_checker: nil) ⇒ Object
instance wrapper method, as nodo does not support class methods
28 29 30 31 32 33 34 35 36 |
# File 'lib/readability_js/nodo.rb', line 28 def self.is_probably_readerable(html, min_content_length: 140, min_score: 20, visibility_checker: nil) begin # remove style tags from html, so jsdom does not need to process css and its warnings are not shown html = html.gsub(/<style[^>]*>.*?<\/style>/m, '') self.new.is_probably_readerable html, min_content_length, min_score, visibility_checker rescue ::Nodo::JavaScriptError => e raise ReadabilityJs::Error.new "#{e.}" end end |
.parse(html, url: nil, debug: false, max_elems_to_parse: 0, nb_top_candidates: 5, char_threshold: 500, classes_to_preserve: [], keep_classes: false, disable_json_ld: false, serializer: nil, allow_video_regex: nil, link_density_modifier: 0) ⇒ Object
instance wrapper method, as nodo does not support class methods
15 16 17 18 19 20 21 22 23 |
# File 'lib/readability_js/nodo.rb', line 15 def self.parse(html, url: nil, debug: false, max_elems_to_parse: 0, nb_top_candidates: 5, char_threshold: 500, classes_to_preserve: [], keep_classes: false, disable_json_ld: false, serializer: nil, allow_video_regex: nil, link_density_modifier: 0) begin # remove style tags from html, so jsdom does not need to process css and its warnings are not shown html = html.gsub(/<style[^>]*>.*?<\/style>/m, '') self.new.parse html, url, debug, max_elems_to_parse, nb_top_candidates, char_threshold, classes_to_preserve, keep_classes, disable_json_ld, serializer, allow_video_regex, link_density_modifier rescue ::Nodo::JavaScriptError => e raise ReadabilityJs::Error.new "#{e.}" end end |
.probably_readerable(html) ⇒ Object
38 39 40 |
# File 'lib/readability_js/nodo.rb', line 38 def self.probably_readerable(html) self.is_probably_readerable(html) end |