Class: Webby::LinkValidator
- Inherits:
-
Object
- Object
- Webby::LinkValidator
- Defined in:
- lib/webby/link_validator.rb
Overview
The Webby LinkValidator class is used to validate the hyperlinks of all the HTML files in the output directory. By default, only links to other pages in the output directory are checked. However, setting the :external flag to true
will cause hyperlinks to external web sites to be validated as well.
Instance Attribute Summary collapse
-
#validate_externals ⇒ Object
Returns the value of attribute validate_externals.
Class Method Summary collapse
-
.validate(opts = {}) ⇒ Object
A lazy man’s method that will instantiate a new link validator and run the validations.
Instance Method Summary collapse
-
#check_file(fn) ⇒ Object
Check the given file (identified by its filename for short here) by iterating through all the configured xpaths and validating that those hyperlinks ae valid.
-
#initialize(opts = {}) ⇒ LinkValidator
constructor
call-seq: LinkValidator.new( opts = {} ).
-
#validate ⇒ Object
Iterate over all the HTML files in the output directory and validate the hyperlinks.
-
#validate_anchor(uri, doc) ⇒ Object
Validate that the anchor fragment of the URI exists in the given document.
-
#validate_uri(uri, dir) ⇒ Object
Validate the the page the uri refers to actually exists.
Constructor Details
#initialize(opts = {}) ⇒ LinkValidator
call-seq:
LinkValidator.new( opts = {} )
Creates a new LinkValidator object. The only supported option is the :external flag. When set to true
, the link validator will also check out links to external websites. This is done by opening a connection to the remote site and pulling down the page specified in the hyperlink. Use with caution.
32 33 34 35 36 37 38 39 40 41 42 |
# File 'lib/webby/link_validator.rb', line 32 def initialize( opts = {} ) @log = Logging::Logger[self] glob = ::File.join(::Webby.site.output_dir, '**', '*.html') @files = Dir.glob(glob).sort @attr_rgxp = %r/\[@(\w+)\]$/o @validate_externals = opts.getopt(:external, false) @valid_uris = ::Webby.site.valid_uris.flatten @invalid_uris = [] end |
Instance Attribute Details
#validate_externals ⇒ Object
Returns the value of attribute validate_externals.
21 22 23 |
# File 'lib/webby/link_validator.rb', line 21 def validate_externals @validate_externals end |
Class Method Details
.validate(opts = {}) ⇒ Object
A lazy man’s method that will instantiate a new link validator and run the validations.
17 18 19 |
# File 'lib/webby/link_validator.rb', line 17 def self.validate( opts = {} ) new(opts).validate end |
Instance Method Details
#check_file(fn) ⇒ Object
Check the given file (identified by its filename for short here) by iterating through all the configured xpaths and validating that those hyperlinks ae valid.
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/webby/link_validator.rb', line 55 def check_file( fn ) @log.info "validating #{fn}" dir = ::File.dirname(fn) @doc = Hpricot(::File.read(fn)) ::Webby.site.xpaths.each do |xpath| @attr_name = nil @doc.search(xpath).each do |element| @attr_name ||= @attr_rgxp.match(xpath)[1] uri = URI.parse(element.get_attribute(@attr_name)) validate_uri(uri, dir) end end @doc = @attr_name = nil end |
#validate ⇒ Object
Iterate over all the HTML files in the output directory and validate the hyperlinks.
47 48 49 |
# File 'lib/webby/link_validator.rb', line 47 def validate @files.each {|fn| check_file fn} end |
#validate_anchor(uri, doc) ⇒ Object
Validate that the anchor fragment of the URI exists in the given document. The document is an Hpricot document object.
Returns true
if the anchor exists in the document and false
if it does not.
139 140 141 142 143 144 145 146 147 |
# File 'lib/webby/link_validator.rb', line 139 def validate_anchor( uri, doc ) return false if uri.fragment.nil? anchor = '#' + uri.fragment if doc.at(anchor).nil? @log.error "invalid URI '#{uri.to_s}'" false else true end end |
#validate_uri(uri, dir) ⇒ Object
Validate the the page the uri refers to actually exists. The directory of the current page being processed is needed in order to resolve relative paths.
If the uri is a relative path, then the output directory is searched for the appropriate page. If the uri is an absolute path, then the remote server is contacted and the page requested from the server. This will only take place if the LinkValidator was created with the :external flag set to true.
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/webby/link_validator.rb', line 83 def validate_uri( uri, dir ) # for relative URIs, we can see if the file exists in the output folder if uri.relative? return validate_anchor(uri, @doc) if uri.path.empty? path = if uri.path =~ %r/^\// ::File.join(::Webby.site.output_dir, uri.path) else ::File.join(dir, uri.path) end path = ::File.join(path, 'index.html') if ::File.extname(path).empty? uri_str = path.dup (uri_str << '#' << uri.fragment) if uri.fragment return if @valid_uris.include? uri_str if test ?f, path valid = if uri.fragment validate_anchor(uri, Hpricot(::File.read(path))) else true end @valid_uris << uri_str if valid else @log.error "invalid URI '#{uri.to_s}'" end # if the URI responds to the open mehod, then try to access the URI elsif uri.respond_to? :open return unless @validate_externals return if @valid_uris.include? uri.to_s if @invalid_uris.include? uri.to_s @log.error "could not open URI '#{uri.to_s}'" return end begin uri.open {|_| nil} @valid_uris << uri.to_s rescue Exception @log.error "could not open URI '#{uri.to_s}'" @invalid_uris << uri.to_s end # otherwise, post a warning that the URI could not be validated else return if @valid_uris.include? uri.to_s @log.warn "could not validate URI '#{uri.to_s}'" end end |