validate-website

Description

Web crawler for checking the validity of your documents

validate website

Installation

Debian

apt install ruby-dev libxslt1-dev libxml2-dev

If you want complete local validation look tidy packages

RubyGems

gem install validate-website

Synopsis

validate-website [OPTIONS] validate-website-static [OPTIONS]

Examples

validate-website -v -s https://www.ruby-lang.org/ validate-website -v -x tidy -s https://www.ruby-lang.org/ validate-website -v -x nu -s https://www.ruby-lang.org/ validate-website -h

Description

validate-website is a web crawler for checking the markup validity with XML Schema / DTD and not found urls (more info doc/validate-website.adoc).

validate-website-static checks the markup validity of your local documents with XML Schema / DTD (more info doc/validate-website-static.adoc).

HTML5 support with libtidy5 or Validator.nu Web Service.

Exit status

  • 0: Markup is valid and no 404 found.
  • 64: Not valid markup found.
  • 65: There are pages not found.
  • 66: There are not valid markup and pages not found.

On your application

ruby require 'validate_website/validator' body = '<!DOCTYPE html><html></html>' v = ValidateWebsite::Validator.new(Nokogiri::HTML(body), body) v.valid? # => false

Jekyll static site validation

You can add this Rake task to validate a jekyll site:

ruby desc 'validate _site with validate website' task validate: :build do Dir.chdir("_site") do system("validate-website-static", "--verbose", "--exclude", "examples", "--site", HTTP_URL) exit($?.exitstatus) end end end

More info

HTML5

Tidy5

If the libtidy5 is found on your system this will be the default to validate your html5 document. This does not depend on a tier service everything is done locally.

nokogiri

nokogiri can validate html5 document without tier service but reports less errors than tidy.

Validator.nu web service

When --html5-validator nu option is used HTML5 support is done by using the Validator.nu Web Service, so the content of your webpage is logged by a tier. It’s not the case for other validation because validate-website use the XML Schema or DTD stored on the data/ directory.

Please read http://about.validator.nu/#tos for more info on the HTML5 validation service.

Use validator standalone web server locally

You can download validator jar and start it with:

java -cp PATH_TO/vnu.jar nu.validator.servlet.Main 8888

Then you can use validate-website option:

--html5-validator-service-url http://localhost:8888/ # or export VALIDATOR_NU_URL="http://localhost:8888/"

This will prevent you to be blacklisted from validator webservice.

Tests

With standard environment:

bundle exec rake

Credits

  • Thanks tenderlove for Nokogiri, this tool is inspired from markup_validity.
  • And Chris Kite for Anemone web-spider framework and postmodern for Spidr.

Contributors

See GitHub.

License

The MIT License

Copyright (c) 2009-2022 Laurent Arnoud [email protected]


Build Coverage Version Documentation License Inline docs