basilisk
a command-line front-end for the anemone web-crawler (github.com/chriskite/anemone). basilisk produces useful reports for qa-ing websites. It also features an extensible page processor class for writing your own page processors.
Included page processors:
-
seo: generates a csv with the following columns: url, title, description, keywords, h1s, h2s
-
sitemap: generates an xml sitemap
-
image: generates a list of broken images and images lacking an alt tag.
-
error: generates a csv of urls returning html response codes other than success and redirect.
See the generated yml config file for even more options.
install
sudo gem install basilisk
usage
To create a new search:
basil create [search_name] [url]
-
Creates a search config file ([search_name].yml), which you may edit to change the default options, specify which page process you want to run, any regex and css terms for searching across the site, and regexes for skipping urls.
To run the search:
basil run [search_name]
-
Runs the specified search. Note: you must create a search before running it. Files generated by the page processors will reside in a folder called [search_name].
author & license
basilisk is licensed under a modified MIT licence. See LICENCE.txt.
basilisk was written by Kyle Banker, largely dependent on the anemone web-crawler by Chris Kite.
Copyright 2009 Alexander Interactive, Inc.