weblicate
Replicate a website.
Weblicate creates a copy of a web page, complete with third party assets, to be run on your own webserver. When given a HAR file, weblicate writes all assets to local disk. It appends a domain to the end of all URLs so you can simulate external requests from sites like doubleclick and google.
HAR (HTTP Archive) files are a great way to pass around information about web pages. Firebug lets you export them.
Installation
You need Ruby and RubyGems installed
sudo gem install weblicate
Usage
Running the following command:
weblicate www.cnn.com.har
Results in the following response:
Now run these commands...
rsync -avz www.cnn.com-files/ localhost:/var/www/weblicate/www.cnn.com/
scp www.cnn.com-vhosts localhost:/etc/apache2/sites-enabled/
cat www.cnn.com-hosts >> /etc/hosts # For access from workstation
OPTIONAL - if you want access from the wider internet
sh www.cnn.com-dns # Create domains on Slicehost
and generates the following output:
www.cnn.com-files # All files and assets for the page
www.cnn.com-hosts # Entries that can be added to your local hosts file
www.cnn.com-vhosts # Apache vhosts entries for all domains
By default, weblicate appends ‘.localhost’ to all domains. You can over ride this if you want you weblicant to be accessible to the greater internet.
weblicate www.cnn.com.har yourdomain.com
Weblicate generates a script that creates the domains. It Works For Me.
www.cnn.com-dns # A script to create DNS entries (on Slicehost.com)
Copyright
Copyright © 2010 Mike Bailey. See LICENSE for details.