WebLog Parser - Readme


gem install weblog-parser


WebLog Parser reads a webserver logfile and counts page visits and unique page views. It uses a command-line interface.

Getting Started

An example log file can be found at https://tinyurl.com/ve8x3qs You can download this file and name it 'weblog.log'. Then run the command:

wlparser -i -f weblog.log

This will read the log file and validates either ip4 or ip6 addresses. The file contains both, so if you don't use the -i option, it will show some errors as the default is ip4.


wlparser -h wlparser --help

Shows a list of options

wlparser -f logfile.log
wlparser --file logfile.log

Reads a log file and display results:

wlparser -m 'logfile1.log logfile2.log'
wlparser --multiple_files 'logfile1.log logfile2.log'

Reads a list of log files in quotes and displays results. All files give using -f or -m options will be read and the output combined.

If no files are specified, the default file 'webserver.log' will be read.

wlparser -c wlparser --color

Displays colored text output. Colors can be change in Constants.rb.

wlparser -C wlparser --no_color

Disables colored text output.

wlparser -v wlparser --verbose

Shows extra information, including all validation warnings.

wlparser -q wlparser --quiet

Displays minimal information i.e. only important warnings. Will still write information to a file if this option is selected. Disables verbose.

wlparser -o wlparser --output_file info.txt

Writes output to file. Default is 'log_info.txt' if no file chosen, although this will only work if this is the last argument given.

wlparser -t wlparser --timestamp

Adds a timestamp to the output file. If an output file is given that already exists, this is turned on automatically.

wlparser -x wlparser --text

Sets file output format to text, similar to that displayed (default).

wlparser -j wlparser --json

Sets file output format to json.

wlparser -4 wlparser --ip4_validation

Validates ip addresses using ip4 format (default).

wlparser -6 wlparser --ip6_validation

Validates ip addresses using ip6 format.

wlparser -6 wlparser --ip4ip6_validation

Validates ip addresses if they matches either ip4 or ip6 format.

wlparser -I wlparser --no_ip_validation

Does not validate ip addresses, assumes they are all valid.

wlparser -p wlparser --path_validation

Validates webpage path (default).

wlparser -P wlparser --no_path_validation

Does not validate webpage path, assumes they are all valid.

wlparser -r wlparser --remove_invalid

Ignore logs if either ip address or path is invalid.

wlparser -R wlparser --warn_invalid

Warns about logs with invalid ip addresss or path, but still reads them (default)

wlparser -g wlparser --page_visits

Displays page visits in results and in text file output (default).

wlparser -g wlparser --page_visits

Does not display page visits in results or text file output.

wlparser -u wlparser --unique_page_views

Displays unique page views in results and in text file output (default).

wlparser -U wlparser --no_unique_page_views

Does not display unique page views in results or text file output (default).

Log Format

Logs should be on separate lines. There should be a space separator between the webpage path and the ip address.

Example log with ip4 address: \webpage\index

Logs can use either using ip4 addresses or ip6 addresses.

ip4 addresses should be valid i.e. between and, although you can skip this check.

Example log with ip6 address: \webpage\index 1234:1234:1234:1234:1234:1234:1234:1234

ip6 addresses can be compressed e.g.

\webpage\index 1234:1234::1234


Tests can be run from the cloned repository using:

rake test

The git repository is: https://github.com/davidmorton0/WebLogParser

Tests have been separated into

  • unit tests - test methods in each class
  • integration tests - test the whole app
  • performance - parses a log file with 10,000 logs and a log with 100 logs 100 times. Calculates the time taken and logs parsed/second. The log files are a mixture of ip4 and ip6 addresses.

App structure

A class diagram can be found here: https://tinyurl.com/tky2f74 Note that the dependencies to Constants are not shown.


  • wlparser - Starts the app


  • Parser - Holds the log information and changes the format
  • LogReader - Loads files then reads logs. Validates logs, ip addresses and paths
  • ipValidator - Validates ip addresses
  • PathValidator - Validates the path for the webpage
  • OptionHandler - Sets the options from the command line arguments given
  • Formatter - Formats information for text or display output
  • OutputProcessor - Assembles the information for output
  • WarningHandler - Handles the warnings found when parsing the logs


  • LogParser - Calls the methods in order
  • Constants - Contains default options and other constants used in the app
  • TestData - Contains the data used in the tests
  • ColorText - Adds color to text
  • Version - Gives the current version number


  • test_logs contains log files used in testing