# Skab

This is a tool to help run statistical analyses of A/B testing experiments we run here at Songkick.

We use this util mainly to generate CSV files that we can plot using Google Docs in order to determine if an A/B test is a success or a failure.

## Getting started

• Install skab by running `gem install skab`

• You can run the util by using the `skab` command line

## Command line arguments

``````skab [output] [model] [model_args]
``````

The command line accepts a variable number of arguments:

• `output` is the name of the output module to use to print data

• `model` is the name of the model used to model the process to analyse

• All other arguments are model dependent and are passed to the model

## Outputs

Skab is able to output different statistics, all based on the model used to generate the distribution.

We currently support two main outputs:

• Distribution: the discrete probability distribution for each group, based on the model used to represent the process

• Differential: the discrete probability distribution for Xb - Xa

## Models

Skab currently supports two models to generate a distribution of the mean depending on the actual observed values:

• Poisson model, working with rate of events on a specific period of time

• Binomial model, working with success rates

### The poisson model

The poisson model accepts two integer parameters: A and B. Each parameter corresponds to the measured number of events occuring in group A or B, respectively.

The distribution outputs a list of probability for each mean depending on the A or B group, according to the poisson law of small numbers.

Here is an example, with 1450 events observed for group A and 1430 for group B:

``````skab distribution poisson 1450 1430
``````

It is worth noting that the Poisson distribution is expensive to compute for large numbers (> 100), so this model uses an approximation using a normal distribution (using a variance of delta).

### The binomial model

The binomial model is used to generate a distribution of success rates depending on a number of trials and successes for each group A and B.

The distribution outputs a list of probable success rates and their respective probability for groups A and B.

For example, this command generate the binomial distribution with:

• 200 successes out of 450 trials for group A

• 220 successes out of 470 trials for group B

``````skab distribution binomial 450 200 470 220
``````

## Known issues

This software relies on Hash ordering to display values in the correct order. On Ruby versions older than 1.9, hash ordering wasn't guaranteed, and this will cause some output to be inconsistent (mainly differential CSV and summary outputs).