OutlierTree Ruby
:deciduous_tree: OutlierTree - explainable outlier/anomaly detection - for Ruby
Produces human-readable explanations for why values are detected as outliers
Price (2.50) looks low given Department is Books and Sale is false
:evergreen_tree: Check out IsoTree for an alternative approach that uses Isolation Forest
Installation
Add this line to your application’s Gemfile:
gem "outliertree"
Getting Started
Prep your data
data = [
{department: "Books", sale: false, price: 2.50},
{department: "Books", sale: true, price: 3.00},
{department: "Movies", sale: false, price: 5.00},
# ...
]
Train a model
model = OutlierTree.new
model.fit(data)
Get outliers
model.outliers(data)
Parameters
Pass parameters - default values below
OutlierTree.new(
max_depth: 4,
min_gain: 0.01,
z_norm: 2.67,
z_outlier: 8.0,
pct_outliers: 0.01,
min_size_numeric: 25,
min_size_categ: 50,
categ_split: "binarize",
categ_outliers: "tail",
numeric_split: "raw",
follow_all: false,
gain_as_pct: true,
nthreads: -1
)
See a detailed explanation
Data
Data can be an array of hashes
[
{department: "Books", sale: false, price: 2.50},
{department: "Books", sale: true, price: 3.00},
{department: "Movies", sale: false, price: 5.00}
]
Or a Rover data frame
Rover.read_csv("data.csv")
Performance
OutlierTree uses OpenMP when possible for best performance. To enable OpenMP on Mac, run:
brew install libomp
Then reinstall the gem.
gem uninstall outliertree --force
bundle install
Resources
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone --recursive https://github.com/ankane/outliertree-ruby.git
cd outliertree-ruby
bundle install
bundle exec rake compile
bundle exec rake test