AprendizajeMaquina
Aprendizaje maquina is a gem that help us to write ruby machine learning algorithms.
Installation
Add this line to your application's Gemfile:
gem 'aprendizaje_maquina'
And then execute:
$ bundle
Or install it yourself as:
$ gem install aprendizaje_maquina
Usage
linear regression model
first
require 'aprendizaje_maquina'
load data from a CSV file
load = AprendizajeMaquina::Cargar.new("file.csv")
# specify the column that you want to store on a vector
y = load.to_vector(3)
# if you don't specify the column or range of columns
# this put all the data of the csv file in a matrix
matrix = load.to_matrix
# create a matrix with the data in the column 0 of the csv file
x = load.to_matrix(0) # you can specify range like this load.to_matrix(0..4)
x_with_ones = x.add_ones # this add a column of ones to the matrix
to normalize data
x.normalize
create an instance of the class RegresionLineal
regresion_lineal = AprendizajeMaquina::RegresionLineal.new(x_matrix,y_vector)
regresion_lineal.find_ecuation # (or use the alias :train) return a Vector
m = Matrix[[1,95]]
puts regresion_lineal.make_prediction(m) # (or use the alias :predict) to make predictions
# => Vector[193.45225618631895]
linear regresion with arrays
x = [74,92,63,72,58,78,85,85,73,62,80,72]
y = [168,196,170,175,162,169,190,186,176,170,176,179]
regresion_simple = AprendizajeMaquina::RegresionLineal.new(x,y)
regresion_simple.train
p regresion_simple.ecuacion
p regresion_simple.predict(95)
Logistic Classification
data = AprendizajeMaquina::Cargar.new("data.csv")
x = data.to_matrix(0..1).add_ones
y = data.to_vector(2)
initial_theta = Vector[0,0,0]
cl = AprendizajeMaquina::ClasificacionLogistica.new(x,y,initial_theta)
training
the method ClasificacionLogistica#train receives 3 inputs, the first is the numbers of iterations, the second is the alpha value(step size), last one is type of training method ('SGD' for Stochastic Gradient Descents, 'Grad' for Batch Gradiendt Descent and 'Newm' for Newton's method)
example 1:
cl.train(12,0.01,'SGD')
example 2:
cl.train(10,'NewM') # Newton's method dont use alpha
example 3:
cl.train(400,0.001,'Grad')
predictions
if cl.predict(Matrix[[1,24,0]]) == 1
p "CANSADO"
else
p "DESCANSADO"
end
make predictions for multiclass(one vs all)
initial_theta_for_each_class = [
Vector[-38.98494868465186, 3.133704064187691,-1.0058753929521247],
Vector[40.93814883472139,-3.2195737672278586, -0.8080682715294277],
Vector[-7.220460,0.256681,1.141166]
]
predicted_val = []
initial_theta_for_each_class.each do |e|
multiclass = AprendizajeMaquina::ClasificacionLogistica.new(x,y,e)
predicted_val << multiclass.predict(Matrix[[1,13.5,1.83]])
end
if predicted_val[0] == 1
puts "Vino Tinto"
elsif predicted_val[1] == 1
puts "Vino Rosado"
elsif predicted_val[2] == 1
puts "Vino Blanco"
else
puts predicted_val
end
Clustering
load_data = AprendizajeMaquina::Cargar.new('clustering_data.csv')
dataset = load_data.to_matrix
# initialize with 2 cluster centroids
clustering = AprendizajeMaquina::KmeansClustering.new(2,dataset)
# fit the model with 20 iterations
clustering.fit(20)
# watch the values in their respective cluster
p clustering.cluster(0)
p clustering.cluster(1)
# Predict the closest cluster
p clustering.predict(Vector[63,190])
Decision tree
tree = AprendizajeMaquina::DecisionTree.new(dataset)
print tree.display_tree
puts tree.predict(datatest)
License
The gem is available as open source under the terms of the MIT License.