Class: Rumale::Clustering::PowerIteration
- Inherits:
-
Object
- Object
- Rumale::Clustering::PowerIteration
- Includes:
- Base::BaseEstimator, Base::ClusterAnalyzer
- Defined in:
- lib/rumale/clustering/power_iteration.rb
Overview
PowerIteration is a class that implements power iteration clustering.
Reference
-
Lin and W W. Cohen, “Power Iteration Clustering,” Proc. ICML’10, pp. 655–662, 2010.
-
Instance Attribute Summary collapse
-
#embedding ⇒ Numo::DFloat
readonly
Return the data in embedded space.
-
#n_iter ⇒ Integer
readonly
Return the number of iterations run for optimization.
Attributes included from Base::BaseEstimator
Instance Method Summary collapse
-
#fit(x) ⇒ PowerIteration
Analysis clusters with given training data.
-
#fit_predict(x) ⇒ Numo::Int32
Analysis clusters and assign samples to clusters.
-
#initialize(n_clusters: 8, affinity: 'rbf', gamma: nil, init: 'k-means++', max_iter: 1000, tol: 1.0e-8, eps: 1.0e-5, random_seed: nil) ⇒ PowerIteration
constructor
Create a new cluster analyzer with power iteration clustering.
-
#marshal_dump ⇒ Hash
Dump marshal data.
-
#marshal_load(obj) ⇒ nil
Load marshal data.
Methods included from Base::ClusterAnalyzer
Constructor Details
#initialize(n_clusters: 8, affinity: 'rbf', gamma: nil, init: 'k-means++', max_iter: 1000, tol: 1.0e-8, eps: 1.0e-5, random_seed: nil) ⇒ PowerIteration
Create a new cluster analyzer with power iteration clustering.
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/rumale/clustering/power_iteration.rb', line 40 def initialize(n_clusters: 8, affinity: 'rbf', gamma: nil, init: 'k-means++', max_iter: 1000, tol: 1.0e-8, eps: 1.0e-5, random_seed: nil) check_params_integer(n_clusters: n_clusters, max_iter: max_iter) check_params_float(tol: tol, eps: eps) check_params_string(affinity: affinity, init: init) check_params_type_or_nil(Float, gamma: gamma) check_params_type_or_nil(Integer, random_seed: random_seed) check_params_positive(n_clusters: n_clusters, max_iter: max_iter, tol: tol, eps: eps) @params = {} @params[:n_clusters] = n_clusters @params[:affinity] = affinity @params[:gamma] = gamma @params[:init] = init == 'random' ? 'random' : 'k-means++' @params[:max_iter] = max_iter @params[:tol] = tol @params[:eps] = eps @params[:random_seed] = random_seed @params[:random_seed] ||= srand @embedding = nil @n_iter = nil end |
Instance Attribute Details
#embedding ⇒ Numo::DFloat (readonly)
Return the data in embedded space.
23 24 25 |
# File 'lib/rumale/clustering/power_iteration.rb', line 23 def @embedding end |
#n_iter ⇒ Integer (readonly)
Return the number of iterations run for optimization
27 28 29 |
# File 'lib/rumale/clustering/power_iteration.rb', line 27 def n_iter @n_iter end |
Instance Method Details
#fit(x) ⇒ PowerIteration
Analysis clusters with given training data.
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/rumale/clustering/power_iteration.rb', line 68 def fit(x, _y = nil) check_sample_array(x) raise ArgumentError, 'Expect the input affinity matrix to be square.' if @params[:affinity] == 'precomputed' && x.shape[0] != x.shape[1] # initialize some variables. affinity_mat = @params[:metric] == 'precomputed' ? x : Rumale::PairwiseMetric.rbf_kernel(x, nil, @params[:gamma]) affinity_mat[affinity_mat.diag_indices] = 0.0 n_samples = affinity_mat.shape[0] tol = @params[:tol].fdiv(n_samples) # calculate normalized affinity matrix. degrees = affinity_mat.sum(axis: 1) normalized_affinity_mat = (1.0 / degrees).diag.dot(affinity_mat) # initialize embedding space. @embedding = degrees / degrees.sum # optimization @n_iter = 0 error = Numo::DFloat.ones(n_samples) @params[:max_iter].times do |t| @n_iter = t + 1 = normalized_affinity_mat.dot(@embedding) /= .abs.sum new_error = ( - @embedding).abs break if (new_error - error).abs.max <= tol @embedding = error = new_error end self end |
#fit_predict(x) ⇒ Numo::Int32
Analysis clusters and assign samples to clusters.
101 102 103 104 105 106 107 108 109 |
# File 'lib/rumale/clustering/power_iteration.rb', line 101 def fit_predict(x) check_sample_array(x) fit(x) kmeans = Rumale::Clustering::KMeans.new( n_clusters: @params[:n_clusters], init: @params[:init], max_iter: @params[:max_iter], tol: @params[:tol], random_seed: @params[:random_seed] ) kmeans.fit_predict(@embedding.(1)) end |
#marshal_dump ⇒ Hash
Dump marshal data.
113 114 115 116 117 |
# File 'lib/rumale/clustering/power_iteration.rb', line 113 def marshal_dump { params: @params, embedding: @embedding, n_iter: @n_iter } end |
#marshal_load(obj) ⇒ nil
Load marshal data.
121 122 123 124 125 126 |
# File 'lib/rumale/clustering/power_iteration.rb', line 121 def marshal_load(obj) @params = obj[:params] @embedding = obj[:embedding] @n_iter = obj[:n_iter] nil end |