Class: Rumale::Manifold::TSNE
- Inherits:
-
Object
- Object
- Rumale::Manifold::TSNE
- Includes:
- Base::BaseEstimator, Base::Transformer
- Defined in:
- lib/rumale/manifold/tsne.rb
Overview
TSNE is a class that implements t-Distributed Stochastic Neighbor Embedding (t-SNE) with fixed-point optimization algorithm. Fixed-point algorithm usually converges faster than gradient descent method and do not need the learning parameters such as the learning rate and momentum.
Reference
-
van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. of Machine Learning Research, vol. 9, pp. 2579–2605, 2008.
-
-
Yang, I. King, Z. Xu, and E. Oja, “Heavy-Tailed Symmetric Stochastic Neighbor Embedding,” Proc. NIPS’09, pp. 2169–2177, 2009.
-
Instance Attribute Summary collapse
-
#embedding ⇒ Numo::DFloat
readonly
Return the data in representation space.
-
#kl_divergence ⇒ Float
readonly
Return the Kullback-Leibler divergence after optimization.
-
#n_iter ⇒ Integer
readonly
Return the number of iterations run for optimization.
-
#rng ⇒ Random
readonly
Return the random generator.
Attributes included from Base::BaseEstimator
Instance Method Summary collapse
-
#fit(x) ⇒ TSNE
Fit the model with given training data.
-
#fit_transform(x) ⇒ Numo::DFloat
Fit the model with training data, and then transform them with the learned model.
-
#initialize(n_components: 2, perplexity: 30.0, metric: 'euclidean', init: 'random', max_iter: 500, tol: nil, verbose: false, random_seed: nil) ⇒ TSNE
constructor
Create a new transformer with t-SNE.
-
#marshal_dump ⇒ Hash
Dump marshal data.
-
#marshal_load(obj) ⇒ nil
Load marshal data.
Constructor Details
#initialize(n_components: 2, perplexity: 30.0, metric: 'euclidean', init: 'random', max_iter: 500, tol: nil, verbose: false, random_seed: nil) ⇒ TSNE
Create a new transformer with t-SNE.
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/rumale/manifold/tsne.rb', line 59 def initialize(n_components: 2, perplexity: 30.0, metric: 'euclidean', init: 'random', max_iter: 500, tol: nil, verbose: false, random_seed: nil) check_params_integer(n_components: n_components, max_iter: max_iter) check_params_float(perplexity: perplexity) check_params_string(metric: metric, init: init) check_params_boolean(verbose: verbose) check_params_type_or_nil(Float, tol: tol) check_params_type_or_nil(Integer, random_seed: random_seed) check_params_positive(n_components: n_components, perplexity: perplexity, max_iter: max_iter) @params = {} @params[:n_components] = n_components @params[:perplexity] = perplexity @params[:max_iter] = max_iter @params[:tol] = tol @params[:metric] = metric @params[:init] = init @params[:verbose] = verbose @params[:random_seed] = random_seed @params[:random_seed] ||= srand @rng = Random.new(@params[:random_seed]) @embedding = nil @kl_divergence = nil @n_iter = nil end |
Instance Attribute Details
#embedding ⇒ Numo::DFloat (readonly)
Return the data in representation space.
30 31 32 |
# File 'lib/rumale/manifold/tsne.rb', line 30 def @embedding end |
#kl_divergence ⇒ Float (readonly)
Return the Kullback-Leibler divergence after optimization.
34 35 36 |
# File 'lib/rumale/manifold/tsne.rb', line 34 def kl_divergence @kl_divergence end |
#n_iter ⇒ Integer (readonly)
Return the number of iterations run for optimization
38 39 40 |
# File 'lib/rumale/manifold/tsne.rb', line 38 def n_iter @n_iter end |
#rng ⇒ Random (readonly)
Return the random generator.
42 43 44 |
# File 'lib/rumale/manifold/tsne.rb', line 42 def rng @rng end |
Instance Method Details
#fit(x) ⇒ TSNE
Fit the model with given training data.
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/rumale/manifold/tsne.rb', line 91 def fit(x, _not_used = nil) check_sample_array(x) raise ArgumentError, 'Expect the input distance matrix to be square.' if @params[:metric] == 'precomputed' && x.shape[0] != x.shape[1] # initialize some varibales. @n_iter = 0 distance_mat = @params[:metric] == 'precomputed' ? x**2 : Rumale::PairwiseMetric.squared_error(x) hi_prob_mat = gaussian_distributed_probability_matrix(distance_mat) y = (x) lo_prob_mat = t_distributed_probability_matrix(y) # perform fixed-point optimization. one_vec = Numo::DFloat.ones(x.shape[0]).(1) @params[:max_iter].times do |t| break if terminate?(hi_prob_mat, lo_prob_mat) a = hi_prob_mat * lo_prob_mat b = lo_prob_mat * lo_prob_mat y = (b.dot(one_vec) * y + (a - b).dot(y)) / a.dot(one_vec) lo_prob_mat = t_distributed_probability_matrix(y) @n_iter = t + 1 puts "[t-SNE] KL divergence after #{@n_iter} iterations: #{cost(hi_prob_mat, lo_prob_mat)}" if @params[:verbose] && (@n_iter % 100).zero? end # store results. @embedding = y @kl_divergence = cost(hi_prob_mat, lo_prob_mat) self end |
#fit_transform(x) ⇒ Numo::DFloat
Fit the model with training data, and then transform them with the learned model.
124 125 126 127 |
# File 'lib/rumale/manifold/tsne.rb', line 124 def fit_transform(x, _not_used = nil) fit(x) @embedding.dup end |
#marshal_dump ⇒ Hash
Dump marshal data.
131 132 133 134 135 136 137 |
# File 'lib/rumale/manifold/tsne.rb', line 131 def marshal_dump { params: @params, embedding: @embedding, kl_divergence: @kl_divergence, n_iter: @n_iter, rng: @rng } end |
#marshal_load(obj) ⇒ nil
Load marshal data.
141 142 143 144 145 146 147 148 |
# File 'lib/rumale/manifold/tsne.rb', line 141 def marshal_load(obj) @params = obj[:params] @embedding = obj[:embedding] @kl_divergence = obj[:kl_divergence] @n_iter = obj[:n_iter] @rng = obj[:rng] nil end |