Class: Rumale::Ensemble::AdaBoostRegressor
- Inherits:
-
Object
- Object
- Rumale::Ensemble::AdaBoostRegressor
- Includes:
- Base::BaseEstimator, Base::Regressor
- Defined in:
- lib/rumale/ensemble/ada_boost_regressor.rb
Overview
AdaBoostRegressor is a class that implements random forest for regression. This class uses decision tree for a weak learner.
Reference
-
Shrestha and D. P. Solomatine, “Experiments with AdaBoost.RT, an Improved Boosting Scheme for Regression,” Neural Computation 18 (7), pp. 1678–1710, 2006.
-
Instance Attribute Summary collapse
-
#estimator_weights ⇒ Numo::DFloat
readonly
Return the weight for each weak learner.
-
#estimators ⇒ Array<DecisionTreeRegressor>
readonly
Return the set of estimators.
-
#feature_importances ⇒ Numo::DFloat
readonly
Return the importance for each feature.
-
#rng ⇒ Random
readonly
Return the random generator for random selection of feature index.
Attributes included from Base::BaseEstimator
Instance Method Summary collapse
-
#fit(x, y) ⇒ AdaBoostRegressor
Fit the model with given training data.
-
#initialize(n_estimators: 10, threshold: 0.2, exponent: 1.0, criterion: 'mse', max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, random_seed: nil) ⇒ AdaBoostRegressor
constructor
Create a new regressor with random forest.
-
#marshal_dump ⇒ Hash
Dump marshal data.
-
#marshal_load(obj) ⇒ nil
Load marshal data.
-
#predict(x) ⇒ Numo::DFloat
Predict values for samples.
Methods included from Base::Regressor
Constructor Details
#initialize(n_estimators: 10, threshold: 0.2, exponent: 1.0, criterion: 'mse', max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, random_seed: nil) ⇒ AdaBoostRegressor
Create a new regressor with random forest.
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 58 def initialize(n_estimators: 10, threshold: 0.2, exponent: 1.0, criterion: 'mse', max_depth: nil, max_leaf_nodes: nil, min_samples_leaf: 1, max_features: nil, random_seed: nil) check_params_type_or_nil(Integer, max_depth: max_depth, max_leaf_nodes: max_leaf_nodes, max_features: max_features, random_seed: random_seed) check_params_integer(n_estimators: n_estimators, min_samples_leaf: min_samples_leaf) check_params_float(threshold: threshold, exponent: exponent) check_params_string(criterion: criterion) check_params_positive(n_estimators: n_estimators, threshold: threshold, exponent: exponent, max_depth: max_depth, max_leaf_nodes: max_leaf_nodes, min_samples_leaf: min_samples_leaf, max_features: max_features) @params = {} @params[:n_estimators] = n_estimators @params[:threshold] = threshold @params[:exponent] = exponent @params[:criterion] = criterion @params[:max_depth] = max_depth @params[:max_leaf_nodes] = max_leaf_nodes @params[:min_samples_leaf] = min_samples_leaf @params[:max_features] = max_features @params[:random_seed] = random_seed @params[:random_seed] ||= srand @estimators = nil @feature_importances = nil @rng = Random.new(@params[:random_seed]) end |
Instance Attribute Details
#estimator_weights ⇒ Numo::DFloat (readonly)
Return the weight for each weak learner.
33 34 35 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 33 def estimator_weights @estimator_weights end |
#estimators ⇒ Array<DecisionTreeRegressor> (readonly)
Return the set of estimators.
29 30 31 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 29 def estimators @estimators end |
#feature_importances ⇒ Numo::DFloat (readonly)
Return the importance for each feature.
37 38 39 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 37 def feature_importances @feature_importances end |
#rng ⇒ Random (readonly)
Return the random generator for random selection of feature index.
41 42 43 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 41 def rng @rng end |
Instance Method Details
#fit(x, y) ⇒ AdaBoostRegressor
Fit the model with given training data.
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 91 def fit(x, y) # rubocop:disable Metrics/AbcSize check_sample_array(x) check_tvalue_array(y) check_sample_tvalue_size(x, y) # Check target values raise ArgumentError, 'Expect target value vector to be 1-D arrray' unless y.shape.size == 1 # Initialize some variables. n_samples, n_features = x.shape @params[:max_features] = n_features unless @params[:max_features].is_a?(Integer) @params[:max_features] = [[1, @params[:max_features]].max, n_features].min observation_weights = Numo::DFloat.zeros(n_samples) + 1.fdiv(n_samples) @estimators = [] @estimator_weights = [] @feature_importances = Numo::DFloat.zeros(n_features) sub_rng = @rng.dup # Construct forest. @params[:n_estimators].times do |_t| # Fit weak learner. ids = Rumale::Utils.choice_ids(n_samples, observation_weights, sub_rng) tree = Tree::DecisionTreeRegressor.new( criterion: @params[:criterion], max_depth: @params[:max_depth], max_leaf_nodes: @params[:max_leaf_nodes], min_samples_leaf: @params[:min_samples_leaf], max_features: @params[:max_features], random_seed: sub_rng.rand(Rumale::Values.int_max) ) tree.fit(x[ids, true], y[ids]) p = tree.predict(x) # Calculate errors. abs_err = ((p - y) / y).abs err = observation_weights[abs_err.gt(@params[:threshold])].sum break if err <= 0.0 # Calculate weight. beta = err**@params[:exponent] weight = Math.log(1.fdiv(beta)) # Store model. @estimators.push(tree) @estimator_weights.push(weight) @feature_importances += weight * tree.feature_importances # Update observation weights. update = Numo::DFloat.ones(n_samples) update[abs_err.le(@params[:threshold])] = beta observation_weights *= update observation_weights = observation_weights.clip(1.0e-15, nil) sum_observation_weights = observation_weights.sum break if sum_observation_weights.zero? observation_weights /= sum_observation_weights end @estimator_weights = Numo::DFloat.asarray(@estimator_weights) @feature_importances /= @estimator_weights.sum self end |
#marshal_dump ⇒ Hash
Dump marshal data.
159 160 161 162 163 164 165 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 159 def marshal_dump { params: @params, estimators: @estimators, estimator_weights: @estimator_weights, feature_importances: @feature_importances, rng: @rng } end |
#marshal_load(obj) ⇒ nil
Load marshal data.
169 170 171 172 173 174 175 176 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 169 def marshal_load(obj) @params = obj[:params] @estimators = obj[:estimators] @estimator_weights = obj[:estimator_weights] @feature_importances = obj[:feature_importances] @rng = obj[:rng] nil end |
#predict(x) ⇒ Numo::DFloat
Predict values for samples.
146 147 148 149 150 151 152 153 154 155 |
# File 'lib/rumale/ensemble/ada_boost_regressor.rb', line 146 def predict(x) check_sample_array(x) n_samples, = x.shape predictions = Numo::DFloat.zeros(n_samples) @estimators.size.times do |t| predictions += @estimator_weights[t] * @estimators[t].predict(x) end sum_weight = @estimator_weights.sum predictions / sum_weight end |