Class: Statsample::Regression::Multiple::MatrixEngine
- Inherits:
-
BaseEngine
- Object
- BaseEngine
- Statsample::Regression::Multiple::MatrixEngine
- Defined in:
- lib/statsample/regression/multiple/matrixengine.rb
Overview
Pure Ruby Class for Multiple Regression Analysis, based on a covariance or correlation matrix.
Use Statsample::Regression::Multiple::RubyEngine if you have a Dataset, to avoid setting all details.
Remember: NEVER use a Covariance data if you have missing data. Use only correlation matrix on that case.
Example:
matrix=[[1.0, 0.5, 0.2], [0.5, 1.0, 0.7], [0.2, 0.7, 1.0]]
lr=Statsample::Regression::Multiple::MatrixEngine.new(matrix,2)
Direct Known Subclasses
Instance Attribute Summary collapse
- #cases ⇒ Object
-
#digits ⇒ Object
writeonly
Sets the attribute digits.
-
#x_mean ⇒ Object
Hash of mean for predictors.
-
#x_sd ⇒ Object
Hash of standard deviation of predictors.
-
#y_mean ⇒ Object
Mean for criteria.
-
#y_sd ⇒ Object
Standard deviation of criterion Only useful for Correlation Matrix, because by default is set to 1.
Attributes inherited from BaseEngine
#digits, #name, #total_cases, #valid_cases
Instance Method Summary collapse
-
#coeffs ⇒ Object
Hash of b or raw coefficients.
-
#coeffs_se ⇒ Object
Standard Error for coefficients.
-
#constant ⇒ Object
Value of constant.
-
#constant_se ⇒ Object
Standard error for constant.
-
#constant_t ⇒ Object
t value for constant.
-
#df_e ⇒ Object
Degrees of freedom for error.
-
#df_r ⇒ Object
Degrees of freedom for regression.
-
#initialize(matrix, y_var, opts = Hash.new) ⇒ MatrixEngine
constructor
Create object.
-
#r ⇒ Object
Multiple correlation, on random models.
-
#r2 ⇒ Object
Get R^2 for the regression For fixed models is the coefficient of determination.
-
#sst ⇒ Object
Total sum of squares.
-
#standarized_coeffs ⇒ Object
Hash of beta or standarized coefficients.
-
#tolerance(var) ⇒ Object
Tolerance for a given variable defined as (1-R^2) of regression of other independent variables over the selected == Reference: * talkstats.com/showthread.php?t=5056.
Methods inherited from BaseEngine
#anova, #assign_names, #coeffs_t, #coeffs_tolerances, #estimated_variance_covariance_matrix, #f, #mse, #msr, #predicted, #probability, #process, #r2_adjusted, #report_building, #residuals, #se_estimate, #se_r2, #sse, #sse_direct, #ssr, #ssr_direct, #standarized_predicted, univariate?
Methods included from Summarizable
Constructor Details
#initialize(matrix, y_var, opts = Hash.new) ⇒ MatrixEngine
Create object
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 36 def initialize(matrix,y_var, opts=Hash.new) matrix.extend Statsample::CovariateMatrix raise "#{y_var} variable should be on data" unless matrix.fields.include? y_var if matrix._type==:covariance @matrix_cov=matrix @matrix_cor=matrix.correlation @no_covariance=false else @matrix_cor=matrix @matrix_cov=matrix @no_covariance=true end @y_var=y_var @fields=matrix.fields-[y_var] @n_predictors=@fields.size @predictors_n=@n_predictors @matrix_x= @matrix_cor.submatrix(@fields) @matrix_x_cov= @matrix_cov.submatrix(@fields) raise LinearDependency, "Regressors are linearly dependent" if @matrix_x.determinant<1e-15 @matrix_y = @matrix_cor.submatrix(@fields, [y_var]) @matrix_y_cov = @matrix_cov.submatrix(@fields, [y_var]) @y_sd=Math::sqrt(@matrix_cov.submatrix([y_var])[0,0]) @x_sd=@n_predictors.times.inject({}) {|ac,i| ac[@matrix_x_cov.fields[i]]=Math::sqrt(@matrix_x_cov[i,i]) ac; } @cases=nil @x_mean=@fields.inject({}) {|ac,f| ac[f]=0.0 ac; } @y_mean=0.0 @name=_("Multiple reggresion of %s on %s") % [@fields.join(","), @y_var] opts_default = {:digits=>3} opts = opts_default.merge opts opts.each{|k,v| self.send("#{k}=",v) if self.respond_to? k } result_matrix=@matrix_x_cov.inverse * @matrix_y_cov if matrix._type == :covariance @coeffs=result_matrix.column(0).to_a @coeffs_stan=coeffs.collect {|k,v| coeffs[k]*@x_sd[k].quo(@y_sd) } else @coeffs_stan=result_matrix.column(0).to_a @coeffs=standarized_coeffs.collect {|k,v| standarized_coeffs[k]*@y_sd.quo(@x_sd[k]) } end @total_cases=@valid_cases=@cases end |
Instance Attribute Details
#cases ⇒ Object
98 99 100 101 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 98 def cases raise "You should define the number of valid cases first" if @cases.nil? @cases end |
#digits=(value) ⇒ Object (writeonly)
Sets the attribute digits
33 34 35 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 33 def digits=(value) @digits = value end |
#x_mean ⇒ Object
Hash of mean for predictors. By default, set to 0
26 27 28 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 26 def x_mean @x_mean end |
#x_sd ⇒ Object
Hash of standard deviation of predictors. Only useful for Correlation Matrix, because by default is set to 1
21 22 23 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 21 def x_sd @x_sd end |
#y_mean ⇒ Object
Mean for criteria. By default, set to 0
29 30 31 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 29 def y_mean @y_mean end |
#y_sd ⇒ Object
Standard deviation of criterion Only useful for Correlation Matrix, because by default is set to 1
24 25 26 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 24 def y_sd @y_sd end |
Instance Method Details
#coeffs ⇒ Object
Hash of b or raw coefficients
121 122 123 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 121 def coeffs assign_names(@coeffs) end |
#coeffs_se ⇒ Object
Standard Error for coefficients. Standard error of a coefficients depends on
-
Tolerance of the coeffients: Higher tolerances implies higher error
-
Higher r2 implies lower error
Reference:
-
Cohen et al. (2003). Applied Multiple Reggression / Correlation Analysis for the Behavioral Sciences
159 160 161 162 163 164 165 166 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 159 def coeffs_se out={} #mse=sse.quo(df_e) coeffs.each {|k,v| out[k]=@y_sd.quo(@x_sd[k])*Math::sqrt( 1.quo(tolerance(k)))*Math::sqrt((1-r2).quo(df_e)) } out end |
#constant ⇒ Object
Value of constant
116 117 118 119 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 116 def constant c = coeffs @y_mean - @fields.inject(0) { |a,k| a + (c[k] * @x_mean[k])} end |
#constant_se ⇒ Object
Standard error for constant. This method recreates the estimaded variance-covariance matrix using means, standard deviation and covariance matrix. So, needs the covariance matrix.
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 176 def constant_se return nil if @no_covariance means=@x_mean #means[@y_var]=@y_mean means[:constant]=1 sd=@x_sd #sd[@y_var]=@y_sd sd[:constant]=0 fields=[:constant]+@matrix_cov.fields-[@y_var] # Recreate X'X using the variance-covariance matrix xt_x=::Matrix.rows(fields.collect {|i| fields.collect {|j| if i==:constant or j==:constant cov=0 elsif i==j cov=sd[i]**2 else cov=@matrix_cov.submatrix(i..i,j..j)[0,0] end cov*(@cases-1)+@cases*means[i]*means[j] } }) matrix=xt_x.inverse * mse matrix.collect {|i| Math::sqrt(i) if i>0 }[0,0] end |
#constant_t ⇒ Object
t value for constant
168 169 170 171 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 168 def constant_t return nil if constant_se.nil? constant.to_f / constant_se end |
#df_e ⇒ Object
Degrees of freedom for error
139 140 141 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 139 def df_e cases-@n_predictors-1 end |
#df_r ⇒ Object
Degrees of freedom for regression
135 136 137 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 135 def df_r @n_predictors end |
#r ⇒ Object
Multiple correlation, on random models.
112 113 114 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 112 def r Math::sqrt(r2) end |
#r2 ⇒ Object
Get R^2 for the regression For fixed models is the coefficient of determination. On random models, is the ‘squared-multiple correlation’ Equal to
-
1-(|R| / |R_x|) or
-
Sum(b_i*r_yi) <- used
108 109 110 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 108 def r2 @n_predictors.times.inject(0) {|ac,i| ac+@coeffs_stan[i]* @matrix_y[i,0]} end |
#sst ⇒ Object
Total sum of squares
130 131 132 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 130 def sst @y_sd**2*(cases-1.0) end |
#standarized_coeffs ⇒ Object
Hash of beta or standarized coefficients
126 127 128 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 126 def standarized_coeffs assign_names(@coeffs_stan) end |
#tolerance(var) ⇒ Object
Tolerance for a given variable defined as (1-R^2) of regression of other independent variables over the selected
Reference:
147 148 149 150 151 |
# File 'lib/statsample/regression/multiple/matrixengine.rb', line 147 def tolerance(var) return 1 if @matrix_x.column_size==1 lr=Statsample::Regression::Multiple::MatrixEngine.new(@matrix_x, var) 1-lr.r2 end |