Class: Statsample::Regression::Multiple::BaseEngine
- Includes:
- Summarizable
- Defined in:
- lib/statsample/regression/multiple/baseengine.rb
Overview
Base class for Multiple Regression Engines
Direct Known Subclasses
Instance Attribute Summary collapse
-
#cases ⇒ Object
readonly
Minimum number of valid case for pairs of correlation.
-
#digits ⇒ Object
Returns the value of attribute digits.
-
#name ⇒ Object
Name of analysis.
-
#total_cases ⇒ Object
readonly
Number of total cases (dataset.cases).
-
#valid_cases ⇒ Object
readonly
Number of valid cases (listwise).
Class Method Summary collapse
Instance Method Summary collapse
-
#anova ⇒ Object
Calculate F Test.
- #assign_names(c) ⇒ Object
-
#coeffs_se ⇒ Object
Standard Error for coefficients.
-
#coeffs_t ⇒ Object
T values for coeffs.
-
#coeffs_tolerances ⇒ Object
Tolerances for each coefficient.
-
#constant_se ⇒ Object
Standard error for constant.
-
#constant_t ⇒ Object
T for constant.
-
#df_e ⇒ Object
Degrees of freedom for error.
-
#df_r ⇒ Object
Degrees of freedom for regression.
-
#estimated_variance_covariance_matrix ⇒ Object
Estimated Variance-Covariance Matrix Used for calculation of se of constant.
-
#f ⇒ Object
Fisher for Anova.
-
#initialize(ds, y_var, opts = Hash.new) ⇒ BaseEngine
constructor
A new instance of BaseEngine.
-
#mse ⇒ Object
Mean Square Error.
-
#msr ⇒ Object
Mean square Regression.
-
#predicted ⇒ Object
Retrieves a vector with predicted values for y.
-
#probability ⇒ Object
p-value of Fisher.
- #process(v) ⇒ Object
-
#r ⇒ Object
R Multiple.
-
#r2_adjusted ⇒ Object
R^2 Adjusted.
- #report_building(b) ⇒ Object
-
#residuals ⇒ Object
Retrieves a vector with residuals values for y.
-
#se_estimate ⇒ Object
Standard error of estimate.
-
#se_r2 ⇒ Object
Estandar error of R^2 ????.
-
#sse ⇒ Object
Sum of squares (Error).
- #sse_direct ⇒ Object
-
#ssr ⇒ Object
Sum of squares (regression).
-
#ssr_direct ⇒ Object
Sum of squares of regression using the predicted value minus y mean.
-
#sst ⇒ Object
Sum of squares Total.
-
#standarized_predicted ⇒ Object
Retrieves a vector with standarized values for y.
-
#tolerance(var) ⇒ Object
Tolerance for a given variable talkstats.com/showthread.php?t=5056.
Methods included from Summarizable
Constructor Details
#initialize(ds, y_var, opts = Hash.new) ⇒ BaseEngine
Returns a new instance of BaseEngine.
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 20 def initialize(ds, y_var, opts = Hash.new) @ds=ds @predictors_n=@ds.vectors.size-1 @total_cases=@ds.nrows @cases=@ds.nrows @y_var=y_var @r2=nil @name=_("Multiple Regression: %s over %s") % [ ds.vectors.to_a.join(",") , @y_var] opts_default={:digits=>3} @opts=opts_default.merge opts @opts.each{|k,v| self.send("#{k}=",v) if self.respond_to? k } end |
Instance Attribute Details
#cases ⇒ Object (readonly)
Minimum number of valid case for pairs of correlation
10 11 12 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 10 def cases @cases end |
#digits ⇒ Object
Returns the value of attribute digits.
16 17 18 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 16 def digits @digits end |
#name ⇒ Object
Name of analysis
8 9 10 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 8 def name @name end |
#total_cases ⇒ Object (readonly)
Number of total cases (dataset.cases)
14 15 16 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 14 def total_cases @total_cases end |
#valid_cases ⇒ Object (readonly)
Number of valid cases (listwise)
12 13 14 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 12 def valid_cases @valid_cases end |
Class Method Details
.univariate? ⇒ Boolean
17 18 19 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 17 def self.univariate? true end |
Instance Method Details
#anova ⇒ Object
Calculate F Test
37 38 39 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 37 def anova @anova||=Statsample::Anova::OneWay.new(:ss_num=>ssr, :ss_den=>sse, :df_num=>df_r, :df_den=>df_e, :name_numerator=>_("Regression"), :name_denominator=>_("Error"), :name=>"ANOVA") end |
#assign_names(c) ⇒ Object
212 213 214 215 216 217 218 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 212 def assign_names(c) a={} @fields.each_index {|i| a[@fields[i]]=c[i] } a end |
#coeffs_se ⇒ Object
Standard Error for coefficients
149 150 151 152 153 154 155 156 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 149 def coeffs_se out={} mse=sse.quo(df_e) coeffs.each {|k,v| out[k]=Math::sqrt(mse/(@ds[k].sum_of_squares * tolerance(k))) } out end |
#coeffs_t ⇒ Object
T values for coeffs
101 102 103 104 105 106 107 108 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 101 def coeffs_t out={} se=coeffs_se coeffs.each do |k,v| out[k]=v / se[k] end out end |
#coeffs_tolerances ⇒ Object
Tolerances for each coefficient
142 143 144 145 146 147 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 142 def coeffs_tolerances @fields.inject({}) {|a,f| a[f]=tolerance(f); a } end |
#constant_se ⇒ Object
Standard error for constant
182 183 184 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 182 def constant_se estimated_variance_covariance_matrix[0,0] end |
#constant_t ⇒ Object
T for constant
178 179 180 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 178 def constant_t constant.to_f/constant_se end |
#df_e ⇒ Object
Degrees of freedom for error
122 123 124 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 122 def df_e @valid_cases-@predictors_n-1 end |
#df_r ⇒ Object
Degrees of freedom for regression
118 119 120 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 118 def df_r @predictors_n end |
#estimated_variance_covariance_matrix ⇒ Object
Estimated Variance-Covariance Matrix Used for calculation of se of constant
165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 165 def estimated_variance_covariance_matrix #mse_p=mse columns=[] @ds_valid.vectors.each{|k| v = @ds_valid[k] columns.push(v.to_a) unless k == @y_var } columns.unshift([1.0]*@valid_cases) x=::Matrix.columns(columns) matrix=((x.t*x)).inverse * mse matrix.collect {|i| Math::sqrt(i) if i>=0 } end |
#f ⇒ Object
Fisher for Anova
126 127 128 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 126 def f anova.f end |
#mse ⇒ Object
Mean Square Error
114 115 116 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 114 def mse sse.quo(df_e) end |
#msr ⇒ Object
Mean square Regression
110 111 112 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 110 def msr ssr.quo(df_r) end |
#predicted ⇒ Object
Retrieves a vector with predicted values for y
45 46 47 48 49 50 51 52 53 54 55 56 57 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 45 def predicted Daru::Vector.new( @total_cases.times.collect do |i| invalid = false vect = @dep_columns.collect {|v| invalid = true if v[i].nil?; v[i]} if invalid nil else process(vect) end end ) end |
#probability ⇒ Object
p-value of Fisher
130 131 132 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 130 def probability anova.probability end |
#process(v) ⇒ Object
240 241 242 243 244 245 246 247 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 240 def process(v) c=coeffs total=constant @fields.each_index{|i| total+=c[@fields[i]]*v[i] } total end |
#r ⇒ Object
R Multiple
77 78 79 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 77 def r raise "You should implement this" end |
#r2_adjusted ⇒ Object
R^2 Adjusted. Estimate Population R^2 usign Ezequiel formula. Always lower than sample R^2
Reference:
-
Leach, L. & Henson, R. (2007). The Use and Impact of Adjusted R2 Effects in Published Regression Research. Multiple Linear Regression Viewpoints, 33(1), 1-11.
89 90 91 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 89 def r2_adjusted r2-((1-r2)*@predictors_n).quo(df_e) end |
#report_building(b) ⇒ Object
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 185 def report_building(b) di="%0.#{digits}f" b.section(:name=>@name) do |g| c=coeffs g.text _("Engine: %s") % self.class g.text(_("Cases(listwise)=%d(%d)") % [@total_cases, @valid_cases]) g.text _("R=")+(di % r) g.text _("R^2=")+(di % r2) g.text _("R^2 Adj=")+(di % r2_adjusted) g.text _("Std.Error R=")+ (di % se_estimate) g.text(_("Equation")+"="+ sprintf(di,constant) +" + "+ @fields.collect {|k| sprintf("#{di}%s",c[k],k)}.join(' + ') ) g.parse_element(anova) sc=standarized_coeffs cse=coeffs_se g.table(:name=>_("Beta coefficients"), :header=>%w{coeff b beta se t}.collect{|field| _(field)} ) do |t| t.row([_("Constant"), sprintf(di, constant), "-", constant_se.nil? ? "": sprintf(di, constant_se), constant_t.nil? ? "" : sprintf(di, constant_t)]) @fields.each do |f| t.row([f, sprintf(di, c[f]), sprintf(di, sc[f]), sprintf(di, cse[f]), sprintf(di, c[f].quo(cse[f]))]) end end end end |
#residuals ⇒ Object
Retrieves a vector with residuals values for y
63 64 65 66 67 68 69 70 71 72 73 74 75 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 63 def residuals Daru::Vector.new( (0...@total_cases).collect do |i| invalid=false vect=@dep_columns.collect{|v| invalid=true if v[i].nil?; v[i]} if invalid or @ds[@y_var][i].nil? nil else @ds[@y_var][i] - process(vect) end end ) end |
#se_estimate ⇒ Object
Standard error of estimate
41 42 43 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 41 def se_estimate Math::sqrt(sse.quo(df_e)) end |
#se_r2 ⇒ Object
Estandar error of R^2 ????
159 160 161 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 159 def se_r2 Math::sqrt((4*r2*(1-r2)**2*(df_e)**2).quo((@cases**2-1)*(@cases+3))) end |
#sse ⇒ Object
Sum of squares (Error)
97 98 99 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 97 def sse sst - ssr end |
#sse_direct ⇒ Object
237 238 239 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 237 def sse_direct sst-ssr end |
#ssr ⇒ Object
Sum of squares (regression)
93 94 95 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 93 def ssr r2*sst end |
#ssr_direct ⇒ Object
Sum of squares of regression using the predicted value minus y mean
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 222 def ssr_direct mean=@dy.mean cases=0 ssr=(0...@ds.cases).inject(0) {|a,i| invalid=false v=@dep_columns.collect{|c| invalid=true if c[i].nil?; c[i]} if !invalid cases+=1 a+((process(v)-mean)**2) else a end } ssr end |
#sst ⇒ Object
Sum of squares Total
81 82 83 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 81 def sst raise "You should implement this" end |
#standarized_predicted ⇒ Object
Retrieves a vector with standarized values for y
59 60 61 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 59 def standarized_predicted predicted.standarized end |
#tolerance(var) ⇒ Object
Tolerance for a given variable talkstats.com/showthread.php?t=5056
135 136 137 138 139 140 |
# File 'lib/statsample/regression/multiple/baseengine.rb', line 135 def tolerance(var) ds = assign_names(@dep_columns) ds.each { |k,v| ds[k] = Daru::Vector.new(v) } lr = self.class.new(Daru::DataFrame.new(ds),var) 1 - lr.r2 end |