Class: Statsample::Factor::PrincipalAxis
- Includes:
- DirtyMemoize, Summarizable
- Defined in:
- lib/statsample/factor/principalaxis.rb
Overview
Principal Axis Analysis for a covariance or correlation matrix.
For PCA, use Statsample::Factor::PCA
Usage:
require 'statsample'
a = Daru::Vector.new([2.5, 0.5, 2.2, 1.9, 3.1, 2.3, 2.0, 1.0, 1.5, 1.1])
b = Daru::Vector.new([2.4,0.7,2.9,2.2,3.0,2.7,1.6,1.1,1.6,0.9])
ds= Daru::DataFrame.new({:a => a,:b => b})
cor_matrix=Statsample::Bivariate.correlation_matrix(ds)
pa=Statsample::Factor::PrincipalAxis.new(cor_matrix)
pa.iterate(1)
pa.m
=> 1
pca.component_matrix
=> GSL::Matrix
[ 9.622e-01
9.622e-01 ]
pca.communalities
=> [0.962964636346122, 0.962964636346122]
References:
-
SPSS Manual
-
Smith, L. (2002). A tutorial on Principal Component Analysis. Available on courses.eas.ualberta.ca/eas570/pca_tutorial.pdf
Constant Summary collapse
- DELTA =
Minimum difference between succesive iterations on sum of communalities
1e-3
- MAX_ITERATIONS =
Maximum number of iterations
25
Instance Attribute Summary collapse
-
#eigenvalues ⇒ Object
readonly
Eigenvalues of factor analysis.
-
#epsilon ⇒ Object
Tolerance for iterations.
-
#initial_eigenvalues ⇒ Object
readonly
Initial eigenvalues.
-
#iterations ⇒ Object
readonly
Number of iterations required to converge.
-
#m ⇒ Object
Number of factors.
-
#max_iterations ⇒ Object
Maximum number of iterations.
-
#name ⇒ Object
Name of analysis.
-
#smc ⇒ Object
Use SMC(squared multiple correlations) as diagonal.
Class Method Summary collapse
-
.separate_matrices(matrix, y) ⇒ Object
Returns two matrixes from a correlation matrix with regressors correlation matrix and criteria xy matrix.
Instance Method Summary collapse
-
#communalities(m = nil) ⇒ Object
Communality for all variables given m factors.
-
#component_matrix(m = nil) ⇒ Object
Component matrix for m factors.
- #initial_communalities ⇒ Object
-
#initialize(matrix, opts = Hash.new) ⇒ PrincipalAxis
constructor
A new instance of PrincipalAxis.
-
#iterate(m = nil) ⇒ Object
(also: #compute)
Iterate to find the factors.
- #report_building(generator) ⇒ Object
Methods included from Summarizable
Constructor Details
#initialize(matrix, opts = Hash.new) ⇒ PrincipalAxis
Returns a new instance of PrincipalAxis.
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
# File 'lib/statsample/factor/principalaxis.rb', line 65 def initialize(matrix, opts=Hash.new) @matrix=matrix if @matrix.respond_to? :fields @fields=@matrix.fields else @fields=@matrix.row_size.times.map {|i| _("Variable %d") % (i+1)} end @n_variables=@matrix.row_size @name="" @m=nil @initial_eigenvalues=nil @initial_communalities=nil @component_matrix=nil @delta=DELTA @smc=true @max_iterations=MAX_ITERATIONS opts.each{|k,v| self.send("#{k}=",v) if self.respond_to? k } if @matrix.respond_to? :fields @variables_names=@matrix.fields else @variables_names=@n_variables.times.map {|i| "V#{i+1}"} end if @m.nil? pca=PCA.new(::Matrix.rows(@matrix.to_a)) @m=pca.m end @clean=true end |
Instance Attribute Details
#eigenvalues ⇒ Object (readonly)
Eigenvalues of factor analysis
58 59 60 |
# File 'lib/statsample/factor/principalaxis.rb', line 58 def eigenvalues @eigenvalues end |
#epsilon ⇒ Object
Tolerance for iterations
49 50 51 |
# File 'lib/statsample/factor/principalaxis.rb', line 49 def epsilon @epsilon end |
#initial_eigenvalues ⇒ Object (readonly)
Initial eigenvalues
46 47 48 |
# File 'lib/statsample/factor/principalaxis.rb', line 46 def initial_eigenvalues @initial_eigenvalues end |
#iterations ⇒ Object (readonly)
Number of iterations required to converge
43 44 45 |
# File 'lib/statsample/factor/principalaxis.rb', line 43 def iterations @iterations end |
#m ⇒ Object
Number of factors. Set by default to the number of factors with eigenvalues > 1 (Kaiser criterion).
Warning: Kaiser criterion overfactors! Give yourself some time and use Horn’s Parallel Analysis.
40 41 42 |
# File 'lib/statsample/factor/principalaxis.rb', line 40 def m @m end |
#max_iterations ⇒ Object
Maximum number of iterations
55 56 57 |
# File 'lib/statsample/factor/principalaxis.rb', line 55 def max_iterations @max_iterations end |
#name ⇒ Object
Name of analysis
32 33 34 |
# File 'lib/statsample/factor/principalaxis.rb', line 32 def name @name end |
#smc ⇒ Object
Use SMC(squared multiple correlations) as diagonal. If false, use 1
52 53 54 |
# File 'lib/statsample/factor/principalaxis.rb', line 52 def smc @smc end |
Class Method Details
.separate_matrices(matrix, y) ⇒ Object
Returns two matrixes from a correlation matrix with regressors correlation matrix and criteria xy matrix.
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
# File 'lib/statsample/factor/principalaxis.rb', line 178 def self.separate_matrices(matrix, y) ac=[] matrix.column_size.times do |i| ac.push(matrix[y,i]) if i!=y end rxy=Matrix.columns([ac]) rows=[] matrix.row_size.times do |i| if i!=y row=[] matrix.row_size.times do |j| row.push(matrix[i,j]) if j!=y end rows.push(row) end end rxx=Matrix.rows(rows) [rxx,rxy] end |
Instance Method Details
#communalities(m = nil) ⇒ Object
Communality for all variables given m factors
97 98 99 100 101 102 103 |
# File 'lib/statsample/factor/principalaxis.rb', line 97 def communalities(m=nil) if m!=@m or @clean iterate(m) raise "Can't calculate comunality" if @communalities.nil? end @communalities end |
#component_matrix(m = nil) ⇒ Object
Component matrix for m factors
105 106 107 108 109 110 |
# File 'lib/statsample/factor/principalaxis.rb', line 105 def component_matrix(m=nil) if m!=@m or @clean iterate(m) end @component_matrix end |
#initial_communalities ⇒ Object
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/statsample/factor/principalaxis.rb', line 154 def initial_communalities if @initial_communalities.nil? if @smc # Based on O'Connors(2000) @initial_communalities=@matrix.inverse.diagonal.map{|i| 1-(1.quo(i))} =begin @[email protected]_size.times.collect {|i| rxx , rxy = PrincipalAxis.separate_matrices(@matrix,i) matrix=(rxy.t*rxx.inverse*rxy) matrix[0,0] } =end else @initial_communalities=[1.0]*@matrix.column_size end end @initial_communalities end |
#iterate(m = nil) ⇒ Object Also known as: compute
Iterate to find the factors
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
# File 'lib/statsample/factor/principalaxis.rb', line 112 def iterate(m=nil) @clean=false m||=@m @m=m t = @max_iterations work_matrix=@matrix.to_a prev_com=initial_communalities pca=PCA.new(::Matrix.rows(work_matrix)) @initial_eigenvalues=pca.eigenvalues prev_sum=prev_com.inject(0) {|ac,v| ac+v} @iterations=0 t.times do |i| "#{@name}: Iteration #{i}" if $DEBUG @iterations+=1 prev_com.each_with_index{|v,it| work_matrix[it][it]=v } pca=PCA.new(::Matrix.rows(work_matrix)) @communalities=pca.communalities(m) @eigenvalues=pca.eigenvalues com_sum = @communalities.inject(0) {|ac,v| ac+v} #jump=true break if (com_sum-prev_sum).abs < @delta @communalities.each_with_index do |v2,i2| raise "Variable #{i2} with communality > 1" if v2>1.0 end prev_sum=com_sum prev_com=@communalities end @component_matrix=pca.component_matrix(m) @component_matrix.extend CovariateMatrix @component_matrix.name=_("Factor Matrix") @component_matrix.fields_x = @variables_names @component_matrix.fields_y = m.times.map {|i| "factor_#{i+1}"} end |
#report_building(generator) ⇒ Object
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
# File 'lib/statsample/factor/principalaxis.rb', line 197 def report_building(generator) iterate if @clean generator.section(:name=>@name) do |s| s.text _("Number of factors: %d") % m s.text _("Iterations: %d") % @iterations s.table(:name=>_("Communalities"), :header=>[_("Variable"),_("Initial"),_("Extraction")]) do |t| communalities(m).each_with_index {|com,i| t.row([@fields[i], sprintf("%0.4f", initial_communalities[i]), sprintf("%0.3f", com)]) } end s.table(:name=>_("Total Variance"), :header=>[_("Factor"), _("I.E.Total"), _("I.E. %"), _("I.E.Cum. %"), _("S.L.Total"), _("S.L. %"), _("S.L.Cum. %") ]) do |t| ac_eigen,ac_i_eigen=0,0 @initial_eigenvalues.each_with_index {|eigenvalue,i| ac_i_eigen+=eigenvalue ac_eigen+=@eigenvalues[i] new_row=[ _("Factor %d") % (i+1), sprintf("%0.3f",eigenvalue), sprintf("%0.3f%%", eigenvalue*100.quo(@n_variables)), sprintf("%0.3f",ac_i_eigen*100.quo(@n_variables)) ] if i<@m new_row.concat [ sprintf("%0.3f", @eigenvalues[i]), sprintf("%0.3f%%", @eigenvalues[i]*100.quo(@n_variables)), sprintf("%0.3f",ac_eigen*100.quo(@n_variables)) ] else new_row.concat ["","",""] end t.row new_row } end s.parse_element(component_matrix) end end |