Class: Eluka::FeatureVectors
- Inherits:
-
Object
- Object
- Eluka::FeatureVectors
- Defined in:
- lib/eluka/feature_vector.rb
Instance Method Summary collapse
-
#add(vector, label = 0) ⇒ Object
We just keep all data points stored and convert them to feature vectors only on demand.
-
#define_features ⇒ Object
For training data points we make sure all the features are added to the feature list.
-
#initialize(features, train) ⇒ FeatureVectors
constructor
Feature Vectors for a data point need to know the global list of features and their respective ids.
-
#to_libSVM(sel_features = nil) ⇒ Object
Creates feature vectors and converts them to LibSVM format – a multiline string with one data point per line.
Constructor Details
#initialize(features, train) ⇒ FeatureVectors
Feature Vectors for a data point need to know the global list of features and their respective ids
During training, as we keep finding new features we add them to the features list
Hence we need to know whether the vectors we are computing are for training or classification
25 26 27 28 29 |
# File 'lib/eluka/feature_vector.rb', line 25 def initialize (features, train) @fvs = Array.new @features = features #Instance of features @train = train #Boolean end |
Instance Method Details
#add(vector, label = 0) ⇒ Object
We just keep all data points stored and convert them to feature vectors only on demand
34 35 36 |
# File 'lib/eluka/feature_vector.rb', line 34 def add (vector, label = 0) @fvs.push([vector, label]) end |
#define_features ⇒ Object
For training data points we make sure all the features are added to the feature list
41 42 43 44 45 46 47 |
# File 'lib/eluka/feature_vector.rb', line 41 def define_features @fvs.each do |vector, label| vector.each do |term, value| @features.add(term) end end end |
#to_libSVM(sel_features = nil) ⇒ Object
Creates feature vectors and converts them to LibSVM format – a multiline string with one data point per line
If provided with a list of selected features then insert only those features
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/eluka/feature_vector.rb', line 56 def to_libSVM (sel_features = nil) #Load the selected features into a Hash sf = Hash.new if (sel_features) sel_features.each do |f| sf[f] = 1 end end self.define_features if (@train) #This method is needed only for training data output = Array.new @fvs.each do |vector, label| line = Array.new line.push(label) (1..@features.f_count).each do |id| #OPTIMIZE: Change this line to consider sorting in case of terms being features term = @features.term(id) if ( value = vector[term] ) then line.push([id, value].join(":")) if sf[term] or not sel_features end end output.push(line.join(" ")) end output.join("\n") end |